Abstract
We present a parallel implementation of the widely-used entropy encoding algorithm, the Huffman coder, on the NVIDIA CUDA architecture. After constructing the Huffman codeword tree serially, we proceed in parallel by generating a byte stream where each byte represents a single bit of the compressed output stream. The final step is then to combine each consecutive 8 bytes into a single byte in parallel to generate the final compressed output bit stream. Experimental results show that we can achieve up to 22× speedups compared to the serial CPU implementation without any constraint on the maximum codeword length or data entropy.
Original language | English |
---|---|
Title of host publication | 2014 IEEE Visual Communications and Image Processing Conference, VCIP 2014 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 311-314 |
Number of pages | 4 |
ISBN (Electronic) | 9781479961399 |
DOIs | |
Publication status | Published - 27 Feb 2015 |
Externally published | Yes |
Event | 2014 IEEE Visual Communications and Image Processing Conference, VCIP 2014 - Valletta, Malta Duration: 7 Dec 2014 → 10 Dec 2014 |
Publication series
Name | 2014 IEEE Visual Communications and Image Processing Conference, VCIP 2014 |
---|
Conference
Conference | 2014 IEEE Visual Communications and Image Processing Conference, VCIP 2014 |
---|---|
Country/Territory | Malta |
City | Valletta |
Period | 7/12/14 → 10/12/14 |
Bibliographical note
Publisher Copyright:© 2014 IEEE.
Keywords
- CUDA
- GPGPU
- Huffman coding
- JPEG
- parallel computing
- variable length coding