Library to store codes of arbitrary bits length (Data Compression)

Hello everybody,

I am slowly learning Zig language in my spare time. I do little projects as challenges and learn it by doing something “useful”. One of this is Huffman coding. As you probably know, some of the compressing algorithms emit variable bits length codes which require bit manipulations in order to store them in array of bytes. Does anybody know of an existing library/module in Zig to do that?
Thank you in advance

1 Like

Hello @Ziggurat Welcome to ziggit :slight_smile:

I am not aware of any library for Huffman coding.

But I find it a really good idea to implement something like this, as an exercise, using only the zig standard library.

Coding representation can be a map from character to number of bits and the bits.
Input and output are streams of bytes (u8).
For each input byte, a map lookup results to some bits. For every complete 8 bits an output byte is produced.

I think no library is needed here - for each bit of a Huffman code-word just shift left an accumulator and then OR with this bit (as soon as accumulator is full, output it and start over again).

1 Like

Thank you for your recommendations :slight_smile:
I would not like to transfer each bit to the stream while traversing the Huffman tree but rather create the codebook in advance as a mapping between the input code and its Huffman code. The Huffman code would be then transferred to the byte stream after it has been correctly shifted to be aligned after the last transferred bit of the previous code. I saw such implementation is C ( GitHub - MichaelDipperstein/huffman: huffman: An ANSI C implementation of Huffman coding). I was wondering whether such or similar implementation existed also in Zig. If I will have time, I will try to implement it as part of the whole exercise to learn Zig

Thank you!