roaring bitmaps are a data structure for storing and operating on large sets of 32-bit integers. they’re used in major systems because of great compression and fast set operations.
this project is quite early with lots to be done. its a little messy still but i think i’m ok with the current design except some regret that i’ve over engineered my WordBitset with comptime options.
just wanted to share incase others are interested in this. human contributions wanted!
i managed to create a workflow on github so that this project’s docs get auto built and deployed. first time i’ve done that with a zig project. was pretty easy to do really. mirroring from codeberg to github for hosting docs and ci doesn’t seem too bad so far.
Thanks for your message. I saw your project was inspired by it to try myself. I’m following your validation strategy. Seems like you’ve covered a good bit of ground in your project. So far I’m somewhere between just porting the c code and trying for something simpler.
Some of my project’s c style memory management has been a bit difficult. I’m currently storing each container type as ptr and cardinality, capacity u32s with to/fromArrayList methods. and that works well.
Also like your project, I’m storing Conatiners as tagged pointers. Since each container has equal size, 128 bits, each is align 16 so the first 4 bits of every pointer will be clear. Though I’m only using 2 bits for 4 tags.
I want to add fuzzing and property based tests soon. I think roaring bitmaps are quite interesting and am enjoying learning about them.
Anyway, I’m curious to know how writing rawr went, what you learned and if you have any advice.
Just added frozen_view() to complete an initial frozen de/serialization cycle .
I think i’m happy now with a single shared allocation which gets cleaned up in deinit(). Better than my first pass with an out pointer the caller needed to free. Here’s how it looks now. And here is deinit.
I’ve struggled before with constructing a MultiArrayList from a buffer. Figured out something that works, seems ok along with MAL.capacityInBytes(). Curious to hear if anyone else has done similar. I’m hoping this will be fast - need start benchmarking soon.