Existing type to dump all my strings into?

When I want to use such a data structure I just copy it from the zig source code where it is called StringTable: https://codeberg.org/ziglang/zig/src/branch/master/src/link/StringTable.zig

There have been a bunch of posts about this data structure in the past, because it is quite neat in the way it combines the backing array with a hash set that indexes into that backing array (automatically adding de-duplication).

The only thing that needs to be mentioned is that it uses zero terminated strings, so you can’t use it to store strings that contain zeroes, for that you would have to store the length explicitly like with your OffsetSlice (but personally I would use u32 for the strings unless they really can be huge strings).

In my code I also added a fromOwnedSlice method that way I can easily store the backing array in a file and read it in again restoring my string table from the backing array.

pub fn fromOwnedSlice(gpa: Allocator, slice: []u8) !@This() {
    var instance: @This() = .empty;
    instance.buffer = .fromOwnedSlice(slice);
    // rebuild hash table
    var offset: u32 = 0;
    while (offset < instance.buffer.items.len) {
        try instance.table.putContext(gpa, offset, {}, StringIndexContext{
            .bytes = &instance.buffer,
        });

        offset = while (offset < instance.buffer.items.len) : (offset += 1) {
            if (instance.buffer.items[offset] == 0) break offset + 1;
        } else offset;
    }
    return instance;
}
3 Likes