If it’s the StringHashMap, so be it: iterate the keys() and free them, then deinit the HashMap.
If it’s something else, don’t do that. Have the something else do it instead.
Example: diffz has a line-diff mode, and how it works is fairly clever: it hashes every line in both files, and converts them to a list of unique characters (UTF-8 encoded). It then run the Myer’s diff on just the characters, and iterates the character diffs, retrieving the lines from the HashMap. Result: line-based diffs.
That hash map doesn’t own the slices, so it doesn’t free them. They’re just views into the source buffers. The final diffs have copies, although that isn’t strictly necessary, but it means you can free the file buffers once the diffing is done and nothing bad will happen.
FWIW, for my Advent of Code solutions in zig, I use an interned string table. This is basically:
An ArrayList that holds all strings (and they are dynamically allocated copies of the slices passed in); the position in this table becomes the string id.
A StringHashMap which maps from each string to its position in the array (its id).
With this in place, I can easily compare two strings by comparing their positions in the ArrayList. When the string table is being deinited, I release the memory used by each string.
Not for the first time… It is not a surprise that the solution space for these things quickly converges to one or two good ideas. Can’t claim any flashes of genius, but it is good to get confirmation that others have walked the same path. Cheers!
There also was this old (clever) implementation that was used within the WASM namespace for a while (that still could be used for other things):
The benefit of that variant is that it doesn’t need to use/store any slices internally, it just stores the start position of specific null terminated keys.
What @gonzo described is not a StringArrayHashmap, for the purposes of string interning, it is no different from using a StringHashMap, the only difference is how efficient it is to iterate through the stored keys/values due to how it stores that data.
@mnemnion is using the first pointer given to the hash map as the ID, this can be done with a StringHashMap as well. @gonzo is using an index into an ArrayList of copies of the keys as the ID. This offers the ability to use a smaller type for the ID.
They don’t specify if they are using an ArrayList(u8) or ArrayList([]const u8)
Both are interning strings, but they are different.