Building off this question (and AoC day 8) - what’s the point of a HashMap’s getOrPut? Since it doesn’t accept a value, it seems misleadingly named to me - you still have to call map.put to put a value.
Contrast the two methods used here:
const expect = @import("std").testing.expect;
test "HashMap of Lists" {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
var map = std.AutoHashMap(u8, std.ArrayList(u32)).init(allocator);
var list_for_a = std.ArrayList(u32).init(allocator);
try list_for_a.append(1);
try map.put('a', list_for_a);
if (!map.contains('b')) {
try map.put('b', std.ArrayList(u32).init(allocator));
}
try map.getPtr('b').?.append(2);
const response = try map.getOrPut('c');
// Seems that `getOrPut` _doesn't_ put anything - as uncommenting the following line causes a segmentation fault
// print("DEBUG - content at c is {}\n", .{response.value_ptr});
if (!response.found_existing) {
try map.put('c', std.ArrayList(u32).init(allocator));
}
// Now that we have manually _put_ something into the map, the previously-breaking line is legal:
print("DEBUG - content at c is {}\n", .{response.value_ptr});
try response.value_ptr.append(3);
try expect(map.get('a').?.items[0] == 1);
try expect(map.get('b').?.items[0] == 2);
try expect(map.get('c').?.items[0] == 3);
}
When would one prefer to use the getOrPut approach? Or, is there a better third approach that I’ve missed?
Welcome to the forum! The advantage of getOrPut is that it’s more efficient than doing contains followed by put, as it only needs to find the (existing or new) position of the key in the map once. The intended way to use getOrPut is not to call put in the case where an existing entry isn’t found, but to initialize the new entry’s value using the returned value_ptr:
if (!response.found_existing) {
response.value_ptr.* = std.ArrayList(u32).init(allocator);
}
getOrPut does put something: namely, the new entry in the map, but it doesn’t initialize the value for you. That’s where the returned entry comes in, by giving you a pointer to the new entry’s value for you to initialize it.
Hmm…I still have some thinking to do to get my head around Zig’s low-level operation, then (the majority of my experience is with Java, Python, and TypeScript), since to me “putting an entry with an uninitialized value into the map” is essentially equivalent to “not putting anything into the map”. Would the difference be that map.contains(key) would return true (but map.get(key) would still return null)? Or should I be thinking in terms of allocated/reserved memory (if so - how can getOrPut reserve memory for an ArrayList when that is - if I’ve understood it right - a growable data structure?)?
it only needs to find the (existing or new) position of the key in the map once.
Fascinating. So it sounds like you’re saying that response.value_ptr.* = ... would be faster to execute than map.put(key, ...), because the latter would have to (maybe I’m using the wrong terms here, but hopefully the intention is understandable) traverse the map again to find “where to put” the value corresponding with key? Is that because of the cost of hashing key (I would usually consider that to be negligible; but I guess at the levels we’re dealing with here, it’s relevant?), or because of some other factor?
Thanks for the helpful insight! This is really rewiring my brain to care about a lot of stuff that’s been safely-ignorable before
Yes and yes. Everytime you need to search for an element, you need to do the whole shebang: hash it, chase pointers and do comparisons with elements that hash collide.
If they return true or the value that you are looking for, then it’s fine. But returning false or null is not enough, as you’d have to redo all the work to add the value to the map.
The same way that you store any growable data structure anywhere. You store a pointer, and put the data somewhere else.
One other thing: the cost of finding the right spot to put that new element might be sort-of O(1) for a hash table-based map (but what about collisions?) – other map implementations might need more work. So the point of getOrPut() is allowing you to pay that cost only once.
I wish other languages exposed this level of flexibility too! Zig
Hmmm - I did some experiments to try to understand this better, and got some very confusing results. Comparisons of the pointer-based and non-pointer-based approaches are inconsistent - neither is consistently faster, though the non-pointer-based approach tends to win more often than not. Can anyone see what mistake I might have made in setting up the experiment?
I cannot see the mistake that caused the segmentation fault. I’d need to see the full stack trace for that. However the fact that it works at numbers less than 10 might suggest that it has something to do with pointer invalidation after container resizing. Not sure about the HashMap, but an ArrayList for example resizes for the first time after 8 elements.
But I do see a few problems with your benchmarking methodology:
You do not use the same seed for both attempts, so they end up doing different operations. Generally we want to avoid additional variance introduce by doing different operations. If you average over many runs this might not be a big deal though.
Are you even measuring the right thing? You are generating two random numbers per loop. Even worse, since it’s unlikely that you get the same key twice you end up allocating once every iteration. And the GPA is not known for being fast.
The order of execution matters. If you do the non-pointer based test first, you might get totally different results. The GPA for example will be a lot slower for the first allocations, since it always requires a syscall, whereas subsequent allocation in the second execution will be a lot faster.
The numbers seem suspiciously high. You are inserting 25 elements, even with the allocation, that should not take 200 µs. Did you maybe forget to compile in release mode?
This is not correct: there is no guarantee that the response pointers still refer to valid memory after you perform additional mutations (such as map.put) on the map. In this case, calling map.put tries to ensure space for the new entry in the map (before it checks to see if the key is already present), which will resize the map’s internal memory if not enough space is already allocated, invalidating any pointers into the old memory.
Just to clarify, you can also use zig run -OReleaseFast <filename> to compile and run your program in release mode. The default optimization mode (for both zig run and zig build-exe) is Debug, which is best for development (safety checks enabled, debug info available, few optimizations applied so it’s easier to step through the code in a debugger), but not a good choice for benchmarking.
Of course, if you want to do benchmarking at an executable level, it’s still best to precompile an executable using zig build-exe and use something like poop to get nice timing, memory usage, and other stats using two different commands (e.g. add a command-line option in your benchmarking program to select which approach should be executed).