Who is responsible for freeing memories when it comes to maps?

I am still not sure when it comes to memory management, who owns data and hence responsible for freeing it. And also when is ownership transferred.

I think one place this knowledge gap shows up is when it comes to maps and values stored in a map,

Who is responsible for freeing the memory used by a map. more specifically:

  • the memory used by the keys of the map? Is freeing this even a thing?
  • the things stored as values in the map?

If the map has a deinit method, will that also free both the map’s internal memory it uses and the memory of its values? or the values needs to be freed manually, before calling deinit on the map?

How does managed vs unmanaged variants affect this?

Trying to answer my own question

If the map has a deinit method, will that also free both the map’s internal memory it uses and the memory of it’s values?

I think it depends on the value stored in the map? if the value is not a pointer, then I guess the map’s deinit method also cleans it up.

if the value is just a pointer then a manually freeing the values memory would be needed.

Now question is, is it possible to ā€œmoveā€ a value to a map and have the map now own it? such that the map is now responsible for the freeing?

If the answer is that Zig does not have a ā€œmoveā€ semantics then how do one identify who is the final responsible for freeing memory?

Hello,
Zig does not impose any restrictions on memory ownership (for better or worse).
The hash map data structure in std, only frees its internal memory on deinit. So if you store anything that has its own separate lifetime, you need to free it yourself.

There is no move semantics in Zig, you can possibly store some smart structure that would keep track of who owns her and when deiniting the map it would go thru the elements and free them if needed. Alho i have to say this will cost more memory and make the code much more bug prone and harder to maintain (+ there are very few cases when this would be desired anyway).

You can also wrap the map around in your own struct and provide custom deinit or overall functions for it. Altho i would again raise a question why do you store data in a single map with marginly different lifetimes behaviors on map deinit?

In the end, programmer is the one responsible to free the memory, whenever its needed (Zig gives some tools to make it cleaner).

On the case of managed vs unmanaged, there have been a shift towards unmanaged variants of data structures in std lately. The only difference is that the managed variants hold allocator inside so you dont have to pass it to every function that might need it on call.
Even tho its more convinient it makes code sometimes harder to reason about and is larger in memory than unmanaged variants.

Robert :blush:

2 Likes

It’s not that bad, and it should cost much less memory or there’s no point in doing it. Not at all uncommon to be able to add a bool to a struct without increasing its size, just uses memory which would have been padding. That can track ownership with no size increase at all.

It’s useful when a data structure may or may not alter a string, for instance. The normalization functions in zg do this: if they don’t change the string, you get it back, and the struct keeps track of whether deinit frees the string or not. It does mean that you need to keep track of whether the old string is ā€˜in play’, but that’s easy to do, or you can just call a method which copies it if it wasn’t copied (and sets the owner flag accordingly).

Most of the time, a data structure should either own what it receives or it shouldn’t. Which applies needs to be clearly documented. Just part of the Zig experience: carefully engineered memory policies can be more efficient than any automated or semi-automated approach, RAII included, but it does mean that program correctness requires always having a clear sense of what owns memory and what’s just looking at it.

But to address @finlaydotb’s question directly: HashMap and data structures like it copy their keys and values, but only their keys and values, and you have to know what that means. A slice is a pointer and a length: a HashMap will copy that pointer and that length. It will not copy what the pointer points to. If you want that, you have to do it, and you need to iterate the map to free that memory (unless you have other plans, e.g. an arena is backing it).

1 Like

Without diving too deep into this great question, I want to add one little piece of advice: whenever you find yourself iterating over a data structure and freeing all the elements, ask yourself the question: is there a way to turn this O(N) memory allocation strategy into O(1)?

As @mnemnion hinted above, one way to do this would be to arena-allocate the backing memory for strings inside a hash map. There are other strategies as well. But in my experience, more often than not, freeing inside a for loop is a code smell.

It’s not terrible to do. In other programming languages there is practically no other way. But in Zig, we can do better :slightly_smiling_face:

8 Likes

In other languages, the machinery for doing all that freeing is mostly hidden from you, which is just as poisonous (for performance), but you can’t smell it. It’s the carbon monoxide of software engineering.

4 Likes

I’ll be curious what the other strategies are. I am currently working on a codebase where freeing inside a for loop is the main/only strategy for de-initialising values in a map, and it has never come up that, that could be a code smell…

It’s a code smell because it means you’re thinking about lifetimes of individual items when you should be thinking about lifetimes of collections or batches of items. This was alluded to above through the mention of arenas.

There are obviously exceptions, but often you can organize, say, the use of a map and its keys and values such that they can be allocated from an arena. Then you just deallocate everything in one single sweep instead of first the items in a loop and then the map itself, etc. This is especially true in session oriented situations (such as a web request), or when there are related short-lived collection of things (like in a frame of a game)

4 Likes

Andrew mentioned this strategy in his response, but he also alluded to other strategies. Curious about what those other strategies are.

In addition to arenas, some options I’ve encountered (which is not to say that you sometimes have to do things at an individual level)

  • Find a non-allocating solution (it’s incredible how often this is possible)
  • Calculate and preallocate enough space (TigerBeetle take this the whole way IIRC)
  • Consider other data structures that can allow for batching
  • Andrew’s Programming without pointers show a strategy that can help simplify memory management
5 Likes

Yeah, although sometimes you don’t have any other option, like if the resource is not memory but e.g. a file handle.

1 Like