Ergonomics of ArrayListUnmanaged

weskoerber · April 23, 2025, 7:09pm

This is a follow-up a post I read earlier: Fastest way to 'switch' ownership of arraylist.

The topic of the post morphed into the use of multiple allocators with the various methods of ArrayListUnmanaged, and I wanted to follow up on this separately without contributing additional noise to the original topic. Specifically, I wanted to use something @ericlang said as a jumping-off point:

Theoretically appending to the list isn’t a problem in-and-of itself. Since each allocator instance maintains a list of its own allocations (I think), allocating is not an issue.

However, potential issues arise when memory is freed. Depending on which combination of allocators are used, the behavior could differ. Maybe some combination will cause segfaults while others cause memory leaks. Or maybe everything works just fine for now ®™.

I can’t help but wonder that the “unmanaged” flavor of containers will cause a lot of confusion and subtle, hard-to-track-down behaviors, especially since an Allocator interface is being passed around which makes it almost impossible to know what the concrete implementation is. Given these issues, it seems a bit unfortunate that the “managed” containers seem to be on the path to deprecation.

One counter-argument for this would be to just use a single allocator, but then that kind of defeats the purpose of the Allocator interface (and allocators in Zig as a whole).

I’m sure there’s an aspect of the “unmanaged” containers that I’m not considering that y’all will let me know about. And to be clear, I’m not saying that the “unmanaged” containers are objectively bad - there are certainly cases for them. I think I’m just struggling to understand why the “average” person should use them and be responsible for ensuring they’re using the right allocators over using a “managed” container, which pretty much makes this a non-issue.

Sze · April 23, 2025, 7:54pm

You can’t expect that an allocator has an explicit list of allocations, basically the only thing that is required is that if the alloc succeeded that the free will work too (with the allocator that was used to allocate).

This makes it possible to use arena/bump allocators which mostly just allocate big chunks of memory and then return smaller pieces of that memory to callers of alloc while either completely ignoring free calls or only honoring the last allocated one (because that one can be freed without introducing fragmentation/holes into the bigger memory chunks). Because those allocators don’t have a way to track individual memory pieces they can’t free individual memory blocks so they instead retain the memory until someone calls .reset or .deinit on the allocator.

Basically every allocator can choose a different strategy how to handle allocation and deallocation.

The benefit of the unmanaged variants is that when you create something that uses multiple data structures for example multiple ArrayLists and HashMaps it becomes possible to share the allocator between all those datastructures in a single field, cutting down on unnecessary redundancy and basically restoring the managed style just that it now is managed across a group of datastructures and in a way that makes sense for your application code. Things that are managed via specific structs can also share one or more allocators (either via shared fields or function parameters).

You also gain the option of not storing the allocator in a field and writing the code in a way where you only require the allocator once you actually allocate something. This means that initializing your data structure can become more simple, where a complex data structure composed from simpler ones can be initialized via an .empty decl literal.

ericlang · April 23, 2025, 7:57pm

That was exactly what I was thinking. There are not much cases thinkable in which you want multiple allocators for one list. That’s why I was kind of flabbergasted when reading about the deprecation of ArrayList.
But people who decided that are probably much smarter than me. I also should have a look at the unmanaged lists code in more depth.
I hope we do not have to have multiple allocators for one byte in the future

ForeverZer0 · April 23, 2025, 8:45pm

Just my two cents, but I often find that the argument “against” generally boils down to some theoretical confusion of which allocator was used to allocate a list, and the programmer being confused as to which allocator to use when freeing it, and/or this somehow being antithetical to using multiple allocators.

My issue with these arguments is that they always seem entirely theoretical, but I can never envision nor have ever seen a real-world example where such a scenario would manifest. The ability to pass in the incorrect allocator is not somehow unique to the containers defined in the standard library, literally any struct that doesn’t carry around an allocator with it is subject to the same “confusion”, but we never talk about these situations because in the real-world this is not an actual issue, and the use of multiple allocators is rarely even used with this level of granularity where it would even become a possible issue.

I am not against the use of “managed” lists, but I do think that the “Unmanaged” should be the default (i.e. ArrayList and ArrayListManaged), as it is adheres much closer to the ZIg practice of being explicit. If I am passing in an allocator, I should expect that this function can/will allocate, otherwise not. Very simple, no surprises, and not just surmising as much based on the error union, if it even chooses to return the OutOfMemory error.

weskoerber · April 24, 2025, 7:27pm

To be clear, I wasn’t necessarily arguing “against” any particular thing here. I really just wanted to get some insight into the benefits of using unmanaged containers from folks whom are better at writing Zig and understanding the ecosystem than I, but I can see how my ignorance came across as such.

Admittedly, this post was theoretical, but the theory is far from impossible. You can absolutely have multiple allocators and free memory with the wrong one, even if it’s bad code.

I do appreciate the semantics of explicitly passing an allocator to a function that allocates memory. In that regard, the change seems better suited for Zig’s idea of avoiding hidden allocations.

ForeverZer0 · April 24, 2025, 8:53pm

I am not against either, I fully support the inclusion of both in the standard library, nor do I really don’t feel strongly either way. I do think that explicitly requiring an allocator for function that can allocate is more in harmony with the rest of the standard library, and additionally how people actually write Zig in the wild.

In my opinion, a struct carrying around an allocator is perfectly fine, but typically I would confine this to more complicated types, whereas lists and whatnot feel like one tier above language primitives (and often are in other languages). More often than not, a container is being used within the scope of a more advanced type, which may have multiple fields requiring an allocator, so it makes more sense to have this parent type carry the allocator, and pass it down to its children where needed. This is why I mentioned that in my opinion, “unmanaged” should be the default, and “managed” be the type with the qualified name.

All of this is obvious bike-shedding on my part, I won’t be upset if they choose to keep the status-quo, but I do support the change, it personally makes sense to me. I found myself exclusively using “unmanaged” variation after a couple months of using Zig with the exception of a few one-offs and short-lived locals.

Sze · April 24, 2025, 9:26pm

From 0.14.0 Release Notes ⚡ The Zig Programming Language :

std.ArrayHashMap is now deprecated and aliased to std.ArrayHashMapWithAllocator.

To upgrade, switch to ArrayHashMapUnmanaged which will entail updating callsites to pass an allocator to methods that need one. After Zig 0.14.0 is released, std.ArrayHashMapWithAllocator will be removed and std.ArrayHashMapUnmanaged will be a deprecated alias of ArrayHashMap. After Zig 0.15.0 is released, the deprecated alias ArrayHashMapUnmanaged will be removed.

So the unmanaged variants will end up with the simple name and the managed ones will be removed.

weskoerber · April 24, 2025, 9:59pm

I think this is what I needed to hear. It’s essentially what @Sze said…

…but I feel like your explanation resonated more with me - at least, made me better understand what they were saying, and why the unmanaged containers should be used (not that their reply was poor or anything - quite the opposite).