I understand allocation should be explicit and not implicit, and generally pass my allocators down through function calls so that the caller can handle the call to free the memory (while using errdefer to ensure errors don’t cause leaks). I have looked around for blog posts or documentation regarding anti-patterns and best practices beyond the call for explicit memory management, but have not found anything beyond “pass allocators as function arguments”. Does anyone have any great resources for common pitfalls with allocations and how best to avoid them? I want to ensure I’m doing things in an idiomatic fashion.
I think the main reason while you find no clean list on this is because the answer is diverse.
Even the “always pass down the allocator”, is not as strict as you think it is. It makes a lot of sense for libraries and simple data structures, but when it comes to your application it makes sense to e.g. commit to a single global allocator or commit to always pass arena’s to this function.
To give some examples: For short running CLI tools the general recommendation is to just use an arena for everything. On the other end of the spectrum, my project Cubyz (a game) uses a variety of different arenas and a global and a stack-local allocator with different use-cases as stated in the project guidelines.
If you are in doubt then I’d generally recommend to use the DebugAllocator. It will be good enough for most cases and gives you more guardrails (→leak detection) which help when you are new to manual memory management. And beyond that, I’d suggest to keep your eyes out for arena allocators, they can save a lot of complexity in your code.
Apart from that a few general tips:
Try to avoid creating many objects with complex lifetimes and reference counting. Instead try grouping objects of similar lifetime into arenas or use a more data oriented approach.
Here are also some related posts I’d suggest to check out for more practical examples:
There were a couple recent threads discussing different aspects of this question.
This is great thank you! I am building a small CLI tool for viewing processes (I should call it YAT (yet another top XD)) for my learning how to build TUI apps in Zig. I am utilizing an arena allocator as I find that best fits the need of my application for procLists that I throw the whole thing away when im done with it. That may grow and change but for now it’s a simple data structure and fits what you are describing.
I come from Golang where people have strong opinions on how to do things, and Zig feels a little more free, or in many cases freedumb enough to shoot yourself in your foot lol. I guess I was looking for hardline rules where I need to take a step back and view the application needs first beyond the dogma. Thank you for your time replying, I will take a look at the video/post you linked. ![]()
Most important best practice for me: never use the name allocator, always be precise about the contract:
gpa— the code must calldefer gpa.freeto free stuff uparena— the data will be bulk-deallocated at the appropriate system boundary, do not call free.scratch— private space for function. Unlikearena, it’s definitely incorrect to return scratch data.
You can imagine use all there at the same time:
fn handle(
request: *Request,
arena: Allocator,
gpa: Allocator,
scratch: Allocator,
) Response
arenais for request-scoped stuff, that’s whereResponseis allocated in.gpais for data that outlives a single request. For example, if requests share a cache, data in the cache is managed viagpascratchis for data that is required to computeResponse, but isn’t actual response. For example, if you need to sort something, you can use scratch to allocate that buffer.
As a fun example of that, TigerBeetle is built on top of gpa abstraction (we do free stuff, which is important for VOPR), but we use arena as our production implementation of gpa:
Hey look who showed up to the party, literally just finished the link to your blog post hahaha. Great insights though, adding this to my refactor list. Your reply, and blog post are much appreciated.
In this case, would you consider the following variation wise or unwise (or “just depends”)?
const Allocators = struct {
request: Allocator,
state: Allocator,
scratch: Allocator,
};
fn handle(request: *Request, allocators: Allocators) Response {
const foo = allocators.scratch.alloc(u8, 64); // for my sort....
// ....
}
(A “just depends” answer might be offered like: “if there are lots of handle_X() variations, and they follow this pattern, then this can be a useful construct”… but I’m looking more for reflection on the naming - instead of the names “arena” and “gpa”, here, we have a proposition that might be considered even more specific, or “suggestive” - that’s good, right? Drawbacks?)
unwise — I’ve never seen all three used at the same time.
I must have misinterpreted this. I thought that was why you put all three in the function args. I confess, it’s not something I’ve seen, but I think I could imagine it, the way you proposed it, and think I see the value of it for your example of building a Response to a Request, where some scratch work might be involved, some state that outlives the Request/Response life might be needed, and, of course, building the Response (in a dumpable arena) is the actual purpose of the function, so….
But the back-question is: if a given allocator is really for the purpose of building a Response, then does it make even more sense to call it a name that suggest that specificity, rather than just “hinting”, with the name “arena”, that it should be arena-like? I think one answer to this is, “NO”, under the logic provided, because naming it “arena” is suppose to signal, “do not call (defer) free - that’s not the way you’re supposed to use this one!” Of course, that understanding could be conveyed in other ways, but the name could certainly be considered one way to do it. That logic wouldn’t particularly hold up for why to call other allocators ‘gpa’ and ‘scratch’, rather than, possibly, ‘allocator.state’ and ‘allocator.temp’, or whatever - both of which should ask the alloc()-caller to be sure to call free(). It feels like there’s room to argue…?
But do you in practice end up passing arena like this a lot? If so, can you point us to a real-world example that you would consider a typical use case?
If a lifetime of the Request is already well-defined, then why not treat any internal allocation as implementation detail? Why not then accept the std.mem.Allocator on .init(), store it and free on deinit .. or even go further and use internal [_]u8 buffers, with or without FixedBufferAllocator.?
In other words, the contract of “give me an allocator but don’t tell me when to alloc/free” feels like half-way towards managed memory. I tried to use it and felt like whenever I found the use case for it, I ended up going all way to managed memory anyway.
Are there any examples when an API would benefit from making this contract but not going to managed pattern?
Not about designing your API for proper lifetimes or allocation interfaces, but something I ran into recently caused me to start testing various allocators for performance. After you’ve minimized your alloc/free count in general, FixedBufferAllocator will typically be fastest, showing you the theoretical max you could ever hope to achieve with the other allocators. Big fan of the std.heap.smpallocator esp after seeing how it compares to std.heap.c_allocator . smp was even competitive with arena based strategies and bulk frees in my testing.
I’m guessing such depends a lot on how you’re using the allocator? Certainly FixedBuffer has categorical advantages, but, for the others, surely there are so many reasons to choose one over another that even often trump basic performance (dis)advantages on some normalized playing field?
I agree that the names can be more specific, so here’s my interpretation:
fn handle(
request: *Request,
request_arena: Allocator,
cache_gpa: Allocator,
scratch: Allocator,
) Response
The allocator names have a scope prefix and a lifetime suffix, answering the questions: “what data am I allocating?” and “what are their lifetimes?” respectively. The scratch prefix can be inferred and is omitted.
As for the Allocators struct:
Bundling parameters in structs is useful, but it implies 2 things about your API:
- This combination of things is used often
- These things are conceptually/semantically linked
As was said before, (usually) neither of these are true. These allocators would probably only be used together in this function, and all of them are orthogonal to one another. I can certainly imagine scenarios to the contrary, but we’re talking in general here. If anything, I’d consider moving request_arena into the Request struct, since it’s unlikely you’d use one without the other.
@matklad feel free to overrule if I missed the mark!