Using ArenaAllocator in a short-lived command-line zig program

biosbob · April 3, 2024, 6:07pm

i’m writing a command-line program in zig that will do some “simple” reading/writing of ~50 files which each contain no more than 200 lines of text… once the processing finishes (in seconds, not minutes) the program exits…

currently, i’m taking a somewhat brute force approach to memory management:

i create a single std.heap.ArenaAllocator in my main.zig, which is used everywhere an Allocator is needed; and
i have a defer arena.deinit() statement immediately following, which will presumably free everything in one place…

said another way, there are many functions that return some object which (according to their docs) i’m responsible for freeing… KNOWING that i’m using an ArenaAllocator, what’s the point of adding a bunch of defer statements that are essentially invoking a “free NOP”… (clearly, were i to use a different sort of allocator, this would be important)…

in certain cases (eg., some config file parsers i’m playing with), it appears that the parser creates its own ArenaAllocator which is freed along with results of the parser in some defer parsed_obj.deinit() statement…

should i really worry about this if (say) i’m only parsing a handful of files???

and what about files themselves… i’m assume i should be “closing” files once they are no longer needed – which is yet another defer file.close() statement…

so what i’m still trying understand is this: given my use of an ArenaAllocator, what are some guidelines for how much “incremental release of resources” i should still concern myself with given the short-lived nature of my command-line program???

dimdin · April 3, 2024, 6:45pm

How ArrenaAllocator works:
ArenaAllocator has a child allocator and a linked list of buffers. For child_allocator, my recommendation is to pass a c_allocator if you already link with libc otherwise a page_allocator.
The first time you allocate memory a new buffer is created (for the requested size) and a linked list node is added that points to that buffer. The second allocation tries to increase the size of the buffer, if it cannot increase the size of the buffer a new buffer is allocated and added as a new node in the linked list. Each time you call free, if you free the last allocation, the free is effective, otherwise it is a nop.

You don’t have to call free, and the free call is effective if and only if the free is for the last allocation.

If you pass your ArenaAllocator as a child_allocator for the parser you don’t have to free anything.

This have nothing to do with ArenaAllocator. You must close the files to ensure that buffers are flushed.

Your must still manage any resources such as files or sockets. But you don’t have to worry about memory (at least until you hit an out of memory limit).

kristoff · April 3, 2024, 7:04pm

if your program is not memory hungry, which should be the case if an arena allocator works for your usecase (since it doesn’t really free memory), then you might even opt for not deiniting it at the end of the program.

leak the memory and have the OS clean that up for you. it’s a legitimate technique, plenty of programs do that.

AndrewCodeDev · April 3, 2024, 8:52pm

Short lived arenas are also used under the hood in the standard library. We just recently got a question about the json module and you can see functions creating arenas under the hood to help with parsing before getting evacuated after the function call.