It works well but I don’t like the fact that I have to call s.arena.allocator() every time I need to allocate during traversal. I think it would be more optimal to pre-save the allocator somewhere in a struct field to use it later. However, I can’t save the s.arena.allocator() in a field during init() as it requires taking a pointer of a struct that is not yet residing on the stack. What can I do?
Another way, equally verbose in its use, is to have the ArenaAllocator outside Printer and store it as member allocator: std.mem.Allocator.
But I prefer to use self.allocator() instead of self.allocator.
@dimdin Sorry, I am also not sure what the return self.arena.allocator; line means here. There is no such field as allocator in the ArenaAllocator struct. If that function is meant to return .child_allocator (which exists in the struct), this is certainly breaks the whole point of using arena allocator.
@dimdin em… The main concern was that I want to avoid calling arena.allocator() in the first place. I think it’s a bit too much do the following on every call:
@aiac I think in the init body you are using stack pointers that will be invalidated as soon as the function exits. Second, the allocator for the buffer should be independent of the arena because the buffer is the result of a traversal that will be used after arena.deinit().
So it is a bit like a fat pointer, should be similar to passing around a slice, where I would expect a slice to be more costly because its pointer and length are dynamic, where here only the pointer is dynamic.
Unless you have measured this to be a performance problem, I think it is silly to try to avoid this. It might even be that having less fields on your printer struct makes it easier to pass around and thus is better then avoiding a simple recompute (that could be quite optimized).
I think most other things need to be considered for optimization, before considering to save a single call to allocator(). (Even when you have a bunch of calls)
On the contrary, it is probably even faster, because the compiler knows the allocator implementation, if you store the allocator in the struct, then it does not know this and needs to resort to runtime function calls.
Check out this godbolt: Compiler Explorer
As you can see the version that uses arena.allocator() has less instructions and jumps to a static address, whereas the version that uses a prefilled allocator uses a jump to a dynamic address (which is usually slower).
Thank you all guys, especially @Sze and @IntegratedQuantum for hitting the root of my problem. I tested the version that passes arena.allocator() as an argument to the traverse function (1) vs inserting arena.allocator() directly where it’s needed (2):