How to get arena.allocator() during struct initialization?

timfayz · May 22, 2024, 9:20am

I’m trying to apply arena allocation in my printer (contrived example):

const std = @import("std");

const Printer = struct {
    arena: std.heap.ArenaAllocator,
    buffer: std.ArrayList(u8),

    fn init(alloc: std.mem.Allocator) Printer {
        return Printer{
            .arena = std.heap.ArenaAllocator.init(alloc),
           // .alloc = ^.arena.allocator() // would be nice to do something like this but I think it is impossible and would require an additional, say, initAlloc() step
            .buffer = std.ArrayList(u8).init(alloc),
        };
    }

    fn deinit(s: *Printer) void {
        s.arena.deinit();
        s.buffer.deinit();
    }

    fn toOwnedSlice(s: *Printer, val: anytype) ![]u8 {
        try s.traverse(val, 0);
        return s.buffer.toOwnedSlice();
    }

    fn traverse(s: *Printer, val: anytype, comptime depth: usize) !void {
        switch (@typeInfo(@TypeOf(val))) {
            .Pointer => |ptr| {
                switch (ptr.size) {
                    .One => {
                        const interim = try std.fmt.allocPrint(s.arena.allocator(), "ptr:{p}\n", .{val}); // leak
                        try s.buffer.appendSlice("  " ** depth);
                        try s.buffer.appendSlice(interim);
                        try s.traverse(val.*, depth + 1);
                    },
                    else => unreachable,
                }
            },
            else => {
                const interim = try std.fmt.allocPrint(s.arena.allocator(), "val:{any}\n", .{val}); // leak
                try s.buffer.appendSlice("  " ** depth);
                try s.buffer.appendSlice(interim);
            },
        }
    }
};

test {
    var printer = Printer.init(std.testing.allocator);
    defer printer.deinit();

    const val: u8 = 42;
    const ptr: *const u8 = &val;
    const ptr_ptr: *const *const u8 = &ptr;
    const res = try printer.toOwnedSlice(ptr_ptr);
    std.testing.allocator.free(res);
}

It works well but I don’t like the fact that I have to call s.arena.allocator() every time I need to allocate during traversal. I think it would be more optimal to pre-save the allocator somewhere in a struct field to use it later. However, I can’t save the s.arena.allocator() in a field during init() as it requires taking a pointer of a struct that is not yet residing on the stack. What can I do?

timfayz · May 22, 2024, 9:28am

Maybe I should do something like this:

     fn toOwnedSlice(s: *Printer, val: anytype) ![]u8 {
-        try s.traverse(val, 0);
+        try s.traverse(s.arena.allocator(), val, 0);
         return s.buffer.toOwnedSlice();
     }
 
-    fn traverse(s: *Printer, val: anytype, comptime depth: usize) !void {
+    fn traverse(s: *Printer, alloc: std.mem.Allocator, val: anytype, comptime depth: usize) !void {
         switch (@typeInfo(@TypeOf(val))) {

dimdin · May 22, 2024, 10:19am

You can have a helper allocator function:

fn allocator(self: Printer) std.mem.Allocator {
    return self.arena.allocator();
}

Another way, equally verbose in its use, is to have the ArenaAllocator outside Printer and store it as member allocator: std.mem.Allocator.
But I prefer to use self.allocator() instead of self.allocator.

aiac · May 22, 2024, 11:00am

I thought you wanted something like this

github.com

ziglang/zig/blob/5fe9f88b13f37e14fcb91e155e3e686eccb89dfc/lib/std/json/static.zig#L93


      
              options: ParseOptions,
          ) ParseError(Scanner)!T {
              var scanner = Scanner.initCompleteInput(allocator, s);
              defer scanner.deinit();
          
              return parseFromTokenSourceLeaky(T, allocator, &scanner, options);
          }
          
          /// `scanner_or_reader` must be either a `*std.json.Scanner` with complete input or a `*std.json.Reader`.
          /// Note that `error.BufferUnderrun` is not actually possible to return from this function.
          pub fn parseFromTokenSource(
              comptime T: type,
              allocator: Allocator,
              scanner_or_reader: anytype,
              options: ParseOptions,
          ) ParseError(@TypeOf(scanner_or_reader.*))!Parsed(T) {
              var parsed = Parsed(T){
                  .arena = try allocator.create(ArenaAllocator),
                  .value = undefined,
              };
              errdefer allocator.destroy(parsed.arena);

timfayz · May 22, 2024, 11:16am

I’m not sure I understood exactly you wished to say.
Did you mean I can do the following?

const Printer = struct {
    arena: *std.heap.ArenaAllocator,
    alloc: std.mem.Allocator,
    ...

        fn init(alloc: std.mem.Allocator) Printer {
            var printer = Printer{
                .arena = try alloc.create(std.heap.ArenaAllocator),
                .alloc = undefined,
                ...
            };

            printer.alloc = printer.arena.allocator();
            ...
            
            return printer;
        }
};

@dimdin Sorry, I am also not sure what the return self.arena.allocator; line means here. There is no such field as allocator in the ArenaAllocator struct. If that function is meant to return .child_allocator (which exists in the struct), this is certainly breaks the whole point of using arena allocator.

dude_the_builder · May 22, 2024, 11:22am

I think you can have an init that takes a pointer to an uninitialized Printer:

    .alloc: std.mem.Allocator,

    fn init(self: *Printer, alloc: std.mem.Allocator) void {
        self.* = .{
            .arena = std.heap.ArenaAllocator.init(alloc),
            .buffer = std.ArrayList(u8).init(alloc),
        };

        self.alloc = self.arena.allocator();
    }

then

var printer: Printer = undefined;
printer.init(allocator);
// or Printer.init(&printer, allocator);

dimdin · May 22, 2024, 11:26am

My mistake.
The correct is:

fn allocator(self: Printer) std.mem.Allocator {
    return self.arena.allocator();
}

aiac · May 22, 2024, 11:41am

This should be feasible

fn init(alloc: std.mem.Allocator) Printer {
      const arena = std.heap.ArenaAllocator.init(alloc);
      const allocator = arena.allocator();
      return Printer{
          .arena = arena,
          .buffer = std.ArrayList(u8).init(allocator),
      };
  }

fn allocator(self: Printer) std.mem.Allocator {
   return self.arena.allocator();
}

fn deinit(self: *Printer) {
  self.arena.deinit();
  self.* = undefined;
}

dude_the_builder · May 22, 2024, 11:57am

This would use pointers to temporary memory. @AndrewCodeDev explains this in example 3 of “Pointers to Temporary Memory.”

NOTE: The arena can’t be const, so it would have to be allocated on the stack in this example.

timfayz · May 22, 2024, 12:00pm

@dude_the_builder That is a good trick I should say!

@dimdin em… The main concern was that I want to avoid calling arena.allocator() in the first place. I think it’s a bit too much do the following on every call:

     pub fn allocator(self: *ArenaAllocator) Allocator {
        return .{
            .ptr = self,
            .vtable = &.{
                .alloc = alloc,
                .resize = resize,
                .free = free,
            },
        };

@aiac I think in the init body you are using stack pointers that will be invalidated as soon as the function exits. Second, the allocator for the buffer should be independent of the arena because the buffer is the result of a traversal that will be used after arena.deinit().

aiac · May 22, 2024, 12:08pm

maybe this? (just like std.json.Parsed)

fn init(alloc: std.mem.Allocator) Printer {
         var arena = blk: {
            const result = try allocator.create(ArenaAllocator);
            errdefer allocator.destroy(result);
            result.* = std.heap.ArenaAllocator.init(allocator);
            break :blk result;
        };
        errdefer arena.deinit();

      return Printer{
          .arena = arena,
          .buffer = std.ArrayList(u8).init(arena.allocator()),
      };
  }

fn allocator(self: Printer) std.mem.Allocator {
   return self.arena.allocator();
}

fn deinit(self: *Printer)  void {
      const allocator = self.arena.child_allocator;
      self.arena.deinit();
      allocator.destroy(self.arena);
      self.* = undefined;
}

Sze · May 22, 2024, 12:23pm

For the compiler this is just:

return .{
    .ptr = self,
    .vtable = static_address,
};

So it is a bit like a fat pointer, should be similar to passing around a slice, where I would expect a slice to be more costly because its pointer and length are dynamic, where here only the pointer is dynamic.

Unless you have measured this to be a performance problem, I think it is silly to try to avoid this. It might even be that having less fields on your printer struct makes it easier to pass around and thus is better then avoiding a simple recompute (that could be quite optimized).

I think most other things need to be considered for optimization, before considering to save a single call to allocator(). (Even when you have a bunch of calls)

IntegratedQuantum · May 22, 2024, 12:59pm

On the contrary, it is probably even faster, because the compiler knows the allocator implementation, if you store the allocator in the struct, then it does not know this and needs to resort to runtime function calls.
Check out this godbolt: Compiler Explorer
As you can see the version that uses arena.allocator() has less instructions and jumps to a static address, whereas the version that uses a prefilled allocator uses a jump to a dynamic address (which is usually slower).

timfayz · May 24, 2024, 10:34am

Thank you all guys, especially @Sze and @IntegratedQuantum for hitting the root of my problem. I tested the version that passes arena.allocator() as an argument to the traverse function (1) vs inserting arena.allocator() directly where it’s needed (2):

     fn toOwnedSlice(s: *Printer, val: anytype) ![]u8 {
1        try s.traverse(s.arena.allocator(), val, 0);
2        try s.traverse(val, 0);
         ...
     }
 
1    fn traverse(s: *Printer, alloc: std.mem.Allocator, val: anytype, comptime depth: usize) !void {
2    fn traverse(s: *Printer, val: anytype, comptime depth: usize) !void {
        ... 
1       interim = try std.fmt.allocPrint(alloc, "ptr:{p}\n", .{val});
2       interim = try std.fmt.allocPrint(s.arena.allocator(), "ptr:{p}\n", .{val});

Hyperfine showed a slight speed boost over the (2) version. So I ended up using it.