Segmentation fault when using arena allocator

luizcoro · September 26, 2023, 5:37pm

Hello, I’m trying to use the arena allocator to allocate some temporary pages in memory, then deallocate everything after flushing these pages to disk.

The fields of my struct look like this:

fd: std.os.fd_t,
root_ptr: u64,
mmap: struct {
    file_size: u64,
    mmap_size: u64,
    chunks: std.ArrayList([]align(page_size) u8),
},
tmp: struct {
    n_flushed: u64,
    pages: std.ArrayList([]u8),
    arena: std.heap.ArenaAllocator,
},
allocator: std.mem.Allocator,

I initialize some of these fields like this:

var self: Self = undefined;
self.fd = try std.os.open(path, std.os.O.RDWR | std.os.O.CREAT, 0o0644);
self.mmap.chunks = std.ArrayList([]align(page_size) u8).init(allocator);
self.tmp.pages = std.ArrayList([]u8).init(allocator);
self.tmp.arena = std.heap.ArenaAllocator.init(allocator);
self.allocator = self.page.arena.allocator();

errdefer self.deinit();

// some mmap related stuff...

But when I try to allocate a page like this inside some other function:

var tmp_page = try self.allocator.alloc(u8, page_size)

I get this error:

run test: error: Segmentation fault at address 0x0
/home/luiz/zig/zig-linux-x86_64-0.12.0-dev.415+5af5d87ad/lib/std/heap/arena_allocator.zig:186:77: 0x240cb1 in alloc 
(test)
            const cur_alloc_buf = @as([*]u8, @ptrCast(cur_node))[0..cur_node.data];
                                                                            ^
/home/luiz/zig/zig-linux-x86_64-0.12.0-dev.415+5af5d87ad/lib/std/mem/Allocator.zig:215:29: 0x23f4ec in allocBytesWit
hAlignment__anon_5707 (test)
    // The Zig Allocator interface is not intended to solve alignments beyond
                            ^
/home/luiz/zig/zig-linux-x86_64-0.12.0-dev.415+5af5d87ad/lib/std/mem/Allocator.zig:211:40: 0x23b44d in allocWithSize
AndAlignment__anon_4391 (test)
    return self.allocBytesWithAlignment(alignment, byte_count, return_address);
                                       ^
/home/luiz/zig/zig-linux-x86_64-0.12.0-dev.415+5af5d87ad/lib/std/mem/Allocator.zig:137:75: 0x22acc1 in alloc__anon_2
412 (test)
    comptime optional_alignment: ?u29,
                                                                          ^
/home/luiz/Documents/codes/zig/zdb/src/DiskPager.zig:102:44: 0x231df9 in new (test)
    var tmp_page = try self.allocator.alloc(u8, page_size);
                                           ^

If instead I allocate like this:

var tmp_page = try self.tmp.arena.allocator().alloc(u8, page_size);

Everything works fine, but I would much prefer to have an allocator field separated as it is part of the api I’m planing. How can I solve this problem when using the self.allocator field?

luizcoro · September 26, 2023, 6:04pm

I think I have the answer… hehe

Looking at the code in arena_allocator.zig, I realized that the allocator() function wraps the address of the arena. My code is inside a init function, probably it will copy the Self struct when returning it, so the arena address will be lost.

But now I don’t know how to solve it… should I allocate the Self struct in heap?

edit: this indeed solves the problem

AndrewCodeDev · September 26, 2023, 6:42pm

luizcoro:

var self: Self = undefined;
self.fd = try std.os.open(path, std.os.O.RDWR | std.os.O.CREAT, 0o0644);
self.mmap.chunks = std.ArrayList([]align(page_size) u8).init(allocator);
self.tmp.pages = std.ArrayList([]u8).init(allocator);
self.tmp.arena = std.heap.ArenaAllocator.init(allocator);
self.allocator = self.page.arena.allocator();

In general, I would not make the Arena apart of the struct using it. Allocators ideally should give handles to memory that subsidiary structures “own” (in reality, they are renting it… especially from a stateful allocator).

In fact, this is part of why the .allocator() interface exists. It hands a pointer to an allocator for a structure to use. That way, your arena can be off-site of the structure and is not tied into the lifetimes of your objects.

For instance, with ArrayList, you’ll notice that it only takes an allocator interface. It doesn’t destroy the whole allocator when it’s done - it just returns the memory to the allocator when it’s done.

Here’s it’s deinit call - no deinit called against the allocator itself:


        /// Release all allocated memory.
        pub fn deinit(self: Self) void {
            if (@sizeOf(T) > 0) {
                self.allocator.free(self.allocatedSlice());
            }
        }

LucasSantos · September 26, 2023, 7:04pm

@AndrewCodeDev already commented why this isn’t a good idea. But to solve the general problem of how to initialize a struct if the initialization code requires the struct’s address, the current solution is to take the struct by pointer in the init function, assuming the value will be unitialized. Then, at the calling site, create a struct and set it to undefined. Pass the pointer of this undefined struct to the init function.

fn initRequiresAddress(self: *Self) void{
 //Init the self value.
}

fn callsite() void{
  var self: Self = undefined;
  self.initRequiresAddress();
}

That are proposals to make this more ergonomic.

AndrewCodeDev · September 26, 2023, 7:20pm

@luizcoro I don’t usually do this, but I am unmarking the solution provided. I understand that heap allocation will circumvent the current pointer issue, but in general this isn’t how the allocator interface was made to be used. We’re still tangling lifetimes here.

@LucasSantos provided a way to do this without heap allocation, however (as @LucasSantos mentioned) this is still only a work-around for your current situation.

Again, I’m only doing this because I want to promote best practices

luizcoro · September 26, 2023, 7:34pm

I totally understand. I had that lonely allocator inside Self as an API to another struct to use it. I think I understood your comment, instead of keeping that allocator field separated inside the Self struct, and Self being forced to allocate itself in heap, Self just provide a way to get the handle to whatever struct at the time needed. There is no address problem now since, in the scope initializing the other struct, everything is already set up. Thank you very much!

AndrewCodeDev · September 26, 2023, 7:35pm

There is no address problem now since, in the scope initializing the other struct, everything is already set up.

@luizcoro Sounds like you’re on a better track! I’m marking your new comment as the solution.