A question about memory allocation

if i have a struct like this one

const A1 = struct{
  big_arr_1: [32768]u32,
  big_arr_2: [32768]u32,

  pub fn init(allocator: std.mem.Allocator) !*A1 {
      return try allocator.create(A1);
  }
};

is it safe to assume that big_arr_1 and big_arr_2 will also be allocated on the heap?
i’m asking because, i’m getting massive FPS drops if those two arrays are specified with a compile time known length

whereas allocating them at runtime with an allocator removes those FPS drops.

i don’t know the exact source of the FPS drops, i just know that making one change, which is runtime allocating those arrays, solves those FPS drops.

Arrays in zig are not secret pointers, unlike some languages cough c cough, they are the full memory as a value.

Since you are allocating the struct they are in, yes they will be on the heap at runtime.

I think you are talking about using slices, which don’t have a comptime known length. In both cases they are heap allocated at runtime.

Can you share the code where you use them, ideally the code that causes the issue if you can narrow that down?

And what zig version are you using?

Unnecessary memory copies are the likely cause, zig used to have a bug where it copied arrays when indexing, but that should be fixed. It’s also possible that you have code that is copying them when it shouldn’t be.

The arrays get allocated to where the allocator passed to the init method is set to e.g. heap or stack.

Here is a page allocator passed to init (heap):

Here is fixed buffer allocator passed to init (stack):

ah, that makes sense. here’s how i use those two arrays.

pub fn getBlock(self: *const LocalChunk, index: usize) Block {
    return @atomicLoad(Constants.Block, &self.blocks[index], std.builtin.AtomicOrder.acquire);
}

pub fn setBlock(self: *LocalChunk, index: usize, new_block: Block) BlockLight {
    @atomicStore(Constants.Block, &self.blocks[index], new_block, std.builtin.AtomicOrder.release);
    const data = new_block.blockData().light;
    self.light_emitter.setBit3D(index, data.emits);
    return self.fetchUpdateLight(index, data.default);
}

pub fn setLight(self: *LocalChunk, index: usize, new_light: BlockLight) void {
    @atomicStore(Constants.BlockLight, &self.lights[index], new_light, std.builtin.AtomicOrder.release);
}

pub fn getLight(self: *const LocalChunk, index: usize) BlockLight {
    return @atomicLoad(Constants.BlockLight, &self.lights[index], std.builtin.AtomicOrder.acquire);
}

pub fn fetchUpdateLight(self: *const LocalChunk, index: usize, new_light: BlockLight) BlockLight {
    return @atomicRmw(Constants.BlockLight, &self.lights[index], std.builtin.AtomicRmwOp.Xchg, new_light, std.builtin.AtomicOrder.acq_rel);
}

those are the only uses of those two arrays, everything else calls those funcs to get or set elements, right now the arrays are runtime allocated as well as the struct itself, so those funcs are built with that in mind

also just noticed fetchUpdateLight takes a constant pointer to chunk even though it mutates the struct, interesting.

Check the assembly to see if it’s copying the whole array, godbolt is good, but it’s a website so it’s not useable unless you can reduce the code to something small.

There are various tools to see assembly locally, objdump is what I use on linux, its usually preinstalled or comes with other dev tools.

Debuggers also typically let you see assembly.

Its also worth asking what optimise modes does the slowdown occur on, its still worth reporting regardless, but you could work around the issue by using a different mode. If your using debug on x86_64 linux, then you’d be using the custom backend which has less optimal output, but faster to compile, you can force the use of llvm in the exe/lib/etc options.

That is a serious bug, but do confirm it’s actually a problem and not just a typo

Oh no, this was straight up copied from my source code.
fetchUpdateLight is used in the program, so it’s not being ignored.
ill make a bug report with a reproducible code soon
EDIT: ^ this only works on heap allocated slices, on arrays, the compiler knows its a constant and throws an error

Just tested my code, issue still happens in debug, fast, and safe modes.
(i use latest master)

1 Like

i don’t really know how to read asm, but i got a minimal asm code of the relevant functions using -fstrip and -O ReleaseSmall, this is with the arrays set with comptime known length

	.intel_syntax noprefix
	.file	"local_chunk"
	.text
	.globl	getBlock
	.type	getBlock,@function
getBlock:
	.cfi_startproc
	push	rbp
	.cfi_def_cfa_offset 16
	.cfi_offset rbp, -16
	mov	rbp, rsp
	.cfi_def_cfa_register rbp
	mov	eax, dword ptr [rdi + 4*rsi]
	pop	rbp
	.cfi_def_cfa rsp, 8
	ret
.Lfunc_end0:
	.size	getBlock, .Lfunc_end0-getBlock
	.cfi_endproc

	.globl	setBlock
	.type	setBlock,@function
setBlock:
	.cfi_startproc
	push	rbp
	.cfi_def_cfa_offset 16
	.cfi_offset rbp, -16
	mov	rbp, rsp
.Lfunc_end1:
	.size	setBlock, .Lfunc_end1-setBlock
	.cfi_endproc

	.globl	fetchUpdateLight
	.type	fetchUpdateLight,@function
fetchUpdateLight:
	.cfi_startproc
	push	rbp
	.cfi_def_cfa_offset 16
	.cfi_offset rbp, -16
	mov	rbp, rsp
	.cfi_def_cfa_register rbp
	mov	eax, edx
	xchg	dword ptr [rdi + 4*rsi + 131072], eax
	pop	rbp
	.cfi_def_cfa rsp, 8
	ret
.Lfunc_end2:
	.size	fetchUpdateLight, .Lfunc_end2-fetchUpdateLight
	.cfi_endproc

	.globl	setLight
	.type	setLight,@function
setLight:
	.cfi_startproc
	push	rbp
	.cfi_def_cfa_offset 16
	.cfi_offset rbp, -16
	mov	rbp, rsp
	.cfi_def_cfa_register rbp
	mov	dword ptr [rdi + 4*rsi + 131072], edx
	pop	rbp
	.cfi_def_cfa rsp, 8
	ret
.Lfunc_end3:
	.size	setLight, .Lfunc_end3-setLight
	.cfi_endproc

	.globl	getLight
	.type	getLight,@function
getLight:
	.cfi_startproc
	push	rbp
	.cfi_def_cfa_offset 16
	.cfi_offset rbp, -16
	mov	rbp, rsp
	.cfi_def_cfa_register rbp
	mov	eax, dword ptr [rdi + 4*rsi + 131072]
	pop	rbp
	.cfi_def_cfa rsp, 8
	ret
.Lfunc_end4:
	.size	getLight, .Lfunc_end4-getLight
	.cfi_endproc

	.section	".note.GNU-stack","",@progbits

You’re sure the callsites are in turn used, so the function is actually analyzed? I have very similar code using rmw, and making the pointer argument const gives me the expected “cast discards const qualifier” error on master (just pulled today). 0.15.2 gives the same error on that code.

apologies, i should’ve clarified, i filed for a bug report, it only happens to heap allocated slices.
https://codeberg.org/ziglang/zig/issues/31444 is the issue i made.

Ah, interesting case, but I don’t think it has anything to do with heap allocation.
Same thing here without heap allocation:

const std = @import("std");

const Foo = struct {
    arr_1: []u32,
    arr_2: []u32,

    pub fn constButStillMutating(self: *const Foo, d: u32, index: usize) u32 {
        return @atomicRmw(u32, &self.arr_1[index], .Xchg, d, .acq_rel);
    }
};

pub fn main() !void {
    var arr1: [2]u32 = .{ 0, 0 };
    var arr2: [2]u32 = .{ 0, 0 };

    var v: Foo = undefined;
    var f = &v;
    f.* = .{
        .arr_1 = &arr1,
        .arr_2 = &arr2,
    };
    _ = f.constButStillMutating(2, 1);
}

The difference between your and my own code, is that you have slices in the fields and I use arrays, so there’s one more level of indirection in your case. So it’s probably fine.

To clarify, I think you’re just seeing constness not propagating, which is how Zig works

const std = @import("std");

const MutableStruct = struct {
    value: []i32 = undefined,
};

pub fn mutate(ptr: *const MutableStruct, new_value: i32) void {
    ptr.value[0] = new_value;
}

pub fn main() void {
    var arr: [2]i32 = .{ 0, 0 };

    const myStruct = MutableStruct{ .value = &arr };
    mutate(&myStruct, 20);
    std.debug.print("After mutation: {any}\n", .{myStruct.value});
}
1 Like

:person_facepalming: the array version probably mixed us both up, as that one would be a problem. Ofc slices are pointers, and have their own constness.

Speaking of, if the data is in slices, then Op probably wouldn’t need to heap allocate the struct. Idk if they were doing that, but I figured I should point it out.

1 Like