if i have a struct like this one
const A1 = struct{
big_arr_1: [32768]u32,
big_arr_2: [32768]u32,
pub fn init(allocator: std.mem.Allocator) !*A1 {
return try allocator.create(A1);
}
};
is it safe to assume that big_arr_1 and big_arr_2 will also be allocated on the heap?
i’m asking because, i’m getting massive FPS drops if those two arrays are specified with a compile time known length
whereas allocating them at runtime with an allocator removes those FPS drops.
i don’t know the exact source of the FPS drops, i just know that making one change, which is runtime allocating those arrays, solves those FPS drops.
Arrays in zig are not secret pointers, unlike some languages cough c cough, they are the full memory as a value.
Since you are allocating the struct they are in, yes they will be on the heap at runtime.
I think you are talking about using slices, which don’t have a comptime known length. In both cases they are heap allocated at runtime.
Can you share the code where you use them, ideally the code that causes the issue if you can narrow that down?
And what zig version are you using?
Unnecessary memory copies are the likely cause, zig used to have a bug where it copied arrays when indexing, but that should be fixed. It’s also possible that you have code that is copying them when it shouldn’t be.
The arrays get allocated to where the allocator passed to the init method is set to e.g. heap or stack.
Here is a page allocator passed to init (heap):
Here is fixed buffer allocator passed to init (stack):
ah, that makes sense. here’s how i use those two arrays.
pub fn getBlock(self: *const LocalChunk, index: usize) Block {
return @atomicLoad(Constants.Block, &self.blocks[index], std.builtin.AtomicOrder.acquire);
}
pub fn setBlock(self: *LocalChunk, index: usize, new_block: Block) BlockLight {
@atomicStore(Constants.Block, &self.blocks[index], new_block, std.builtin.AtomicOrder.release);
const data = new_block.blockData().light;
self.light_emitter.setBit3D(index, data.emits);
return self.fetchUpdateLight(index, data.default);
}
pub fn setLight(self: *LocalChunk, index: usize, new_light: BlockLight) void {
@atomicStore(Constants.BlockLight, &self.lights[index], new_light, std.builtin.AtomicOrder.release);
}
pub fn getLight(self: *const LocalChunk, index: usize) BlockLight {
return @atomicLoad(Constants.BlockLight, &self.lights[index], std.builtin.AtomicOrder.acquire);
}
pub fn fetchUpdateLight(self: *const LocalChunk, index: usize, new_light: BlockLight) BlockLight {
return @atomicRmw(Constants.BlockLight, &self.lights[index], std.builtin.AtomicRmwOp.Xchg, new_light, std.builtin.AtomicOrder.acq_rel);
}
those are the only uses of those two arrays, everything else calls those funcs to get or set elements, right now the arrays are runtime allocated as well as the struct itself, so those funcs are built with that in mind
also just noticed fetchUpdateLight takes a constant pointer to chunk even though it mutates the struct, interesting.
Check the assembly to see if it’s copying the whole array, godbolt is good, but it’s a website so it’s not useable unless you can reduce the code to something small.
There are various tools to see assembly locally, objdump is what I use on linux, its usually preinstalled or comes with other dev tools.
Debuggers also typically let you see assembly.
Its also worth asking what optimise modes does the slowdown occur on, its still worth reporting regardless, but you could work around the issue by using a different mode. If your using debug on x86_64 linux, then you’d be using the custom backend which has less optimal output, but faster to compile, you can force the use of llvm in the exe/lib/etc options.
That is a serious bug, but do confirm it’s actually a problem and not just a typo
Oh no, this was straight up copied from my source code.
fetchUpdateLight is used in the program, so it’s not being ignored.
ill make a bug report with a reproducible code soon
EDIT: ^ this only works on heap allocated slices, on arrays, the compiler knows its a constant and throws an error
Just tested my code, issue still happens in debug, fast, and safe modes.
(i use latest master)
1 Like
i don’t really know how to read asm, but i got a minimal asm code of the relevant functions using -fstrip and -O ReleaseSmall, this is with the arrays set with comptime known length
.intel_syntax noprefix
.file "local_chunk"
.text
.globl getBlock
.type getBlock,@function
getBlock:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16
.cfi_offset rbp, -16
mov rbp, rsp
.cfi_def_cfa_register rbp
mov eax, dword ptr [rdi + 4*rsi]
pop rbp
.cfi_def_cfa rsp, 8
ret
.Lfunc_end0:
.size getBlock, .Lfunc_end0-getBlock
.cfi_endproc
.globl setBlock
.type setBlock,@function
setBlock:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16
.cfi_offset rbp, -16
mov rbp, rsp
.Lfunc_end1:
.size setBlock, .Lfunc_end1-setBlock
.cfi_endproc
.globl fetchUpdateLight
.type fetchUpdateLight,@function
fetchUpdateLight:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16
.cfi_offset rbp, -16
mov rbp, rsp
.cfi_def_cfa_register rbp
mov eax, edx
xchg dword ptr [rdi + 4*rsi + 131072], eax
pop rbp
.cfi_def_cfa rsp, 8
ret
.Lfunc_end2:
.size fetchUpdateLight, .Lfunc_end2-fetchUpdateLight
.cfi_endproc
.globl setLight
.type setLight,@function
setLight:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16
.cfi_offset rbp, -16
mov rbp, rsp
.cfi_def_cfa_register rbp
mov dword ptr [rdi + 4*rsi + 131072], edx
pop rbp
.cfi_def_cfa rsp, 8
ret
.Lfunc_end3:
.size setLight, .Lfunc_end3-setLight
.cfi_endproc
.globl getLight
.type getLight,@function
getLight:
.cfi_startproc
push rbp
.cfi_def_cfa_offset 16
.cfi_offset rbp, -16
mov rbp, rsp
.cfi_def_cfa_register rbp
mov eax, dword ptr [rdi + 4*rsi + 131072]
pop rbp
.cfi_def_cfa rsp, 8
ret
.Lfunc_end4:
.size getLight, .Lfunc_end4-getLight
.cfi_endproc
.section ".note.GNU-stack","",@progbits
You’re sure the callsites are in turn used, so the function is actually analyzed? I have very similar code using rmw, and making the pointer argument const gives me the expected “cast discards const qualifier” error on master (just pulled today). 0.15.2 gives the same error on that code.
apologies, i should’ve clarified, i filed for a bug report, it only happens to heap allocated slices.
https://codeberg.org/ziglang/zig/issues/31444 is the issue i made.
Ah, interesting case, but I don’t think it has anything to do with heap allocation.
Same thing here without heap allocation:
const std = @import("std");
const Foo = struct {
arr_1: []u32,
arr_2: []u32,
pub fn constButStillMutating(self: *const Foo, d: u32, index: usize) u32 {
return @atomicRmw(u32, &self.arr_1[index], .Xchg, d, .acq_rel);
}
};
pub fn main() !void {
var arr1: [2]u32 = .{ 0, 0 };
var arr2: [2]u32 = .{ 0, 0 };
var v: Foo = undefined;
var f = &v;
f.* = .{
.arr_1 = &arr1,
.arr_2 = &arr2,
};
_ = f.constButStillMutating(2, 1);
}
The difference between your and my own code, is that you have slices in the fields and I use arrays, so there’s one more level of indirection in your case. So it’s probably fine.
To clarify, I think you’re just seeing constness not propagating, which is how Zig works
const std = @import("std");
const MutableStruct = struct {
value: []i32 = undefined,
};
pub fn mutate(ptr: *const MutableStruct, new_value: i32) void {
ptr.value[0] = new_value;
}
pub fn main() void {
var arr: [2]i32 = .{ 0, 0 };
const myStruct = MutableStruct{ .value = &arr };
mutate(&myStruct, 20);
std.debug.print("After mutation: {any}\n", .{myStruct.value});
}
1 Like
the array version probably mixed us both up, as that one would be a problem. Ofc slices are pointers, and have their own constness.
Speaking of, if the data is in slices, then Op probably wouldn’t need to heap allocate the struct. Idk if they were doing that, but I figured I should point it out.
1 Like