Grouping variables by struct vs grouping them by prefix

chung-leong · June 29, 2024, 10:07pm

In a function I was working on there’re three local variables:

var binding_size: usize = 0;
var binding_offset: usize = 0;
var binding_byte_codes: ?[]u8 = null;

The thought occurred to me that perhaps I can store these variables in a anonymous struct:

var binding: struct {
     size: usize = 0,
     offset: usize = 0,
     byte_codes: ?[]u8 = null
} = .{};

It seems a bit neater. Although the only difference is that we’ve replaced the underscore with a period. I wonder if grouping them this way affects optimization.

AndrewCodeDev · June 29, 2024, 10:24pm

One thing that comes to mind are the default arguments. With optimizations turned off, I can change the generated code by modifying the defaults:

var binding: struct {
     size: usize = 0,
     offset: usize = 0,
     byte_codes: ?[]u8 = null
} = .{};

.L__unnamed_1:
        .zero   32

Compared to changing one of the defaults to 1:

var binding: struct {
     size: usize = 0,
     offset: usize = 1,
     byte_codes: ?[]u8 = null
} = .{};

.L__unnamed_1:
        .quad   0
        .quad   1
        .zero   16

Whereas this doesn’t generate any unamed offsite information:

var binding_size: usize = 0;
var binding_offset: usize = 0;
var binding_byte_codes: ?[]u8 = null;

I think the answer is therefore yes - it has side effects in at least one way.

In terms of optimization? I’d like to see this done with less trivial examples at higher optimization levels. The basic examples that tested on ReleaseFast see through the numeric values and just computes them directly. We’d have to investigate this more thoroughly.

AndrewCodeDev · June 29, 2024, 10:35pm

I can reproduce these side-effects on ReleaseSmall, too.

chung-leong · June 30, 2024, 1:05am

What I wonder about is whether the compiler would think that because the variables are in a struct, it would need to keep their values in an actual location in memory in the event you pass the address of the struct to another function, or is it smart enough to know that it can store them in registers as though they were just independent variables.

AndrewCodeDev · June 30, 2024, 3:14am

I release fast, it sees through it. So it’s definitely possible. I remember hearing from Jason Turner that passing arguments in structs as opposed to independent variables caused the C++ compiler to emit larger data transfers instead of independent ones (it does something sub-optimal when you get the values back out… I don’t recall what video it was and this was a while ago). I’ve never independently verified this, but it may be worth checking into.

mnemnion · June 30, 2024, 3:43am

According to the C (and therefore, C++) abstract machine, structs must have a defined layout in memory, based on the order of declarations, and required alignment for the system.

The optimizer can still cheat, but it has to be able to get away with it. The program has to behave as though the struct follows layout rules, and since structs are semantically copied at function boundaries, this means that in function calls, they generally do.

Zig structs have no defined memory layout, and the compiler can decide if a reference is a copy or a pointer, these allow more leeway in that regard. I’m keen to know to what degree this is taken advantage of in status quo.

chung-leong · June 30, 2024, 12:07pm

Silly me. Totally forgot that structs also function as namespaces in Zig. For the purpose of grouping variables, decls should be used instead of fields:

fn someFn() void {
    const binding = struct {
        var size: usize = 0;
        var offset: usize = 0;
        var byte_codes: ?[]u8 = null;
    };
    // ...
}