[Theoretical matter] Is it possible to compile and access a non-existent field?

I remember (no guarantee that it’s correct) that once I was able to compile a program that could take an anytype value and access a “random” field. It felt like depending on which structure I passed, it could work in one case and panic in another. However, this lasting effect could be due to ZLS could not infer a tricky chain of type interrelations.

I tried to reproduce this behaviour but couldn’t trick the compiler:

const std = @import("std");

pub fn accessWrongField(val: anytype) void {
    const val_T = @TypeOf(val);
    if (@typeInfo(val_T) == .Struct) {
        if (@hasField(val_T, "Field")) {
            for (val.Field) |*item| {
                _ = item;
                // const lets_try = item.wrong_field;
            }
        }
    }
}

pub fn main() !void {
    const Struct = struct { Field: []anyopaque };
    accessWrongField(Struct{ .Field = undefined });
}

This program compiles successfully but when executed produces zsh: trace trap zig run file.zig with a return code of 133. Anyway, I wanted the line I commented out to work (which is good that it doesn’t).

My understanding is that in “low-level” languages like C or Zig, the .field access is a result of pointer arithmetic—we calculate sizes of structs and the necessary shifts for accessing particular fields based on their type sizes. So, if there is an example where Zig allows compiling access to a field without knowing its type and size, it kind of ruins my world :smiley:. It would be very helpful to have some background or examples that provide insight into how this accessing works (or, in case it doesn’t, why) :slight_smile:.

It’s not compiling. The SIGTRAP is being triggered by the compiler. You can’t have a slice of anyopaque, because it doesn’t have a known size. Switch Field to []*anyopaque and it works. Field acces is indeed just pointer arithmetic, it’s trivial to access any field in any weird way that you wish.

const T = struct{
    f: u32,
};
const U = struct{
    f: u16,
    g: u16,
};

const t: T = undefined;
const u: *U = @ptrCast(&t);
// u.f access the first two bytes of t.f

We kind of do this when we use std.mem.sliceAsBytes. The entire struct is seen as just a bunch of bytes. When doing aliasing, you have to be careful with alignment and padding. Also, in Zig, structs that are not extern don’t have a defined layout.

3 Likes

This is related to a common technique referred to as “coalesced access” and it’s used to vectorize GPU loads on numeric arrays. They’ll define a struct with four data members (or less, we’ll it Vec4 and give it members w, x, y, z). They’ll then cast the head of the array to to a *Vec4 and index into every 4th (starting from zero) data element in the array. This can lead to big speedups because it will coalesce the memory access and can force the machine to do less independent copy operations.

It’s all just bytes - what you call them is a matter of convention and how you access them is a matter of size and alignment :slight_smile:

2 Likes

The closest thing to a non-existent field is a comptime field:

const std = @import("std");

const Seinfeld = struct {
    comptime bytes: [4]comptime_int = .{ 1, 2, 3, 4 },
};

test "test" {
    var array: [4]Seinfeld = undefined;
    std.debug.print("\nSize of Seinfeld: {d}\n", .{@sizeOf(Seinfeld)});
    for (array) |s| {
        std.debug.print("{d} {d} {d} {d}\n", .{ s.bytes[0], s.bytes[1], s.bytes[2], s.bytes[3] });
    }
}
Size of Seinfeld: 0
1 2 3 4
1 2 3 4
1 2 3 4
1 2 3 4

Even though Seinfeld is a struct containing nothing, you can still create an array of it and the items seem to have data. This is a joke post, BTW :laughing:

4 Likes

Well… that’s kinda cool it works but why does it work? :smiling_face_with_tear:

It’s not compiling. The SIGTRAP is being triggered by the compiler.

To be honest, I’m not sure why, if it is obvious that having a slice of anyopaque is not allowed, Zig doesn’t throw a compilation error. Also, I’m unsure about where exactly the SIGTRAP happens.

Regarding the example. Thank you for showing how to cast pointers (didn’t reach that part yet:)). I couldn’t get it work though with t being const:

const std = @import("std");

const T = packed struct {
    f: u8,
};
const U = packed struct {
    f: u4,
    g: u4,
};

const t: T = .{ .f = 0xAB };
const u: *const U = @ptrCast(&t);

pub fn main() !void {
    std.log.debug("{any}", .{@import("builtin").target.cpu.arch.endian()});
    std.log.debug("{X}", .{t.f});
    std.log.debug("{X} {X}", .{ u.f, u.g });
}

Throws:

file.zig:17:34: error: dereference of '*align(1:0:1) const u4' exceeds bounds of containing decl of type 'u0'
    std.log.debug("{X} {X}", .{ u.f, u.g });
                                ~^~

However, when I get rid of pointer const-ness like this:

var t: T = .{ .f = 0xAB };
const u: *U = @ptrCast(&t);

I get the expected:

debug: builtin.Endian.little
debug: AB
debug: B A

Yes, that would have been ideal, but zig is a work in progress. :man_shrugging:

Oof, this looks like a miscompilation to me. I’ve never seen this error before. I believe packed structs are still kind of iffy. I think the packed struct is interacting with the pointer in an unexpected way. I think the compiler is seeing that t is comptime-known and replacing it with a u0, and the fields are being transformed into global constants. This should have been transparent to us. But it looks like u didn’t get the memo and is actually trying to access the fields. When you change it to var, the compiler can’t do this transformation, which is why it works.

1 Like

Don’t know if your use case requires a pointer, but if not, @bitCast works fine with the constants.

const std = @import("std");

const T = packed struct {
    f: u8,
};
const U = packed struct {
    f: u4,
    g: u4,
};

const t: T = .{ .f = 0xAB };
const u: U = @bitCast(t);

pub fn main() !void {
    std.log.debug("{any}", .{@import("builtin").target.cpu.arch.endian()});
    std.log.debug("{X}", .{t.f});
    std.log.debug("{X} {X}", .{ u.f, u.g });
}
2 Likes

Hm…that is interesting insight. Shouldn’t this error be reported or it will be fixed at some point anyway?

No.

Type reflection is all comptime. Zig is statically-typed. An anytype parameter does not make a run-time function that works for any type, it makes a different run-time function for every kind of type you use the function with. E.g. if I have:

fn add(a: anytype, b: @TypeOf(a)) @TypeOf(a) {
    return a + b;
}

Then in my usage code, I have:

_ = add(@as(u64, 2), 6);
_ = add(@as(i32, 1), 3);

These are calling two completely different functions at run-time! One takes two u64’s, the other takes two i32’s. (Of course for this trivial example the compiler could inline both functions, but for non-trivial code, my point stands)

Thank you. I understand. But I’m not sure how this understanding explains the error: dereference of '*align(1:0:1) const u4' exceeds bounds of containing decl of type 'u0' in the example above…