Copying comptime known bare union

Given the following union types, how can I copy union1 into union2?

const U1 = union {
    a: comptime_int,
    b: u32,
};

const U2 = union {
    a: u16,
    b: u32,
};

const union1: U1 = .{ .a = 1234 };
var union2: U2 = union1; // error: expected type 'test.S2', found 'test.S1'

I think that the U1 type has to be a tagged union since any conversion from U1 to U2 needs to know which field of union1 is active. Since this only makes sense at comptime anyway, switching U1 to union(enum) shouldn’t have any impact beyond giving the type system a tag to work with.

fn coerceUnion(comptime U: type, u: anytype) U {
    inline for (@typeInfo(@TypeOf(u)).Union.fields) |fld| {
        if (@field(std.meta.Tag(@TypeOf(u)), fld.name) == u) {
            return @unionInit(U, fld.name, @field(u, fld.name));
        }
    }
    unreachable;
}

test {
    const U1 = union(enum) {
        a: comptime_int,
        b: u32,
    };

    const U2 = union {
        a: u16,
        b: u32,
    };

    const union1: U1 = .{ .a = 1234 };
    const union2: U2 = coerceUnion(U2, union1);
    try std.testing.expectEqual(@as(u16, 1234), union2.a);
}
2 Likes

Somewhere behind the scenes, the compiler does know the active tag for bare unions at comptime (and at runtime too in safe optimization modes I think), but I don’t believe that info is exposed anywhere.

Looks like I’ll need to cast the struct to bytes and access the field selector directly.

Doesn’t work. Getting weird “TODO: implement writeToMemory for type ‘in-bare-union.UnionA’” error when I deref the pointer from std.mem.asBytes(). Strange thing is that the following runs without problem:

const std = @import("std");

pub const U1 = union {
    a: u32,
    b: u32,
    c: comptime_int,
};

const union1: U1 = .{ .c = 1234 };

pub fn main() void {
    const bytes = std.mem.asBytes(&union1);
    const array = bytes.*;
    std.debug.print("{d}\n", .{array});
}
{ 0, 0, 0, 0, 2, 0, 0, 0 }

You can’t do that because comptime_int does not have a well-defined layout. It’s an infinite precision integer, therefore it can have any size. It probably also contains metadata inside of it. Therefore the compiler won’t let you see the bytes of it.
It gets even more complicated with unions. During comptime, unions have metadata, because they are tagged. You shouldn’t rely on the layout of the union at comptime, and the compiler probably will enforce this.
It gets even more complicated when you use pointers. Because comptime Zig is a managed language, pointers at comptime are not what we expect. You are not allowed to use @ptrFromInt or intFromPtr at comptime, which makes it impossible to alias objects at comptime.

3 Likes

comptime_int has a size of zero. As you can see in the output above, the union does have a defined layout. Four bytes are used to store the u32, which are followed by a u8 selector plus padding. The comptime integer isn’t stored in the structure itself, much like compile fields of structs.

That isn’t guaranteed to be the memory layout though, it’s just how the compiler currently implements an active field safety check in debug builds.

const std = @import("std");

pub const U1 = union {
    a: u32,
    b: u32,
    c: comptime_int,
};

const union1: U1 = .{ .c = 1234 };

pub fn main() void {
    const bytes = std.mem.asBytes(&union1);
    std.debug.print("union size: {d}\n", .{@sizeOf(U1)});
    for (bytes) |byte| {
        std.debug.print("{x:0<2}", .{byte});
    }
    std.debug.print("\n", .{});
}
> zig run temp.zig 
union size: 8
000000002000000
> zig run temp.zig -O ReleaseFast
union size: 4
00000000
1 Like

Bare unions are for cases where you have some external information that tells you which field is the active one, if you don’t have that information you can’t access the right field.

The zig internal debug mode tag isn’t something you should rely on, it is only there so that you can get an error in case you misuse your bare union while in debug mode.

If you want a union that includes information about the active tag, then use a tagged union, otherwise store that information somewhere else and pass it together with the bare union whenever you need to access its active field.

1 Like

Crap, forgot about that. So even without whatever this TODO issue is the approach is unworkable. Well, it’s an edge case. Comptime union is fundamentally senseless in any event.