About optional union(enum) niche optimization

I’ve been working on a parser recently where I have some tagged union types representing various AST nodes.
Some nodes require this union type, while others make it optional.

And I noticed something about the type size:

const Node = union(enum) {
    a,
    b,
};

std.debug.print("{}\n", .{@sizeOf(Node)});       // 1 byte
std.debug.print("{}\n", .{@sizeOf(?Node)});      // 2 bytes

In Zig, ?Node takes 2 bytes.

Rust has “niche optimization” where Option can reuse an invalid discriminant value to represent None, so Option remains 1 byte:

enum Node {
    A,
    B,
}

std::mem::size_of::<Node>();           // 1 byte
std::mem::size_of::<Option<Node>>();   // still 1 byte

I looked through Zig’s codebase and found that Zig currently only applies such optimization for pointers and anyerror.

So I wonder: Is this a “not yet implemented” feature, or is there a design reason Zig intentionally avoids this optimization?

Initially I considered adding a .none variant directly inside the union, but found it hard to maintain — so finnaly I went back to ?union though it occupies slightly more bits.

This is an open issue Optimization opportunity for optional enum values · Issue #12186 · ziglang/zig · GitHub.

I also remember seeing discussions of potential optimization compacting:

union(enum) {
    a: enum { a, b, c },
    b: enum { d, e, f },
    c: enum { a, b },
} 

into

enum {
   a_a,
   a_b,
   a_c,
   b_d,
   b_e,
   b_f,
   c_a,
   c_b,
};

But i wasn’t able to find link quickly. Will edit if able to find.

EDIT:
Also this optional type optimization · Issue #104 · ziglang/zig · GitHub. Its a bit closer to what you describing

1 Like

Ah, thanks for pointing those out! I searched for “optional union optimization” but apparently didn’t find the right issue :smiling_face_with_tear:

I’m also interested in this.

Meanwhile you can define a big enum with all the options (including .none or other special values), and in addition define enums that are subsets of the big enum, in a bitwise compatible way. Then, just add your own converter methods that go BigEnum -> ?SmallEnum, while the data that is stored long term is either BigEnum or SmallEnum but not ?SmallEnum (because of the bigger size).

1 Like

I actually tried your approach at the beginning, but I found the DX to be quite poor.

The target AST has very complex subsets, so initially I put all node variants into a large enum, then used comptime type functions to construct many subset tagged unions (by passing enum subsets and obtaining node types from an enum value → node type function).

However, this led to:

  • The code actually ended up with more lines (and wasn’t intuitive)
  • I lost all type inference and autocompletion when accessing the unions

Premature optimization is the root of all evil, so I’ll focus on implementing first, though I still feel like I’m wasting bits (especially with some double-layered unions).

Also, my original intention for defining a large set and deriving smaller subsets wasn’t to handle the null variant, but to give all tagged unions stable tags, so that casting from node subsets to the full set would be no-op, making it easier to write visitors. However, I never actually tested whether this approach was safe, because after losing type hints and autocompletion, I became concerned about code soundness, so I quickly abandoned this approach.

Fair. I’ve only done this with relatively small enums, and wrote the subset enums explicitly – which is duplication – but because of the code completion aspect, it’s probably the lesser evil compared to generating these enums in comptime.

It’s also possibly to annotate the enum variants with comments saying which subsets they belong to and do code generation based on that (not comptime). This probably feels overkill, but in principle makes sense, I think.

1 Like

My understanding is that this is something that’s planned, but not implemented. Well it is only implemented for pointers. ?*T is the same size as *T.

But it does not, as far as I can tell, also extend for structs with pointer members. So

const S = struct {
    foo: *const i32, // or whatever
    bar: i32,
};

?S and S would not be the same size. Eh

Is zig restricted to one level of ‘optionality’ ? If one can create long optionals chain it might start to cause issues. For examples if one uses ???i8 the niches may be hard to find, or cause weird behaviour ?

Because its possible to get a pointer to the inner optional (with if (opt_x) |*x| ...), I believe the representation of each nested optional must be the same.

For example, you can have ??usize. The representation of ??usize must be an extra byte alongside the representation of ?usize. This is because, given a ??usize you can obtain a *?usize to its inner value.

In other words, the niches for nested optional types cannot be combined.

nothing really prevents you from doing ????T or even ?E!?T except weird syntax.

These worked fine when I tried them when learning and playing around with Zig.

You reminded me though, I neglected to complain about the fact that when printing a ???T, all the different nulls were printed in the same way – but maybe that’s a niche ( :drum: ) complaint.

After reading this proposal, I did feel it was an elegant and harmonious way to implement niche optimization, but its scope was too broad. On the one hand, I thought it would be difficult to implement without significant effort, and I also had doubts about the final DX.

I think I would prefer this proposal; it’s small enough and conceptually very simple to understand, which aligns well with Zig’s preference for “explicit” approaches. I think the only problem is that it shifts the mental burden of how the end user might use the union to the union provider.
I don’t know why sno2 closed it, but I might try it out on a fork when I have time.

Language support is great, but why wait?

const assert = @import("std").debug.assert;
const std = @import("std");

test {
    try testConversion(null);
    try testConversion(.a);
    try testConversion(.b);
}

fn testConversion(e: ?MyEnum) !void {
    const b: Niche(MyEnum, u8) = .pack(e);
    const e2 = b.unpack();
    try std.testing.expectEqual(e, e2);
}

const MyEnum = enum {
    a,
    b,
};

fn Niche(E: type, Backing: type) type {
    const e_info = @typeInfo(E).@"enum";
    const e_tag_info = @typeInfo(e_info.tag_type).int;
    const backing_info = @typeInfo(Backing).int;
    const sentinel: Backing = std.math.maxInt(Backing);

    assert(e_tag_info.signedness == .unsigned);
    assert(backing_info.signedness == .unsigned);
    assert(backing_info.bits >= e_tag_info.bits);
    assert(e_info.is_exhaustive or backing_info.bits > e_tag_info.bits);
    for (std.enums.values(E)) |e| assert(@intFromEnum(e) != sentinel);

    return struct {
        v: Backing,

        fn pack(maybe_e: ?E) @This() {
            if (maybe_e) |e| {
                return .{ .v = @intFromEnum(e) };
            }
            return .{ .v = sentinel };
        }

        fn unpack(b: @This()) ?E {
            if (b.v == sentinel) return null;
            return @enumFromInt(b.v);
        }
    };
}

You can explicitly assign the tag integer value, also works on union(enum), but not on union(Tag) in which case you must specify in the definition of Tag.

const Subset = enum {
    a: i32 = @intFromEnum(Super.a),
    //...
};

It is a bit verbose, but if they are sequential in Super then only the first needs to be specified.

even without it, a vistor function is still easy, assuming the subset payloads match the supersets payloads.

switch(subset) {
    // for unions
    inline else => |p, t| return @unionInit(Superset, @tagName(t), p),
    // for enums
    inline else => |t| return @field(Superset, @tagName(t)),
}

If the tag values match it should optimise to a no-op.

2 Likes
const Node = union(enum) {
    @"null": void,
    a: A,
    b: B,
};

Apart from optional pointers, I seldom use optional types and instead treat their optionality as part of a tagged union. I think the greatest significance of optional types is that they can simply express the optimization of optional pointers and are well compatible with C-style pointers. In other scenarios, I believe their use is completely covered by tagged unions.

1 Like

This is indeed suitable for enum, but I think it’s not quite suitable for union(enum) that might contain any value (in my scenario, a bunch of pointers).

In implementations, there’s always a trade-off between engineering and semantic considerations when modeling, which is reasonable — I might ultimately choose to do the same as you, BUT:

From my personal perspective, I believe treating null as a value of an entity is a very painful Java-like approach. When I consider a node as an entity, and when it might not exist, ?Node is the most semantically fitting and intuitive type; simply by examining the struct definitions of AST Nodes, one can understand where the node must exist and where it’s optional. Thus avoiding the need to constantly keep track of the cases where it might not exist when visiting it.

1 Like

I agree: having if(opt_a)|a|{} is definitely a plus when it comes to ergonomics and understanding the code.

Intuitively, I understand this perspective, but my fondness for tagged unions still lies in their ability to convey richer semantics. Optional types can only convey that something does not exist, whereas tagged unions can also express why it does not exist:

const Node = union(enum) {
    unparsed: void,
    unspecified: void,
    parsed_null: void,
    a: A,
    b: B,
};

Although sometimes there may not be multiple reasons for null behind each optional type, I still believe that annotating the meaning of null through a tag for just this one reason is helpful for code readers to understand.

1 Like