A Nanopass framework for Zig

joelreymont · June 12, 2025, 11:00am

I’m trying to implement the Nanopass framework in Zig which requires extending union (enum) among other things. I took a stab at it below but is there a better way to do it?

Is there a way to make L2 show as nanopass.L2 instead of nanopass.extend(nanopass.L1,nanopass.L2__union_19543)?

Last but not least, unlike enum fields, union fields have an extra alignment field. I’m concerned that the values may become invalid if I just concatenate the two sets of union fields. Do I need to recalculate it for my new composite union L2 and how if so?

const std = @import("std");

pub const L1 = union(enum) {
    a: usize,
    b: usize,
};

pub const L2 = extend(L1, union(enum) {
    c: usize,
});

pub fn extend(Src: type, Dst: type) type {
    const src = @typeInfo(Src).@"union";
    const dst = @typeInfo(Dst).@"union";
    const src_len = src.fields.len;
    const dst_len = dst.fields.len;
    var enum_fields: [src_len + dst_len]std.builtin.Type.EnumField = undefined;
    inline for (src.fields, 0..) |field, i| {
        enum_fields[i] = .{ .name = field.name, .value = i };
    }
    inline for (dst.fields, src_len..) |field, i| {
        enum_fields[i] = .{ .name = field.name, .value = i };
    }
    const fields = src.fields ++ dst.fields;
    const dst_enum = @Type(.{
        .@"enum" = .{
            .tag_type = std.math.IntFittingRange(0, fields.len - 1),
            .fields = &enum_fields,
            .decls = &.{},
            .is_exhaustive = true,
        },
    });
    return @Type(.{
        .@"union" = .{
            .decls = &.{},
            .layout = .auto,
            .tag_type = dst_enum,
            .fields = src.fields ++ dst.fields,
        },
    });
}

// ❯ zig run nanopass.zig
// nanopass.extend(nanopass.L1,nanopass.L2__union_19543){ .a = 10 }
// nanopass.L1{ .a = 10 }%

pub fn main() void {
    std.debug.print("{}\n{}", .{ L2{ .a = 10 }, L1{ .a = 10 } });
}

To complete the framework, I’ll need to give users the ability to delete arms of the union.

I also need to write the pass function that would recursively copy L1 into L2 for arms of the union we are not adding.

P.S. I need to do a lot of AST manipulation for a new language I’m working on and I’d like to use the nanopass approach.

joelreymont · June 12, 2025, 12:31pm

I believe my alignment question has been answered here and the answer is @alignOf(field.type).

joelreymont · June 12, 2025, 1:43pm

This is what I ended up with after allowing new fields to replace existing ones.

Please critique!

const std = @import("std");

pub const L1 = extend(null, union(enum) {
    a: usize,
    b: usize,
});

pub const L2 = extend(L1, union(enum) {
    a: []const u8,
    c: usize,
});

pub fn extend(T1: ?type, T2: type) type {
    const EnumField = std.builtin.Type.EnumField;
    const UnionField = std.builtin.Type.UnionField;

    const fields1 = if (T1) |T|
        @typeInfo(T).@"union".fields
    else
        [0]UnionField{};

    const fields2 = @typeInfo(T2).@"union".fields;

    // how many fields are we replacing?
    var nchanged = 0;
    if (T1) |T| {
        for (fields2) |field| {
            if (@hasField(T, field.name))
                nchanged += 1;
        }
    }

    const len1 = fields1.len;
    const len2 = fields2.len;
    const len = len1 + len2 - nchanged;

    var enum_fields: [len]EnumField = undefined;
    var union_fields: [len]UnionField = undefined;

    var i = 0;

    inline for (fields1) |field| {
        if (@hasField(T2, field.name))
            continue;
        enum_fields[i] = .{
            .name = field.name,
            .value = i,
        };
        union_fields[i] = .{
            .name = field.name,
            .type = field.type,
            .alignment = @alignOf(field.type),
        };
        i += 1;
    }
    inline for (fields2) |field| {
        enum_fields[i] = .{
            .name = field.name,
            .value = i,
        };
        union_fields[i] = .{
            .name = field.name,
            .type = field.type,
            .alignment = @alignOf(field.type),
        };
        i += 1;
    }
    return @Type(.{
        .@"union" = .{
            .layout = .auto,
            .tag_type = @Type(.{ .@"enum" = .{
                .tag_type = std.math.IntFittingRange(0, len - 1),
                .fields = &enum_fields,
                .decls = &.{},
                .is_exhaustive = true,
            } }),
            .fields = &union_fields,
            .decls = &.{},
        },
    });
}

pub fn main() void {
    std.debug.print("{}\n{}", .{ L1{ .a = 10 }, L2{ .a = "foo" } });
}

joelreymont · June 12, 2025, 2:42pm

What I really want to do is take my parse tree and make an exact copy of it but for the Binary structure.

The Binary structure should then change from this

pub const Binary = struct {
    lhs: *Expr,
    op: BinaryOp,
    rhs: *Expr,
};

to this

pub const Binary = struct {
    op: BinaryOp,
    args: *Expr,
};

There are a bunch of types in the AST module that depend on the Expr type so they would need to be copied into the new structure type to depend on the new Expr type.

Except… I don’t think this is possible in Zig because of the restriction on declarations within structures.

I could just bite the bullet and copy AST into AST1 but that would suck.

It seems that Zig is not really suitable for the nanopass approach and I should focus on designing each version of the syntax tree to be as general as possible, to allow a maximum number of operations. For example, I should make the AST Lispy from the start since I print S-expressions after parsing.

Is there another way to accomplish what I’m trying to do?