Elevating meta-programming into upstream meta-programs

biosbob · June 2, 2024, 5:15pm

SELF-REFERENTIAL DATA-STRUCTURES

I have come up with a “solution” to the problem described in this post – having statically-initialized structs that can reference one another. This solution represents an interesting use-case in the ongoing comptime meta-programming versus upstream meta-programs discussion.

Until #131 is resolved, it doesn’t appear that a comptime solution is possible. I was, however, able to create my linked data structure within the upstream meta-program; and with help of some generated code consumed downstream, the data was indeed statically initialized.

This generated code itself made use of comptime within the downstream program in a somewhat “advanced” manner – creating absolute symbols which contained the (linker-resolved) address of each static struct. The appropriate symbol (&node_123) was then inserted in each location where a pointer was statically assigned upstream.

Here’s a snip from the generated file, which illustrates the pattern:

    ...
comptime {
    asm (".globl \"em.coremark/ListBench__Elem$28\"");
    asm ("\"em.coremark/ListBench__Elem$28\" = \".gen.targ.em.coremark/ListBench__Elem\" + 28 * " ++ @"em.coremark/ListBench__Elem__SIZE");
}
extern const @"em.coremark/ListBench__Elem$28": usize;
const @"em.coremark/ListBench__Elem__28": *em.Import.@"em.coremark/ListBench".Elem = @constCast(@ptrCast(&@"em.coremark/ListBench__Elem$28"));

comptime {
    asm (".globl \"em.coremark/ListBench__Elem$29\"");
    asm ("\"em.coremark/ListBench__Elem$29\" = \".gen.targ.em.coremark/ListBench__Elem\" + 29 * " ++ @"em.coremark/ListBench__Elem__SIZE");
}
extern const @"em.coremark/ListBench__Elem$29": usize;
const @"em.coremark/ListBench__Elem__29": *em.Import.@"em.coremark/ListBench".Elem = @constCast(@ptrCast(&@"em.coremark/ListBench__Elem$29"));

pub var @"em.coremark/ListBench__Elem" = [_]em.Import.@"em.coremark/ListBench".Elem{
    em.Import.@"em.coremark/ListBench".Elem{
        .next = @"em.coremark/ListBench__Elem__1",
        .data = @"em.coremark/ListBench__Data__0",
    },
    em.Import.@"em.coremark/ListBench".Elem{
        .next = @"em.coremark/ListBench__Elem__2",
        .data = @"em.coremark/ListBench__Data__1",
    },
    ...

Maybe I got lucky here, but because an extern const (the absolute symbol itself) is ultimately used in defining an array of statically-initialized elements, there is no “dependency loop” error. Said another way, the Zig compiler does NOT attempt to “know” the (comptime) value of this array initializer – which is ultimately “linked” by the linker.

As esoteric as this solution appears, it’s actually 100% portable – agnostic to whether the upstream meta-program has 64-bit pointers while the downstream program has a target with just 16-bit pointers. Needless to say, @sizeOf a struct containing pointer fields will often be different upstream from downstream.

While it would be possible to “manually” apply this pattern in any standalone program, it’s a little hard to see what’s really going on; and like comptime in general, debugging can be a challenge. Said another way, it wasn’t a “walk in the park” to get this working.

The upstream meta-program actually performs this initialization at its own run-time – where is can more easily debug the code. Inspecting the generated .zig file output by the meta-program is another “security blanket” that I’ve correctly configured my final program.