Elevating meta-programming into upstream meta-programs

biosbob · May 23, 2024, 12:02pm

It’s been a little over two months since I’d even heard of Zig; and shortly I’ll be showcasing some fruits of my journey – many of which have been sweetened by the exceptional support and encouragement fostered by this forum

In anticipation, however, I’d like to first brainstorm about a rather novel approach to meta-programming I’ve come to embrace when developing software for resource-constrained MCUs – where every byte of memory and μJoule of energy matters.

SOME BACKGROUND

I’ve designed a programming language named EM back in 2010, which has seen use in a small number of very high-volume commercial applications using ultra-low-power wireless MCUs. A quick-read of the twenty Q & A’s found here should give you a sense of where I’m coming from and where I’d like to go towards making this technology openly and freely avaiable.

While my focus remains on developing MCU firmware that’s both higher-level in design and higher-performance in deployment, the burden of supporting EM as a general-purpose language for broad(er) use in the embedded space is a daunting effort – compared (say) with maintaining an internal, proprietary tool used by a handful of programmers.

Which brings us forward to Zig – a language that not only embodies many features I’ve already implemented in EM, but includes capabilities that I often wished I had in EM (but just didn’t have the time to develop). As they say: If you can’t beat’em, Zig•EM

FROM LANGUAGE TO FRAMEWORK

For the past two months, I’ve been working on a “proof-of-concept” in which the current EM programming language and runtime would effectively be grafted onto Zig. The net effect is that EM would be “downgraded” to a programming framework – but still targeting resource-constrained MCUs as before.

With a large base of legacy EM code in my software BAG (basement / attic / garage), I’m evolving a “re-write pattern” that I’m currently applying by hand to a (small) subset of this codebase featured in recent docs and blogs.

Said another way, each legacy EM source file (Uart.em, Timer.em, FFT.em, …) will morph into a corresponding Zig•EM file (Uart.em.zig, Timer.em.zig, …) that now relies upon the Zig compiler to produce MCU firmware.

Needless to say, I’m not forking Zig – creating my own divergent notation. By design, I’m using the language “as is” – embracing its capabilities while constraining myself to using Zig in its current form. (No rants about the choice of keywords, requirements for semicolons, curly braces around blocks, and so forth )

Beyond some syntactic differences on the surface, both EM and Zig are (obviously) Turing-complete languages at their core. The trick, however, is to “zigify” programming concepts and constructs central to the EM language, while remaining true to their original meaning and intent.

To cite just a few examples:

an EM module or interface is a language construct that would require a specific “usage pattern” to realize in Zig;
the EM language has syntactically distinct primitives for toggling real-time debug pins (which I can emulate using Zig’s @"..." identifiers for these special functions); and
EM has a rather novel approach for turning .em source files into .out binaries, which is certainly worthy of some brainstorming.

ZIG COMPTIME || EM CONFIGURATION

And now we come to the heart of this topic – just how far can we leverage Zig’s comptime meta-programming to realize EM’s novel build flow, in which each program is actually executed TWICE. Full disclosure – nothing’s been decided, and everything still matters!!!

There is an inherent asymmetry between the host computer and the target MCU when cross-compiling for a resource-constrained embedded system: the former has virtually unlimited MIPs and memory, a file-system, internet access, etc; the latter might have just ~16K of memory!!

Needless to say, ANYTHING that can be (pre-)computed at build-time relieves pressure on run-time resources. Details notwithstanding, Zig and EM are philosophically aligned and kindred spirits on this point.

For historical reasons, there have been no shortage of “configuration tools” in the embedded space which are invoked upstream from the cross-compiler itself – generating (C/C++) data-structures which encode hardware setup, scheduling policies, algorithm coefficients, etc. The “configuration language” is typically data-centric (XML, JSON) and often prepared using a GUI.

Long before conceiving EM, I was already using JavaScript as my configuration “meta-language” for writing complete programs that would compute and output statically-initialized data-structures (and sometimes small code fragments) consumed downstream during cross-compilation.

Even today, each .em source file is transpiled into a corresponding .js and .cpp file – with all of the former code aggregated into a JavaScript (meta-)program that executes on your host computer. See this blog post for a quick overview of the flow.

What’s novel, of course, is that EM is its own meta-language – and with the same sort of fluidity that Zig’s comptime affords. Said another way, one programming language serves as the linqua franca for two distinct programming domains – in my case, a resource-rich host computer and a resource-constrained target MCU.

While I’ve already leveraged comptime in implementing the EM framework in Zig (and I continue to learn more everyday about what’s possible), I’ve perhaps taken a more conservative approach by funneling much of the application-specific meta-programming into a separate upstream meta-program – written in Zig, translated by the Zig compiler, and leveraging almost any library function in the Zig runtime.

With so many EM modules relying on application-level meta-programming for downstream configuration, having a “normal” program flow in which I can use print to trace my execution as well as read/write data files flattens the learning curve considerably. Leveraging some key features of the Zig language and compiler, partitioning a single .em.zig source file into elements that are either restricted to the host or target environment or else are common across both domains is actually quite expressive.

Whew!!! Let me stop here to get some feedback from y’all on whether this approach has merit, highlighting a rather novel way to use Zig in applications where the run-time domain is extremly limited in capability. Is this approach more powerful and expressive than comptime alone (elevating meta-programming to a complete meta-program)??? Or am I just being lazy

Nathan-Franck · May 23, 2024, 10:23pm

Hello! I’m relatively new here as well.

I really like that you went ahead and generated code from Javascript, I’ve been thinking about that for a while, which is part of why I ended up at Zig at all. Since I’ve had so much fun with Typescript, I was on the lookout for a way to apply that style of coding to a more low-level environment for Game development. I kinda wish I had explored a bit more on generating code from Javascript, but it would have also been a pretty nasty rabbit hole given Zig now exists and does a lot of what I love about the Typescript tooling!

Just a note on comptime - the performance to run code at comptime is relatively slow because I’m pretty sure it’s all interpreted. So if you can run code from the build.zig file or ahead of time, it can be tonnes faster and way more flexible, since you can fully take advantage of allocators and heap memory. I still use comptime absolutely everywhere as I’m building and learning, but if I’m processing big chunks of data, it’s not a great fit.

biosbob · May 23, 2024, 10:46pm

SORRY, BAD LINK

Please, give me just 5 more minutes of your time and read the twenty Q & A’s found here for some important background to this post.

AndrewCodeDev · May 23, 2024, 10:56pm

The whole thing sounds really cool. I’d like to see some examples (that may be hard to produce given the nature of this topic).

I think your work on runtime images is quite interesting and I’d like to know more about that and how this plays into it. From your link:

Pushing intelligence from the cloud down to distributed edge devices has become standard practice; and as we move further towards Ambient IoT, we'll want to push AI / ML algorithms even closer to our input sources.

While TinyML has emerged as a standard for embedded MCUs, early prototypes integrating EM with the Meta AI Glow compiler have yielded significantly smaller runtime images for the same .tflite model input.

dude_the_builder · May 24, 2024, 11:42am

When I read the OP, I was thinking just the same. In my experience, and with the recent developments in the build system, Zig is a language that gives you three execution contexts: build time, comptime, and runtime. The tricky part is determining which time is the best choice for a given processing task in your project. I guess that’s always going to be project-specific and also would require some benchmarking and godbolt analysis to tweak to perfection.

biosbob · May 27, 2024, 5:04am

Suppose I had some (numerical) algorithm which could execute at comptime OR runtime. A trivial example might be computing a checksum or perhaps encrypting some data.

If @Nathan-Franck is in fact correct in characterizing comptime as “interpreted execution” by the compiler, this gives even more weight to having a separate hosted meta-program to handle this processing more efficiently.

This approach naturally scales to far more complex algorithms (such as in the ML space), which effectively implement an application-specific form of profile-based optimization that involves a feedback loop.

As @dude_the_builder pointed out, there are definitely multiple execution contexts at play here. In the more specialized use-case of Zig•EM, an over-arching objective to distill the target program by hoisting as much processing as possible into the hosted meta-program.

This is not unlike what happens in Zig today, when we realize that we can compute something at comptime – only now, we’ve expanded the latter to be a standalone executable image that enjoys the full power of Zig (including its OWN comptime phase).

But it’s still just ONE program whose individual .em.zig source files happen to be compiled/execute TWICE. How all of this plays into build.zig is still a work in progress; right now, i have a minimal command-line tool zig-em which coordinates a flow in which zig build-exe is invoked twice on the same program sources.

I’ll try to cobble together a sample .em.zig source file that illustrates some of these concepts.

Sze · May 27, 2024, 1:44pm

I am aware that you are dealing with a program / language that has organized things in this, 2-stage meta-program is “interleaved” with the source that becomes the final program, manner.

While I can see the benefits of that (Users can keep the meta parts close to the runtime parts they affect), I personally find using the build-system to structure these things more appealing.

I think using the build-system to structure this would be:

to put the general computations into some re-useable module (instead of having the code directly in one particular place that is executed in two different ways)
create a custom compile step that, compiles the meta program, runs it, collects its outputs
have another step that then makes use of those outputs

There are several benefits to that approach:

less need for c style if-def switching in code, because you are able to organize your code to minimize that, which also makes the code simpler to read because it doesn’t switch between different execution-contexts / compile-time parameters forcing you to mentally ignore big parts of the code based on which compile time parameters you currently care about
by turning meta programs into explicit build steps, those can be cached
inputs and outputs of build steps become clearer parts of a standardized way of dealing with dependencies, making the single steps more understandable, because you don’t have implicit requirements like “this code needs to run in two different ways”
when something fails, it fails in an explicit step, instead of in a pseudo-step that could have run in this or that context, basically giving you clearer error messages
by writing meta programs as explicit build steps, they become clearer, less magical, more like normal programs that happen to be executed as a build step
instead of having magical switches peppered throughout your program, you have more coarse distinctions between what is the meta program (or what is an output of the meta program that gets imported) and what remains as the final program executing at runtime
programs / modules become simpler, to build and re-use, because they don’t have meta dependencies, they just have dependencies that happen to do “meta” things
build steps can fully implement all the things that need to happen, allowing people to use it without needing to understand all the details, but when they want to change it, they are logical pieces that can be read, understood and tested, step by step
build steps can replace external scripts, that may be platform dependent
reduce the amount of external dependencies, reliance on other ecosystems, that may have their own complex bootstrapping requirements

I would also say there are currently still some short comings:

the build system still seems daunting to many (myself included), but I think Build system tricks was a good step to start to demystify some things
by making things more explicit through separate build steps, code can become more verbose and require more work to organize, even if it can be understood easier
tooling for debugging and understanding isn’t really developed and requires you to figure out things yourself, instead of having some kind of build-step inspector/debugger

Sometimes magic switches feel better, but I think that is a short-term thing.

Personally I think that frameworks are magic tricks, they make you think that things are more simple then they are, making you think that you are an amazing programmer, because everything can be accomplished this easily by you, until some day you want to implement something simple and you run into an invisible barrier, then you realize that an illusion was cast over you, this way of seeing things may be very useful, but the moment you need to go beyond that perspective, break out of that barrier, you are now in a strange and foreign land, you are no longer the participant in a pleasant illusion, now you suddenly need to become the magician that changes the framework, knows how it works, what assumptions it made and how it is extended.
The sad part is, the moment you understand how it works, it no longer feels magic and it looses its appeal. Especially if the framework made a bunch of assumptions, that you need to break, that force you to change a lot about the framework.

I think different projects have different degrees of “magic” to them and for some projects a bit more might be acceptable, then for other projects, so I don’t want to prescribe for anyone how much magic vs explicitness they use.

I just find explicit, manual, verbose preferable over more automatic things, that may hide things from me (where I might not even know, that I would want to know about it).

To bring it back to meta-programs, I think using explicit build steps for meta programs, would naturally lead you down a path where you need to be very explicit about what is meta and what isn’t. I think doing this might be more work initially, but it also might be easier to understand and maintain in the long run.

Incomplete thoughts about languages/frameworks Zig and Racket

Before Zig I was using Racket, I liked a lot about it, but it didn’t allow me to get close enough to the machine, without touching its implementation, or writing my own code generator DSLs/languages, I didn’t want to have to switch to c to write lowlevel code that could be used via the ffi. That is one of the things I didn’t like about racket, that it didn’t give me an easy path to write code that handles memory manually.

I also don’t like macros as much as I used to, I think the breaking point between where using comptime becomes to fiddely and difficult and using the build system instead becomes easier, is pretty well chosen.

It keeps you from building huge towers of macros, that then have subtle bugs and half-working ways of being able to interoperate with other macros, that only work half of the time, if everyone has thought about it for way to long, before designing their macros.

That is one of the down-sides of macros they require you to be too much of a big-brain thinker, sometimes to the point where you don’t understand how your own macro works, a few days after you have written it. (And my macros where still relatively simple for what some racket users are able to create)
I also dislike the break between what is normal code and what is a macro and the effect that has on being able to compose things.

Still the language oriented approach of racket has interesting aspects, being able to use modules written in different #langs together in one program is interesting and I think putting things that are “magic” into the semantics of specific language/dsl features seems more appropriate to me, than putting these things into parts of a framework.

In some ways a framework can start to look like a poor mans language/DSL. When I think of it as a DSL it seems natural that it would be implemented via a build step that clearly separates things into language constructs vs implementation details of the language.

Instead of making things within frameworks, that seem more like implicit DSL language constructs that weren’t turned into explicit language constructs and instead implemented as general purpouse language constructs, that have strange implicit ways of needing to be used, with caveats that need to be followed, to avoid breaking what was intended by the framework.

I am not completely sure whether I just have experienced badly designed frameworks and maybe there are better designed ones, that don’t have these painful breaking points where things become messy.

However since I started avoiding frameworks, I didn’t have these big unexpected disappointments anymore, where something that is expected to be simple, becomes very difficult.

In some ways languages seem similar to frameworks, I think one thing that is a bit better with languages, is that they tend to make the distinction between what is supported and what isn’t a bit clearer.
And thus it might be still disappointing, just a bit easier to expect, if the language communicates/documents well what it actually supports/intends.

I like Rackets idea of language oriented programming and being able to use different languages together, but I am not sure whether their approach to use macros to get there, is a good way.

I am optimistic that Zig could eventually grow into the bedrock underneath a bunch of languages that allows you to use and mix these languages without duplicating huge software ecosystems which essentially are alternative implementations of the same ideas (to differing degrees).

Maybe we could eventually have several languages that share a good portion of data structures and thus are able to easily exchange data between them within one process, or maybe even share parts of their implementations.

biosbob · May 27, 2024, 5:53pm

Among the many good points raised by @Sze here, the idea of having a clean way to express what is part of the “meta-program” versus the “final program” is a dominant theme. No magic, no #ifdef, etc.

AN EXAMPLE IN EM

To help the discussion, here’s a basic example program from my legacy EM codebase which I’ve just “re-written” in Zig•EM. The goal of this exercise is to ensure that the latter (framework) preserves both the expressiveness and performance of the former EM language.

[[ FULL DISCLOSURE – I’m truly motivated to graft EM onto Zig, and am not try to “sell EM” as a language per se ]]

By way of summary, the TickerP program will continuously blink my board’s (green) AppLed every 1.0 s and my (red) SysLed every 1.5 s; this program has run on dozens and dozens of different boards, and typically consumes <1K of memory.

Like any module, this program clearly depends upon other EM modules brought into scope via the import directives at the topic of the file. And like any good module, this program is explicit about the private features used internally to support its implementation.

The “final program” running on my target board begins execution with the intrinsic em$run function, which in turn starts a pair of TickerMgr.Ticker objects and then falls into the FiberMgr scheduling loop.

The (private) callback functions respectively bound to these Ticker objects will wink either AppLed or SysLed for 100 ms. The notation %%[c] and %%[d] (syntactic sugar for a direct call into the EM runtime) will toggle specific “debug pins” on my target board, which I can observe with my logic analyzer to capture sequencing as well as timing of program events.

Turning now to the em$construct intrinsic, this EM code is NOT executed at run-time on my target board but rather at config-time on my development host. Said another way, appTicker and sysTicker are statically created and initialized upstream from em$run(); and internally, TickerMgr.createH will likewise create other types of objects (including a Fiber thread).

Note that while FiberMgr per se allows a more dynamic binding of the tick-rate and callback at run-time, an even more efficient design would have had these parameters statically bound upstream at config-time.

Note also the keyword config applied to the declarations of appTicker and sysTicker. In EM, a config is effectively a var at config-time but a const at run-time.

SAME EXAMPLE IN ZIG•EM

[[ FULL DISCLOSURE – This is work-in-progress, something I’m not necessarily satisfied with in its current form ]]

(I’ve made some small hacks in ZLS to support some “extended semantic token” highlighting, hopefully to help see a few of the EM trees within the Zig forest )

One of the first things you’ll (hopefully) notice is the juxtaposition of the EM__HOST and EM__TARG scopes – explicitly compartmentalizing what portions of this program execute at config-time on the development host versus run-time on my target board. @Sze – once I discovered this pattern, all of my #ifdef-like code vanished.

Under the covers, the (Zig) type returned through em.Import has all of the top-level declarations within the corresponding source file PLUS a pub usingnamespace of either the file’s EM__HOST or EM__TARG scope depending upon the context. To me, this is quite expressive.

Where I’m currently less satisified is with the “gymnastics” required to capture the essence of an EM config as mutable upstream and frozen downstream. Perhaps as an artifact of my (reflective implementation), I do rely on these logically “private” configs being declared pub in Zig. Worse, I have to create a “wrapper” object which is then “unwrapped” at run-time to reveal the config’s actual value.

Honestly, it sure seems that all of this could/should be handled through Zig’s comptime – which was my first thought when learning Zig. At this point, some of more complex computations that occur in my legacy EM seem a little beyond the scope of Zig’s comptime – but maybe I should stand back and take another look.

Finally, the Zig•EM build sequence is currently controlled by a very simple Zig program – one that could obviously be used within a build.zig. What’s important, however, is that there is a rather tight-coupling between these two program phases.

$ zig-em build -u em.test/em.examples.basic/FiberP.em.zig 
compiling HOST ...
compiling TARG ...
    image size: text (976) + const (12) + data (4) + bss (16)
done in 4.42 seconds

Note that the upstream meta-program along with the final target program can BOTH call an em.print function – implemented quite differently in each domain, and yet building on the same std.fmt code at its core.

That’s enough for now. But I would like to eventually settle on the original topic of this post – comptime meta-programming versus upstream meta-programs.

biosbob · June 2, 2024, 5:15pm

SELF-REFERENTIAL DATA-STRUCTURES

I have come up with a “solution” to the problem described in this post – having statically-initialized structs that can reference one another. This solution represents an interesting use-case in the ongoing comptime meta-programming versus upstream meta-programs discussion.

Until #131 is resolved, it doesn’t appear that a comptime solution is possible. I was, however, able to create my linked data structure within the upstream meta-program; and with help of some generated code consumed downstream, the data was indeed statically initialized.

This generated code itself made use of comptime within the downstream program in a somewhat “advanced” manner – creating absolute symbols which contained the (linker-resolved) address of each static struct. The appropriate symbol (&node_123) was then inserted in each location where a pointer was statically assigned upstream.

Here’s a snip from the generated file, which illustrates the pattern:

    ...
comptime {
    asm (".globl \"em.coremark/ListBench__Elem$28\"");
    asm ("\"em.coremark/ListBench__Elem$28\" = \".gen.targ.em.coremark/ListBench__Elem\" + 28 * " ++ @"em.coremark/ListBench__Elem__SIZE");
}
extern const @"em.coremark/ListBench__Elem$28": usize;
const @"em.coremark/ListBench__Elem__28": *em.Import.@"em.coremark/ListBench".Elem = @constCast(@ptrCast(&@"em.coremark/ListBench__Elem$28"));

comptime {
    asm (".globl \"em.coremark/ListBench__Elem$29\"");
    asm ("\"em.coremark/ListBench__Elem$29\" = \".gen.targ.em.coremark/ListBench__Elem\" + 29 * " ++ @"em.coremark/ListBench__Elem__SIZE");
}
extern const @"em.coremark/ListBench__Elem$29": usize;
const @"em.coremark/ListBench__Elem__29": *em.Import.@"em.coremark/ListBench".Elem = @constCast(@ptrCast(&@"em.coremark/ListBench__Elem$29"));

pub var @"em.coremark/ListBench__Elem" = [_]em.Import.@"em.coremark/ListBench".Elem{
    em.Import.@"em.coremark/ListBench".Elem{
        .next = @"em.coremark/ListBench__Elem__1",
        .data = @"em.coremark/ListBench__Data__0",
    },
    em.Import.@"em.coremark/ListBench".Elem{
        .next = @"em.coremark/ListBench__Elem__2",
        .data = @"em.coremark/ListBench__Data__1",
    },
    ...

Maybe I got lucky here, but because an extern const (the absolute symbol itself) is ultimately used in defining an array of statically-initialized elements, there is no “dependency loop” error. Said another way, the Zig compiler does NOT attempt to “know” the (comptime) value of this array initializer – which is ultimately “linked” by the linker.

As esoteric as this solution appears, it’s actually 100% portable – agnostic to whether the upstream meta-program has 64-bit pointers while the downstream program has a target with just 16-bit pointers. Needless to say, @sizeOf a struct containing pointer fields will often be different upstream from downstream.

While it would be possible to “manually” apply this pattern in any standalone program, it’s a little hard to see what’s really going on; and like comptime in general, debugging can be a challenge. Said another way, it wasn’t a “walk in the park” to get this working.

The upstream meta-program actually performs this initialization at its own run-time – where is can more easily debug the code. Inspecting the generated .zig file output by the meta-program is another “security blanket” that I’ve correctly configured my final program.