I never really got into the possibilities of build.zig. I really should do that soon.
I know we can do lots of things in there.
So question…
I am thinking about creating a “tiny” version of my chess engine as a fun side project.
The executable size of an “official tiny engine” is 4096 bytes max.
My “big” chess engine has a lot of compile time computed constants (lots of arrays).
That is blowing up the executable size of course.
So the plan is to stuff all these constants in a struct on the heap.
Is it possible to fill this heap-struct inside build.zig, thus avoiding runtime code. And somehow let this code execute at compiletime?
(Runtime code of course also consumes executable space).
I don’t think so? since the memory exists on the heap at runtime it must be set at runtime. it can either be set by copying bytes from rodata (i.e. potentially increasing the binary size), or by executing code that writes to that memory (i.e. potentially slowing down the execution of your program).
Are you sure your comptime code is the problem? Generally things should only show up in your executable’s data/rodata sections if runtime code actually references it, so moving your comptime code to build.zig and referencing it with @embedFile or by generating zig code that you @import isn’t likely to change the executable size.
One thing you could do more easily if you compute your arrays via build.zig is to compress them. Then when your program starts it can take the compressed byte arrays from @embedFile and decompress them into heap memory. Technically you could probably achieve the compression with just comptime code too, but it would likely be painfully slow to compile.
That said, 4K is positively tiny! I don’t think even a zig hello world compiled with ReleaseSmall ends up that small on most platforms, so you probably won’t be able to afford a deflate decompressor. You could take a look at simpler compression algorithms though. I came up with one a few years back for use on highly resource-constrained microcontrollers that might work well if the data isn’t super random.
That small size makes me think it may make more sense to look into the demo scene (or old games) and what they are using, my guess would be lots of formulas and (deterministic) pseudo randomly generated data mixed with reused data that is reinterpreted in multiple ways and probably also hand crafted assembly, but I haven’t looked into what they actually use. (just heard some cool anecdotes like using the program code as textures)
Would some sort of ‘domain-specific’ data compression be possible, e.g. how “noisy” is the data in those tables?
E.g. compute the tables in build.zig apply this ‘magic’ data compression, save to files, pull those files in via @embed (hopefully keeping the exe size small enough because the data is compressed), and then decompress into heap-tables (which hopefully is faster and less code than doing the full computation from scratch).
…or is there a sensible way to split the table computation into an expensive offline-part (which creates little intermediate data baked into the exe via @embed), and then ‘finish’ the computation in a runtime-code part at startup into heap tables?
…but IME: when you have code that generates the data tables, this is usually already the best “compression algorithm”. Usually nothing beats data-generation-code when it comes to “data density”.
Another crazy idea: build your own domain-specific virtual instruction set, write the table computation in this virtual “machine code”, and run the whole table computation at startup via a minimal interpreter (only makes sense if the virtual program + the interpreter is smaller than the native code which computes the table data - and it will probably be a lot slower). But AFAIK this technique of a special VM is used in scene demos which need to keep executable size down, but need to generate a lot of data.
PS: another idea: can the table computation be described by much fewer parameters and a generic algorithm? E.g. only store the parameters in the executable and run the algorithm to populate the heap tables. A little bit how in 3D rendering spherical harmonics might replace a whole cubemap for image-based lighting with just 9(?) parameters.
Some of the data can be embedded and compressed I think.
But not this one… here you can see some of the comptime computed const arrays.
"this is usually already the best “compression algorithm”
In my ideal world it would be great if that @comptime functions could fill a heap allocated struct if you know what i mean…
Another general idea that may or may not work for you: do the computations at build time, and then generate a zig file that’s compiled normally by the build system and which does the thing you want at runtime (like putting the computed constants on heap)
Another idea would be to go down the rabbithole of self-modifying code and somehow generate the code that generates the tables at runtime(maybe even one recursion more). This, totally based on intuition, seems doable because your table generation code looks quite repetitive.
I just did a ReleaseSmall on a completely empty main() and it was already 5KB.
Either I need to dive into rabbitholes or Zig is not the right language to do this
What number do you get with -OReleaseSmall -fomit-frame-pointer -fno-unwind-tables -fstrip -dead_strip followed by strip -R .comment <binary> ? That used to be effective, especially -fno-unwind-tables