Is there a roadmap / plan for "the right way" to generate files during a build?

Hi all,

I’ve been trying to google and research the best / possible ways of generating code from build.zig which is then used in a later build step. My practical use case is porting a C/C++ project where cmake runs a bash script which generates some files which are then included from one of the .cpp files.

From what I can tell the preferred way to do this in zig is building a small zig program and running that as a build step via something like addRunArtifact() and hooking the inputs and outputs into the build system via addOutputFileArg() / addOutputFileArg. (docs). Doing this seems to require you to pass all input files and and all output files as command line arguments to the executable, which I imagine can become impractical if you’re dealing with a lot of files. Alternatively you need to spawn this executable many times, once per input+output, which also seems not nice.

However, to me it seems a lot less complex and wasteful to simply run a zig function instead of running an executable (potentially lots of times) only to generate a few strings and write a few files. It seems like a bunch of people are creating steps using a custom makeFn function, but if you do that then it seems like it’d be a bunch of work to integrate your custom makeFn into the caching system such that the function is not rerun unecessarily, input files are watched etc. It works, but this kind of custom build step seems to always run, regardless of whether the input files have changed or not. The input files also are not considered by --watch by default.

Various posts like this one from @castholm indicate that the makeFn approach will be made impossible at some point.

There are others like @floooh who have asked for a function-based ways of creating generation steps.

tldr; I guess the thing I’m asking is

  • is it decided that custom makeFn based steps will be deprecated
  • and/or whether there should be a nice API to run a zig function with inputs and outputs instead of spawning executables for file/code generation and a general RFC

Am I overthinking this? Is worrying about executing extra scripts instead of running functions premature optimisation?

pointing out, you can pass input and output directories too.

1 Like

Yeah, having to write a cmdline tool just for code generation is kind of the worst case and should only be needed when nothing else works.

There’s the boilerplate of cmdline arg parsing (and those args are not typesafe because they need to go through a string conversion), for more complex input args the build.zig may need to write a JSON file which then also needs to be parsed. Other input files may also need to be written, but the content of those files may only be available as the result of other build steps etc etc etc…

IMHO it would be better if the entire build system would be built around the idea of ‘function DAG nodes’, e.g. a node in the build graph is essentially a function pointer, a way to pass input/output args between such nodes, and to connect nodes into a dependency graph, where outputs of depdendency nodes flow into inputs towards the root node.

E.g. each DAG node has:

  • a list of typed inputs
  • a list of typed outputs
  • an (ideally pure and async) function which turns the inputs into outputs
  • a way to connect node outputs to node inputs

Everything else (e.g. the current ‘build steps’) would be built on top of this fundamental ‘function node’, but (and that’s the important part) build.zigs may also ‘derive’ their own node types and hook them into the build system.

The code in those functions must be properly ‘isolated’ so that they can be scheduled in parallel.

Such a system might actually be a nice proof-of-concept of the new IO system :wink:

Bonus points for allowing to inspect the build graph for visualizations.

2 Likes

I wonder if it’s possible to use std.zon.Serializer to generate a ZON file at build time, then import that as a pseudo-namespace.
I’ve never tried any kind of build-time codegen, but in my mind if that’s possible it’d be the easiest/nicest way to do it.

2 Likes