Complicated pre-build and post-build steps in build.zig

Continuing my mission to slowly replace a pile of ad-hoc bash and cmake with “pure” Zig, I’m wondering how to best go about longer pre- and post-build steps.

Simple command line tools can be called with Build.addSystemCommand, but for something like a bash script that modifies a lot of files or calls a lot of external tools, are there any good options?

you can run your complex bash scripts with Build.addSystemCommand. can you better explain what it is you are trying to do, and what you need more than addSystemCommand?

What I would do is rewriting bash scripts in Zig, and then, in build.zig, use addRunArtifact to run it. I’d then pass everything that the script needs as command-line arguments.

If my script takes too long to execute, I’d split it into several Zig strips, to create several addRunArtifact command and wire them together. That way, I’d lean on build.zig to cache partial results.

This is how, for example, multi version builds of TigerBeetle work, where we need to use llvm-objcopy to “season” the binary appropriately:

8 Likes

At my work I have an embedded linux application that I am the maintainer of.

Currently the codebase is a mix of C and Python, and the build system a mix of bash and cmake, with some other tools for testing and verification. Zig seems like a perfect candidate for me to replace all of it with a single thing, eventually, so I’m trying to get future Zig development supported by replacing the build system first.

Currently it looks something like this:

  1. Untar a filesystem
  2. Download and compile dependencies
  3. Build the program I maintain
  4. Install the program in the filesystem
  5. Modify a bunch of system files on the system to configure it
  6. Retar the filesystem

I think I’ve got a handle on everything except 5, which is currently handled by several bash scripts. That’s what I’m looking at zigifying now.

2 Likes

i think for #5 in your list i’d go with what @matklad suggested and rewrite that portion in zig. i’m not aware of anything in build declarations that would make that step any easier than just running the bash script (I would love a correction on this though if there is a way :slight_smile: )

Other approach:

In this case, You can use nektos/act to run the Github Actions workflow.

I’ve used mlugg/setup-zig to build Zig sources in this workflow.

I’m going to try this, it seems like a good way forward.

If my script takes too long to execute, I’d split it into several Zig strips, to create several addRunArtifact command and wire them together. That way, I’d lean on build.zig to cache partial results.

That is really clever, in a good way. How do you verify that you’re not caching things that have been invalidated?

You mostly get it for free, if you pass dependencies as CLI arguments. For example, my script needs to use llvm-objcopy, which is two steps:

  • download obj copy
  • use obj copy

The “use objcopy” script takes path to objcopy as a CLI argument here:

And that path is a LazyPath from Zig build system, which means that it tracks dependencies, and Zig takes care to re-run the download step if its dependencies change and then my “use” step if the resulting lldb changes.

The dependency loop bottoms out at the explicitly-specfied hash for the file downloaded from the internet:

The important thing is that I haven’t written a line of code which tracks if anything is fresh or not, I just very carefully used build.zig to make sure to re-use built-in dependency tracking.

5 Likes

Admittedly it’s a bit of a hack, but what I ended up doing was creating a couple of wrappers for std.process.Child.run, that either run a bash script, or run a list of shell commands like this:

const std = @import("std");
const run = @import("shell-cmd.zig").run;

const script = @embedFile("path/to/script.sh");

pub fn main() !void {
    var arena_root = std.heap.ArenaAllocator.init(std.heap.page_allocator);
    defer arena_root.deinit();
    const arena: std.mem.Allocator = arena_root.allocator();

    var commands = std.mem.tokenizeScalar(u8, script, '\n');
    while (commands.next()) |command| {
        try run(arena, command);
    }
}

That way, I can access command line utilities in a way that’s opaque to the build system, which means I can start chipping away at making pure(r) zig implementations of the different build steps eventually without needing to replace all of it now.

Zig successfully builds my project end-to-end now! :partying_face:

(If I never have to write another line of cmake in my life, I’ll die slightly happier)

1 Like

What is complicated pre-build step

  • already written in Zig
  • placed in external repo

Typical example - Zig source code generator written in Zig

Of course I can build executables for different targets and fetch them during build (like build_tigerbeetle_executable_get_objcopy),
but it’s already Zig

So I’d like during build:

  • build codegenerator from external zig repo
  • run it as pre-build step

But how to do this?

It is essentially the same as this, just that you put the generator inside a dependency: https://ziglang.org/learn/build-system/#generating-zig

It doesn’t really matter whether you generate code or data, the generator gives you a lazy path for the output, which then can be used in a module import, the generator gets automatically compiled and run, when it is setup that way.

Possibly I don’t understand this magic

    const tool = b.addExecutable(.{
        .name = "generate_struct",
        .root_source_file = b.path("tools/generate_struct.zig"),
        .target = b.graph.host,
    });

In this example file tools/generate_struct.zig exists in the local repo

For my case all source code of generator saved in remote repo

Looks like I need to do something similar to Fetching dependencies without build.zig.zon?

No the generator dependency creates an exe artifact and your downstream project uses that artifact, just make sure that you pass b.graph.host as the target.

Start with building the generator as if it was a normal application project.
Then in the downstream project get the artifact and pass it to addRunArtifact.
Something like this:

const generator_dep = b.dependency("generator", .{
    .target = b.graph.host,
    .optimize = optimize,
});
const tool_step = b.addRunArtifact(generator_dep.artifact("generator"));
1 Like

I have to digest this

FWIW I would really like to have a build step type which directly runs a user-provided Zig function at build time instead of having to delegate the job to an adhoc compiled executable (and in general I would like to be able to extend the Zig build system with user-provided custom build steps which could come from the top-level build.zig or from a dependency).

It’s not clear to me what this would mean. Could you illustrate it with a pseudo-build.zig? As in, assume you have the functionality you want, what would calling it look like?

Also not clear to me what this means. What kind of build step would you want to create? Say the function b.addCustomBuildStep existed: what’s the signature? How do we use it?

It’s not clear to me what this would mean. Could you illustrate it with a pseudo-build.zig? As in, assume you have the functionality you want, what would calling it look like?

I’ll give it a shot, because this also seems like a useful feature to me.

Instead of:

const patch_mod = b.createModule(.{
    .root_source_file = b.path("path/to/main.zig"),
    .target = b.graph.host,
    .optimize = optimize,
});
const patch_exe = b.addExecutable(.{
    .name = "patch",
    .root_module = patch_mod,
});

const patch_run = b.addRunArtifact(patch_exe);
patch_run.addArg( "123" );
patch_run.addArg( "234" );

Something like:

const function_run = b.addRunFunction( func, .{
   // some struct specifying input and output arguments for caching purposes
} );

fn func( args: b.RunFunctionArgs ) void {
   // Would have been the main function in main.zig before
}

It probably wouldn’t need the full flexibility of addRunArtifact, but when all you want is to compile and run a single zig function for the host, having to split it off into a self contained file/module gets quite busy.

2 Likes

Something along the line of (I’m cheating a bit by using a fictional ‘arrow-function syntax’ but that should just resolve to a function pointer):

const step = b.addFunction(.{
    .func = fn (args: MyArgs) void {
        // this code runs when the step executes
    },
    .args = .{
        .input_path = b.path(...),
        .output_path = b.path(...),
    },
});

…there would need to be some duck-typing when passing the args item, and there may need to be a way to define file dependencies so that the step only runs when needed.

About custom build steps I haven’t made up my mind yet… but maybe smth like:

// first register the build somewhere somewhere in the build declaration
// (including package build.zigs)
b.registerCustomStep(.{
    .name = "MyBuildStep",
    .impl = .{
        // an 'interface' with function pointers?
    },
});

// and somewhere else, use the custom step
const myStep = b.addCustomStep("MyBuildStep", .{ ... });

…there is some overlap with the above addFunctionStep() though. I think the most important feature of a custom build step is that dependency package build.zigs could make custom features available to the toplevel project without the toplevel project build.zig having to import the package’s build.zig (e.g. an engine build.zig could make asset exporter build steps available to the toplevel game project).

PS: two examples where I would really like a custom build step are:

  1. wrapping the Emscripten linker into a build step instead of exposing a helper function in an external build.zig (pacman.zig/build.zig at 71bdbbb509a176502ed5ee7ff26141bc75b43a4e · floooh/pacman.zig · GitHub)
  2. wrapping the Sokol shader compiler into a custom link step instead of calling a helper function in a foreign build.zig (pacman.zig/build.zig at 71bdbbb509a176502ed5ee7ff26141bc75b43a4e · floooh/pacman.zig · GitHub)

…both of those helper functions are wrappers around b.addSystemCommand(), but the code required to setup the system command step is non-trivial and shouldn’t need to be replicated in top level build.zigs, while at the same time directly importing a dependency build.zig to call such helper functions feels like a hack/workaround (it’s great that it works though).

3 Likes

Repo of the generator lcm2zig - for now just prints to stderr&stdout

pub fn main() !void {
    std.debug.print("\n\nSkeleton of generator\n\n", .{});

    const stdout_file = std.io.getStdOut().writer();
    var bw = std.io.bufferedWriter(stdout_file);
    const stdout = bw.writer();

    try stdout.print("\n\n--------------------------- RUN --------------------\n\n\n\n", .{});

    try bw.flush(); 
}

Dependency in local build.zig.zon

    .dependencies = .{
        .lcm2zig = .{
            .url = "git+https://github.com/g41797/lcm2zig#8eaa99e9ede45d1721069f5329dc1a88aa5494db",
            .hash = "lcm2zig-0.0.0-6QdwBIwpAAAfO039c-8sqjmHFCQgmvbChBQRt8dv2GMa",
        },
    },

local build.zig:

    const lib = b.addLibrary(.{
        .linkage = .static,
        .name = "lcmz",
        .root_module = lib_mod,
    });

    const generator_dep = b.dependency("lcm2zig", .{
        .target = b.graph.host,
        .optimize = optimize,
    });

    const tool_step = b.addRunArtifact(generator_dep.artifact("lcm2zig"));
    lib.step.dependOn(&tool_step.step);

Result of zig build --summary all:



Skeleton of generator



--------------------------- RUN --------------------



Build Summary: 7/7 steps succeeded
install cached
├─ install lcmz cached
│  └─ zig build-lib lcmz Debug native cached 9ms MaxRSS:39M
│     └─ run lcm2zig success 127us MaxRSS:1M
│        └─ zig build-exe lcm2zig Debug native cached 10ms MaxRSS:39M
└─ install lcmz cached
   └─ zig build-exe lcmz Debug native cached 10ms MaxRSS:39M

So the order of activities - as expected

  • zig build-exe lcm2zig
  • run lcm2zig
  • zig build-lib lcmz
  • install lcmz

Lot of thanks

Ok, that helps me get the picture, thanks.

What about something like this?

_ = b.addNamespaceModule("build_helper", .{
    .target = target,
    .optimize = .optimize,
    .root_namespace = build_helpers,
  });

Then outside of the build function itself:

const build_helpers = struct {
     // Put the build module namespace here
};

Kind of hand-wavey, but this could be imported in anyone else’s build.zig and do things.

My problem every time I try to do something non-trivial with the build system is the near total lack of documentation. I end up spending a lot of time reviewing the doc comments and source code just to figure out what’s even potentially possible, and that has limits. Much of what I’ve figured out how to do is just cargo-culted from other people’s build files.

For instance, I don’t understand how you’re able to @import("sokol") within build.zig in that first link. Does the build system just let you do an import of anything in the build.zig.zon, as a treat?

How is someone supposed to figure that out? Is it line 2541 of Build.zig where you discover that? Because it isn’t the documentation.

I just figured out that I can export a lazy path from a dependency, which is handy, I was able to update ZTAP so that users can import the sample test runner from there and don’t have to make a separate source file in their own repos. I found that little affordance at random, while researching something else (a something else I was not able to find a way to do, btw).

But it sure would be nice if there were some documents available. The official docs don’t even cover importing dependencies, which is, frankly, pathetic.

3 Likes