Invalidate cache if a file is missing

tsdtas · May 7, 2025, 7:08am

My build script generates some files that get loaded into my application at build/compile time with @embedFile. I need to keep the files accessible to some other system tools as well, so they get generated in the source tree instead of directly in the cache.

To ensure caching works, the generating Zig exe works like this:

Check if the file exists
If it does not, generate it
Copy the file to a addOutputFileArg-added cache location

The problem: If I manually delete the file, I still have it stored in Zig’s cache, because the build system does not know about the ‘real’ location of the generated files.

Is there a way for me to either

Tell the build system that this file is an output of the step, without relying on a generated path like addOutputFileArg

or

Invalidate the cache for the generating step manually by checking if the file is missing

… or some other option I haven’t thought of.

squeek502 · May 7, 2025, 7:18am

Unless I’m misunderstanding, Mutating source files in place from the Zig build system guide should work.

swenninger · May 7, 2025, 7:23am

Could you maybe generate the files in the Cache and add an installFile[WithDir] step for the relevant files to copy them to some other directory for your other tools? Or would this only be able but stuff in the actual zig-build directory?

tsdtas · May 7, 2025, 7:25am

That does seem to be the exact thing I need.

Thank you for the link.

Will mark solved when I get it working.

tsdtas · May 7, 2025, 8:18am

After a little digging around the standard library to see what the functionality is called in the current build API, here’s the solution I ended up with:

// Create build step and file path
const update_generated = b.step("update-generated", "Generate or update required build files");
const generated_file_path_raw = "generated/filename.bin"; // relative to build root
const generated_file_path = b.path(generated_file_path_raw);

// Compile and run the file generator
const generator_mod = b.createModule(.{
    .root_source_file = b.path("generator/main.zig"),
    .target = b.graph.host,
    .optimize = optimize,
});

const generator_exe = b.addExecutable(.{
    .name = "generator",
    .root_module = generator_mod,
});

const generator = b.addRunArtifact(generator_exe);
const cached_file = generator.addOutputFileArg("generated");

// Copy the file from cache to source tree
const usf = b.addUpdateSourceFiles();
usf.addCopyFileToSource(cached_file, generated_file_path_raw);

// Update build graph
update_generated.dependOn(&usf.step);

// < --------------- snip ------------------------ >

// make filename.bin available for @embedFile usage
some_module.addAnonymousImport("filename", .{
    .root_source_file = generated_file_path,
});

Running zig build building some_module now fails if the file generated/filename.bin doesn’t exist, until you run zig build update-generated, which creates the file.

The documentation @squeek502 linked mentions you shouldn’t do this as part of the normal build process, since it can mess up the cache, so I haven’t - this is acceptable behaviour for my use case.

Hope the solution is useful to someone else.

Edit: I was a tad quick declaring this a solution.

The code does what it says on the tin, so I’m leaving the example as is. But this only fetches the files required by the rest of the build system from cache if they’re missing, it doesn’t actually rerun the step, which happens to be important in this case. So it’s only half the solution to my problem.

If the generated files exist in the source tree, the cached version is fine. If it is missing from the source tree, the step needs to be rerun, regardless of cache state.

How can I tell Zig this?

Tosti · May 8, 2025, 8:30am

Just to ensure this is not an XY problem: are these tools called from some step in build.zig? Can these tools accept a file path as an argument? If so,

const tool = b.addSystemCommand(...);

creates a Run step, and

tool.addFileArg(caches_file)

adds the generated file path in cache. In this case, copy to the source tree is not required.

tsdtas · May 8, 2025, 8:52am

They are not called from a build step, no.

The step I want to cache is generating some mock data for testing an embedded Linux application, which gets loaded directly into the application with @embedFile.

A copy of the data needs to be accessible on the host machine to verify the test results, and the tool in question expects the data available as files. The test harness itself is not built or run with Zig, and rewriting the test tool in Zig is a bit out of scope.

Well, at the moment.

FWIW, it’s not blocking for the project. It’s just a minor annoyance I’m wondering if I can massage the build system into fixing, or this is something that’ll be fixed when we (eventually) start porting our test suite to Zig as well.

Sze · May 8, 2025, 1:47pm

You could just start with running the existing test suite as a system command from your build.zig, if you already run tests I imagine there should be some way to instead run zig build test-suite command which then calls the previous command and passes it some of the paths. Seems like the easiest option for bridging the gap.

Another option would be to actually install the files into the zig-out folder and then point your test-suite at that folder.

It is unclear to me how you generate your file, you only describe the outputs, but what is the input for generating the file? If the file is generated based on some other file, you should declare that other file as an input file to the generation step, that way you would have automatic dependency tracking, where the file gets re-generated when the input file changes.

Without something like that the build system can’t know when it should re-generate your file, because it is unclear how it is generated and when it needs to be re-generated. For example if the generator hides its inputs by hardcoding them into the executable or by doing its own requests or side-effects.

Can you describe a bit more how your generator works?