Passing build artifacts between C code

pvande · December 11, 2024, 6:27pm

I’m working on a project that is made up of two different compilation units, a static library and an executable — both written in C. Rather than linking against the static library, however, the executable uses #embed to statically store the library code.

The documentation for the Zig Build System has a great example of doing something very similar from Zig code, using the @embedFile builtin and an anonymous import, but #embed isn’t import-aware, and the library’s filename isn’t well-known.

I’ve got a few thoughts on how to accomplish my goals, but I’m not sure how exactly to enact them.

I could communicate the path to the compiled library via a C define/macro.
- To do this, the executable target would need configured after the library build finished, which I can’t find a way to do.
I could install or copy the library to a location specified in the build script, and communicate that to the compiled library via a C define/macro.
- I can’t seem to find a way to do that install/copy in a way that doesn’t also change the executable’s install dir.
I could leverage the anonymous import approach, if Zig’s C compiler exposed that path to me as a define/macro.
I could rewrite the executable in Zig, if the C translation layer didn’t choke on its dependencies’ headers.

I’m hopeful that someone here can either give me other ideas to try, or fill in some gaps in my understanding.

Sze · December 12, 2024, 5:13am

Hi @pvande welcome to Ziggit!

I am a bit confused about what you are trying to do, @embedFile is usually used to embed static data into the executable, I wouldn’t assume that that data is marked as executable, my assumption would be that it isn’t and that it is mapped as read-only, but not executable. (I haven’t verified that but it would seem logical to me from how I have it seen used in practice, always with data never with code. (so far))

Usually C code is used with Zig by either statically or dynamically linking it, where static is probably the more popular default.

I have trouble understanding what the point of that would be, if the code is available to be embedded, wouldn’t it also be possible to just statically link it?
It seems unnecessarily complicated to embed it instead.

Not saying you shouldn’t do it, I just wonder what would be the benefit of that approach?

If you create a build step that creates the library and you get a lazy path to the result, you should be able to use that as the .root_source_file of the anonymous import and the build system would automatically ensure that it is build first.

I have trouble understanding your project structure, I think it would be helpful if you created a small example project that illustrates what you are doing, for example by creating a example repository.

At the moment I don’t understand how you compile your C code, do you use the Zig build system?

pvande · December 12, 2024, 5:50am

I am a bit confused about what you are trying to do, @embedFile is usually used to embed static data into the executable, I wouldn’t assume that that data is marked as executable, my assumption would be that it isn’t and that it is mapped as read-only, but not executable.

That’s correct; I’m trying to bundle this library into my executable as data.

I have trouble understanding what the point of that would be, if the code is available to be embedded, wouldn’t it also be possible to just statically link it? It seems unnecessarily complicated to embed it instead.

The executable in my project doesn’t run any of the library code itself. The compiled library is just binary data that the executable writes to the filesystem later.

If you create a build step that creates the library and you get a lazy path to the result, you should be able to use that as the .root_source_file of the anonymous import and the build system would automatically ensure that it is build first.

The issue isn’t dependency ordering; that was fairly straightforward. It’s more that AFAICT the only mechanism I have to communicate the library Compile’s output filename to the executable Compile seems to be C defines/macros, and I haven’t worked out how to configure that define/macro with data from a LazyPath.

I have trouble understanding your project structure, I think it would be helpful if you created a small example project that illustrates what you are doing, for example by creating a example repository.

The core is pretty simple:

// src/library.c
#include <stdio.h>

void foo() {
  puts("Hi there!");
}

// src/executable.c
char library[] = {
#embed LIBRARY
};

int main(int argc, char *argv[]) {
  // Do things with the `library` bytes.
}

At the moment I don’t understand how you compile your C code, do you use the Zig build system?

I’ve been trying to move to the Zig build system; previously, I’d been scripting compilation by hand, which looked something like this:

clang src/library.c -o library.out -dynamiclib -Wl,-undefined,dynamic_lookup
clang src/executable.c -o executable -DLIBRARY='"library.out"'

castholm · December 12, 2024, 9:43am

Here’s an idea:

/* library.c */
/* (the actual contents of this file are irrelevant) */

// embed_glue.zig
const library = @embedFile("library");
export const library_ptr = library.ptr;
export const library_len = library.len;

/* executable.c */
#include <stdio.h>

extern const char *library_ptr;
extern size_t library_len;

int main(void) {
    /* print the first 16 bytes of the library binary */
    for (int i = 0; i < library_len && i < 16; i++) {
        printf("%02hhx", library_ptr[i]);
    }
    printf("\n");
    return 0;
}

// build.zig
const std = @import("std");

pub fn build(b: *std.Build) void {
    const target = b.standardTargetOptions(.{});
    const optimize = b.standardOptimizeOption(.{});

    const library = b.addSharedLibrary(.{
        .name = "library",
        .target = target,
        .optimize = optimize,
        .link_libc = true,
    });
    library.addCSourceFile(.{ .file = b.path("library.c") });

    const embed_glue = b.addStaticLibrary(.{
        .name = "embed_glue",
        .root_source_file = b.path("embed_glue.zig"),
        .target = target,
        .optimize = optimize,
    });
    embed_glue.root_module.addAnonymousImport("library", .{
        .root_source_file = library.getEmittedBin(),
    });

    const executable = b.addExecutable(.{
        .name = "executable",
        .target = target,
        .optimize = optimize,
        .link_libc = true,
    });
    executable.addCSourceFile(.{ .file = b.path("executable.c") });
    executable.linkLibrary(embed_glue);

    b.installArtifact(executable);
}

I think it’s easier to let the code speak for itself, but some key takeaways:

You can obtain a lazy path to a compiled artifact via compile_step.getEmittedBin().
A Zig module is (more or less) just an alias for a root source file, which doesn’t have to be Zig code and can be arbitrary data.
You can @embedFile modules.

pvande · December 12, 2024, 5:12pm

Fantastic — that’s the option I couldn’t see! Thank you!