Side Effects / IO in Zig build system

Something I see come up again and again is when package managers allow arbitrary IO to be executed as part of installing a package. This is pretty common in languages like Python, JS, etc where a package will include some C code that it can call out to for the obvious performance reasons. To support this, the language package manager has some hooks to run make, etc.

Of course, this is a huge security issue. A package that gets compromised will then be able to run whatever malicious code it wants just by you running your normal dependency install steps. (AFAIK, this is part of what happened with the recent compromised LiteLLM packages.)

My thoughts here are

  1. seems like the zig build system does allow arbitrary IO in my build script. when I include others’ packages as dependencies, is this running their build script at all or just including some code? does this have the same general threat vector or are we generally okay here?
  2. if this isn’t a problem that the zig build system allows, maybe there’s something interesting here for other language’s package managers to take advantage of. e.g. imagine a JS package installer that embeds zig and allows packages to include what they need so that zig could compile the lower level dependency for the end user’s system, but without opening the door to arbitrary IO.

There has been some previous discussion around this. In short, the compiler team is aware of this and there is discussion around how to make this better.

Thanks for the link

Sandboxing it will not really help here.

The problem here is that you can run arbitrary code during compilation which can do IO.

Even if the build.zig gets fully sandboxed from everything (filesystem, network etc.), this will still be possible by just putting the code into an executable which gets run as a dependency in the tree of the module it exposes.

Let’s say you make a library and expose a module called “super”:

const super_mod = b.addModule("super", ...);

The projects depending on you can then get it via dep.module("super").

Now somebody malicious can go and make it run something during the make phase when you depend on that module:

const malicious = b.addExecutable(...);
const run_malicious = b.addRunArtifact(malicious);
const malicious_dummy = b.addObject(...);
maliciour_dummy.step.dependOn(&run_malicious.step);
super_mod.addObject(malicious_dummy);

And that’s not really something you can prevent as a buildsystem.

After all there are also (many) valid use cases for this with the most commonly known probably being including the commit hash in the executable or running a code generator.

If you really want to close it, you would also need to remove addRunArtifact. And if you do that, you also make the buildsystem not usable in a LOT of contexts.

And even if you do this, a malicious actor can still put the malicious part into e.g. a test or just the regular software (after all, you will surely run it at some point even if only to test your software).

This is something you can only really solve by reviewing your dependencies.

Don’t see the build description as something separate of the rest of the code base, even if a lot of people like to do that. You need to keep it clean (and hopefully nice to read) just as well as the rest of the code base. It’s code. This goes for all buildsystems of all languages.

2 Likes

This is really the summation of it. While sandboxing and other strategies can help mitigate various attack vectors, in the end it is always going to come down to manual review of your dependencies and trusting them.

I think you could use some sort of capability model, maybe the build system would need to have some different apis to be able to specify what certain artifacts are allowed to do, but then you could do something similar to what Fil-C does, just extended with some sort of permission system.
How Fil-C Works

I also think that something similar to DynamoRIO’s Program Shepherding could be useful:

Applying DynamoRIO to the security field resulted in a technique called program shepherding.[7] The program shepherding instrumentation monitors the origin of each program instruction and the control flow between instructions in order to prevent a security exploit from taking control of the program.

Secure Execution via Program Shepherding


I think another aspect is that you would have to clearly decide which parts are sandboxed and which aren’t and where the border between those is and then have techniques that make sure that no part that is supposed to be sandboxed is able to sneak into the non-sandboxed part. I think different capability-based or tracing techniques like the shepherding could work for that. (Or you might be able to run everything sandboxed at a much higher performance tradeoff, or with enough optimizations maybe even not that high)

Because I am not super deep into the research in that area, I am sure that there are lots of other interesting techniques that could be explored that I have never heard of.

I think it is more a matter of how much work is required to use certain techniques and implement them. Long term I would expect that we can make theses things more secure, maybe even without having to annotate too much code pieces with what they are allowed to do.

While I agree that review and readable code is important, I think we also should look at automated tooling that is able to prove or trace things. I think following both paths and seeing which ideas can be implemented without too much overhead makes sense.

If those automated techniques can reduce the amount of code that could have exploits, then you need to review less. Not every program needs fine grained security, but for those that do, it would be cool to have a way to opt in to more sandboxing and more automatic runtime tracing.

The latter is done in DynamoRIO just by creating a dynamic copy of the original program which is basically jit re-compiled with tracing added.

Maybe we could eventually have a ReleaseSecure mode or something like that.

addRunArtifact already has in most cases information about what’s the input and what’s the output. So you can restrict its access to those places reducing the attack surface. There are cases which are impossible by sandboxing, but in those cases it’s possible to provide escape hatches that you have to opt-in to. But by default I think it’s good idea if the build system reduces the attack surface and encourages good practices.

2 Likes