I’m trying to help Martin speed up Orca’s build. The problem I’m seeing is that zig’s build order randomization sometimes gets unlucky, like in this case where curl got built at the very end (see zig build-lib curl
located in bottom right of image):
It’s unlucky because curl is only getting ~4 zig clang
processes running in parallel, and there isn’t any other work left to saturate the rest of my 10 core machine. Everything else is already finished by the time curl
starts compiling.
Compare to a “lucky” build where curl
is started early and is able to overlap with more work (build completes ~5% faster):
So my question is: how can I influence the build order? I know about the --seed
parameter, but aside from trying a bunch of seed values to find a lucky one (fragile), I’m not sure what other options there are.
What’s the total runtime difference between a lucky and an unlucky build?
This should only affect a build from scratch right? While obviously it would be nicer for everything to be as fast as possible, this is something that you basically tend to pay for once (…per Zig release, granted :^) normally.
Do you have a situation where you pay this cost at a regular frequency?
I’m optimizing from scratch builds, which happen for every cloud build multiplied by the number of target platforms.
Yes, this is a “small” problem in orca – a difference of only a few seconds. However in the presence of larger libraries it will be more and I’d like to have control over that. It only takes one long-running build-lib to affect build time variance.
I understand why shuffling build order can be desirable, but for cloud builds I’d prefer it use an optimal order.
Couldn’t you just build the platforms in parallel then? That should mitigate this effect.
1 Like
This should not be the case, I see that Orca is using mlugg/setup-zig, which will make sure to save and restore the build cache. If you’re not seeing this happening on CI then something else is broken and needs to be fixed.
Yes, you’re right about Orca. The issue here is that I’m trying to learn how solve a general problem, but I’m using Orca as a specific example.
Ignoring Orca and focusing on “can you control build order for cloud builds?”, it’s sounding like “you can’t” is the answer and one should use a build cache. Is that right?
Yes that’s right, currently there is no concept of step priority.