How stable is the package manager output and directory?

I have created a Zig dependency fetching mechanism for nixpkgs at zig: replace generated dependencies with fixed-output derivations by water-sucks · Pull Request #438523 · NixOS/nixpkgs · GitHub using the Zig package manager and zig build --fetch=all, but one concern is bit-for-bit reproducibility here.

Since Zig is still pre-1.0, I am wondering if there are any present or future guarantees around the stability of the Zig package manager’s output, or if I should create a custom program that vendors this package directory in two stages, where:

  1. A custom fetcher creates a fixed-output derivation by fetching all packages recursively, similar to what the Zig package manger does now. This output remains stable across Zig versions, even if Zig itself changes how it fetches packages in future versions.
  2. Using this stable directory, depending on the Zig version required, transform this fixed-output derivation into a proper Zig package directory.

Thoughts on this?

2 Likes

It’s likely that hashes will keep changing. Guix implemented this by generating a build.zig.zon with the respective packages:

1 Like

Unfortunately, replicating the Guix approach is not gonna be likely to work in Nix due to the way that fixed-output derivations work.

I’m aware that the hashes will keep changing until 1.0, but the problem with that in Nix land is that when the hashes change (i.e. with a new Zig version like 0.15.0 has done), the FOD is not going to rebuild, since it thinks the vendor directory has already been calculated. Hence the need for bit-for-bit reproducibility of the FOD.

However, since the hashes themselves may change in the manifest, I am wondering the following. The only thing that the package manager is doing is:

  • Fetch dependency if needed
  • Calculate hash, move into Zig package store addressed by hash only
  • Filter out files not in the paths in build.zig.zon
  • Fetch dependencies of dependencies, as needed

in a loop.

As far as I know, there are no plans to change this unless the package manager is completely redesigned, correct? If so, then what I’m thinking is that my current approach using zig build --fetch=all is fine, since even if package hashes change between versions, they would require Zig version upgrades because the build system itself has probably changed since then. And as such, a policy that can be adopted in nixpkgs until Zig reaches 1.0 (and the hash format itself becomes stable, presumably) is to specify an explicit Zig version to build with.

I could be wrong, though. If I am wrong, and package manager output is going to change substantially besides hash format changes, then my two-stage program is the only thing that’s gonna work in the Nix context for fetching Zig dependencies without extensive FOD breakage. Not to mention, the second stage of transforming dependencies would have to vary based on the Zig version that is in use, so hopefully I am not wrong.

1 Like

I don’t really know how nix works on a detailed level (only basic understanding).

Just asking some uninformed questions, hoping this could help make the situation more clear to me and possibly others reading this topic. Feel free to point out if those questions aren’t really relevant, or other concerns.

Why does it matter whether the package manager output changes?

As far as I understand Zig’s side of things it seems to me that a build.zig.zon describes the hash of all the dependencies and if any of those change then their hashes will become invalid (not matching the content) and once it is invalid there is no longer a point in trying to use it, instead you need to update the package.

If you want to get that hash for a zig application you could pretend you want to use it as a dependency from a freshly zig init-ed project by using zig fetch --save <project-repo> which then would download the project and calculate its hash.

I guess the part that I don’t quite understand is how the nix side deals with versioning, is a FOD supposed to stay valid no matter what, or could it get a version attached that represents the version of zigs package manager format/hash/impl?

There are packages that support multiple versions of Zig, so I think baking the Zig version into a flattened tree of packages would potentially unnecessarily multiply the number of packages that only differ in Zig versions that may not even differ at all in what they produce from a package management perspective.
Or it could restrict the Zig version number more than necessary.

Also what is the level of reproducability that nix cares about, is it about replicating artifacts being reproduced unchanged or also that executing those produces the same outputs?

Would Zig itself be a nix package or not?

If the Zig standard library (or Zig creating different compiler outputs as the language implementation changes) can be changed outside from the Zig packages then that could still result in programs producing different outputs, so I think to prevent that you theoretically would have to vendor the Zig version itself.
What is the goal here for nix, what needs to be frozen and what not?

And what is allowed to become obsolete/broken when things change (like a new package hash format, or other updates to Zig)?
possibly requiring those package to be updated…

Are there any easy to read guidelines about how things should be packaged for nix or is it a deep rabbit hole that requires hours of study?
If you have a pointer to something easy to read than I would take a look at it, but if it requires a lot of in-depth research, I can’t promise that I will spend huge amounts of time looking into it.

1 Like

Why does it matter whether the package manager output changes?

The reason that a FOD has to be bit-for-bit reproducible is because a fetcher relies on the hash of the content (see Advanced Attributes - Nix Reference Manual) to avoid re-computation. If the Zig package manager changes how its package structure is laid out, then all FOD hashes need to be updated, which is a manual and pretty painful process to find out which FODs have changed outputs vs. which ones have not. There’s no good tooling to my knowledge that can automatically detect hash breakages other than simply rebuilding the FOD by settings its hash to a fake one to force re-computation and failure.

I guess the part that I don’t quite understand is how the nix side deals with versioning, is a FOD supposed to stay valid no matter what, or could it get a version attached that represents the version of zigs package manager format/hash/impl?

A FOD is supposed to refer to an immutable resource. If a URL changes under the hood, for example, it would not make sense to use this URL to create a FOD, for the reasons detailed above.

Currently, the way that Zig applications are packaged in nixpkgs is done with codegen using a tool called zon2nix, which generates Nix code by pre-fetching package URLs into the Nix store and computing their Nix hashes, and then creating a Zig package directory from that. This needs to be re-generated each time the build.zig.zon changes, which is not an automatic process, and it’s what I’m trying to avoid so that we can have transparent, simpler auto-updates for Zig-based packages in nixpkgs.

There are packages that support multiple versions of Zig, so I think baking the Zig version into a flattened tree of packages would potentially unnecessarily multiply the number of packages that only differ in Zig versions that may not even differ at all in what they produce from a package management perspective.

This is not really for the Zig side of packaging, but rather for system/distribution packaging, so I’m not really sure why this is a concern, since only one version of Zig is going to be used at this point to build the package for distribution. I’m not trying to restrict the Zig version number outside of for usage inside of Nix and nixpkgs.

What is the goal here for nix, what needs to be frozen and what not?

The idea is that the package directory layout in $HOME/.cache/zig/p needs to have some reasonable guarantee of bit-for-bit reproducibility for me to use the current method I am using in nix to fetch Zig dependencies. That’s it. The Zig version itself doesn’t need to be vendored or anything like that, just the dependencies themselves in a stable format.

And what is allowed to become obsolete/broken when things change (like a new package hash format, or other updates to Zig)?

As far as I know, the only thing that has substantially changed in this package directory layout to my knowledge is the format of the hash itself used to address packages by. This seems like acceptable breakage to me, due to the fact that it would presumably require a Zig version update, and Zig version updates would require rebuilding FODs, since those packages would break entirely and they would break in a way that is detectable.

Imagine the following scenario:

  • Nix package example depends on zig (an alias to zig_0_13)
  • zig package alias gets updated from zig_0_13 to zig_0_15 in nixpkgs, which requires the hash change
  • Package example now depends on zig_0_15, so in a rebuild, the package will be broken due to the old FOD with the old hashes still being used.

Now, the way to avoid this would be to necessitate the zig package version for a given package example be specified without using the zig package alias in nixpkgs, but I’m not sure if such a policy would be accepted or not.

If the hashes themselves were to not change (like I assume they would with a Zig 1.0 release), then this would not be a problem as far as I know, since no other changes have been made to the package directory’s layout since the package manager was released to my knowledge. My real question is if I should be concerned about any other changes to this package directory.

if it requires a lot of in-depth research, I can’t promise that I will spend huge amounts of time looking into it.

No worries, I appreciate your questions and time taken already :}

2 Likes