As we all known, we can get zig fetch --save ... to add a package to our project, it will update build.zig.zon with a hash field in each package, indicating its integrity.
But my question is how can we ensure the hash field is right? How can we ensure there is no man-in-the-middle attack?
The package author can publish the right hash in README, but there are two problems:
How can the author know the right hash? zig fetch --debug-hash <local path> is one solution, but it may contain files ignored by git, so the author has to ensure no unexpected files is included when calculate the hash, which is very error-prone.
If the package is hosted on GitHub, and readme file is included in src of build.zig.zon, there is a chicken-egg problem. If we update the HASH in README, the HASH is no longer valid since src is changed. Of course we can push a new commit to update README, but I think it’s kind of awkward.
When Zig fetches a package, either implicitly via the build system or explicitly via zig fetch, it unpacks the package to a directory in the global Zig cache, then computes the package hash and compares it with the declared hash. If the package contains a build.zig.zon manifest, only files matched by paths will be hashed, and all other files will be deleted.
zig fetch --debug-hash . can be used to obtain the hash of your package in advance, since it respects the paths rules.
The chicken-and-egg situation with the hash in the readme can be resolved by not including the readme in the paths rules. paths only need to list files that have an effect on compilation. Files that don’t have any effect on the end user’s ability to use the package (readmes, docs, CI scripts, etc.) don’t need to be listed (though it might be wise to still include licenses and related notices). The comments in the zig init template even suggest this practice:
As a rule of thumb, one should list files required for compilation plus any license(s).
It makes no sense to put the hash in the README file. The README file is fetched the same way as the package contents, so if you trust one you trust the other, and if you do not trust one you do not trust the other.
When fetching a package for the first time, do it on a connection that you trust, such as via HTTPS. This is called Trust On First Use. Once you obtain the hash this way, the hash protects against tampering. This is why you can safety fetch Zig projects on untrusted networks.