Bugs Rust Won’t Catch - Zig Strings

nm-remarkable · April 29, 2026, 4:47pm

I found this blog Bugs Rust Won't Catch | corrode Rust Consulting extremely interesting when taking into account the choice to not have a String type in Zig. Many of the bugs present in Rust’s uutils had to do with using String and it’s default UTF-8 encoding for paths, over using the more correct array of u8.

Strings are usually a big pain point with most Zig users and only after reading this article did I have a better understanding of why the choice to not have Strings in the standard library was made

cryptocode · April 29, 2026, 5:19pm

Regarding the ‘comm’ CVE, perhaps std.fs.path.fmtAsUtf8Lossy should be nuked as to not invite the same kind of path formatting problem. If someone insists, they can always reach into std.unicode, but at least then it’s not in the fs.path namespace.

squeek502 · April 29, 2026, 5:52pm

I’d argue that’d be learning the wrong lesson. IMO the lesson is really “you need to think things through when it comes to paths”; making any blind choice can and likely will lead to problems.

My full thoughts can be found here. This bit is particularly relevant:

Zig’s lack of a string type & invalid values

[…] there is no canonical/portable way to format an arbitrary path as valid UTF-8 (i.e. invalid UTF-8 sequences can be converted into � using a variety of algorithms, but the user cannot ever use that output to reconstruct the actual path)

For Zig, in #19005 I added std.fs.path.fmtAsUtf8Lossy and std.fs.path.fmtWtf16LeAsUtf8Lossy.

However, that may not be the way you want to print paths depending on your use case. For example, ls on Linux prints them shell-escaped:
$ touch `echo 'FF FF FF FF' | xxd -r -p`
$ ls
''$'\377\377\377\377'

WeeBull · April 29, 2026, 6:02pm

Interesting blog post. I particularly like the acknowledgement of panic being a DDOS vector. Anybody remember the Cloudflare outage?

What’s also interesting is the discussion about bugs not shipped. It’s the standard fare of buffer overruns and null pointer dereferencing which Zig is also going to be pretty strong on.

That means, even if the tools were (and probably still are) buggy, they never had a bug that could be exploited to read arbitrary memory.

Ok, but in commands like pwd, ls, split, and od what information were you expecting an attacker to extract through a buffer overrun exploit? They don’t have access to anything except what you fed into them.

What’s left is, frankly, a more interesting class of bug. It lives at the boundary between our controlled Rust environment and the messy, chaotic outside world, where paths, bytes, strings, and syscalls are all tangled up in one eternal ball of sadness. That’s the new security boundary of modern systems code.

That was always the security boundary, which is what the chroot bug in the Rust re-write shows.

cryptocode · April 29, 2026, 6:11pm

Maybe, but since as you state there’s no canonical way to format arbitrary paths as utf8, having this one way in std still makes it easy to reach for when one perhaps shouldn’t. Perhaps clearer doc strings about potential issues is a compromise.

matklad · April 29, 2026, 6:14pm

FWIW, my personal lessons here is that good API design can’t prevent people from writing bad code. API-design-wise, Rust is great! It is very pedantic about what are strings, what are bytes, and what are OS-dependent paths. And the API SCREAMS at you if you use the wrong thing.

Like

// ra, rb are &[u8], raw bytes from the input files.
print!("{}", String::from_utf8_lossy(ra));
print!("{delim}{}", String::from_utf8_lossy(rb));

Who would’ve write that?! The only way this could be more obviously wrong is if your terminal gave an audible beep if you’d type that, and computer rebooted angrily. https://www.cve.org/CVERecord?id=CVE-2026-35375 is the same level of obviousness of brokenness.

But, yeah, a lot of people do routinely write code like that. And people who don’t, arguably, don’t benefit that much from extra API type-safety.

I would still say that, on the margin, appropriate API salt & sugar can help, but that is surprisingly less effective than one would think.

It seems plausible that Zig’s transparent treatment of paths as bytes makes Zig programs more correct on Unix, but I am not sure if that just doesn’t exchange some bugs for others.

Eg, if I run zig fetch non-ascii-bytes, it sends an HTTP request with invalid URL (so, violating the protocol, as far as I understand), instead of rejecting this early. Similarly, I’d be surprised if there are no subtle bugs with Zig’s making wft8 on windows visible to the users.

spiffyk · April 29, 2026, 8:34pm

The first Don’t Trust a Path Across Two Syscalls chapter is very important and not being talked about enough, I think.

We should be teaching people in schools to never use a path twice if they wish to access the same file. Always prefer a file descriptor if you’re doing more than a single operation. In fact, working with paths in general should be avoided as much as possible if a different facility can do the job, because there are many edge cases and it’s very error prone, not to mention sensitive to OS changes. It’s generally just better to leave it to the OS to do it once, and then operate on the object that the OS gives you.

In that same vein, I really dislike the Resolve Paths Before Comparing Them rule from the blog. No. Instead of comparing paths, they should be stating them and comparing inodes. User space resolution of paths in general is just not worth the hassle.

I’m honestly a bit baffled that someone would set out to replace coreutils without this knowledge, which I feel is kind of crucial for this sort of endeavour.

EDIT: For these reasons I really take issue with the part of the post that says the codebase was “written by people who knew what they were doing”. I’m sorry, but no, it really wasn’t.

WeeBull · April 30, 2026, 11:37am

…and yet they shipped it as default in 25.10 (before this review), and are still pushing it to be the only version available in the future. IMHO Ubuntu needs to revert back to GNU coreutils and sudo.

I don’t understand why the Rust community has such damaging blind spots.

matklad · April 30, 2026, 12:03pm

Note that “Rust community” != “The set of people who historically worked on implementing rust coreutils” != “The set of people that decided to ship rust coreutils with Ubuntu”.

More generally, there isn’t such a thing as “programming language community”, and there’s especially no such thing as a “programming language community that can actually solve coordination problem and act cohesively with intention”.

spiffyk · April 30, 2026, 12:10pm

I just fail to see the rationale to push for uutils so much, other than “hurr, durr, C unsafe bad, Rust safe good”^[1]. IF there were frequent bugs stemming from memory safety issues in coreutils, I would kinda sorta understand, but for a project of its age they seem to be rather few and far between. I realize I’m probably preaching to the choir here, but am I missing something?

But please do note that I have nothing in particular against Rust, I just don’t really work with it at the present time ↩︎

matklad · April 30, 2026, 12:39pm

I can see some:

For having uutils:

It’s MIT rather than GNU
It’s usable as a library (by, for example, nushell)

For carefully replacing coreutils in a distro with a Rust version:

unsafe bad While older code generally has fewer vulns, this might also be a consequence of us being bad at discovering vulnerabilities? Given the recent deluge of AI-discovered weaknesses everywhere, I think this take aged way better than anyone could have imagined. I mean, given Copy Fail: 732 Bytes to Root on Every Major Linux Distribution. - Xint, I would only be moderately surprised to learn that you can pwn GNU cat with tricky arguments.
I don’t think coreutils are completely frozen? They do get new features, and, with some amount of expected future development, it might make sense to go to Rust

For replacing coreutils in a distro with an obviously half-baked version:

Gets massively sketchier from here But I can see how just throwing the code out there and making it everyone’s problem massively accelerates development speed, assuming we do want to migrate eventually. Yes, there’ll be a rocky start, but there’s going to be only finite amount of bugs, quick fixed. Especially given that Rust is so much more easier to hack on than older GNU projects.

WeeBull · April 30, 2026, 3:11pm

That’s a bad example for your case. It’s a logic bug in the kernel crypto API. To quote the author:

Copy Fail is a straight-line logic flaw. It triggers without races, retries, or crash-prone timing windows.

No programming language helps with logic bugs (Not even Zig ). My issue is that people talk about Rust code being “safe” or “unsafe” and don’t say they’re only talking about memory safety. Sure, that’s something but it’s oversold. The emperor is only wearing lacy see-through underwear.

matklad · April 30, 2026, 3:31pm

That example was in support of the claim that the fact that few vulnerabilities were found in the past doesn’t guarantee that a lot of vulns won’t be found in the future, because we might get much better at finding them.

Though, I think it also supports the claim that absence of memory safety in C leads to security vulnerabilities?

The third call writes 4 bytes at offset assoclen + cryptlen, past the AEAD tag. The algorithm is using memory it does not own as a scratch pad.
….
In the AF_ALG in-place path, this write crosses from the output buffer into the chained page cache tag pages.

matklad · April 30, 2026, 3:41pm

The thing here is that people make all kinds of claims about all languages. You can’t really help this, that’s Kolmogorov’s zero-one law.

But you can choose whether you listen to the silliest claims, or the most enlightening ones. In my personal experience, people who know what they are doing, are fairly pedantic and precise when they discuss Rust’s safety features, like memory safety, or the absolutely ground-breaking idea of making it hard to ignore errors.

nm-remarkable · April 30, 2026, 4:03pm

I think leaning on the mocking perspective when trying to view other people’s viewpoints can be an unfruitful path to take. In my view each programming language can learn from eachother and some of the patterns introduced by a language can unblock new viewpoints in others. For example Rust’s newtype index pattern and in Zig. That is to say to not dismiss other viewpoints so easily without even giving them a shot.

The rewrite of uutils is very interesting because the biggest reason to do it, in my view, is to learn from the process of rewriting and shipping it. I’m sure Ubuntu has gotten a lot of data on how to make future upgrades better, how to make compatibility testing better, what holes they have in their processes.

Another two major reasons but more on the uutils side, instead of the distro side, is to make the GNU coreutils compatible in more targets (Windows for one) and attracting more contributors. The second point is albeit the same reason as Fish shell had for rewriting their project which worked, as Fish shell did get more contributors to join and help

spiffyk · April 30, 2026, 4:45pm

Again, this time outside of a footnote, I’ve got nothing against Rust, not even against rewrites in Rust, I can see its clear benefits. I can also see the shortcomings of C, of which there are plenty (using it at my day job has me wishing every day for a migration to something better).

I did not really mean to target Rust itself (but yes, my mockery does put it across that way, sorry). What I don’t understand is the specific decision to ship a rewrite made by people clearly not as versed in systems programming (see above) of coreutils of all things in the LTS release of a very popular Linux distro. At least with their unsuccessful first Wayland push^[1] they targeted a non-LTS release and reverted it before the next LTS.

And again, nothing against Wayland. I love Wayland, I use it and I’m never coming back to X11. It just really wasn’t ready back then. ↩︎

castholm · April 30, 2026, 7:12pm

It’s about organizational politics and wanting to wield power and influence. I’m sure that there are many people involved on either side who advocate for their projects because they genuinely believe it’s what is best for the users/everyone, but it’s important to recognize that there’s more to it than technical arguments like “our software is more safe/performant/battle tested/maintainable/contributor friendly” and that neither side has purely altruistic motives.

(Note that I’m not judging anyone for this, desiring control is a universal human behavior.)

andrewrk · May 1, 2026, 12:27am

Ooh la la

chrboesch · May 1, 2026, 11:45am

This more or less forces other interested parties to participate. Or to switch distributions. For example, the French authorities have good programmers who are now being challenged. The same thing happened with Matrix, it’s now military standard.

unkempt6057 · May 2, 2026, 8:52pm

Not sure what you mean here. Is the French government now trying to patch these bugs in the rust coreuitls? Or migrate away from Ubuntu?