Zig’s lack of a string type & invalid values

It’s a thoughtful post! Regarding the String Question, I have little to add to what I said in that thread, other than to point out @dude_the_builder’s very nice string library.

Enums: Rust’s enum is more like Zig’s tagged union type combined with its enum type. In Rust you can have an enum like you describe, closer to C enums in spirit, or your enums can have a payload type. In Zig these are different things. I’ll stick to the Zig side of the equation for simplicity.

There are good reasons for enums to have invalid values, one of which is exhaustive switching. The compiler will ensure that you cover every possible value of an enum when you switch on it, and if a new value is added, and you didn’t use an else branch, you’ll be compelled to go back and add the new value to each switch statement. That’s a valuable property!

But enums don’t have to be exhaustive, you can just say

const Birbs = enum(u8) {
   robin,
   oriole,
   crow,
   -,
};

And now every possible octet is a potential Birb. It means you still get exhaustive switching, if I add .eagle then switch statements which don’t use else have to add that, but there’s a distinct switch prong for every unnamed value which the integer type can carry.

But in many cases there’s no advantage to this. If I have the classic .spade, .heart, .club, .diamond enum, well I might want to add .joker later, but eventually all the cards are in the pack, and if some integer value greater than 4 gets in there somehow, it’s a bug.

The reason I appreciate and prefer Zig’s slice-o-bytes approach to strings is that there are too many invariants, and some of them are incompatible with each other. Strings get really complex!

Sometimes, frequently even, the invariant a string has to uphold is “this string is a validated email address from the database” and “this string is user input which is supposedly an email but we haven’t checked yet”. Which is why I think that distinct types of some sort would be a strong addition to the language. No one’s worked out a really tight proposal there, though, and making a struct which holds a slice as a member is adequate and might be all we actually need.

I actually agree with you that Zig should make zeroed memory available through the allocator interface. I don’t think that the API should privilege that value, but as your linked post points out, modern hosted environments generally have cheap zero pages, and it’s an important optimization for some kinds of code.

Don’t agree at all that zeroInit should be a default, though. For one thing, pre-zeroed memory only applies to the heap, and only sometimes: the stack will very seldom already be zeroed out, and if the allocator is returning memory which was recently freed, that won’t be zeroed either. If we write something like a buffer which is declared inside of a while loop, then those semantics require the buffer to be zeroed out for each pass on the while loop, which is no bueno performance-wise.

I don’t think you’ve made a good case that optionals are bad for performance. A ?*Something type has to be unwrapped before you use it, yes, but that’s the same thing as checking if a pointer is NULL in any of the billion dollar mistake languages, it turns into the same machine code. The difference is that you never have to check a *Something before you use it. If it was initialized as undefined and you try and use it, that’s safety-checked illegal behavior, but that’s a fairly shallow bug in most cases. Defensive NULL checks are all over responsible C code, because you kind of have to, since the type system won’t do it for you.

Zero can be a useful value, yes, but it’s just a value, I’ve never seen the point in trying to make it special. That’s definitely not worth giving up null-safety for.

7 Likes