I’ve had this post on ice for, well, awhile now. Zig 0.16 has taken some steps in this direction, which shows, I hope, a willingness to consider taking things further.
Without further ado:
Arithmetic Compatibility
Since Andrew had encouraging things to say about this general idea, it seems worthwhile to brainstorm a bit about how it can work. Especially after 0.16 comes with several improvements aimed squarely at less-painful arithmetic.
Zig’s casting rule is very simple, that’s good, and for casting it works just fine. Empirically, the way it’s used in arithmetic expressions is not ergonomic, this is an acknowledged pain point.
Any change needs all these things:
-
The rule must also be simple. It’s not possible to be simpler than the casting rule, so it won’t be, but it should be as simple as possible given that.
-
It must not make the language less safe. The casting rule imposes friction on downcasts, this encourages the use of wide-enough types, and biases code toward unsigned values for indices, both of which are, broadly speaking, good.
-
It must work for arbitrary integer ranges. That issue hasn’t been accepted, but it has support on the core team, and well, I for one hope we get it. So if a change to how this works helps make that possible, that’s good.
I think what follows meets all those criteria. We introduce a rule called arithmetic compatibility, that’s arithMETic, not aRITHmetic: adjective, not noun. It’s not a cast or a promotion, it’s a rule which determines when the combination of two operations and the result location are mutually compatible.
This is the rule: arithmetic is allowed when the image of the operation has less than, or equal to, the number of invalid results as if the operands were in the domain of the result type. Concretely, for C = A • B, we ask: if A and B are the same type as C, what percentage of results of the operation • are invalid? Then we ask again for the actual types of A and B, if the result is greater, the operation is not allowed, if it’s equal or less, the operation is allowed.
The rule is very simple, but, it is abstract and this is somewhat demanding on the user. But it does have to be, in order for it to work with Zig’s unusually wide variation in integer types, and to have a prayer of working with integer ranges.
I will now illustrate some of the more common results of this rule. I’ll stick with operands of the same bitwidth, which is the most important case. Don’t ignore that there’s a result location here, that’s important.
-
uN += iN: Allowed, as is-. Precisely 75% of results of signed-unsigned addition and subtraction fit into the unsigned result type, with both unsigned, it’s 50% + 1. -
uN *= iN: Not allowed. Any negative value ofiNgives an invalid result, and this makes the domain smaller than unsigned multiplication. If the signed value is small enough, the rule allows it: I’m not sure that it should. A negative value is never legal here, and it doesn’t seem like eliding the cast for this operation is fulfilling the obligation to keep things at least as safe as they are now. But this would complicate the rule. -
iN += uN: Not allowed. This turns out to have the same analysis asuN += iN, for nontrivial reasons. -
fN •= (i|u)Mfor any M: allowed. Our result location is a float, we accept that this comes with loss of precision in exchange for a very, very large range with saturation at the high and low ends, and underflow to zero for division. Note that this operation cannot panic under any circumstances, even if•is/and the integer is0, so we’re emphatically not doing anything unsafe here.
A nice thing about this rule is that it’s calculable. For one thing, that’s essential in order for the compiler to determine what to do, for another, we can give a clear error message even with arbitrary integer ranges.
Another advantage of considering the result location is we could have more cases of @as(T, a + b) and less of the dreaded @as(T, @intCast(a)) + @as(T, @intCast(b)). I will say: Rust code requires more a as T type code, but it’s much lighter on the page. This alone would make me miss that less when I’m writing Zig.
A caveat: this rule doesn’t work for shifting, but, shifting is not arithmetic. I propose no changes to how shifting currently works in Zig. It’s also basically meaningless for bit masking, which is also working just fine as it is.
Indexing: where I finally came down on this is that it’s a cast, and so the current rule holds. I bring it up simply because of the fact that a usize is almost always absurdly larger than the actual legal range of an index, and if you think about it that way: a [5]T has five usize values which are legal, and five isize values which are legal. An @intCast on a number does nothing in unsafe modes, and safe modes have bounds checking, so it’s kind of double-dipping.
But, I do think it’s good that this encourages the use of unsigned values for index types, and I expect that the arithmetic compatibility rule will ease the pain of working with offset types / deltas enough that relaxing this constraint wouldn’t matter in the end. Right now, the choice is between clearer signed arithmetic with a cast for every index, or a bunch of muddy and visually-heavy casting to keep an index unsigned, but then indexing doesn’t need a cast. This would remove that dichotomy, and actually strengthen the use of unsigned index types.
I could be persuaded otherwise. The “at least as safe” precept mostly holds; not if we consider cases like i8 vs u8 for a [256]T, but there’s no restriction on usize and below for that array in current Zig either, and “less safe” only kicks in here with valid positive indices which the signed type under consideration cannot represent.
It’s more conservative to see how much it matters after introducing arithmetic compatibility. Given where I’ve felt the pain in writing this kind of code, I suspect it will be much less worth addressing.
Simplification
This rule is elegant, and simple in that sense. I considered tripling the size of this post by brain-dumping all the number theory of how to improve on a brute-force calculation, but: really this is only necessary for arbitrary ranges. It’s not clear that we’re going to have those, so I’ll just note that it’s fast for addition and subtraction (constant time), and there are ways to memorize results from multiplication and division, such that the answer can be interpolated correctly.
For the numbers we do have, there are only a few rules here:
-
Integers are automatically promoted to float when the result location and other operand are floats. Zig 0.16 has that already for casts: intolerant of loss of precision, which is correct for casts, but the premise of arithmetic compatibility is that
float *= usizehas a float result type, so the code has ‘opted in’ to the usual tradeoffs found in float calculation. -
When the result location is unsigned, addition and subtraction may have either operand signed or unsigned, so long as both have equal or lesser bit width to the result location. Multiplication and division must have both operands unsigned. This is done through ‘natural’ arithmetic in two’s complement, safety checks confirm the result is
>=the left operand for addition, and<=the left operand for subtraction. -
When the result location is signed, any unsigned participants must be of lesser bit width than the result location, for all four arithmetic operations.
This summary isn’t quite a mere consequence of applying the rule, specifically it disallows any signed participants in multiplication and division for an unsigned result. I think that’s better in the absence of arbitrary ranged types, that’s where things get complex and a general rule is needed. For a system like we have, which is purely power-of-two based, you can modify the rule and say that, rather than comparing to the result location, compare as though each operand has its bit width, but the sign of the result location, and then you get exactly as sketched above. You can see why that doesn’t work for arbitrary ranges, because sign is not a central fact about a range in the same way as it is for power-of-two arithmetic.
Bonus Round
While I’m here, I would like to enter a plea for signed division. I do agree with the logic which removed it from the language: popular languages disagree about how to handle it, this makes things inherently ambiguous, taking a side arbitrarily is not the correct resolution. The “/ is not valid with signed operands” part, I’m in.
But punting to builtins is also bad, frankly. It’s just weird, division is an operator and using a function for it is, well, it’s unheard of, if we don’t count Lisp. We should be wary of arguments from popularity, but look: a lot of code, especially numeric code, gets translated from somewhere else, and making division a function screws up operator precedence in a major way. I have personally spent a few hours tracking down a bug caused by this, and I didn’t like it. Algebra is a language, and this makes it unduly hard to read arithmetic operations in Zig which employ signed division.
If I told you there’s a language where you can’t use / for signed divison, because it’s ambiguous whether truncation or flooring is intended, and therefore, you have to use either /- or /%, would you be at all confused about which one does what? I hope not! Zig already uses composite operators to handle several tricky corners of overflow-able arithmetic on fixed-sized values, to great effect I might add, and I hold to the opinion that adding these would have an improving effect.
This is the once-buggy line I had to fix:
bin_mid = @divTrunc(bin_max - bin_min, 2) + bin_min;
Binary search, it’s hardly a recondite operation.
I had transcribed it thus:
bin_mid = bin_max - @divTrunc(bin_min, 2) + bin_min;
Like every mistake, it’s obvious: in retrospect. What I want you to notice is that the precedence did, in fact, change. That kind of thing has consequences.
Compare:
bin_mid = (bin_max - bin_min) /% 2 + bin_min;
YMMV, but I find this much easier to read and understand.
I can make the same case for @rem and @mod, but not as confidently. The problem there is that % is itself ambiguous, and tacking another operator onto it can’t fix it. That said, once the user knows about /- and /%, learning what %- and %% do is a matter of analogy, which is something we’re pretty good at, as a general rule.
I come away from the Wikipedia article on modulo with the impression that describing truncated remainder as “remainder” and floored remainder as modulus is clear for contextual social reasons, but not because it faithfully reflects how mathematics deals with the same ambiguity: a remainder is a modulus, the several possibilities are handled with noun phrases, not by drawing an artificial distinction between the one and the other.
The relevance of that observation being that “floored remainder” and “truncated remainder” are fully cognate with their equivalent terms in addition, and both results are considered a modulus. So maybe we can handle symbolic definitions for those operators as well.
// Before
(@divFloor(a, b) * b) + @mod(a, b) == a
// After
(a /- b * b) + @mod(a, b) == a;
// Full change:
(a /- b * b) + a %- b == a;
// Before
(@divTrunc(a, b) * b) + @rem(a, b) == a
// After
a /% b * b + @rem(a, b) == a;
// Full change:
a /% b * b + a %% b == a;
One with redundant parens (like the original), one without. I mean redundant in terms of the result, I personally favor parens for anything more complicated than the above, meaning I wouldn’t use them for this but reasonable people can disagree which is clearer.
I’d even hazard that adding the third-choice lines to the documentation would go a long way toward fixing the meaning in users’ minds.
Even in the somewhat artifical context of documenting the meaning of those two builtins, I find the latter two easier to comprehend.
Readability
These things aren’t connected except insofar as the intention is to make Zig code using signed values easier to read. I consider that goal valuable. There’s a balance to be struck between
- Communicate intent precisely
And
- Favor reading code
I believe that the amount of casting in status quo, as well as handling division by function builtins, is worse for both of these goals. A systems language in which numerics-heavy code appears unfamiliar and must be heavily punctuated with casts is alienating one of its core audiences. I think it would be great, by the way, if the core language supported complex numbers and matrices. But one thing at a time.