I saw a YouTube video recently of Johnathan Blow describing his vision for how his Jai compiler should eventually have its own backends that can be used instead of LLVM. In it, he said there are two types of consumers of compilers:
The first type is the person who thinks the compiler should be able to give you whatever assembly it wants, so long as the effects are the same. To this person, how they write the code is mainly however they feel it looks the nicest, and they may not know that much about how to make a program efficient or they don’t care. They just want whatever free performance gains the compiler can offer them.
The second type is the person who does pretty much know what’s an efficient way to accomplish their computing, and really they just want the compiler to do all the tedious, mundane optimizations like finding the best combination of instructions which map to what you expressed in your code. Of course, you might only know what you’re doing once you’ve written the code before, but the point remains the same.
I personally prefer to be in the second category, and it’s nice for me when I can do my own experiments on code I want to run as fast as possible. I would expect/hope that your two code samples compile differently, as you are specifying (IMO) that the two potentially unnecessary statements should be executed unconditionally, regardless of need. This is a form of software speculation, which can be a performance win in some cases, and it’s a loss in other cases.
Others have touched on the issue of what the compiler can assume/prove for your particular example, but I want to touch on something more general:
One thing to keep in mind is that the compiler does not actually know the best way to compile all code, it simply uses heuristics and tries to apply optimizations that are good on average. However, in some scenarios, they backfire. Current compilers are also incapable of asking the programmer for more information about whether it’s allowed to do an optimization it is almost able to prove is valid, or whether it should even do an optimization in the first place. Aggressive auto SIMDizing and loop unrolling can get you major performance wins in many cases, however, imagine such techniques get applied to a function which expects very small inputs. Let’s say I have a “sum” function that adds all the numbers in a slice together. The compiler might choose to auto-SIMDize and unroll the SIMDized loop 4 or 8 times. On massive inputs this gives a major speedup, however, if it so happens that I’m often passing things smaller than or just barely over the size of the hardware vector, the optimization will actually make my code run much slower and my binary file needlessly bloated, which means my instruction caches are going to be needlessly bloated too.
I want more choice in these matters. I want to run my own tests and decide for myself how the final binary should be optimized. In the case of SIMD, if I was writing the “sum” function and wanted it to be SIMD, I’d write it as such. There are instances where I think auto-SIMDization should occur, primarily in cases where you have a bunch of statements that can be smushed into one or a few vector instructions. However, most times, if I wanted SIMD, I’d write it that way. When it comes to unrolling loops, I want to choose how much. I don’t like that changing the whitespace or a comment sometimes leads to a very different compilations with loops unrolled by different amounts, with different performance characteristics. I also don’t like when I tell the compiler to do an optimization and it automatically deoptimizes it back to what it was before. I can’t tell if my change will make a difference down the road when the code does more things and then the compiler finally goes along with my “optimization”, which might be a win or a loss.
I can understand the appeal of being in the first group I mentioned, but I heavily prefer having more control, and to me low-level control is one of the most important features of Zig. Therefore, I think we should argue for fewer optimizations for Zig code in the situations I’ve mentioned, not more. I think your example is another good example of when optimizations should not happen automatically.