Plan to address "Attack of the killer features" parameter reference optimization aliasing issue just dropped!

To be clear, I’m not sure if this was discovered as part of “Attack of the Killer Features”, that is just what first brought the concept (PRO) to my attention.

10 Likes

" So, here’s our conclusion. PRO in its current form will cease to exist. The Zig compiler’s optimizer will gain the ability to notice that a function is pure, and promote parameters to references accordingly."

Does this have implications for incremental compilation?

3 Likes

it will become desirable to use explicit *const T in cases where this was previously not idiomatic.

Yes, explicit is good.

10 Likes

I’m really glad they decided to go with this, I think it’s better to be explicit even if that let less room for the optimizer, because at the end of the way working code is better than that kind of footgun. I hope this gets implemented soon.

4 Likes

So we’re back to the C way of passing parameters. I’m glad that this is finally being addressed, but it’s sad that after all this time and all the talk about how awesome PRO was, we’re back exactly where we started.

2 Likes

I don’t think so? My interpretation is that very little has changed:

  1. Using * const T (reads like pass by reference) has always and will always restrict the compiler to passing by reference. (no change here)
  2. Using T (reads like pass by value) will only automatically pass by reference if the function can be detected as pure (change here is that the compiler is just doing less optimization).

I don’t see this as a closed door to more aggressive PRO happening in the future (this is not a semantic change to the language, only a optimization). And I don’t see this as a change that requires a lot of change to peoples code. (Unless there are a lot of people relying on PRO?)

Existing implementations that are using T may see a performance hit, but no aliasing bug.

I think this is just growing pains in implementing PRO.

1 Like

Maybe its a bigger change than I am thinking?

Will there be a lot of changes to the std lib to change to * const T? Will we have two versions of many APIs? One for big data * const T and one for small data (T)?

PRO will only be applied to pure functions.
When writing our functions, we’ll have to, once again, start thinking if we are passing something big or small. Big things we want to pass by pointer and small things by value. That is the C way of passing parameters.

1 Like

I don’t see a solution description.

The first problem is how zig will define a function as “pure”.
A definition might be: when you call the function with the same arguments you get the same results. But can the compiler actually detect these functions?
Will an imported function from a library be defined as non-pure, or will zig add a way to specify pure functions?

3 Likes

Certainly, a function will only be classified as pure if the compiler can see its body. Probably the rule will be:
Any function is pure unless it:

  • Is an external function (including dynamic functions, library functions, functions that came from C or functions that came from object files)
  • Does a syscall
  • Modifies global variables
  • Receives a pointer as a parameter and modifies it (including pointers in fields)
  • Calls an impure function
2 Likes

just want it on record that I personally never claimed it was awesome. I only said it solved certain problems while creating other problems, and that its future was uncertain.

lots of people out there have bad zig takes

5 Likes

I wish the programming world had kept the distinction between “function” and “procedure”.

3 Likes

In Zig, a function’s purity also affects its availability at comptime. As such, perhaps it’s sensible to make it a property that programmers have to declare explicitly.

1 Like

Also, will the programmer be able to know what the compiler detected as ‘pure’, without looking at the assembly?

1 Like

I love that they just decided to go the explicit route. Back when I watched the talk and read some diacussion about it there were a lot of solutions that seemed like more overhead and complexity than just thinking about the arg size. Im so glad that this proposal was made.

Unfortunately, software borrows a lot of terms from math and then butchers them. For one, the term “vector”, which can mean:

  1. A dynamically growable array in C++
  2. A math concept of a point or magnitude and a direction.
  3. A CPU concept of an extra-large register that supports a different instruction set, with most instructions allowing the register to be semantically divided into multiple pieces which are operated on in parallel of each other. Also called “SIMD vector”
  4. A CPU concept similar to the previous one, except those multiple data elements are not operated on entirely in parallel but in consecutive time steps. So e.g. you might have a “vector processor” that allows you to operate on 512 byte vectors in one instruction, but it is probably going to sextuple-pump those instructions and pipeline starting them across the next 16 cycles. People familar with this concept would refer to the previous concept as Array processing, NOT vector processing. Although I think “array processing” could be an even worse term to say to software developers.
  5. And more. Vector - Wikipedia (And that list doesn’t even mention SIMD at the time of writing)

Within SIMD, we have shuffles and permutes. Based on the original definition you might think it implies that you can’t end with multiple copies of the same individual parts, but actually that’s perfectly acceptable. In x86 lingo, a shuffle means “intra-lane” and conditionally zeroing if the top bit is set, and permute means “cross-lane”. And in this case “lane” refers to 16-byte chunks of the vector, not the other definition of “lane” which refers to how big the pieces are (bytes, in this case). The term “swizzle” is more common in the GPU space for a similar concept, although I think it’s more of a compile-time decision, whereas arbitrary shuffles and permutes can be computed at runtime. The Broadcom Videocore IV uses the terminology “Lane rotate” instead of swizzle, which solves the problem of the word “rotate” not really being butchered that much by most software and hardware people.

Conceivably there could be a builtin function that allows you to query during compilation whether it detected a function as pure or not, but one wonders whether this idea leads to people obsessing over this idea when perhaps they shouldn’t.

It reminds me of a proposal that was floated for Zig where you could specify that a variable is constant beyond a certain point, or that it is not to be used after a certain point. Both of these are things you can do already with blocks and by making a new variable declared with const, but this feature would make it easier. Andrew Kelley rejected it on the basis that it would become “good practice” to always make sure you specify that a variable isn’t used anymore. Even though this kind of feels like a good idea, we have to ask ourselves whether it makes sense for devs to obsess over something which the compiler should be able to figure out pretty easily. Some people even argue that const falls in that category!

Likewise, should programmers try to eliminate all the non-pure functions, except on the boundaries of the application (like unsafe in Rust)? Personally, I don’t know that that makes sense or matters that much.

3 Likes

12 posts were split to a new topic: Parameter passing

A post was merged into an existing topic: Parameter passing

I learned to program using Turbo Pascal, a language which preserved the distinction, and it wasn’t as nice as you might think.

A function was just a procedure which returns a value. You could pass by pointer and mutate stuff in a function just as easily as a proc, the difference was quite minor and I think we’re better off just having a void type rather than two names for essentially the same thing.

Whether that name is function or procedure is just stylistic: Odin abbreviates proc and Zig abbreviates fn but they end up being the same thing.

You could get a “real” function by declaring all parameters Const, but of course this is possible in Zig also.

Then you have languages in the ML family which have actual functions in the mathematical sense. But those don’t have procedures.

If there was a language which had procedures which can mutate state and functions which can’t, I never saw it. Could be pretty nice! But that would be a new thing.

2 Likes