Pass by value semantics

One of these languages is PL/SQL. But aliasing is possible there, too. At least if you use the NOCOPY “hint” for IN OUT parameters, which the language itself advocates (you get a compiler warning without it). Just happened at work a few weeks ago, when a colleague wondered “WTF, this function actually changes an input parameter!?”

1 Like

I find it weird to call ReleaseSafe “unoptimized”

1 Like

Then Andrew and the core team came, stubbornly insisted on it for so long, and also failed

I would love a write up on what they tried. From the outside it looks like the first implementation was really greedy, and memcpy everywhere got added when the issue blew up on Hacker News and Zig got a bad rap. I’m sure there is more to it

1 Like

Sadly, ReleaseSafe just means SlightlyLessWorseDebug.

ReleaseSafe gives me at least 10x the performance of debug. So not just slightly less worse. It is slightly better than Rust release mode.

I wonder how Odin gets away with it.
https://odin-lang.org/docs/overview/#parameters

By default, Odin procedures use the “odin” calling convention. This calling convention is the same as C, however it differs in a couple of ways:

It promotes values to a pointer if that’s more efficient on the target system, and
It includes a pointer to the current context as an implicit additional argument.

The promotion is enabled by the fact that all parameters are immutable in Odin, and its rules are consistent for a given type and platform and can be relied on since they are part of the calling convention.

1 Like

I don’t know Odin, but I’m guessing it doesn’t. Had anyone tried the ArrayList test in it?

I have tried with the “merge” footgun. Odin is basically at Zig 0.9 level, alias all the things !

const A = struct {x: u32, y: u32, z: [100]u32};
noinline fn merge(a: *A, b: A) void {
    a.x += b.y;
    a.y += b.x;
    var i: u32 = 0;
    while (i < 100) : (i+=1) {a.z[i] += b.z[i];}
}

export fn shootfoot(i: u32) u32 {
  var a: A = .{.x = i, .y = i+1, .z = undefined};
  merge(&a, a);
  return a.y;
}

emits call fastcc void @merge(%A* %a, %A* %a) which is incorrect

4 Likes

Interesting! Thanks.

Didn’t know that, thanks! Some more context: Aliasing issues when value is passed by reference · Issue #2971 · odin-lang/Odin · GitHub

1 Like

They also have the array list bug

3 Likes

I suspect that the semantic solution that Pony uses might fit this pretty well ?

Adding iso/ref/val annotations to declare the capability restrictions up front seems to do the job, and solves some tricky problems.

It’s a bit of an all-or-nothing solution though. Not sure how well that might work out with a non GC language like zig - like, can you still provide similar guarantees without GC ?

1 Like

I assume you need RAII plus ownership tracking in the compiler, just like Rust.

Some Ada users here? Looking at the docs, it seems to me that even Ada Spark cannot catch all of the possible issues.

So they’re taking the position that the programmer should know that the semantics change if the item being passed is greater than 16 bytes in size. There’s no way anybody is getting caught out by that one.

Oooof!

I’m glad PRO got reverted in Zig.

2 Likes

I think it’s more likely they would document that when you pass a variable by value, you shouldn’t also pass it by pointer. And if you do, expect that mutating through the pointer may change the value argument.

I’m not very sure about other people’s concepts. But my expectation for PRO is as follows: For the passing of immutable values of stateless data, the coder should not need to deliberately write parameters like noalias *const T for the sake of performance optimization to reduce copying (otherwise the consistency of generics would easily be broken depending on the size of the type). The coder should assume that the language can perform such optimization during parameter passing, and the coder should not have to design obscure function signatures for performance. Therefore, the function signature only expresses the intended semantics and will not break the generic parameter forms because of that.

Regarding the aliasing issue, the language should be able to detect it and fall back to copying. Since we usually only expect passing by value for immutable stateless data, ideal code should not have such aliasing. Similarly, if a coder deliberately writes noalias *const T parameters for performance reasons, the aliasing issue was already present. The problem never disappears; it just manifests in a different form. For an ideal PRO, it’s a performance fallback; for an interface deliberately written by the coder, it’s ib.

The earlier version of Zig’s PRO had issues because it basically did not provide checks or fallbacks for aliasing. This is also why, in my philosophy, PRO should be based on the Windows calling convention.

Agree! But can we trust the compiler for 100%?

That is very handwavy. Falling back to copying implies either relying on compile-time escape analysis (which is often pessimistic and fails across boundaries) or introducing unexpected pointer comparisons at runtime. Neither of those are desirable or expected in systems code.

Also an additional problem that, as far as I’ve seen, wasn’t talked about at all in this thread is multithreaded code. It’s not unreasonable to assume that code exists that operates on things as values where concurrent threads are updating the “original”. With a normal mutex operation this will likely rarely happen, but this is Zig and you can (and should be able to) do anything you please. And other synchronization methods where this would lead to issues are easily possible to create and also widely used. I haven’t looked into it but intuitively lock free stuff and also something like a seqlock could suffer from this.

Other problems I can think of are (1) the then missing consistency of generics, where you might need to explicitely handle cases for types larger than some processor dependent threshold, for which the reason is entirely hidden and can easily be forgotten because, at least I, wouldn’t think about how this thing I’m programming could be used in all possible ways while writing some code. And (2) the already non-stable ABI which Zig has is then even more brittle and making it even harder to allow things like this. (3) The ability for any code to write assembly, which is largely a black box to the compiler and with that especially to some part of the frontend, and expect something to be passed by value. (4) Being able to patch the code that is running at runtime - I don’t need to say more about this one I think.

It also violates the key thing that for me Zig stands for is that the amount of implicit things is reduced to the necessary minimum[1].

$ zig zen | sed -n '2p;4p;10p'
 * Communicate intent precisely.
 * Favor reading code over writing code.
 * Reduce the amount one must remember.

I believe something like this just doesn’t belong at all into a close to the hardware language that allows complete control over the executed code.


  1. But not the absolute minimum as this would be very awkward. ↩︎

This is the reason I advocate treating it as a ‘semantics’ rather than an ‘optimization,’ which means that in practice, the code writer can predict whether it actually occurs. The semantics I envisage are very pessimistic, and copy elision can only be guaranteed when the parameters are immutable; in other scenarios, there will be no copy elision. Immutable parameters are also the rare cases where I believe such optimizations should be adopted.

When PRO was removed, my initial thought was the same: I thought it was ‘nicely deleted.’ At that time, I confidently believed that I could completely implement a similar behavior in user space with a more explicit semantics. Until I actually started attempting to do it in user space, only to be completely defeated. I now admit that without language support, it is currently difficult to implement generics with equivalent usability in user space. What particularly blocked me was the available use of noalias; because it only applies to modifier parameters rather than types, and it only applies to pointer types, I could hardly have a design in user space with equivalent capability.