Pass by value semantics

jumpnbrownweasel · May 28, 2026, 8:13pm

Just for completeness, I think you’re referring to this issue:

github.com/ziglang/zig

eliminate hidden pass-by-reference footguns

opened 08:40PM - 01 Aug 20 UTC

closed 11:36AM - 28 Apr 26 UTC

ghost

proposal accepted

[Accepted Proposal](https://github.com/ziglang/zig/issues/5973#issuecomment-2380…332493) ----- Zig 0.6.0 (not master). This is related to, actually maybe a subset of, https://github.com/ziglang/zig/issues/4021 / https://github.com/ziglang/zig/issues/3696 (this issue doesn't involve result copy elision). I understand that this was intended to be a feature of zig: args passed as values "may" be silently translated to pass-by-reference by the compiler. I think the intent was to stop the user from passing const pointers as an "optimization". But it's also a footgun, sort of like https://github.com/ziglang/zig/issues/2915. The problem occurs when you have another non-const pointer aliasing the same memory as the argument value. ```zig const std = @import("std"); const Thing = struct { value: u32, }; const State = struct { thing: Thing, }; fn inner(state: *State, thing: Thing) void { std.debug.warn("before: {}\n", .{thing.value}); // prints 10 state.thing.value = 0; std.debug.warn("after: {}\n", .{thing.value}); // prints 0 } pub fn main() void { var state: State = .{ .thing = .{ .value = 10 }, }; inner(&state, state.thing); } ``` The behavior here depends on the compiler implementation. It seems that right now, if `thing` is a struct value, it's passed by reference. But if it's a bare `u32`, it's passed by value. I don't know if it will always be this simple (I assume there are plans to pass "small" structs by value.) The workaround for this situation is to make an explicit copy using a redundant-seeming optimization, probably accompanied by a comment explaining what's going on. Or else to restructure the code at a higher level, but then this footgun will still be lurking in the shadows. I think that any optimistic "assume no aliases" optimization ought to be opt-in rather than opt-out. That would mean, either go back to the C way of things, or add a new syntax (some symbol that means "compiler can choose between value and const pointer"). Either way, a plain argument should always be passed by value. What do others think?

LucasSantos91 · May 28, 2026, 8:19pm

Look again. Item can be a u8, the issue still happens. item is semantically copied into the function stack. It should be semantically safe to move the data inside the list somewhere else. But if you surreptitiously changed the value to a pointer, that pointer will be dangling after the realloc, before you could copy it.
Search for “Attack of the Killer Features” on youtube, by SpexGuy.

LucasSantos91 · May 28, 2026, 8:21pm

Worst title ever for an issue. It should have been called “Hidden pass by reference miscompilation”. A footgun is when a feature is easy to misuse. This is correct code being broken by an optimization, that’s a miscompilation.

npc1054657282 · May 28, 2026, 8:28pm

It may need to be reiterated that here, the ‘pass-by-reference’ is designed based on the Windows calling convention.

For these aggregate types passed as a pointer, including __m128 , the caller-allocated temporary memory must be 16-byte aligned.

Simply put, for passing larger structures, the Windows calling convention is: allocate a temporary copy on the stack, while passing the parameter by reference.

This allows the caller to optimize based on known information: the caller knows the mutability and aliasing of the parameter, and if the caller knows the parameter is immutable, the temporary copy can be omitted.

And based on Zig’s previous issues, I believe that Zig does not make a temporary copy for parameters that are not immutable either.

pzittlau · May 28, 2026, 8:34pm

Oh my bad. I overread the windows part beforehand.

It’s late for me so excuse me if I’m wrong, but isn’t this still susceptible to this?

npc1054657282 · May 28, 2026, 8:38pm

pzittlau:

pub fn main(init: std.process.Init) !void {
    var list: std.ArrayList(LargeItem) = .empty;
    defer list.deinit(init.gpa);

    try list.append(init.gpa, LargeItem{ .data = [_]u8{42} ** 128 });
    // Value semantics guarantee this should be a safe snapshot copy.
    try ownAppend(&list, init.gpa, list.items[0]);
}

My understanding is as follows:
list is mutable, and list.items[0] is mutable.
Therefore, semantic analysis cannot conclude that the temporary copy here should be eliminated, so it will still be temporary copied according to the usual pass-by-value semantics.

Can this explaination helps?

LucasSantos91 · May 28, 2026, 8:43pm

Yes, now find any example where this is safe, guaranteed, and can be proven programatically. There will be a minuscule amount of trivial cases, which are already optimized in every every compiler. Zig’ PRO was supposed to be the next step forward.

npc1054657282 · May 28, 2026, 8:47pm

My ambition is not that great; my idea is simply this: for those stateless immutable data, copy optimization for pass-by-value can be determined semantically, so that even users in scenarios where the optimization is not enabled can be assured that there is no additional overhead here.

This is especially useful for generics.

matklad · May 28, 2026, 8:52pm

The thing is, whether the code is correct depends on definition of the language. You could define a language such that this sort of aliasing is illegal behavior, ad we might still get something like that eventually: noalias on all parameters by default (with debug safety); ability to specify mayalias · Issue #1108 · ziglang/zig · GitHub.

My understanding is that, at the time the issue was written, this was literally undefined behavior, in a sense that the language didn’t conclusively rule one way or another whether this sort of code is valid.

gwenzek · May 28, 2026, 8:53pm

copy optimization for pass-by-value can be determined semantically

I like the idea. Though I always hated the “will copy elision trigger” C++ game.
But maybe that’s a UX problem. Like if there is a way to ask ZIg compiler, is there a copy ellision here ? it’s way better.

LucasSantos91 · May 28, 2026, 9:10pm

With PRO, some types

Defining it this way would make some kinds of code impossible (literally) to write correctly. Which was exactly the case when PRO existed.

Take a look at this beautiful example. I doubt anyone will disagree with PRO being removed after seeing it.
In the example, the functions involved had no pointer arguments, and in fact, there’s not a single pointer to be seen, other than a discarded one (_ = &b), which happens after the function call. That user even explicitly created an unnecessary copy on the stack, even though the argument itself was passed by value and should be, semantically, copied. The copy explicitly eliminates aliasing.
PRO attacked anyways.
The compiler eliminated the explicit copy and passed a one-byte struct by reference, creating aliasing, even though semantically the user did everything to avoid aliasing.

LucasSantos91 · May 28, 2026, 9:13pm

I don’t think it’s possible to determine these conditions hold without analyzing the code, that is, enabling optimizations. And I’m pretty sure your small ambition is what LLVM already does, with optimizations enabled.

gwenzek · May 28, 2026, 9:35pm

I feel the discussion is a bit too much all or nothing.
I would love some reliable optimization that currently LLVM don’t do.

The main issue with PRO is a pointer may alias with one of the argument magic pointer.

But there can be no alias if nobody knows the magic pointer.
Which is true if you just created the value on the stack and you didn’t give the pointer to anyone.
It means taking a pointer to an argument or a stack variable requires making a new copy for further pass by reference call.

in the array list example, items[0] isn’t on the stack, so it need to be copied to it before being passed by reference

The other big footgun with naive PRO

fn merge(a: *A, b: A) { ... }

fn shootfoot() void {
  var a: A = .init;
  merge(&a, a);
}

here shootfoot needs two copies of a because of the reference.

gwenzek · May 28, 2026, 9:36pm

This example is baffling,
but it seems to be due to a poorly implemented PRO, rather than PRO in general.
Why would const b = a; not materialize a copy of the global variable a on the stack ?

npc1054657282 · May 28, 2026, 9:38pm

If we want to infer aliasing situations, I think it requires relatively complex semantic analysis. However, I believe that in most cases the demands we face are very simple: when we pass stateless data through the abstraction of a function, we do not want the parameter passing to incur an extra meaningless expensive copy in an unoptimized scenario. To explicitly ensure this, one would have to manually write noalias *const T in the function signature, which is usually not what we want.

The cost for users to achieve this guarantee is very simple: cultivate the good habit of avoiding var when using stateless data. Doing so makes it easy to determine through semantics that optimization is possible.

matklad · May 28, 2026, 9:41pm

I wouldn’t be so categorical! I think there is a coherent language where passing parameters and const bindings in general is specified as creating an aliases for data, where the aliased data is considered “borrowed” (so, mutating it while such a binding is active is illegal behavior), and where there’s an explicit operator to force a copy.

I can clearly see two ways to look at this whole design space:

You can see it as extremely weird that function parameter isn’t an independent copy of memory, and can alias something else.

But, equally, you could see it as extremely weird that compiler silently duplicates or moves values behind your back.

gwenzek · May 28, 2026, 9:48pm

I feel one thing that would help the compiler detect aliases is to prevent pointers from being copied by default.
Like fn (x: *Foo) void is not allowed to store x pointer anywhere. Only fn (x: *free-for-all Foo) can.

gwenzek · May 28, 2026, 9:55pm

LLVM is actually quite bad at removing copies, unless it inlines.

In this examples I would expect exactly one copy, when receiving data from outer world.

LucasSantos91 · May 28, 2026, 10:11pm

Maybe. I would hope so. My biggest wish for Zig is for PRO to come back correctly. I don’t know if it’s impossible, but the evidence does point in that direction. Like I said earlier, if we use C’s semantics for parameter passing, then we benefit from C’s optimizations in LLVM. And those guys have been optimizing C for a very long time, and they’ve done amazing things, but they have failed to implement this. Then Andrew and the core team came, stubbornly insisted on it for so long, and also failed. If it is possible, it will certainly take a genius with some kind of breakthrough. We’d have a better shot at this if we changed the semantics of parameter passing to something different from C, like the in/out parameters that some languages are trying.

LucasSantos91 · May 28, 2026, 10:26pm

That code is unoptimized. On ReleaseFast the behavior is what you expected.