Function Parameter Immutability

only in single threaded code, no? In MT code that shit can change underneath you since zig is allowed to pass a the argument by pointer to const if it decides to (and almost always does for non-native types).and you have to make an explicit copy. That’s next level footgun.

I think parameters are thread local variables on their respective stacks?
Sure your parameter can be a pointer to some shared memory and then you have got synchronization problems to deal with, but the pointer value itself is still immutable, even if it points to something that is mutable.

I don’t understand what you mean here, just because something is multi threaded doesn’t mean that those threads access shared memory?
Threads can have their own islands of memory that they work on.

I mean that at a source level, you can try to pass by copy and it looks like pass by copy, but the compiler can turn it into pass by const pointer if it wants to, and there is no way to tell if that will happen or not.

So telling somebody that their parameter of their function, which by all account is an immutable copy, it a nasty FT footgun that can change per compile (eg, if there is register pressure if can pass a pointer in one scenario and a copy in another).

fn aaa(x: Mine, ...) out
fn bbb(x: Mine, ...) out

It is entirely possible for those to have two different semantics in the same program or between compilations in MT code. It’s possible in ST code too if you have self pointer in x or are using some form of coroutines even.

But given that, if the compiler performs PRO, the pointer is still a *const so in practical terms the parameter is still immutable and multithreaded access is safe, no?

In the backend, definitely yes. In the front-end I’d say no. From the calling code’s perspective it should have the same semantics. From the receiving code’s perspective there is just no way to know that syntactic copy isn’t really a copy and can change on you at any time.

From an MT perspective, you can’t even copy the what appears to already be a copy (but is just a pointer) reliably because some other thread could be simultaneously modifying the struct.

I think I understand what you mean, but it seems to me that even if you try to make this mutation happen on purpose, you just can’t do it with the language alone. If it’s possible, I would love to see the code that achieves it. It would be like the sample Rust code floating around that demonstrates how you can access a reference beyond its lifetime.

I think you guys are just rehashing the hidden pass-by-reference footgun. It has been shown that mutations to parameters can happen because of how aliasing and PRO interact.

3 Likes

In an MT setting it’s trivial (there’s already bug reports on this where the solution was listed at make a copy if you really want a copy, don’t rely on the calling conv o copy for you):

/// just a simple struct where we maybe calculating a yearly
// average from monthly data. It can be anything, even just a simple
// collection of u32s .jan, .feb, .mar, etc...There's no depth, so a shallow
// copy is the same as a deep copy.
const Data = struct {
    mo:[12]u32,
   // other random stuf
};

// thread 1 calls:
fn calc_avg(c: Data) u32 {
  // this might be a copy or might be *const Data
}

/// thread 2 calls:
fn set_data(d: *Data) void {
  // this sets data.mo elements in some manner
  for(0..12) |i| {
    d.mo[i] = something();
  }
}

If you want to reuse the Data struct, is calc_avg going to be affected by set_data? It’s a guess. The semantics are different depending on how the compiler optimizes it.

That’s the concept. A more real world scenario is this:

// This is the exact definition of a small buffer optimized string in tiny mode
const SsoString = struct {  len: u8, text: [15]u8 };

// thread 1
fn upcase(s: SsoString) SsoString { ... }

// thread 2
fn set(s: *SsoString) void { ... }

good luck figuring out what that does. what is safe and what is not. upcate might be running over a string that set is mutating (eg, I use a small mode string to hold pieces of a network packet and then recycle them.

I can’t just pass them down. I have to make a copy to ensure they are not being clobbered when I send them to a function, but the compiler could then decide to make another copy to pass in or pull the values back out and put them in a register.

The question isnt if these are acceptable optimizations - they might be - but saying parameters at the lexical level are pass by value sets people up for blowing their foot off really easily (its happened a few times on blogs and github).

It doesn’t even have to be MT. If you are using any form of coroutine (even some makeshift one that you don’t even realize minicks a yield point (eg, network event loop) you can return from / re-enter the function and an argument can be completely different because you went off and called code that mutated what lexically appears to be a copy).

You can even just do it by accident from a struct that has a self pointer (or chain of pointers that come back to itself). If there was a true copy on the stack, you would be modifying the one in the callers stack frame, so the attempt to pass a copy and snapshot it gets subverted.

Here’s this question? Assuming Z has no pointers of any type and guess calls no other functions, (ie. it is a pure function) is this always threads safe?

fn guess(x: Z) bool { ... }

It depends on what the optimizer does.

1 Like

I see. I recently saw an interview where Chris Lattner mentions that in Mojo all function parameters are always pointers. I guess that’s a viable alternative and shouldn’t hurt performance given pointers can be shuffled around in registers just like “small” values can, right?

I don’t know how Mojo works. I couldn’t find what you are describing (in fact it says mojo supports both value and reference semantics), but i’m not sure what that means and how it is implemented.

I can’t imagine putting native types on the stack so an address can be taken to pass it to a function. That seems like it would kill performance, but maybe it does and just relies on inlining to remove the need for it.

Java does a mix where all native types are by copy and all class types are by reference which seems to work in practice.

Found it: https://youtu.be/9ag0fPMmYPQ?si=u78iadVTDaVF3TkI&t=530

So actually, Mojo will borrow by default. If you use the owned qualifier, it’ll move, but only if you then use the caret on the argument when calling. If not, it’ll copy.

Edit: Check out around 14:00 where they explain that parameters are borrowed by default and if you try to mutate them in the function, you get a compiler erorr.

Edit2: This table is the essence of it all in Mojo:

1 Like