Hello, I’m quite new to Zig and I’ trying to understand why the * prefix is necessary in |*i| to make a pointer capture value when looping over a reference to an iterable in a for loop:
const arr = [_]u8{1, 2, 3};
for (&arr) |*i| {
// "i" here is a pointer that can be deferenced via: "i.*"
}
because when looping by reference (&), I expected the loop variable i to be a pointer directly (without the * prefix).
Also interestingly, the type of i in both:
1.) for (arr) |i| {}
2.) for (&arr) |i| {}
is the same (a u8), whereas I expected the second case to involve a pointer.
Could someone please shed some light or intuition on why that is the case?
The &arr syntax refers to the array itself, not to the items in the array - you are capturing the array by pointer and then unpacking each element by value. That’s why they’re the same.
If you’re coming from Rust, I can understand why you would expect looping over a pointer to an array to produce pointers to the items; that’s exactly what Rust does. But in Zig it helps to think about the capture here |i| or |*i| as how you define function parameters fn foo(i: u8) versus fn foo(i: *u8). You are controlling whether you want to mutate the item or not just as you control whether you want to mutate a passed-in argument to a function or not.
I think one of the reason why is explicitness. One of Zig’s goal is to be very explicit, in one of Andrew’s demo he actually start by making his audience read and debug some meta-programming in Zig, which is notoriously one of the hardest part of any language and he does make a point that explicit syntax can help people interact with a language. Now for the for loop capture syntax, it might feel a bit odd at first, but let’s take a quick example :
pub fn foo(objects: String) void {
for (objects) |obj| {
// do stuff
}
}
pub fn bar(objects: String) void {
for (&objects) |*obj| {
// do stuff
}
}
In this example where you don 't use an explicit syntax for your iteration, unless you know the definition of “String” you don’t know if you are iterating over a copy, or if you are modifying the source “objects” collection. Whereas in the second example it’s more obvious to the naked eye that you are taking a pointer which probably means you are modifying the content itself.
In this case, we’re iterating over the array without adding anything to the syntax. Each value of |i| is captured as a copy.
So what happens if I do this?
for (&numbers) |i| {
result *= i;
}
A few things happen here - first, we get a pointer to that array. That pointer is then converted into a slice by the loop. We’re now iterating over a slice and copying each value of |i| just like in the first loop. The capture |i| has not changed.
That’s exactly the same if we did the following:
for (numbers[0..]) |i| {
result *= i;
}
In the previous example, the &numbers was casted to a slice. In this example, we’re explicitly asking for a slice.
In every example you’ll try, regardless of it being numbers, &numbers or even numbers[0..], you’ll always be copying if you capture by |i|. The reverse is true too - you’ll always be getting a pointer if you capture by |*i|.
How you pass the array to the loop and how you capture its values are independent of each other.
Keep in mind, this is a trivial example. I’m inclined to believe you’ll get the same result either way though, but that’s a good thing for you to look into
One way you can try to confuse the compiler is by passing an argument as a slice vs pointer vs array to a function that isn’t inlined and see what it has to come up with.
I tried to make a slight less trivial and I tried native and non-native types. For the native type, all 4 ways of doing the loop (even capturing a pointer) all gave me the same code. It be using an index under the hood regardless.
For non-native type the capture was a little different, but no matter what I put in the for header, array, slice, it didn’t matter. I’m not sure about smaller structs.
However, I still don’t understand why providing the pointer capture value |*i| alone (or just the reference &arr) isn’t sufficient to declare my intent of wanting to mutate the items of an array, and hence wanting to deal with pointers?
If both for (numbers) |i| and for (&numbers) |i| ultimately copy over the capture value |i|, why would anyone opt for the latter? Also, does the latter implicitly dereference the capture to unpack each element by value?
Is there a use-case for for (numbers) |*i| and for (&numbers) |i|?
Yes, there are cases where one is preferable to another - absolutely.
Let’s get the obvious ones out of the way… if you want to mutate something, pointers are going to allow you to do that by default if you’re not pointing at a const thing.
However, let’s think this through carefully - how large is a pointer? It’s the size of a usize value. So if I copy a pointer, I’m copying something the size of a usize. Anything less than that size (such as i32, u8, etc…) is actually smaller than the pointer to that thing itself.
I remember this coming up a lot in C++… “should I pass by reference or value?” because beginners often had the idea that a “copy” was slower, so they’d opt to make everything references. For small types, this actually isn’t true at all. In fact, if you are passing by pointer, you may have to go fetch that value every single time. Even if the thing is larger than a pointer (like a pair of usize), it may still be more optimal to copy that if you’re repeatedly using it in a local context.
Sometimes passing by reference is an optimization, but not always. For large things where the copy is particularly brutal, reference will be remarkably faster (especially if it’s going through multiple functions or in a loop where a big copy will happen every iteration).
Sorry to keep this going, but is my understanding below “roughly” correct?
var numbers = [_]u8{ 1, 2, 3 };
// Capture elements by value (copy):
for (numbers) |n| {
// "n: u8" a created copy of each element in "numbers"
}
// Capture elements by "read-only" reference:
for (&numbers) |n| {
// "n: u8" an accessed (read) value (dereferenced pointer) of each element in "numbers"
}
// Capture elements by "write" reference:
for (&numbers) |*n| {
// "n: *u8" a pointer capture value of each element in "numbers" that can be dereferenced via "n.*"
}
// Invalid:
for (numbers) |*n| {
// error: pointer capture of non pointer type
}
// Capture elements by value (copy):
for (numbers) |n| {
// "n: u8" a created copy of each element in "numbers"
}
Yes.
// Capture elements by "read-only" reference:
for (&numbers) |n| {
// "n: u8" an accessed (read) value (dereferenced pointer) of each element in "numbers"
}
These are copies, too. If you put really big types in your array, like…
…you’ll see the compiler start emitting memcpy instructions.
// Capture elements by "write" reference:
for (&numbers) |*n| {
// "n: *u8" a pointer capture value of each element in "numbers" that can be dereferenced via "n.*"
}
Essentially. The pointer itself is const. You can’t change what the pointer is pointing to, but you can change the value that it’s referring to.
// Invalid:
for (numbers) |*n| {
// error: pointer capture of non pointer type
}
Yes - this currently will not compile. Essentially, if you capture with |i| then expect a copy.
In theory that seems correct, but I get the same asm for all those for native types. For larger types I get the same code for anything without capture by pointer and then I still get the same code for capture by pointer on both ways of passing in the array.
You just posted the asm up top that showed the same code generated. How would there be a distinction then? Not trying to be a pain. I just can’t see the differences and get confused by them constantly
For the last example, is that really an error? I thought I had it working on godbolt, but maybe it was the type I used instead of native types.
I think those two are the same. I get the same asm for natives types for both of those. For structs I also get the same asm for both. I can’t find a distinction unless it matters how you use the capture (I didn’t test that).
I was just about to suggest that. Haven’t tried it, but I suspect the Zig compiler is smart enough to see that even if capturing by |*i| if you don’t actually mutate it, it will behave as |i|?
Which, btw, I definitely give you kudos here. There’s a reason this is marked as a footgun because it doesn’t have obvious behavior in terms when it’s emitting copies. It’s definitely interesting that it’s selectively calling memcpy and based on usage.
dynamically linking against libc? Why is is calling into library memcpy though the PLT? What compiler flags are you using? I thought llvm had a compiler builtin for memcpy.
So if the capture isn’t dirtied it elides the copy. Good to know.
I’m actually surprised that it didn’t notice the store is never loaded and goes out of scope so that can be removed too. Actually really surprised.
The untouched store and the PLT call. LLVM really jacked up the codegen.