Inquiry about 'for' loop syntax when using references

OSuwaidi · April 9, 2024, 9:15pm

Hello, I’m quite new to Zig and I’ trying to understand why the * prefix is necessary in |*i| to make a pointer capture value when looping over a reference to an iterable in a for loop:

    const arr = [_]u8{1, 2, 3};
    for (&arr) |*i| {
        // "i" here is a pointer that can be deferenced via: "i.*"
        }

because when looping by reference (&), I expected the loop variable i to be a pointer directly (without the * prefix).

Also interestingly, the type of i in both:
1.) for (arr) |i| {}
2.) for (&arr) |i| {}

is the same (a u8), whereas I expected the second case to involve a pointer.

Could someone please shed some light or intuition on why that is the case?

AndrewCodeDev · April 9, 2024, 9:50pm

Hey @OSuwaidi - welcome to Ziggit

The &arr syntax refers to the array itself, not to the items in the array - you are capturing the array by pointer and then unpacking each element by value. That’s why they’re the same.

dude_the_builder · April 9, 2024, 10:11pm

If you’re coming from Rust, I can understand why you would expect looping over a pointer to an array to produce pointers to the items; that’s exactly what Rust does. But in Zig it helps to think about the capture here |i| or |*i| as how you define function parameters fn foo(i: u8) versus fn foo(i: *u8). You are controlling whether you want to mutate the item or not just as you control whether you want to mutate a passed-in argument to a function or not.

pierrelgol · April 10, 2024, 9:15am

I think one of the reason why is explicitness. One of Zig’s goal is to be very explicit, in one of Andrew’s demo he actually start by making his audience read and debug some meta-programming in Zig, which is notoriously one of the hardest part of any language and he does make a point that explicit syntax can help people interact with a language. Now for the for loop capture syntax, it might feel a bit odd at first, but let’s take a quick example :

pub fn foo(objects: String) void {
    for (objects) |obj| {
        // do stuff
    }
}

pub fn bar(objects: String) void {
    for (&objects) |*obj| {
        // do stuff
    }
}

In this example where you don 't use an explicit syntax for your iteration, unless you know the definition of “String” you don’t know if you are iterating over a copy, or if you are modifying the source “objects” collection. Whereas in the second example it’s more obvious to the naked eye that you are taking a pointer which probably means you are modifying the content itself.

nyc · April 10, 2024, 9:46pm

But there are 4 ways to write that and while those two seem clear, the other two ways then just as confusing. How would these differ from those?

 for (&objects) |obj|
 for (objects) |*obj|

AndrewCodeDev · April 10, 2024, 10:29pm

The confusion here is arising about why we’d use &something at all.

To clear things up, let’s take the example below:

export fn foo() i32 {

    const numbers: [5]i32 = .{ 1, 2, 3, 4, 5 };
 
    var result: i32 = 1;

    for (numbers) |i| {
        result *= i;
    }

    return result;
}

In this case, we’re iterating over the array without adding anything to the syntax. Each value of |i| is captured as a copy.

So what happens if I do this?

    for (&numbers) |i| {
        result *= i;
    }

A few things happen here - first, we get a pointer to that array. That pointer is then converted into a slice by the loop. We’re now iterating over a slice and copying each value of |i| just like in the first loop. The capture |i| has not changed.

That’s exactly the same if we did the following:

    for (numbers[0..]) |i| {
        result *= i;
    }

In the previous example, the &numbers was casted to a slice. In this example, we’re explicitly asking for a slice.

In every example you’ll try, regardless of it being numbers, &numbers or even numbers[0..], you’ll always be copying if you capture by |i|. The reverse is true too - you’ll always be getting a pointer if you capture by |*i|.

How you pass the array to the loop and how you capture its values are independent of each other.

Hope that helps.

nyc · April 10, 2024, 10:40pm

kinda, does that mean passing numbers all 3 ways will generate the same code?

edit: this is something I can go godbolt. Thx for trying to explain though.

AndrewCodeDev · April 10, 2024, 10:42pm

Yes, it’s the same:

foo numbers:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 48
        mov     dword ptr [rbp - 20], 1
        mov     qword ptr [rbp - 16], 0
        jmp     .LBB0_2
        
foo &numbers:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 48
        mov     dword ptr [rbp - 20], 1
        mov     qword ptr [rbp - 16], 0
        jmp     .LBB0_2
        
foo numbers[0..]:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 48
        mov     dword ptr [rbp - 20], 1
        mov     qword ptr [rbp - 16], 0
        jmp     .LBB0_2

Keep in mind, this is a trivial example. I’m inclined to believe you’ll get the same result either way though, but that’s a good thing for you to look into

One way you can try to confuse the compiler is by passing an argument as a slice vs pointer vs array to a function that isn’t inlined and see what it has to come up with.

nyc · April 10, 2024, 10:57pm

I tried to make a slight less trivial and I tried native and non-native types. For the native type, all 4 ways of doing the loop (even capturing a pointer) all gave me the same code. It be using an index under the hood regardless.

For non-native type the capture was a little different, but no matter what I put in the for header, array, slice, it didn’t matter. I’m not sure about smaller structs.

OSuwaidi · April 11, 2024, 1:07am

That clears up a few things, thanks a lot!

However, I still don’t understand why providing the pointer capture value |*i| alone (or just the reference &arr) isn’t sufficient to declare my intent of wanting to mutate the items of an array, and hence wanting to deal with pointers?

If both for (numbers) |i| and for (&numbers) |i| ultimately copy over the capture value |i|, why would anyone opt for the latter? Also, does the latter implicitly dereference the capture to unpack each element by value?

Is there a use-case for for (numbers) |*i| and for (&numbers) |i|?

Thanks a lot!

AndrewCodeDev · April 11, 2024, 1:42am

@Osuwaidi, great questions.

Yes, there are cases where one is preferable to another - absolutely.

Let’s get the obvious ones out of the way… if you want to mutate something, pointers are going to allow you to do that by default if you’re not pointing at a const thing.

However, let’s think this through carefully - how large is a pointer? It’s the size of a usize value. So if I copy a pointer, I’m copying something the size of a usize. Anything less than that size (such as i32, u8, etc…) is actually smaller than the pointer to that thing itself.

I remember this coming up a lot in C++… “should I pass by reference or value?” because beginners often had the idea that a “copy” was slower, so they’d opt to make everything references. For small types, this actually isn’t true at all. In fact, if you are passing by pointer, you may have to go fetch that value every single time. Even if the thing is larger than a pointer (like a pair of usize), it may still be more optimal to copy that if you’re repeatedly using it in a local context.

Sometimes passing by reference is an optimization, but not always. For large things where the copy is particularly brutal, reference will be remarkably faster (especially if it’s going through multiple functions or in a loop where a big copy will happen every iteration).

Please check out the following for more information: Unintentional Copy On Capture

I think I’ll leave it there though

OSuwaidi · April 11, 2024, 6:32am

Sorry to keep this going, but is my understanding below “roughly” correct?

  var numbers = [_]u8{ 1, 2, 3 };

  // Capture elements by value (copy):
  for (numbers) |n| {
      // "n: u8" a created copy of each element in "numbers"
  }

  // Capture elements by "read-only" reference:
  for (&numbers) |n| {
      // "n: u8" an accessed (read) value (dereferenced pointer) of each element in "numbers"
  }

  // Capture elements by "write" reference:
  for (&numbers) |*n| {
      // "n: *u8" a pointer capture value of each element in "numbers" that can be dereferenced via "n.*"
  }

  // Invalid:
  for (numbers) |*n| {
      // error: pointer capture of non pointer type
  }

Thanks again as usual for your assistance!

AndrewCodeDev · April 11, 2024, 6:42am

No problem.

// Capture elements by value (copy):
for (numbers) |n| {
    // "n: u8" a created copy of each element in "numbers"
}

Yes.

// Capture elements by "read-only" reference:
for (&numbers) |n| {
    // "n: u8" an accessed (read) value (dereferenced pointer) of each element in "numbers"
    }

These are copies, too. If you put really big types in your array, like…

const MyType = struct {
    value: usize = 50,
    data: [200]usize = .{ 0 } ** 200, // arbitrary big thing
};

…you’ll see the compiler start emitting memcpy instructions.

  // Capture elements by "write" reference:
  for (&numbers) |*n| {
      // "n: *u8" a pointer capture value of each element in "numbers" that can be dereferenced via "n.*"
  }

Essentially. The pointer itself is const. You can’t change what the pointer is pointing to, but you can change the value that it’s referring to.

  // Invalid:
  for (numbers) |*n| {
      // error: pointer capture of non pointer type
  }

Yes - this currently will not compile. Essentially, if you capture with |i| then expect a copy.

nyc · April 11, 2024, 2:36pm

In theory that seems correct, but I get the same asm for all those for native types. For larger types I get the same code for anything without capture by pointer and then I still get the same code for capture by pointer on both ways of passing in the array.

You just posted the asm up top that showed the same code generated. How would there be a distinction then? Not trying to be a pain. I just can’t see the differences and get confused by them constantly

For the last example, is that really an error? I thought I had it working on godbolt, but maybe it was the type I used instead of native types.

Am I missing something?

nyc · April 11, 2024, 2:42pm

OSuwaidi:

  // Capture elements by value (copy):
  for (numbers) |n| {
      // "n: u8" a created copy of each element in "numbers"
  }

  // Capture elements by "read-only" reference:
  for (&numbers) |n| {
      // "n: u8" an accessed (read) value (dereferenced pointer) of each element in "numbers"
  }

I think those two are the same. I get the same asm for natives types for both of those. For structs I also get the same asm for both. I can’t find a distinction unless it matters how you use the capture (I didn’t test that).

dude_the_builder · April 11, 2024, 2:45pm

I was just about to suggest that. Haven’t tried it, but I suspect the Zig compiler is smart enough to see that even if capturing by |*i| if you don’t actually mutate it, it will behave as |i|?

AndrewCodeDev · April 11, 2024, 4:15pm

Here’s a trivial example you can put on godbolt and see this in action:

const MyType = struct {
    value: usize = 50,
    data: [200]usize = .{ 0 } ** 200, // arbitrary big thing
};

export fn foo() usize {

    var array: [5]MyType = undefined;

    array[0] = .{}; // silences undefined warning

    var result: usize  = 1;

    for (array[0..]) |i| {
        result *= i.value;
        result *= i.data[42];
    }

    return result;
}

If you comment out this bit:

        result *= i.data[42];

It removes this from the loo entirely:

        lea     rdi, [rbp - 1608]
        mov     edx, 1608
        call    memcpy@PLT

So before:

.LBB0_4:
        mov     rsi, qword ptr [rbp - 9720]
        mov     rax, qword ptr [rbp - 9728]
        imul    rax, rax, 1608
        add     rsi, rax
        lea     rdi, [rbp - 1608]
        mov     edx, 1608
        call    memcpy@PLT
        mov     rax, qword ptr [rbp - 1624]
        mul     qword ptr [rbp - 1608]
        mov     qword ptr [rbp - 9736], rax
        seto    al
        jo      .LBB0_6
        jmp     .LBB0_7

And after:

.LBB0_4:
        mov     rcx, qword ptr [rbp - 8112]
        mov     rax, qword ptr [rbp - 8120]
        imul    rax, rax, 1608
        add     rcx, rax
        mov     rax, qword ptr [rbp - 16]
        mul     qword ptr [rcx]
        mov     qword ptr [rbp - 8128], rax
        seto    al
        jo      .LBB0_6
        jmp     .LBB0_7

AndrewCodeDev · April 11, 2024, 4:20pm

Which, btw, I definitely give you kudos here. There’s a reason this is marked as a footgun because it doesn’t have obvious behavior in terms when it’s emitting copies. It’s definitely interesting that it’s selectively calling memcpy and based on usage.

nyc · April 11, 2024, 4:20pm

dynamically linking against libc? Why is is calling into library memcpy though the PLT? What compiler flags are you using? I thought llvm had a compiler builtin for memcpy.

So if the capture isn’t dirtied it elides the copy. Good to know.

I’m actually surprised that it didn’t notice the store is never loaded and goes out of scope so that can be removed too. Actually really surprised.

The untouched store and the PLT call. LLVM really jacked up the codegen.

AndrewCodeDev · April 11, 2024, 4:26pm

Nah, I was just marking it as export so I didn’t have to write a main function and it emits less assembly. You’ll get the same behavior with this:

const MyType = struct {
    value: usize = 50,
    data: [200]usize = .{ 0 } ** 200,
};

fn foo() usize {

    var array: [5]MyType = undefined;

    array[0] = .{}; // silences undefined warning

    var result: usize  = 1;

    for (array[0..]) |i| {
        result *= i.value;
        result *= i.data[42];
    }

    return result;
}

pub fn main() void {
    var x = foo();
    _ = &x;
}