Unintentional Copy On Capture

Zig’s capture syntax can unpack objects by value or by pointer:

for (slice) |elem| {
   // elem has been captured by value (copy)
}

for (slice) |*elem| {
   // elem has been captured by pointer (reference)
}

Capturing objects by pointer is particularly useful when iterating over a slice of objects that are each larger than a usize value.

The Footgun

If the captured element is large, it can spawn unintentional copies. This is particularly painful when iterating over collections such as arrays or slices.

The Workaround

Consider your payload carefully when writing range-based for-loops (or any capture).

Consider using index based loops where possible as slice[i].foo = x does not create a copy of the slice’s ith element.

For optionals and tagged unions the captures in if and switch statements can create unnecessary copies. To avoid them use capture by pointer.

In Summary

Capture syntax is a helpful way to cutdown on boilerplate code, but check your captures to ensure you don’t create unintentional copies. Zig’s capture syntax can unpack objects by value or by pointer.

8 Likes

had a question regarding for loop if i do something like:

for (arr, 0..) |_, i| {
	// here would zig try to copy the value even tho
	// it’s not being used?
}

i find myself doing something like this instead of for(0..arr.len) |i|, probably my brain thinks it’s faster to write the first one but anyways. ow i think this should also apply for all other capture.

This workaround is fine for loops but it does not work for captures in switches over tagged unions. In this case capture by pointer seems to be the only option to avoid copies.

1 Like

@korke two things - you can’t actually take a pointer via |*_|, and I just checked the assembly… it removes memcpy calls by using |_, i| even in debug.

@slonik-az Good point for optionals - for arrays it’s still a valid alternative, but for capturing single optional and union payloads, it’s not. I encourage you to update the documentation. You should see an edit button at the bottom of that post - go ahead and add your point to the post. That’s why we opened the docs :slight_smile:

Edited doc to include workaround for optionals and tagged unions.

3 Likes

Isn’t this wrong? Doesn’t the capture have same semantics as function arguments, that is compiler decides whether to pass by value or reference?

I can get Godbolt to emit memcpy instructions regardless of optimization level with very trivial examples on trunk.

Can you point out why you think this is incorrect? It looks like the effect of this is alive and well.

I can’t point to anywhere, but it does sound inconsistent vs. function argument passing.

I may be misunderstanding your post (and apologies if I am).

When I read “isn’t this wrong?” I was under the impression you were saying that the effect isn’t correct. I’m going by the generated assembly.

If you instead were saying “shouldn’t this work like parameter passing?” then I see your point - it probably should for consistency’s sake.

Asked on discord and it used to work like parameter passing, but was actually made to always copy due to the same footgun I linked in the original post.

Cool :+1: that’s a good piece of historical context.

It looks if the iterated container is an array, then that array will also be copied.
For example, the following code prints 0 1, instead of 0 9.

const std = @import("std");

pub fn main() !void {
	var a: [2]u32 = .{0, 1};
	for (a, 0..) |v, i| {
		if (i == 0) {
			a[1] = 9;
		}
		std.debug.print("{} ", .{v});
	}
}

// output: 0 1

This is actually the same as Go’s for-range semantic. Some ones think it is a good design, some others think it brings a little complexity so that some beginners of the language will get confused in the early learning phase.

Another design is to only allow iterating slices. If an array is fed, it will be coerced to a slice.
To iterate a copy of an array, we can explicitly create a duplication of the array before the iteration.

2 Likes

Or you can capture “by pointer” rather than by value:

// untested:
const std = @import("std");

pub fn main() !void {
	var a: [2]u32 = .{0, 1};
	for (a, 0..) |*v, i| {
		if (i == 0) {
			a[1] = 9;
		}
		std.debug.print("{} ", .{v.*});
	}
}
1 Like

I ran it, there is a small bug the & is missing:

    for (&a, 0..) |*v, i| {

Output:

0 9

@slonik-az,

As @Sze pointed out, your code fails to compile.

In @Sze’s code, &a will be coerced to a slice.

My point here is: the iterated container can only be a slice. This will make it is easy to explain the for semantic.

I put // untested comment in my code for a reason. It is really hard to type compilable code on the phone. I missed & in front of a. Guilty as charged :slight_smile: Thanks for catching the typo.