Pointers and Assignment

AndrewCodeDev · February 11, 2024, 4:31am

Assignment is a fundamental operation in Zig (and many programming langues). Understanding how assignment interacts with pointers is crucial to writing correct Zig code.

Pointers are classified by their characteristics (slices, pointer-to-many, pointer-to-one), but all of them have similar behavior via assignment.

const x: usize = 42;

const y = &x; // get x's address

const z = y; // copies x's address

It is important to note that the address of x has been copied - not the integer value of x. This is true of user defined types too.

const A = struct {
    // slice contains a pointer and a length
    data: []const u8, 
};

// later...

// assign the "data" member variable to the address of the string.
const a = A {
    .data = "Hello, World!"
};

// we are copying the pointer and the length into b. We have not
// copied the value of "Hello, World!", we have copied the slice
// which contains the pointer to "Hello, World!" and still only
// have one string in memory pointed to from two places.
const b = a;

This is true of standard containers as well. For this example, we’ll consider ArrayList. Recall that ArrayList has 3 members…

a slice to the current used data
a capacity for the total data
an allocator interface which has a pointer to the allocator and the vtable

Assignment in this case will act much like our user-created type A:

// we make an array list with 100 elements to start
var x = ArrayList(T).initCapacity(allocator, 100);

// the pointers and integers of x have been copied to y
var y = x;

This diagram illustrates the data shared by both:

            + ------DATA-----+
            |                |
         x.items             |
                          y.items

Since they have the same pointer, they point to the same data.

In this next example, we have an ArrayList called array_list. We will copy the array_list.items slice member variable to another variable called buffer and modify the array_list to observe the effects:

// let's copy the slice
var buffer = array_list.items

// let's modify that original list...
array_list.ensureUnusedCapacity(array_list.capacity * 100);

// here's what can happen:

   Modify ArrayList Allocation ---> potentially frees ----> OLD DATA
          |                                                    ^
          v                                                    |           
   NEW LARGER DATA                                        var buffer
          ^
          |
   array_list.items

In this diagram, we can see that var buffer can still be pointing to data that was abandoned by the ArrayList - this could have been a freeing/invalidating operation.

There is a helpful comment in the source file of ArrayList that describes this situation. Always pay attention to comments that mention “invalidating pointers” as it can have the effect outlined in the diagram above:

/// Modify the array so that it can hold at least `additional_count` **more** items.
/// Invalidates element pointers if additional memory is needed.

This issue also occurs if we directly copied the ArrayList as well:

var x = ArrayList(T).initCapacity(allocator, 100);

var y = x;

// becomes...

               +-----DATA-----+
               |              |               
               x              y


x.deinit() // frees data

y.deinit() // tries to free the same data

Here we can observe a double free. Each array list tries to free the memory, but once x has already achieved this, y attempts to do the same on memory that has already been freed by x.

AndrewCodeDev · February 11, 2024, 4:32am

This needs a section about cloning and the difference from assignment.