How to get a non-const pointer to a struct's field from within a function of the struct?

I’m trying to implement simple Memoization while learning Zig. Learning from my questions here about HashMap.getOrPut, I’ve constructed the following attempted solution to a toy problem:

const std = @import("std");
const print = std.debug.print;

const expect = @import("std").testing.expect;

const FibonacciMemoizer = struct {
    data: std.AutoHashMap(u8, u32),
    fn findNthFibonacci(self: FibonacciMemoizer, n: u8) u32 {
        if (n == 0) {
            return 0;
        }
        if (n == 1) {
            return 1;
        } else {
            const response = try self.data.getOrPut(n);
            if (response.found_existing) {
                return response.value_ptr.*;
            } else {
                const return_value = self.findNthFibonacci(n - 1) + self.findNthFibonacci(n - 2);
                response.value_ptr = return_value;
                return return_value;
            }
        }
    }
};

test "find the 25th Fibonacci number" {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    var inner_data = std.AutoHashMap(u8, u32).init(allocator);
    defer inner_data.deinit();

    const memoizer = FibonacciMemoizer{ .data = inner_data };

    try expect(memoizer.findNthFibonacci(0) == 0);
    try expect(memoizer.findNthFibonacci(1) == 1);
    try expect(memoizer.findNthFibonacci(10) == 55);
    try expect(memoizer.findNthFibonacci(19) == 4181);
}

However, this gives:

scratch.zig:15:43: error: expected type '*hash_map.HashMap(u8,u32,hash_map.AutoContext(u8),80)', found '*const hash_map.HashMap(u8,u32,hash_map.AutoContext(u8),80)'
            const response = try self.data.getOrPut(n);
                                 ~~~~~~~~~^~~~~~~~~
scratch.zig:15:43: note: cast discards const qualifier
/Users/scubbo/zig/zig-macos-x86_64-0.14.0-dev.2362+a47aa9dd9/lib/std/hash_map.zig:490:31: note: parameter type declared here
        pub fn getOrPut(self: *Self, key: K) Allocator.Error!GetOrPutResult {

I interpret this as saying that the reference to self.data within the struct’s function is giving an immutable pointer to data, and thus that self.getOrPut fails. Fair enough - that seems like correct behaviour. But, then - how do I get a mutable pointer to self.data?

What I’ve tried

Declaring as variable

I’ve tried declaring the field with var - i.e.

const FibonacciMemoizer = struct {
    var data: std.AutoHashMap(u8, u32),

which appears to be a syntax error.

Changing the function to accept a pointer to Self

I don’t know why this would work - my issue is with the mutability of the internal pointer, not with the reference to self. However, I’ve seen some suggestions (e.g. here) that the self-reference as the first parameter of a struct’s function should be a pointer, so I tried that…

...
const FibonacciMemoizer = struct {
    data: std.AutoHashMap(u8, u32),
    fn findNthFibonacci(self: *FibonacciMemoizer, n: u8) u32 { // <--- change here
        if (n == 0) {
...

and got

scratch.zig:37:24: error: expected type '*scratch.FibonacciMemoizer', found '*const scratch.FibonacciMemoizer'
    try expect(memoizer.findNthFibonacci(0) == 0);
               ~~~~~~~~^~~~~~~~~~~~~~~~~
scratch.zig:37:24: note: cast discards const qualifier
scratch.zig:8:31: note: parameter type declared here
    fn findNthFibonacci(self: *FibonacciMemoizer, n: u8) u32 {
                              ^~~~~~~~~~~~~~~~~~

Similar output arises if I change the test code to create memoizer as a pointer - that is:

...
    var inner_data = std.AutoHashMap(u8, u32).init(allocator);
    defer inner_data.deinit();

    const memoizer = &FibonacciMemoizer{ .data = inner_data }; // <--- change here

    try expect(memoizer.findNthFibonacci(0) == 0);
...

Similar Questions

  • This question is similar, but is not the same - it’s about mutable pointers to structs, not getting mutable pointers within a struct’s functions to the struct’s fields
  • This question appears to be about internals of compiler operation, and not relevant at this level
  • This question makes it clear that “_ arguments passed by value are immutable. The fact that you pass it by value, e.g. self: Bar, means that you don’t intend to mutate it_”, which still doesn’t explain the situation:
    • Even when I was passing self by-value rather than as a pointer, I wasn’t trying to mutate the object itself - I was trying to mutate one of its fields (or - have I misunderstood that? Does immutability “cascade” to the fields of an object?)
    • This suggests that my solution passing self as a pointer should have worked - but it didn’t.

After some significant tinkering, I found a working (albeit memory-leaky) solution - explicitly requesting the pointer as a var:

const std = @import("std");
const print = std.debug.print;

const expect = @import("std").testing.expect;

const FibonacciMemoizer = struct {
    data: std.AutoHashMap(u8, u32),
    fn findNthFibonacci(self: FibonacciMemoizer, n: u8) !u32 {
        if (n == 0) {
            return 0;
        }
        if (n == 1) {
            return 1;
        } else {
            // VVVVVVVV look here!
            var var_data = self.data; // <--- this is the important change!
            // ^^^^^^^^ look here!
            const response = try var_data.getOrPut(n);
            if (response.found_existing) {
                return response.value_ptr.*;
            } else {
                const return_value = try self.findNthFibonacci(n - 1) + try self.findNthFibonacci(n - 2);
                response.value_ptr.* = return_value;
                return return_value;
            }
        }
    }
};

test "find the 25th Fibonacci number" {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    var inner_data = std.AutoHashMap(u8, u32).init(allocator);
    defer inner_data.deinit();

    const memoizer = FibonacciMemoizer{ .data = inner_data };

    try expect(try memoizer.findNthFibonacci(0) == 0);
    try expect(try memoizer.findNthFibonacci(1) == 1);
    try expect(try memoizer.findNthFibonacci(10) == 55);
    try expect(try memoizer.findNthFibonacci(19) == 4181);
}

It works, but it seems verbose to require an extra line of declaration just to get a mutable pointer. Is there no way to get a non-const pointer when doing self.<field_name>.<function_on_field_name>()?

var var_data = self.data; is not a reference to data; it is a copy of the data.

Declare self as *FibonacciMemoizer to have a mutable pointer.

    fn findNthFibonacci(self: *FibonacciMemoizer, n: u8) !u32 {
2 Likes

The problem is not with your struct, but with your test. You’re declaring memoizer as const, so the entire struct becomes const. If you change it to var and use a pointer as @dimdin suggested, it will work.

2 Likes

Hmm, still no luck. I’ve changed the Memoizer’s declaration to a var, and the signature of the function to take a pointer, but I’m still getting an error about an unexpectedly-const pointer (though in this case it’s the Memoizer itself, rather than its internal field) - fiddle here, error:

Error: Error: <uuid>.zig:37:28: error: expected type '*<uuid>.FibonacciMemoizer', found '*const <uuid>.FibonacciMemoizer'
<uuid>.zig:37:28: note: cast discards const qualifier
<uuid>.zig:8:31: note: parameter type declared here

You’re taking a pointer to a temporary. Pointers to temporaries are const.

Make the FibonacciMemoizer{...} a separate variable if you need to take the address of it (or don’t take the address of it at all).

By solving a few problems the code becomes:

const std = @import("std");
const assert = std.debug.assert;

const FibonacciMemoizer = struct {
    data: std.AutoHashMap(u8, u32),

    fn findNthFibonacci(self: *FibonacciMemoizer, n: u8) !u32 {
        std.debug.print("findNthFibonacci({d})\n", .{n});
        if (n == 0) {
            return 0;
        } else if (n == 1) {
            return 1;
        } else {
            assert(n >= 2);
            const response = try self.data.getOrPut(n);
            if (response.found_existing) {
                return response.value_ptr.*;
            } else {
                const n1 = try self.findNthFibonacci(n - 1);
                const n2 = try self.findNthFibonacci(n - 2);
                assert(n1 >= n2);
                const return_value = n1 + n2;
                response.value_ptr.* = return_value;
                assert(response.key_ptr.* == n);
                return return_value;
            }
        }
    }

    fn init(allocator: std.mem.Allocator) FibonacciMemoizer {
        return .{
            .data = std.AutoHashMap(u8, u32).init(allocator),
        };
    }

    fn deinit(self: *FibonacciMemoizer) void {
        self.data.deinit();
    }
};

test FibonacciMemoizer {
    var memoizer = FibonacciMemoizer.init(std.testing.allocator);
    defer memoizer.deinit();

    try std.testing.expectEqual(0, try memoizer.findNthFibonacci(0));
    try std.testing.expectEqual(1, try memoizer.findNthFibonacci(1));
    try std.testing.expectEqual(55, try memoizer.findNthFibonacci(10));
    try std.testing.expectEqual(4181, try memoizer.findNthFibonacci(19));
}

But there is a huge problem: Before setting the new value to getOrPut response, two more findNthFibonacci are called, and loop generating a tree of calls the does not yet set their value. Before findNthFibonacci returns it sums two undefined values!

I am proposing to replace the HashMap with an array that holds all the fibonacci values upto n.

3 Likes

when you have complicated or possibly recursive logic along with a getOrPut operation, I recommend to take advantage of pointer stability locking safety feature:

const response = try self.data.getOrPut(n);
+self.data.lockPointers();
+defer self.data.unlockPointers();

This will reveal that you have a bug in calling the function recursively while holding a getOrPut result.

3 Likes

It seems that calling self.data.lockPointers() in recursive call led to lock twice.

fn findNthFibonacci(self: FibonacciMemoizer, n: u8) !u32 {
    // (snip)

    const response = try self.data.getOrPut(n);

    // First call of `lockPointers` will be successful.
    // But parent call of `findNthFibonacci` still have a lock. 
    // So second call of `lockPointers` will fail at assert(l.state == .unlocked)`.
    self.data.lockPointers();

    defer self.data.unlockPointers();

    // (snip)

    const return_value = try self.findNthFibonacci(n - 1) + try self.findNthFibonacci(n - 2);
}

Yes, exactly. Does that help you understand the problem?

Are you suggesting that I should refactor the logic to avoid recursion, so it works better with lockPointers()?

Especially, current implementation has seg fault issue with grow() of HashMap.
Are you suggesting that such a potentially bug-prone structure should be avoided?

I think he’s hoping you’ll realize that this pointer dereference can become invalidated:

response.value_ptr.* = return_value;

The reason is that in the recursive calls right before it, you are potentially growing the hashmap, meaning that by the time you come back, the hash map may have moved in memory several times. response will have become stale by that point.

The fix is to do a new lookup into the (potentially relocated) hash map before writing it.

5 Likes

Sorry, folks, I’m still not getting it. Here’s a simpler example (inspired by, but simplified from, Day 14 of Advent Of Code):

const std = @import("std");
const print = std.debug.print;

const expect = @import("std").testing.expect;

const Point = struct { x: u32, y: u32 };

const Robot = struct {
    position: Point,

    pub fn move_right(self: *Robot) void {
        self.position = Point{ .x = self.position.x + 1, .y = self.position.y };
    }

    pub fn from_line(line: []const u8) !Robot {
        const position_of_comma = std.mem.indexOf(u8, line, ",").?;
        const x = try std.fmt.parseInt(u32, line[0..position_of_comma], 10);
        const y = try std.fmt.parseInt(u32, line[position_of_comma + 1 ..], 10);
        return Robot{ .position = Point{ .x = x, .y = y } };
    }
};

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    // Pretend that this string was being read from a file.
    const input_data = "2,3\n10,0";

    var it = std.mem.splitScalar(u8, input_data, '\n');
    var robots = std.ArrayList(Robot).init(allocator);
    defer robots.deinit();

    while (it.next()) |line| {
        try robots.append(try Robot.from_line(line));
    }

    for (robots.items) |robot| {
        robot.move_right();
    }
    try expect(robots.items[0].position.x == 3);
    try expect(robots.items[1].position.x == 11);
}

(fiddle)

The above gives an error as robot.move_right() expected type '*<uuid>.Robot', found '*const <uuid>.Robot'.

If I duly change the signature of move_right to accept a *const Robot, (fiddle), I then get an error on the self.position = Point{... line, stating cannot assign to constant.

So, it seems like I’m stuck - the value returned from my “constructor” function (from_line) is const, which means that any functions I pass it into have to accept a const, which means that they cannot mutate any of its fields, and I don’t see any syntax that allows me to declare the return type of the function as var. What am I missing?

EDIT: explicitly, it looks like I can’t declare the robots as variable like so:

    while (it.next()) |line| {
        var robot = try Robot.from_line(line)
        try robots.append(robot);
    }

because this, rightly, complains that local variable is never mutated

EDIT2:

You’re taking a pointer to a temporary. Pointers to temporaries are const. Make the FibonacciMemoizer{...} a separate variable if you need to take the address of it

I don’t understand this, nor how it applies here. What is a “temporary”? The few usages of that word in the documentation mostly don’t appear to apply here - and while Result Locations seem like they might be relevant, I don’t follow how to apply that information here. This thread also uses the term, but doesn’t seem to define it. This discussion suggests that a temporary variable is one that goes out of scope, but I don’t see how that would be the case here - each robot is created within the body of main, so I don’t see how they would go out-of-scope before the end of the program.

The problem here isn’t that you are referencing a temporary, it’s that loops loop by value by default:

    for (&robots.items) |*robot| {
        robot.move_right();
    }

Is the syntax to loop by reference.

See “for reference” example here. Documentation - The Zig Programming Language

1 Like

Unfortunately, that still doesn’t work: fiddle, changing:

    for (&robots.items) |*robot| {
        robot.move_right();
    }

gives error: type '*[]<uuid>.Robot' is not indexable and not a range. I also tried for (&(robots.items)) |*robot| in case precedence was messing things up - same error.

EDIT: huh, weird. This seems to work - i.e.:

    for (robots.items) |*robot| {
        robot.move_right();
    }

I still have questions, though. If robot was a *const Robot before, then why does this syntax make it into a (variable) *Robot? There’s no var in play here. In fact, if robot was previously a *const Robot, then why isn’t *robot a **const Robot?

For slices the & before isn’t needed, because slices are already seen as a pointer + length pair which can be iterated over.
For arrays you need to use & to get a pointer to the array (if you want to capture a pointer to the value) (the length is statically known from the type of the array).

When you use |robot| the type isn’t *const Robot the type is just const Robot (a possibly copied value), then when you later call a method on it that can result in a *const Robot type.

When you use |*robot| you get a *Robot because the syntax of for loops is that when you use * to declare the payload, you get a mutable pointer to the value instead of the value itself.

4 Likes

And if you simplify move_right to

pub fn move_right(self: *Robot) void {
        self.position.x += 1;
}

it becomes easier to understand what is happening and why you need a pointer.

2 Likes

Huh. So calling a function on an object can “implicitly pointerize” that object? That is, if var/const object is a non-pointer, when I call object.function() then function might get passed a pointer to object? I didn’t know that, thank you - that certainly explains a large chunk of my confusion!

Do you have a link to some documentation that explains that behaviour? In particular, how would I predict/control that behaviour? I see here that “When [structs, unions, and arrays] are passed as parameters, Zig may choose to copy and pass by value, or pass by reference, whichever way Zig decides will be faster.”, but that seems to be behaviour out of my control. How would I know (while writing, in advance, before encountering an error) that an object is going to get “pointerized” when passed into an function? Is it simply “if the function’s signature specifies a pointer as a parameter”, or is there more going on than that?


I appreciate the example, but - not (yet) to me, it doesn’t :sweat_smile: I know that the arguments to a function are immutable, but:

  • move_right isn’t mutating self, but rather one of self’s fields’ fields. Is it perhaps the case that an immutable struct’s fields are themselves immutable? That’s backed up by this fiddle, but I don’t want to assume as I’ve been working from many wrong assumptions already. In particular, that behaviour doesn’t seem to be stated here or here.
  • why would making a variable into a pointer make it mutable? “Pointerness” and “Mutability” seem to be two orthogonal concepts - this article points out that you can have “a pointer to a value, a pointer to a const value, a const pointer to a value, and a const pointer to a const value”. That is - in order to allow move_right to mutate (a field of) self, why do we have to pass a pointer rather than passing a mutable value?

I didn’t find explict explanation of the method call syntax and semantics within the language reference, except here:

test "dot product" {
    const v1 = Vec3.init(1.0, 0.0, 0.0);
    const v2 = Vec3.init(0.0, 1.0, 0.0);
    try expect(v1.dot(v2) == 0.0);

    // Other than being available to call with dot syntax, struct methods are
    // not special. You can reference them as any other declaration inside
    // the struct:
    try expect(Vec3.dot(v1, v2) == 0.0);
}

But Ziglings is good at explaining more and separating between functions and methods, to make it more digestible for people who aren’t used to how Zig does things:

If the parameter is self: MyType it will either be passed by value or as reference, when the parameter is *const MyType or *MyType it will be passed as pointer.

why do we have to pass a pointer rather than passing a mutable value?

because function parameters are always const (immutable)
even if that were not the case, zig copies values, so your changes would only apply to the copy that the function has not to the robot in the array list

1 Like