Canonical design pattern for accessing dynamically-allocated elements

noodlecollie · August 26, 2024, 1:43pm

Let’s say I’m writing a struct which contains a dynamically-allocated array of items, eg. points forming a polygon. Which of the following approaches is the most canonical in Zig?

// Option 1: Function returning an optional value,
// depending on whether the index is in range.
pub fn getPoint1(self: Polygon, index: usize) ?Point {
    return if (index < self.points.len) self.points[index] else null;
}

// Option 2: Function returning a value or an error,
// depending on whether the index is in range.
pub fn getPoint2(self: Polygon, index: usize) !Point {
    return if (index < self.points.len) self.points[index] else error.IndexOutOfRange;
}

// Option 3: Function returning a value, implicitly with
// the precondition (checked in debug mode) that the
// index must be in range.
pub fn getPoint3(self: Polygon, index: usize) Point {
    std.debug.assert(index < self.points.len);
    return self.points[index];
}

// Option 4: Direct user access to the underlying
// array, essentially delegating the access rules
// so that the user treats this like they would any
// other array (the "Not my problem" approach).
const point = poly.points[index];

My initial thought would be that option 2 is best, since this is what Zig errors are designed to assist with. However, I have seen other functions in the standard library take all three of the other approaches, so I was wondering about the rationale behind each of them.

LucasSantos91 · August 26, 2024, 1:50pm

It depends on whether you expect your users to try to access out-of-bounds elements. If trying to access an out-of-bound element is considered normal usage and will happen frequently, use the optional. If it’s normal usage but will happen infrequently, use the error union (returning an error is implicitly marked as cold). If it’s not correct to access out-of-bounds elements, you can use either 3 or 4, that’s just a matter of style. Note that the assertion in 3 is redudant, because slice accesses already have this assertion built into them.

noodlecollie · August 26, 2024, 2:10pm

That makes a lot of sense. I guess in the polygon example, given there will be a way to query how many points there are in the polygon, options 3 or 4 would be appropriate. There is never a reason to access outside the range of valid points, and the accessor method should not be treated as a way to check whether an index is valid.

A subsequent question, then: I understand that Zig does not support access specifiers on struct fields, only on functions. (I’ve read the Github issue discussion on this and fundamentally disagree with the rationale, but whatever.) Is there a canonical way to indicate which fields users should consider OK to access, and which should be considered private? The two that spring to mind are naming private fields with a leading underscore in a Python-style way, or alternatively wrapping private fields in an “internal” namespace, like certain C++ header libraries I’ve seen. I really don’t want to have to go and look at documentation to work something like that out, IMO it should be clear from intellisense completions alone.

LucasSantos91 · August 26, 2024, 3:04pm

Well, the Zig recommended way is to read the documentation…
Personally, I think even functions shouldn’t be private. We can frequently find a use for private fields and functions, and it’s a real bummer when we can’t do something that should be really easy because of an artificial limitation, like private fields. I have had this issue more than once when trying to setup a “len” field from a struct as a futex to wake up a thread. I’m willing to work around future changes the implementers make to the struct. In the vast majority of times, no changes will happen.
Python is doing well without hiding anything from the users.