Can we simulate rust like traits in Zig (to add methods to a struct)?

That hits the point. Zig might, due to not yet reaching version 1.0, be accustomed to the fact that updates to the standard library are mostly breaking changes, and therefore less consideration is given to cases of ‘implementation changes without API changes.’ But this indeed makes sense, because when an API changes, the API itself can be marked as deprecated or obsolete, and retaining these deprecated APIs has little impact on scenarios that don’t use them. However, fields are different; a deprecated field can never be retained. Therefore, if one relies on accessing a field, any changes to the field itself during a library update will inevitably render the old code unusable.

In practice, documentation can be relied upon to describe which fields are stable. In addition, test cases can help understand the correct ways to directly access certain fields.As for the current Zig implementation, we have to embrace documentation. It does more than expected, including input constraints for various parameters that rely on the documentation.

But don’t get me wrong, I actually don’t like this. In my mind, documentation is the easiest thing to become outdated, much more so than the actual API, because when the implementation changes, the documentation doesn’t appear to have any immediate impact. Soon, the documentation accumulates a large amount of outdated information.

5 Likes

That all makes perfect sense, thank you for explaining!

1 Like

I agree in general. But this one aspect – marking fields “internal” – seems relatively easy to add initially and keep updated. Plus, doing this is much better than any of the alternatives we have available.

1 Like

I personally don’t like traits because it makes every code base feel little bit of alien, and hard to track where the actual implementation comes from.

How do these methods make accessing the data simpler or safer?

5 Likes

If the internal implementation of a structure is volatile between versions, then when the structure is upgraded, access to the fields may become invalid due to implementation changes, while the access APIs will not.

Even if certain APIs are no longer maintained, marking them as deprecated will not immediately break old code that references them. For code that doesn’t use these APIs, the presence of these deprecated APIs won’t cause any code bloat due to zig’s lazy compilation. This allows library users to gradually upgrade without immediately impacting code usability.

If certain fields are no longer maintained, however, they must be removed and cannot be retained in any form, as this would otherwise significantly increase memory usage.

I wish there was a lib qualifier for function/members, alternative to pub, which would mean the field is considered pub inside the library, but not from outside.

2 Likes

I’ve rethought this and feel that the current issue with fields and the issue with pub declarations in namespaces may be the same problem.

Not setting certain declarations in a namespace to pub can cause difficulties for advanced users attempting to extend the functionality. Similarly, if a struct’s fields are not publicly accessible but private, the same problem arises.

If namespace declarations lack pub to distinguish them, it becomes difficult for ordinary users to determine which declarations are suitable for direct use and which ones should not be used directly. The same problem applies to fields: some fields are more suitable for direct access, while others are not.

Currently, declarations are differentiated between public and private, so I often encounter the first problem. Fields are always public, which may lead to the second problem.

I think the ideal solution would be to make declarations and fields more consistent—all declarations and fields always accessible (whether public or private), and all declarations and fields also have a public or private distinction (perhaps declarations default to private, and fields default to public).

For ordinary users, always accessing public declarations and fields is sufficient.
For advanced users, providing an accessible channel for both private declarations and fields is sufficient, perhaps simply by adding a @unsafeImport with access to private fields.

2 Likes

I did this for some time in my project but realized the underscore is already communicating the intent and having a bunch of self.private is a bit verbose and might have a cost (no idea if it’s totaly opimized out or could change the memory layout of the struct). "" could be a convention but the downside is that my autocomplete put them everywhere in the complete list instead of the very bottom while doing the private struct hides all this.

I’m all for the everything is accessible but having a clean way to sort things out for the outside would be very nice.
First one in Zig Zen:Communicate intent precisely.

3 Likes

I guess a straighforward solution for this would be a special variant of @import() which just makes everything public :slight_smile:

…the idea to define an ‘import context’ to define access rights is actually interesting… e.g. conversing with chatgpt about ‘programming languages with unix-style file permissions instead of public/private’ there’s a “Java frontend” called Jif which allows to annotate struct declarations like this:

class Document {
    String{Alice:} secret;    // Alice owns this field, no one else can read it
    String{Alice:Bob} shared; // Alice owns it, Bob is allowed to read
    String{} title;           // world-readable (like 644 in Unix)
}

…from here on I’m just making things up… e.g. the ‘import context’ could then look like this in Zig:

const bla = @import("bla.zig") as Bob;

…and this would apply the access rules for ‘Bob’ to the module…

But yeah… that’s a lot of extra annotations… but OTH this Jif thingie seems to be about much more than just access rights: Jif

1 Like

You’re right; even before you finished your reply, I realized that an @import variant was needed and made an edit.

Especially with zig, the granularity of public and private is file-based (which makes sense; access within the same file should be for “power users”). This is a strong argument for using @import as a controller for access permissions.

2 Likes

…it shouldn’t have any runtime cost. Not sure about memory layout (in terms of Zig’s automatic struct member reordering and struct alignment - e.g. whether such a nested struct remains one thing or is ‘dissolved’ into the parent struct).

Just for fun here’s what I’d like.

const Foo = struct {
    name: []const u8,
    details {
      is_dirty: bool = false,
      some_internal_counter: u32, 
    }

    fn updateCounter(self: *Foo) void {
         self.some_internal_counter += 1; // Just like a normal field
    }
};

Then from the outside

// Messing with the struct from outside using "!" instead of "."
foo!some_internal_counter = 0;

What I like about this is that moving a field in or out of “details” would have refactoring consequences and would also be friendly to ZLS.

1 Like

It basically has to remain one thing, because a reference can be taken to it and passed around. That reference has to behave identically to the same struct, just not embedded in a bigger one.

1 Like

For now I gave an example of a simple ArrayList, so one might think that accessing items directly is not an issue, since it’s just a simple 1D array.

What about when our datastructure is more complicated? What about when it is a Matrix? A Sparse matrix with CSR (Compressed Sparse Row) storage system?

In those cases, it is not possible to access such data structures, by just accessing items, because the data structure is not a simple array anymore.

To access those data safely other languages provide [] operator overload, but since Zig doesn’t provide that, we need accessor methods, such as at() and in()

Such a more complicated data structure I use for my work is shared below

See if you can realize why at() and in() is necessary.

const std = @import("std");
const ArrayList = @import("arrayList.zig").ArrayList;

pub fn SparseArrayList(comptime T: type) type {
    return struct {
        items: ArrayList(T),
        heads: ArrayList(u32),
        len: usize,

        const Self = @This();

        pub fn init(allocator: std.mem.Allocator) !Self {
            var x = Self{
                .items = ArrayList(T).init(allocator),
                .heads = ArrayList(u32).init(allocator),
                .len = 1,
            };
            errdefer x.deinit();
            try x.heads.append(0);
            return x;
        }

        pub fn initCpacity(allocator: std.mem.Allocator, capacity: usize, factor: usize) !Self {
            var x = Self{
                .items = try ArrayList(T).initCapacity(allocator, capacity * factor),
                .heads = try ArrayList(u32).initCapacity(allocator, capacity),
                .len = 1,
            };
            errdefer x.deinit();
            try x.heads.append(0);
            return x;
        }

        pub fn deinit(self: Self) void {
            self.items.deinit();
            self.heads.deinit();
        }

        pub fn row_len(self: Self, i: usize) usize {
            return (self.heads.at(i + 1) - self.heads.at(i));
        }

        pub fn row(self: Self, i: usize) []u32 {
            return self.items.array.items[self.heads.at(i) .. self.heads.at(i) + self.row_len(i)];
        }

        pub fn at(self: Self, i: usize, j: usize) T {
            return self.items.at(self.heads.at(i) + j);
        }

        pub fn in(self: *Self, i: usize, j: usize) *T {
            return &self.items.array.items[self.heads.at(i) + j];
        }

        pub fn append(self: *Self, x: T) !void {
            try self.items.append(x);
        }

        pub fn bin(self: *Self, n: usize) !void {
            const old_entry: u32 = self.heads.getLast();
            const num_new_entries: u32 = @truncate(n);
            try self.heads.append(old_entry + num_new_entries);
            self.len += n;
        }

        pub fn fold(self: *Self) void {
            var sum: u32 = 0;
            for (self.heads.array.items) |*value| {
                sum += value.*;
                value.* = sum;
            }
        }
    };
}

1 Like

I thought I might chime in on this very interesting topic. I think we can all agree we’re talking about tradeoffs and the fact that nobody has gotten it 100% right in the past.

Zig has made a very conscious decision to not deal with visibility, except for file structs/namespaces. Personally I’m not particularly happy with this decision, but as I mentioned above nobody has gotten it right so far so who am I to blame them. They’ve looked at the mess other languages have made and made a decision with their eyes wide open. The community around C has made that work for a long time, so it’s certainly doable.

I love the private/protected/internal/public keywords from C#, right until the moment they get in the way… which is ALL the time! The biggest flaw here is that it’s considered a hard rule that can only be overcome with very ugly reflection code. If they had simply allowed you to say myList.private.items to override the access modifier and allow you to access the internal private/internal/protected items array that would have been perfect for C#. If you did this to a library you accepted that your code could break. Same as you’re right now doing every day with zig.

I particularly love that when visibility is done right, the intent of the code becomes so much clearer, to the point where many C# libraries are written in a way where you don’t need documentation. The class names tells you what it does and the public members tells you how to use it. But I digress, this is my personal preference.

The one thing that I hope will be added to zig one day would be a keyword that would let you cleanly access the private members of namespaces. Even if visibility only exists on a namespace level it can still get in the way. When visibility is done wrong it can be a real bitch and being forced to make your own copies of libraries because someone thought it would be clean to hide some internal stuff that turned out to be useful to you is a real pain.

8 Likes

In principle one could have an outer struct that serves as the interface and contains a single struct called impl which “hides” the internal data structure. This way one could use the functions of the outer struct as the public API, but if you know what you’re doing and need some functionality not provided by the API, and consciously risk that things can break when the internals change, you can still access the fields inside impl directly.

This would probably be zero overhead at runtime (memory and CPU) and only a slightly more to type when you use the interface, because obviously you need the parens for function call syntax.

BTW just a few hours ago I stumbled over a blog entry from Bob Nystrom where he shared his thoughts about syntax for visibility/access control syntax for fields and functions.

3 Likes

Part of the problem is people were having, myself included, is the names you chose, provide no meaning since we can’t see the code as context.

More descriptive names would be get and getPtr

The other issue I have with this whole thread, you don’t need traits to do this, just add the functions to your types.

While it is good that everything can be accessed, one should not forget that you also have the case that some thing only need to be accessed when you need genuinely weird things.

That’s why I like things like the namespacing method: it makes it clear and documents in code that it’s not intended to be used by others, and also let’s you specify what is part of the public and the private API.

You can specify this difference already with functions (using or not using pub), but you can’t really do so with fields.

And yes, struct fields are part of an API. So changing them effects users and makes it important to differentiate between public and private fields.

1 Like

Ding! Ding! Ding! We have a winnah!

The problem is that anything which “locks away” a function, variable, etc. invariably gets in the way later on. I’m not even a big fan of pub, but I think that not having pub has some fairly strong implications around compilation.

I’d really rather that everything is accessible, but that pub makes it easy to get access and anything not-pub should require an extra hoop to access but still be able to do so.

2 Likes

Is this with code you’ve marked private or with some 3rd party library?