Suggestions on updating py.zig to not use usingnamespace?

About a year ago I wrote a wrapper/api for creating python bindings with zig 0.13 and 0.14.1. It is located here https://codeberg.org/frmdstryr/py.zig .

It splits each of pythons separate object protocols into mixins and then used usingnamespace to add them, for example:

pub fn ObjectProtocol(comptime T: type) type {
    return struct {
        pub inline fn incref(self: *T) void {
            // Use the _ variant since it should be non-null
            c._Py_IncRef(@ptrCast(self));
        }

        pub inline fn decref(self: *T) void {
            // Use the _ variant since it should be non-null
            c._Py_DecRef(@ptrCast(self));
        }
        // etc
   };
}

pub const Object = extern struct {
    // The underlying python structure
    impl: c.PyObject,

    // Use the object protocol
    pub usingnamespace ObjectProtocol(@This());

};

pub const Tuple = extern struct {
    // The underlying python structure
    impl: c.PyTupleObject,

    // Tuple uses the object protocol
    pub usingnamespace ObjectProtocol(@This());

    // Tuple uses the SequenceProtocol
    pub usingnamespace SequenceProtocol(@This());
        
    // Some more tuple specific fns here...
};

This created a very clean and easy to use API that was then used to port a fairly large python framework written in c++.

I am looking to update to zig 0.15.2 (to use branch hints) but with usingnamespace now gone this design completely falls apart.

From what I can tell none of the proposed solutions in Remove `usingnamespace` ¡ Issue #20663 ¡ ziglang/zig ¡ GitHub work well. Either I would have to:

  • use a single type that wraps all c types and adds all functions to all objects (so a list would have dict functions but if used would just show a compile error)
  • dig through each c type (which may change with python versions) and re-implement the structure in zig using the proposed mixin approach.
  • copy paste all the protocol functions every time

Can anyone suggest how this project can be updated?

1 Like

It sounds like you haven’t tried to implement any approach yet, actually try them before discounting them as not good enough.

I think you’ll find that they really aren’t that bad, regardless you have to pick an approach.

You wouldn’t have to dig through each type any more than you already do.

The mixin fields is the approach I would recommend from the little I have seen of your code.

2 Likes

Three options:

  1. Zero length field:
pub fn ObjectProtocol(comptime T: type) type {
    return extern struct {
        pub inline fn incref(m: *@This()) void {
            const self: *T  = @alignCast(@fieldParentPtr("as_object", m));
            // Use the _ variant since it should be non-null
            c._Py_IncRef(@ptrCast(self));
        }

        pub inline fn decref(self: *@This()) void {
            const self: *T  = @alignCast(@fieldParentPtr("as_object", m));
            // Use the _ variant since it should be non-null
            c._Py_DecRef(@ptrCast(self));
        }
        // etc
   };
}

pub const Object = extern struct {
    // The underlying python structure
    impl: c.PyObject,
    // Use the object protocol
    as_object: ObjectProtocol(@This()) = .{};
};

pub const Tuple = extern struct {
    // The underlying python structure
    impl: c.PyTupleObject,
    // Tuple uses the object protocol
    as_object: ObjectProtocol(@This()) = .{};
    // Tuple uses the SequenceProtocol
    as_sequence: SequenceProtocol(@This()) = .{};
    // Some more tuple specific fns here...
};

Talk about the advantages of this approach: it provides a more explicit namespace for the functions of mixins, and if you need such a layer of abstraction, this is helpful, and better than the previous approach.
However, the essence of this approach is virtual inheritance, so it also has the issues of virtual inheritance. If the purpose of the mixin is merely code reuse, then the extra abstraction brought by this approach may be harmful, as it is ‘the demand for code reuse intruding into abstraction’.
In addition, if we regard it as an ‘abstract interface’ rather than simply ‘code reuse,’ we must face the fact that this interface fragilely relies on specific field names as CPOs, causing namespace pollution.
Side note: I’m really looking forward to a ‘Type-as-Key’ symbol system; it would solve the fragility of string-based CPOs and make refactoring safer.

  1. Sacrifice the simplicity of the call site
pub fn ObjectProtocol(comptime T: type) type {
    return struct {
        pub inline fn incref(self: *T) void {
            // Use the _ variant since it should be non-null
            c._Py_IncRef(@ptrCast(self));
        }

        pub inline fn decref(self: *T) void {
            // Use the _ variant since it should be non-null
            c._Py_DecRef(@ptrCast(self));
        }
        // etc
   };
}

pub const Object = extern struct {
    // The underlying python structure
    impl: c.PyObject,

    // Use the object protocol
    pub const object_ops = ObjectProtocol(@This());

};

pub const Tuple = extern struct {
    // The underlying python structure
    impl: c.PyTupleObject,

    // Tuple uses the object protocol
    pub const object_ops = ObjectProtocol(@This());

    // Tuple uses the SequenceProtocol
    pub const sequence_ops = SequenceProtocol(@This());
        
    // Some more tuple specific fns here...
};

The drawbacks of this approach are obvious: we cannot use syntactic sugar when calling, we have to write it like this:

var obj: Object = .{ .impl = foo() };
Object.object_ops.incref(&obj);

Nevertheless, I still believe this is a more robust abstraction than the zero-length field scheme. Although the call site looks a bit ugly, it is more sound.

Edit: A variant:

pub fn ObjectProtocol(comptime T: type) type {
    const allowed_types = .{Object, Tuple};
    comptime {
        var is_valid = false;
        for (allowed_types) |ValidType| {
            if (T == ValidType) is_valid = true;
        }
        if (!is_valid) {
            @compileError(@typeName(T) ++ " is not allowed to use ObjectProtocol");
        }
    }
    return struct {
        pub inline fn incref(self: *T) void {
            // Use the _ variant since it should be non-null
            c._Py_IncRef(@ptrCast(self));
        }

        pub inline fn decref(self: *T) void {
            // Use the _ variant since it should be non-null
            c._Py_DecRef(@ptrCast(self));
        }
        // etc
   };
}
pub const Object = extern struct {
    // The underlying python structure
    impl: c.PyObject,
};

pub const Tuple = extern struct {
    // The underlying python structure
    impl: c.PyTupleObject,
};

Use it like this:

var obj: Object = .{ .impl = foo() };
ObjectProtocol(Object).incref(&obj);
  1. Manual forwarding
pub fn ObjectProtocol(comptime T: type) type {
    return struct {
        pub inline fn incref(self: *T) void {
            // Use the _ variant since it should be non-null
            c._Py_IncRef(@ptrCast(self));
        }

        pub inline fn decref(self: *T) void {
            // Use the _ variant since it should be non-null
            c._Py_DecRef(@ptrCast(self));
        }
        // etc
   };
}

pub const Object = extern struct {
    // The underlying python structure
    impl: c.PyObject,

    // Use the object protocol
    pub const incref = ObjectProtocol(@This()).incref;
    pub const decref = ObjectProtocol(@This()).decref;
};

pub const Tuple = extern struct {
    // The underlying python structure
    impl: c.PyTupleObject,

    // Tuple uses the object protocol
    pub const incref = ObjectProtocol(@This()).incref;
    pub const decref = ObjectProtocol(@This()).decref;

    // Tuple uses the SequenceProtocol
    pub const foo = SequenceProtocol(@This()).foo;
        
    // Some more tuple specific fns here...
};

This approach is the most consistent with the original abstraction: if our goal is merely to avoid code repeat without introducing unnecessary layers of abstraction, this is the only reasonable approach.
If you need to manually introduce hundreds of methods using this approach, compared to the practice of using namespace, writing it this way can be frustrating. However, if we adhere to the principle of ‘reader first, writer second,’ we will find that although this approach involves more boilerplate when writing code, it is very reader-friendly and makes it easy to locate the declaration positions of all mixin functions.
A minor readability flaw is that this approach introduces functions in a form different from conventional function declarations, which may cause declaration search schemes based on fn methodname to fail. This is an issue that readers need to get used to.
For writers, a potential problem is that copy-paste errors may easily occur, e.g. pub const decref = ObjectProtocol(@This()).incref;
Currently, there is no good solution to this problem. I am looking forward to a simple reuse solution when declaring symbols, such aspub const decref = ObjectProtocol(@This()).@currentSymbol();. However, such designs may lack semantic orthogonality and do not have enough motivation to incorporate language.

Fortunately, we can add unit tests to catch this kind of error:

test "ObjectProtocol mapping integrity" {
    const Proto = ObjectProtocol(Object);
    const expected_fields = @typeInfo(Proto).@"struct".decls;
    
    inline for (expected_fields) |decl| {
        if (!@hasDecl(Object, decl.name)) {
            @panic("Object miss the protocal function: " ++ decl.name);
        }
        if (@field(Object, decl.name) != @field(Proto, decl.name)) {
            @panic("Object forwarding error: " ++ decl.name ++ " forward to wrong impl");
        }
    }
}
6 Likes

you can pass the field name as a parameter.

Regardless @fieldParentPtr will compile error if the field doesn’t exist, or if the field type is wrong.
For string names to be a problem would require you to have 2 fields of the same mixin with similar names.
Note that if the mixin take different parameters, including the field name if they work like that, then they will be different types.

So in practice it would be really odd if this was ever an issue, not that it isn’t possible.

2 Likes

This is correct.

You have one option which doesn’t offer your users inferior ergonomics: codegen.

Which is not the end of the world, but if you find my reply disappointing: I share that disappointment.

1 Like

I thought about that too but even codgen is not ideal because any custom types need to then also be somehow run though a code generator… (as opposed to the current err “old” method which requires adding a single line for each protocol the type uses).

Thanks for the quick responses everyone.

The least disruptive change I can think of at the moment is based on the 2. option from npc1054657282 . It would in theory require embedding all the protocol functions into each type (sometimes with a dummy c.PyObject structure) and then adding a function to cast to that object (I haven’t tried it yet).

pub const Object = extern struct {
    // The underlying python structure
    impl: c.PyObject,

    // Use the object protocol
    pub inline fn incref(self: *Object) void {
        // Use the _ variant since it should be non-null
        c._Py_IncRef(@ptrCast(self));
    }

    pub inline fn decref(self: *Object) void {
        // Use the _ variant since it should be non-null
        c._Py_DecRef(@ptrCast(self));
    }

};

pub const Sequence = extern struct {
    // Dummy.. does not represent the entire structure
    impl: c.PyObject,
    
    pub inline fn contains(self: *Sequence, obj: *Object) bool {
        // etc
    }
    
}

pub const Tuple = extern struct {
    // The underlying python structure
    impl: c.PyTupleObject,

    // Tuple uses the object protocol
    pub inline fn asObject(self: *Tuple) *Object {
        return @ptrCast(self); // Or whatever funky cast is needed
    }

    // Tuple uses the SequenceProtocol
    pub inline fn asSequence(self: *Tuple) *Sequence {
        return @ptrCast(self); // Or whatever funky cast is needed
    }

};

Unfortunately it is still a usability and readability downgrade compared to usingnamespace. Hopefully, some future version of zig will provide a better alternative..

1 Like

yep.

That’d be nice, right? usingnamespace was a blunt instrument, it could easily be improved upon.

Don’t count on it. Andrew has been hostile to the idea of adding declarations to type reification:

I intentionally made this not possible, because this kind of functionality tends to be abused, and is generally not needed to solve any problems elegantly. I personally never want to have to read any Zig code that creates declarations in compile-time logic, I don’t want to have to implement such features in the compiler, and I don’t want to toil through codifying such behavior into a language specification.

Hear that? You think you need it to “solve problems elegantly”, but you’re abusing Zig.

The thing about Python bindings or whatever: a compiler doesn’t need them. That’s what Zig exists for: to compile Zig.

So don’t hold your breath.

Personally, I think that the zero-width field approach is well worth a try… I understand that it might be a big diff now but it could be worth the investment. It’s about as ergonomic at the callsite as name-injection, much more transparent to the reader, and also usually makes naming easier in the long run (you have less need for compound names as the namespace carries meaning).

But if you really don’t want that - what’s wrong with the wrapper that adds all the functions? Isn’t it effectively equivalent to what you have now? You can even inject better error messages than you would get in the simple absence of a symbol.

fn bad() noreturn {
    @compileError("Can’t touch this");
}

fn Wrap(T: type) type {
    return struct {
        t: T,
        const meth = if (@hasDecl(T, "__py_tuple__")) TupleProtocol.meth else bad;
    };
}

Thanks to everyone who commented. For anyone interested, I was able to migrate py.zig and my project.

While doing so however, I realized the mixin and ptr cast approaches all lose valuable type information that was previously there.

In my case, newref now requires “double casting”. For example the old self.a = b.newref() is now to self.a = @ptrCast(b.asObject().newref())). Meaning, if the type on either side of the assignment changes it will now compile garbage code… while before it would detect an issue and give a compile error.

The zero length field should not have this problem (if it works, i didn’t try) so if anyone is in a similar situation, I’d recommend trying that instead (I may give it a try eventually).

Sadly, this is the first time since Zig 0.6.0 that updating to a new version has left me disappointed (the only other time being when inline was removed but that got reverted)… Not saying usingnamespace should come back (most of the usage of it in my other zig projects was indeed unnecessary and totally understand his viewpoint) but I’m still hopeful something more elegant than the options available now will come before 1.0.

1 Like