`comptime T: type` arguments feel redundant at the callsite

kj4tmp · August 5, 2024, 6:09am

I’m wondering if anyone else feels that API’s that include a comptime T: type parameter can be a bit redundant.

For example, consider the following function that uses this functionality, this function takes a packed struct and converts it to bytes in the proper order for a binary protocol:

const std = @import("std");
const native_endian = @import("builtin").target.cpu.arch.endian();

/// convert a packed struct to bytes that can be sent via ethercat
/// 
/// the packed struct must have bitwidth that is a multiple of 8
pub fn pack_to_ecat(comptime T: type, packed_struct: T) [@divExact(@bitSizeOf(T), 8)]u8 {
    comptime std.debug.assert(@typeInfo(T).Struct.layout == .@"packed"); // must be a packed struct
    var bytes: [@divExact(@bitSizeOf(T), 8)]u8 = undefined;

    switch (native_endian) {
        .little => {
            bytes = @bitCast(packed_struct);
        },
        .big => {
            bytes = @bitCast(packed_struct);
            std.mem.reverse(u8, &bytes);
        },
    }
    return bytes;
}

Now at the callsite I must write:

test "pack_to_ecat" {
    const Command = packed struct(u8) {
        flag: bool = true,
        reserved: u7 = 0,
    };
    try std.testing.expectEqual(
        [_]u8{1},
        pack_to_ecat(Command, Command{}),
    );
}

This feels like I am asserting that the second argument Command{} has type Command, but I don’t care what type the arguments are, that’s why I made it a generic.

Generally, I want to use the full features of a language to express my intent, and my intent is to not care about the type, which pushes me towards this at the call site:

const my_command = Command{};
_ = pack_to_ecat(@TypeOf(my_command), my_command);

But now I’m just thinking, the compiler knows the type already for me, why should I have to say it!

kj4tmp · August 5, 2024, 6:16am

I guess one could argue as well that this improves readability? For example, I can tell from this line that I am reversing a sequence of u8.

kj4tmp · August 5, 2024, 6:21am

and the compiler already enforces that the two arguments have the same type, for example:

try std.testing.expectEqual(
        [_]u8{1, 7},
        pack_to_ecat(Command, Command2{}),
    );

produces error:

src/nic.zig:265:39: error: expected type 'nic.test.pack_to_ecat.Command', found 'nic.test.pack_to_ecat.Command2'
        pack_to_ecat(Command, Command2{}),

AndrewCodeDev · August 5, 2024, 6:24am

There’s always anytype

pub fn pack_to_ecat(ps: anytype) [@divExact(@bitSizeOf(@TypeOf(ps)), 8)]u8

// ...

const y = pack_to_ecat(something);

For balance, I also made some arguments against it here, too: Generic Programming and anytype

kj4tmp · August 5, 2024, 6:35am

Great write up! I think you are right that anytype may be more appropriate here since I am not trying to communicate the partial specialization case (multiple arguments must have the same type).

kj4tmp · August 5, 2024, 6:41am

this is what the full implementation using anytype looked like:

/// convert a packed struct to bytes that can be sent via ethercat
/// 
/// the packed struct must have bitwidth that is a multiple of 8
pub fn pack_to_ecat(packed_struct: anytype) [@divExact(@bitSizeOf(@TypeOf(packed_struct)), 8)]u8 {
    comptime std.debug.assert(@typeInfo(@TypeOf(packed_struct)).Struct.layout == .@"packed"); // must be a packed struct
    var bytes: [@divExact(@bitSizeOf(@TypeOf(packed_struct)), 8)]u8 = undefined;

    switch (native_endian) {
        .little => {
            bytes = @bitCast(packed_struct);
        },
        .big => {
            bytes = @bitCast(packed_struct);
            std.mem.reverse(u8, &bytes);
        },
    }
    return bytes;
}

test "pack_to_ecat" {
    const Command = packed struct(u8) {
        flag: bool = true,
        reserved: u7 = 0,
    };
    try std.testing.expectEqual(
        [_]u8{1},
        pack_to_ecat(Command{}),
    );
}

Another language feature I have learned thanks to Ziggit!

dimdin · August 5, 2024, 7:53am

You can have compile time functions used to derive the return type:

fn PackedArray(T: type) type {
    return [@divExact(@bitSizeOf(T), 8)]u8;
}

pub fn pack_to_ecat(packed_struct: anytype) PackedArray(@TypeOf(packed_struct)) {
    ...
    var bytes: PackedArray(@TypeOf(packed_struct)) = undefined;
    ...

jmc · August 5, 2024, 9:18am

[pedantic reply] the T: type argument to PackedArray is (and should be) known at compile time so you could say comptime T: type instead

dimdin · August 5, 2024, 9:42am

It is nice but you don’t have to add comptime. comptime is derived because type is only available at compile time.

kristoff · August 5, 2024, 10:33am

There’s a key reason to use an explicit type arguments sometimes.

Consider these two different signatures for eql:

fn eqlA(T: type, lhs: []const T, rhs: []const T) bool {}
fn eqlB(lhs: anytype, rhs: anytype) bool {}

Setting aside for a moment discussions about clarity, here’s a problem when you use eqlB:

eqlB("apple", "strawberry"); // eqlB(*const [5:0]u8, *const [10:0]u8)
eqlB("pear", "orange"); // eqlB(*const [4:0]u8, *const [6:0]u8)

The comments show the type of each argument. String literals, and sometimes sub-slices of string literals, have an extremely specific type.

In the example above we have generated 2 different instances of eqlB because the types passed in are different every single time.

By asking for T and defining other arguments as slices, we make the compiler coerce all those values to one common type ([]const u8), which helps preventing code bloat generated by inadvertently causing a combinatory explosion of types given to eql.

mnemnion · August 5, 2024, 12:26pm

This illustrates a limitation of the all-or-nothing approach to comptime genericity IMHO. It’s not such a bad thing, ok, you have to provide a type sometimes.

The type provided in eql is invoking Peer Type Resolution, basically. It would be nice to have a mechanism which did this more directly, writing this in a syntax Zig would never use:

pub fn eql(a: Type<T>, b: Type<T>) bool { ... }

It’s a pretty minor thing, because in many cases you can do something like this:

pub fn genre(a: anytype, b: @TypeOf(a)) bool {
    return a == b;
}

test "generic typeof" {
    var a32: u32 = 5;
    _ = &a32;
    var b32: u32 = 5;
    _ = &b32;
    // This works
    try expect(genre(a32, b32));
    var a64: u64 = 5;
    _ = &a64;
    // This also works
    try expect(genre(a64, b32));
    var b64: u64 = 5;
    _ = &b64;
    // ..but this will fail
    // try expect(genre(a32, b64));
}

(For those who may not know, the _ = &var; thing is a way to trick the compiler into not recognizing those as comptime values. As consts all of these work).

So one of the arguments is privileged over the other, when what we want is what eql gets from providing the type, but without having to provide that type. For anything other than built-in types, this approach will work just fine.

But built-in types, and the various kinds of peer coercion which apply to them, are pretty central to Zig programming, and it would be a nice affordance to be able to say “generic function taking two slices of the same T” without also providing T at every call site.

I know there’s an issue tracking a proposal around this, and like I said, it’s not exactly a big deal that we need to provide a type sometimes. But the topic being why they feel redundant at times, I think this is part of it: sometimes they kind of are.

Other times they’re essential, like a function which returns another type parameterized by the function’s type argument.

AndrewCodeDev · August 5, 2024, 1:40pm

I made a similar argument in the doc actually and we both used eql too - what are the odds?