Multidimensional arrays with zero size

The following question is entirely based on curiosity. I don’t really have a real-world problem where I currently need this, but I’m curious if there is a solution in Zig for it.

Zig supports multidimensional arrays. Also, it is possible to have zero-sized arrays. Now what if we have two (or more) dimensional arrays where one dimension has a size of zero?

Consider:

const std = @import("std");

pub fn printDim(matrix: anytype) void {
    std.debug.print("{}x{}\n", .{ matrix.len, matrix[0].len });
}

pub fn main() void {
    const x: [17][0]i32 = undefined;
    printDim(x);
}

Output:

17x0

Note that this concept also exists in other languages. Let’s see how Octave deals with it (empty matrices in Octave):

$ octave
GNU Octave, version 10.2.0
…
octave:1> A = zeros(17, 0)
A = [](17x0)
octave:2> rows(A)
ans = 17
octave:3> columns(A)
ans = 0
octave:4> numel(A)
ans = 0
octave:5> B = zeros(0, 17)
B = [](0x17)
octave:6> rows(B)
ans = 0
octave:7> columns(B)
ans = 17
octave:8> numel(B)
ans = 0
octave:9> A == B
error: mx_el_eq: nonconformant arguments (op1 is 17x0, op2 is 0x17)

But getting back to Zig, let’s try:

-    const x: [17][0]i32 = undefined;
+    const x: [0][17]i32 = undefined;

Now it fails to work as we get:

multiary.zig:4:53: error: indexing into empty array is not allowed
    std.debug.print("{}x{}\n", .{ matrix.len, matrix[0].len });
                                              ~~~~~~^~~

So I wonder:

  • Are zero sized one-dimensional arrays used in Zig?
  • How about two-dimensional arrays where only one dimension (possibly the first?) has a size of zero? Are those used as well?
  • Is this a supported feature, or just a curious corner case that one shouldn’t rely on?
  • If I “initialize” a zero sized array (with one or more than one dimension) with undefinied, is it then properly initialized? :thinking:
  • Can I reliably obtain the column count n of an m x n matrix that is represented as [m][n]T in Zig (for all m >= 0)?
    • If yes, how?
    • If not, would this be a nice-to-have?

Maybe a possible solution is to use a dedicated type (constructor), instead of the built-in arrays:

const std = @import("std");

pub fn Matrix(
    comptime T: type,
    comptime rows: usize,
    comptime cols: usize,
) type {
    return struct {
        values: [rows][cols]T,
        const row_count = rows;
        const col_count = cols;
        // Alternatively also:
        pub fn rowCount(self: @This()) usize {
            _ = self;
            return row_count;
        }
        pub fn colCount(self: @This()) usize {
            _ = self;
            return col_count;
        }
    };
}

pub fn printDim(matrix: anytype) void {
    std.debug.print("{}x{}\n", .{
        @TypeOf(matrix).row_count,
        @TypeOf(matrix).col_count,
    });
    // Alternatively:
    std.debug.print("{}x{}\n", .{
        matrix.rowCount(),
        matrix.colCount(),
    });
}

pub fn main() void {
    const a: Matrix(i32, 17, 0) = undefined;
    printDim(a);
    const b: Matrix(i32, 0, 17) = undefined;
    printDim(b);
}

Output:

17x0
17x0
0x17
0x17

That’s a bit clunky though, yet cleaner than matrix[0].len (besides working for zero sized arrays).

So final question:

  • For non-zero sized two-dimensional arrays, is it idiomatic to write matrix[0].len to obtain the size of the second dimension?

What do you think?

No because like you have demonstrated, that only works if the dimmensions are guaranteed to be greater than 0.

I would say if you require that the dimmensions are never zero, than you can do it that way, otherwise you would use the comptime available type information to extract the length information from the array type.

The length of arrays is always comptime known / part of the type.

One way to do this is this way:

const std = @import("std");

pub fn printDim(matrix: anytype) void {
    std.debug.print("{}x{}\n", .{ matrix.len, @typeInfo(std.meta.Child(@TypeOf(matrix))).array.len });
}

pub fn main() void {
    // const x: [17][0]i32 = undefined;
    const x: [0][17]i32 = undefined;
    printDim(x);
}
2 Likes

Are you sure it’s std.meta.Child and not better std.meta.Elem?

Inspired by your solution, I tried something else:

const std = @import("std");

pub fn dim(value: anytype, comptime level: usize) usize {
    if (level == 0) {
        return value.len;
    } else {
        return dim(@as(std.meta.Elem(@TypeOf(value)), undefined), level - 1);
    }
}

pub fn rows(value: anytype) usize {
    return dim(value, 0);
}

pub fn cols(value: anytype) usize {
    return dim(value, 1);
}

pub fn printDim(matrix: anytype) void {
    std.debug.print("{}x{}\n", .{ rows(matrix), cols(matrix) });
}

pub fn main() void {
    const a: [3][2]i32 = undefined;
    const b: [3][0]i32 = undefined;
    const c: [0][2]i32 = undefined;
    const d: [0][0]i32 = undefined;
    printDim(a);
    printDim(b);
    printDim(c);
    printDim(d);
}

Output:

3x2
3x0
0x2
0x0

P.S.: The return value.len statement is maybe a bit risky if len is an actual struct field, as then len would be undefined. Probably @typeInfo should be used there.

Improved version:

pub fn dim(value: anytype, comptime level: usize) usize {
    if (level == 0) {
        return switch (@typeInfo(@TypeOf(value))) {
            .array => |info| info.len,
            .vector => |info| info.len,
            else => @compileError("length not comptime known"),
        };
    } else {
        return dim(@as(std.meta.Elem(@TypeOf(value)), undefined), level - 1);
    }
}

And maybe even more clean, avoiding undefined trickery:

const std = @import("std");

pub fn typeDim(comptime T: type, comptime level: usize) usize {
    if (level == 0) {
        return switch (@typeInfo(T)) {
            .array => |info| info.len,
            .vector => |info| info.len,
            else => @compileError("length not comptime known"),
        };
    } else {
        return typeDim(std.meta.Elem(T), level - 1);
    }
}

pub fn rows(value: anytype) usize {
    return typeDim(@TypeOf(value), 0);
}

pub fn cols(value: anytype) usize {
    return typeDim(@TypeOf(value), 1);
}

pub fn printDim(matrix: anytype) void {
    std.debug.print("{}x{}\n", .{ rows(matrix), cols(matrix) });
}

pub fn main() void {
    const a: [3][2]i32 = undefined;
    const b: [3][0]i32 = undefined;
    const c: [0][2]i32 = undefined;
    const d: [0][0]i32 = undefined;
    printDim(a);
    printDim(b);
    printDim(c);
    printDim(d);
}
2 Likes

To all those questions I would answer more generally, yes zero sized types are used within Zig, sometimes they can be pretty useful.
For me multi-dimensional-array types where at least one dimension is zero, thus turning the whole type into a zero sized type, are just a special case of zero sized types.

In the case of matrices / math, I would imagine that it can be helpful to treat those types in a fully generic manner, just so that you don’t have to turn them into a corner case that is treated specially.

At runtime these types basically turn into nothing, but I think they can be useful in combination with comptime code that uses them to coordinate other code.

For example I could imagine a library that processes a bunch of sizes at comptime to figure out the min or max size needed for something, you could represent that with types like those const types:[]const type = &.{[0][10]void, [5][5]void, [7][0]void} which would then be used to calculate the needed size for some other operation.

(of course in this case we also could use an explicit struct that is used at comptime, but depending on what the library does it could be useful to use specific zero sized types)

For example these two contain pretty much the same information:

pub fn sizes1() void {
    const Size = struct {};
    const sizes = [_]type{
        [0][10]Size, [5][5]Size, [7][0]Size,
    };
    std.debug.print("sizes: {s}\n", .{std.fmt.comptimePrint("{any}", .{sizes})});
    // sizes: { [0][10]zerosized.sizes1.Size, [5][5]zerosized.sizes1.Size, [7][0]zerosized.sizes1.Size }
}

pub fn sizes2() void {
    const Size = struct { comptime_int, comptime_int };
    const sizes = [_]Size{
        .{ 0, 10 }, .{ 5, 5 }, .{ 7, 0 },
    };
    std.debug.print("sizes: {s}\n", .{std.fmt.comptimePrint("{any}", .{sizes})});
    // sizes: { { 0, 10 }, { 5, 5 }, { 7, 0 } }
}

pub fn main() void {
    sizes1();
    sizes2();
}
const std = @import("std");

So if I wrote some comptime code that only needs one kind of size I would probably use sizes2, however if you had many different kinds of sizes it may be helpful to use something like sizes1 but with different types, for example if you used it to describe computation of sizes with different units.

A zero sized type doesn’t have any memory, so it is impossible to create an instance with invalid memory.

For array types it results in the same code, another option would be to use @typeInfo directly.

I tried to improve my last attempt even further by including runtime dimensions (where possible):

const std = @import("std");

pub fn typeDim(comptime T: type, comptime level: usize) ?usize {
    return if (level == 0) switch (@typeInfo(T)) {
        .array => |info| info.len,
        .vector => |info| info.len,
        else => null,
    } else comptime ret: {
        break :ret typeDim(std.meta.Elem(T), level - 1) orelse
            @compileError("length not comptime known");
    };
}

pub fn dim(value: anytype, comptime level: usize) usize {
    return typeDim(@TypeOf(value), level) orelse value.len;
}

pub fn rows(value: anytype) usize {
    return dim(value, 0);
}

pub fn cols(value: anytype) usize {
    return dim(value, 1);
}

pub fn printDim(name: []const u8, matrix: anytype) void {
    std.debug.print("{s}: {}x{}\n", .{ name, rows(matrix), cols(matrix) });
}

pub fn main() void {
    const a: [3][2]i32 = undefined;
    const b: [3][0]i32 = undefined;
    const c: [0][2]i32 = undefined;
    const d: [0][0]i32 = undefined;
    printDim("a", a);
    printDim("b", b);
    printDim("c", c);
    printDim("d", d);
    const square_matrix: [3][3]i32 = .{
        [_]i32{ 1, 2, 3 },
        [_]i32{ 4, 5, 6 },
        [_]i32{ 7, 8, 9 },
    };
    printDim("square_matrix", square_matrix);
    printDim("slice of a square", square_matrix[0..1]);
    // This gives a compiler error (as intended):
    //const array_of_slices: [3][]i32 = undefined;
    //printDim("array_of_slices", array_of_slices);
}

Output:

a: 3x2
b: 3x0
c: 0x2
d: 0x0
square_matrix: 3x3
slice of a square: 1x3

Wouldn’t it be nice to have something like typeDim or dim that in the std library?

I wonder why I needed a comptime block in the else branch of the typeDim function. Without, it doesn’t work, and I get a compiler error:

-    } else comptime ret: {
-        break :ret typeDim(std.meta.Elem(T), level - 1) orelse
-            @compileError("length not comptime known");
-    };
+    } else typeDim(std.meta.Elem(T), level - 1) orelse
+        @compileError("length not comptime known");
multiary.zig:9:9: error: length not comptime known
        @compileError("length not comptime known");
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
referenced by:
    dim__anon_22309: multiary.zig:13:19
    cols__anon_22271: multiary.zig:21:15
    6 reference(s) hidden; use '-freference-trace=8' to see all references
multiary.zig:9:9: error: length not comptime known
        @compileError("length not comptime known");
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
multiary.zig:9:9: error: length not comptime known
        @compileError("length not comptime known");
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
multiary.zig:9:9: error: length not comptime known
        @compileError("length not comptime known");
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
multiary.zig:9:9: error: length not comptime known
        @compileError("length not comptime known");
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
multiary.zig:9:9: error: length not comptime known
        @compileError("length not comptime known");
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Why is that? Is comptime then more than a restriction, but also an instruction to the compiler to try harder evaluating something at compile-time? And why didn’t it happen in my previous version where I returned usize instead of ?usize?

When you use comptime you tell the compiler to evaluate it at comptime, if you don’t, you leave it up to the compiler to choose based on the context and the types, for example if the types flowing into an expression are comptime-only types that means it has to be evaluated at comptime, to some degree, but you haven’t forced the compiler to make the result fully evaluated at compile time.

Keep in mind that @compileError works like this:

This function, when semantically analyzed, causes a compile error with the message msg.

There are several ways that code avoids being semantically checked, such as using if or switch with compile time constants, and comptime functions.

So by forcing it to be run at comptime it works:

pub fn typeDim(comptime T: type, comptime level: usize) ?usize {
    return comptime if (level == 0) switch (@typeInfo(T)) {
        .array => |info| info.len,
        .vector => |info| info.len,
        else => null,
    } else typeDim(std.meta.Elem(T), level - 1) orelse
        @compileError("length not comptime known");
}

I think if you don’t force the compiler, the compiler has technically a choice whether to analyze it in depth or just start generating the corresponding code and only analyze it where it needs to. However Zig tries to be lazy/incremental about things.

Basically if the compiler was more eager it could automatically check in depth that the function can be fully evaluated at comptime, but because comptime operates more lazily we reach and analyze the @compileError before the compiler could realize that the code doesn’t actually depend on runtime arguments. I think if the compiler was more eager in this case, there would be other cases where it evaluates too eagerly, so instead the programmer is required to specify that it needs to be evaluated at comptime, explicitly (to avoid the compile error).

At least that is how I currently reason about this, I am not completely sure whether this is fully accurate.

I think the compiler is just written in a way where usize is more likely to be evaluated at comptime (more deeply) by default (without asking for it), while ?usize may default to being passed to code generation to generate a branch based on the optional.

I think if you really wanna know in detail you would have to look at the compiler implementation of how the comptime interpreter works in detail.

This post is related, especially this part of it:

Oh, I didn’t realize I can use comptime without braces (i.e. it works on any expression).

I did try this before, and it didn’t work:

 pub fn typeDim(comptime T: type, comptime level: usize) ?usize {
-    return comptime if (level == 0) switch (@typeInfo(T)) {
-        .array => |info| info.len,
-        .vector => |info| info.len,
-        else => null,
-    } else typeDim(std.meta.Elem(T), level - 1) orelse
-        @compileError("length not comptime known");
+    comptime {
+        return if (level == 0) switch (@typeInfo(T)) {
+            .array => |info| info.len,
+            .vector => |info| info.len,
+            else => null,
+        } else typeDim(std.meta.Elem(T), level - 1) orelse
+            @compileError("length not comptime known");
+    }
 }

Resulting in:

multiary.zig:5:9: error: function called at runtime cannot return value at comptime
        return if (level == 0) switch (@typeInfo(T)) {
        ^~~~~~
referenced by:
    dim__anon_22308: multiary.zig:15:19
    rows__anon_22268: multiary.zig:19:15
    6 reference(s) hidden; use '-freference-trace=8' to see all references
…

I guess that’s because typeDim can be called at runtime or at comptime. (Though I believe the compiler could fix this by internally replacing each return with a break :label and including a surrounding block internally.)


That practical issues aside (which are really interesting to me), I think that some sort of dim function (maybe named lenLevel or nestedLen or similar) would be nice to have in std. If such a function was to be added, it would/could go to std.meta, I assume?

But perhaps it’s too niche to have. For comptime-known matrices (or slices of those) it would be nice though, IMHO. (Though maybe I miss to see some implications of it.)

Maybe, I think currently the main focus for std is still to serve the needs of the compiler, so things that aren’t strictly needed, or aren’t obviously useful to the whole Zig community aren’t likely to be included, basically things have to prove that they are worth adding. So single functions that could be easily defined in user code in the few cases where they are needed, are unlikely to be included.

I am unsure what would be included in the standard library once we get to preparations for 1.0, however it is expected that it will get a big patch/rewrite:

Personally I hope that the Zig standard library will stay relatively small with 1.0, because I think having a small core is valuable and I also think that the package manager should be so robust at that point, that additional things can just be depended on via that.
However I could see certain math or data-structure related things being included, that aren’t strictly necessary, but agreed to be beneficial from a convenience standpoint.

Std meta is disliked and has shrunk over time, it is more likely that it would shrink further, than receive new additions. It is not impossible, just not very likely.
Personally I like meta, but I also like that it has gotten smaller/simpler, because some things now can be done more easily with builtins directly, for example @FieldType.

1 Like

Yeah that makes sense.

The dim function doesn’t seem to be as easy (to understand). I just tried to rewrite it a bit more clean (not using std.meta). Also for anyone else who has the wish to determine the dimensions of a multi-dimensional array (or the row/column count of a matrix, for example), I would like to share my current result here. This is what I came up with:

const std = @import("std");

pub const Dim = union(enum) {
    static: usize,
    dynamic: void,
    unknown: void,
    none: void,
};

pub fn typeDim(comptime T: type, comptime level: usize) Dim {
    const z = comptime level == 0;
    return comptime switch (@typeInfo(T)) {
        .array => |info| if (z) .{ .static = info.len } else typeDim(info.child, level - 1),
        .vector => |info| if (z) .{ .static = info.len } else typeDim(info.child, level - 1),
        .pointer => |info| switch (info.size) {
            // Special handling of `.one` needed because slices
            // with comptime length are pointers to arrays.
            .one => typeDim(info.child, level),
            .slice => if (z) .{ .dynamic = {} } else typeDim(info.child, level - 1),
            .many, .c => if (z) .{ .unknown = {} } else typeDim(info.child, level - 1),
        },
        else => .{ .none = {} },
    };
}

pub fn dim(value: anytype, comptime level: usize) usize {
    return switch (comptime typeDim(@TypeOf(value), level)) {
        .static => |len| len,
        .dynamic => if (comptime (level == 0)) value.len else @compileError("dynamic dimension"),
        .unknown => @compileError("unknown dimension"),
        .none => @compileError("value has no dimension"),
    };
}

pub fn rows(value: anytype) usize {
    return dim(value, 0);
}

pub fn cols(value: anytype) usize {
    return dim(value, 1);
}

pub fn printDim(name: []const u8, matrix: anytype) void {
    std.debug.print("{s}: {}x{}\n", .{ name, rows(matrix), cols(matrix) });
}

pub fn main() void {
    const a: [3][2]i32 = undefined;
    const b: [3][0]i32 = undefined;
    const c: [0][2]i32 = undefined;
    const d: [0][0]i32 = undefined;
    printDim("a", a);
    printDim("b", b);
    printDim("c", c);
    printDim("d", d);
    const square_matrix: [3][3]i32 = .{
        [_]i32{ 1, 2, 3 },
        [_]i32{ 4, 5, 6 },
        [_]i32{ 7, 8, 9 },
    };
    printDim("square_matrix", square_matrix);
    printDim("slice of a square", square_matrix[0..1]);
    printDim("slice of a square", square_matrix[1..3]);
    // This gives a compiler error (as intended):
    //const array_of_slices: [3][]i32 = undefined;
    //printDim("array_of_slices", array_of_slices);
}

Not sure if I handled all cases properly.

What makes things a bit twisted is that square_matrix[0..1] and square_matrix[1..3] are not slices, but pointers to a [1][3]i32 and [2][3]i32 array, respectively. This is why std.meta.Elem has a quite confusing implementation as well.

I don’t expect such kind of code to be “quickly written” from scratch from any programmer, who wants to solve an entirely different problem and not deal with low-level meta programming.

Generally I agree, of course. I also wonder, for example, if HTTP should really be included in a standard library, considering that’s an evolving standard as well (HTTP/1.1, HTTP/2, HTTP/3, …). But that’s a bit of another discussion.

Yeah, I feel like “dimensions of an array” is something very related to the language itself (unlike the HTTP example). So I would personally like something like that in std. But if not, it’s not a drama. Not many people might actually need this.

Why disliked? Because it’s more idiomatic to pass explicit types when things get complicated?

I am not completely sure, I think partly because some of the implementations that were part of meta were pretty complicated and slow for things that should be easy and thus it feels like there should be easier and more performant ways to do the same thing, which is the reason why some of those have already been made obsolete and removed.

I think one standpoint is that meta sometimes feels more like a crutch for things that already should be easy to express with the language or builtins, so that ideally you wouldn’t even need to have a meta namespace for such utilities.

But I don’t remember the details, I just vaguely remember it from some discussions or issue comments. I guess you would have to search through some issues/discussions to see whether there is any concrete/up-to-date plans for meta.

In the past meta also had a bunch of things removed that instead have become language features / builtins.

That is interesting point, I had always suspected that it was because std.meta is somewhat of a grab-bag of miscellaneous functions that deal with types, essentially a “util” class, which is often considered Le Bad Thing™ in API design.

To me, it looks like std.meta allows you to compose types. So it’s some sort of language reflection. Maybe std.builtin is more the bare minimum to describe types, and std.meta lets you put types together in a more convenient way.

In a way it’s beautiful to be able to use Zig to implement these functions. But I guess it can slow down compile time? (And if it didn’t, make the compiler more complex, optimizing this again.)

Maybe determining the dimensions of a multidimensional array is something directly to be included in the language anyway, like (hypothetically, this isn’t valid syntax) @len(matrix, 0) to get the row count and @len(matrix, 1) to get the column count.

I meant @builtins not the std.builtin namespace or @import("builtin"), can be a bit confusing with these different meanings of builtin.

I understood; I just asked myself why std.meta exists separately of the already existing std.builtin.

I think both are for reflection, but I tried to understand what’s the semantic difference between those two.

I guess the difference is that std.builtin is used by the “built-ins” (@…) (and thus can’t be implemented by a third-party package), while std.meta is not, and thus can be implemented elsewhere in any third-party package.


Consequently, a function determining the length of a multidimensional array would either belong to

  • std.meta
  • the built-in functions (@…)
  • your own code or third-party packages

But not:

  • std.builtin

My understanding of this problem is this: some of the internal implementations of std.meta are heavy, and providing them is often easy for people to abuse to make an implementation that looks more “clean” rather than an implementation that actually compiles with higher performance.
An example is this one which checks if an error is in a specific set of errors.
In this thread, the OP believes that the following solution is “verbose”.

pub fn inErrorSet(comptime err: anyerror, comptime Err: type) bool {
    if (@typeInfo(Err).error_set) |error_set| inline for (error_set) |err_info| {
        if (std.mem.eql(u8, @errorName(err), err_info.name)) {
            return true;
        }
    };
    return false;
}

Meanwhile, the solution with std.meta.FieldEnum seems much easier.

pub fn inErrorSet(comptime err: anyerror, comptime Err: type) bool {
    return @hasField(std.meta.FieldEnum(Err), @errorName(err));

However, in fact, the previous solution get better perfmance. The latter solution copies all error names in the error set to generate its Enum type at compile time, and calls std.math.IntFittingRange internally just to check the maximum Tag size. These behaviors are actually not helpful for our needs and will harm compile-time performance.
The standard library status and ease of use of std.meta make it easier for users to write “simple” code that affects compilation performance.

2 Likes