Struggling with nested array literals

nurpax · November 11, 2023, 5:47pm

Suppose I have something like this:

// initFromSlice2d is declared like:
// pub fn initFromSlice2d(alloc: std.mem.Allocator, initial: []const []const f32) @This()

const mat3x3 = [_][3]f32{
    [_]f32{ 1, 2, 3 },
    [_]f32{ 4, 5, 6 },
    [_]f32{ 7, 8, 9 },
};
var tens2d = Ndarray(f32).initFromSlice2d(arena.allocator(), &mat3x3);

Leads to the following error:

src\ndarray.zig:530:66: error: expected type '[]const []const f32', found '*const [3][3]f32'
    var tens2d = Ndarray(f32).initFromSlice2d(arena.allocator(), &mat3x3);
                                                                 ^~~~~~~
src\ndarray.zig:530:66: note: pointer type child '[3]f32' cannot cast into pointer type child '[]const f32'
src\ndarray.zig:147:67: note: parameter type declared here
        pub fn initFromSlice2d(alloc: std.mem.Allocator, initial: []const []const f32) @This() {
                                                                  ^~~~~~~~~~~~~~~~~~~

without the address-of operator I get this:

src\ndarray.zig:530:66: error: array literal requires address-of operator (&) to coerce to slice type '[]const []const f32'
    var tens2d = Ndarray(f32).initFromSlice2d(arena.allocator(), mat3x3);
                                                                 ^~~~~~

How do I pass this into a function that takes a slice of slices as input? (Not a fixed size array like above).

I think I get why I can’t pass the “just 9 floats” to a function that takes a slice of slices… but is there a convenient way to convert from the former format to the latter? I think I’m pretty ok with Zig in general, but keep butting my head against array literals.

nurpax · November 11, 2023, 5:51pm

Oh ok, so I can anytype initFromSlice2d it and it works:

pub fn initFromSlice2d(alloc: std.mem.Allocator, initial: anytype) @This() {
    var arr = @This().init(alloc, &[_]usize{ initial.len, initial[0].len });
    var c: usize = 0;
    for (0..arr.shape[0]) |i| {
        for (0..arr.shape[1]) |j| {
            arr.buf[c] = initial[i][j];
            c += 1;
        }
    }
    return arr;
}

Hardly ideal as I just lost type information from the initFromSlice2d signature.

desttinghim · November 11, 2023, 7:31pm

The problem you are running into is the difference between Arrays and Slices.

A [3]f32 array is literally 3 float values in memory, one after another. Slices like []const f32 are pointers that store an address and a size. To be able to get a slice, you need somewhere valid to point to, which is either heap-allocated memory or an array allocated on the stack. initSlice2d is expecting a slice of slices, but you are passing a slice of arrays. To make it work, change the type to []const [3]f32 rather than []const []const f32.

If it were me, I would store all nine values in one slice [9]f32 instead, since it makes conversion between arrays and slices less troublesome.

More, Unnecessary Details

Arrays are a list of values of the same type, either with an explicit length ([3]f32{0, 0, 0}) or an inferred one ([_]f32{0, 0, 0}). This length is compile time known, and so any code that operates on it knows exactly how much data to expect and can be optimized to work on exactly that amount of data.

Slices, on the other hand, are a pointer to a a list of values of the same type. It points to memory, rather than being the value itself. It is represented in memory as a pointer to a memory address and a size. A []u8 slice is basically equivalent to a struct like this:

const Slice = struct {
    ptr: [*]u8, // a pointer to an unknown number of u8 values
    len: usize,
};

An array, on the other hand, would be more like this:

// Just pretend we can use numbers as fields in a struct, this code will not work
const Array3u8 = struct {
    0: u8,
    1: u8,
    3: u8,
};

Where each of the fields is one of the indexes in the array.

Why does this matter?

The function you are calling expects the type []const []const f32. This type is a slice of slices:

const Slicef32 = struct {
    ptr: [*]f32,
    len: usize,
};
const SliceOfSlicef32 = struct {
    ptr: [*]F32Slice,
    len: usize,
};

And the data you are trying to pass is a slice of arrays:

const Array3f32 = struct {
    0: f32,
    1: f32,
    2: f32,
};
const Array3Array3f32 = struct {
    0: Array3f32,
    1: Array3f32,
    2: Array3f32,
};
const SliceOfArray3f32 = struct { // when you use the reference operator (&) this type is returned
    ptr: [*]Array3f32,
    len: usize,
};

Array:

+-----+
| f32 |
+-----+
| f32 |
+-----+
| f32 |
+-----+

Slice:

+-----+
| ptr |
+-----+
| len |
+-----+

// Ptr location
+-----------+
| f32 0     |
+-----------+
| f32 1     |
+-----------+
| f32 n     |
+-----------+
| ...       |
+-----------+
| f32 len-1 |
+-----------+

nurpax · November 11, 2023, 7:50pm

Thanks for the reply, it makes it clear why the code doesn’t compile.

The slice lengths are arbitrary so changing intiSlice2d type like this is not an option.

This loses the dimensionality of my input data. Maybe not such a big deal for a 3x3 matrix but makes for hard to read code for arbitrary dimensional tensors.

I guess what I’m looking for if there are f.ex. some utilities in the std lib for comptime converting fixed size arrays to slices? Right now this seems pretty cumbersome.

dude_the_builder · November 11, 2023, 8:07pm

After you instantiate an Ndarray, will the dimensions change during runtime? (I’m trying to understand why you need slices instead of arrays.)

nurpax · November 11, 2023, 8:25pm

The dims do not change once the Ndarray instance has been created. But I want to make it easy to create Ndarray’s from arbitrarily sized float arrays. This is useful for example when writing tests but there are many other such uses too. Here’s an example how I currently use this initialization in my tests:

const mat2x4 = [_][4]f32{
    [_]f32{ 1, 2, 3, 4 },
    [_]f32{ 5, 5, 6, 7 },
};
var tens2d = Ndarray(f32).initFromSlice2d(arena.allocator(), mat2x4);
const sum1 = tens2d.sum(arena.allocator(), .{ .axis = 0, .keep_dims = false });
try std.testing.expectEqualSlices(usize, &[_]usize{4}, sum1.shape);
for (0..4) |i| {
    try std.testing.expectEqual(@as(f32, mat2x4[0][i] + mat2x4[1][i]), sum1.get(&[_]usize{i}).item());
}

The need for slices is obvious to me but maybe I’m not explaining the need for it clearly. How else would you declare initFromSlice2d without using dynamic length slices while retaining the current API that it needs to support arbitrary input sizes?

Making the dims an Ndarray comptime argument is not something I’m interested in pursuing either. This is because the Ndarray module includes many functions that take as input one or more Ndarrays with arbitrary dimensionality, and can return Ndarrays that again have different dims. I think comptime dims works well for low-dimensional matrices like 4x4 used in 3d graphics, but when working with high-dimensional tensors in machine learning, there are just too many shapes and sizes of everything.

desttinghim · November 11, 2023, 8:37pm

I was supposing the number of dimensions was fixed, sorry. If you want to pass the multiple dimensions as individual slices, you need to take a reference to each of the sub-arrays and change the type like so:

const mat3x3 = [_][]const f32{
    &[_]f32{ 1, 2, 3 },
    &[_]f32{ 4, 5, 6 },
    &[_]f32{ 7, 8, 9 },
};

nurpax · November 11, 2023, 8:55pm

Thanks, this works!

const mat2x4 = [_][]const f32{
    &[_]f32{ 1, 2, 3, 4 },
    &[_]f32{ 5, 5, 6, 7 },
};
var tens2d = Ndarray(f32).initFromSlice2d(arena.allocator(), &mat2x4);

I was unable to get the types correctly when I tried to do this myself earlier. Miss a const or a & somewhere and it won’t compile and it’s hard to tell from the error message why.

desttinghim · November 11, 2023, 9:11pm

Yeah, the error reporting isn’t always the best, I’ve had trouble with getting this sort of stuff to work in the past as well.