Function that accepts array of any length - is it a thing?

Fix a type T. Suppose you have some logic that wants to operate on arrays with value type T. It works for all array sizes, and you genuinely to work with arrays - you want a new version of f to get compiled for every distinct length of array that it is called on.

This is perfectly achievable, by doing comptime introspection on the type of the argument passed to f, using @compileError to enforce that it’s an array and that the value type is T.

However, this kind of enforcement doesn’t tell the compiler ahead of time what types are accepted by f, which means that certain initialization expressions don’t work.

For example, if T = enum { foo, bar }, it might be nice to do

f(.{.foo, .bar, .bar, .foo});

and if T = [10]u32 it might be nice to do

f(.{@splat(42), @splat(1234)});

These are valid if f is knows the array size at function-definition time, but not if it doesn’t. Given what we know about the arguments that f accepts (and can enforce at comptime), there is logically enough information in the pair of the initialization expression (an array literal) and the target (valid argument to f) to make this work. But is there a way to communicate it to the compiler?

What you want is a slice []T it’s a pointer + length.

Pointers to arrays coerce to slices also don’t forget about constness as slices are pointers

A const slice is []const T

fn foo(
    comptime T: type, 
    comptime size: usize, 
    array: [size]T,
) void
6 Likes

Unfortunately, I don’t think slices quite achieve this - in the examples I gave, while it’s true that if you make f accept a []const T you would be able to do

f(&.{ .foo, .bar, .foo }) // T an enum
f(&.{ @splat(x), @splat(y) }) // T an array type

you wouldn’t be able to see the length of the argument (even though it’s in principle comptime known) inside f at comptime, without forcing the whole slice to be a comptime parameter.

The solution with passing the length of the array as a generic parameter does work:

fn f(T: type, comptime N: usize, data:*const [N]T)

(not that you need the *const at that point). But again, logically this parameter is redundant, or rather would be redundant if there was a way to describe the notion of an array with compile-time-known but not function-definition-time-known length.

Anyway, this is all just part of my exploring the limits of comptime, limits that exist for good reason and I’m not looking to expand - just understand.

You can fn f(T: type, comptime data: []T) if you need it at comptime, I prefer that over arrays for ergonomics, and it achieves the same result.

Might even be more performant as arrays are passed by value so they’d be copied, unless you’re taking arrays through a pointer.

ah but ig if you only need the length at comptime not the data i can understand using arrays in that case

I don’t think the question about whether you want to pass a copy vs a pointer is material here, since the strongest pointer type (coercing to all the others)

*S // S = [N]T, say

carries the same comptime information as S itself. So you’re never ‘forced’ to use an array due to comptime requirements.

You could make a second-order function that uses the type information:

pub fn makeGetDoubleLen(comptime N: usize, T: type) fn ([N]T) usize {
    return struct {
        pub fn getDoubleLen(arr: [N]T) usize {
            return arr.len * 2;
        }
    }.getDoubleLen;
}

// snip 

const getDoubleLen2 = makeGetDoubleLen(2, EnumType);
const double_len_2 = getDoubleLen2(.{ .foo, .bar });

const getDoubleLen10 = makeGetDoubleLen(10, u32);
const double_len_10 = getDoubleLen10(@splat(42));

I’m not sure if there’s any benefit to doing it that way over the fn( N: usize, T: type, data: [N]T ) way, but it is possible.

If you’re looking for special syntax in Zig, the answer is almost always one of the two:

  • It’s possible in comptime by being a little more verbose
  • It’s not possible

Your answer is either what @LucasSantos91 suggested, or:

fn foo(array: anytype) void {
    my_awesome_comptime_reflection_lib.ensureThisTypeIsValidInThisParticularContext(@TypeOf(array));
    // ...
}
1 Like

I think the right answer in this case is ‘it’s not possible’, and I’m fine with that.

The problem specification is that:

  • f should be able to infer at comptime the length from the argument, which rules out passing the length directly to f, as this defeats the purpose.
  • the argument should not need to be comptime-known. This rules out passing a slice to f (because you can’t know the length of a slice at comptime without knowing the whole slice at comptime).
  • f should provide enough information in its signature so that the value types of the argument are known ahead of time, enabling passing array literals whose elements are context-dependent expressions (like .foo , @splat(3.14) etc). This rules out the anytype argument.

The first two points in your specification are contradicting.

3 Likes

zig will likely never do the type inference you are asking for

  1. zigs’ type inference is intentionally limiting, the only reason it has type inference at all is to avoid redundant type specifications.
  2. zigs’ comptime is just zig code, it can be arbitrarily complex, to collect information on acceptable types is not simple.

You have 3 options:
fn f(T: type, N: comptime_int, a: *const [N]T)
fn f(a: anytype) (not recommended)

inline fn f(T: type, s: []const T) void {
    const arr = comptime s[0..s.len];
}

If you choose option 3, I recommend it be just a wrapper for option 1 as inline can very easily bloat your binary size and prevent optimisations, it only exists as the compiler cant make perfect decisions.

I don’t believe these points are in contradiction. You can write a function f that will compile against any array as input, and it will be able to use the length (and of course value type) of the array at comptime, even if the array itself is not comptime known. But there is no way to specify the value type in the signature, without either:

  1. also specifying the length in the signature, or
  2. passing fully comptime known parameters (not what I was looking for), or
  3. giving up on knowing the length at comptime

The point, which does not really fall into ‘special syntax’ category IMO, would be to enable expressions like

foo(.{
    @splat(1.0),
    @splat(2.0),
});

foo(.{
    @splat(1.0),
    @splat(2.0),
    @splat(3.0),
});

to compile. In this case, the compiler would understand that foo takes arrays of any length whose values are of type [3]f64, say, for which @splat makes sense.

P.S. This whole discussion is a matter of curiosity for me.

The point is not about arrays per se, any function call I write down involving arrays could also use a pointer-to-array instead, as far as I’m concerned. It just simplifies the notation for this discussion to stick to arrays.

On the point of redundancy -
you could achieve the ‘splat at depth’ expression I was going for like this -

fn foo(comptime n: usize, arr: [n][3]f64) { … }

foo(2, .{
    @splat(1.0),
    @splat(2.0),
});

foo(3, .{
    @splat(1.0),
    @splat(2.0),
    @splat(3.0),
});

but I would argue that the n parameter is in fact redundant, logically. The compiler knows the lengths of the literal expressions appearing as the second argument. This was an attempt to remove that redundancy.

In this specific case you could do:

pub fn foo(s: []const [3]f64) void {
	_ = s;
}

test foo {
	foo(&[_][3]f64{
		@splat(1),
		@splat(2),
		@splat(3),
	});
}

Not much extra syntax needed here to make it work.
As for accepting arrays of any length, you could try coercing the splats with @as(), but you would need to do a little @typeInfo() trickery with a wrapper function to make it work:

pub fn MakeFooArgs(s: anytype) type {
	return switch(@typeInfo(@TypeOf(s))){
		.@"struct" => |s_struct| @Type(.{.array = .{
			.len = s_struct.fields.len,
			.child = s_struct.fields[0].type,
			.sentinel_ptr = null,
		}}),
		else => @compileError("MakeFooArgs() expects a struct or tuple"),
	};
}

pub fn foo(s: anytype) void {
	const s_t: type = MakeFooArgs(s);
	std.log.err("{s}: {any}", .{
		@typeName(s_t),
		@as(s_t, s),
	});
}

test foo {
	foo(.{
		@as([4]f64, @splat(1)),
		@as([4]f64, @splat(2)),
		@as([4]f64, @splat(3)),
	});
}
2 Likes

It seems strange to me to rule out anytype here. You can type-check the value completely. Either that, or the redundant comptime n parameter. I see this as the price for Zig’s simplicity and I’m OK with that. I’d much rather deal with this than a turing-complete type system.

2 Likes

I didn’t make any points about arrays, I was just sticking to your example.

You’re right, that is redundant, but any solution to remove that redundancy adds quite a bit of complexity, even if it was just a special case for arrays.

I mean, zigs type system is turing complete, because its just zig code with types as values.

you can do this

foo(&.{
    @splat(1),
    @splat(2),
    @splat(3),
});

There are many such cases where your only choice is anytype (which as you say has no information in the signature), or a redundant type param. Based on watching the discussions and decisions for a few years now, I can say this is very unlikely to change. It is a trade-off between comptime as it exists today and something else, or at least more features that are closer to generics in other languages, which is something that has been very explicitly decided against. I’m glad for that, because the level of complexity would go way up if it were not the case.

1 Like