Anyerror!void versus !void

biosbob · August 9, 2024, 12:04pm

i just wrote a pair of mutually-recursive functions (each declared !void) and ran into the unable to resolve inferred error set messages…

following a suggestion here, i declared my functions to return anyerror!void – and everything compiles just fine…

what’s the difference between !void and anyerror!void ???

dee0xeed · August 9, 2024, 12:12pm

When the compiler can not infer error set it requires anyerror, which is a set of all errors in a program.

biosbob · August 9, 2024, 12:12pm

i get that… so why isn’t that implicit in !void ???

dee0xeed · August 9, 2024, 12:14pm

An example from some of my code, function pointer:

const ReadDataFnPtr = *const fn(self: *EventSource) anyerror!void;

In this case ompiler does not know where the pointer will point to, this is runtime info and so it wants anyerror.

Bobbias · August 9, 2024, 12:16pm

I haven’t looked at the internals, but typically what’s going on is that leaving the actual error type off actually asks zig to try to infer a specific type. Typically you don’t want type inference to fall back on any types, because that simply means something that fails to typecheck will simply infer the any type. This defeats the entire purpose of having static typing. The safer option is to simply fail at inference and require an explicit type be provided. If the user opts to use an any type that’s their choice, but they’re being explicit about it.

dee0xeed · August 9, 2024, 12:18pm

I think in case of !void the error set will include only the errors which a function actually returns, whereas anyrror is ‘global’ error set.

kristoff · August 9, 2024, 12:24pm

Another way of describing anyerror is as a form of type erasure.

Omitting the error union asks Zig to infer what the end union looks like, which can fail when recursion is involved.

In a sense anyerror is the void * of errors, using C lingo.

As another example, one cannot create function types with an implicit error union because that implicitness is a form of genericity (ie the signature would be generic with regards to the error union), while function signatures with a hardcoded error set or anytype are fair game.


const Foo = struct {
  a: *const fn (usize) error{Bar}!void, // ok
  b: *const fn (usize) anyerror!void, // ok
  c: *const fn *usize) !void, // compile error
};

@biosbob if in your use case it’s not too cumbersome to define the error set manually, I would recommend doing that over anyerror as that’s both easier to understand for the reader, and it will preserve the ability to switch over an error set that correctly represents the actual errors that the function can return. In contrast, with anyerror switching over it will require you to switch over errors that the function might not even be able to produce.

If the actual error set is too annoying to fully define, then anyerror is fine.

mnemnion · August 9, 2024, 3:22pm

I have a sort of brute-force tip for doing this, if there’s a better technique, someone please let me know.

Just replace !T in the return type with error{E}T, the actual letter E is fine unless that’s the whole error set of the function, in which case, it’s also fine.

In all other cases, it will fail to compile, and the compiler will tell you what the actual error set is. Which you can copy-paste right into the return value, or you can give it a name and use the name.

For a mutually-recursive function, since you already know it can’t infer the error type, you’ll need to knock out the functions temporarily, and then combine the error sets you get. That should work.

Hopefully at some point, either ZLS or an official language server will be able to do this for us. In the meantime, a bit of manual labor is not so bad.

Note that use of anytype can result in an error set which just can’t be inferred. You’ll need anyerror for that case.

mlugg · August 9, 2024, 10:15pm

If the actual error set is too annoying to fully define, then anyerror is fine.

I would push back a little on this. There are, IMO, very few cases where anyerror is appropriate to use. It’s necessary in some runtime interfaces (e.g. std.io.AnyReader), but I… actually can’t really think of any other case I’d recommend using it, at least off the top of my head.

I don’t really see how it could be “annoying” to define the error set in any case. If it’s a self-contained algorithm, then the errors are just the ones in that file, and you can just list them out in a const Error = error{ ... }, and if not, you define the error sets in the stuff you call into and combine them with ||.

Using anyerror rather than a concrete error type has several disadvantages:

You lose out on useful type signatures; you need to analyze the implementation to determine which errors perhaps need explicit handling (vs which just need an else => fatal() or whatever).
In a similar vein, you lose the safety of actually making sure you’ve handled everything which needs handling. Exhaustive switch statements are a fantastic feature of Zig (yes, most modern languages do this), and IMO are a huge contributor to its safety. When using anyerror, you might be tempted to do else => unreachable because you know these cases can’t happen – you should assert this with the type system instead, so you get a compile error if that invariant is violated!
It makes the optimizer’s job harder. If we tell the optimizer that only “success” and error.A are possible, it can easily omit an extra comparison; similarly for any finite set of errors. For instance, if your program has hundreds of errors but only a few are possible, the optimizer can in the worst case omit many comparisons, and in the best case perhaps use something like a jump table, when switching on the error.
It’s viral! Since Zig code very often uses try to bubble some errors far up a program (perhaps even all the way to main), the problems listed above typically apply recursively, to every caller of this function, the moment you introduce one instance of anyerror. You can ultimately end up losing error safety for a huge chunk of your program just because you didn’t want to list the errors for one function.

dee0xeed · August 9, 2024, 10:22pm

see also this

mnemnion · August 9, 2024, 10:59pm

This reminds me of something I noticed recently. I’ve written code where a Writer of some sort is passed in using anytype, and when I was filling in the error types in that library, the compiler wasn’t able to infer an error set, which makes sense: if it doesn’t know the type of the parameter which has a try call, it can’t infer anything about what happens there in a generalizable way.

But something I’m working on just recently, a couple of days ago, has an init function which takes a writer type, because it needs to have a field to hold the writer, and then it uses that writer later. As it happens, my tests use an ArrayList.writer, at least so far, and so when I was decorating functions with error sets, it accepted error{OutOfMemory} as the set for that function.

I figured out later that that this will probably break as soon as I pass in a different type, but it left me wondering: what happens to the error set for a function which gets specialized this way? Do the specialized functions get differently-inferred error sets, or does the whole function take a superset of all the concrete types error sets?

I think I found the answer in CountingWriter, which takes the error set off the WriterType which it’s passed and then uses that error set to decorate the return value of the member functions. That seems to imply that each function is specialized to a different error set and that therefore, this is the technique to use if the goal is to provide an explicit error set which is dependent on the type provided.

I bet this is even something which could be done with type reflection on functions which take an anytype writer, although it might be somewhat ponderous to write out the block necessary to perform that type reflection and extract the error from it.

If that last bit is also correct, then it should be possible to always provide an error set for a given function, which isn’t anyerror, even if it involves a big ol’ block of comptime reflection and error set unions and such.

dee0xeed · August 9, 2024, 11:08pm

‘Reader/Writer’ interfaces in Zig stdlib are awful (IMHO),
cause UNIX “everything in UNIX is a file” abstraction is already
quite a “high level” “abstraction”.

Well, BSD sockets was not very well designed
(
special API to create (socket, not open(/dev/some/tcp),
and then very special bind, setsockopt etc)
)

But POSIX write() and read() works for almost everything,
Why these not so easy to grasp reader/writer interfaces in Zig?

How many “objects/things” in Zig stdlib do use this kind of (very weird) “interface”?

AndrewCodeDev · August 10, 2024, 1:00am

It’s also worth pointing out that anyerror is useful on function pointers where you cannot infer the error set. As was mentioned, I discourage this design paradigm in general, but that’s another use case.

Tosti · August 10, 2024, 12:41pm

I don’t understand the problem. The type of anytype function parameter is resolved at comtime, and the compiler instantiates the function for this type. It can infer a return error set.

For example, this compiles and runs as expected (prints A\nB\n). In this example the compiler produces 2 instantiations of g, one for A (with return type error{A}!void), and another for B (with return type error{B}!void).

const std = @import("std");

const A = struct {
    fn f(_: A) !void {
        return error.A;
    }
};

const B = struct {
    fn f(_: B) !void {
        return error.B;
    }
};

fn g(p: anytype) !void {
    try p.f();
}

pub fn main() void {
    g(A{}) catch |e| switch(e) {
        error.A => std.debug.print("A\n", .{}),
    };
    g(B{}) catch |e| switch(e) {
        error.B => std.debug.print("B\n", .{}),
    };
}

dee0xeed · August 10, 2024, 2:14pm

I’m not sure but I guess anytype has nothing to do with it.
Consider following example, old-fashioned interface:

const log = @import("std").debug.print;

const Interface = struct {
    methodImpl: *const fn(i: *Interface) !void,
    //methodImpl: *const fn(i: *Interface) anyerror!void,

    fn method(i: *Interface) !void {
        return i.methodImpl(i);
    }
};

const Implementor = struct {
    i: Interface,
    d: u32,

    fn init(d: u32) Implementor {
        return .{
            .d = d,
            .i = .{.methodImpl = myMethodImpl},
        };
    }

    fn myMethodImpl(i: *Interface) !void {
        const me: *Implementor  = @fieldParentPtr("i", i);
        log("d = {}\n", .{me.d});
        return error.SomeError;
    }
};

pub fn main() !void {
    var a = Implementor.init(11);
    try a.i.method();
}

Try to compile it and you will get

main.zig:6:42: error: function type cannot have an inferred error set
    methodImpl: *const fn(i: *Interface) !void,

Add anyerror and it will work:

$ ./main 
d = 11
error: SomeError
/home/zed/2-coding/zig-lang/anyerror/main.zig:28:9: 0x1036feb in myMethodImpl (main)
        return error.SomeError;
        ^
/home/zed/2-coding/zig-lang/anyerror/main.zig:10:9: 0x10354c9 in method (main)
        return i.methodImpl(i);
        ^
/home/zed/2-coding/zig-lang/anyerror/main.zig:34:5: 0x1035609 in main (main)
    try a.i.method();
    ^

mnemnion · August 10, 2024, 2:33pm

Let’s look at what I said real quick:

So I had to leave the return value as !void, because, since the anytype wasn’t fixed for that function, the compiler was unable to infer a specific error set which I could tag the function with, in the code. It might have been better to say “synthesize” rather than “infer” here.

It seems that I didn’t use the world ‘problem’ at any point, so I also don’t see what the problem is here.

The interesting part is where I was passing a specific writer type into a function returning a struct type, and the compiler let me tag the error set with the only error set which I had happened to instantiate. This lead me to wonder if I could (even though it’s a bad idea) just keep adding to the superset of every instantiated type’s error set for .write, or if the compiler is going to be seeing two mutually-incompatible sets at some point and I’ll need to do something else.

I’ll be working further on that code today, so I’ll have a chance to find out.

Also, yesterday I misread what was going on with the .Error field used in CountingWriter and some other places in std, it’s an explicit const declaration of the errors used as part of the GenericWriter interface.

This has me wondering, again, what would be involved in extracting the error sets from a given named decl function for a provided type, to have a comptime-synthesized error set which is correct for that type. I haven’t written any serious comptime introspection on the type of a member function, so I don’t know the answer. I’ve found some posts about using comptime introspection for validating interfaces, so I might start by finding those links and re-reading them.

The compiler likes to hand back types like '@typeInfo(@typeInfo(@TypeOf(example.test.foo debar.FooMe.fooTheBar)).Fn.return_type.?).ErrorUnion.error_set' when it can’t reduce an error set to something it can express as error{A,B,C}, so I expect that the move is to work backward from there.