Custom errors with payloads (with current zig)

Somewhere I read stuff about users wanting errors with payloads, then I read stuff about zigs destructuring syntax for tuples and I began wondering whether using destructuring syntax could help in creating custom errors with payloads. I nerd-sniped myself :sweat_smile:
So I began an exploration (procrastination) journeyā€¦

Here is what I came up with, custom errors are structs like this:

const UnexpectedToken = struct {
    expected: Token,
    got: Token,
    location: Location,

    pub fn init(expected: Token, got: Token, location: Location) @This() {
        return .{ .expected = expected, .got = got, .location = location };
    }
    pub inline fn err(_: @This()) !void {
        return error.UnexpectedToken;
    }
    pub fn format(
        self: UnexpectedToken,
        comptime fmt: []const u8,
        options: std.fmt.FormatOptions,
        writer: anytype,
    ) !void {
        _ = fmt;
        _ = options;
        try writer.writeAll("UnexpectedToken\n");
        try writer.print("    expected: {}\n", .{self.expected});
        try writer.print("    got: {}\n", .{self.got});
        try writer.print("    location: {}\n", .{self.location});
    }
};

The only really essential method is err the rest is just fluff to make initialization/formatting nicer.

How do you write a function returning custom errors?

const payload = customerrors.Payload(.{});
const TokenOrFile = customerrors.Union(.{ UnexpectedToken, FileFetchError });
const NumbersErrors = customerrors.Union(.{ customerrors.AllocationError, TokenOrFile });
const Numbers = payload.Error(*Node, NumbersErrors);
fn parseNumbers(self: *Parser) Numbers.res {
    const choice = self.rng.random().uintLessThan(u16, 10);
    if (choice == 0) {
        return Numbers.fail(UnexpectedToken.init(.NUMBER, .SOMETHING_NOT_ALLOWED_IN_PARENS, .{
            .file = fakefile,
            .line = 10,
            .column = 33,
        }));
    } else if (choice == 1) {
        return Numbers.fail(FileFetchError{ .file = fakefile, .pos = 42 });
    } else {
        var node = self.randomNodeAllocFail() catch {
            return Numbers.fail(customerrors.AllocationError{ .src = @src() });
        };
        node.data = self.rng.random().uintLessThan(u16, 1000);
        return Numbers.success(node);
    }
}

Here customerrors.Union combines N different errors into a union so we can return any of them, we have to do this explicitly because we donā€™t have a way to infer it, but we can combine unions into bigger unions (they get flattened).

payload.Error(*Node, NumbersErrors) defines that we have a payload *Node and a custom error (union) NumbersErrors. Numbers is a struct with functions that are the interface used for writing functions with custom errors, res is the result type which is always a 2 element tuple (kind of like in go, but try makes it nicer?) where the first is the result of the function (only defined if successful) and the second is an optional of the custom-error type, simplified you can say that success returns .{value, null} and fail returns .{undefined, custom-error} however fail constructs the union instance based on the given type automatically and can ā€œdisableā€ custom errors (e.g. based on build flags) always returning OpaqueError which doesnā€™t have more information than zig error codes.

How do you use a function?

fn parse(self: *Parser) !void {
    const node1 = try payload.unwrap(self.parseNumbers());

    const node2, const err = self.parseNumbers();
    try payload.check(err);

    const node3, const err2 = self.parseNumbers();
    payload.custom(err2) catch |e| {
        std.debug.print("the custom error was:\n{}\n", .{err2.?});
        std.debug.print("the error code is: {}\n", .{e});
        std.debug.dumpCurrentStackTrace(null); // TODO better stack trace
        return e;
    };

    const node4, const err3 = self.parseNumbers();
    if (err3) |custom_error| {
        std.debug.print("using if on the optional:\n{}\n", .{custom_error});
        std.debug.dumpCurrentStackTrace(null); // TODO better stack trace
        return custom_error.err();
    }

    // use nodes
    std.debug.print("{} {} {} {}\n", .{ node1, node2, node3, node4 });
}

Here payload.unwrap gets the 2-tuple returned from parseNumbers and returns its success value if its successful and on fail it prints the custom error and fails with the zig error code, so unwrap uses the custom error information in a predefined way (it might make sense to have a customization function for this) and converts it to a zig error.

payload.check(err) here err is the custom error and the function prints the custom error and fails with the zig error on failure, otherwise it does nothing.

payload.custom(err2) fails with the zig error without printing anything on failure, otherwise does nothing.

And you also can use if on the optional custom error value.


There are a bunch of things that could be explored further here:

  • when custom errors are ā€œdisabledā€ how much of the code really disappears from what gets compiled into the program? I havenā€™t looked into thisā€¦
  • in functions that use custom errors can we capture stack trace information properly and print one unified stack trace that looks similar to a zig error just with more information?
  • we are converting custom errors to zig errors, maybe the other way around also makes sense sometimes?
  • what about resources being associated with errors?
  • should the generated union fields have better names?
  • switching on the custom error union would be better with predictable field names!?!

Here is my sketch of the idea:

This topic is about what can be done with current zig, however here is a link to an issue about adding a feature to the language: Allow returning a value with an error Ā· Issue #2647 Ā· ziglang/zig Ā· GitHub


I currently donā€™t have a usecase for errors with payloads (maybe in the future when I revisit my interpreter). Do you use custom errors of some kind? What are your use cases? What features do you miss? Is this useful?

If this idea is useful, maybe we can work together on creating a polished version of this, that can be used as a library.

2 Likes

Could you explain why have you decided to use go/odin-like 2-tuples instead of making actual payload a part of the union? Making payload a part of the union prevents from accidentally using it without error checking. This is the case for built-in error unions.

@Tosti I had the same question - maybe partial success is an option here? You could have another union member representing that too, though.

@Sze The only case Iā€™ve personally worked on had this functionality was with sending network requests (my experience is limited here). I wanted information about why something failed but that had to come from an outside source. I can see the benefit if you do not have all the information about what went wrong locally and you need to further handle things on your end based on that information. What Iā€™m describing here is more like a response, though. Theoretically, you could have a unique integer identifying just about every situation you can think of, but the combinatorics of that can be awful if you have the potential for multiple independent errors.

On a more language level note, it depends on how you interpret the word ā€œerrorā€ as a construct in the language. Does it have to require the use of a keyword or can a struct with a string and a bool do the job? If you program a system around that, you can certainly treat those like errors. Maybe what people are more worried about here is that if itā€™s not a fundamental part of the language (like a keyword) then people wonā€™t use them?

A usecase that I wished for error payload was when parsing JSON. The error now doesnā€™t contain any context on where and why parsing failed. (There is a way to accomplish the same ny passing around a parsing context, but itā€™s not as handy as the error containing this info.)

1 Like

If it proofs to be useful and many start to use it, everyone will come up with their own solution. One project will have its own generic to store errors with custom payloads, another project ā€“ its own. It will be a pain to build an interface between the two.

1 Like

That definitely happens with the C-family of languages. People throw everything including the kitchen sink. You have to catch ints, chars, basically whatever someone decided to throw that dayā€¦ especially in older code. It was bad and Iā€™ve seen youtube tutorials promoting this same idea.

I think part of the answer is that I havenā€™t thought that much about it, it felt like there might be an idea here worth exploring, but I couldnā€™t tell what the details would look like by trying to imagine all the implications, so I decided to explore it with code.
Another part is probably that I had seen this odin vs zig video before:

Does it? To me this seems similar:

const node_tuple = self.parseNumbers();
// have to pry open node_tuple to get to data

const node_result =  self.parseNumbersMadeUpUnionVersion();
// have to pry open node_result to get to data

I think the one benefit of the latter might be if you always unpack the thing with switch statements. The things I like about the former are that you can easily say what is the result and what is the error, instead of the result just being one of the n cases. Also I like the idea of potentially having many different functions being able to use the same type for the error part.

Other things I wonder but havenā€™t actually looked into are:

  • does the compiler do clever things for tuples being returned from functions, like treat them as if they were independent output parameters and optimize them individually, if that may be helpful?
  • what happens when I have a really big error type but a tiny success result or the other way around, can the tuple be optimized better?
  • if I ā€œdisableā€ these custom errors (reduce them to wrappers of zig errors the example has a flag -Dcustomerrors=false) can I get rid of the overhead of those errors, or close to it? Would this be more difficult if it was all one union, or just different?
  • are there cases where you can write error handling functions that can be used for multiple functions that return that error type (where having the success result be mixed into the union would prevent reuse?)

Another thing, combining error unions makes sense to me, but how do I combine two things where one of the values isnā€™t an error but the success thing. Then when I want to convert back to the zig error I need to have some kind of convention / flag, that tells me how to get the error code or succeed. I think in a way you could argue that putting everything in one union is worse, because it puts everything in the same code path, instead of putting special attention on what you are doing with the error, but I am not sure if that is a strong argument, on the flip side you could say, that the switch at least complains about unhandled cases and if you use it for both data and error, you are more likely to handle the error. So I donā€™t know.

Seems to me without language support for these errors, you canā€™t really prevent someone from forgetting about an error (sure there are unused variable errors, but if someones lsp just fixups away that error, you could forget about handling it).


Another thing I wanted to explore eventually is whether you can build up more complex error types as you go up the callstack and collect more information, for that it seemed it might be easier to take apart and reassemble that 2-tuple than having to deal with something more ā€œcomplexā€/structured.

But I also kind of like the pattern shown in the github issue, with just creating a struct on the callsite and passing a reference to it via a config argument, sure it seems very adhoc, but I also like its simplicity.