What is the idiomatic way to include additional information with errors?

Hi,

I have a function that checks if an expected token is found or not. In case the token is not found, it returns an error.

However it would be nice to also include additional information with the error, specifically,

  • At which token index the failure happened?
  • What was the expected token that we wanted to find?
  • What token did we actually find?
  • etc.

In Rust, the error itself would be a sum type enum (tagged union) where we could include the information, but in Zig we can’t do that.

The only alternative that seems possible is to explicity pass a pointer to the functions to extract the error data.

Thanks.

fn matchTokenExact(tokens: []const Token, expect: TokenKind, itoken: usize) ParserError!usize {
    const it = matchToken(tokens, expect, itoken) orelse {
        return ParserError.TokenDidNotMatch; // <<< I Want to add information to the error
    };
    return it;
}

PS: This could be solved by allowing errors to be tagged unions in future version of the language, like Rust. If heap allocated data can not be returned, at least allow stack allocated data.

2 Likes

While I completely agree with the control flow of errors in Zig, I sometimes am looking for a neat way to give details too.
Curious about the best way!

The term you’re looking for is “Diagnostics”. See here for a multitude of threads on best idioms for this: Search results for 'Diagnostics' - Ziggit

5 Likes

Here’s the relevant (rejected) proposal:

2 Likes

Preface: I’ve never worked on a Zig application that actually needed to report errors, so this all is theoretical.

In my mental model of Zig error handling, the key observation is that “error codes” and “diagnostics” are separate. What Zig gives you via try and error are error codes, which are a control flow constructs. Errors affect control flow in two ways:

  • Most code just try-propagates them, running defer/errdefer, where we only care about one bit of info – is it an error?
  • The caller might make a decision and handle the error. There’s going to be only a statically finite amount of different branches that the caller can take, and so it is sufficient to return just a single number for branching in the error handler.

This is more or less everything that Zig error handling does for you.

Then, there’s a separate task of reporting the errors to the user, and, here, you have to do it yourself, I don’t think Zig provides any built-in mechanism. Here, you have two cases:

  • You know how the final error is going to be represented (a diagnostic in the terminal, and HTTP 500, etc). Here, instead of returning an error, you could immediately produce it in its final form. Eg, just write formatting output in English to stderr.
  • You don’t know the final destination of an error. Here, you’ll need to store fully structured representation of an error somewhere. For levers, this could mean something like:
struct Lexer {
   source: []const u8,
   offset: usize, 

   // Choose whether all errors are fatal or not:
   last_diagnostic: ?Diagnostic, 
   diagnostics: ArrayList(Diagnostic), 
   
    const Diagnostic: struct { offset: u32, expected: Token, got: Token }
}
9 Likes
3 Likes

Someone made a tutorial that uses optional struct fields to allow error messages.

I don’t really need to return string messages, however maybe we can use this method to create an ArrayList(ErrorInfo) to add additional information to the error.

Rather complicated but will work.

I’ve been struggling with this myself this week. I started by wanting to pass an errorHandler(string) down, but missing lamdas for that.

Then I thought I’d try with built in error types, but they dont seem to be unique, and I was losing context without them all having silly names.

So then I tried handling them locally (think 5 steps that can all generate the same error) but seems like try only pushes errors to caller, and I couldnt work out how to generate some kind of local catch-block.

Using errdefer as a common local catch block also didnt work because, it seems the error is already being pushed to the caller at that point. Only thing I could think of there was some kind of ugly wrapper method inbetween caller and callee to consume the error, yuk.

Anyway I think I found something that works, at least for me. @import("root"); into where the error is being generated, and then call a global logger from main.zig to capture some error context before the stack unwinds.

I just started learning Zig as well (last week). And I, too, am perplexed that errors cannot propagate contextual informational upwards when one occurs. As a Python and Go programmer, this is proving challenging to wrap my head around.

It was so unexpected especially after learning about tagged unions, which are wonderful. When I went to define my first error, I suddenly realized I was unable to use them in errors.

Sure, in many cases, this info is not being used for additional control flow, but rather just going to a log or to the screen. But I’m having a hard time giving myself approval to print something to the screen in a library function.

“Who am I, as a library author, to dare print something to my caller’s screen?” What if it’s a TUI? I’ve been conditioned that is just as bad as calling exit(1) in a library function.

How doea it report compilation errors then?

Take a look at std.log. You can log an error message in your library, but the user still has full control over how error messages are handled.

You’ll want to create a scoped log for your library, so the user can separate messages from your library from their own.

2 Likes

the various parsers in std and the clap library are good examples of how to do this.

The caller can provide a pointer to a type to hold contextual information, then they can do whatever they want with it when they get an error.

1 Like

The typical convention is to keep a field for diagnostics message or data, possibly nullable. On error, you’ll set the info string and pop errors up. Any user that catches the error should know that the details are at a separate location.

That’s also essentially what C libraries do.

It’s preferable to use a defined type instead of a string, it’s more flexible and potentially, depending on what you’re doing, easier to manage its memory.

In what sense? I can’t think of how a strong error type helps better with flexibility and memory management. You can define the error message as a new type if you want to but that doesn’t invalidate the use of error code for control flows.

Or are you thinking of Zig automatically freeing up memory for your error object?

we are talking about storing context in addition to zigs errors when it’s needed.

Say you are making a json parser, when a parsing error occurs you want to be able to tell the caller the line and column of the error.

Your suggestion of a string requires arbitrary amount of memory and therefore allocation, which you can push onto the caller, it also is fallible which you don’t want when you are in the process of returning an error.

The caller then needs to parse the string you provided if it wants to do anything other then present it raw to the user, maybe you provide an api to make that less error-prone, but it’s a lot of work compared to storing two numbers.

3 Likes

Tbh, I keep thinking about why we can’t have proper error payloads when we already have all the ingredients (tagged-unions and optionals).

What is the current limitation that prevents error sets to become tagged unions where the error code is the tag? Is it the automatically inferred global error set, is it some obscure syntax detail?

How would an error result with a per-error-type payload be different from a tagged-union-style error set?

E.g. instead of:

const FileOpenError = error{
    AccessDenied,
    OutOfMemory,
    FileNotFound,
};

smth like this:

const FileOpenError = error{
    AccessDenied: { a: i32, b: i32, c: i32 },
    OutOfMemory: { d: i32 },
    FileNotFound: void,
};
4 Likes

It’s not a technical limitation, it’s a deliberate separation of error control flow and diagnostic information.

If you want to, you can make a result tagged union like rust, though you will sacrifice some convenience zig gives you with its built-in functionality.

While the needs of diagnostic information vary wildly, error control flow doesn’t

1 Like

…which is the whole point of using Zig’s builtin error handling. Without that it’s just fancy return error codes.

While the needs of diagnostic information vary wildly, error control flow doesn’t

In that case the payloads would all be void which would be fully equivalent with the current error code system.

I just don’t see returning error information through a separate side-channel (like an out-pointer or a diagnostic object) as a useful programming concept, that was already a bad idea in the C era (or also see C’s errno, the idea to keep diagnostics separate from the error looks a lot like “errno 2.0” - or OpenGL’s KHR_debug extension is also pretty close to the diagnostic idea, these are all a PITA to work with).

4 Likes

This is the pattern I currently use. I made some changes in practice, such as always using ArenaAllocator to allocate memory.

pub const Diagnostics = struct {
    arena: std.heap.ArenaAllocator,
    error_stack: std.ArrayListUnmanaged(Error) = .empty,
    last_diagnostic: Diagnostic = undefined,
    double_error: ?anyerror = null,
    pub const Error = struct {
        code: anyerror,
        diagnostic: Diagnostic,
    };
    pub fn clear(self: *Diagnostics) void {
        _ = self.arena.reset(.free_all);
        self.error_stack = .empty;
        self.last_diagnostic = undefined;
        self.double_error = null;
    }
    pub fn log_all(self: *Diagnostics, last_error: ?anyerror) void {
        if (last_error) |err| {
            if (self.double_error) |double_error| {
                std.log.err("double error!{s}", .{@errorName(double_error)});
            }
            self.last_diagnostic.log(err);
            var it = std.mem.reverseIterator(self.error_stack.items);
            while (it.nextPtr()) |item| {
                item.diagnostic.log(item.code);
            }
        } else return;
    }
};

pub const Diagnostic = union {
    DiagnosticGIT_ERROR: c_helper.DiagnosticGIT_ERROR,
    DiagnosticUnknownCError: c_helper.DiagnosticUnknownCError,
    pub fn enterStack(last_diagnostic: *@This(), last_error: anyerror) !void {
        var diagnostics: *Diagnostics = @fieldParentPtr("last_diagnostic", last_diagnostic);
        if (diagnostics.double_error != null) {
            return last_error;
        }
        diagnostics.error_stack.append(diagnostics.arena.allocator(), .{ .code = last_error, .diagnostic = last_diagnostic.* }) catch |double_error| {
            diagnostics.double_error = double_error;
            return last_error;
        };
        last_diagnostic.* = undefined;
    }
    pub fn getAllocator(last_diagnostic: *@This()) std.mem.Allocator {
        const diagnostics: *Diagnostics = @fieldParentPtr("last_diagnostic", last_diagnostic);
        return diagnostics.arena.allocator();
    }
    pub fn unableToConstructDiagnostic(last_diagnostic: *@This(), err: anyerror) !void {
        const diagnostics: *Diagnostics = @fieldParentPtr("last_diagnostic", last_diagnostic);
        diagnostics.double_error = err;
        return error.UnableToConstructDiagnostic;
    }

    pub fn log(self: *Diagnostic, err: anyerror) void {
        inline for (@typeInfo(Diagnostic).@"union".fields) |field| {
            if (std.mem.eql(u8, @errorName(err), field.name)) {
                if (@hasDecl(@FieldType(Diagnostic, field.name), "log")) {
                    @field(self, field.name).log();
                } else {
                    std.log.err("{s}:{}\n", .{ field.name, @field(self, field.name) });
                }
                return;
            }
        }
        std.log.err("{s}\n", .{@errorName(err)});
    }
};
2 Likes