Minimal Viable Zig Error Contexts

9 Likes

Ha! so we are just linking 30+ page nerd snipe papers in our otherwise enticingly short blogs now :slight_smile:

In true “reserve first” fashion, I find myself pushing all my resource acquisition as early as possible in my program’s lifetime, which sometimes has secondary effect of easier error reporting:

  1. have list of files to process from user
  2. open all the files (building up array list of fd) (early-exit on failure with error reporting)
  3. process all the files

Which also has secondary effect of enabling batched error reporting too (cannot open multiple files).

Maybe its more complicated :person_shrugging:, but what I can tell you is that I am too lazy and will never use diagnostics pattern

3 Likes

We cannot avoid the hard task of explaining why something like a cancelled_error exception or error_code is
a poor representation of a cancelled result, because it does not appear at first glance to introduce a lot of noise.
This paper is focused on explaining why errors are a bad way to represent a cancelled result.

OK, I’m intrigued. But after this is a bunch of C++ mumbo jumbo. Is anyone able to translate Chthulu into English so I can understand why they think error.Canceled is a bad idea?

I think the argument stems from their goal of representing an async task as a finite state machine that is visible to the user (at least in C++26 senders/receivers). The paper is from 2019, but you can already see them talking about senders with effectively three state “channels”: done, error, canceled. Admittedly I do not understand why the distinction is needed between error/canceled, other than the fact that error handling in C++ is annoying and they do not have true error sets to cleanly propagate the error.Canceled in a robustly enforced manner.

C++ committee loves to make things “correct by construction” through types and template metaprogramming, but always seem to fail to see that the easier and better thing is right in front of them.

1 Like

I’m not a C++ expert, but I’m not sure I’m really convinced by that paper. It just seems like a bunch of tacit admissions that exceptions are a terrible mechanism for control flow.

One of their examples is a retry pattern, which they argue wouldn’t need to be aware of any error other than cancelled. This just seems nuts to me; there are all sorts of errors that you wouldn’t want to continue retrying for.

I just love “serendipitous success” naming! But one thing I do remember is the issue of a “catch all” clause which silently swallows cancelation exceptions.

Eg, I often write the code which tries to read a “cache” file from disk, re-creating it on failure:

read_file()) catch {
    try write_file();
};

The thinking here is that even if the error is a true IO failure, and not just NotFound, I’d rather report it on the write path.

In the presence of cancelation, this code is buggy.

I wonder if we can have something like

pub const Cancelable = @Inedible(error{
    Canceled,
});

with the semantics that errors in inedible error sets can’t be ignored silently, such that the compiler errors out on the above snippet and requires an explicit

switch (err) {
    error.Canceled => {},
    else => {},
}

which makes the bug obvious.

2 Likes

From the perspective of language design, errdefer should not be used for purposes other than resource cleanup, see #23734

catch was originally intended as the design purpose of the use case, but it seems that the complaints mainly focus on the need to repeatedly duplicate error logs. So a simple coping strategy is to wrap the error logs into a temporary function.

fn process_file(io: std.Io, path: []const u8) !void {
    const with_log = struct {
        inline fn openFile(dir: std.Io.Dir, io_: std.Io, sub_path: []const u8, options: std.Io.Dir.OpenFileOptions) !std.Io.File {
            return dir.openFile(io_, sub_path, options) catch |err| {
                std.log.err("failed to open file '{s}': {t}", .{ sub_path, err });
                return err;
            };
        }
    };
    const fd = try with_log.openFile(std.Io.Dir.cwd(), io, path, .{});
    defer fd.close(io);
    // ...
}

That doesn’t really scale since it makes the code really obscure.
Let’s take one of the source codes from the linked blog post at the end of this as example (and add errdefer to Rust for simplicity)

pub async fn connect_ws(
    local_addr: SocketAddr, host: &str
) -> Result<WebSocket> {
    errdefer error!("local_addr={local_addr}");
    let peer_addr = resolve(host)?;
    errdefer error!("peer_addr={peer_addr}");
    let socket = bind(local_addr)?;
    errdefer error!("socket={socket}");
    socket.connect(peer_addr)?;
    errdefer error!("host={host}");
    let tls_connection = tls_handshake(host, socket).await?;
    errdefer error!("tls_connection={tls_connection}");
    let ws_connection = ws_handshake(tls_connection).await?;
    ws_connection
}

You would need to create one such wrapper function for every individual errdefer here (or use a Diagnostics Factory as linked further down the blog->blog link chain).

Readability would suffer a lot from that, making it VERY hard to figure out what the code actually tries to achieve.

Yes, diagnostic mode is always the most correct approach. In comparison, wrapping it as a temporary function is rather ugly.

As for errdefer as a solution, I tend to think of it as a coincidentally workable ‘beautiful accident’, but considering that it was not designed for this purpose, I would try to avoid using it in situations other than solving exception safety.

To prevent potential misunderstanding, this talks about

which I linked earlier in this morning (so the link wasn’t there in the original version of my article)

3 Likes

There’s one huge drawback though — the error message is logged, even if the error is subsequently handled

Putting my person that worked on error monitoring hat on, that drawback is much more significant than you would expect. When people who are not familiar with a situation do investigations, sometimes under pressure, seeing errors show up that were actually handled and benign is a great source of confusing during incidents.

6 Likes

Maybe I’ve written too much Go at my day job, but the straw man “first attempt” given in the post:

const fd = dir.openFile(io, path, .{}) catch |err| {
    log.err("failed to open file '{s}': {t}", .{path, err});
    return err;
}

would probably work for me in a “script-y” scenario.

I’m not sure what you mean by “inedible error set”? What error sets can be eaten, and which not? And what does it mean to eat an error set?

An inedible error is an error you aren’t allowed to swallow.

Errors in inedible error sets (a suggestion, not a real Zig feature) can’t be ignored silently even via an explicit discard. That is, _ = err;, err catch return, switch (err) { else => return } all fail at compile time. switch (err) { error.Canceled => return } is the only construct that allows handing an inedible error.

I think this might also disallow implicit conversion of inedible error into ‘anytype’, and, again, require explicit by-name listing of offending errros.

As a corollary, you wouldn’t be able to bubble an inedible error out of main.

About 10% sure that this is a good idea, but it might help with accidentally swallowing ‘Canceled`, and I think it could also help with some subsystem-internal domain errors. E.g. at TigerBeetle we model async iterators as ‘error{Pending}!?T’ and that Pending mustn’t be swallowed.

4 Likes

Thanks. I was wondering if autocorrect was changing “indelible” to “inedible”, but then I was still confused :-). Not a bad way to put it.

I was musing an AI hallucinating after getting an edible error set. (OT-sorry)

So we have the else clause to explicitly group all cases that we’re not interested, then we add a feature that makes the else clause not work? Next step would be a feature to eat inedible errors:

switch (@eatingDisorder(err)) {
    else => {
        // I really want to silenty ignore these errors
    },
}

Sounds like an arms race.

2 Likes

Honestly the better approach for me would be, to just have an explicit error set in one of the calling functions that doesn’t have Canceled or Pending inside of it. Then you can also easily see the boundary of where this needs to be resolved.

You could even have some comptime stuff that basicalle creates an inferred error set from a set of function but always disallows some other set of errors, so basically a set difference.

1 Like

Explicit error sets solve the opposite problem: they force you to handle Canceled at the boundaries. They don’t help with the problem of accidentally ignoring Canceled together with other errors, the

read_file()) catch {
    try write_file();
};

example.

3 Likes

Oh yeah in that example you’re right, sorry.

Maybe that’s just me but I nearly always look at what errors can be returned and think about it.
Now that 0.16 is live and I’m using it I will probably just create me a snippet that inserts this:

catch |err| switch(err) {
    error.Canceled => return,
    else => ${1:},
};

This is simpler than having some notion of errors which must be handled, which would also light the fire of those who are already complaining, that Zig is to strict for fast prototyping.