Can errdefer and defer be combined?

chung-leong · April 20, 2025, 6:08pm

The return is supposed to indicate the end of a function. We don’t want to deviate too far from that by allowing operations of all sorts post return.

mnemnion · April 20, 2025, 6:48pm

I knew that would come up, which is why I said this:

Jesus said, “the poor you will always have with you”, and so it is with bad code. A defer |retval| block would of course (as written) be a constant capture, and it would be a reasonable guard against misbehavior to forbid a pointer capture here. This is also true:

We would get some useful patterns out of it, such as the ability to define postconditions which apply to every return statement in the function, without repetition, and the ability to use those postconditions to swap a return value for an error.

I do think it’s important that this would work on a ‘catch means release’ basis. I think a case could be made that allowing an arbitrary amount of capturing defers and errdefers, rather than, say, one of either per function, would fit the “this feature is hard to use correctly and that will result in bad code too often to justify what we’d get from it” template.

But I would also that such a case would apply equally to the ?E signature for capturing defer. Overall that approach smacks of nerfing: an arbitrary limitation which confuses new users and just becomes a thing you have to memorize, and “why can’t I capture T with defer” is a question we’ll never get to stop answering.

Zig has very few weird corners like that, and that’s a big part of why I like it so much. I’d like to round down the ones it does have, and not add more. And I’d rather have no capturing defer, than one which concerns itself only with errors and doesn’t let me run postconditional assertions.

chung-leong · April 22, 2025, 12:07pm

At risk of stating the obvious, a programming language should be capable of handling scenarios that programmers may encounter. It does happen that clean-up procedures differ depending on whether an error has happened. The OP provided a real world example. I can easily imagine another:

Suppose we have a function that runs a query on a database. The first thing that it does is acquiring a connection from a pool. Now the next statement depends on the return value. If the rows are returned as a slice of structs, a defer statement releasing the connection back to the pool would follow. If an iterator is returned, we would use an errdefer instead.

When an error occurs, we might advise the pool to not reuse the connection in the future. A server timeout could mean the connection is no longer usable, for instance. As things stand, we can easily make this distinction only in one version of our function. For no reason at all we’re forced to employ a workaround to handle the exact same scenario just because the function in happier circumstances would return a slice of structs.

mnemnion · April 22, 2025, 3:03pm

I think we may have both run out of interesting things to say to one another on this question. I’ll just point out this:

Is of course possible with the signature E!T, which is (in a useful sense) a superset of ?E. I’m aware it solves problems, and which ones, and I think it’s a partial solution where we should prefer a complete one.

Until (unless? I wonder) mere users can open and debate language proposals again, it’s all fairly theoretical. If that does happen, and I hope it will, we can all hash out the details then.

chung-leong · April 22, 2025, 10:29pm

Well, I’m not making a suggestion on how I think the language can be improved. I’m pointing out an actual flaw in the language. An errdefer statement might need to capture the error not because it’s somehow an error handler. It needs to capture the error because it’s a type of defer statement. It’s just illogical for defer itself to be unable to capture the error.

As for capturing the full retval, if you can come up with an usage scenario where you’d need the non-error portion of the error union in order a clean-up, I’d be all ears. I’m pretty sure you can’t. An error set, especially an inferred one, can potentially contain information on a very broad range of concerns. It might tell us that our SQL statement has a syntactical error. It might tell us that our server has spontaneously turned into a whale. When the need to capture the error is rare, the need to capture the right side of the exclamation mark is basically infinitesimally small.

mnemnion · May 1, 2025, 4:32pm

fn blah() !ReturnValue {
    var handle = try magic.getOpenFile();
    defer |ret_err| {
        if (ret_err) |ret| {
            // Can still throw errors
            try handle.close();
            // Because we're returning `ret`
            // After some post-conditions 
            assert(meetsContract(ret));
            return ret;
        else |err| {
            logger.log("some error {}", .{err});
            // Can check if the error
            // allows us to close the handle,
            // or if that would just fail again
            return err;
        }
    }
    // Do stuff with the handle here,
    // It will always be closed
}

This only works with a catch-means-release rule, and to have a rule like that, you do have to catch the full E!T value.

Kind of bulky written out like that, but easily reduced to an inline function call if that’s desired. Of course half the lines are meta-commentary which a real function would not need.

tensorush · May 1, 2025, 4:45pm

Btw, errdefer capture feature is getting cut:

github.com/ziglang/zig

Proposal: remove capturing `errdefer` from the language

opened 03:21PM - 30 Apr 25 UTC

mlugg

proposal accepted

## Background `defer` is a core and incredibly useful component of Zig, which h…elps avoid bugs in resource management. Similarly, `errdefer` helps avoid bugs in situations where a resource should be cleaned up only on error conditions. However, there is a third, lesser-known, piece of syntax: *capturing* `errdefer`. It looks like this: ```zig errdefer |err| { // `err` is the error being returned; you can do stuff with it! // Like a normal `defer/`errdefer`, you can't `return` or `try` in this block, so you can't change // which error is returned; but you can e.g. log it } ``` This feature is used *incredibly* rarely; many experienced Zig users do not even know that it exists. As a data point, in the Zig repository, ignoring test coverage and documentation for it, there are exactly two uses of this language feature: https://github.com/ziglang/zig/blob/8e79fc64cddf6832e4446b20d282084ae69f93ae/tools/update_cpu_features.zig#L1529 https://github.com/ziglang/zig/blob/8e79fc64cddf6832e4446b20d282084ae69f93ae/lib/fuzzer.zig#L298 A feature being rarely used is not necessarily in itself a reason to remove that feature. However, `errdefer` has a bigger design problem. ## The Problem Here's a question: in `errdefer |err|`, what is the *type* of `err`? The obvious thing would be that it has the type of "every error which can be returned after this point in the function", but this isn't a feasible definition; it brings a good amount of (quite boring) complexity to the language specification in terms of how it interacts with things like inferred error sets, and implementing this would require a type of logic in the compiler which likely has compiler performance implications. So, in reality, there are 3 reasonable choices: * The type of `err` is the current function's error set. * The type of `err` is `anyerror`. * The type of `err` is the error type which is being returned at a given `return` site (so the `errdefer` body is reanalyzed at every possible error return with a different type for `err`). Let's go through these three options. Note that right now, the compiler has inconsistent and unhelpful behavior here, so this *is* an unsolved problem. The first option is actually fairly good, with two big caveats: * In functions with inferred error sets, `errdefer |err| switch (err)` becomes impossible (it would emit a dependency loop because we don't know what cases need to be in the `switch`). * It makes something like `errdefer |err| fatal("error: {}", .{err});` impossible, since the function needs to return an error for `err` to be typed correctly. The second option means that any switch on the captured `err` must have an `else` prong, so if you want to `switch` on the captured error, this option is strictly worse than using a wrapper function which `switch`es on the returned error (since at this point the error type is known and can be exhaustively switched on). However, people are likely to reach for `errdefer` anyway out of convenience, and shoot themselves in the foot by losing type safety. The third option is what we *usually* do today, but: * It breaks `switch` on the captured error, because `@TypeOf(err)` will usually only contain a subset of all possible errors, so you'll get errors that `switch` prongs are impossible (because e.g. `error.Foo` can't be returned at *this* particular return site, so `err` is `error{Bar}` instead). * It makes it significantly harder for any compiler implementation to deduplicate `errdefer`s across different error return sites, because they are analyzed differently. (This Zig implementation does not do this deduplication anyway, but it's good for the language spec to make it viable!) One observation here is that all three of these solutions have big problems with `switch` on the captured error -- and this is a construct we want to *encourage*, not discourage! The main other use case for capturing `errdefer` is to log errors. The thing is, this use case is actually not a brilliant use for this construct, because it can lead to code bloat (due to the error logging logic being duplicated at every error return site). Generally, the body of an `errdefer` should be incredibly simple; it's essentially just intended for resource cleanup. Capturing `errdefer` encourages using the construct in more complex ways, which is usually not a good thing! Also, if you're logging an error, this generally indicates you aren't going to explicitly handle the different error cases higher on the call stack, so it's probably desirable to "collapse" these errors down into one (e.g. turn your set of 5 errors into a blanket `error.FooFailed`). This is something `errdefer |err|` does not support. As such, there are several big advantages to using a "wrapper function" approach instead, like this: ```zig pub fn foo() error{FooFailed}!void { fooInner() catch |err| { std.log.err("error: {s}", .{@errorName(err)}); return error.FooFailed; }; } fn fooInner() !void { // ... } ``` So, capturing `errdefer` syntax raises a difficult design problem, where all solutions seem unsatisfactory. Furthermore, the common use cases for this feature -- and indeed, the ones it seems to encourage -- tend to emit bloated code and suffer from worse error sets. That leads on to this proposal. ## Proposal Remove `errdefer |err|` syntax from Zig. Uh... yeah, that's kinda it. ## Migration The two uses in the compiler can be trivially rewritten with these diffs. `lib/fuzzer.zig` ```diff fn traceValue(f: *Fuzzer, x: usize) void { - errdefer |err| oom(err); - try f.traced_comparisons.put(gpa, x, {}); + f.traced_comparisons.put(gpa, x, {}) catch |err| oom(err); } ``` `tools/update_cpu_features.zig` ```diff fn processOneTarget(job: Job) void { - errdefer |err| std.debug.panic("panic: {s}", .{@errorName(err)}); + processOneTargetInner(job) catch |err| std.debug.panic("panic: {s}", .{@errorName(err)}); +} +fn processOneTargetInner(job: Job) !void { const target = job.target; ``` These are also basically the two solutions for current uses of this feature: either split your function in two, or perhaps use `catch` at the error site(s) if there aren't many. This improves clarity (one less language feature to understand in order to understand your code!), type safety (no issues with `switch` exhaustivity, and you can "collapse" error sets where you want as discussed above), and code bloat (no duplication of the error handling path at every possible error return).

I doubt defer will be getting a capture. But it’s for the best.

ericlang · May 1, 2025, 6:59pm

" Remove errdefer |err| syntax from Zig. Uh… yeah, that’s kinda it."
Ha funny.

I indeed did not know you could capture the error. Which we will not be able to do later on anyway.

It reminds me much of Exceptions (very bad idea they are) in other languages. I never knew (like really really know) what to do with them.

I agree that errdefer should only be there for resource cleanup. Although I am aware of the logical difficulties which can arise.

The extreme simplicity of errors in Zig I love. That’s what I always needed.

mnemnion · May 1, 2025, 7:16pm

This shouldn’t be a problem, it’s actually easier than stitching together identical final blocks. The various return points just unconditionally jump to the defer logic.

On the other hand, as the issue @tensorush linked to points out, anything which could be done with a capturing defer (or errdefer), can be done with an inner function. Even weird nested stuff, which really should be avoided, could be emulated with multiple levels of nesting, and inline can be used to make the result identical for most purposes.

So it’s basically syntax sugar, and not worth complicating the language over. I didn’t care for errdefer having a capture semantics, when defer didn’t; removing errdefer |err| is a satisfying resolution of that inconsistency.

chung-leong · May 1, 2025, 8:24pm

If close() fails, the only reasonable reaction is to either ignore the error or panic. Doesn’t make sense to return an error when the caller cannot do anything in response.

mnemnion · May 1, 2025, 9:54pm

file.close() doesn’t happen to return an error, a better example would be something like buffered_writer.flush(), which can.

Code can always catch any error which it doesn’t want to bother the caller with. Current defer can’t return, so a fallible function call in a defer block would be obliged to do this.

But really, something like this gets almost all the benefits, without adding complexity to the language. As I said, removing capturing errdefer eliminates the idiosyncrasy with less effort and complexity, which is the way to go.

Sze · May 1, 2025, 10:34pm

If OutOfMemory is the only possible error, wouldn’t it be better to remove the else branch so that it becomes a compile error if any other error gets added to the inferred error set?

mnemnion · May 2, 2025, 12:44am

Oh wait. Looks like I circled back and doubled up on something there. I converted two of the setup function (that and Graphemes I believe) to use individual catch unreachable branches for each error, but that was getting tedious, so I started doing it like this (different link, same trick).

Looks like what happened is that I went back to GeneralCategories and set up the error filter for it even though it doesn’t strictly need it. Easy enough to change, it’s a beta after all.

I think the basic pattern here is sound however. Any error other than OutOfMemory would indicate a bug in either zg or std, which I would need to fix somehow. There’s no point in user code doing the filtering for those, and by the contract of the operation, allocation failure is the only error which makes sense to expose. So if stdlib split one of those errors into two new errors, I would want the else branch to get rid of the new one automatically.

It’s a good general point, only use an else branch if the contract stipulates that any new members would go into the else branch. In this case it does. These functions decompress a segment of memory into a data structure, the only syscalls are from the allocator and it’s otherwise deterministic.

mnemnion · May 2, 2025, 12:47am

I had missed a trywhile I was going through setting up those filters, so I hadn’t noticed that I had almost finished doing it the hard way.

chung-leong · May 2, 2025, 10:39am

Your own example demonstrates why what you’re pushing for is ill-advised. The following is a resource leak:

    defer |ret| {
        try buffered_writer.flush();
        handle.close();
        return ret;
    }

Programmers cannot make this mistake currently.

mnemnion · May 2, 2025, 4:28pm

That looks more like your example than anything I wrote. Considering you wrote that, calling it “my example” is some sort of rhetorical move, no?

I wonder what you’re getting out of all this dead horse beating.

This is of course false:

try buffered_writer.flush();
handle.close();
return ret;

Same resource leak. The existence of defer does not compel the use of it, and it is also the case that defer statements can have bugs.

But hey, here’s “my” example:

    defer |ret| {
        defer handle.close();
        try buffered_writer.flush();
        return ret;
    }

Sze · May 2, 2025, 4:42pm

try can’t be used within a defer (or return) so the whole discussion doesn’t make sense. And @tensorush has already shown that we will get less captures, not more.

    defer {
        const val = try getSwitched(4);
        std.debug.print("val {}", .{val});
    }

temp24.zig:26:21: error: 'try' not allowed inside defer expression
        const val = try getSwitched(4);
                    ^~~~~~~~~~~~~~~~~~
temp24.zig:25:5: note: defer expression here
    defer {
    ^~~~~

chung-leong · May 3, 2025, 9:25am

This discussion is completely off the rail. Somehow we’ve gotten further from a solution to the OP’s problem than before. Without capture in errdefer means that if your code needs the error you’re have to catch it yourself, in every single instance. We go from this:

    try doA();
    try doB();
    try doC(try doD(), try doE(), try doF());

To this:

    var last_error_maybe: ?anyerror = null;
    doA() catch |err| {
        last_error_maybe = err;
        return err;
    };
    doB() catch |err| {
        last_error_maybe = err;
        return err;
    };
    doC(doD() catch |err| {
        last_error_maybe = err;
        return err;
    }, doE() catch |err| {
        last_error_maybe = err;
        return err;
    }, try doF() catch |err| {
        last_error_maybe = err;
        return err;
    }) catch |err| {
        last_error_maybe = err;
        return err;
    };

Why are we forcing people to write this crazy stuff? Just let defer/errdefer statement see the error. In other systems, it’s not uncommon to expose the last error globally. In C/POSIX we have errno. In Windows we have GetLastError(). In PHP, error_get_last(). The list goes on.

phatchman · May 3, 2025, 11:57am

chung-leong:

 var last_error_maybe: ?anyerror = null;
    doA() catch |err| {
        last_error_maybe = err;
        return err;
    };
    doB() catch |err| {
        last_error_maybe = err;
        return err;
    };
    doC(doD() catch |err| {
        last_error_maybe = err;
        return err;
    }, doE() catch |err| {
        last_error_maybe = err;
        return err;
    }, try doF() catch |err| {
        last_error_maybe = err;
        return err;
    }) catch |err| {
        last_error_maybe = err;
        return err;
    };

Wouldn’t your example be solved by putting all of those calls in a separate function and doing a single catch on that function?

If you must do it as a single function, the best I’ve come up with is below.

    const last_error_maybe: ?anyerror = err: {
        doA() catch |err| break :err err;
        doB() catch |err| break :err err;
        break :err null;
    };
    if (last_error_maybe) |err| {
        std.debug.print("err: {s}\n", .{@errorName(err)});
    }

milogreg · May 3, 2025, 7:58pm

I agree that this is cumbersome, but it’s somewhat unrelated to OP’s specific goal: to conditionally execute based on whether an error exists, not based on what the error is.

OP’s problem could be solved without errdefer capture or any additional boilerplate for error-returning calls:


{
    // Related to DBus message iterators
    var iter: Message.Iter = undefined;
    parent.openContainer(&iter);

    // Abandon on error
    errdefer parent.abandonContainer(&iter);

    // But close only if it doesn't error
    var returned_err = false;
    defer if (!returned_err) parent.closeContainer(&iter);
    errdefer returned_err = true;

    // Do other things with iter
}