Iterating optional error unions (i.e. `!?T`)

weskoerber · July 18, 2024, 10:28pm

What’s the canonical way to iterate over optional error unions? The langref documents while usage with optionals and error unions separately. Can they be used together?

For example, std.fs.Dir.Iterator.next() returns Error!?Entry – an optional error union – and I want to do something like this:

var iter = std.fs.cwd().iterate();
while (iter.next()) |entry| {
    // use the entry
    std.debug.print("{s}\n", .{entry.name}); // this doesn't work -- optional doesn't support field access
} else |err| {
    // handle the error union
    std.debug.print("error: {s}", .{@errorName(err)});
}

I’d expect the first capture, |entry|, to capture the optional’s value (std.fs.Dir.Entry). However, it is still optional (?std.fs.Dir.Entry). Why doesn’t the expression capture the optional’s value?

weskoerber · July 18, 2024, 10:30pm

I typically use the following workaround:

var iter = std.fs.cwd().iterate();
while (iter.next()) |maybe_entry| {
    const entry = maybe_entry orelse break;
    // use the entry
    std.debug.print("{s}\n", .{entry.name}); 
} else |err| {
    // handle the error union
    std.debug.print("error: {s}", .{@errorName(err)});
}

LucasSantos91 · July 18, 2024, 11:27pm

weskoerber:

var iter = std.fs.cwd().iterate();
while (iter.next()) |maybe_entry| {
    const entry = maybe_entry orelse break;
    // use the entry
    std.debug.print("{s}\n", .{entry.name}); 
} else |err| {
    // handle the error union
    std.debug.print("error: {s}", .{@errorName(err)});
}

Looks good to me, I don’t see it as a “workaround”.
If you find it difficult to wrap your head around the notation, you can think of it like this:

const MaybeEntry = ?Entry;
const ResultOfNext = Error!MaybeEntry;

weskoerber · July 18, 2024, 11:40pm

I don’t find it difficult to wrap my head around the types of the captures. My question had more to do with the syntax of the language and the possible inconsistency here.

For example, when iterating over optionals, the result is as I expect. std.mem.TokenIterator.next() returns an optional slice (?[]const u8) – a slice ([]const u8) if the iterator is not at the end and null if it is at the end.

var iter = std.mem.tokenizeScalar(u8, "   abc def     ghi  ", ' ');
while (iter.next()) |tok| {
    std.debug.print("{s}\n", .{tok});
}

The above code prints

abc
def,
ghi

In this example, the capture inline with the while loop unwraps the optional value. Why doesn’t it do the same in the case of an optional error union?

weskoerber · July 18, 2024, 11:44pm

It looks like this is intentional:

While loop with Error union and Optional value · Issue #8460 · ziglang/zig · GitHub

Calder-Ty · July 18, 2024, 11:45pm

I think the “Optional Error Union” is not the right terminology. When read it, I think of ?Error!T. Such A type, if it exists would be very strange, as then you are wrapping Errors into types.

the type !?T is an Error Union, where the expected type is optional. So you have to first unwrap the error and then the check for the optional payload.

LucasSantos91 · July 18, 2024, 11:46pm

It did, you showed this perfectly with the code you called a “workaround”. Think of an onion of types: Error! (?Entry)

// iter.next() returns `Error! (?Entry)`
while (iter.next()) |maybe_entry| {
  // Peel one layer of the onion, which is the (Error!). 
  // We are left with (?Entry).
  comptime std.debug.assert(@TypeOf(maybe_entry) == ?Entry);
  const entry = maybe_entry orelse break;
  // Peel another layer, which is the optional (?).
  // We are left with Entry
  comptime std.debug.assert(@TypeOf(entry) == Entry);
}

dude_the_builder · July 18, 2024, 11:47pm

If you were OK with propagating the error, this works as you expect:

while (try iter.next()) |entry| {

And yes, I see your point on the possible inconsistency. It seems that if the while is detected to be the while with error union variant, given the else clause, it will only bahave in that way and not like the while with optional variant even if after unwrapping the error an optional is what’s left.

weskoerber · July 18, 2024, 11:57pm

Thanks guys. I think that pretty well sums it up- handling optionals within a while block is mutually exclusive with handling error unions (and vice-versa).

mnemnion · July 19, 2024, 12:02am

I had an idea about this, and I’m still not sure if it’s brilliant or terrible.

Basically, we already have this:

if (a()) { 
    // boolean true 
} else {
   // boolean false
}

if (a()) |cap| {
   // optional
} else {
   // null case
}

if (a()) |cap| {
   // error, you can tell because
} else |err| {
  // there's second capture
}

Which are all syntactically distinct, that’s an important trait IMHO.

Well, why not take it a bit further?

if (!a()) { 
    // is a null case because:
} else |cap| {
   // it has an else-capture but no if-capture,
   // it has a `!` as well.  Sometimes the capture
   // branch is long, and you want to, say, return
   // from the null case first, y'know?
}

if (a()) |opt_cap| {
    // oh no! an error and an optional!
    // this is the happy path where you get something
} else {
   // here's that null branch
} else |err| {
   // and here's the error
   // yep. double else. mantatory, double, else
}

So a !?T while loop would be

while (can_error.next()) |cap| {
    // do stuff
} else { 
     // must have a null case but it can be empty,
     // and it breaks the while loop
} else |err| {
    // error case, also breaks the while loop
}

The saving grace of this is that a double-else is illegal, except for a condition which is !?T, in which case, it is mandatory.

weskoerber · July 19, 2024, 12:20am

Funny enough, I actually tried the double else but, of course, got a compile error. It feels so natural, though!

I think this would be a fantastic addition to the language and would make !?T handling so much cleaner.

dude_the_builder · July 19, 2024, 12:41am

For some reason, I see it more “natural” this way:

while (can_error_next()) |cap| {
    // no error, not null
} else |err| {
    // error comes first just like the type !?T
} else {
    // null; The absence of a capture kinda mirrors 
    // the absence of a value.
}

mnemnion · July 19, 2024, 1:22am

I see it more like a double-unwrapping, so !?T reads as T, ?, !.

My reasoning is this. Right now, what you get is the following:


fn optionErr(i: usize) !?f64 {
    switch (i) {
        0 => return 1.5,
        1 => return null,
        else => return error.NotBinary,
    }
}

test "optional error" {
    const val: usize = 0; // set to whatever
    if (optionErr(val)) |cap| {
        if (cap) |inner_cap| {
            std.debug.print("Got a float {d}\n", .{inner_cap});
        } else {
            std.debug.print("It's a null\n", .{});
        }
    } else |err| {
        std.debug.print("Got an error {!}\n", .{err});
    }
}

So a “double else” just unwraps the inner conditional:

test "optional error" {
    const val: usize = 0; // set to whatever
    if (optionErr(val)) |cap| {
        std.debug.print("Got a float {d}\n", .{inner_cap});
    } else {
        std.debug.print("It's a null\n", .{});
    } else |err| {
        std.debug.print("Got an error {!}\n", .{err});
    }
}

But the complexity of this is evidenced by you and I seeing different interpretations as more natural.

The saving grace of the idea is that, like the three existing varieties of if statement, the correct form for a given type would be mandatory. Whether the error comes first or second, it requires two else clauses, and one of them has no capture, because it’s null.

Unless there’s a leading ! in the conditional, of course:

test "optional error reversed" {
    const val: usize = 0; // set to whatever
    if (!optionErr(val))  {
        std.debug.print("It's a null\n", .{});
    } else |cap| {
        std.debug.print("Got a float {d}\n", .{inner_cap});
    } else |err| {
        std.debug.print("Got an error {!}\n", .{err});
    }
}

I think that one would be more confusing if the error-capturing else came before the null else, because logically, that would put the intended value all the way at the bottom.

But the cool thing about all of them is that you don’t need any type or semantic analysis at all to know what kind of if statement you’re dealing with, it’s built in to the parse. Whether it’s correct depends on the types, of course, but that’s true in general.