Mimicking switch fall-through

nyc · June 7, 2024, 1:14pm

I have a switch statement where each arm is a continuation of the next (think Duff’s Device but no loop). How would I do this better?

const state = det_state();
switch(state) {
  state_a => { x = a(x); x = b(x); x = c(x); }
  state_b => { x = b(x); x = c(x); }
  state_c => { x = c(x); }
};

There’s six arms in the real code. Trying to find a zero code version without all the repetition (so looping through an array of function pointers won’t meet the goals).

Dok8tavo · June 7, 2024, 1:50pm

Idk if it suits your taste, but it’s the best I found so far:

const State = enum(u8) {
    start, 
    a,
    b, 
    c,
    ...
    f, 

    // Of course you could write you own implementation depending on your needs
    pub fn fallthrough(self: *State) bool {
        if (self.* == .f) return false;
        self.* = @enumFromInt(1 + @intFromEnum(self.*));
        return true;
    } 
};

var should_continue = true;
/// this is the switch
while (should_continue) : (should_continue = self.fallthrough()) switch (state) {
    .a => doA(), 
    .b => doB(),
    .c => doC(), 
    ...
    .f => doF(), 
};

Now I didn’t test it but it should work. It’s probably not worth it imo. It depends on readability, and I don’t know what you’re doing but surely there’s something else you could do rather than a fallthrough, like extract a function with early returns or smth.

chung-leong · June 7, 2024, 2:29pm

Maybe something like this?

const stage: usize = switch(state) {
  state_a => 0,
  state_b => 1,
  state_c => 2,
};
if (stage <= 0) {
    x = a(x);
}
if (stage <= 1) {
    x = b(x);
}
if (stage <= 2) {
    x = c(x);
}

nyc · June 7, 2024, 2:33pm

its for coroutine code where each arm is a yield point (kind of not exactly but close enough – oddly enough this also appears in some code generation code where i’m trying to do the minimal workload between compilation cycles).

llvm mighjt be able to figure to move the switch test to the end of each arm, but then it would need to notice each arm just fowraded to rthe “next” and info may me long lost by then, but in its naieve form, the branch target predictor would choke on that pretty badly.

early return

This the inverse problem. I’m not trying to end sooner, but enter later. Early return would solve doing “a then b then c” or “a then b” or “just a”. This solves doing “a then b then c” or “b then c” or “just c” where each step depends on the previous.

nyc · June 7, 2024, 2:36pm

That kind of what i was thinking too - seems heavyweight, but definitely seems doable for the compiler to optimize the extra checks out. I wonder what the code gen on that will look like, but definite possibility that will work.

mnemnion · June 8, 2024, 10:53pm

There’s an issue in flight for labeled continue which will help with that. I’m looking forward to that landing, for handling some VM dispatch logic that I’d use a goto for in C. “can be used to implement Duff’s Device” was one of the comments on that issue, so it should solve the problem.

AndrewCodeDev · June 8, 2024, 10:59pm

I really like that proposal - it’s unique and offers something more than syntax sugar.

Validark · June 9, 2024, 1:56am

Definitely! The only way to sort of guarantee the emit you want (it still might¿ duplicate your code, but hopefully not if it’s non-trivial) right now is with tail calls, which in some situations gets super convoluted when you have to pass around 10 variables and manage all those properly across function boundaries. But here it is for your simple example:

switch(state) {
  state_a => do_state_a(),
  state_b => do_state_b(),
  state_c => do_state_c(),
};

fn do_state_a(x: anytype) @TypeOf(x) {
    return @call(.always_tail, do_state_b, .{ a(x) });
}

fn do_state_b(x: anytype) @TypeOf(x) {
    return @call(.always_tail, do_state_c, .{ b(x) });
}

// Redundant because it's the last one,
// but hopefully you see the pattern
fn do_state_c(x: anytype) @TypeOf(x) {
    return c(x);
}

Also, tail call optimization doesn’t exist on some platforms, I think WASM, for example.

(LLVM has historically not been able to optimize out my checks on a variable that was set in the previous instruction. Set a = 1, then check if a == 1 has been generated for me before.)

chung-leong · June 9, 2024, 5:07am

The compiler does seem capable of divining your intention and generating the optimal code:

mnemnion · June 9, 2024, 4:24pm

Switch optimizations are notoriously brittle. They’ll often work for test code, and then fail on real code, where the logic confounds common subexpression elimination, even though it would be valid to do so.

Not to mention that duplication of code which is intended to be identical is a very bad thing to have in your program. If one spot changes, they all need to change, that’s very easy to miss when the repeated code is separated by more than a screen.

The duplication also doesn’t tell the reader (this includes you, later) what’s supposed to be going on.

All in all, the labeled continue is a great addition to the language. Dropping goto did mean losing some important control flow, and labeled continue adds the last of that back.

The issue is actively being worked on, which you can track here if you’d like.

chung-leong · June 11, 2024, 11:10am

Labeled continue will certainly be very useful. I wonder if it’ll be capable of handling something like this:

    sw: switch (@typeInfo(T) {
        .Optional => |op| continue :sw @typeInfo(op.child),
        .ErrorUnion => |eu| continue :sw @typeInfo(eu.payload),
        // other cases
    }