Thoughts on Block Expressions Syntax?

My Rust experience is rather limited, I largely let it fall by the wayside once I discovered Zig, but one thing I really like about it that Zig does not have is (surprisingly) a syntactical one: block expressions.

In Zig, we have labelled blocks to accomplish the same task, but where I most often find these useful is when I a have a basic if/else to set a variable (essentially a ternary) that is too long/complex to be comfortable on a single line. This leaves me with the following the options:

1. Do nothing. Let the expression run long on a single line.

const some_variable_name = if (some_long_condition) some_long_result_if_condition_true else some_long_result_if_condition_false;

2. Use a label.

const some_variable_name = if (some_long_condition) result: {
    break :result some_long_result_if_condition_true
} else some_long_result_if_condition_false;

…or alternatively…

const some_variable_name = if (some_long_condition) result: {
    break :result some_long_result_if_condition_true;
} else {
    break :result some_long_result_if_condition_false;
};

3. Extend it to multiline without braces or label

const some_variable_name = if (some_long_condition)
    some_long_result_if_condition_true
else
    some_long_result_if_condition_false;

While #3 is the closest to my preferred way, and allows for placing breakpoints normally, I don’t personally like breaking an if statement across multiple lines without braces. I know that formatters and Zig’s strictness largely protect against errors like Apple’s infamous ā€œgoto failā€ vulnerability, but my preferred way is to use a block expression (i.e. omit the final semicolon of the block, but still be able to use braces.

const some_variable_name = if (some_long_condition) {
    some_long_result_if_condition_true
} else {
    some_long_result_if_condition_false
};

I was curious about others’ thoughts on this syntax, and if they feel it would be a welcome addition to the language?

My thoughts are that many people have spent time thinking about this related issue and looking over the discussion I don’t feel like I have much to add:

2 Likes

I really don’t like Rust’s ā€˜dangling expression’ as result of a block, but I also don’t like Zig’s forced labels.

E.g. my compromise for returning a block result would look like this:

const bla = {
    const result = some complex expression;
    break result;
};

…and a label would only be required when skipping blocks, e.g.:

const bla = label: {
    const res1 = ...;
    const res2 = ...;
    if (blub) {
         break :label res1;
    } else {
         break :label res2;
    }
};

There’s probably some hidden edge cases I’m missing which prevent this syntax, not sure.

For instance where it gets tricky is that a regular break in a for/while loop will always break out of the loop instead of breaking out of the current block.

…maybe a different keyword would make sense instead, something more similar to return, maybe leave :wink:

6 Likes

I have no problem with using break to explicitly return the contents of a block expression. I prefer explicit return types to using a dangling statement at the end of a block as the return value. If not explicitly specified, the block type is void, which is very elegant and consistent.

Perhaps the only consideration is that there are no anonymous blocks; all blocks must have a name.

I used to think about this, but I now completely agree that the current approach is difficult to optimize.

Labels themselves are difficult to omit. In practice, it’s easy to accidentally introduce new code blocks during refactoring. Suppose that block expressions are allowed to return without labels. If we originally had a statement:

const bar = {
    var foo = if (condition) baz else break bla;
    foo += 1;
    break foo;
};

Suppose one day we develop a new preference for conditional statement formatting, for example, changing to always using blocks after if condition, and then accidentally refactoring the code and making a mistake:

const bar = {
    var foo = if (condition) baz else {
        break bla;
    }
    foo += 1;
    break foo;
};

Because unlabeled expressions are allowed, the inadvertent introduction of a block changes the meaning of the statement.
Enforcing labels can prevent this error.
Another possibility I’ve considered is to always retain the :, but omit the label name before the :, such as:

const bar = : {
    var foo = if (condition) baz else break : bla;
    foo += 1;
    break : foo;
};

However, I’m concerned about whether this usage might be ambiguous, as it’s clear that : is used not only before the label name but also to separate identifiers and types.
If keywords are the only way to eliminate ambiguity, I’d prefer mandatory label names for all labels. I think a blk: would be more concise than a mandatory keyword. In my current practice, I use blk: as a replacement for anonymous label names.

3 Likes

This would make a common case extremely annoying:

while (iter.next()) |each| {
    if (someCondition(each)) {
        break;
    }
}

Under this proposal, the break would break the if instead of the while

(edit - i see it was kinda considered in a later paragraph, my bad! i guess this just serves as an explicit example where the idea of break targeting the nearest parent is not great)

2 Likes

break :{} value;
Rust has the turbofish, Zig has the lips.
;^)

4 Likes

Thinking idly about this, if break (or a word specific for blocks were introduced, like Hare’s yield) was used to return values from blocks without label, it would create the following weird situation (zig pseudocode)

// 1
const c = {
    // logic goes here
    if (condition) {
        yield value;
    }
    // more logic
};

// 2
const c = {
    // logic goes here
    if (condition)
        yield value;
    // more logic
};

These two snippets become subtly semantically different. And you’d end up using labels anyway to disambiguate it. (I don’t know how Hare deals with this)

2 Likes

I don’t know, removing curly braces changing the semantics isn’t that surprising.

My OCD actually triggers on the single line if-body without braces, that should clearly be a compiler error :wink: (and should only be allowed when if is used as expression)

2 Likes

Heh, same here. But what about if the body and if are on the same line, f.ex.

if (!self.updated_focus) self.focus = 0;

For a long time, I always had brances around the body, but I’ve started liking this again after a long hiatus.

2 Likes

of course! why on earth make 3 lines of that.

I think it depends on many factors. Comments also change the way I write.
The nullable case is badly readable. I don’t like the nullable syntax of Zig.

        if (!is_first_call and self.check_stop(.quiet)) return 0;

        // Evaluate.
        score = eval.evaluate(pos, false);

        // Fail high.
        if (score >= beta) return beta;

        // Raise alpha.
        if (score > alpha) alpha = score;

        // Too deep.
        if (ply >= max_search_depth) return score;

        // Probe.
        var tt_move: Move = .empty;
        if (self.tt_probe(key, 0, ply, alpha, beta, &tt_move)) |tt_score| {
            return tt_score;
        }

More tricky to read.

        if (!is_first_call and self.check_stop(.quiet)) return 0;
        score = eval.evaluate(pos, false);
        if (score >= beta) return beta;
        if (score > alpha) alpha = score;
        if (ply >= max_search_depth) return score;
        var tt_move: Move = .empty;
        if (self.tt_probe(key, 0, ply, alpha, beta, &tt_move)) |tt_score| return tt_score;
1 Like

That’s fine IMHO, I just really don’t like this formatting:

if (!self.updated_focus)
    self.focus = 0;
1 Like

surprising or not, it makes control flow and data flow dependent on whether punctuation exists or not. Which is not ā€œZiggyā€

4 Likes

Surprising that such a thing even exists, you forget a semicolon, your program breaks and for the compiler it’s all OK.

Anyway there’s also the switch(condition) alternative.

const a = switch(condition) {
    true => ...,
    false => ...,
}

What is the context?

Sorry. The block expressions where omitting the last semicolon indicates the value returned by the block:

linked by

It did exist, but doesn’t anymore and was rejected from returning.
Did you mean surprising that it existed in the past?

Yes, and surprising if some languages do it. At least surprising for me.

2 Likes

The fact that it existed in Zig was a surprise to me too. I feel kind of foolish even creating this topic in hindsight, I hadn’t realized I was a couple years too late to the conversation, which was already hashed out.

Beyond some topical experimentation, I didn’t get invested into the language until about version 0.12, so I am ignorant of some this history on features/syntax that were removed.

In other languages, I agree, but Zig programs usually are written with auto-formatting, and then the indentation level clearly shows the scope, so it’s easy to understand while reading, it doesn’t matter if there are curly braces or not.

Actually, I’m one of those who really prefer Python’s indent level based syntax.

1 Like