Basically, I need to deal with a certain subset of possible characters for a state in a certain way and any other character in another way where any other character may be EOF. So here I’m basically emulating a goto the end of the anything_else block by breaking out of the block in case self.peek() is null.
Great question. I’ve looked into this a bit myself so I’d like to hear more people’s thoughts on this, too.
In your case, I’d say it’s unnecessary from the example that was provided. That said, I understand that examples can become arbitrarily more complex. Let’s take a look at your code first though.
That’s starting to act more like a goto and is an elegant way to unpack things and break out if the need arises.
In short, for replacing if statements or direct calls… I wouldn’t prefer block or goto as an escape. For breaking out of nested code, totally. It would be good to have someone pitch some examples and then we could workshop them together to find solutions. @dee0xeed always has some good examples from C.
I strongly recommend you read that. If you feel like it doesn’t answer your question, we can certainly stay here and discuss - just wanted to point out that we have existing resources on this.
The reason I didn’t go for this is because I wanted it to be clear that EOF is handled the same way anything_else is (in this case, any codepoint that’s not a hyphen) whereas in some other cases EOF is specially handled. If I write it like you’d shown it’d be more difficult to understand whether this is a case of EOF being specially handled or if it’s just part of anything_else.
Also, in some cases the handling of anything_else is not simply switching to another state, but it may involve emitting multiple tokens etc. In such scenarios, I didn’t want to duplicate that logic for EOF and then again for the final anything_else scenario.
I hadn’t considered this, but in this case I’d have to call self.peek() multiple times which means decoding the codepoint repeatedly. Here’s what peek looks like: