Probably too general of a question, for which I apologize, but I am hoping to write a lexer in Zig. In thinking of how it would be tackled, say you have a graph:
A → B
A → C
B → D
B → E
C → D
C → E
where A, B, C, D, and E are not characters, but representations of nodes in the graph (which, themselves could contain a graph within).
First would be to consume characters until some delimiter and compare against “A”, right? If that doesn’t match, fail. But, if it does, we then need to see if the next “token” matches “B” or “C”.
So then you check “B”, if it matches, good, then you move on to checking if “D” matches, if it works, great, if it doesn’t check “E”, else it fails.
Here is where my first question arises, before any section of the graph with parallel nodes, is one meant to “save” the position to jump back? I.e., in the aforementioned example, traveling A → B → D and A → B → E didn’t match. So, one would have to restart from A → C right?
My second question is, can it taken as an assumption that any well-defined grammars that have these parallel nodes ensured that those parallel nodes are unique? What I mean is, it is impossible for a string to satisfy multiple routes?
If the two things above are true: I want to minimize stored state as I don’t care about the the knowledge gained throughout the processing of a statement–just that it abides by the grammar.
I would need my current position. I fear I need an unbounded stack to store all the “stop points” along the graph to go back to. I think that’s it? I wish there was a way to not require the whole “history”.