Hello,
I’ve been working on a regex engine since december with a similar philosophy to re2.
This project initially started because I needed a regex analysis library, but I somehow derailed quite badly and ended up making a matching engine instead
Here are a couple of design philosophies that shaped this project:
- Sensible defaults; the default automata should be very small and fast without requiring it to cover all possible use cases. There should be a hierarchy of engines that are used as needed. The default engine shouldn’t even support capture groups
- Zig’s comptime should be fully used while optimizing, e.g. minimize integer sizes, infer optimal context lengths, etc
- All automata and search algorithms should be highly resilient to untrusted input
- The entire API should work the same whether calls are being done at comptime or runtime.
- Interesting machine topologies should be included that scale well as the number of compiled machines increase
- Compiled machines should be fully immutable and the core matching api fully allocator free
The engine is currently quite usable. I spent quite a bit of time implementing the core of the engine for v0.1.0. I tested it quite thoroughly and thought about as many edge cases as I could, but there are probably still bugs somewhere.
Check it out pzre
I’d be happy to hear your thoughts or receive any feedback
Supported Zig versions
0.16.0
AI / LLM usage disclosure
Very minimal, some tests are generated. I dont use AI autocompletion or code editors