1 Introduction
one-more-re-nightmare is a new regular expression compiler implemented in Common Lisp.
The aim is to generate code that will perform as well, if not better, than hand-written searching code. We believe we have achieved this aim; as finite state machine generation removes redundant tests that would not typically be removed by a human programmer, compilation is performed for specific array types, and the compiler can also make use of SIMD intrinsics for simple loops. We also intend to produce excellent analysis, to help with writing regular expressions that do the "right thing".
1.1 Prior work
The derivative approach was introduced by the late Janusz Brzozowski in (Brzozowski 1964).
The name of the library is due to the song One More Red Nightmare by the band King Crimson in 1974.
Ronald Johnston wrote on the APL\3000 compiler in (Johnston 1979), which produced specialised code for different array layouts on demand.
Scott Owens, John Reppy and Aaron Turon showed how to produce finite state machines that are very close to minimal in size in (Owens et al. 2009), using the derivative approach. This minimality was achieved by using additional rewrite rules, allowing the machine generation process to reuse more states, instead of producing more redundant states.
Jim Newton compiled finite state machines to Common Lisp code in (Newton et al. 2016), to implement regular type expressions.
We were first introduced to using the Common Lisp compiler as a backend for a just-in-time compiler by the Petalisp language by Marco Heisig (Heisig 2018).
Gilbert Baumann described how to implement submatching using derivatives in a currently unpublished paper.