On this page:
2.1 Syntax
2.2 Matching
2.3 Compiling

2 Interface

2.1 Syntax

We currently use our own syntax for regular expressions, since the POSIX syntax does not allow for expressing complements or intersections of regular expressions.

E

 

::=

 

E | E

 

union

 

|

 

E & E

 

intersection

 

|

 

E E

 

concatenate

 

|

 

¬E

 

complement

 

|

 

~E

 

complement

 

|

 

E*

 

zero or more repeats

 

|

 

E+

 

one or more repeats

 

|

 

E{i}

 

repeat

 

|

 

«E»

 

submatch

 

|

 

(E)

 

change precedence

 

|

 

[r]

 

character range

 

|

 

[¬r]

 

complement ranges

 

|

 

$

 

every character

 

|

 

c

 

literal character

 

|

 

<empty string>

 

r

 

::=

 

c

 

single character

 

|

 

c-c

 

character range

i

 

::=

 

<integer>

 

c

 

::=

 

<single character>

 

Rules higher on this list are "looser" than rules lower on the list. For example, the expression ab&cd|ef is equivalent to ((ab)&(cd))|(ef).

2.2 Matching

Note that one-more-re-nightmare can avoid a cache lookup (involving acquiring a lock and hash table searching) if the regular expression is a literal string, or a constant variable bound to a string.

first-match regular-expression string &key start endFunction

first-string-match regular-expression string &key start endFunction

Find the first match for regular-expression in string between start and end.

first-match either returns a simple vector, where each element is a register. The first two registers are always the start and end of the match, and then subsequent registers are the start and end of each submatch. A register is either a bounding index of string or nil (when there is no submatch), or nil if there is no match.

first-string-match either returns a simple vector, every element of which is a fresh string or nil (when there is no submatch), or nil if there is no match.


Examples

(first-match "[0-9]([0-9]| )+" "Phone: 632 3003")

;; => #(6 15)

(first-string-match "[0-9]([0-9]| )+" "Phone: 632 3003")

;; => "632 3003"

 

(first-match

 "«[0-9]+»x«[0-9]+»|«[0-9]+»p"

 "Foobar 1920x1080 17-inch display")

;; => #(7 16 7 11 12 16 NIL NIL)

(first-string-match

 "«[0-9]+»x«[0-9]+»|«[0-9]+»p"

 "Foobar 1920x1080 17-inch display")

;; => #("1920x1080" "1920" "1080" NIL)

all-matches regular-expression string &key start endFunction

all-string-matches regular-expression string &key start endFunction

Find all matches for regular-expression in string between start and end.

Both functions return a list of matches; all-matches represents matches as first-match does, and all-string-matches represents matches as first-string-match does.


Examples

(all-matches

 "«[0-9]+»x«[0-9]+»|«[0-9]+»p"

 "Foobar 1920x1080 17-inch display or Quux 19-inch 720p display?")

;; => (#(7 16 7 11 12 16 NIL NIL) #(49 53 NIL NIL NIL NIL 49 52))

(all-string-matches

 "«[0-9]+»x«[0-9]+»|«[0-9]+»p"

 "Foobar 1920x1080 17-inch display or Quux 19-inch 720p display?")

;; => (#("1920x1080" "1920" "1080" NIL) #("720p" NIL NIL "720"))

do-matches ((&rest registers) regular-expression string &key start end) &body bodyMacro

do-matches iterates over all matches for regular-expression across string. The registers variables are bound to the registers produced, as described for first-match.

It is possible to provide fewer variables than registers in the regular expression, but an error will be signalled if there are more variables than registers.

2.3 Compiling

The compiler may be run manually, when the regular expression is not known at compile time, and the code cache takes too long to search. (The latter can happen if many threads are accessing the code cache, and the time taken searching is sufficiently short, as lookups grab a global lock currently.)

compiled-regular-expression Class

An object representing a compiled regular expression. An instance of this class can be provided as a regular expression to all the searching functions, instead of a string.

compile-regular-expression expressionFunction

Compile a regular expression, returning an instance of compiled-regular-expression.

An error is signalled if the expression has invalid syntax.