Documentation ¶
Index ¶
Constants ¶
const ( KNil = iota KRune KClass KAssert KWild )
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Expression ¶
type Node ¶
type Node struct { E []*Edge // Out-edges. Id int // Index number. Scoped to a family. Accept int // True if this is an accepting state. Set []int // The NFA nodes represented by a DFA node. }
func BuildNfa ¶
func BuildNfa[E Expression](expressions []E) ([]*Node, error)
BuildNfa Regex -> NFA (Nondeterministic Finite Automaton) We cannot have our alphabet be all Unicode characters. Instead, we compute an alphabet for each regex:
Singles: we add single runes used in the regex: any rune not in a range. These are held in `singles`.
Ranges: entire ranges become elements of the alphabet. If ranges in the same expression overlap, we break them up into non-overlapping ranges. The generated code checks singles before ranges, so there's no need to break up a range if it contains a single. These are maintained in sorted order in `lim`.
Wild: we add an element representing all other runes.
e.g. the alphabet of /[0-9]*[Ee][2-5]*/ is singles: { E, e }, lim: { [0-1], [2-5], [6-9] } and the wild element.