Documentation ¶
Index ¶
- Constants
- type Automaton
- type AutomatonBuilder
- type AutomatonProvider
- type CharacterRunAutomaton
- type DaciukMihovAutomatonBuilder
- type FrozenIntSet
- type IntPair
- type Kind
- type PointTransitionSet
- type PointTransitions
- type PointTransitionsArray
- type RegExp
- type RunAutomaton
- type SortedIntSet
- type StateList
- type StateListNode
- type StatePair
- type Transition
- type TransitionList
Constants ¶
const ( REGEXP_UNION = Kind(1) REGEXP_CONCATENATION = Kind(2) REGEXP_INTERSECTION = Kind(3) REGEXP_OPTIONAL = Kind(4) REGEXP_REPEAT = Kind(5) REGEXP_REPEAT_MIN = Kind(6) REGEXP_REPEAT_MINMAX = Kind(7) REGEXP_COMPLEMENT = Kind(8) REGEXP_CHAR = Kind(9) REGEXP_CHAR_RANGE = Kind(10) REGEXP_ANYCHAR = Kind(11) REGEXP_EMPTY = Kind(12) REGEXP_STRING = Kind(13) REGEXP_ANYSTRING = Kind(14) REGEXP_AUTOMATON = Kind(15) REGEXP_INTERVAL = Kind(16) )
const ( INTERSECTION = 0x0001 // & COMPLEMENT = 0x0002 // ~ EMPTY = 0x0004 // # ANYSTRING = 0x0008 // @ AUTOMATON = 0x0010 // <identifier> INTERVAL = 0x0020 // <n-m> ALL = 0xffff // enables all optional regexp syntax. NONE = 0x0000 // enables no optional regexp syntax. )
Syntax flags
const HASHMAP_CUTOVER = 30
const MIN_CODE_POINT = 0
Go doesn't have unicode.MinRune which should be 0
const TREE_MAP_CUTOVER = 30
If we hold more than this many states, we swtich from O(N*2) linear ops to O(N log(N)) TreeMap
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Automaton ¶
type Automaton struct {
// contains filtered or unexported fields
}
Represents an automaton and all its states and transitions. States are integers and must be created using {@link #createState}. Mark a state as an accept state using {@link #setAccept}. Add transitions using {@link #addTransition}. Each state must have all of its transitions added at once; if this is too restrictive then use {@link Automaton.Builder} instead. State 0 is always the initial state. Once a state is finished, either because you've starting adding transitions to another state or you call {@link #finishState}, then that states transitions are sorted (first by min, then max, then dest) and reduced (transitions with adjacent labels going to the same dest are combined).
func MakeEmpty ¶
func MakeEmpty() *Automaton
Returns a new (deterministic) automaton with the empty language.
type AutomatonBuilder ¶
type AutomatonBuilder struct {
// contains filtered or unexported fields
}
type AutomatonProvider ¶
Automaton provider for RegExp.
type CharacterRunAutomaton ¶
type CharacterRunAutomaton struct {
*RunAutomaton
}
Automaton representation for matching []char
func NewCharacterRunAutomaton ¶
func NewCharacterRunAutomaton(a *Automaton) *CharacterRunAutomaton
type DaciukMihovAutomatonBuilder ¶
type DaciukMihovAutomatonBuilder struct { }
type FrozenIntSet ¶
type FrozenIntSet struct {
// contains filtered or unexported fields
}
func (*FrozenIntSet) String ¶
func (fis *FrozenIntSet) String() string
type PointTransitionSet ¶
type PointTransitionSet struct {
// contains filtered or unexported fields
}
func (*PointTransitionSet) String ¶
func (pts *PointTransitionSet) String() string
type PointTransitions ¶
type PointTransitions struct {
// contains filtered or unexported fields
}
Holds all transitions that start on this int point, or end at this point-1
type PointTransitionsArray ¶
type PointTransitionsArray []*PointTransitions
func (PointTransitionsArray) Len ¶
func (a PointTransitionsArray) Len() int
func (PointTransitionsArray) Less ¶
func (a PointTransitionsArray) Less(i, j int) bool
func (PointTransitionsArray) Swap ¶
func (a PointTransitionsArray) Swap(i, j int)
type RegExp ¶
type RegExp struct {
// contains filtered or unexported fields
}
Regular Expression extension to Automaton.
Regular expressions are built from the following abstract syntax:
regexp ::= unionexp | unionexp ::= interexp | unionexp (union) | interexp interexp ::= concatexp & interexp (intersection) [OPTIONAL] | concatexp concatexp ::= repeatexp concatexp (concatenation) | repeatexp repeatexp ::= repeatexp ? (zero or one occurrence) | repeatexp * (zero or more occurrences) | repeatexp + (one or more occurrences) | repeatexp {n} (n occurrences) | repeatexp {n,} (n or more occurrences) | repeatexp {n,m} (n to m occurrences, including both) | complexp complexp ::= ~ complexp (complement) [OPTIONAL] | charclassexp charclassexp ::= [ charclasses ] (character class) | [^ charclasses ] (negated character class) | simpleexp charclasses ::= charclass charclasses | charclass charclass ::= charexp - charexp (character range, including end-points) | charexp simpleexp ::= charexp | . (any single character) | # (the empty language) [OPTIONAL] | @ (any string) [OPTIONAL] | " <Unicode string without double-quotes> " (a string) | ( ) (the empty string) | ( unionexp ) (precedence override) | < <identifier> > (named automaton) [OPTIONAL] | <n-m> (numerical interval) [OPTIONAL] charexp ::= <Unicode character> (a single non-reserved character) | \ <Unicode character> (a single character)
The productions marked [OPTIONAL] are only allowed if specified by the syntax flags passed to the RegExp constructor. The reserved characters used in the (enabled) syntax must be escaped with backslash (\) or double-quotes ("..."). (In contrast to other regexp syntaxes, this is required also in character classes.) Be aware that dash (-) has a special meaning in charclass expressions. An identifier is a string not containing right angle bracket (>) or dash (-). Numerical intervals are specified by non-negative decimal integers and include both end points, and if n and m have the same number of digits, then the conforming strings must have that length (i.e. prefixed by 0's).
func NewRegExpWithFlag ¶
Constructs new RegExp from a string.
func (*RegExp) ToAutomaton ¶
Constructs new Automaton from this RegExp. Same as ToAutomaton(nil) (empty automaton map).
type RunAutomaton ¶
type RunAutomaton struct {
// contains filtered or unexported fields
}
Finite-state automaton with fast run operation.
func (*RunAutomaton) String ¶
func (ra *RunAutomaton) String() string
type SortedIntSet ¶
type SortedIntSet struct {
// contains filtered or unexported fields
}
Just holds a set of []int states, plus a corresponding []int count per state. Used by determinize().
I have to disable hashCode and use string key to mimic Lucene's custom hashing function here.
func (*SortedIntSet) String ¶
func (sis *SortedIntSet) String() string
type StateListNode ¶
type StateListNode struct {
// contains filtered or unexported fields
}
type StatePair ¶
type StatePair struct {
// contains filtered or unexported fields
}
Pair of states.
type Transition ¶
type Transition struct {
// contains filtered or unexported fields
}
Holds one transition from an {@link Automaton}. This is typically used temporarily when iterating through transitions by invoking {@link Automaton#initTransition} and {@link Automaton#getNextTransition}.
func (*Transition) String ¶
func (t *Transition) String() string
type TransitionList ¶
type TransitionList struct {
// contains filtered or unexported fields
}
Simple custom []*Transition