automaton

package

v0.0.0-...-d0be9ee Latest Latest Go to latest Published: Dec 10, 2015 License: Apache-2.0 Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/ironsweet/golucene

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
type Automaton
- func MakeEmpty() *Automaton
- func (a *Automaton) IsAccept(state int) bool
- func (a *Automaton) String() string
type AutomatonBuilder
type AutomatonProvider
type CharacterRunAutomaton
- func NewCharacterRunAutomaton(a *Automaton) *CharacterRunAutomaton
type DaciukMihovAutomatonBuilder
type FrozenIntSet
- func (fis *FrozenIntSet) String() string
type IntPair
type Kind
type PointTransitionSet
- func (pts *PointTransitionSet) String() string
type PointTransitions
type PointTransitionsArray
- func (a PointTransitionsArray) Len() int
- func (a PointTransitionsArray) Less(i, j int) bool
- func (a PointTransitionsArray) Swap(i, j int)
type RegExp
- func NewRegExp(s string) *RegExp
- func NewRegExpWithFlag(s string, flags int) *RegExp
- func (re *RegExp) String() string
- func (re *RegExp) ToAutomaton() *Automaton
type RunAutomaton
- func (ra *RunAutomaton) String() string
type SortedIntSet
- func (sis *SortedIntSet) String() string
type StateList
type StateListNode
type StatePair
type Transition
- func (t *Transition) String() string
type TransitionList

Constants ¶

View Source

const (
	REGEXP_UNION         = Kind(1)
	REGEXP_CONCATENATION = Kind(2)
	REGEXP_INTERSECTION  = Kind(3)
	REGEXP_OPTIONAL      = Kind(4)
	REGEXP_REPEAT        = Kind(5)
	REGEXP_REPEAT_MIN    = Kind(6)
	REGEXP_REPEAT_MINMAX = Kind(7)
	REGEXP_COMPLEMENT    = Kind(8)
	REGEXP_CHAR          = Kind(9)
	REGEXP_CHAR_RANGE    = Kind(10)
	REGEXP_ANYCHAR       = Kind(11)
	REGEXP_EMPTY         = Kind(12)
	REGEXP_STRING        = Kind(13)
	REGEXP_ANYSTRING     = Kind(14)
	REGEXP_AUTOMATON     = Kind(15)
	REGEXP_INTERVAL      = Kind(16)
)

View Source

const (
	INTERSECTION = 0x0001 // &
	COMPLEMENT   = 0x0002 // ~
	EMPTY        = 0x0004 // #
	ANYSTRING    = 0x0008 // @
	AUTOMATON    = 0x0010 // <identifier>
	INTERVAL     = 0x0020 // <n-m>
	ALL          = 0xffff // enables all optional regexp syntax.
	NONE         = 0x0000 // enables no optional regexp syntax.

)

Syntax flags

View Source

const HASHMAP_CUTOVER = 30

View Source

const MIN_CODE_POINT = 0

Go doesn't have unicode.MinRune which should be 0

View Source

const TREE_MAP_CUTOVER = 30

If we hold more than this many states, we swtich from O(N*2) linear ops to O(N log(N)) TreeMap

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Automaton ¶

type Automaton struct {
	// contains filtered or unexported fields
}

Represents an automaton and all its states and transitions. States are integers and must be created using {@link #createState}. Mark a state as an accept state using {@link #setAccept}. Add transitions using {@link #addTransition}. Each state must have all of its transitions added at once; if this is too restrictive then use {@link Automaton.Builder} instead. State 0 is always the initial state. Once a state is finished, either because you've starting adding transitions to another state or you call {@link #finishState}, then that states transitions are sorted (first by min, then max, then dest) and reduced (transitions with adjacent labels going to the same dest are combined).

func MakeEmpty ¶

func MakeEmpty() *Automaton

Returns a new (deterministic) automaton with the empty language.

func (*Automaton) IsAccept ¶

func (a *Automaton) IsAccept(state int) bool

Returns true if this state is an accept state.

func (*Automaton) String ¶

func (a *Automaton) String() string

type AutomatonBuilder ¶

type AutomatonBuilder struct {
	// contains filtered or unexported fields
}

type AutomatonProvider ¶

type AutomatonProvider func(name string) *Automaton

Automaton provider for RegExp.

type CharacterRunAutomaton ¶

type CharacterRunAutomaton struct {
	*RunAutomaton
}

Automaton representation for matching []char

func NewCharacterRunAutomaton ¶

func NewCharacterRunAutomaton(a *Automaton) *CharacterRunAutomaton

type DaciukMihovAutomatonBuilder ¶

type DaciukMihovAutomatonBuilder struct {
}

type FrozenIntSet ¶

type FrozenIntSet struct {
	// contains filtered or unexported fields
}

func (*FrozenIntSet) String ¶

func (fis *FrozenIntSet) String() string

type IntPair ¶

type IntPair struct {
	// contains filtered or unexported fields
}

type Kind ¶

type Kind int

type PointTransitionSet ¶

type PointTransitionSet struct {
	// contains filtered or unexported fields
}

func (*PointTransitionSet) String ¶

func (pts *PointTransitionSet) String() string

type PointTransitions ¶

type PointTransitions struct {
	// contains filtered or unexported fields
}

Holds all transitions that start on this int point, or end at this point-1

type PointTransitionsArray ¶

type PointTransitionsArray []*PointTransitions

func (PointTransitionsArray) Len ¶

func (a PointTransitionsArray) Len() int

func (PointTransitionsArray) Less ¶

func (a PointTransitionsArray) Less(i, j int) bool

func (PointTransitionsArray) Swap ¶

func (a PointTransitionsArray) Swap(i, j int)

type RegExp ¶

type RegExp struct {
	// contains filtered or unexported fields
}

Regular Expression extension to Automaton.

Regular expressions are built from the following abstract syntax:

	regexp	::= unionexp
 			|
	unionexp ::= interexp | unionexp 	(union)
	 		| interexp
	interexp ::= concatexp & interexp 	(intersection) 						[OPTIONAL]
 			| concatexp
	concatexp ::= repeatexp concatexp	 (concatenation)
 			| repeatexp
	repeatexp ::= repeatexp ? 			(zero or one occurrence)
 			| repeatexp * 				(zero or more occurrences)
 			| repeatexp + 				(one or more occurrences)
 			| repeatexp {n} 			(n occurrences)
 			| repeatexp {n,} 			(n or more occurrences)
 			| repeatexp {n,m} 			(n to m occurrences, including both)
 			| complexp
	complexp ::= ~ complexp 			(complement) 						[OPTIONAL]
 			| charclassexp
	charclassexp ::= [ charclasses ] 	(character class)
 			| [^ charclasses ] 			(negated character class)
 			| simpleexp
	charclasses ::= charclass charclasses
 			| charclass
	charclass ::= charexp - charexp 	(character range, including end-points)
 			| charexp
	simpleexp ::= charexp
 			| . 						(any single character)
 			| # 						(the empty language) 				[OPTIONAL]
 			| @ 						(any string) 						[OPTIONAL]
			| " <Unicode string without double-quotes>  " (a string)
 			| ( ) 						(the empty string)
 			| ( unionexp ) 				(precedence override)
 			| < <identifier> > 			(named automaton) 					[OPTIONAL]
			| <n-m> 					(numerical interval) 				[OPTIONAL]
	charexp ::= <Unicode character> 	(a single non-reserved character)
 			| \ <Unicode character>  	(a single character)

The productions marked [OPTIONAL] are only allowed if specified by the syntax flags passed to the RegExp constructor. The reserved characters used in the (enabled) syntax must be escaped with backslash (\) or double-quotes ("..."). (In contrast to other regexp syntaxes, this is required also in character classes.) Be aware that dash (-) has a special meaning in charclass expressions. An identifier is a string not containing right angle bracket (>) or dash (-). Numerical intervals are specified by non-negative decimal integers and include both end points, and if n and m have the same number of digits, then the conforming strings must have that length (i.e. prefixed by 0's).

func NewRegExp ¶

func NewRegExp(s string) *RegExp

Constructs new RegExp from a string. Same as RegExp(s, ALL)

func NewRegExpWithFlag ¶

func NewRegExpWithFlag(s string, flags int) *RegExp

Constructs new RegExp from a string.

func (*RegExp) String ¶

func (re *RegExp) String() string

Constructs string from parsed regular expression

func (*RegExp) ToAutomaton ¶

func (re *RegExp) ToAutomaton() *Automaton

Constructs new Automaton from this RegExp. Same as ToAutomaton(nil) (empty automaton map).

type RunAutomaton ¶

type RunAutomaton struct {
	// contains filtered or unexported fields
}

Finite-state automaton with fast run operation.

func (*RunAutomaton) String ¶

func (ra *RunAutomaton) String() string

type SortedIntSet ¶

type SortedIntSet struct {
	// contains filtered or unexported fields
}

Just holds a set of []int states, plus a corresponding []int count per state. Used by determinize().

I have to disable hashCode and use string key to mimic Lucene's custom hashing function here.

func (*SortedIntSet) String ¶

func (sis *SortedIntSet) String() string

type StateList ¶

type StateList struct {
	// contains filtered or unexported fields
}

type StateListNode ¶

type StateListNode struct {
	// contains filtered or unexported fields
}

type StatePair ¶

type StatePair struct {
	// contains filtered or unexported fields
}

Pair of states.

type Transition ¶

type Transition struct {
	// contains filtered or unexported fields
}

Holds one transition from an {@link Automaton}. This is typically used temporarily when iterating through transitions by invoking {@link Automaton#initTransition} and {@link Automaton#getNextTransition}.

func (*Transition) String ¶

func (t *Transition) String() string

type TransitionList ¶

type TransitionList struct {
	// contains filtered or unexported fields
}

Simple custom []*Transition

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL