Documentation ¶
Overview ¶
Package ahocorasick implements the Aho-Corasick string searching algorithm in Go.
The algorithm is implemented using a double array trie for increased access speed and reduced memory consumption.
The algorithm uses an alphabet size of 256, so can only be used to match byte patterns.
Index ¶
Examples ¶
Constants ¶
const ( AlphabetSize int64 = 256 // The size of the alphabet is fixed to the size of a byte. RootState int64 = 0 // The root state of the trie is always 0. EmptyCell int64 = -1 // Represents an unused cell. DefaultBase int64 = 0 // The default base for new states. )
const MagicNumber int32 = 0x45495254
Variables ¶
This section is empty.
Functions ¶
func DecodeByte ¶
func EncodeByte ¶
EncodeByte optimizes for ASCII text by shifting to 0x41 ('A'). Also adds one to avoid byte == 0.
func ReadStrings ¶
Read string patterns from a text file, one pattern on each line.
Types ¶
type Match ¶
type Match struct {
// contains filtered or unexported fields
}
Represents a matched pattern.
func (*Match) MatchString ¶
Just to make working with strings a little more comfortable.
type Trie ¶
type Trie struct {
// contains filtered or unexported fields
}
Trie implementing the Aho-Corasick algorithm. Uses two arrays (base and check) for transitions (as described by Aho).
A transition in the trie from state s to state t on symbol c is described by:
base[s] + c = t check[t] = s
Note that the symbol c is actually stored as c + 1 in this implementation for easier handling of transition on 0.
func (*Trie) MatchFirst ¶
Same as Match, but returns immediately after the first matched pattern.
func (*Trie) MatchString ¶
Helper method to make matching strings a little more comfortable.
Example ¶
trie := NewTrieBuilder(). AddStrings([]string{"hers", "his", "he", "she"}). Build() matches := trie.MatchString("she is here") fmt.Println(len(matches))
Output: 3
func (*Trie) MatchStringFirst ¶
Helper method to make matching a string a little more comfortable.
func (*Trie) NumPatterns ¶
type TrieBuilder ¶
type TrieBuilder struct {
// contains filtered or unexported fields
}
A TrieBuilder must be used to properly build Tries.
func (*TrieBuilder) AddPattern ¶
func (tb *TrieBuilder) AddPattern(pattern []byte) *TrieBuilder
Add a new pattern to be built into the resulting Trie.
func (*TrieBuilder) AddPatterns ¶
func (tb *TrieBuilder) AddPatterns(patterns [][]byte) *TrieBuilder
A helper method to make adding multiple patterns a little more comfortable.
func (*TrieBuilder) AddString ¶
func (tb *TrieBuilder) AddString(pattern string) *TrieBuilder
A helper method to make adding a string pattern more comfortable.
func (*TrieBuilder) AddStrings ¶
func (tb *TrieBuilder) AddStrings(patterns []string) *TrieBuilder
A helper method to make adding multiple string patterns a little more comfortable.
func (*TrieBuilder) Build ¶
func (tb *TrieBuilder) Build() *Trie
Build the trie.
Example ¶
builder := NewTrieBuilder() builder.AddPattern([]byte{0x44, 0x22, 0x31, 0x52, 0x32, 0x00, 0x01, 0x01}) builder.AddStrings([]string{"hello", "world"}) trie := builder.Build() fmt.Println(len(trie.MatchString("hello!")))
Output: 1
type TrieGrapher ¶
type TrieGrapher struct {
// contains filtered or unexported fields
}
A TrieGrapher is used to output a Trie in the DOT graph description language.
func (*TrieGrapher) DrawFailLinks ¶
func (tg *TrieGrapher) DrawFailLinks(b bool) *TrieGrapher
Toggle inclusion of fail links in the graph.
func (*TrieGrapher) Graph ¶
func (tg *TrieGrapher) Graph(path string) error
Output the DOT graph to a file.
Example ¶
trie := NewTrieBuilder(). AddStrings([]string{"his", "hers", "he", "she"}). Build() grapher := NewTrieGrapher(trie).DrawFailLinks(true) grapher.Graph("trie.dot") if _, err := exec.LookPath("dot"); err == nil { if err := exec.Command("dot", "-Tpng", "-o", "trie.png", "trie.dot").Run(); err != nil { fmt.Println(err) } else { fmt.Println("OK") } // Output: OK } else {
Output: