Documentation ¶
Index ¶
- Constants
- func AddMatcherToGroup(g MatcherGroup, matcher Matcher, value uint32) error
- func CompositeMatches(matches [][]uint32) []uint32
- func CompositeMatchesReverse(matches [][]uint32) []uint32
- func MemHash(seed uint32, input string) uint32
- func RollingHash(hash uint32, input string) uint32
- func ToDomain(pattern string) (string, error)
- type ACAutomatonMatcherGroup
- func (ac *ACAutomatonMatcherGroup) AddDomainMatcher(matcher DomainMatcher, value uint32)
- func (ac *ACAutomatonMatcherGroup) AddFullMatcher(matcher FullMatcher, value uint32)
- func (ac *ACAutomatonMatcherGroup) AddSubstrMatcher(matcher SubstrMatcher, value uint32)
- func (ac *ACAutomatonMatcherGroup) Build() error
- func (ac *ACAutomatonMatcherGroup) Match(input string) []uint32
- func (ac *ACAutomatonMatcherGroup) MatchAny(input string) bool
- type DomainMatcher
- type DomainMatcherGroup
- type FullMatcher
- type FullMatcherGroup
- type IndexMatcher
- type LinearIndexMatcher
- type Matcher
- type MatcherGroup
- type MatcherGroupForAll
- type MatcherGroupForDomain
- type MatcherGroupForFull
- type MatcherGroupForRegex
- type MatcherGroupForSubstr
- type MphIndexMatcher
- type MphMatcherGroup
- func (g *MphMatcherGroup) AddDomainMatcher(matcher DomainMatcher, value uint32)
- func (g *MphMatcherGroup) AddFullMatcher(matcher FullMatcher, value uint32)
- func (g *MphMatcherGroup) Build() error
- func (g *MphMatcherGroup) Lookup(rollingHash uint32, input string) uint32
- func (g *MphMatcherGroup) Match(input string) []uint32
- func (g *MphMatcherGroup) MatchAny(input string) bool
- type RegexMatcher
- type SimpleMatcherGroup
- type SubstrMatcher
- type SubstrMatcherGroup
- type Type
Constants ¶
const PrimeRK = 16777619
PrimeRK is the prime base used in Rabin-Karp algorithm.
Variables ¶
This section is empty.
Functions ¶
func AddMatcherToGroup ¶
func AddMatcherToGroup(g MatcherGroup, matcher Matcher, value uint32) error
AddMatcherToGroup is a helper function to try to add a Matcher to any kind of MatcherGroup. It returns error if the MatcherGroup does not accept the provided Matcher's type. This function is provided to help writing code to test a MatcherGroup.
func CompositeMatches ¶
CompositeMatches flattens the matches slice to produce a single matched indices slice. It is designed to avoid new memory allocation as possible.
func CompositeMatchesReverse ¶
CompositeMatches flattens the matches slice to produce a single matched indices slice. It is designed that:
- All matchers are concatenated in reverse order, so the matcher that matches further ranks higher.
- Indices in the same matcher keeps their original order.
- Avoid new memory allocation as possible.
func MemHash ¶
MemHash is the hash function used by go map, it utilizes available hardware instructions(behaves as aeshash if aes instruction is available). With different seed, each MemHash<seed> performs as distinct hash functions.
func RollingHash ¶
RollingHash calculates the rolling murmurHash of given string based on a provided suffix hash.
func ToDomain ¶
ToDomain converts input pattern to a domain string, and return error if such a conversion cannot be made.
- Conforms to Letter-Digit-Hyphen (LDH) subset (https://tools.ietf.org/html/rfc952): * Letters A to Z (no distinction between uppercase and lowercase, we convert to lowers) * Digits 0 to 9 * Hyphens(-) and Periods(.)
- If any non-ASCII characters, domain are converted from Internationalized domain name to Punycode.
Types ¶
type ACAutomatonMatcherGroup ¶
type ACAutomatonMatcherGroup struct {
// contains filtered or unexported fields
}
ACAutoMationMatcherGroup is an implementation of MatcherGroup. It uses an AC Automata to provide support for Full, Domain and Substr matcher. Trie node is char based.
NOTICE: ACAutomatonMatcherGroup currently uses a restricted charset (LDH Subset), upstream should manually in a way to ensure all patterns and inputs passed to it to be in this charset.
func NewACAutomatonMatcherGroup ¶
func NewACAutomatonMatcherGroup() *ACAutomatonMatcherGroup
func (*ACAutomatonMatcherGroup) AddDomainMatcher ¶
func (ac *ACAutomatonMatcherGroup) AddDomainMatcher(matcher DomainMatcher, value uint32)
AddDomainMatcher implements MatcherGroupForDomain.AddDomainMatcher.
func (*ACAutomatonMatcherGroup) AddFullMatcher ¶
func (ac *ACAutomatonMatcherGroup) AddFullMatcher(matcher FullMatcher, value uint32)
AddFullMatcher implements MatcherGroupForFull.AddFullMatcher.
func (*ACAutomatonMatcherGroup) AddSubstrMatcher ¶
func (ac *ACAutomatonMatcherGroup) AddSubstrMatcher(matcher SubstrMatcher, value uint32)
AddSubstrMatcher implements MatcherGroupForSubstr.AddSubstrMatcher.
func (*ACAutomatonMatcherGroup) Build ¶
func (ac *ACAutomatonMatcherGroup) Build() error
func (*ACAutomatonMatcherGroup) Match ¶
func (ac *ACAutomatonMatcherGroup) Match(input string) []uint32
Match implements MatcherGroup.Match.
func (*ACAutomatonMatcherGroup) MatchAny ¶
func (ac *ACAutomatonMatcherGroup) MatchAny(input string) bool
MatchAny implements MatcherGroup.MatchAny.
type DomainMatcher ¶
type DomainMatcher string
DomainMatcher is an implementation of Matcher.
func (DomainMatcher) Match ¶
func (m DomainMatcher) Match(s string) bool
func (DomainMatcher) Pattern ¶
func (m DomainMatcher) Pattern() string
func (DomainMatcher) String ¶
func (m DomainMatcher) String() string
func (DomainMatcher) Type ¶
func (DomainMatcher) Type() Type
type DomainMatcherGroup ¶
type DomainMatcherGroup struct {
// contains filtered or unexported fields
}
DomainMatcherGroup is an implementation of MatcherGroup. It uses trie to optimize both memory consumption and lookup speed. Trie node is domain label based.
func NewDomainMatcherGroup ¶
func NewDomainMatcherGroup() *DomainMatcherGroup
func (*DomainMatcherGroup) AddDomainMatcher ¶
func (g *DomainMatcherGroup) AddDomainMatcher(matcher DomainMatcher, value uint32)
AddDomainMatcher implements MatcherGroupForDomain.AddDomainMatcher.
func (*DomainMatcherGroup) Match ¶
func (g *DomainMatcherGroup) Match(input string) []uint32
Match implements MatcherGroup.Match.
func (*DomainMatcherGroup) MatchAny ¶
func (g *DomainMatcherGroup) MatchAny(input string) bool
MatchAny implements MatcherGroup.MatchAny.
type FullMatcher ¶
type FullMatcher string
FullMatcher is an implementation of Matcher.
func (FullMatcher) Match ¶
func (m FullMatcher) Match(s string) bool
func (FullMatcher) Pattern ¶
func (m FullMatcher) Pattern() string
func (FullMatcher) String ¶
func (m FullMatcher) String() string
func (FullMatcher) Type ¶
func (FullMatcher) Type() Type
type FullMatcherGroup ¶
type FullMatcherGroup struct {
// contains filtered or unexported fields
}
FullMatcherGroup is an implementation of MatcherGroup. It uses a hash table to facilitate exact match lookup.
func NewFullMatcherGroup ¶
func NewFullMatcherGroup() *FullMatcherGroup
func (*FullMatcherGroup) AddFullMatcher ¶
func (g *FullMatcherGroup) AddFullMatcher(matcher FullMatcher, value uint32)
AddFullMatcher implements MatcherGroupForFull.AddFullMatcher.
func (*FullMatcherGroup) Match ¶
func (g *FullMatcherGroup) Match(input string) []uint32
Match implements MatcherGroup.Match.
func (*FullMatcherGroup) MatchAny ¶
func (g *FullMatcherGroup) MatchAny(input string) bool
MatchAny implements MatcherGroup.Any.
type IndexMatcher ¶
type IndexMatcher interface { // Size returns number of matchers added to IndexMatcher. Size() uint32 // Add adds a new Matcher to IndexMatcher, and returns its index. The index will never be 0. Add(matcher Matcher) uint32 // Build builds the IndexMatcher to be ready for matching. Build() error // Match returns the indices of all matchers that matches the input. // * Empty array is returned if no such matcher exists. // * The order of returned matchers should follow priority specification. // Priority specification: // 1. Priority between matcher types: full > domain > substr > regex. // 2. Priority of same-priority matchers matching at same position: the early added takes precedence. // 3. Priority of domain matchers matching at different levels: the further matched domain takes precedence. // 4. Priority of substr matchers matching at different positions: the further matched substr takes precedence. Match(input string) []uint32 // MatchAny returns true as soon as one matching matcher is found. MatchAny(input string) bool }
IndexMatcher is a general type of matcher thats accepts all kinds of basic matchers. It should:
- Accept all Matcher types with no exception.
- Optimize string matching with a combination of MatcherGroups.
- Obey certain priority order specification when returning matched Matchers.
type LinearIndexMatcher ¶
type LinearIndexMatcher struct {
// contains filtered or unexported fields
}
LinearIndexMatcher is an implementation of IndexMatcher.
func NewLinearIndexMatcher ¶
func NewLinearIndexMatcher() *LinearIndexMatcher
func (*LinearIndexMatcher) Add ¶
func (g *LinearIndexMatcher) Add(matcher Matcher) uint32
Add implements IndexMatcher.Add.
func (*LinearIndexMatcher) Build ¶
func (*LinearIndexMatcher) Build() error
Build implements IndexMatcher.Build.
func (*LinearIndexMatcher) Match ¶
func (g *LinearIndexMatcher) Match(input string) []uint32
Match implements IndexMatcher.Match.
func (*LinearIndexMatcher) MatchAny ¶
func (g *LinearIndexMatcher) MatchAny(input string) bool
MatchAny implements IndexMatcher.MatchAny.
func (*LinearIndexMatcher) Size ¶
func (g *LinearIndexMatcher) Size() uint32
Size implements IndexMatcher.Size.
type Matcher ¶
type Matcher interface { // Type returns the matcher's type. Type() Type // Pattern returns the matcher's raw string representation. Pattern() string // String returns a string representation of the matcher containing its type and pattern. String() string // Match returns true if the given string matches a predefined pattern. // * This method is seldom used for performance reason // and is generally taken over by their corresponding MatcherGroup. Match(input string) bool }
Matcher is the interface to determine a string matches a pattern.
- This is a basic matcher to represent a certain kind of match semantic(full, substr, domain or regex).
type MatcherGroup ¶
type MatcherGroup interface { // Match returns all matched matchers with their corresponding values. Match(input string) []uint32 // MatchAny returns true as soon as one matching matcher is found. MatchAny(input string) bool }
MatcherGroup is an advanced type of matcher to accept a bunch of basic Matchers (of certain type, not all matcher types). For example:
- FullMatcherGroup accepts FullMatcher and uses a hash table to facilitate lookup.
- DomainMatcherGroup accepts DomainMatcher and uses a trie to optimize both memory consumption and lookup speed.
type MatcherGroupForAll ¶
MatcherGroupForAll is an interface indicating a MatcherGroup could accept all types of matchers.
type MatcherGroupForDomain ¶
type MatcherGroupForDomain interface {
AddDomainMatcher(matcher DomainMatcher, value uint32)
}
MatcherGroupForDomain is an interface indicating a MatcherGroup could accept DomainMatchers.
type MatcherGroupForFull ¶
type MatcherGroupForFull interface {
AddFullMatcher(matcher FullMatcher, value uint32)
}
MatcherGroupForFull is an interface indicating a MatcherGroup could accept FullMatchers.
type MatcherGroupForRegex ¶
type MatcherGroupForRegex interface {
AddRegexMatcher(matcher *RegexMatcher, value uint32)
}
MatcherGroupForRegex is an interface indicating a MatcherGroup could accept RegexMatchers.
type MatcherGroupForSubstr ¶
type MatcherGroupForSubstr interface {
AddSubstrMatcher(matcher SubstrMatcher, value uint32)
}
MatcherGroupForSubstr is an interface indicating a MatcherGroup could accept SubstrMatchers.
type MphIndexMatcher ¶
type MphIndexMatcher struct {
// contains filtered or unexported fields
}
A MphIndexMatcher is divided into three parts: 1. `full` and `domain` patterns are matched by Rabin-Karp algorithm and minimal perfect hash table; 2. `substr` patterns are matched by ac automaton; 3. `regex` patterns are matched with the regex library.
func NewMphIndexMatcher ¶
func NewMphIndexMatcher() *MphIndexMatcher
func (*MphIndexMatcher) Add ¶
func (g *MphIndexMatcher) Add(matcher Matcher) uint32
Add implements IndexMatcher.Add.
func (*MphIndexMatcher) Build ¶
func (g *MphIndexMatcher) Build() error
Build implements IndexMatcher.Build.
func (*MphIndexMatcher) Match ¶
func (g *MphIndexMatcher) Match(input string) []uint32
Match implements IndexMatcher.Match.
func (*MphIndexMatcher) MatchAny ¶
func (g *MphIndexMatcher) MatchAny(input string) bool
MatchAny implements IndexMatcher.MatchAny.
func (*MphIndexMatcher) Size ¶
func (g *MphIndexMatcher) Size() uint32
Size implements IndexMatcher.Size.
type MphMatcherGroup ¶
type MphMatcherGroup struct {
// contains filtered or unexported fields
}
MphMatcherGroup is an implementation of MatcherGroup. It implements Rabin-Karp algorithm and minimal perfect hash table for Full and Domain matcher.
func NewMphMatcherGroup ¶
func NewMphMatcherGroup() *MphMatcherGroup
func (*MphMatcherGroup) AddDomainMatcher ¶
func (g *MphMatcherGroup) AddDomainMatcher(matcher DomainMatcher, value uint32)
AddDomainMatcher implements MatcherGroupForDomain.
func (*MphMatcherGroup) AddFullMatcher ¶
func (g *MphMatcherGroup) AddFullMatcher(matcher FullMatcher, value uint32)
AddFullMatcher implements MatcherGroupForFull.
func (*MphMatcherGroup) Build ¶
func (g *MphMatcherGroup) Build() error
Build builds a minimal perfect hash table for insert rules. Algorithm used: Hash, displace, and compress. See http://cmph.sourceforge.net/papers/esa09.pdf
func (*MphMatcherGroup) Lookup ¶
func (g *MphMatcherGroup) Lookup(rollingHash uint32, input string) uint32
Lookup searches for input in minimal perfect hash table and returns its index. 0 indicates not found.
func (*MphMatcherGroup) Match ¶
func (g *MphMatcherGroup) Match(input string) []uint32
Match implements MatcherGroup.Match.
func (*MphMatcherGroup) MatchAny ¶
func (g *MphMatcherGroup) MatchAny(input string) bool
MatchAny implements MatcherGroup.MatchAny.
type RegexMatcher ¶
type RegexMatcher struct {
// contains filtered or unexported fields
}
RegexMatcher is an implementation of Matcher.
func (*RegexMatcher) Match ¶
func (m *RegexMatcher) Match(s string) bool
func (*RegexMatcher) Pattern ¶
func (m *RegexMatcher) Pattern() string
func (*RegexMatcher) String ¶
func (m *RegexMatcher) String() string
func (*RegexMatcher) Type ¶
func (*RegexMatcher) Type() Type
type SimpleMatcherGroup ¶
type SimpleMatcherGroup struct {
// contains filtered or unexported fields
}
SimpleMatcherGroup is an implementation of MatcherGroup. It simply stores all matchers in an array and sequentially matches them.
func (*SimpleMatcherGroup) AddMatcher ¶
func (g *SimpleMatcherGroup) AddMatcher(matcher Matcher, value uint32)
AddMatcher implements MatcherGroupForAll.AddMatcher.
func (*SimpleMatcherGroup) Match ¶
func (g *SimpleMatcherGroup) Match(input string) []uint32
Match implements MatcherGroup.Match.
func (*SimpleMatcherGroup) MatchAny ¶
func (g *SimpleMatcherGroup) MatchAny(input string) bool
MatchAny implements MatcherGroup.MatchAny.
type SubstrMatcher ¶
type SubstrMatcher string
SubstrMatcher is an implementation of Matcher.
func (SubstrMatcher) Match ¶
func (m SubstrMatcher) Match(s string) bool
func (SubstrMatcher) Pattern ¶
func (m SubstrMatcher) Pattern() string
func (SubstrMatcher) String ¶
func (m SubstrMatcher) String() string
func (SubstrMatcher) Type ¶
func (SubstrMatcher) Type() Type
type SubstrMatcherGroup ¶
type SubstrMatcherGroup struct {
// contains filtered or unexported fields
}
SubstrMatcherGroup is implementation of MatcherGroup, It is simply implmeneted to comply with the priority specification of Substr matchers.
func (*SubstrMatcherGroup) AddSubstrMatcher ¶
func (g *SubstrMatcherGroup) AddSubstrMatcher(matcher SubstrMatcher, value uint32)
AddSubstrMatcher implements MatcherGroupForSubstr.AddSubstrMatcher.
func (*SubstrMatcherGroup) Match ¶
func (g *SubstrMatcherGroup) Match(input string) []uint32
Match implements MatcherGroup.Match.
func (*SubstrMatcherGroup) MatchAny ¶
func (g *SubstrMatcherGroup) MatchAny(input string) bool
MatchAny implements MatcherGroup.MatchAny.
type Type ¶
type Type byte
Type is the type of the matcher.
const ( // Full is the type of matcher that the input string must exactly equal to the pattern. Full Type = 0 // Domain is the type of matcher that the input string must be a sub-domain or itself of the pattern. Domain Type = 1 // Substr is the type of matcher that the input string must contain the pattern as a sub-string. Substr Type = 2 // Regex is the type of matcher that the input string must matches the regular-expression pattern. Regex Type = 3 )