Documentation ¶
Overview ¶
Package fuzzy implements a fuzzy matching algorithm.
Index ¶
Constants ¶
const ( // MaxInputSize is the maximum size of the input scored against the fuzzy matcher. Longer inputs // will be truncated to this size. MaxInputSize = 127 // MaxPatternSize is the maximum size of the pattern used to construct the fuzzy matcher. Longer // inputs are truncated to this size. MaxPatternSize = 63 )
Variables ¶
This section is empty.
Functions ¶
func BestMatch ¶
BestMatch returns the name most similar to the pattern, using fuzzy matching, or the empty string.
func LastSegment ¶
LastSegment returns the substring representing the last segment from the input, where each byte has an associated RuneRole in the roles slice. This makes sense only for inputs of Symbol or Filename type.
func Words ¶
func Words(roles []RuneRole, consume WordConsumer)
Words find word delimiters in an input based on its bytes' mappings to rune roles. The offset delimiters for each word are fed to the provided consumer function.
Types ¶
type Matcher ¶
type Matcher struct {
// contains filtered or unexported fields
}
Matcher implements a fuzzy matching algorithm for scoring candidates against a pattern. The matcher does not support parallel usage.
func NewMatcher ¶
NewMatcher returns a new fuzzy matcher for scoring candidates against the provided pattern.
func (*Matcher) MatchedRanges ¶
MatchedRanges returns matches ranges for the last scored string as a flattened array of [begin, end) byte offset pairs.
func (*Matcher) Score ¶
Score returns the score returned by matching the candidate to the pattern. This is not designed for parallel use. Multiple candidates must be scored sequentially. Returns a score between 0 and 1 (0 - no match, 1 - perfect match).
func (*Matcher) ScoreChunks ¶
func (*Matcher) ScoreTable ¶
ScoreTable returns the score table computed for the provided candidate. Used only for debugging.
type RuneRole ¶
type RuneRole byte
RuneRole specifies the role of a rune in the context of an input.
const ( // RNone specifies a rune without any role in the input (i.e., whitespace/non-ASCII). RNone RuneRole = iota // RSep specifies a rune with the role of segment separator. RSep // RTail specifies a rune which is a lower-case tail in a word in the input. RTail // RUCTail specifies a rune which is an upper-case tail in a word in the input. RUCTail // RHead specifies a rune which is the first character in a word in the input. RHead )
type SymbolMatcher ¶
type SymbolMatcher struct {
// contains filtered or unexported fields
}
SymbolMatcher implements a fuzzy matching algorithm optimized for Go symbols of the form:
example.com/path/to/package.object.field
Knowing that we are matching symbols like this allows us to make the following optimizations:
- We can incorporate right-to-left relevance directly into the score calculation.
- We can match from right to left, discarding leading bytes if the input is too long.
- We just take the right-most match without losing too much precision. This allows us to use an O(n) algorithm.
- We can operate directly on chunked strings; in many cases we will be storing the package path and/or package name separately from the symbol or identifiers, so doing this avoids allocating strings.
- We can return the index of the right-most match, allowing us to trim irrelevant qualification.
This implementation is experimental, serving as a reference fast algorithm to compare to the fuzzy algorithm implemented by Matcher.
func NewSymbolMatcher ¶
func NewSymbolMatcher(pattern string) *SymbolMatcher
NewSymbolMatcher creates a SymbolMatcher that may be used to match the given search pattern.
Currently this matcher only accepts case-insensitive fuzzy patterns.
An empty pattern matches no input.
func (*SymbolMatcher) Match ¶
func (m *SymbolMatcher) Match(chunks []string) (int, float64)
Match looks for the right-most match of the search pattern within the symbol represented by concatenating the given chunks, returning its offset and score.
If a match is found, the first return value will hold the absolute byte offset within all chunks for the start of the symbol. In other words, the index of the match within strings.Join(chunks, ""). If no match is found, the first return value will be -1.
The second return value will be the score of the match, which is always between 0 and 1, inclusive. A score of 0 indicates no match.
type WordConsumer ¶
type WordConsumer func(start, end int)
WordConsumer defines a consumer for a word delimited by the [start,end) byte offsets in an input (start is inclusive, end is exclusive).