fuzzy

package
v0.0.0-...-57c1bf3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 1, 2019 License: BSD-3-Clause Imports: 3 Imported by: 0

Documentation

Overview

Package fuzzy implements a fuzzy matching algorithm.

Index

Constants

View Source
const (
	// MaxInputSize is the maximum size of the input scored against the fuzzy matcher. Longer inputs
	// will be truncated to this size.
	MaxInputSize = 127
	// MaxPatternSize is the maximum size of the pattern used to construct the fuzzy matcher. Longer
	// inputs are truncated to this size.
	MaxPatternSize = 63
)

Variables

This section is empty.

Functions

func LastSegment

func LastSegment(input string, roles []RuneRole) string

LastSegment returns the substring representing the last segment from the input, where each byte has an associated RuneRole in the roles slice. This makes sense only for inputs of Symbol or Filename type.

func ToLower

func ToLower(input string, reuse []byte) []byte

ToLower transforms the input string to lower case, which is stored in the output byte slice. The lower casing considers only ASCII values - non ASCII values are left unmodified. Stops when parsed all input or when it filled the output slice. If output is nil, then it gets created.

func Words

func Words(roles []RuneRole, consume WordConsumer)

Words find word delimiters in an input based on its bytes' mappings to rune roles. The offset delimiters for each word are fed to the provided consumer function.

Types

type Input

type Input int

Input specifies the type of the input. This influences how the runes are interpreted wrt to segmenting the input.

const (
	// Text represents a text input type. Input is not segmented.
	Text Input = iota
	// Filename represents a filepath input type with '/' segment delimitors.
	Filename
	// Symbol represents a symbol input type with '.' and ':' segment delimitors.
	Symbol
)

type Matcher

type Matcher struct {
	// contains filtered or unexported fields
}

Matcher implements a fuzzy matching algorithm for scoring candidates against a pattern. The matcher does not support parallel usage.

func NewMatcher

func NewMatcher(pattern string, input Input) *Matcher

NewMatcher returns a new fuzzy matcher for scoring candidates against the provided pattern.

func (*Matcher) MatchedRanges

func (m *Matcher) MatchedRanges() []int

MatchedRanges returns matches ranges for the last scored string as a flattened array of [begin, end) byte offset pairs.

func (*Matcher) Score

func (m *Matcher) Score(candidate string) float32

Score returns the score returned by matching the candidate to the pattern. This is not designed for parallel use. Multiple candidates must be scored sequentally. Returns a score between 0 and 1 (0 - no match, 1 - perfect match).

func (*Matcher) ScoreTable

func (m *Matcher) ScoreTable(candidate string) string

ScoreTable returns the score table computed for the provided candidate. Used only for debugging.

func (*Matcher) SetInput

func (m *Matcher) SetInput(input Input)

SetInput updates the input type for subsequent scoring attempts.

type RuneRole

type RuneRole byte

RuneRole specifies the role of a rune in the context of an input.

const (
	// RNone specifies a rune without any role in the input (i.e., whitespace/non-ASCII).
	RNone RuneRole = iota
	// RSep specifies a rune with the role of segment separator.
	RSep
	// RTail specifies a rune which is a lower-case tail in a word in the input.
	RTail
	// RUCTail specifies a rune which is an upper-case tail in a word in the input.
	RUCTail
	// RHead specifies a rune which is the first character in a word in the input.
	RHead
)

func RuneRoles

func RuneRoles(str string, input Input, reuse []RuneRole) []RuneRole

RuneRoles detects the roles of each byte rune in an input string and stores it in the output slice. The rune role depends on the input type. Stops when it parsed all the runes in the string or when it filled the output. If output is nil, then it gets created.

type WordConsumer

type WordConsumer func(start, end int)

WordConsumer defines a consumer for a word delimited by the [start,end) byte offsets in an input (start is inclusive, end is exclusive).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL