Documentation
¶
Index ¶
Constants ¶
const DEFAULT_MAX_TOKEN_LENGTH = int(math.MaxInt32)
Variables ¶
var EMPTY_STOPSET = auto.NewCharacterRunAutomaton(auto.MakeEmpty())
var WHITESPACE = auto.NewCharacterRunAutomaton(auto.NewRegExp("[^ \t\r\n]+").ToAutomaton())
Acts Similar to WhitespaceTokenizer
Functions ¶
This section is empty.
Types ¶
type MockAnalyzer ¶
type MockAnalyzer struct { *ca.AnalyzerImpl // contains filtered or unexported fields }
Analyzer for testing
This analyzer is a replacement for Whitespace/Simple/KeywordAnalyzers for unit tests. If you are testing a custom component such as a queryparser or analyzer-wrapper that consumes analysis streams, it's a great idea to test it with the anlyzer instead. MockAnalyzer as the following behavior:
1. By default, the assertions in MockTokenizer are tured on for extra checks that the consumer is consuming properly. These checks can be disabled with SetEnableChecks(bool). 2. Payload data is randomly injected into the streams for more thorough testing of payloads.
func NewMockAnalyzer ¶
func NewMockAnalyzer(r *rand.Rand, runAutomaton *auto.CharacterRunAutomaton, lowerCase bool, filter *auto.CharacterRunAutomaton) *MockAnalyzer
Creates a new MockAnalyzer.
func NewMockAnalyzer3 ¶
func NewMockAnalyzer3(r *rand.Rand, runAutomation *auto.CharacterRunAutomaton, lowerCase bool) *MockAnalyzer
func NewMockAnalyzerWithRandom ¶
func NewMockAnalyzerWithRandom(r *rand.Rand) *MockAnalyzer
Creates a Whitespace-lowercasing analyzer with no stopwords removal.
type MockTokenizer ¶
type MockTokenizer struct { }
Tokenizer for testing.
This tokenizer is a replacement for WHITESPACE, SIMPLE, and KEYWORD tokenizers. If you are writing a component such as a TokenFilter, it's a great idea to test it wrapping this tokenizer instead for extra checks. This tokenizer has the following behavior:
1. An internal state-machine is used for checking consumer consistency. These checks can be disabled with DisableChecks(bool). 2. For convenience, optionally lowercases terms that it outputs.