Documentation ¶
Overview ¶
Functions for finding documents by full text search
Functions for parsing a search query
Index ¶
Constants ¶
This section is empty.
Variables ¶
var WEIGHT = []float64{0.080, 2.327, 3.040} // [BM25 words, BM25 bigrams, bit vector]
From logistic regression
Functions ¶
Types ¶
type Collection ¶
type Collection struct {
GlossFile, Title string
}
type DictQueryParser ¶
type DictQueryParser struct{ Tokenizer tokenizer.DictTokenizer }
func (DictQueryParser) ParseQuery ¶
func (parser DictQueryParser) ParseQuery(query string) []TextSegment
The method for parsing the query text in this function is based on dictionary lookups
type DocFinder ¶ added in v0.0.17
type DocFinder interface { FindDocuments(ctx context.Context, dictSearcher *dictionary.Searcher, parser QueryParser, query string, advanced bool) (*QueryResults, error) FindDocumentsInCol(ctx context.Context, dictSearcher *dictionary.Searcher, parser QueryParser, query, col_gloss_file string) (*QueryResults, error) GetColMap() map[string]string Inititialized() bool }
DocFinder finds documents.
type DocInfo ¶ added in v0.0.60
type DocInfo struct {
CorpusFile, GlossFile, Title, TitleCN, TitleEN, CollectionFile, CollectionTitle string
}
type DocTitleFinder ¶ added in v0.0.52
type DocTitleFinder interface {
FindDocuments(ctx context.Context, query string) (*QueryResults, error)
}
DocTitleFinder finds documents by title.
func NewDocTitleFinder ¶ added in v0.0.52
func NewDocTitleFinder(infoCache map[string]DocInfo) DocTitleFinder
NewDocTitleFinder initializes a DocTitleFinder implementation Params
infoCache: key to the map is the Chinese part of the title
type Document ¶
type QueryParser ¶
type QueryParser interface {
ParseQuery(query string) []TextSegment
}
Parses input queries into a slice of text segments
func MakeQueryParser ¶
func MakeQueryParser(dict map[string]dicttypes.Word) QueryParser
Creates a QueryParser
type QueryResults ¶
type QueryResults struct {
Query, CollectionFile string
NumCollections, NumDocuments int
Collections []Collection
Documents []Document
Terms []TextSegment
SimilarTerms []TextSegment
}
type TextSegment ¶
A text segment contains the QueryText searched for and possibly a matching dictionary entry. There will only be matching dictionary entries for Chinese words in the dictionary. Non-Chinese text, punctuation, and unknown Chinese words will have nil DictEntry values and matching values will be included in the Senses field.