Documentation ¶
Overview ¶
Functions for finding collections by partial match on collection title
Functions for parsing a search query
Index ¶
Constants ¶
const ( MAX_RETURNED = 50 MIN_SIMILARITY = -4.75 AVG_DOC_LEN = 4497 INTERCEPT = -4.75 // From logistic regression )
Variables ¶
var ( // From logistic regression WEIGHT = []float64{0.080, 2.327, 3.040} // [BM25 words, BM25 bigrams, bit vector] )
Functions ¶
This section is empty.
Types ¶
type Collection ¶
type Collection struct {
GlossFile, Title string
}
type DictQueryParser ¶
type DictQueryParser struct{ Tokenizer tokenizer.DictTokenizer }
func (DictQueryParser) ParseQuery ¶
func (parser DictQueryParser) ParseQuery(query string) []TextSegment
The method for parsing the query text in this function is based on dictionary lookups
type Document ¶
type QueryParser ¶
type QueryParser interface {
ParseQuery(query string) []TextSegment
}
Parses input queries into a slice of text segments
func MakeQueryParser ¶
func MakeQueryParser(dict map[string]dicttypes.Word) QueryParser
Creates a QueryParser
type QueryResults ¶
type QueryResults struct {
Query, CollectionFile string
NumCollections, NumDocuments int
Collections []Collection
Documents []Document
Terms []TextSegment
}
func FindDocuments ¶
func FindDocuments(parser QueryParser, query string, advanced bool) (QueryResults, error)
Returns a QueryResults object containing matching collections, documents, and dictionary words. For dictionary lookup, a text segment will contains the QueryText searched for and possibly a matching dictionary entry. There will only be matching dictionary entries for Chinese words in the dictionary. If there are no Chinese words in the query then the Chinese word senses matching the English or Pinyin will be included in the TextSegment.Senses field.
func FindDocumentsInCol ¶
func FindDocumentsInCol(parser QueryParser, query, col_gloss_file string) (QueryResults, error)
Returns a QueryResults object containing matching collections, documents, and dictionary words within a specific collecion. For dictionary lookup, a text segment will contains the QueryText searched for and possibly a matching dictionary entry. There will only be matching dictionary entries for Chinese words in the dictionary. If there are no Chinese words in the query then the Chinese word senses matching the English or Pinyin will be included in the TextSegment.Senses field.
type TextSegment ¶
A text segment contains the QueryText searched for and possibly a matching dictionary entry. There will only be matching dictionary entries for Chinese words in the dictionary. Non-Chinese text, punctuation, and unknown Chinese words will have nil DictEntry values and matching values will be included in the Senses field.