Documentation ¶
Overview ¶
Package index represents an n-gram index as a read-only structure providing both low level methods for accessing the internal ngram tree and higher level methods for searching a specific word.
Index ¶
- Constants
- type DynamicNgramIndex
- func (nib *DynamicNgramIndex) AddNgram(ngram []int, count int, metadata []column.AttrVal)
- func (nib *DynamicNgramIndex) Finish()
- func (nib *DynamicNgramIndex) GetIndex() *NgramIndex
- func (nib *DynamicNgramIndex) GetInfo() string
- func (nib *DynamicNgramIndex) GetNgramsAt(position int) *NgramSearchResult
- func (nib *DynamicNgramIndex) MetadataWriter() *column.MetadataWriter
- func (nib *DynamicNgramIndex) Save(dirPath string) error
- type NgramIndex
- type NgramResultItem
- type NgramSearchResult
- func (nsr *NgramSearchResult) Append(other *NgramSearchResult)
- func (nsr *NgramSearchResult) Filter(fn func(*NgramResultItem) bool)
- func (nsr *NgramSearchResult) HasNext() bool
- func (nsr *NgramSearchResult) Next() *NgramResultItem
- func (nsr *NgramSearchResult) RemoveNext(v *NgramResultItem) *NgramResultItem
- func (nsr *NgramSearchResult) ResetCursor()
- func (nsr *NgramSearchResult) Size() int
- func (nsr *NgramSearchResult) Slice(leftIdx int, rightIdx int) bool
- type SearchableIndex
- func (si *SearchableIndex) GetCol0Idx(widx int) int
- func (si *SearchableIndex) GetNgramsOf(word string) *NgramSearchResult
- func (si *SearchableIndex) GetNgramsOfColIdx(idx int) *NgramSearchResult
- func (si *SearchableIndex) GetNgramsOfWidx(idx int) *NgramSearchResult
- func (si *SearchableIndex) LoadRange(fromIdx int, toIdx int)
Constants ¶
const ( // MaxNgramSize specifies the largest n-gram // (1-gram, 2-gram,..., n-gram) size Gloomy supports MaxNgramSize = 10 )
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type DynamicNgramIndex ¶
type DynamicNgramIndex struct {
// contains filtered or unexported fields
}
DynamicNgramIndex allows adding items to the index
func NewDynamicNgramIndex ¶
func NewDynamicNgramIndex(ngramSize int, initialLength int, attrMap map[string]string) *DynamicNgramIndex
NewDynamicNgramIndex creates a new instance of DynamicNgramIndex
func (*DynamicNgramIndex) AddNgram ¶
func (nib *DynamicNgramIndex) AddNgram(ngram []int, count int, metadata []column.AttrVal)
AddNgram adds a new n-gram represented as an array of indices to the index
func (*DynamicNgramIndex) Finish ¶
func (nib *DynamicNgramIndex) Finish()
Finish should be called once adding of n-grams is done. The method frees up some memory preallocated for new n-grams.
func (*DynamicNgramIndex) GetIndex ¶
func (nib *DynamicNgramIndex) GetIndex() *NgramIndex
GetIndex returns internal index structure
func (*DynamicNgramIndex) GetInfo ¶
func (nib *DynamicNgramIndex) GetInfo() string
GetInfo returns a brief human-readable information about the index
func (*DynamicNgramIndex) GetNgramsAt ¶
func (nib *DynamicNgramIndex) GetNgramsAt(position int) *NgramSearchResult
GetNgramsAt returns all the ngrams where the first word index equals position
func (*DynamicNgramIndex) MetadataWriter ¶
func (nib *DynamicNgramIndex) MetadataWriter() *column.MetadataWriter
MetadataWriter provides access to attached metadata index writer
func (*DynamicNgramIndex) Save ¶
func (nib *DynamicNgramIndex) Save(dirPath string) error
Save stores current index data to bunch of files within the provided directory.
type NgramIndex ¶
type NgramIndex struct {
// contains filtered or unexported fields
}
NgramIndex is a low-level implementation of a n-gram index.
func LoadNgramIndex ¶
func LoadNgramIndex(dirPath string, attrs []string) *NgramIndex
LoadNgramIndex loads index data from within a specified directory.
func NewNgramIndex ¶
func NewNgramIndex(ngramSize int, initialLength int, attrMap map[string]string) *NgramIndex
NewNgramIndex creates a new empty instance of NgramIndex
func (*NgramIndex) GetInfo ¶
func (n *NgramIndex) GetInfo() string
GetInfo returns a human readable overview of the index
func (*NgramIndex) GetNgramsAt ¶
func (n *NgramIndex) GetNgramsAt(position int) *NgramSearchResult
GetNgramsAt returns all the ngrams where the first word index equals position
func (*NgramIndex) LoadRange ¶
func (n *NgramIndex) LoadRange(fromPos int, toPos int)
LoadRange loads data for all the configured n-gram and metadata columns delimited by interval [fromPos, toPos] applied on the zero-th n-gram column (e.g. 100-200 on 0th column means 1700-3500 on the 1st, 7000-9000 on 2th column which is calculated automatically).
Both interval ends are included.
type NgramResultItem ¶
type NgramSearchResult ¶
type NgramSearchResult struct {
// contains filtered or unexported fields
}
NgramSearchResult is a low level result representation where n-grams are just arrays of integers (i.e. no translation to actual words yet).
The result is implemented to behave as a kind of iterator (using HasNext(), Next() methods) rather than copying all the result data into an array.
func (*NgramSearchResult) Append ¶
func (nsr *NgramSearchResult) Append(other *NgramSearchResult)
func (*NgramSearchResult) Filter ¶
func (nsr *NgramSearchResult) Filter(fn func(*NgramResultItem) bool)
func (*NgramSearchResult) HasNext ¶
func (nsr *NgramSearchResult) HasNext() bool
HasNext tests whether the result has at least one item left (i.e. whether it is possible to call Next() and get a valid row)
func (*NgramSearchResult) Next ¶
func (nsr *NgramSearchResult) Next() *NgramResultItem
Next returs a following result item.
func (*NgramSearchResult) RemoveNext ¶
func (nsr *NgramSearchResult) RemoveNext(v *NgramResultItem) *NgramResultItem
RemoveNext removes the following item to the 'v' one. In case v is nil, first item is removed. The function call resets iterator to the first item.
func (*NgramSearchResult) ResetCursor ¶
func (nsr *NgramSearchResult) ResetCursor()
ResetCursor moves a pointer pointing to the current result item back to the first result item.
func (*NgramSearchResult) Size ¶
func (nsr *NgramSearchResult) Size() int
Size returns a size of the result (this is an O(1) operation)
func (*NgramSearchResult) Slice ¶
func (nsr *NgramSearchResult) Slice(leftIdx int, rightIdx int) bool
Slice slices internal list preserving items starting from leftIdx (including) up to rightIdx (excluding). If an actual slice has been performed then true is returned, otherwise false is returned. Slice is performed only if rightIdx is strictly greater than leftIdx.
type SearchableIndex ¶
type SearchableIndex struct {
// contains filtered or unexported fields
}
SearchableIndex is a higher-level representation of ngram-index with some functions allowing searching
Please note that SearchableIndex does not handle data loading automatically. It provides method LoadRange to load a specified part of column data but the logic is up to a search routine (which decides which words we are actually looking for by parsing a query),
func OpenSearchableIndex ¶
func OpenSearchableIndex(index *NgramIndex, wstore *wdict.WordDictReader) *SearchableIndex
OpenSearchableIndex creates a instance of SearchableIndex based on internal NgramIndex instance and WordIndex instance
func (*SearchableIndex) GetCol0Idx ¶
func (si *SearchableIndex) GetCol0Idx(widx int) int
GetCol0Idx returns an index within zero column of provided word identied by an index within word dictionary
func (*SearchableIndex) GetNgramsOf ¶
func (si *SearchableIndex) GetNgramsOf(word string) *NgramSearchResult
GetNgramsOf returns all the n-grams with first word equal to the 'word' argument
func (*SearchableIndex) GetNgramsOfColIdx ¶
func (si *SearchableIndex) GetNgramsOfColIdx(idx int) *NgramSearchResult
GetNgramsOfColIdx returns all the n-grams with the first word identified by its index within zero column
func (*SearchableIndex) GetNgramsOfWidx ¶
func (si *SearchableIndex) GetNgramsOfWidx(idx int) *NgramSearchResult
GetNgramsOfWidx returns all the n-grams with the first word identified by its word dictionary index value
func (*SearchableIndex) LoadRange ¶
func (si *SearchableIndex) LoadRange(fromIdx int, toIdx int)
LoadRange loads column data starting from fromIdx up to toIdx