index

package
v0.0.0-...-9c66339 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 15, 2018 License: Apache-2.0 Imports: 7 Imported by: 0

Documentation

Overview

Package index represents an n-gram index as a read-only structure providing both low level methods for accessing the internal ngram tree and higher level methods for searching a specific word.

Index

Constants

View Source
const (
	// MaxNgramSize specifies the largest n-gram
	// (1-gram, 2-gram,..., n-gram) size Gloomy supports
	MaxNgramSize = 10
)

Variables

This section is empty.

Functions

This section is empty.

Types

type DynamicNgramIndex

type DynamicNgramIndex struct {
	// contains filtered or unexported fields
}

DynamicNgramIndex allows adding items to the index

func NewDynamicNgramIndex

func NewDynamicNgramIndex(ngramSize int, initialLength int, attrMap map[string]string) *DynamicNgramIndex

NewDynamicNgramIndex creates a new instance of DynamicNgramIndex

func (*DynamicNgramIndex) AddNgram

func (nib *DynamicNgramIndex) AddNgram(ngram []int, count int, metadata []column.AttrVal)

AddNgram adds a new n-gram represented as an array of indices to the index

func (*DynamicNgramIndex) Finish

func (nib *DynamicNgramIndex) Finish()

Finish should be called once adding of n-grams is done. The method frees up some memory preallocated for new n-grams.

func (*DynamicNgramIndex) GetIndex

func (nib *DynamicNgramIndex) GetIndex() *NgramIndex

GetIndex returns internal index structure

func (*DynamicNgramIndex) GetInfo

func (nib *DynamicNgramIndex) GetInfo() string

GetInfo returns a brief human-readable information about the index

func (*DynamicNgramIndex) GetNgramsAt

func (nib *DynamicNgramIndex) GetNgramsAt(position int) *NgramSearchResult

GetNgramsAt returns all the ngrams where the first word index equals position

func (*DynamicNgramIndex) MetadataWriter

func (nib *DynamicNgramIndex) MetadataWriter() *column.MetadataWriter

MetadataWriter provides access to attached metadata index writer

func (*DynamicNgramIndex) Save

func (nib *DynamicNgramIndex) Save(dirPath string) error

Save stores current index data to bunch of files within the provided directory.

type NgramIndex

type NgramIndex struct {
	// contains filtered or unexported fields
}

NgramIndex is a low-level implementation of a n-gram index.

func LoadNgramIndex

func LoadNgramIndex(dirPath string, attrs []string) *NgramIndex

LoadNgramIndex loads index data from within a specified directory.

func NewNgramIndex

func NewNgramIndex(ngramSize int, initialLength int, attrMap map[string]string) *NgramIndex

NewNgramIndex creates a new empty instance of NgramIndex

func (*NgramIndex) GetInfo

func (n *NgramIndex) GetInfo() string

GetInfo returns a human readable overview of the index

func (*NgramIndex) GetNgramsAt

func (n *NgramIndex) GetNgramsAt(position int) *NgramSearchResult

GetNgramsAt returns all the ngrams where the first word index equals position

func (*NgramIndex) LoadRange

func (n *NgramIndex) LoadRange(fromPos int, toPos int)

LoadRange loads data for all the configured n-gram and metadata columns delimited by interval [fromPos, toPos] applied on the zero-th n-gram column (e.g. 100-200 on 0th column means 1700-3500 on the 1st, 7000-9000 on 2th column which is calculated automatically).

Both interval ends are included.

type NgramResultItem

type NgramResultItem struct {
	Ngram    []int
	Count    int
	Metadata []string
	// contains filtered or unexported fields
}

type NgramSearchResult

type NgramSearchResult struct {
	// contains filtered or unexported fields
}

NgramSearchResult is a low level result representation where n-grams are just arrays of integers (i.e. no translation to actual words yet).

The result is implemented to behave as a kind of iterator (using HasNext(), Next() methods) rather than copying all the result data into an array.

func (*NgramSearchResult) Append

func (nsr *NgramSearchResult) Append(other *NgramSearchResult)

func (*NgramSearchResult) Filter

func (nsr *NgramSearchResult) Filter(fn func(*NgramResultItem) bool)

func (*NgramSearchResult) HasNext

func (nsr *NgramSearchResult) HasNext() bool

HasNext tests whether the result has at least one item left (i.e. whether it is possible to call Next() and get a valid row)

func (*NgramSearchResult) Next

func (nsr *NgramSearchResult) Next() *NgramResultItem

Next returs a following result item.

func (*NgramSearchResult) RemoveNext

func (nsr *NgramSearchResult) RemoveNext(v *NgramResultItem) *NgramResultItem

RemoveNext removes the following item to the 'v' one. In case v is nil, first item is removed. The function call resets iterator to the first item.

func (*NgramSearchResult) ResetCursor

func (nsr *NgramSearchResult) ResetCursor()

ResetCursor moves a pointer pointing to the current result item back to the first result item.

func (*NgramSearchResult) Size

func (nsr *NgramSearchResult) Size() int

Size returns a size of the result (this is an O(1) operation)

func (*NgramSearchResult) Slice

func (nsr *NgramSearchResult) Slice(leftIdx int, rightIdx int) bool

Slice slices internal list preserving items starting from leftIdx (including) up to rightIdx (excluding). If an actual slice has been performed then true is returned, otherwise false is returned. Slice is performed only if rightIdx is strictly greater than leftIdx.

type SearchableIndex

type SearchableIndex struct {
	// contains filtered or unexported fields
}

SearchableIndex is a higher-level representation of ngram-index with some functions allowing searching

Please note that SearchableIndex does not handle data loading automatically. It provides method LoadRange to load a specified part of column data but the logic is up to a search routine (which decides which words we are actually looking for by parsing a query),

func OpenSearchableIndex

func OpenSearchableIndex(index *NgramIndex, wstore *wdict.WordDictReader) *SearchableIndex

OpenSearchableIndex creates a instance of SearchableIndex based on internal NgramIndex instance and WordIndex instance

func (*SearchableIndex) GetCol0Idx

func (si *SearchableIndex) GetCol0Idx(widx int) int

GetCol0Idx returns an index within zero column of provided word identied by an index within word dictionary

func (*SearchableIndex) GetNgramsOf

func (si *SearchableIndex) GetNgramsOf(word string) *NgramSearchResult

GetNgramsOf returns all the n-grams with first word equal to the 'word' argument

func (*SearchableIndex) GetNgramsOfColIdx

func (si *SearchableIndex) GetNgramsOfColIdx(idx int) *NgramSearchResult

GetNgramsOfColIdx returns all the n-grams with the first word identified by its index within zero column

func (*SearchableIndex) GetNgramsOfWidx

func (si *SearchableIndex) GetNgramsOfWidx(idx int) *NgramSearchResult

GetNgramsOfWidx returns all the n-grams with the first word identified by its word dictionary index value

func (*SearchableIndex) LoadRange

func (si *SearchableIndex) LoadRange(fromIdx int, toIdx int)

LoadRange loads column data starting from fromIdx up to toIdx

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL