Documentation ¶
Overview ¶
Package for ngram analysis
Index ¶
- Constants
- type Bigram
- type BigramFreq
- type BigramFreqMap
- func (bfmPtr *BigramFreqMap) GetBigram(bigram *Bigram) BigramFreq
- func (bfmPtr *BigramFreqMap) GetBigramVal(id1, id2 int) (*Bigram, bool)
- func (bfmPtr *BigramFreqMap) Merge(more BigramFreqMap)
- func (bfmPtr *BigramFreqMap) PutBigram(bigram *Bigram)
- func (bfmPtr *BigramFreqMap) PutBigramFreq(bigramFreq BigramFreq)
- type CollocationMap
- type SortedBFM
Constants ¶
const MAX_COLLOCATIONS = 10 // Max to report
Max collocation elements for a single word
const MAX_STORE = 100 // Max to store
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Bigram ¶
type Bigram struct { HeadwordDef1 *dicttypes.Word // First headword HeadwordDef2 *dicttypes.Word // Second headword Example, ExFile, ExDocTitle, ExColTitle *string }
A struct to hold an instance of a Bigram Since they could be either simplified or traditional, index by the headword ids. Also, include an example of the bigram so that usage context can be investigated
var NULL_BIGRAM_PTR *Bigram
func NullBigram ¶
func NullBigram() *Bigram
func (*Bigram) ContainsFunctionWord ¶
Bigrams that contain function words should be excluded
func (*Bigram) Simplified ¶
The simplified text of the bigram
func (*Bigram) Traditional ¶
The traditional text of the bigram
type BigramFreq ¶
Single record of the frequency of occurence of a bigram
func SortedFreq ¶
func SortedFreq(bfm BigramFreqMap) []BigramFreq
Get the bigram frequencies as a sorted array
type BigramFreqMap ¶
type BigramFreqMap map[string]BigramFreq
Map of the frequency of occurence of a bigram in a collection of texts
func (*BigramFreqMap) GetBigram ¶
func (bfmPtr *BigramFreqMap) GetBigram(bigram *Bigram) BigramFreq
Put the bigram in the bigram frequency map
func (*BigramFreqMap) GetBigramVal ¶
func (bfmPtr *BigramFreqMap) GetBigramVal(id1, id2 int) (*Bigram, bool)
Does the Bigram map contain a bigram with this combination of words?
func (*BigramFreqMap) Merge ¶
func (bfmPtr *BigramFreqMap) Merge(more BigramFreqMap)
Merge another bigram frequency map
func (*BigramFreqMap) PutBigram ¶
func (bfmPtr *BigramFreqMap) PutBigram(bigram *Bigram)
Put the bigram in the bigram frequency map
func (*BigramFreqMap) PutBigramFreq ¶
func (bfmPtr *BigramFreqMap) PutBigramFreq(bigramFreq BigramFreq)
Put the bigram in the bigram frequency map
type CollocationMap ¶
type CollocationMap map[int]BigramFreqMap
The key is the headword id, each entry is a bigram frequency map
func (*CollocationMap) MergeCollocationMap ¶
func (cmPtr *CollocationMap) MergeCollocationMap(more CollocationMap)
Put the bigram in the bigram frequency map for the specific word
func (*CollocationMap) PutBigram ¶
func (cmPtr *CollocationMap) PutBigram(headwordId int, bigram *Bigram)
Put the bigram in the bigram frequency map for the specific word
func (*CollocationMap) PutBigramFreq ¶
func (cmPtr *CollocationMap) PutBigramFreq(key int, bigramFreq BigramFreq)
Add the BigramFreq object to the CollocationMap
func (*CollocationMap) SortedCollocations ¶
func (cmPtr *CollocationMap) SortedCollocations(headwordId int) []BigramFreq
Get the sorted collocations for a given headword, making sure that there are at least two of each and with the total number less than MAX_COLLOCATIONS
type SortedBFM ¶
type SortedBFM struct {
// contains filtered or unexported fields
}
Sorted into descending order with most frequent bigram first
func NewSortedBFM ¶
func NewSortedBFM(bfm BigramFreqMap) *SortedBFM