The highest tagged major version is v2.

rank

package

v2.1.1+incompatible Latest Latest Go to latest Published: Sep 5, 2018 License: MIT Imports: 2 Imported by: 15

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/DavidBelicza/TextRank

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
func Calculate(ranks *Rank, algorithm Algorithm)
type Algorithm
type AlgorithmChain
- func NewAlgorithmChain() *AlgorithmChain
- func (a *AlgorithmChain) WeightingHits(wordID int, rank *Rank) float32
- func (a *AlgorithmChain) WeightingRelation(word1ID int, word2ID int, rank *Rank) float32
type AlgorithmDefault
- func NewAlgorithmDefault() *AlgorithmDefault
- func (a *AlgorithmDefault) WeightingHits(wordID int, rank *Rank) float32
- func (a *AlgorithmDefault) WeightingRelation(word1ID int, word2ID int, rank *Rank) float32
type Phrase
- func FindPhrases(ranks *Rank) []Phrase
type Rank
- func NewRank() *Rank
- func (rank *Rank) AddNewWord(word string, prevWordIdx int, sentenceID int) (wordID int)
- func (rank *Rank) GetWordData() map[int]*Word
- func (rank *Rank) IsWordExist(word string) bool
- func (rank *Rank) UpdateRightConnection(wordID int, rightWordID int)
- func (rank *Rank) UpdateWord(word string, prevWordIdx int, sentenceID int) (wordID int)
type Relation
- func (relation *Relation) AddRelation(wordID int, relatedWordID int, sentenceID int)
type Score
type Sentence
- func FindSentences(ranks *Rank, kind int, limit int) []Sentence
- func FindSentencesByPhrases(ranks *Rank, words []string) []Sentence
- func FindSentencesFrom(ranks *Rank, id int, limit int) []Sentence
type SingleWord
- func FindSingleWords(ranks *Rank) []SingleWord
type Word

Constants ¶

View Source

const ByQty = 0

ByQty filter by occurrence of word.

View Source

const ByRelation = 1

ByRelation filter by phrase weight.

Variables ¶

This section is empty.

Functions ¶

func Calculate ¶

func Calculate(ranks *Rank, algorithm Algorithm)

Calculate function ranking words by the given algorithm implementation.

Types ¶

type Algorithm ¶

type Algorithm interface {
	WeightingRelation(
		word1ID int,
		word2ID int,
		rank *Rank,
	) float32

	WeightingHits(
		wordID int,
		rank *Rank,
	) float32
}

Algorithm interface and its methods make possible the polimorf usage of weighting process.

type AlgorithmChain ¶

type AlgorithmChain struct{}

AlgorithmChain struct is the combined implementation of Algorithm. It is a good example how weighting can be changed by a different implementations. It can weight a word or phrase by comparing them.

func NewAlgorithmChain ¶

func NewAlgorithmChain() *AlgorithmChain

NewAlgorithmChain constructor retrieves an AlgorithmChain pointer.

func (*AlgorithmChain) WeightingHits ¶

func (a *AlgorithmChain) WeightingHits(
	wordID int,
	rank *Rank,
) float32

WeightingHits method ranks the words by their occurrence.

func (*AlgorithmChain) WeightingRelation ¶

func (a *AlgorithmChain) WeightingRelation(
	word1ID int,
	word2ID int,
	rank *Rank,
) float32

WeightingRelation method is a combined algorithm of text rank and word occurrence, it weights a phrase.

type AlgorithmDefault ¶

type AlgorithmDefault struct{}

AlgorithmDefault struct is the basic implementation of Algorithm. It can weight a word or phrase by comparing them.

func NewAlgorithmDefault ¶

func NewAlgorithmDefault() *AlgorithmDefault

NewAlgorithmDefault constructor retrieves an AlgorithmDefault pointer.

func (*AlgorithmDefault) WeightingHits ¶

func (a *AlgorithmDefault) WeightingHits(
	wordID int,
	rank *Rank,
) float32

WeightingHits method ranks the words by their occurrence.

func (*AlgorithmDefault) WeightingRelation ¶

func (a *AlgorithmDefault) WeightingRelation(
	word1ID int,
	word2ID int,
	rank *Rank,
) float32

WeightingRelation method is the traditional algorithm of text rank to weighting a phrase.

type Phrase ¶

type Phrase struct {
	LeftID  int
	RightID int
	Left    string
	Right   string
	Weight  float32
	Qty     int
}

Phrase struct contains a single phrase and its data.

LeftID is the ID of the word 1.

RightID is the ID of the word 2.

Left is the token of the word 1.

Right is the token of the word 2.

Weight is between 0.00 and 1.00.

Qty is the occurrence of the phrase.

func FindPhrases ¶

func FindPhrases(ranks *Rank) []Phrase

FindPhrases function has wrapper textrank.FindPhrases. Use the wrapper instead.

type Rank ¶

type Rank struct {
	Max         float32
	Min         float32
	Relation    Relation
	SentenceMap map[int]string
	Words       map[int]*Word
	WordValID   map[string]int
}

Rank struct contains every original raw sentences, words, tokens, phrases, indexes, word hits, phrase hits and minimum-maximum values.

Max is the occurrence of the most used word.

Min is the occurrence of the less used word. It is always greater then 0.

Relation is the Relation object, contains phrases.

SentenceMap contains raw sentences. Index is the sentence ID, value is the sentence itself.

Words contains Word objects. Index is the word ID, value is the word/token itself.

WordValID contains words. Index is the word/token, value is the ID.

func NewRank ¶

func NewRank() *Rank

NewRank constructor retrieves a Rank pointer.

func (*Rank) AddNewWord ¶

func (rank *Rank) AddNewWord(word string, prevWordIdx int, sentenceID int) (wordID int)

AddNewWord method adds a new word to the rank object and it defines its ID.

func (*Rank) GetWordData ¶

func (rank *Rank) GetWordData() map[int]*Word

GetWordData method retrieves all words as a pointer.

func (*Rank) IsWordExist ¶

func (rank *Rank) IsWordExist(word string) bool

IsWordExist method retrieves true when the given word is already in the rank.

func (*Rank) UpdateRightConnection ¶

func (rank *Rank) UpdateRightConnection(wordID int, rightWordID int)

UpdateRightConnection method adds the right connection to the word. It always can be used after a word has added and the next word is known.

func (*Rank) UpdateWord ¶

func (rank *Rank) UpdateWord(word string, prevWordIdx int, sentenceID int) (wordID int)

UpdateWord method update a word what already exists in the rank object. It retrieves its ID.

type Relation ¶

type Relation struct {
	Max  float32
	Min  float32
	Node map[int]map[int]Score
}

Relation struct contains the phrase data.

Max is the occurrence of the most used phrase.

Min is the occurrence of the less used phrase. It is always greater then 0.

Node is contains the Scores. Firs ID is the word 1, second ID is the word 2, and the value is the Score what contains the data about their relation.

func (*Relation) AddRelation ¶

func (relation *Relation) AddRelation(wordID int, relatedWordID int, sentenceID int)

AddRelation method adds a new relation to Relation object.

type Score ¶

type Score struct {
	Qty         int
	Weight      float32
	SentenceIDs []int
}

Score struct contains data about a relation of two words.

Qty is the occurrence of the phrase.

Weight is the weight of the phrase between 0.00 and 1.00.

SentenceIDs contains all IDs of sentences what contain the phrase.

type Sentence ¶

type Sentence struct {
	ID    int
	Value string
}

Sentence struct contains a single sentence and its data.

func FindSentences ¶

func FindSentences(ranks *Rank, kind int, limit int) []Sentence

FindSentences function has wrappers textrank.FindSentencesByRelationWeight and textrank.FindSentencesByWordQtyWeight. Use the wrappers instead.

func FindSentencesByPhrases ¶

func FindSentencesByPhrases(ranks *Rank, words []string) []Sentence

FindSentencesByPhrases function has wrapper textrank.FindSentencesByPhraseChain. Use the wrapper instead.

func FindSentencesFrom ¶

func FindSentencesFrom(ranks *Rank, id int, limit int) []Sentence

FindSentencesFrom function has wrapper textrank.FindSentencesFrom. Use the wrapper instead.

type SingleWord ¶

type SingleWord struct {
	ID     int
	Word   string
	Weight float32
	Qty    int
}

SingleWord struct contains a single word and its data.

ID of the word.

Word itself, the token.

Weight of the word between 0.00 and 1.00.

Quantity of the word.

func FindSingleWords ¶

func FindSingleWords(ranks *Rank) []SingleWord

FindSingleWords function has wrapper textrank.FindSingleWords. Use the wrapper instead.

type Word ¶

type Word struct {
	ID              int
	SentenceIDs     []int
	ConnectionLeft  map[int]int
	ConnectionRight map[int]int
	Token           string
	Qty             int
	Weight          float32
}

Word struct contains all data about the words.

If a word is multiple times in the text then the multiple words point to the same ID. So Word is unique.

SentenceIDs contains all IDs of sentences what contain the word.

ConnectionLeft contains all words what are connected to this word on the left side. The map index is the ID of the related word and its value is the occurrence.

ConnectionRight contains all words what are connected to this word on the right side. The map index is the ID of the related word and its value is the occurrence.

Token is the word itself, but not the original, it is tokenized.

Qty is the number of occurrence of the word.

Weight is the weight of the word between 0.00 and 1.00.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL