processor

package
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 24, 2018 License: Apache-2.0 Imports: 7 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Filter

func Filter(strs []string) []string

Filter ...

func RegisterStopWords

func RegisterStopWords(words []string)

RegisterStopWords ...

Types

type Tag

type Tag struct {
	Value string
	Score float64
	Count int
}

Tag holds some arbitrary string value (e.g. a word) along with some extra data about it.

func ParseHTML

func ParseHTML(lines []string, verbose bool) []*Tag

ParseHTML receives lines of raw strings from the Web and produces result of prioritised tags based on the importance of HTML tags which wrap sentences.

Example:

<h1>A story about foo
<p> Foo was a good guy but, had a quite poor time management skills,
therefore he had issues with shipping all his tasks. Though foo had heaps
of other amazing skills, which gained him a fortune.

Result:

foo: 5 + 1 = 6, story: 5, management: 1 + 1 = 2, skills: 1 + 1 = 2.

func ParseText

func ParseText(tokens []string) []*Tag

ParseText ...

func Run

func Run(items []*Tag, limit int) []*Tag

Run first sorts given list based on scores, then iterates over the given list and de-dupes items in the list by merging inflections, then sorts de-duped list by scores in descending order and takes only rquested size (limit) or just everything if result is smaller than limit.

nolint: gocyclo

func (*Tag) String

func (t *Tag) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL