irtools

package module
v0.0.0-...-ab36419 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 1, 2014 License: MIT Imports: 3 Imported by: 0

README

ir-tools

A set of packages for golang for Information Retrieval.

Requires

github.com/dchest/stemmer/porter2

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Combine

func Combine(dest, source *map[string]int)

Take two term frequency maps add the source map to the destination map

func Count

func Count(terms []string) (counted map[string]int)

Take a list of terms and produce a frequency map for the list

func Difference

func Difference(m1, m2 map[string]int) (count int, difference []string)

Counts the number of terms in m2 which are not in m1

func ReadLines

func ReadLines(path string) ([]string, error)

readLines reads a whole file into memory and returns a slice of its lines.

func URLFilter

func URLFilter(text string) (clean string, urls []string)

Removes URLs from a string, and returns the string along with any removed URLs

Types

This section is empty.

Directories

Path Synopsis
URL Filter removes URLs from the inpout tokens, and returns the remaining tokens
URL Filter removes URLs from the inpout tokens, and returns the remaining tokens
A very simply sorted map.
A very simply sorted map.
TweetTokenizer is a tokenizer designed explicitly for parsing Tweets and other Twitter content.
TweetTokenizer is a tokenizer designed explicitly for parsing Tweets and other Twitter content.
LowercaseFilter is badly named - rather than filter out lowercase characters, as the name would imply, it converts text to lowercase.
LowercaseFilter is badly named - rather than filter out lowercase characters, as the name would imply, it converts text to lowercase.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL