paicehusk

package module
v0.0.0-...-d62367a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 19, 2013 License: BSD-2-Clause Imports: 3 Imported by: 7

README

##Golang Implementation of the Paice/Husk stemming algorithm This project was created for the QUT course INB344. Details on the algorithm can be found here. This implementation is primarily based on the ANSI C Implementationn by Andy Stark. Effort has been put into the correctness of the algorithm, but this is hampered by many of the existing implementations giving differing results. Any comments/assistance/pull-requests are welcome.

##TODO

  • Benchmarks

##Demo A demo App Engine project utilizing this package exists here.

Documentation

Overview

Package paicehusk provides an implementation of the Paice / Husk stemmer, along with a default ruleset for the English Language

Index

Constants

This section is empty.

Variables

View Source
var DefaultRules = NewRuleTable(strings.Split(defaultRules, "\n"))

DefaultRules is a default ruleset for the english language.

Functions

func ParseRule

func ParseRule(s string) (r *rule, ok bool)

ParseRule parses a rule in the form: |suffix|intact flag|number to strip|Append|Continue flag

Eg, a rule: ht*2. Means if the stem is still intact, strip the suffix th and make no further attempts to stem the word.

Rule nois4j> Means strip the sion suffix, append a j and check for any more rules to follow

func ValidRule

func ValidRule(s string) (rule string, ok bool)

Validates a rule

Types

type RuleTable

type RuleTable struct {
	Table map[string][]*rule
}

RuleTable stores rules based on the final letter of the suffix they act on allowing for easy lookup.

func NewRuleTable

func NewRuleTable(f []string) (table *RuleTable)

NewRuleTable returns a new RuleTable instance

func (*RuleTable) Stem

func (r *RuleTable) Stem(word string) string

Stem a string, word, based on the rules in *RuleTable r, by following the algorithm described at: http://www.comp.lancs.ac.uk/computing/research/stemming/Links/paice.htm

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL