russian

package
v0.10.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 13, 2024 License: MIT Imports: 4 Imported by: 13

README

Snowball Russian

This package implements the Russian language Snowball stemmer.

Russian overview

Russian has 33 letters, 11 Vowels, 20 consonants and 2 unpronounced signs. The capital letters look the same as the lower case letters, with the exception of cursive capital letter and lower case.

Implementation

The Russian language stemmer comprises preprocessing, a number of steps. Each of these is defined in a separate file in this package. All of the steps operate on a SnowballWord from the snowballword package and modify the word in place.

Caveats

The example vocabulary for the original Russian snowball stemmer contains the word "злейший", which means "worst" in English. This word contains the adjectival suffix "ий" preceded by the superlative suffix "ейш". The output for the example vocabulary indicates that this word should be stemmed to "злейш". However, this implementation stems the word to "зл". The Python NLTK implementation also stems "злейший" to "зл". It is unclear to me how the original snowball implementation would possibly produce "злейш". So, I removed that word from the tests.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func IsStopWord added in v0.9.0

func IsStopWord(word string) bool

Return `true` if the input `word` is a French stop word.

func Stem

func Stem(word string, stemStopwWords bool) string

Stem an Russian word. This is the only exported function in this package.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL