hungarian

package
v0.0.0-...-76e5571 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 14, 2024 License: MIT Imports: 5 Imported by: 0

README

Snowball Hungarian

This package implements the Hungarian language Snowball stemmer algorithm by [atordai@science.uval.nl](Anna Tordai).

Implementation

The Hungarian language stemmer comprises preprocessing, a number of steps, and postprocessing. Each of these is defined in a separate file in this package. All of the steps operate on a SnowballWord from the snowballword package and modify the word in place.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func IsStopWord

func IsStopWord(word string) bool

IsStopWord returns true it the word is a stop word.

Hungarian stop word list prepared by Anna Tordai

https://snowballstem.org/algorithms/hungarian/stop.txt

func Stem

func Stem(word string, stemStopwWords bool) string

Stem an Hungarian word. This is the only exported function in this package.

This stemming algorithm removes the inflectional suffixes of nouns. Nouns are inflected for case, person/possession and number.

Letters in Hungarian include the following accented forms,

á   é   í   ó   ö   ő   ú   ü   ű

The following letters are vowels:

a   á   e   é   i   í   o   ó   ö   ő   u   ú   ü   ű

The following letters are digraphs:

cs   dz   dzs   gy   ly   ny   ty   zs

A double consonant is defined as:

bb   cc   ccs   dd   ff   gg   ggy   jj   kk   ll   lly   mm   nn   nny   pp   rr   ss   ssz   tt   tty   vv   zz   zzs

func StemSentence

func StemSentence(pairs [][2]string, s string) [][2]string

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL