stopwordsiso

package module
v0.1.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 5, 2024 License: MIT Imports: 5 Imported by: 1

README

stopwords-iso

stopwords-iso is a go package that removes stop words from a text content

Example

You can remove stopwords by language

package main
import (
    sw "github.com/toadharvard/stopwords-iso" 
)

func main() {
	stopwordsMapping, _ := sw.NewStopwordsMapping()

	originalString := "This is a sample string with some stopwords."
	language := "en"

	clearedString := stopwordsMapping.ClearStringByLang(originalString, language)
	fmt.Printf("Cleared string: %s\n", clearedString)
}

or remove all stopwords from all supported languages

package main
import (
    sw "github.com/toadharvard/stopwords-iso"
)
func main() {
	stopwordsMapping, _ := sw.NewStopwordsMapping()

	originalString := "the book on the table y la pluma es de ella und da Licht ist aus et la porte est ouverte и я it's"

	clearedString := stopwordsMapping.ClearString(originalString)
	fmt.Printf("Cleared string: %s\n", clearedString)
}

Supported languages

This package uses the stopwords-iso words pack. All languages supported by stopwords-iso are listed here: https://github.com/stopwords-iso/stopwords-iso?tab=readme-ov-file#credits

License

Distributed under the MIT license. See LICENSE for more information.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ISOCode639_1

type ISOCode639_1 = string

type StopwordsMapping

type StopwordsMapping map[string][]string

func NewStopwordsMapping

func NewStopwordsMapping() (StopwordsMapping, error)

NewStopwordsMapping initializes a new StopwordsMapping from a JSON file.

Returns: - StopwordsMapping: a map containing language to stopwords mapping. - error: an error object if an error occurred while reading or unmarshaling the JSON file.

func (*StopwordsMapping) ClearString

func (m *StopwordsMapping) ClearString(str string) string

ClearString clears the given string by removing stopwords for all languages.

Parameters: - str: the string to be cleared.

Returns: - string: the cleared string.

func (*StopwordsMapping) ClearStringByLang

func (m *StopwordsMapping) ClearStringByLang(str string, language ISOCode639_1) string

ClearStringByLang clears the given string by removing all stopwords in the specified language.

Parameters: - str: the string to be cleared. - language: the language of the stopwords to be removed in ISO 639-1 format.

Return: - string: the cleared string.

func (*StopwordsMapping) IsStopword added in v0.1.4

func (m *StopwordsMapping) IsStopword(word string, language ISOCode639_1) bool

isStopword checks if the given word is a stopword for the specified language. It takes a word string and a language ISOCode639_1 as parameters and returns a boolean.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL