contractions

package
v1.0.9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 12, 2022 License: MIT Imports: 3 Imported by: 1

README

Contractions expander for Jargon

This package implements a TokenExpander for use with the jargon lemmatizer, expanding common English contractions into separate words.

Examples:

  • don't → does not
  • We’ve → We have
  • SHE'S -> SHE IS

It handles lower, Title and UPPER case tokens, as well as straight ' and smart ’ apostrophes.

Command line

Assuming you have installed the Jargon CLI, use the -cont flag to specify this numbers expander.

echo "I would've called but he's away from his phone" | jargon -cont

In your code

package main

import (
    "fmt"

    "github.com/clipperhouse/jargon"
    "github.com/clipperhouse/jargon/filters/contractions"
)

var lem = jargon.NewLemmatizer(contractions.Expander)

func main() {
    text := "I would've called but he's away from his phone"
    r := strings.NewReader(text)
    tokens := jargon.Tokenize(r)

    // Or! Pass tokens on to the lemmatizer
    lemmas := lem.Lemmatize(tokens)
    for {
        lemma := tokens.Next()
        if lemma == nil {
            break
        }

        fmt.Print(lemma)
    }
}

Implementation

The Lookup method satisfies the jargon.TokenFilter interface.

Here is the base list of contractions. Variations (case, apostrophes) are code-generated.

Documentation

Overview

Package contractions provides a filter to expand English contractions, such as "don't" → "does not", for use with jargon

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Expand

func Expand(incoming *jargon.TokenStream) *jargon.TokenStream

Expand converts single-token contractions to non-contracted version. Examples: don't → does not We’ve → We have SHE'S -> SHE IS

Types

This section is empty.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL