Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func NewSentenceTokenizer ¶
func NewSentenceTokenizer(s *sentences.Storage) (*sentences.DefaultSentenceTokenizer, error)
English customized sentence tokenizer.
Types ¶
type MultiPunctWordAnnotation ¶
type MultiPunctWordAnnotation struct { *sentences.Storage sentences.TokenParser sentences.TokenGrouper sentences.Ortho }
Attempts to tease out custom Abbreviations, e.g. F.B.I.
func (*MultiPunctWordAnnotation) Annotate ¶
func (a *MultiPunctWordAnnotation) Annotate(tokens []*sentences.Token) []*sentences.Token
type WordTokenizer ¶
type WordTokenizer struct {
sentences.DefaultWordTokenizer
}
func NewWordTokenizer ¶
func NewWordTokenizer(p sentences.PunctStrings) *WordTokenizer
func (*WordTokenizer) HasSentEndChars ¶
func (e *WordTokenizer) HasSentEndChars(t *sentences.Token) bool
Find any punctuation excluding the period final
Click to show internal directories.
Click to hide internal directories.