Documentation ¶
Overview ¶
Package uax29 provides Unicode text segmentation (UAX #29) for words, sentences and graphemes.
See the words, sentences, and graphemes packages for details and usage.
For more information on the UAX #29 spec: https://unicode.org/reports/tr29/
Directories ¶
Path | Synopsis |
---|---|
Package main generates tries of Unicode properties by calling go generate as the repository root
|
Package main generates tries of Unicode properties by calling go generate as the repository root |
triegen
Package triegen implements a code generator for a trie for associating unsigned integer values with UTF-8 encoded runes.
|
Package triegen implements a code generator for a trie for associating unsigned integer values with UTF-8 encoded runes. |
Package graphemes implements Unicode grapheme cluster boundaries: https://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries
|
Package graphemes implements Unicode grapheme cluster boundaries: https://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries |
Package iterators is a support (base types) package for other packages in UAX29.
|
Package iterators is a support (base types) package for other packages in UAX29. |
filter
Package filter provides methods for filtering via Scanners and Segmenters.
|
Package filter provides methods for filtering via Scanners and Segmenters. |
transformer
Package transformer provides a few handy transformers, for use with Scanner and Segmenter.
|
Package transformer provides a few handy transformers, for use with Scanner and Segmenter. |
Package sentences implements Unicode sentence boundaries: https://unicode.org/reports/tr29/#Sentence_Boundaries
|
Package sentences implements Unicode sentence boundaries: https://unicode.org/reports/tr29/#Sentence_Boundaries |
Package words implements Unicode word boundaries: https://unicode.org/reports/tr29/#Word_Boundaries
|
Package words implements Unicode word boundaries: https://unicode.org/reports/tr29/#Word_Boundaries |
Click to show internal directories.
Click to hide internal directories.