Documentation ¶
Overview ¶
Package textproc provides text processing.
Index ¶
- Variables
- func ConvertLineTerminatorsToLF(in <-chan rune) <-chan rune
- func EmitLFLineContent(in <-chan rune) <-chan []rune
- func EmitLFParagraphContent(in <-chan rune) <-chan []rune
- func EnsureFinalLFIfNonEmpty(in <-chan rune) <-chan rune
- func Read(r io.Reader) (<-chan rune, <-chan error)
- func SortLFLinesI(in <-chan rune) <-chan rune
- func SortLFParagraphsI(in <-chan rune) <-chan rune
- func TrimLFTrailingWhiteSpace(in <-chan rune) <-chan rune
- func TrimLeadingEmptyLFLines(in <-chan rune) <-chan rune
- func TrimTrailingEmptyLFLines(in <-chan rune) <-chan rune
- type Processor
- type Tokenizer
Constants ¶
This section is empty.
Variables ¶
var ErrInvalidUTF8 = errors.New("Invalid UTF-8")
ErrInvalidUTF8 is the error returned when the input is not valid UTF-8.
Functions ¶
func ConvertLineTerminatorsToLF ¶
ConvertLineTerminatorsToLF converts "\r" and "\r\n" to "\n".
func EmitLFLineContent ¶ added in v2.1.0
EmitLFLineContent emits the content of each line (excluding the line terminator "\n") as a token.
func EmitLFParagraphContent ¶ added in v2.1.0
EmitLFParagraphContent emits the content of each paragraph (excluding the line terminator of the paragraph's last line) as a token.
A paragraph consists of adjacent non-empty lines. Lines are terminated by "\n".
func EnsureFinalLFIfNonEmpty ¶
EnsureFinalLFIfNonEmpty ensures non-empty content ends with "\n".
func Read ¶
Read returns two channels. All runes read from r as UTF-8 are sent, then the rune channel is closed, then the error from r is sent, then the error channel is closed.
func SortLFLinesI ¶
SortLFLinesI reads the content of all lines excluding the line terminator "\n", sorts that content in case-insensitive order and adds "\n" after each item.
func SortLFParagraphsI ¶
SortLFParagraphsI reads the content of all paragraphs excluding the line terminator of a paragraph's last line, sorts that content in case-insensitive order, joins the items with "\n\n" and adds "\n" after the last item.
A paragraph consists of adjacent non-empty lines. Lines are terminated by "\n".
func TrimLFTrailingWhiteSpace ¶
TrimLFTrailingWhiteSpace removes white space at the end of lines. Lines are terminated by "\n".
func TrimLeadingEmptyLFLines ¶
TrimLeadingEmptyLFLines removes empty lines at the start of the input. Lines are terminated by "\n".
func TrimTrailingEmptyLFLines ¶
TrimTrailingEmptyLFLines removes empty lines at the end of the input. Lines are terminated by "\n".