Documentation ¶
Overview ¶
Package utf8reader provides a utility to wrap an io.Reader that contains text in an arbitrary encoding and produce an io.Reader that outputs UTF-8 encoded text. The package automatically detects the original encoding and converts the input to UTF-8. Additionally, it can normalize the text to a specified Unicode normalization form (NFC or NFD).
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func WithNormalization ¶ added in v0.5.1
func WithNormalization(nor string) option
WithNormalization sets the normalization form. The normalization form can be "NFC" or "NFD". By default no normalization is done. WithNormalization("NFC") is equivalent to WithTransformers(norm.NFC). WithNormalization("NFD") is equivalent to WithTransformers(norm.NFD).
func WithPeekSize ¶ added in v0.3.0
func WithPeekSize(size int) option
WithPeekSize sets the number of bytes to peak. By default it peaks 4096 bytes. The peaked bytes are used to detect the encoding.
func WithTransform ¶ added in v0.5.1
func WithTransform(transformers ...transform.Transformer) option
WithTransformers append a (set of) transformer(s).
Types ¶
type Reader ¶
type Reader struct {
// contains filtered or unexported fields
}
Reader wraps an io.Reader to convert its input to UTF-8 encoding, if required.
func New ¶
New creates a Reader that converts the input to UTF-8. If encoding detection fails the input stays unchanged, and Encoding() will return an empty string.
func (*Reader) Encoding ¶
Encoding returns the encoding detected from the input, or an empty string if detection was unsuccessful, or an error occurred during the detection.