Documentation
¶
Overview ¶
Library to encode/decode ISO-8859 byte streams to/from UTF-8.
This library is complete in terms of the ISO 8859 standard, i.e. all 15 parts are present.
An io.Writer and an io.Reader can be used as well, in order to write or read ISO-8859 streams from underlying io.Reader/io.Writer.
The Windows-1252 conversion is also included, which uses some undefined positions in ISO-8859-1 for common characters.
Index ¶
- Constants
- func Available() (all []string)
- func Decode(charset int, latin []byte) (utf_8 []byte, err error)
- func Encode(charset int, utf_8 []byte) (latin []byte, success int, err error)
- func NewReader(charset int, r io.Reader) io.Reader
- func NewWriter(charset int, w io.Writer) io.Writer
- type Converter
- type LatinReader
- type LatinWriter
- type UnicodeError
- type UnknownByteError
- type UnknownRuneError
Constants ¶
const ( ISO_8859_1 = iota ISO_8859_2 ISO_8859_3 ISO_8859_4 ISO_8859_5 ISO_8859_6 ISO_8859_7 ISO_8859_8 ISO_8859_9 ISO_8859_10 ISO_8859_11 ISO_8859_13 ISO_8859_14 ISO_8859_15 ISO_8859_16 // Extended Latin-1 (Windows only, not a standard) Windows1252 // Common aliases for the standards Latin1 = ISO_8859_1 Latin2 = ISO_8859_2 Latin3 = ISO_8859_3 Latin4 = ISO_8859_4 Cyrillic = ISO_8859_5 Arabic = ISO_8859_6 Greek = ISO_8859_7 Hebrew = ISO_8859_8 Latin5 = ISO_8859_9 Latin6 = ISO_8859_10 Thai = ISO_8859_11 Latin7 = ISO_8859_13 Latin8 = ISO_8859_14 Latin9 = ISO_8859_15 Latin10 = ISO_8859_16 // The numbers (1,2) are just meant to // distinguish PARTIAL from ILLEGAL PARTIAL = UnicodeError(1) ILLEGAL = UnicodeError(2) )
Constants used to fetch *Converter
Variables ¶
This section is empty.
Functions ¶
func Available ¶
func Available() (all []string)
Return the String representation of all available encodings
Types ¶
type Converter ¶
type Converter struct {
// contains filtered or unexported fields
}
A Converter holds mappings from ISO 8859 => UTF-8, and vice verca.
func (*Converter) Decode ¶
Convert a ISO 8859 byte sequence into a UTF-8 byte sequence. If this function returns a UnknownByteError, the charset of the Converter does not have a unicode mapping for a byte found in latin.
func (*Converter) Encode ¶
Convert a UTF-8 byte sequence into a ISO 8859 byte sequence. The errors returned by this function are either UnicodeError, which means that a partial UTF-8 symbol or an illegal UTF-8 sequence was found, i.e. either latinx.ILLEGAL, or latinx.PARTIAL. When a UnicodeError is returned, success < len(utf_8), and success indicates how many bytes that was successfully converted into UTF-8 bytes. If this function returns an UnknownRuneError, it means that the charset of the Converter has no mapping for a rune (UTF-8 letter) found in the utf_8 array.
type LatinReader ¶
type LatinReader struct {
// contains filtered or unexported fields
}
A LatinReader reads ISO-8859 streams from underlying reader, decodes them to UTF-8, and writes them to a *bytes.Buffer, which is used to store the possibly larger byte-stream. After the decoded stream has been written to buffer, a Read(p []byte) from *bytes.Buffer is preformed.
type LatinWriter ¶
type LatinWriter struct {
// contains filtered or unexported fields
}
A LatinWriter writer will encode UTF-8 byte-streams into selected ISO 8859 byte-stream, before writing them to underlying io.Writer.
func (*LatinWriter) Write ¶
func (w *LatinWriter) Write(p []byte) (n int, err error)
The returned n represents how much of the input we where able to write, this may be different than the actual number of bytes written since Converter.Encode converts multibyte UTF-8 into singlebyte ISO 8859, i.e. if you write []byte("€€€") using charset ISO_8859_15, it will return 9, but it actually just wrote 3 bytes to underlying io.Writer.
type UnicodeError ¶
type UnicodeError int
Error type for Partial UTF-8 sequences
func (UnicodeError) Error ¶
func (e UnicodeError) Error() string
type UnknownByteError ¶
type UnknownByteError string
Error type for unknown ISO 8859 byte
func (UnknownByteError) Error ¶
func (e UnknownByteError) Error() string
type UnknownRuneError ¶
type UnknownRuneError string
Error type for unknown UTF-8 runes
func (UnknownRuneError) Error ¶
func (e UnknownRuneError) Error() string