Documentation ¶
Overview ¶
This package is a character-set conversion library for Go.
Index ¶
Constants ¶
const ( // SUCCESS means that the character was converted with no problems. SUCCESS = Status(iota) // INVALID_CHAR means that the source contained invalid bytes, or that the character // could not be represented in the destination encoding. // The Encoder or Decoder should have output a substitute character. INVALID_CHAR // NO_ROOM means there were not enough input bytes to form a complete character, // or there was not enough room in the output buffer to write a complete character. // No bytes were written, and no internal state was changed in the Encoder or Decoder. NO_ROOM // STATE_ONLY means that bytes were read or written indicating a state transition, // but no actual character was processed. (Examples: byte order marks, ISO-2022 escape sequences) STATE_ONLY )
Variables ¶
This section is empty.
Functions ¶
func RegisterCharset ¶
func RegisterCharset(cs *Charset)
RegisterCharset adds a charset to the charsetMap.
Types ¶
type Charset ¶
type Charset struct { // Name is the character set's canonical name. Name string // Aliases returns a list of alternate names. Aliases []string // NewDecoder returns a Decoder to convert from the charset to Unicode. NewDecoder func() Decoder // NewEncoder returns an Encoder to convert from Unicode to the charset. NewEncoder func() Encoder }
A Charset represents a character set that can be converted, and contains functions to create Converters to encode and decode strings in that character set.
func GetCharset ¶
GetCharset fetches a charset by name. If the name is not found, it returns nil.
type Decoder ¶
A Decoder is a function that decodes a character set, one character at a time. It works much like utf8.DecodeRune, but has an aditional status return value.
func EntityDecoder ¶
func EntityDecoder() Decoder
EntityDecoder returns a Decoder that decodes HTML character entities. If there is no valid character entity at the current position, it returns INVALID_CHAR. So it needs to be combined with another Decoder via FallbackDecoder.
func FallbackDecoder ¶
FallbackDecoder combines a series of Decoders into one. If the first Decoder returns a status of INVALID_CHAR, the others are tried as well.
Note: if the text to be decoded ends with a sequence of bytes that is not a valid character in the first charset, but it could be the beginning of a valid character, the FallbackDecoder will give a status of NO_ROOM instead of falling back to the other Decoders.
func NewDecoder ¶
NewDecoder returns a Decoder to decode the named charset. If the name is not found, it returns nil.
func (Decoder) ConvertString ¶
ConvertString converts a string from d's encoding to UTF-8.
type Encoder ¶
An Encoder is a function that encodes a character set, one character at a time. It works much like utf8.EncodeRune, but has an additional status return value.
func NewEncoder ¶
NewEncoder returns an Encoder to encode the named charset.
func (Encoder) ConvertString ¶
ConvertString converts a string from UTF-8 to e's encoding.
type MBCSTable ¶
type MBCSTable struct {
// contains filtered or unexported fields
}
A MBCSTable holds the data to convert to and from Unicode.
func (*MBCSTable) AddCharacter ¶
AddCharacter adds a character to the table. rune is its Unicode code point, and bytes contains the bytes used to encode it in the character set.
type Reader ¶
type Reader struct {
// contains filtered or unexported fields
}
Reader implements character-set decoding for an io.Reader object.
type Status ¶
type Status int
Status is the type for the status return value from a Decoder or Encoder.