Documentation ¶
Overview ¶
The charset package implements translation between character sets. It uses Unicode as the intermediate representation. Because it can be large, the character set data is separated from the charset package. It can be embedded in the Go executable by importing the data package:
import _ "code.google.com/p/go-charset/data"
It can also made available in a data directory (by settting CharsetDir).
Index ¶
- Variables
- func Names() []string
- func NewReader(charset string, r io.Reader) (io.Reader, error)
- func NewTranslatingReader(r io.Reader, tr Translator) io.Reader
- func NewTranslatingWriter(w io.Writer, tr Translator) io.WriteCloser
- func NewWriter(charset string, w io.Writer) (io.WriteCloser, error)
- func NormalizedName(s string) string
- func Register(factory Factory)
- func RegisterDataFile(name string, open func() (io.ReadCloser, error))
- type Charset
- type Factory
- type Translator
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var CharsetDir = "/usr/local/lib/go-charset/datafiles"
CharsetDir gives the location of the default data file directory. This directory will be used for files with names that have not been registered with RegisterDataFile.
Functions ¶
func Names ¶
func Names() []string
Names returns the canonical names of all supported character sets, in alphabetical order.
func NewReader ¶
NewReader returns a new Reader that translates from the named character set to UTF-8 as it reads r.
Example ¶
package main import ( "fmt" "github.com/mjibson/goread/_third_party/code.google.com/p/go-charset/charset" "io/ioutil" "log" "strings" _ "github.com/mjibson/goread/_third_party/code.google.com/p/go-charset/data" ) func main() { r, err := charset.NewReader("latin1", strings.NewReader("\xa35 for Pepp\xe9")) if err != nil { log.Fatal(err) } result, err := ioutil.ReadAll(r) if err != nil { log.Fatal(err) } fmt.Printf("%s\n", result) }
Output: £5 for Peppé
func NewTranslatingReader ¶
func NewTranslatingReader(r io.Reader, tr Translator) io.Reader
NewTranslatingReader returns a new Reader that translates data using the given Translator as it reads r.
func NewTranslatingWriter ¶
func NewTranslatingWriter(w io.Writer, tr Translator) io.WriteCloser
NewTranslatingWriter returns a new WriteCloser writing to w. It passes the written bytes through the given Translator.
func NewWriter ¶
NewWriter returns a new WriteCloser writing to w. It converts writes of UTF-8 text into writes on w of text in the named character set. The Close is necessary to flush any remaining partially translated characters to the output.
Example ¶
package main import ( "bytes" "fmt" "github.com/mjibson/goread/_third_party/code.google.com/p/go-charset/charset" "log" _ "github.com/mjibson/goread/_third_party/code.google.com/p/go-charset/data" ) func main() { buf := new(bytes.Buffer) w, err := charset.NewWriter("latin1", buf) if err != nil { log.Fatal(err) } fmt.Fprintf(w, "£5 for Peppé") w.Close() fmt.Printf("%q\n", buf.Bytes()) }
Output: "\xa35 for Pepp\xe9"
func NormalizedName ¶
NormalisedName returns s with all Roman capitals mapped to lower case, and '_' mapped to '-'
func Register ¶
func Register(factory Factory)
Register registers a new Factory which will be consulted when NewReader or NewWriter needs a character set translator for a given name.
func RegisterDataFile ¶
func RegisterDataFile(name string, open func() (io.ReadCloser, error))
RegisterDataFile registers the existence of a given data file with the given name that may be used by a character-set converter. It is intended to be used by packages that wish to embed data in the executable binary, and should not be used normally.
Types ¶
type Charset ¶
type Charset struct { Name string // Canonical name of character set. Aliases []string // Known aliases. Desc string // Description. NoFrom bool // Not possible to translate from this charset. NoTo bool // Not possible to translate to this charset. }
Charset holds information about a given character set.
type Factory ¶
type Factory interface { // TranslatorFrom creates a translator that will translate from the named character // set to UTF-8. TranslatorFrom(name string) (Translator, error) // Create a Translator from this character set to. // TranslatorTo creates a translator that will translate from UTF-8 to the named character set. TranslatorTo(name string) (Translator, error) // Create a Translator To this character set. // Names returns all the character set names accessibile through the factory. Names() []string // Info returns information on the named character set. It returns nil if the // factory doesn't recognise the given name. Info(name string) *Charset }
A Factory can be used to make character set translators.
type Translator ¶
Translator represents a character set converter. The Translate method translates the given data, and returns the number of bytes of data consumed, a slice containing the converted data (which may be overwritten on the next call to Translate), and any conversion error. If eof is true, the data represents the final bytes of the input.
func TranslatorFrom ¶
func TranslatorFrom(charset string) (Translator, error)
TranslatorFrom returns a translator that will translate from the named character set to UTF-8.
func TranslatorTo ¶
func TranslatorTo(charset string) (Translator, error)
TranslatorTo returns a translator that will translate from UTF-8 to the named character set.