norm

package
v0.0.0-...-2286dd8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 29, 2012 License: BSD-3-Clause Imports: 2 Imported by: 0

Documentation

Overview

Package norm contains types and functions for normalizing Unicode strings.

Index

Constants

View Source
const MaxSegmentSize = maxByteBufferSize
View Source
const Version = "6.0.0"

Version is the Unicode edition from which the tables are derived.

Variables

This section is empty.

Functions

This section is empty.

Types

type Form

type Form int

A Form denotes a canonical representation of Unicode code points. The Unicode-defined normalization and equivalence forms are:

NFC   Unicode Normalization Form C
NFD   Unicode Normalization Form D
NFKC  Unicode Normalization Form KC
NFKD  Unicode Normalization Form KD

For a Form f, this documentation uses the notation f(x) to mean the bytes or string x converted to the given form. A position n in x is called a boundary if conversion to the form can proceed independently on both sides:

f(x) == append(f(x[0:n]), f(x[n:])...)

References: http://unicode.org/reports/tr15/ and http://unicode.org/notes/tn5/.

const (
	NFC Form = iota
	NFD
	NFKC
	NFKD
)

func (Form) Append

func (f Form) Append(out []byte, src ...byte) []byte

Append returns f(append(out, b...)). The buffer out must be nil, empty, or equal to f(out).

func (Form) AppendString

func (f Form) AppendString(out []byte, src string) []byte

AppendString returns f(append(out, []byte(s))). The buffer out must be nil, empty, or equal to f(out).

func (Form) Bytes

func (f Form) Bytes(b []byte) []byte

Bytes returns f(b). May return b if f(b) = b.

func (Form) FirstBoundary

func (f Form) FirstBoundary(b []byte) int

FirstBoundary returns the position i of the first boundary in b or -1 if b contains no boundary.

func (Form) FirstBoundaryInString

func (f Form) FirstBoundaryInString(s string) int

FirstBoundaryInString returns the position i of the first boundary in s or -1 if s contains no boundary.

func (Form) IsNormal

func (f Form) IsNormal(b []byte) bool

IsNormal returns true if b == f(b).

func (Form) IsNormalString

func (f Form) IsNormalString(s string) bool

IsNormalString returns true if s == f(s).

func (Form) LastBoundary

func (f Form) LastBoundary(b []byte) int

LastBoundary returns the position i of the last boundary in b or -1 if b contains no boundary.

func (Form) QuickSpan

func (f Form) QuickSpan(b []byte) int

QuickSpan returns a boundary n such that b[0:n] == f(b[0:n]). It is not guaranteed to return the largest such n.

func (Form) QuickSpanString

func (f Form) QuickSpanString(s string) int

QuickSpanString returns a boundary n such that b[0:n] == f(s[0:n]). It is not guaranteed to return the largest such n.

func (Form) Reader

func (f Form) Reader(r io.Reader) io.Reader

Reader returns a new reader that implements Read by reading data from r and returning f(data).

func (Form) String

func (f Form) String(s string) string

String returns f(s).

func (Form) Writer

func (f Form) Writer(w io.Writer) io.WriteCloser

Writer returns a new writer that implements Write(b) by writing f(b) to w. The returned writer may use an an internal buffer to maintain state across Write calls. Calling its Close method writes any buffered data to w.

type Iter

type Iter struct {
	// contains filtered or unexported fields
}

An Iter iterates over a string or byte slice, while normalizing it to a given Form.

func (*Iter) Done

func (i *Iter) Done() bool

Done returns true if there is no more input to process.

func (*Iter) Next

func (i *Iter) Next(buf []byte) int

Next writes f(i.input[i.Pos():n]...) to buffer buf, where n is the largest boundary of i.input such that the result fits in buf. It returns the number of bytes written to buf. len(buf) should be at least MaxSegmentSize. Done must be false before calling Next.

func (*Iter) Pos

func (i *Iter) Pos() int

Pos returns the byte position at which the next call to Next will commence processing.

func (*Iter) SetInput

func (i *Iter) SetInput(f Form, src []byte)

SetInput initializes i to iterate over src after normalizing it to Form f.

func (*Iter) SetInputString

func (i *Iter) SetInputString(f Form, src string)

SetInputString initializes i to iterate over src after normalizing it to Form f.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL