Documentation
¶
Index ¶
Constants ¶
View Source
const ( // Limits of first byte of UTF-8 encoded Unicode codepoint outside the ASCII range UTF8FirstByteMin = 194 // \U00000080 in UTF-8 starts with byte 194 UTF8FirstByteMax = 244 // \U0010FFFF in UTF-8 starts with byte 244 )
Variables ¶
View Source
var ( // ErrUTF8 is raised if the input has rune errors ErrUTF8 = errors.New("UTF-8 encoding error") // ErrImpure is raised if the string is not purely double UTF-8 encoded // Impurity criterias: // - some runes have values above 255 // - some consecutive runes with value < 256 do not combine to make a valid rune ErrImpure = errors.New("FixDoubleUTF8: skip (impure input)") )
Functions ¶
func FixDoubleUTF8 ¶
FixDoubleUTF8 fixes double UTF-8 encoding issues in-place.
All precautions are taken: nothing is changed if the input is not purely double encoded.
In case of error, buf is not changed and is just returned. In case of success and double UTF-8 was found, the returned slice will be shorter than the input.
Two errors may be returned:
- ErrUTF8: this is not a valid UTF-8 string
- ErrImpure: this is a valid UTF-8 string, but above, some rune do not make a purely double encoded rune
func FixHTMLEntities ¶
Types ¶
This section is empty.
Click to show internal directories.
Click to hide internal directories.