Documentation
¶
Index ¶
- Constants
- func BOMBOCU1() []byte
- func BOMGB18030() []byte
- func BOMSCSU() []byte
- func BOMUTF1() []byte
- func BOMUTF16BE() []byte
- func BOMUTF16LE() []byte
- func BOMUTF32BE() []byte
- func BOMUTF32LE() []byte
- func BOMUTF7() []byte
- func BOMUTF8() []byte
- func BOMUTF_EBCDIC() []byte
- func BOMs() map[Encoding][]byte
- func ReadBOMOfEncoding(r io.Reader, enc Encoding) (prefix []byte, err error)
- func SkipBOM(r io.Reader, enc Encoding) (err error)
- type Encoding
- type Probe
- type Report
Constants ¶
const ( ErrUnknownEncoding = "unknown encoding: %v" ErrBOMIsNotFound = "byte order mark is not found" ErrDuplicateProbe = "duplicate probe for encoding %v" )
const ( EncodingUTF8 = Encoding(1) // UTF-8 Encoding. EncodingUTF16BE = Encoding(2) // UTF-16 (BE, Big Endian) Encoding. EncodingUTF16LE = Encoding(3) // UTF-16 (LE, Little Endian) Encoding. EncodingUTF32BE = Encoding(4) // UTF-32 (BE, Big Endian) Encoding. EncodingUTF32LE = Encoding(5) // UTF-32 (LE, Little Endian) Encoding. EncodingUTF7 = Encoding(6) // UTF-7 Encoding. EncodingUTF1 = Encoding(8) // UTF-1 Encoding. EncodingUTF_EBCDIC = Encoding(9) // UTF-EBCDIC Encoding. EncodingSCSU = Encoding(10) // SCSU Encoding. EncodingBOCU1 = Encoding(11) // BOCU-1 Encoding. EncodingGB18030 = Encoding(12) // GB18030 Encoding. )
const ( ErrArraysHaveDifferentLengths = "arrays have different lengths: %v vs %v" ErrNoData = "no data" )
Variables ¶
This section is empty.
Functions ¶
func BOMGB18030 ¶ added in v0.9.0
func BOMGB18030() []byte
func BOMUTF16BE ¶ added in v0.9.0
func BOMUTF16BE() []byte
func BOMUTF16LE ¶ added in v0.9.0
func BOMUTF16LE() []byte
func BOMUTF32BE ¶ added in v0.9.0
func BOMUTF32BE() []byte
func BOMUTF32LE ¶ added in v0.9.0
func BOMUTF32LE() []byte
func BOMUTF_EBCDIC ¶ added in v0.9.0
func BOMUTF_EBCDIC() []byte
func ReadBOMOfEncoding ¶ added in v0.9.0
ReadBOMOfEncoding tries to read the BOM of a specified encoding. The prefix which was read from the stream is always returned.
Types ¶
type Encoding ¶
type Encoding byte
Encoding is an encoding type. Usually it is a text encoding using Unicode symbols. Unicode on Wikipedia: https://en.wikipedia.org/wiki/Unicode
func PossibleEncodings ¶
func PossibleEncodings() []Encoding
PossibleEncodings returns a list of possible encodings except the unknown encoding.
type Probe ¶ added in v0.9.0
type Probe struct { // Encoding is the specified encoding which is searched in the probes. Encoding Encoding // Probability is the probability of the encoding to be used in the probes. Probability tsb.TSB // ReadBytesCount is the number of bytes which were read to get the probe. ReadBytesCount int }
Probe stores the result of probing the text for a specified encoding. In other words, it stores the probability of the probes to be of the specified encoding.
func ProbeForEncoding ¶ added in v0.9.0
ProbeForEncoding tries to search the probes for the specified encoding.
func (*Probe) IsAccurate ¶ added in v0.9.0
IsAccurate tells whether the probe results are accurate or not. Here by accuracy we mean the exact 'yes' or 'no' probability.
type Report ¶ added in v0.9.0
type Report struct {
// contains filtered or unexported fields
}
Report stores the result of making probes for all possible encodings.
func GetEncodingsReport ¶ added in v0.9.0
func GetEncodingsReport(data []byte, encodingsToProbe map[Encoding]bool) (report *Report, err error)
GetEncodingsReport tries to get the report about the specified probes. Please note that some encodings have similar BOMs and this fact can make probe results inaccurate.
func (*Report) GetAccurateProbes ¶ added in v0.9.0
GetAccurateProbes returns accurate probes of the report.
func (*Report) IsAccurate ¶ added in v0.9.0
IsAccurate tells whether all the probes of the report are accurate or not.