collations

package
v0.16.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 9, 2023 License: Apache-2.0 Imports: 16 Imported by: 6

Documentation

Index

Constants

View Source
const (
	CollationUtf8ID    = 33
	CollationUtf8mb4ID = 255
	CollationBinaryID  = 63
)

A few interesting character set values. See http://dev.mysql.com/doc/internals/en/character-set.html#packet-Protocol::CharacterSet

View Source
const PadToMax = math.MaxInt32

Variables

This section is empty.

Functions

func Convert added in v0.14.0

func Convert(dst []byte, dstCollation Collation, src []byte, srcCollation Collation) ([]byte, error)

Convert converts the bytes in `src`, which are encoded in `srcCollation`'s charset, into a byte slice encoded in `dstCollation`'s charset. The resulting byte slice is appended to `dst` and returned.

func Length added in v0.16.0

func Length(collation Collation, input []byte) int

Length returns the number of codepoints in the input based on the given collation

func Slice added in v0.14.0

func Slice(collation Collation, input []byte, from, to int) []byte

Slice returns the substring in `input[from:to]`, where `from` and `to` are collation-aware character indices instead of bytes.

func Validate added in v0.14.0

func Validate(collation Collation, input []byte) bool

Validate returns whether the given `input` is properly encoded with the character set for the given collation.

Types

type CaseAwareCollation added in v0.16.0

type CaseAwareCollation interface {
	Collation
	ToUpper(dst []byte, src []byte) []byte
	ToLower(dst []byte, src []byte) []byte
}

CaseAwareCollation implements lowercase and uppercase conventions for collations.

type Coercibility

type Coercibility byte

Coercibility is a numeric value that represents the precedence of a collation when applied to a SQL expression. When trying to coerce the collations of two different expressions so that they can be compared, the expression with the lowest coercibility value will win and its collation will be forced upon the other expression.

The rules for assigning a Coercibility value to an expression are as follows:

  • An explicit COLLATE clause has a coercibility of 0 (not coercible at all).
  • The concatenation of two strings with different collations has a coercibility of 1.
  • The collation of a column or a stored routine parameter or local variable has a coercibility of 2.
  • A “system constant” (the string returned by functions such as USER() or VERSION()) has a coercibility of 3.
  • The collation of a literal has a coercibility of 4.
  • The collation of a numeric or temporal value has a coercibility of 5.
  • NULL or an expression that is derived from NULL has a coercibility of 6.

According to the MySQL documentation, Coercibility is an actual word of the English language, although the Vitess maintainers disagree with this assessment.

See: https://dev.mysql.com/doc/refman/8.0/en/charset-collation-coercibility.html

const (
	CoerceExplicit Coercibility = iota
	CoerceNone
	CoerceImplicit
	CoerceSysconst
	CoerceCoercible
	CoerceNumeric
	CoerceIgnorable
)

func (Coercibility) String

func (ci Coercibility) String() string

type Coercion

type Coercion func(dst, in []byte) ([]byte, error)

Coercion is a function that will transform either the given argument arguments of the function into a specific character set. The `dst` argument will be used as the destination of the coerced argument, but it can be nil.

type CoercionOptions

type CoercionOptions struct {
	// ConvertToSuperset allows merging two different collations as long
	// as the charset of one of them is a strict superset of the other. In
	// order to operate on the two expressions, one of them will need to
	// be transcoded. This transcoding will always be safe because the string
	// with the smallest repertoire will be transcoded to its superset, which
	// cannot fail.
	ConvertToSuperset bool

	// ConvertWithCoercion allows merging two different collations by forcing
	// a coercion as long as the coercibility of the two sides is lax enough.
	// This will force a transcoding of one of the expressions even if their
	// respective charsets are not a strict superset, so the resulting transcoding
	// CAN fail depending on the content of their strings.
	ConvertWithCoercion bool
}

CoercionOptions is used to configure how aggressive the algorithm can be when merging two different collations by transcoding them.

type Collation

type Collation interface {
	// Init initializes the internal state for the collation the first time it is used
	Init()

	// ID returns the numerical identifier for this collation. This is the same
	// value that is returned by MySQL in a query's headers to identify the collation
	// for a given column
	ID() ID

	// Name is the full name of this collation, in the form of "ENCODING_LANG_SENSITIVITY"
	Name() string

	// Collate compares two strings using this collation. `left` and `right` must be the
	// two strings encoded in the proper encoding for this collation. If `isPrefix` is true,
	// the function instead behaves equivalently to `strings.HasPrefix(left, right)`, but
	// being collation-aware.
	// It returns a numeric value like a normal comparison function: <0 if left < right,
	// 0 if left == right, >0 if left > right
	Collate(left, right []byte, isPrefix bool) int

	// WeightString returns a weight string for the given `src` string. A weight string
	// is a binary representation of the weights for the given string, that can be
	// compared byte-wise to return identical results to collating this string.
	//
	// This means:
	//		bytes.Compare(WeightString(left), WeightString(right)) == Collate(left, right)
	//
	// The semantics of this API have been carefully designed to match MySQL's behavior
	// in its `strnxfrm` API. Most notably, the `numCodepoints` argument implies different
	// behaviors depending on the collation's padding mode:
	//
	// - For collations that pad WITH SPACE (this is, all legacy collations in MySQL except
	//	for the newly introduced UCA v9.0.0 utf8mb4 collations in MySQL 8.0), `numCodepoints`
	// 	can have the following values:
	//
	//		- if `numCodepoints` is any integer greater than zero, this treats the `src` string
	//		as if it were in a `CHAR(numCodepoints)` column in MySQL, meaning that the resulting
	//		weight string will be padded with the weight for the SPACE character until it becomes
	//		wide enough to fill the `CHAR` column. This is necessary to perform weight comparisons
	//		in fixed-`CHAR` columns. If `numCodepoints` is smaller than the actual amount of
	//		codepoints stored in `src`, the result is unspecified.
	//
	//		- if `numCodepoints` is zero, this is equivalent to `numCodepoints = RuneCount(src)`,
	//		meaning that the resulting weight string will have no padding at the end: it'll only have
	//		the weight values for the exact amount of codepoints contained in `src`. This is the
	//		behavior required to sort `VARCHAR` columns.
	//
	//		- if `numCodepoints` is the special constant PadToMax, then the `dst` slice must be
	//		pre-allocated to a zero-length slice with enough capacity to hold the complete weight
	//		string, and any remaining capacity in `dst` will be filled by the weights for the
	//		padding character, repeatedly. This is a special flag used by MySQL when performing
	//		filesorts, where all the sorting keys must have identical sizes, even for `VARCHAR`
	//		columns.
	//
	//	- For collations that have NO PAD (this is, the newly introduced UCA v9.0.0 utf8mb4 collations
	//	in MySQL 8.0), `numCodepoints` can only have the special constant `PadToMax`, which will make
	//	the weight string padding equivalent to a PAD SPACE collation (as explained in the previous
	//	section). All other values for `numCodepoints` are ignored, because NO PAD collations always
	//	return the weights for the codepoints in their strings, with no further padding at the end.
	//
	// The resulting weight string is written to `dst`, which can be pre-allocated to
	// WeightStringLen() bytes to prevent growing the slice. `dst` can also be nil, in which
	// case it will grow dynamically. If `numCodepoints` has the special PadToMax value explained
	// earlier, `dst` MUST be pre-allocated to the target size or the function will return an
	// empty slice.
	WeightString(dst, src []byte, numCodepoints int) []byte

	// WeightStringLen returns a size (in bytes) that would fit any weight strings for a string
	// with `numCodepoints` using this collation. Note that this is a higher bound for the size
	// of the string, and in practice weight strings can be significantly smaller than the
	// returned value.
	WeightStringLen(numCodepoints int) int

	// Hash returns a 32 or 64 bit identifier (depending on the platform) that uniquely identifies
	// the given string based on this collation. It is functionally equivalent to calling WeightString
	// and then hashing the result.
	//
	// Consequently, if the hashes for two strings are different, then the two strings are considered
	// different according to this collation. If the hashes for two strings are equal, the two strings
	// may or may not be considered equal according to this collation, because hashes can collide unlike
	// weight strings.
	//
	// The numCodepoints argument has the same behavior as in WeightString: if this collation uses PAD SPACE,
	// the hash will interpret the source string as if it were stored in a `CHAR(n)` column. If the value of
	// numCodepoints is 0, this is equivalent to setting `numCodepoints = RuneCount(src)`.
	// For collations with NO PAD, the numCodepoint argument is ignored.
	Hash(src []byte, numCodepoints int) HashCode

	// Wildcard returns a matcher for the given wildcard pattern. The matcher can be used to repeatedly
	// test different strings to check if they match the pattern. The pattern must be a traditional wildcard
	// pattern, which may contain the provided special characters for matching one character or several characters.
	// The provided `escape` character will be used as an escape sequence in front of the other special characters.
	//
	// This method is fully collation aware; the matching will be performed according to the underlying collation.
	// I.e. if this is a case-insensitive collation, matching will be case-insensitive.
	//
	// The returned WildcardPattern is always valid, but if the provided special characters do not exist in this
	// collation's repertoire, the returned pattern will not match any strings. Likewise, if the provided pattern
	// has invalid syntax, the returned pattern will not match any strings.
	//
	// If the provided special characters are 0, the defaults to parse an SQL 'LIKE' statement will be used.
	// This is, '_' for matching one character, '%' for matching many and '\\' for escape.
	//
	// This method can also be used for Shell-like matching with '?', '*' and '\\' as their respective special
	// characters.
	Wildcard(pat []byte, matchOne, matchMany, escape rune) WildcardPattern

	// Charset returns the Charset with which this collation is encoded
	Charset() charset.Charset

	// IsBinary returns whether this collation is a binary collation
	IsBinary() bool
}

Collation implements a MySQL-compatible collation. It defines how to compare for sorting order and equality two strings with the same encoding.

type Collation_8bit_bin

type Collation_8bit_bin struct {
	// contains filtered or unexported fields
}

func (*Collation_8bit_bin) Charset

func (c *Collation_8bit_bin) Charset() charset.Charset

func (*Collation_8bit_bin) Collate

func (c *Collation_8bit_bin) Collate(left, right []byte, rightIsPrefix bool) int

func (*Collation_8bit_bin) Hash

func (c *Collation_8bit_bin) Hash(src []byte, numCodepoints int) HashCode

func (*Collation_8bit_bin) ID

func (c *Collation_8bit_bin) ID() ID

func (*Collation_8bit_bin) Init

func (c *Collation_8bit_bin) Init()

func (*Collation_8bit_bin) IsBinary

func (c *Collation_8bit_bin) IsBinary() bool

func (*Collation_8bit_bin) Name

func (c *Collation_8bit_bin) Name() string

func (*Collation_8bit_bin) ToLower added in v0.16.0

func (c *Collation_8bit_bin) ToLower(dst, src []byte) []byte

func (*Collation_8bit_bin) ToUpper added in v0.16.0

func (c *Collation_8bit_bin) ToUpper(dst, src []byte) []byte

func (*Collation_8bit_bin) WeightString

func (c *Collation_8bit_bin) WeightString(dst, src []byte, numCodepoints int) []byte

func (*Collation_8bit_bin) WeightStringLen

func (c *Collation_8bit_bin) WeightStringLen(numBytes int) int

func (*Collation_8bit_bin) Wildcard

func (c *Collation_8bit_bin) Wildcard(pat []byte, matchOne rune, matchMany rune, escape rune) WildcardPattern

type Collation_8bit_simple_ci

type Collation_8bit_simple_ci struct {
	// contains filtered or unexported fields
}

func (*Collation_8bit_simple_ci) Charset

func (*Collation_8bit_simple_ci) Collate

func (c *Collation_8bit_simple_ci) Collate(left, right []byte, rightIsPrefix bool) int

func (*Collation_8bit_simple_ci) Hash

func (c *Collation_8bit_simple_ci) Hash(src []byte, numCodepoints int) HashCode

func (*Collation_8bit_simple_ci) ID

func (c *Collation_8bit_simple_ci) ID() ID

func (*Collation_8bit_simple_ci) Init

func (c *Collation_8bit_simple_ci) Init()

func (*Collation_8bit_simple_ci) IsBinary

func (c *Collation_8bit_simple_ci) IsBinary() bool

func (*Collation_8bit_simple_ci) Name

func (c *Collation_8bit_simple_ci) Name() string

func (*Collation_8bit_simple_ci) ToLower added in v0.16.0

func (c *Collation_8bit_simple_ci) ToLower(dst, src []byte) []byte

func (*Collation_8bit_simple_ci) ToUpper added in v0.16.0

func (c *Collation_8bit_simple_ci) ToUpper(dst, src []byte) []byte

func (*Collation_8bit_simple_ci) WeightString

func (c *Collation_8bit_simple_ci) WeightString(dst, src []byte, numCodepoints int) []byte

func (*Collation_8bit_simple_ci) WeightStringLen

func (c *Collation_8bit_simple_ci) WeightStringLen(numBytes int) int

func (*Collation_8bit_simple_ci) Wildcard

func (c *Collation_8bit_simple_ci) Wildcard(pat []byte, matchOne rune, matchMany rune, escape rune) WildcardPattern

type Collation_binary

type Collation_binary struct{}

func (*Collation_binary) Charset

func (c *Collation_binary) Charset() charset.Charset

func (*Collation_binary) Collate

func (c *Collation_binary) Collate(left, right []byte, isPrefix bool) int

func (*Collation_binary) Hash

func (c *Collation_binary) Hash(src []byte, numCodepoints int) HashCode

func (*Collation_binary) ID

func (c *Collation_binary) ID() ID

func (*Collation_binary) Init

func (c *Collation_binary) Init()

func (*Collation_binary) IsBinary

func (c *Collation_binary) IsBinary() bool

func (*Collation_binary) Name

func (c *Collation_binary) Name() string

func (*Collation_binary) ToLower added in v0.16.0

func (c *Collation_binary) ToLower(dst, raw []byte) []byte

func (*Collation_binary) ToUpper added in v0.16.0

func (c *Collation_binary) ToUpper(dst, raw []byte) []byte

func (*Collation_binary) WeightString

func (c *Collation_binary) WeightString(dst, src []byte, numCodepoints int) []byte

func (*Collation_binary) WeightStringLen

func (c *Collation_binary) WeightStringLen(numBytes int) int

func (*Collation_binary) Wildcard

func (c *Collation_binary) Wildcard(pat []byte, matchOne rune, matchMany rune, escape rune) WildcardPattern

type Collation_multibyte

type Collation_multibyte struct {
	// contains filtered or unexported fields
}

func (*Collation_multibyte) Charset

func (c *Collation_multibyte) Charset() charset.Charset

func (*Collation_multibyte) Collate

func (c *Collation_multibyte) Collate(left, right []byte, isPrefix bool) int

func (*Collation_multibyte) Hash

func (c *Collation_multibyte) Hash(src []byte, numCodepoints int) HashCode

func (*Collation_multibyte) ID

func (c *Collation_multibyte) ID() ID

func (*Collation_multibyte) Init

func (c *Collation_multibyte) Init()

func (*Collation_multibyte) IsBinary

func (c *Collation_multibyte) IsBinary() bool

func (*Collation_multibyte) Name

func (c *Collation_multibyte) Name() string

func (*Collation_multibyte) WeightString

func (c *Collation_multibyte) WeightString(dst, src []byte, numCodepoints int) []byte

func (*Collation_multibyte) WeightStringLen

func (c *Collation_multibyte) WeightStringLen(numCodepoints int) int

func (*Collation_multibyte) Wildcard

func (c *Collation_multibyte) Wildcard(pat []byte, matchOne rune, matchMany rune, escape rune) WildcardPattern

type Collation_uca_legacy

type Collation_uca_legacy struct {
	// contains filtered or unexported fields
}

func (*Collation_uca_legacy) Charset

func (c *Collation_uca_legacy) Charset() charset.Charset

func (*Collation_uca_legacy) Collate

func (c *Collation_uca_legacy) Collate(left, right []byte, isPrefix bool) int

func (*Collation_uca_legacy) Hash

func (c *Collation_uca_legacy) Hash(src []byte, numCodepoints int) HashCode

func (*Collation_uca_legacy) ID

func (c *Collation_uca_legacy) ID() ID

func (*Collation_uca_legacy) Init

func (c *Collation_uca_legacy) Init()

func (*Collation_uca_legacy) IsBinary

func (c *Collation_uca_legacy) IsBinary() bool

func (*Collation_uca_legacy) Name

func (c *Collation_uca_legacy) Name() string

func (*Collation_uca_legacy) WeightString

func (c *Collation_uca_legacy) WeightString(dst, src []byte, numCodepoints int) []byte

func (*Collation_uca_legacy) WeightStringLen

func (c *Collation_uca_legacy) WeightStringLen(numBytes int) int

func (*Collation_uca_legacy) Wildcard

func (c *Collation_uca_legacy) Wildcard(pat []byte, matchOne rune, matchMany rune, escape rune) WildcardPattern

type Collation_unicode_bin

type Collation_unicode_bin struct {
	// contains filtered or unexported fields
}

func (*Collation_unicode_bin) Charset

func (c *Collation_unicode_bin) Charset() charset.Charset

func (*Collation_unicode_bin) Collate

func (c *Collation_unicode_bin) Collate(left, right []byte, isPrefix bool) int

func (*Collation_unicode_bin) Hash

func (c *Collation_unicode_bin) Hash(src []byte, numCodepoints int) HashCode

func (*Collation_unicode_bin) ID

func (c *Collation_unicode_bin) ID() ID

func (*Collation_unicode_bin) Init

func (c *Collation_unicode_bin) Init()

func (*Collation_unicode_bin) IsBinary

func (c *Collation_unicode_bin) IsBinary() bool

func (*Collation_unicode_bin) Name

func (c *Collation_unicode_bin) Name() string

func (*Collation_unicode_bin) WeightString

func (c *Collation_unicode_bin) WeightString(dst, src []byte, numCodepoints int) []byte

func (*Collation_unicode_bin) WeightStringLen

func (c *Collation_unicode_bin) WeightStringLen(numBytes int) int

func (*Collation_unicode_bin) Wildcard

func (c *Collation_unicode_bin) Wildcard(pat []byte, matchOne rune, matchMany rune, escape rune) WildcardPattern

type Collation_unicode_general_ci

type Collation_unicode_general_ci struct {
	// contains filtered or unexported fields
}

func (*Collation_unicode_general_ci) Charset

func (*Collation_unicode_general_ci) Collate

func (c *Collation_unicode_general_ci) Collate(left, right []byte, isPrefix bool) int

func (*Collation_unicode_general_ci) Hash

func (c *Collation_unicode_general_ci) Hash(src []byte, numCodepoints int) HashCode

func (*Collation_unicode_general_ci) ID

func (*Collation_unicode_general_ci) Init

func (c *Collation_unicode_general_ci) Init()

func (*Collation_unicode_general_ci) IsBinary

func (c *Collation_unicode_general_ci) IsBinary() bool

func (*Collation_unicode_general_ci) Name

func (*Collation_unicode_general_ci) WeightString

func (c *Collation_unicode_general_ci) WeightString(dst, src []byte, numCodepoints int) []byte

func (*Collation_unicode_general_ci) WeightStringLen

func (c *Collation_unicode_general_ci) WeightStringLen(numBytes int) int

func (*Collation_unicode_general_ci) Wildcard

func (c *Collation_unicode_general_ci) Wildcard(pat []byte, matchOne rune, matchMany rune, escape rune) WildcardPattern

type Collation_utf8mb4_0900_bin

type Collation_utf8mb4_0900_bin struct{}

func (*Collation_utf8mb4_0900_bin) Charset

func (*Collation_utf8mb4_0900_bin) Collate

func (c *Collation_utf8mb4_0900_bin) Collate(left, right []byte, isPrefix bool) int

func (*Collation_utf8mb4_0900_bin) Hash

func (c *Collation_utf8mb4_0900_bin) Hash(src []byte, _ int) HashCode

func (*Collation_utf8mb4_0900_bin) ID

func (*Collation_utf8mb4_0900_bin) Init

func (c *Collation_utf8mb4_0900_bin) Init()

func (*Collation_utf8mb4_0900_bin) IsBinary

func (c *Collation_utf8mb4_0900_bin) IsBinary() bool

func (*Collation_utf8mb4_0900_bin) Name

func (*Collation_utf8mb4_0900_bin) ToLower added in v0.16.0

func (c *Collation_utf8mb4_0900_bin) ToLower(dst, src []byte) []byte

func (*Collation_utf8mb4_0900_bin) ToUpper added in v0.16.0

func (c *Collation_utf8mb4_0900_bin) ToUpper(dst, src []byte) []byte

func (*Collation_utf8mb4_0900_bin) WeightString

func (c *Collation_utf8mb4_0900_bin) WeightString(dst, src []byte, numCodepoints int) []byte

func (*Collation_utf8mb4_0900_bin) WeightStringLen

func (c *Collation_utf8mb4_0900_bin) WeightStringLen(numBytes int) int

func (*Collation_utf8mb4_0900_bin) Wildcard

func (c *Collation_utf8mb4_0900_bin) Wildcard(pat []byte, matchOne rune, matchMany rune, escape rune) WildcardPattern

type Collation_utf8mb4_uca_0900

type Collation_utf8mb4_uca_0900 struct {
	// contains filtered or unexported fields
}

func (*Collation_utf8mb4_uca_0900) Charset

func (*Collation_utf8mb4_uca_0900) Collate

func (c *Collation_utf8mb4_uca_0900) Collate(left, right []byte, rightIsPrefix bool) int

func (*Collation_utf8mb4_uca_0900) Hash

func (c *Collation_utf8mb4_uca_0900) Hash(src []byte, _ int) HashCode

func (*Collation_utf8mb4_uca_0900) ID

func (*Collation_utf8mb4_uca_0900) Init

func (c *Collation_utf8mb4_uca_0900) Init()

func (*Collation_utf8mb4_uca_0900) IsBinary

func (c *Collation_utf8mb4_uca_0900) IsBinary() bool

func (*Collation_utf8mb4_uca_0900) Name

func (*Collation_utf8mb4_uca_0900) ToLower added in v0.16.0

func (c *Collation_utf8mb4_uca_0900) ToLower(dst, src []byte) []byte

func (*Collation_utf8mb4_uca_0900) ToUpper added in v0.16.0

func (c *Collation_utf8mb4_uca_0900) ToUpper(dst, src []byte) []byte

func (*Collation_utf8mb4_uca_0900) WeightString

func (c *Collation_utf8mb4_uca_0900) WeightString(dst, src []byte, numCodepoints int) []byte

func (*Collation_utf8mb4_uca_0900) WeightStringLen

func (c *Collation_utf8mb4_uca_0900) WeightStringLen(numBytes int) int

func (*Collation_utf8mb4_uca_0900) Wildcard

func (c *Collation_utf8mb4_uca_0900) Wildcard(pat []byte, matchOne rune, matchMany rune, escape rune) WildcardPattern

type Environment

type Environment struct {
	// contains filtered or unexported fields
}

Environment is a collation environment for a MySQL version, which contains a database of collations and defaults for that specific version.

func Local

func Local() *Environment

Local is the default collation Environment for Vitess. This depends on the value of the `mysql_server_version` flag passed to this Vitess process.

func NewEnvironment

func NewEnvironment(serverVersion string) *Environment

NewEnvironment creates a collation Environment for the given MySQL version string. The version string must be in the format that is sent by the server as the version packet when opening a new MySQL connection

func (*Environment) AllCollations

func (env *Environment) AllCollations() (all []Collation)

AllCollations returns a slice with all known collations in Vitess. This is an expensive call because it will initialize the internal state of all the collations before returning them. Used for testing/debugging.

func (*Environment) BinaryCollationForCharset

func (env *Environment) BinaryCollationForCharset(charset string) Collation

BinaryCollationForCharset returns the default binary collation for a charset

func (*Environment) CharsetAlias added in v0.14.0

func (env *Environment) CharsetAlias(charset string) (alias string, ok bool)

CharsetAlias returns the internal charset name for the given charset. For now, this only maps `utf8` to `utf8mb3`; in future versions of MySQL, this mapping will change, so it's important to use this helper so that Vitess code has a consistent mapping for the active collations environment.

func (*Environment) CollationAlias added in v0.16.0

func (env *Environment) CollationAlias(collation string) (string, bool)

CollationAlias returns the internal collaction name for the given charset. For now, this maps all `utf8` to `utf8mb3` collation names; in future versions of MySQL, this mapping will change, so it's important to use this helper so that Vitess code has a consistent mapping for the active collations environment.

func (*Environment) DefaultCollationForCharset

func (env *Environment) DefaultCollationForCharset(charset string) Collation

DefaultCollationForCharset returns the default collation for a charset

func (*Environment) DefaultConnectionCharset

func (env *Environment) DefaultConnectionCharset() uint8

DefaultConnectionCharset is the default charset that Vitess will use when negotiating a charset in a MySQL connection handshake. Note that in this context, a 'charset' is equivalent to a Collation ID, with the exception that it can only fit in 1 byte. For MySQL 8.0+ environments, the default charset is `utf8mb4_0900_ai_ci`. For older MySQL environments, the default charset is `utf8mb4_general_ci`.

func (*Environment) EnsureCollate

func (env *Environment) EnsureCollate(fromID, toID ID) error

func (*Environment) LookupByID

func (env *Environment) LookupByID(id ID) Collation

LookupByID returns the collation with the given numerical identifier. The collation is initialized if it's the first time being accessed.

func (*Environment) LookupByName

func (env *Environment) LookupByName(name string) Collation

LookupByName returns the collation with the given name. The collation is initialized if it's the first time being accessed.

func (*Environment) LookupID

func (env *Environment) LookupID(name string) (ID, bool)

LookupID returns the collation ID for the given name, and whether the collation is supported by this package.

func (*Environment) MergeCollations

func (env *Environment) MergeCollations(left, right TypedCollation, opt CoercionOptions) (TypedCollation, Coercion, Coercion, error)

MergeCollations returns a Coercion function for a pair of TypedCollation based on their coercibility.

The function takes the typed collations for the two sides of a text operation (namely, a comparison or concatenation of two textual expressions). These typed collations includes the actual collation for the expression on each size, their coercibility values (see: Coercibility) and their respective repertoires, and returns the target collation (i.e. the collation into which the two expressions must be coerced, and a Coercion function. The Coercion function can be called repeatedly with the different values for the two expressions and will transcode either the left-hand or right-hand value to the appropriate charset so it can be collated against the other value.

If the collations for both sides of the expressions are the same, the returned Coercion function will be a no-op. Likewise, if the two collations are not the same, but they are compatible and have the same charset, the Coercion function will also be a no-op.

If the collations for both sides of the expression are not compatible, an error will be returned and the returned TypedCollation and Coercion will be nil.

func (*Environment) ParseConnectionCharset

func (env *Environment) ParseConnectionCharset(csname string) (uint8, error)

ParseConnectionCharset parses the given charset name and returns its numerical identifier to be used in a MySQL connection handshake. The charset name can be: - the name of a character set, in which case the default collation ID for the character set is returned. - the name of a collation, in which case the ID for the collation is returned, UNLESS the collation itself has an ID greater than 255; such collations are not supported because they cannot be negotiated in a single byte in our connection handshake. - empty, in which case the default connection charset for this MySQL version is returned.

type HashCode

type HashCode = uintptr

type ID

type ID uint16

ID is a numeric identifier for a collation. These identifiers are defined by MySQL, not by Vitess.

const Unknown ID = 0

Unknown is the default ID for an unknown collation.

func Default

func Default() ID

Default returns the default collation for this Vitess process. This is based on the local collation environment, which is based on the user's configured MySQL version for this Vitess deployment.

type Repertoire

type Repertoire byte

Repertoire is a constant that defines the collection of characters in an expression. MySQL only distinguishes between an ASCII repertoire (i.e. an expression where all the contained codepoints are < 128), or an Unicode repertoire (an expression that can contain any possible codepoint).

See: https://dev.mysql.com/doc/refman/8.0/en/charset-repertoire.html

const (
	RepertoireASCII Repertoire = iota
	RepertoireUnicode
)

type TypedCollation

type TypedCollation struct {
	Collation    ID
	Coercibility Coercibility
	Repertoire   Repertoire
}

TypedCollation is the Collation of a SQL expression, including its coercibility and repertoire.

func (TypedCollation) Valid

func (tc TypedCollation) Valid() bool

type UnicaseChar

type UnicaseChar struct {
	ToUpper, ToLower, Sort rune
}

type UnicaseInfo

type UnicaseInfo struct {
	MaxChar   rune
	Page      []*[]UnicaseChar
	LowerSort bool
}

type WildcardPattern

type WildcardPattern interface {
	// Match returns whether the given string matches this pattern
	Match(in []byte) bool
}

WildcardPattern is a matcher for a wildcard pattern, constructed from a given collation

Directories

Path Synopsis
internal
charset/korean
Package korean provides Korean encodings such as EUC-KR.
Package korean provides Korean encodings such as EUC-KR.
charset/simplifiedchinese
Package simplifiedchinese provides Simplified Chinese encodings such as GBK.
Package simplifiedchinese provides Simplified Chinese encodings such as GBK.
uca
tools

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL