collations

package
v0.15.10 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 31, 2024 License: Apache-2.0 Imports: 9 Imported by: 0

Documentation

Index

Constants

View Source
const (
	CollationUtf8mb3ID     = 33
	CollationUtf8mb4ID     = 255
	CollationBinaryID      = 63
	CollationUtf8mb4BinID  = 46
	CollationLatin1Swedish = 8
)

A few interesting character set values. See http://dev.mysql.com/doc/internals/en/character-set.html#packet-Protocol::CharacterSet

Variables

View Source
var SystemCollation = TypedCollation{
	Collation:    CollationUtf8mb3ID,
	Coercibility: CoerceCoercible,
	Repertoire:   RepertoireUnicode,
}

SystemCollation is the default collation for the system tables such as the information schema. This is still utf8mb3 to match MySQLs behavior. This means that you can't use utf8mb4 in table names, column names, without running into significant issues.

Functions

This section is empty.

Types

type Coercibility

type Coercibility byte

Coercibility is a numeric value that represents the precedence of a collation when applied to a SQL expression. When trying to coerce the collations of two different expressions so that they can be compared, the expression with the lowest coercibility value will win and its collation will be forced upon the other expression.

The rules for assigning a Coercibility value to an expression are as follows:

  • An explicit COLLATE clause has a coercibility of 0 (not coercible at all).
  • The concatenation of two strings with different collations has a coercibility of 1.
  • The collation of a column or a stored routine parameter or local variable has a coercibility of 2.
  • A “system constant” (the string returned by functions such as USER() or VERSION()) has a coercibility of 3.
  • The collation of a literal has a coercibility of 4.
  • The collation of a numeric or temporal value has a coercibility of 5.
  • NULL or an expression that is derived from NULL has a coercibility of 6.

According to the MySQL documentation, Coercibility is an actual word of the English language, although the Vitess maintainers disagree with this assessment.

See: https://dev.mysql.com/doc/refman/8.0/en/charset-collation-coercibility.html

const (
	CoerceExplicit Coercibility = iota
	CoerceNone
	CoerceImplicit
	CoerceSysconst
	CoerceCoercible
	CoerceNumeric
	CoerceIgnorable
)

func (Coercibility) String

func (ci Coercibility) String() string

type Environment

type Environment struct {
	// contains filtered or unexported fields
}

Environment is a collation environment for a MySQL version, which contains a database of collations and defaults for that specific version.

func MySQL8

func MySQL8() *Environment

MySQL8 is the collation Environment for MySQL 8. This should only be used for testing where we know it's safe to use this version, and we don't need a specific other version.

func NewEnvironment

func NewEnvironment(serverVersion string) *Environment

NewEnvironment creates a collation Environment for the given MySQL version string. The version string must be in the format that is sent by the server as the version packet when opening a new MySQL connection

func (*Environment) AllCollationIDs

func (env *Environment) AllCollationIDs() []ID

func (*Environment) BinaryCollationForCharset

func (env *Environment) BinaryCollationForCharset(charset string) ID

BinaryCollationForCharset returns the default binary collation for a charset

func (*Environment) CachedSize

func (cached *Environment) CachedSize(alloc bool) int64

func (*Environment) CharsetAlias

func (env *Environment) CharsetAlias(charset string) (alias string, ok bool)

CharsetAlias returns the internal charset name for the given charset. For now, this only maps `utf8` to `utf8mb3`; in future versions of MySQL, this mapping will change, so it's important to use this helper so that Vitess code has a consistent mapping for the active collations environment.

func (*Environment) CollationAlias

func (env *Environment) CollationAlias(collation string) (string, bool)

CollationAlias returns the internal collaction name for the given charset. For now, this maps all `utf8` to `utf8mb3` collation names; in future versions of MySQL, this mapping will change, so it's important to use this helper so that Vitess code has a consistent mapping for the active collations environment.

func (*Environment) DefaultCollationForCharset

func (env *Environment) DefaultCollationForCharset(charset string) ID

DefaultCollationForCharset returns the default collation for a charset

func (*Environment) DefaultConnectionCharset

func (env *Environment) DefaultConnectionCharset() ID

DefaultConnectionCharset is the default charset that Vitess will use when negotiating a charset in a MySQL connection handshake. Note that in this context, a 'charset' is equivalent to a Collation ID, with the exception that it can only fit in 1 byte. For MySQL 8.0+ environments, the default charset is `utf8mb4_0900_ai_ci`. For older MySQL environments, the default charset is `utf8mb4_general_ci`.

func (*Environment) EnsureCollate

func (env *Environment) EnsureCollate(fromID, toID ID) error

func (*Environment) IsSupported

func (env *Environment) IsSupported(coll ID) bool

func (*Environment) LookupByCharset

func (env *Environment) LookupByCharset(name string) *colldefaults

func (*Environment) LookupByName

func (env *Environment) LookupByName(name string) ID

LookupByName returns the collation with the given name.

func (*Environment) LookupCharsetName

func (env *Environment) LookupCharsetName(coll ID) string

func (*Environment) LookupID

func (env *Environment) LookupID(name string) (ID, bool)

LookupID returns the collation ID for the given name, and whether the collation is supported by this package.

func (*Environment) LookupName

func (env *Environment) LookupName(id ID) string

LookupName returns the collation name for the given ID and whether the collation is supported by this package.

func (*Environment) ParseConnectionCharset

func (env *Environment) ParseConnectionCharset(csname string) (ID, error)

ParseConnectionCharset parses the given charset name and returns its numerical identifier to be used in a MySQL connection handshake. The charset name can be: - the name of a character set, in which case the default collation ID for the character set is returned. - the name of a collation, in which case the ID for the collation is returned, UNLESS the collation itself has an ID greater than 255; such collations are not supported because they cannot be negotiated in a single byte in our connection handshake. - empty, in which case the default connection charset for this MySQL version is returned.

type ID

type ID uint16

ID is a numeric identifier for a collation. These identifiers are defined by MySQL, not by Vitess.

const Unknown ID = 0

Unknown is the default ID for an unknown collation.

func CollationForType

func CollationForType(t sqltypes.Type, fallback ID) ID

type Repertoire

type Repertoire byte

Repertoire is a constant that defines the collection of characters in an expression. MySQL only distinguishes between an ASCII repertoire (i.e. an expression where all the contained codepoints are < 128), or an Unicode repertoire (an expression that can contain any possible codepoint).

See: https://dev.mysql.com/doc/refman/8.0/en/charset-repertoire.html

const (
	RepertoireASCII Repertoire = iota
	RepertoireUnicode
)

type TypedCollation

type TypedCollation struct {
	Collation    ID
	Coercibility Coercibility
	Repertoire   Repertoire
}

TypedCollation is the Collation of a SQL expression, including its coercibility and repertoire.

func (TypedCollation) Valid

func (tc TypedCollation) Valid() bool

Directories

Path Synopsis
korean
Package korean provides Korean encodings such as EUC-KR.
Package korean provides Korean encodings such as EUC-KR.
simplifiedchinese
Package simplifiedchinese provides Simplified Chinese encodings such as GBK.
Package simplifiedchinese provides Simplified Chinese encodings such as GBK.
internal
uca
tools
text is a repository of text-related packages related to internationalization (i18n) and localization (l10n), such as character encodings, text transformations, and locale-specific text handling.
text is a repository of text-related packages related to internationalization (i18n) and localization (l10n), such as character encodings, text transformations, and locale-specific text handling.
collate
Package collate contains types for comparing and sorting Unicode strings according to a given collation order.
Package collate contains types for comparing and sorting Unicode strings according to a given collation order.
unicode
unicode holds packages with implementations of Unicode standards that are mostly used as building blocks for other packages in github.com/estuary/vitess/go/mysql/collations/vindex, layout engines, or are otherwise more low-level in nature.
unicode holds packages with implementations of Unicode standards that are mostly used as building blocks for other packages in github.com/estuary/vitess/go/mysql/collations/vindex, layout engines, or are otherwise more low-level in nature.
unicode/norm
Package norm contains types and functions for normalizing Unicode strings.
Package norm contains types and functions for normalizing Unicode strings.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL