Documentation ¶
Overview ¶
Package language implements BCP 47 language tags and related functionality.
The Tag type, which is used to represent language tags, is agnostic to the meaning of its subtags. Tags are not fully canonicalized to preserve information that may be valuable in certain contexts. As a consequence, two different tags may represent identical languages in certain contexts.
To determine equivalence between tags, a user should typically use a Matcher that is aware of the intricacies of equivalence within the given context. The default Matcher implementation provided in this package takes into account things such as deprecated subtags, legacy tags, and mutual intelligibility between scripts and languages.
See http://tools.ietf.org/html/bcp47 for more details.
NOTE: This package is still under development. Parts of it are not yet implemented, and the API is subject to change.
Index ¶
- Constants
- Variables
- func CompactIndex(t Tag) (index int, ok bool)
- type Base
- type CanonType
- type Confidence
- type Coverage
- type Currency
- type Extension
- type Matcher
- type Region
- func (r Region) Canonicalize() Region
- func (r Region) Contains(c Region) bool
- func (r Region) ISO3() string
- func (r Region) IsCountry() bool
- func (r Region) IsGroup() bool
- func (r Region) IsPrivateUse() bool
- func (r Region) M49() int
- func (r Region) String() string
- func (r Region) TLD() (Region, error)
- type Script
- type Tag
- func (t Tag) Base() (Base, Confidence)
- func (t Tag) ComprehensibleTo(speaker Tag) Confidence
- func (t Tag) Extension(x byte) (ext Extension, ok bool)
- func (t Tag) Extensions() []Extension
- func (t Tag) IsRoot() bool
- func (t Tag) Parent() Tag
- func (t Tag) Raw() (b Base, s Script, r Region)
- func (t Tag) Region() (Region, Confidence)
- func (t Tag) Script() (Script, Confidence)
- func (t Tag) SetTypeForKey(key, value string) (Tag, error)
- func (t Tag) String() string
- func (t Tag) TypeForKey(key string) string
- func (t Tag) Variants() []Variant
- type ValueError
- type Variant
Constants ¶
const ( // Replace deprecated base languages with their preferred replacements. DeprecatedBase CanonType = 1 << iota // Replace deprecated scripts with their preferred replacements. DeprecatedScript // Replace deprecated regions with their preferred replacements. DeprecatedRegion // Remove redundant scripts. SuppressScript // Normalize legacy encodings. This includes legacy languages defined in // CLDR as well as bibliographic codes defined in ISO-639. Legacy // Map the dominant language of a macro language group to the macro language // subtag. For example cmn -> zh. Macro // The CLDR flag should be used if full compatibility with CLDR is required. // There are a few cases where language.Tag may differ from CLDR. To follow all // of CLDR's suggestions, use All|CLDR. CLDR // Raw can be used to Compose or Parse without Canonicalization. Raw CanonType = 0 // Replace all deprecated tags with their preferred replacements. Deprecated = DeprecatedBase | DeprecatedScript | DeprecatedRegion // All canonicalizations recommended by BCP 47. BCP47 = Deprecated | SuppressScript // All canonicalizations. All = BCP47 | Legacy | Macro // Default is the canonicalization used by Parse, Make and Compose. To // preserve as much information as possible, canonicalizations that remove // potentially valuable information are not included. The Matcher is // designed to recognize similar tags that would be the same if // they were canonicalized using All. Default = Deprecated | Legacy )
const NumCompactTags = 409
NumCompactTags is the number of common tags. The maximum tag is NumCompactTags-1.
const Version = "27.0.1"
Version is the version of CLDR used to generate the data in this package.
Variables ¶
var ErrMissingLikelyTagsData = errors.New("missing likely tags data")
ErrMissingLikelyTagsData indicates no information was available to compute likely values of missing tags.
Functions ¶
func CompactIndex ¶
CompactIndex returns an index, where 0 <= index < NumCompactTags, for tags for which data exists in the text repository. The index will change over time and should not be stored in persistent storage. Extensions other than 'x' are not considered for determining the index. It will return 0, false if no compact tag exists, where 0 is the index for the root language (Und).
Types ¶
type Base ¶
type Base struct {
// contains filtered or unexported fields
}
Base is an ISO 639 language code, used for encoding the base language of a language tag.
func MustParseBase ¶
MustParseBase is like ParseBase, but panics if the given base cannot be parsed. It simplifies safe initialization of Base values.
func ParseBase ¶
ParseBase parses a 2- or 3-letter ISO 639 code. It returns a ValueError if s is a well-formed but unknown language identifier or another error if another error occurred.
func (Base) IsPrivateUse ¶
func (b Base) IsPrivateUse() bool
IsPrivateUse reports whether this language code is reserved for private use.
type CanonType ¶
type CanonType int
CanonType can be used to enable or disable various types of canonicalization.
func (CanonType) Canonicalize ¶
Canonicalize returns the canonicalized equivalent of the tag.
func (CanonType) Compose ¶
Compose creates a Tag from individual parts, which may be of type Tag, Base, Script, Region, Variant, []Variant, Extension, []Extension or error. If a Base, Script or Region or slice of type Variant or Extension is passed more than once, the latter will overwrite the former. Variants and Extensions are accumulated, but if two extensions of the same type are passed, the latter will replace the former. A Tag overwrites all former values and typically only makes sense as the first argument. The resulting tag is returned after canonicalizing using CanonType c. If one or more errors are encountered, one of the errors is returned.
func (CanonType) Make ¶
Make is a convenience wrapper for c.Parse that omits the error. In case of an error, a sensible default is returned.
func (CanonType) MustParse ¶
MustParse is like Parse, but panics if the given BCP 47 tag cannot be parsed. It simplifies safe initialization of Tag values.
func (CanonType) Parse ¶
Parse parses the given BCP 47 string and returns a valid Tag. If parsing failed it returns an error and any part of the tag that could be parsed. If parsing succeeded but an unknown value was found, it returns ValueError. The Tag returned in this case is just stripped of the unknown value. All other values are preserved. It accepts tags in the BCP 47 format and extensions to this standard defined in http://www.unicode.org/reports/tr35/#Unicode_Language_and_Locale_Identifiers. The resulting tag is canonicalized using the the canonicalization type c.
type Confidence ¶
type Confidence int
Confidence indicates the level of certainty for a given return value. For example, Serbian may be written in Cyrillic or Latin script. The confidence level indicates whether a value was explicitly specified, whether it is typically the only possible value, or whether there is an ambiguity.
const ( No Confidence = iota // full confidence that there was no match Low // most likely value picked out of a set of alternatives High // value is generally assumed to be the correct match Exact // exact match or explicitly specified value )
func (Confidence) String ¶
func (c Confidence) String() string
type Coverage ¶
type Coverage interface { // Tags returns the list of supported tags. Tags() []Tag // BaseLanguages returns the list of supported base languages. BaseLanguages() []Base // Scripts returns the list of supported scripts. Scripts() []Script // Regions returns the list of supported regions. Regions() []Region // Currencies returns the list of supported currencies. Currencies() []Currency }
The Coverage interface is used to define the level of coverage of an internationalization service. Note that not all types are supported by all services. As lists may be generated on the fly, it is recommended that users of a Coverage cache the results.
var ( // Supported defines a Coverage that lists all supported subtags. Tags // always returns nil. Supported Coverage = allSubtags{} )
func NewCoverage ¶
func NewCoverage(list ...interface{}) Coverage
NewCoverage returns a Coverage for the given lists. It is typically used by packages providing internationalization services to define their level of coverage. A list may be of type []T or func() []T, where T is either Tag, Base, Script or Region. The returned Coverage derives the value for Bases from Tags if no func or slice for []Base is specified. For other unspecified types the returned Coverage will return nil for the respective methods.
type Currency ¶
type Currency struct {
// contains filtered or unexported fields
}
Currency is deprecated. Use package golang.org/x/text/currency.
func MustParseCurrency ¶
MustParseCurrency is like ParseCurrency, but panics if the given currency cannot be parsed. It simplifies safe initialization of Currency values.
func ParseCurrency ¶
ParseCurrency parses a 3-letter ISO 4217 code. It returns a ValueError if s is a well-formed but unknown currency identifier or another error if another error occurred.
type Extension ¶
type Extension struct {
// contains filtered or unexported fields
}
Extension is a single BCP 47 extension.
func ParseExtension ¶
ParseExtension parses s as an extension and returns it on success.
func (Extension) String ¶
String returns the string representation of the extension, including the type tag.
type Matcher ¶
type Matcher interface {
Match(t ...Tag) (tag Tag, index int, c Confidence)
}
Matcher is the interface that wraps the Match method.
Match returns the best match for any of the given tags, along with a unique index associated with the returned tag and a confidence score.
func NewMatcher ¶
NewMatcher returns a Matcher that finds the best match for a tag based on written intelligibility. The index returned by the Match method corresponds to the index of the matched tag in the given list. The first element is used as the default value in case no good match is found.
Its Match method matches matches the first of the given Tags to reach a certain confidence threshold. The tags passed to Match should therefore be specified in order of preference. Various factors such as deprecated variants of tags, legacy mappings and information based on mutual intelligibility defined in CLDR are considered to determine equivalence.
type Region ¶
type Region struct {
// contains filtered or unexported fields
}
Region is an ISO 3166-1 or UN M.49 code for representing countries and regions.
func EncodeM49 ¶
EncodeM49 returns the Region for the given UN M.49 code. It returns an error if r is not a valid code.
func MustParseRegion ¶
MustParseRegion is like ParseRegion, but panics if the given region cannot be parsed. It simplifies safe initialization of Region values.
func ParseRegion ¶
ParseRegion parses a 2- or 3-letter ISO 3166-1 or a UN M.49 code. It returns a ValueError if s is a well-formed but unknown region identifier or another error if another error occurred.
func (Region) Canonicalize ¶
Canonicalize returns the region or a possible replacement if the region is deprecated. It will not return a replacement for deprecated regions that are split into multiple regions.
func (Region) Contains ¶
Contains returns whether Region c is contained by Region r. It returns true if c == r.
func (Region) ISO3 ¶
func (r Region) ISO3() string
ISO3 returns the 3-letter ISO code of r. Note that not all regions have a 3-letter ISO code. In such cases this method returns "ZZZ".
func (Region) IsCountry ¶
IsCountry returns whether this region is a country or autonomous area. This includes non-standard definitions from CLDR.
func (Region) IsGroup ¶
IsGroup returns whether this region defines a collection of regions. This includes non-standard definitions from CLDR.
func (Region) IsPrivateUse ¶
func (r Region) IsPrivateUse() bool
IsPrivateUse reports whether r has the ISO 3166 User-assigned status. This may include private-use tags that are assigned by CLDR and used in this implementation. So IsPrivateUse and IsCountry can be simultaneously true.
func (Region) M49 ¶
func (r Region) M49() int
M49 returns the UN M.49 encoding of r, or 0 if this encoding is not defined for r.
func (Region) String ¶
func (r Region) String() string
String returns the BCP 47 representation for the region. It returns "ZZ" for an unspecified region.
func (Region) TLD ¶
TLD returns the country code top-level domain (ccTLD). UK is returned for GB. In all other cases it returns either the region itself or an error.
This method may return an error for a region for which there exists a canonical form with a ccTLD. To get that ccTLD canonicalize r first. The region will already be canonicalized it was obtained from a Tag that was obtained using any of the default methods.
type Script ¶
type Script struct {
// contains filtered or unexported fields
}
Script is a 4-letter ISO 15924 code for representing scripts. It is idiomatically represented in title case.
func MustParseScript ¶
MustParseScript is like ParseScript, but panics if the given script cannot be parsed. It simplifies safe initialization of Script values.
func ParseScript ¶
ParseScript parses a 4-letter ISO 15924 code. It returns a ValueError if s is a well-formed but unknown script identifier or another error if another error occurred.
func (Script) IsPrivateUse ¶
func (s Script) IsPrivateUse() bool
IsPrivateUse reports whether this script code is reserved for private use.
type Tag ¶
type Tag struct {
// contains filtered or unexported fields
}
Tag represents a BCP 47 language tag. It is used to specify an instance of a specific language or locale. All language tag values are guaranteed to be well-formed.
var ( Und Tag = Tag{} Afrikaans Tag = Tag{/* contains filtered or unexported fields */} // af Amharic Tag = Tag{/* contains filtered or unexported fields */} // am Arabic Tag = Tag{/* contains filtered or unexported fields */} // ar ModernStandardArabic Tag = Tag{/* contains filtered or unexported fields */} // ar-001 Azerbaijani Tag = Tag{/* contains filtered or unexported fields */} // az Bulgarian Tag = Tag{/* contains filtered or unexported fields */} // bg Bengali Tag = Tag{/* contains filtered or unexported fields */} // bn Catalan Tag = Tag{/* contains filtered or unexported fields */} // ca Czech Tag = Tag{/* contains filtered or unexported fields */} // cs Danish Tag = Tag{/* contains filtered or unexported fields */} // da German Tag = Tag{/* contains filtered or unexported fields */} // de Greek Tag = Tag{/* contains filtered or unexported fields */} // el English Tag = Tag{/* contains filtered or unexported fields */} // en AmericanEnglish Tag = Tag{/* contains filtered or unexported fields */} // en-US BritishEnglish Tag = Tag{/* contains filtered or unexported fields */} // en-GB Spanish Tag = Tag{/* contains filtered or unexported fields */} // es EuropeanSpanish Tag = Tag{/* contains filtered or unexported fields */} // es-ES LatinAmericanSpanish Tag = Tag{/* contains filtered or unexported fields */} // es-419 Estonian Tag = Tag{/* contains filtered or unexported fields */} // et Persian Tag = Tag{/* contains filtered or unexported fields */} // fa Finnish Tag = Tag{/* contains filtered or unexported fields */} // fi Filipino Tag = Tag{/* contains filtered or unexported fields */} // fil French Tag = Tag{/* contains filtered or unexported fields */} // fr CanadianFrench Tag = Tag{/* contains filtered or unexported fields */} // fr-CA Gujarati Tag = Tag{/* contains filtered or unexported fields */} // gu Hebrew Tag = Tag{/* contains filtered or unexported fields */} // he Hindi Tag = Tag{/* contains filtered or unexported fields */} // hi Croatian Tag = Tag{/* contains filtered or unexported fields */} // hr Hungarian Tag = Tag{/* contains filtered or unexported fields */} // hu Armenian Tag = Tag{/* contains filtered or unexported fields */} // hy Indonesian Tag = Tag{/* contains filtered or unexported fields */} // id Icelandic Tag = Tag{/* contains filtered or unexported fields */} // is Italian Tag = Tag{/* contains filtered or unexported fields */} // it Japanese Tag = Tag{/* contains filtered or unexported fields */} // ja Georgian Tag = Tag{/* contains filtered or unexported fields */} // ka Kazakh Tag = Tag{/* contains filtered or unexported fields */} // kk Khmer Tag = Tag{/* contains filtered or unexported fields */} // km Kannada Tag = Tag{/* contains filtered or unexported fields */} // kn Korean Tag = Tag{/* contains filtered or unexported fields */} // ko Kirghiz Tag = Tag{/* contains filtered or unexported fields */} // ky Lao Tag = Tag{/* contains filtered or unexported fields */} // lo Lithuanian Tag = Tag{/* contains filtered or unexported fields */} // lt Latvian Tag = Tag{/* contains filtered or unexported fields */} // lv Macedonian Tag = Tag{/* contains filtered or unexported fields */} // mk Malayalam Tag = Tag{/* contains filtered or unexported fields */} // ml Mongolian Tag = Tag{/* contains filtered or unexported fields */} // mn Marathi Tag = Tag{/* contains filtered or unexported fields */} // mr Malay Tag = Tag{/* contains filtered or unexported fields */} // ms Burmese Tag = Tag{/* contains filtered or unexported fields */} // my Nepali Tag = Tag{/* contains filtered or unexported fields */} // ne Dutch Tag = Tag{/* contains filtered or unexported fields */} // nl Norwegian Tag = Tag{/* contains filtered or unexported fields */} // no Punjabi Tag = Tag{/* contains filtered or unexported fields */} // pa Polish Tag = Tag{/* contains filtered or unexported fields */} // pl Portuguese Tag = Tag{/* contains filtered or unexported fields */} // pt BrazilianPortuguese Tag = Tag{/* contains filtered or unexported fields */} // pt-BR EuropeanPortuguese Tag = Tag{/* contains filtered or unexported fields */} // pt-PT Romanian Tag = Tag{/* contains filtered or unexported fields */} // ro Russian Tag = Tag{/* contains filtered or unexported fields */} // ru Sinhala Tag = Tag{/* contains filtered or unexported fields */} // si Slovak Tag = Tag{/* contains filtered or unexported fields */} // sk Slovenian Tag = Tag{/* contains filtered or unexported fields */} // sl Albanian Tag = Tag{/* contains filtered or unexported fields */} // sq Serbian Tag = Tag{/* contains filtered or unexported fields */} // sr SerbianLatin Tag = Tag{/* contains filtered or unexported fields */} // sr-Latn Swedish Tag = Tag{/* contains filtered or unexported fields */} // sv Swahili Tag = Tag{/* contains filtered or unexported fields */} // sw Tamil Tag = Tag{/* contains filtered or unexported fields */} // ta Telugu Tag = Tag{/* contains filtered or unexported fields */} // te Thai Tag = Tag{/* contains filtered or unexported fields */} // th Turkish Tag = Tag{/* contains filtered or unexported fields */} // tr Ukrainian Tag = Tag{/* contains filtered or unexported fields */} // uk Urdu Tag = Tag{/* contains filtered or unexported fields */} // ur Uzbek Tag = Tag{/* contains filtered or unexported fields */} // uz Vietnamese Tag = Tag{/* contains filtered or unexported fields */} // vi Chinese Tag = Tag{/* contains filtered or unexported fields */} // zh SimplifiedChinese Tag = Tag{/* contains filtered or unexported fields */} // zh-Hans TraditionalChinese Tag = Tag{/* contains filtered or unexported fields */} // zh-Hant Zulu Tag = Tag{/* contains filtered or unexported fields */} // zu )
func Compose ¶
Compose creates a Tag from individual parts, which may be of type Tag, Base, Script, Region, Variant, []Variant, Extension, []Extension or error. If a Base, Script or Region or slice of type Variant or Extension is passed more than once, the latter will overwrite the former. Variants and Extensions are accumulated, but if two extensions of the same type are passed, the latter will replace the former. A Tag overwrites all former values and typically only makes sense as the first argument. The resulting tag is returned after canonicalizing using the Default CanonType. If one or more errors are encountered, one of the errors is returned.
func Make ¶
Make is a convenience wrapper for Parse that omits the error. In case of an error, a sensible default is returned.
func MustParse ¶
MustParse is like Parse, but panics if the given BCP 47 tag cannot be parsed. It simplifies safe initialization of Tag values.
func Parse ¶
Parse parses the given BCP 47 string and returns a valid Tag. If parsing failed it returns an error and any part of the tag that could be parsed. If parsing succeeded but an unknown value was found, it returns ValueError. The Tag returned in this case is just stripped of the unknown value. All other values are preserved. It accepts tags in the BCP 47 format and extensions to this standard defined in http://www.unicode.org/reports/tr35/#Unicode_Language_and_Locale_Identifiers. The resulting tag is canonicalized using the default canonicalization type.
func ParseAcceptLanguage ¶
ParseAcceptLanguage parses the contents of a Accept-Language header as defined in http://www.ietf.org/rfc/rfc2616.txt and returns a list of Tags and a list of corresponding quality weights. The Tags will be sorted by highest weight first and then by first occurrence. Tags with a weight of zero will be dropped. An error will be returned if the input could not be parsed.
func (Tag) Base ¶
func (t Tag) Base() (Base, Confidence)
Base returns the base language of the language tag. If the base language is unspecified, an attempt will be made to infer it from the context. It uses a variant of CLDR's Add Likely Subtags algorithm. This is subject to change.
func (Tag) ComprehensibleTo ¶
func (t Tag) ComprehensibleTo(speaker Tag) Confidence
ComprehensibleTo returns the confidence score for speaker being able to comprehend the (written) language t. It uses a Matcher under the hood.
func (Tag) Extension ¶
Extension returns the extension of type x for tag t. It will return false for ok if t does not have the requested extension. The returned extension will be invalid in this case.
func (Tag) Extensions ¶
Extensions returns all extensions of t.
func (Tag) Parent ¶
Parent returns the CLDR parent of t. In CLDR, missing fields in data for a specific language are substituted with fields from the parent language. The parent for a language may change for newer versions of CLDR.
func (Tag) Raw ¶
Raw returns the raw base language, script and region, without making an attempt to infer their values.
func (Tag) Region ¶
func (t Tag) Region() (Region, Confidence)
Region returns the region for the language tag. If it was not explicitly given, it will infer a most likely candidate from the context. It uses a variant of CLDR's Add Likely Subtags algorithm. This is subject to change.
func (Tag) Script ¶
func (t Tag) Script() (Script, Confidence)
Script infers the script for the language tag. If it was not explicitly given, it will infer a most likely candidate. If more than one script is commonly used for a language, the most likely one is returned with a low confidence indication. For example, it returns (Cyrl, Low) for Serbian. If a script cannot be inferred (Zzzz, No) is returned. We do not use Zyyy (undetermined) as one would suspect from the IANA registry for BCP 47. In a Unicode context Zyyy marks common characters (like 1, 2, 3, '.', etc.) and is therefore more like multiple scripts. See http://www.unicode.org/reports/tr24/#Values for more details. Zzzz is also used for unknown value in CLDR. (Zzzz, Exact) is returned if Zzzz was explicitly specified. Note that an inferred script is never guaranteed to be the correct one. Latin is almost exclusively used for Afrikaans, but Arabic has been used for some texts in the past. Also, the script that is commonly used may change over time. It uses a variant of CLDR's Add Likely Subtags algorithm. This is subject to change.
func (Tag) SetTypeForKey ¶
SetTypeForKey returns a new Tag with the key set to type, where key and type are of the allowed values defined for the Unicode locale extension ('u') in http://www.unicode.org/reports/tr35/#Unicode_Language_and_Locale_Identifiers. An empty value removes an existing pair with the same key.
func (Tag) TypeForKey ¶
TypeForKey returns the type associated with the given key, where key and type are of the allowed values defined for the Unicode locale extension ('u') in http://www.unicode.org/reports/tr35/#Unicode_Language_and_Locale_Identifiers. TypeForKey will traverse the inheritance chain to get the correct value.
type ValueError ¶
type ValueError struct {
// contains filtered or unexported fields
}
ValueError is returned by any of the parsing functions when the input is well-formed but the respective subtag is not recognized as a valid value.
func (ValueError) Subtag ¶
func (e ValueError) Subtag() string
Subtag returns the subtag for which the error occurred.
type Variant ¶
type Variant struct {
// contains filtered or unexported fields
}
Variant represents a registered variant of a language as defined by BCP 47.
func ParseVariant ¶
ParseVariant parses and returns a Variant. An error is returned if s is not a valid variant.