collate

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 15, 2019 License: MIT Imports: 5 Imported by: 2

README

Collate

Build Status GoDoc

Collate is a simple collation library for comparing strings in various languages for Go. It's designed for the BuntDB project, and is simliar to the collation that is found in traditional database systems

The idea is that you call a function with a collation name and it generates a Less(a, b string) bool function that can be used for sorting using the sort package or with B-Tree style databases.

Install

go get -u github.com/tidwall/collate

Example

// create a case-insensitive collation for french.
less := collate.IndexString("SPANISH_CI")
println(less("Hola", "hola"))
println(less("hola", "Hola"))
// Output:
// false
// false

Options

Case Sensitivity

Add _CI to the collation name to specify case-insensitive comparing.
Add _CS for case-sensitive compares, this is the default.

collate.Index("SPANISH_CI") // Case-insensitive collation for spanish
collate.Index("SPANISH_CS") // Case-sensitive collation for spanish
Loose Compares

Add _LOOSE to ignores diacritics, case and weight.

Numeric Compares

Add _NUM to specifies that numbers should sort numerically ("2" < "12")

JSON

You can also compare fields in json documents using the IndexJSON function. The GJSON is used under-the-hood.

var jsonA = `{"name":{"last":"Miller"}}`
var jsonB = `{"name":{"last":"anderson"}}`
less := collate.IndexJSON("ENGLISH_CI", "name.last")
println(less(jsonA, jsonB))
println(less(jsonB, jsonA))
// Output:
// false
// true

Supported Languages

Afrikaans
Albanian
AmericanEnglish
Amharic
Arabic
Armenian
Azerbaijani
Bengali
BrazilianPortuguese
BritishEnglish
Bulgarian
Burmese
CanadianFrench
Catalan
Chinese
Croatian
Czech
Danish
Dutch
English
Estonian
EuropeanPortuguese
EuropeanSpanish
Filipino
Finnish
French
Georgian
German
Greek
Gujarati
Hebrew
Hindi
Hungarian
Icelandic
Indonesian
Italian
Japanese
Kannada
Kazakh
Khmer
Kirghiz
Korean
Lao
LatinAmericanSpanish
Latvian
Lithuanian
Macedonian
Malay
Malayalam
Marathi
ModernStandardArabic
Mongolian
Nepali
Norwegian
Persian
Polish
Portuguese
Punjabi
Romanian
Russian
Serbian
SerbianLatin
SimplifiedChinese
Sinhala
Slovak
Slovenian
Spanish
Swahili
Swedish
Tamil
Telugu
Thai
TraditionalChinese
Turkish
Ukrainian
Urdu
Uzbek
Vietnamese
Zulu

Contact

Josh Baker @tidwall

License

Collate source code is available under the MIT License.

Documentation

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func IndexJSON

func IndexJSON(name, path string) (less func(a, b string) bool)

IndexJSON is like IndexString expect for json. The "name" parameter should be a valid collate definition. The "path" parameter should be a valid gjson path.

Example
var jsonA = `{"name":{"last":"Miller"}}`
var jsonB = `{"name":{"last":"anderson"}}`
less := IndexJSON("ENGLISH_CI", "name.last")
fmt.Printf("%t\n", less(jsonA, jsonB))
fmt.Printf("%t\n", less(jsonB, jsonA))
Output:

false
true

func IndexString

func IndexString(name string) (less func(a, b string) bool)

IndexString returns a Less function that can be used to compare if string "a" is less than string "b". The "name" parameter should be a valid collate definition.

Examples of collation names
--------------------------------------------------------------------
ENGLISH, EN                -- English
AMERICANENGLISH, EN-US     -- English US
FRENCH, FR                 -- French
CHINESE, ZH                -- Chinese
SIMPLIFIEDCHINESE, ZH-HANS -- Simplified Chinese
...

Case insensitive: add the CI tag to the name
--------------------------------------------------------------------
ENGLISH_CI
FR_CI
ZH-HANS_CI
...

Case sensitive: add the CS tag to the name
--------------------------------------------------------------------
ENGLISH_CS
FR_CS
ZH-HANS_CS
...

For numerics: add the NUM tag to the name
Specifies that numbers should sort numerically ("2" < "12")
--------------------------------------------------------------------
DUTCH_NUM
JAPANESE_NUM
...

For loosness: add the LOOSE tag to the name
Ignores diacritics, case and weight
--------------------------------------------------------------------
JA_LOOSE
CHINESE_LOOSE
...
Example
var nameA = "Miller"
var nameB = "anderson"
less := IndexString("ENGLISH_CI")
fmt.Printf("%t\n", less(nameA, nameB))
fmt.Printf("%t\n", less(nameB, nameA))
Output:

false
true

func SupportedLangs

func SupportedLangs() []string

SupportedLangs returns all of the languages that Index() supports.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL