Documentation
¶
Overview ¶
Package rbnf is a Go implementation of the Unicode Locale Data Markup Language (LDML) [Rule-Based Number Format (RBNF)].
The RBNF can be used for complicated number formatting tasks, such as formatting a number of seconds as hours, minutes and seconds, or spelling out a number like 123 as "one hundred twenty-three", or adding an ordinal suffix to the end of a numeral like "123rd", or formatting numbers in a non-decimal number system such as Roman numerals or traditional Tamil numerals.
This package does not implement any mapping from locale to specific rules. This must be handled at a higher layer.
This package does not store any rules directly. You will have to obtain these from the Unicode Common Locale Data Repository (CLDR), or other sources, or define your own. Some rules CLDR rules for non-decimal number systems are implemented at golib/v2/text/number/algorithmic.
Rule-Based Number Format (RBNF): https://unicode.org/reports/tr35/tr35-numbers.html#6-rule-based-number-formatting golib/v2/text/number/algorithmic: https://github.com/tawesoft/golib/v2/text/number/algorithmic
## Security model
It is assumed that the input rules come from a trusted author (e.g. the CLDR itself, or a trusted provider of localisation rules).
## Note of caution
Quoting from the linked reference:
"Where... CLDR plurals or ordinals can be used, their usage is recommended in preference to the RBNF data. First, the RBNF data is not completely fleshed out over all languages that otherwise have modern coverage. Secondly, the alternate forms are neither complete, nor useful without additional information. For example, for German there is spellout-cardinal-masculine, and spellout-cardinal-feminine. But a complete solution would have all genders (masculine/feminine/neuter), all cases (nominative, accusative, dative, genitive), plus context (with strong or weak determiner or none). Moreover, even for the alternate forms that do exist, CLDR does not supply any data for when to use one vs another (eg, when to use spellout-cardinal-masculine vs spellout-cardinal-feminine). So these data are inappropriate for general purpose software."
Example (Fictional) ¶
Example using custom time factors from the Battlestar Galactica 1978 TV series.
package main import ( "fmt" "github.com/tawesoft/golib/v2/must" "github.com/tawesoft/golib/v2/text/number/rbnf" ) func main() { g := must.Result(rbnf.New(nil, ` %%s: 0: s; 1: ; 2/1: s; %%es: 0: es; 1: ; 2/1: es; %%timecomma: 0: =%time=; 1: , =%time=; %%microns: 0: =%%spellout-cardinal= microns; 1: =%%spellout-cardinal= micron; 2/1: =%%spellout-cardinal= microns; %%hyphen-microns: 0: ' microns; 1: -=%%spellout-cardinal= micron; 2/1: -=%%spellout-cardinal= microns; %time: -x: minus →→; 0: =%%microns=; 1: =%%microns=; 2: =%%microns=; 20: twenty→%%hyphen-microns→; 30: thirty→%%hyphen-microns→; 40: forty→%%hyphen-microns→; 50: fifty→%%hyphen-microns→; 60: sixty→%%hyphen-microns→; 70: seventy→%%hyphen-microns→; 80: eighty→%%hyphen-microns→; 90: ninety→%%hyphen-microns→; 100: ←%%spellout-cardinal← centon←%%s←[→%%timecomma→]; 6000/6000: ←%%spellout-cardinal← centar←%%es←[→%%timecomma→]; 144000/144000: ←%%spellout-cardinal← cycle←%%s←[→%%timecomma→]; 1008000/1008000: ←%%spellout-cardinal← secton←%%s←[→%%timecomma→]; 4032000/4032000: ←%%spellout-cardinal← quatron←%%s←[→%%timecomma→]; 48384000/48384000: ←%%spellout-cardinal-verbose← yahren←%%s←[→%%timecomma→]; %%spellout-cardinal: 0: zero; 1: one; 2: two; 3: three; 4: four; 5: five; 6: six; 7: seven; 8: eight; 9: nine; 10: ten; 11: eleven; 12: twelve; 13: thirteen; 14: fourteen; 15: fifteen; 16: sixteen; 17: seventeen; 18: eighteen; 19: nineteen; 20: twenty[-→→]; 30: thirty[-→→]; 40: forty[-→→]; 50: fifty[-→→]; 60: sixty[-→→]; 70: seventy[-→→]; 80: eighty[-→→]; 90: ninety[-→→]; 100: ←← hundred[ →→]; 1000: ←← thousand[ →→]; 1000000: ←← million[ →→]; 1000000000: ←← billion[ →→]; 1000000000000: ←← trillion[ →→]; 1000000000000000: ←← quadrillion[ →→]; 1000000000000000000: =#,##0=; %%spellout-cardinal-verbose: 0: =%%spellout-numbering=; 100: ←← hundred[→%%and→]; 1000: ←← thousand[→%%and→]; 100000/1000: ←← thousand[→%%commas→]; 1000000: ←← million[→%%commas→]; 1000000000: ←← billion[→%%commas→]; 1000000000000: ←← trillion[→%%commas→]; 1000000000000000: ←← quadrillion[→%%commas→]; 1000000000000000000: =#,##0=; %%spellout-numbering: 0: =%%spellout-cardinal=; %%and: 1: ' and =%%spellout-cardinal-verbose=; 100: ' =%%spellout-cardinal-verbose=; %%commas: 1:' and =%%spellout-cardinal-verbose=; 100: ' =%%spellout-cardinal-verbose=; 1000: ' ←%%spellout-cardinal-verbose← thousand[→%%commas→]; 1000000: ' =%%spellout-cardinal-verbose=; `)) type microns int64 printTime := func(v microns) { fmt.Printf("printTime(microns(%d)): %s\n", v, must.Result(g.FormatInteger("%time", int64(v)))) } const centon = 100 // in microns, ~= 1 minute. const centar = 60 * centon // ~= 1 hour, plural "centares" const cycle = 24 * centar // ~= 1 day const secton = 7 * cycle // ~= 1 week const quatron = 4 * secton // ~= 1 month const yahren = 12 * quatron // ~= 1 year printTime(microns(0)) printTime(microns(1)) printTime(microns(5)) printTime(microns(1 * centar)) printTime(microns(2 * centar)) printTime(microns((1 * centon) + 95)) printTime(microns((2 * centar) + (5 * centon) + 1)) printTime(microns((1 * cycle) + (1 * centar) + 5)) printTime(microns(1 * secton)) printTime(microns(1 * quatron)) printTime(microns((3 * quatron) + (2 * secton))) printTime(microns(1 * yahren)) printTime(microns(2 * yahren)) printTime(microns(150 * yahren)) printTime(microns((101 * yahren) + (6 * quatron) + (3 * secton) + (4 * cycle) + (2 * centar) + 50)) }
Output: printTime(microns(0)): zero microns printTime(microns(1)): one micron printTime(microns(5)): five microns printTime(microns(6000)): one centar printTime(microns(12000)): two centares printTime(microns(195)): one centon, ninety-five microns printTime(microns(12501)): two centares, five centons, one micron printTime(microns(150005)): one cycle, one centar, five microns printTime(microns(1008000)): one secton printTime(microns(4032000)): one quatron printTime(microns(14112000)): three quatrons, two sectons printTime(microns(48384000)): one yahren printTime(microns(96768000)): two yahrens printTime(microns(7257600000)): one hundred and fifty yahrens printTime(microns(4914588050)): one hundred and one yahrens, six quatrons, three sectons, four cycles, two centares, fifty microns
Example (SpelloutCardinal) ¶
package main import ( "fmt" "github.com/tawesoft/golib/v2/must" "github.com/tawesoft/golib/v2/text/number/rbnf" ) func main() { g := must.Result(rbnf.New(nil, ` %spellout-cardinal: -x: minus →→; x.x: ←← point →→; Inf: infinite; NaN: not a number; 0: zero; 1: one; 2: two; 3: three; 4: four; 5: five; 6: six; 7: seven; 8: eight; 9: nine; 10: ten; 11: eleven; 12: twelve; 13: thirteen; 14: fourteen; 15: fifteen; 16: sixteen; 17: seventeen; 18: eighteen; 19: nineteen; 20: twenty[-→→]; 30: thirty[-→→]; 40: forty[-→→]; 50: fifty[-→→]; 60: sixty[-→→]; 70: seventy[-→→]; 80: eighty[-→→]; 90: ninety[-→→]; 100: ←← hundred[ →→]; 1000: ←← thousand[ →→]; 1000000: ←← million[ →→]; 1000000000: ←← billion[ →→]; 1000000000000: ←← trillion[ →→]; 1000000000000000: ←← quadrillion[ →→]; 1000000000000000000: =#,##0=; `)) spellout := func(x int64) { fmt.Printf("spellout(%d): %s\n", x, must.Result(g.FormatInteger("%spellout-cardinal", x))) } spellout(0) spellout(1) spellout(2) spellout(-5) spellout(25) spellout(-325) }
Output: spellout(0): zero spellout(1): one spellout(2): two spellout(-5): minus five spellout(25): twenty-five spellout(-325): minus three hundred twenty-five
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var ( ErrRange = errors.New("value out of range") ErrNoRule = errors.New("no rule for this input") ErrNotImplemented = errors.New("rule logic not implemented for this input") ErrInvalidState = errors.New("invalid rule state") )
Errors returned by the Format methods.
Functions ¶
This section is empty.
Types ¶
type Formatter ¶ added in v2.8.5
type Formatter struct {
// contains filtered or unexported fields
}
type Group ¶
type Group struct {
// contains filtered or unexported fields
}
Group defines a group of rule sets. Rule sets may refer to other rule sets in a Group by name, so think of a Group like a lexical scope in a programming language.
func New ¶
New returns a new rule-based number formatter formed from the group of rule sets described by the rules string.
The plurals argument controls formatting of certain plural forms (cardinals and ordinals) used e.g. in spelling out "1st", "2nd", "3rd" or "1 cat", "2 cats", etc. If the ruleset does not contain any rules that use the cardinal syntax ("$(cardinal,plural syntax)$)") or ordinal syntax ("$(ordidinal,plural syntax)$)") then you may simply pass a nil Plural If specified, the methods implemented by the plural argument should usually match the same locale that the ruleset applies to.
The rules string contains one or more rule sets in the format described by the International Components for Unicode (ICU) software implementations ([ICU4C RuleBasedNumberFormat]) and ([ICU4J RuleBasedNumberFormat]), e.g.: "%rulesetName: ruleName: ruleDescriptor; anotherRuleDescriptor: ruleBody;", with some differences:
- In the ICU implementations, if a formatter only has one rule set, the name may be omitted. In this implementation, the name is always required.
- In the ICU implementations, a rule descriptor may be left out and have an implicit meaning depending on the previous rule. In this implementation, rule descriptors are always required (in any case, this doesn't appear in the data files, regardless).
- The ICU API documentation does not specify if a rule set name may appear twice. In this implementation, this is treated as an error.
- Only the following rule descriptors are supported (those not supported do not seem to appear in the data files, regardless): "bv", "bv/rad", "-x", "x.x", "0.x", "x.0", "Inf", "NaN".
- For "x.x", "0.x", "x.0" rules, replacing the dot with a comma is not supported (this does not seem to appear in the data files, regardless). Note that this does not mean numbers cannot be *formatted* using commas, only that they can not appear this way in a rule descriptor.
Also note that a rule set is an ordered set.
ICU4C RuleBasedNumberFormat: https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/classicu_1_1RuleBasedNumberFormat.html ICU4J RuleBasedNumberFormat: https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/RuleBasedNumberFormat.html
func (*Group) FormatInteger ¶
func (*Group) Formatter ¶ added in v2.8.5
Formatter returns a Formatter that uses a specific named ruleset from a group to format numbers.
func (*Group) RulesetNames ¶ added in v2.8.5
RulesetNames returns a slice of the names of the public rulesets in a group, excluding the leading "%".
Directories
¶
Path | Synopsis |
---|---|
internal
|
|
body
Package body implements parsing of a rbnf rule body.
|
Package body implements parsing of a rbnf rule body. |
descriptor
Package descriptor parses a RNBF rule descriptor.
|
Package descriptor parses a RNBF rule descriptor. |