Documentation
¶
Overview ¶
Package fallback implements Unicode Character Fallback Substitutions using the Unicode CLDR 41.0 supplemental data file characters.xml, Go's packaged Unicode normalisation rules for canonical decomposition, and rules from a matching version of Unicode for compatibility decomposition.
This can be useful for robustly parsing Unicode strings where for practical reasons (e.g. missing keyboard keys, missing font support), certain fallbacks have been used, or for picking a sensible default when certain Unicode strings cannot be displayed (e.g. missing font support).
Note that care must be taken not to change the meaning of a text - for example, superscript two '²', will have a (last resort) Character Fallback Substitution to the digit '2' via NKFC normalisation, but these have entirely different meanings. See the (withdrawn draft) Unicode Technical Report 30: CHARACTER FOLDINGS, as well as the earlier draft Unicode Technical Report 25: CHARACTER FOLDINGS, for commentary.