Documentation ¶
Overview ¶
tablegen is a helper CLI to create Go source files from Unicode Character Data files.
tablegen recognizes the following flags:
-p <package name> : package name of output package -f <n> : field index of character category -x <prefix> : prefix to categories, used for table naming -o <filename> : name of output source file -u <URL> : UCD file URL, e.g. http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt
tablegen will download the UCD file, iterate over character code/range entries and write a Go source code file. Tables defined in the Go file contain *unicode.RangeTable variabes, which may be queried by functions of the Go standard library (package unicode).
For example, after creating tables from UAX#11 East Asian Width tables (see link above), clients may query if a Unicode character is contained in an UAX#11 range by means of unicode.Is(…). After a call to
tablegen -f 2 -p mypackage -o uax11tables.go -x EAW -u http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt
a file named uax11tables.go will contain (amongst others) a range table called `EAW_Na` (indicating a "narrow" EA character), which can be queried by
isnarrow := unicode.Is(EAW_Na, '梨')
Unicode Annex #44 is a starting point for UCD information: http://www.unicode.org/reports/tr44/. An overview over Unicode Character Data files can be found here: https://www.unicode.org/versions/components-13.0.0.html.
___________________________________________________________________________
License ¶
Governed by a 3-Clause BSD license. License file may be found in the root folder of this module.
Copyright © 2021 Norbert Pillmayer <norbert@pillmayer.com>