Documentation
¶
Overview ¶
Package gnparser implements the main use-case of the project -- parsing scientific names. There are methods to parse one name at a time, a slice of names, or a stream of names. All methods return results in the same order as input. It is achieved by restoring the order after concurrent execution of the parsing process.
Example ¶
package main import ( "fmt" "github.com/gnames/gnparser" "github.com/gnames/gnparser/ent/parsed" ) func main() { names := []string{"Pardosa moesta Banks, 1892", "Bubo bubo"} cfg := gnparser.NewConfig() gnp := gnparser.New(cfg) res := gnp.ParseNames(names) fmt.Println(res[0].Authorship.Normalized) fmt.Println(res[1].Canonical.Simple) fmt.Println(parsed.HeaderCSV()) fmt.Println(res[0].Output(gnp.Format())) }
Output: Banks 1892 Bubo bubo Id,Verbatim,Cardinality,CanonicalStem,CanonicalSimple,CanonicalFull,Authorship,Year,Quality e2fdf10b-6a36-5cc7-b6ca-be4d3b34b21f,"Pardosa moesta Banks, 1892",2,Pardosa moest,Pardosa moesta,Pardosa moesta,Banks 1892,1892,1
Index ¶
- Variables
- type Config
- type GNparser
- type Option
- func OptBatchSize(i int) Option
- func OptDebug(b bool) Option
- func OptFormat(s string) Option
- func OptIgnoreHTMLTags(b bool) Option
- func OptIsTest(b bool) Option
- func OptJobsNum(i int) Option
- func OptPort(i int) Option
- func OptWithCapitaliation(b bool) Option
- func OptWithCultivars(b bool) Option
- func OptWithDetails(b bool) Option
- func OptWithNoOrder(b bool) Option
- func OptWithStream(b bool) Option
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var ( // Version is the version of the gnparser package. When Makefile is // used, the version is calculated out of Git tags. Version = "v1.3.0+" // Build is a timestamp of when Makefile was used to compile // the gnparser code. If go build was used, Build stays empty. Build string )
Functions ¶
This section is empty.
Types ¶
type Config ¶ added in v1.0.5
type Config struct { // Format sets the output format for CLI and Web interfaces. // There are 3 formats available: 'CSV', 'CompactJSON' and // 'PrettyJSON'. Format gnfmt.Format // JobsNum sets a level of parallelism used during parsing of // a stream of name-strings. JobsNum int // BatchSize sets the maximum number of elements in names-strings slice. BatchSize int // WithStream changes from parsing a batch by batch, to parsing one name // at a time. When WithStream is true, BatchSize setting is ignored. WithStream bool // IgnoreHTMLTags can be set to true when it is desirable to clean up names // from a few HTML tags often present in names-strings that were planned to // be presented via an HTML page. IgnoreHTMLTags bool // WithDetails can be set to true when a simplified output is not sufficient // for obtaining a required information. WithDetails bool // WithNoOrder flag, when true, output and input are in different order. WithNoOrder bool // WithCapitalization flag, when true, the first letter of a name-string // is capitalized, if appropriate. WithCapitalization bool // WithCultivars flag, when true, cultivar names will be parsed and // modify cardinality, normalized and canonical output. WithCultivars bool // Port to run wer-service. Port int // IsTest can be set to true when parsing functionality is used for tests. // In such cases the `ParserVersion` field is presented as `test_version` // instead of displaying the actual version of `gnparser`. IsTest bool // Debug sets a "debug" state for parsing. The debug state forces output // format to showing parsed ast tree. Debug bool }
Config keeps settings that might affect how parsing is done, of change the parsing output.
type GNparser ¶ added in v1.0.3
type GNparser interface { // GetVersion provides a version and a build timestamp of gnparser. GetVersion() gnvers.Version // ParseName takes a name-string, and returns parsed results for the name. ParseName(string) parsed.Parsed // ParseNames takes a slice of name-strings, and returns a slice of // parsed results in the same order as the input. ParseNames([]string) []parsed.Parsed // ParseNameStream takes a context, an input channel that takes a // a name-string and its position in the input. It returns parsed results // that come in the same order as the input. ParseNameStream(context.Context, <-chan nameidx.NameIdx, chan<- parsed.Parsed) // Format returns currently chosen desired output format of a JSON or // CSV output. Format() gnfmt.Format // ChangeConfig allows to modify settings of GNparser. Changing settings // might modify parsing process, and the final output of results. ChangeConfig(opts ...Option) GNparser // Debug parses a string and outputs raw AST tree from PEG engine. Debug(s string) []byte }
GNparser is the main use-case interface. It provides methods required for parsing scientific names.
type Option ¶ added in v1.0.5
type Option func(*Config)
Option is a type that has to be returned by all Option functions. Such functions are able to modify the settings of a Config object.
func OptBatchSize ¶ added in v1.0.5
OptBatchSize sets the max number of names in a batch.
func OptFormat ¶ added in v1.0.5
OptFormat takes a string (one of 'csv', 'compact', 'pretty') to set the formatting option for the CLI or Web presentation. If some other string is entered, the default, 'CSV' format is set, accompanied by a warning.
func OptIgnoreHTMLTags ¶ added in v1.0.5
OptKeepHTMLTags sets the KeepHTMLTags field. This option is useful if names with HTML tags shold not be parsed, or they are absent in input data.
func OptWithCapitaliation ¶ added in v1.2.0
OptWithCapitaliation sets the WithCapitalization field.
func OptWithCultivars ¶ added in v1.3.0
OptWithCultivars sets the EnableCultivars field.
func OptWithDetails ¶ added in v1.0.5
OptWithDetails sets the WithDetails field.
func OptWithNoOrder ¶ added in v1.0.9
OptWithNoOrder sets the WithNoOrder field.
func OptWithStream ¶ added in v1.0.5
OptWithDetails sets the WithDetails field.
Source Files
¶
Directories
¶
Path | Synopsis |
---|---|
Package main provides C-binding functionality to use parser in other languages.
|
Package main provides C-binding functionality to use parser in other languages. |
ent
|
|
internal/preprocess
Package preprocess performs preparsing filtering and modification of a scientific-name.
|
Package preprocess performs preparsing filtering and modification of a scientific-name. |
nameidx
Package nameidx provides a structure that preserves original position of a name-string in an input slice.
|
Package nameidx provides a structure that preserves original position of a name-string in an input slice. |
parsed
Package parsed provides a user-friendly output of parsing result, as well as functions to convert the result to CSV or JSON-encoded strings.
|
Package parsed provides a user-friendly output of parsing result, as well as functions to convert the result to CSV or JSON-encoded strings. |
parser
Package parser provides entities and methods to perform Parsing Expression Grammer parsing on scientific names.
|
Package parser provides entities and methods to perform Parsing Expression Grammer parsing on scientific names. |
stemmer
http://snowballstem.org/otherapps/schinke/ http://caio.ueberalles.net/a_stemming_algorithm_for_latin_text_databases-schinke_et_al.pdf The Schinke Latin stemming algorithm is described in, Schinke R, Greengrass M, Robertson AM and Willett P (1996) A stemming algorithm for Latin text databases.
|
http://snowballstem.org/otherapps/schinke/ http://caio.ueberalles.net/a_stemming_algorithm_for_latin_text_databases-schinke_et_al.pdf The Schinke Latin stemming algorithm is described in, Schinke R, Greengrass M, Robertson AM and Willett P (1996) A stemming algorithm for Latin text databases. |
str
Package str provides functions for manipulating scientific name-strings.
|
Package str provides functions for manipulating scientific name-strings. |
cmd
Package cmd creates a command line application for parsing scientific names.
|
Package cmd creates a command line application for parsing scientific names. |
io
|
|
dict
Package dict provides lookup data for gnparser.
|
Package dict provides lookup data for gnparser. |
web
Package web provides RESTful API service and a website for gnparser.
|
Package web provides RESTful API service and a website for gnparser. |