output

package
v1.1.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 11, 2024 License: MIT Imports: 13 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CSVHeader

func CSVHeader(withVerification bool, sep rune) string

CSVHeader returns the header string for CSV output format.

Types

type Kingdom

type Kingdom struct {
	NamesNumber     int     `json:"namesNumber"`
	Kingdom         string  `json:"kingdom"`
	NamesPercentage float32 `json:"namesPercentage"`
}

Kingdom contains names resolved to it and their percentage.

type Meta

type Meta struct {
	// Documentation
	Documentation string `json:"documentation"`

	// Date represents time when output was generated.
	Date time.Time `json:"date"`

	// FinderVersion the version of gnfinder.
	FinderVersion string `json:"gnfinderVersion"`

	// InputFile is the name of the source file.
	InputFile string `json:"inputFile,omitempty"`

	// TextExtractionSec is the time spent on converting the file
	// into UTF8-encoded text.
	TextExtractionSec float32 `json:"textExtractSec,omitempty"`

	// NameFindingSec is the time spent on name-finding.
	NameFindingSec float32 `json:"nameFindingSec"`

	// NameVerifSec is the time spent on name-verification.
	NameVerifSec float32 `json:"nameVerifSec,omitempty"`

	// TotalSec is time spent for the whole process
	TotalSec float32 `json:"totalSec"`

	// WordsAround shows the number of tokens preserved before and after
	// a name-string candidate.
	WordsAround int `json:"wordsAround"`

	// Language setting used by the name-finding algorithm.
	Language string `json:"language"`

	// LanguageDetected automatically for the text.
	LanguageDetected string `json:"languageDetected,omitempty"`

	// WithAllMatches is true if all verifcation results are shown.
	WithAllMatches bool `json:"withAllMatches,omitempty"`

	// WithAmbiguousNames is true if ambiguous uninomials are preserved.
	// Examples of ambiguous uninomial names are `Cancer`, `America`.
	WithAmbiguousNames bool `json:"withAmbiguousNames,omitempty"`

	// WithUniqueNames is true when unique names are returned instead
	// of every occurance of a name.
	WithUniqueNames bool `json:"withUniqueNames,omitempty"`

	// WithBayes use of bayes during name-finding
	WithBayes bool `json:"withBayes,omitempty"`

	// WithOddsAdjustment to adjust prior odds according to the density of
	// scientific names in the text.
	WithOddsAdjustment bool `json:"withOddsAdjustment,omitempty"`

	// WithPositionInBytes names get start/enc positionx in bytes
	// instead of UTF-8 chars.
	WithPositionInBytes bool `json:"withPositionInBytes,omitempty"`

	// WithVerification is true if results are checked by verification service.
	WithVerification bool `json:"withVerification,omitempty"`

	// WithLanguageDetection sets automatic language determination.
	WithLanguageDetection bool `json:"withLanguageDetection,omitempty"`

	// TotalWords is a number of 'normalized' words in the text
	TotalWords int `json:"totalWords"`

	// TotalNameCandidates is a number of words that might be a start of
	// a scientific name
	TotalNameCandidates int `json:"totalNameCandidates"`

	// TotalNames is a number of scientific names found
	TotalNames int `json:"totalNames"`

	// Kingdoms are the kingdoms to which the names resolved by
	// the Catalogue of Life are placed.
	// Kingdoms are sorted by percentage in descending order.
	// The first kingom contains the most number of names.
	Kingdoms []Kingdom `json:"kingdoms,omitempty"`

	// MainTaxon is the taxon containing the majority of resolved by
	// the Catalogue of Life names.
	MainTaxon string `json:"mainTaxon,omitempty"`

	// MainTaxonRank is the rank of the MainTaxon.
	MainTaxonRank string `json:"mainTaxonRank,omitempty"`

	// MainTaxonPercentage is the percentage of names in MainTaxon.
	MainTaxonPercentage float32 `json:"mainTaxonPercentage,omitempty"`

	// StatsNamesNum is the number of names used for calculating statistics.
	// It includes names that are genus and lower and are verified to
	// Catalogue of Life.
	StatsNamesNum int `json:"statsNamesNum,omitempty"`
}

Meta contains meta-information of name-finding result.

type Name

type Name struct {
	// Cardinality depicts number of elements in a name. 0 - Cannot determine
	// cardinality, 1 - Uninomial, 2 - Binomial, 3 - Trinomial.
	Cardinality int `json:"cardinality"`

	// Verbatim shows name the way it was in the text.
	Verbatim string `json:"verbatim,omitempty"`

	// Name is a normalized version of a name.
	Name string `json:"name"`

	// Decision about the quality of name detection.
	Decision token.Decision `json:"-"`

	// Odds show a probability that name detection was correct.
	Odds float64 `json:"-"`

	// OddsLog10 show a Log10 of Odds.
	OddsLog10 float64 `json:"oddsLog10,omitempty"`

	// OddsDetails descibes how Odds were calculated.
	OddsDetails boutput.OddsDetails `json:"oddsDetails,omitempty"`

	// OffsetStart is a start of a name on a page.
	OffsetStart int `json:"start"`

	// OffsetEnd is the end of the name on a page.
	OffsetEnd int `json:"end"`

	// AnnotNomen is a nomenclatural annotation for new species or combination.
	AnnotNomen string `json:"annotationNomen,omitempty"`

	// AnnotNomenType is normalized nomenclatural annotation.
	AnnotNomenType string `json:"annotationNomenType,omitempty"`

	// WordsBefore are words that happened before the name.
	WordsBefore []string `json:"wordsBefore,omitempty"`

	// WordsAfter are words that happened right after the name.
	WordsAfter []string `json:"wordsAfter,omitempty"`

	// Verification gives results of verification process of the name.
	Verification *vlib.Name `json:"verification,omitempty"`
}

Name represents one found name.

func FilterNames

func FilterNames(names []Name, genera map[string]struct{}) []Name

type OddsDatum

type OddsDatum struct {
	Name bool
	Odds float64
}

OddsDatum is a simplified version of a name, that stores boolean decision (Name/NotName), and corresponding odds of the name.

type Output

type Output struct {
	Meta      `json:"metadata"`
	InputText string `json:"inputText,omitempty"`
	Names     []Name `json:"names"`
}

Output type is the result of name-finding.

func TokensToOutput

func TokensToOutput(
	ts []token.TokenSN,
	text []rune,
	version string,
	cfg config.Config) Output

TokensToOutput takes tagged tokens and assembles output out of them.

func (*Output) Format

func (o *Output) Format(f gnfmt.Format) string

func (*Output) MergeVerification

func (o *Output) MergeVerification(
	v map[string]vlib.Name,
	st stats.Stats,
	dur float32,
)

MergeVerification takes a map with verified names and incorporates into output.

func (*Output) UniqueNameStrings

func (o *Output) UniqueNameStrings() []string

UniqueNameStrings takes a list of names, and returns a list of unique name-strings

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL