Documentation ¶
Index ¶
- type Candidate
- type Document
- type PosPat
- type Profiler
- func (p Profiler) Close() error
- func (p Profiler) GetAdaptive() bool
- func (p Profiler) GetIterations() int
- func (p Profiler) GetTypes() bool
- func (p Profiler) Profile(doc Document) error
- func (p Profiler) ReadConfig(config string) error
- func (p Profiler) SetAdaptive(val bool)
- func (p Profiler) SetIterations(val int)
- func (p Profiler) SetTypes(val bool)
- type Token
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Candidate ¶
type Candidate struct { HistPatterns []PosPat // Historical patterns. OCRPatterns []PosPat // OCR patters/errors. Suggestion string // Correction suggestion. Modern string // Modern lexicon entry. Dictionary string // Name of the dictionary. Weight float64 // Weight of the candidate. Distance int // Levenshtein distance. }
Candidate represents a profiler correction candidate.
type Document ¶
type Document struct {
// contains filtered or unexported fields
}
Document wraps a cgo document.
func (Document) AddTokenWithCorrection ¶
AddTokenWithCorrection appends a token with its correction to the document.
type PosPat ¶
type PosPat struct {
Left, Right string // Left and right parts of the pattern.
Pos int // Position where the pattern applies.
}
PosPat represents an error or hisoric rewrite pattern.
type Profiler ¶
type Profiler struct {
// contains filtered or unexported fields
}
Profiler wraps an underlying cgo profiler.
func (Profiler) GetAdaptive ¶
GetAdaptive returns if the adaptive profiling is enabled.
func (Profiler) GetIterations ¶
GetIterations returns the number of iterations.
func (Profiler) Profile ¶
Profile profiles the given document. Make sure that ReadConfig was called before the call to Profile.
func (Profiler) ReadConfig ¶
ReadConfig read the profiler configuration file. Must be called before any calls to Profile.
func (Profiler) SetAdaptive ¶
SetAdaptive enables/disables the apdaptive profiling.
func (Profiler) SetIterations ¶
SetIterations sets the number of iterations.
type Token ¶
type Token struct { OCR string // Recognized token from the OCR. Cor string // Correction for this token (if any). GT string // Ground-truth for this token (if any). Candidates []Candidate // List of correction candidates (if any). }
Token represents a token in the document. Any token in the document has its OCR value set. If there is ground-truth or correction information the according values are not empty. After the profiling of the document suspicious tokens have a list of correction candidates. Note that lexicon entries also have one candidate in their candidates slice (the according lexicon entry with a Levenshtein distance of 0).
func (Token) IsLexiconEntry ¶
IsLexiconEntry returns true if the token is a lexicon entry.