gotess

package module
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 5, 2022 License: MIT Imports: 5 Imported by: 0

README

Arch-Linux: builds.sr.ht status Alpine: builds.sr.ht status Debian: builds.sr.ht status

gotess

Simple api wrapper for the tesseract OCR-engine.

Build

Gotess requires cgo to be built. Make sure that the tesseract development headers, the tesseract library, the leptonica development headers and the leptonica library are installed before attempting to build gotess.

Testing

Run go test to run the tests. Make shure that the two environment variables TESSDATA_PREFIX and the TESSDATA_LANGUAGE are set accordingly.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type API

type API struct {
	// contains filtered or unexported fields
}

API wraps a tesseract api handle.

func New

func New(datapath, language string) (API, error)

New creates a new tesseract api handle using the given datapath and language. The handle's page segmation mode is set to line mode (PSM_SINGLE_LINE) and its ocr engine mode to default (OEM_DEFAULT).

Used function(s): TessBaseApi* TessBaseAPICreate(); int TessBaseAPIInit2(TessBaseAPI* handle,

const char* datapath,
const char* language,
TessOcrEngineMode oem);

void TessBaseAPISetPageSegMode(TessBaseAPI* handle,

TessPageSegMode mode);

func (API) Close

func (api API) Close() error

Close cleans up the api handle. It never returns an error.

Used function(s): void TessBaseAPIDelete(TessBaseAPI* handle);

func (API) GetBoolVariable added in v0.2.0

func (api API) GetBoolVariable(key string) (bool, bool)

GetBoolVariable gets a configuration variable from the api. Returns both the result and if the lookup was successfull.

Used function(s): BOOL TessBaseAPIGetBoolVariable(TessBaseAPI* handle, const char* name, BOOL* value);

func (API) GetDoubleVariable added in v0.2.0

func (api API) GetDoubleVariable(key string) (float64, bool)

GetDoubleVariable gets a configuration variable from the api. Returns both the result and if the lookup was successfull.

Used function(s): BOOL TessBaseAPIGetDoubleVariable(TessBaseAPI* handle, const char* name, double* value);

func (API) GetIntVariable added in v0.2.0

func (api API) GetIntVariable(key string) (int, bool)

GetIntVariable gets a configuration variable from the api. Returns both the result and if the lookup was successfull.

Used function(s): BOOL TessBaseAPIGetIntVariable(TessBaseAPI* handle, const char* name, int* value);

func (API) GetVariable added in v0.2.0

func (api API) GetVariable(key string) (string, bool)

GetVariable gets a configuration variable from the api. Returns both the result and if the lookup was successfull.

Used function(s): const char* TessBaseAPIGetStringVariable(TessBaseAPI* handle, const char* name);

func (API) Line

func (api API) Line() string

Line returns the recognized line as utf8 encoded string.

THIS FUNCTION IS DEPRECATED. Use Text() instead.

func (*API) Recognize added in v0.3.0

func (api *API) Recognize() error

Recognize runs the recognition over the line. It has to be called after the image has been set.

Used function(s): char* TessBaseAPIGetUTF8Text(TessBaseAPI* handle);

func (API) ScanVariable added in v0.2.0

func (api API) ScanVariable(key string, val interface{}) bool

ScanVariable scans a variable of type *string, *int, *float32, *float64 or *bool. Returns if the lookup was successfull.

func (API) SetImagePNG

func (api API) SetImagePNG(path string) error

SetImagePNG sets the input PNG image file. void TessBaseAPISetImage2(TessBaseAPI* handle, struct Pix* pix);

func (API) SetPSM added in v0.5.0

func (api API) SetPSM(mode PSMType)

SetPSM sets the page segmentation mode.

Used function(s): void TessBaseAPISetPageSegMode(TessBaseAPI* handle,

TessPageSegMode mode);

func (API) SetVariable added in v0.2.0

func (api API) SetVariable(key, val string) bool

SetVariable sets an internal configuration variable in the api. Returns false if the lookup failed.

Used function(s): BOOL TessBaseAPISetVariable(TessBaseAPI* handle, const char* name, const char* value);

func (API) SymbolIterator

func (api API) SymbolIterator() *SymbolIterator

SymbolIterator returns a new result iterator.

Used function(s): TessResultIterator* TessBaseAPIGetIterator(TessBaseAPI* handle);

func (API) Text added in v0.5.0

func (api API) Text() string

Text returns the recognized text as utf8 encoded string. This function should be preferred over the deprecated Line() function.

type ChoiceIterator

type ChoiceIterator struct {
	// contains filtered or unexported fields
}

ChoiceIterator wraps a choice iterator to iterate over alternative recognition results. Other than tesseract's choice iterator, this one skips the result iterator's result and only reports true alternatives.

func (*ChoiceIterator) Close

func (it *ChoiceIterator) Close() error

Close cleans up the choice iterator. Close never returns an error.

func (*ChoiceIterator) Err

func (it *ChoiceIterator) Err() error

Err returns the active error if any. EOF is ignored.

func (*ChoiceIterator) Get added in v0.3.0

func (it *ChoiceIterator) Get() (string, float32)

Get returns the current symbol and its confidence. The confidence lies between 0 and 1 (other than tesseract's confidences - which lie between 0 and 100).

func (*ChoiceIterator) Next

func (it *ChoiceIterator) Next() bool

Next advances the iterator.

type PSMType added in v0.5.0

type PSMType int
const (
	PSMOSDOnly             PSMType = C.PSM_OSD_ONLY               // Orientation and script detection only (OSD).
	PSMAutoOSD             PSMType = C.PSM_AUTO_OSD               // Automatic page segmentation with orientation and script detection.
	PSMAutoOnly            PSMType = C.PSM_AUTO_ONLY              // Automatic page segmentation, but no OSD, or OCR.
	PSMAuto                PSMType = C.PSM_AUTO                   // Fully automatic page segmentation, but no OSD.
	PSMSingleColumn        PSMType = C.PSM_SINGLE_COLUMN          // Assume a single column of text of variable sizes.
	PSMSingleBlockVertText PSMType = C.PSM_SINGLE_BLOCK_VERT_TEXT // Assume a single uniform block of vertically ///< aligned text.
	PSMSingleBlock         PSMType = C.PSM_SINGLE_BLOCK           // Assume a single uniform block of text. (Default.)
	PSMSingleLine          PSMType = C.PSM_SINGLE_LINE            // Treat the image as a single text line.
	PSMSingleWord          PSMType = C.PSM_SINGLE_WORD            // Treat the image as a single word.
	PSMCircleWord          PSMType = C.PSM_CIRCLE_WORD            // Treat the image as a single word in a circle.
	PSMSingleChar          PSMType = C.PSM_SINGLE_CHAR            // Treat the image as a single character.
	PSMSparseText          PSMType = C.PSM_SPARSE_TEXT            // Find as much text as possible in no particular order.
	PSMSparseTextOSD       PSMType = C.PSM_SPARSE_TEXT_OSD        // Sparse text with orientation and script det.
	PSMRawLine             PSMType = C.PSM_RAW_LINE               // Treat the image as a single text line, bypassing hacks that are Tesseract-specific.
)

type SymbolIterator

type SymbolIterator struct {
	// contains filtered or unexported fields
}

SymbolIterator is used to iterate over the symbols in a line.

func (*SymbolIterator) ChoiceIterator

func (it *SymbolIterator) ChoiceIterator() *ChoiceIterator

ChoiceIterator returns a choice iterator that iterates over recognition results.

func (*SymbolIterator) Close

func (it *SymbolIterator) Close() error

Close cleans up the result iterator. It never returns an error.

Used function(s): void TessResultIteratorDelete(TessResultIterator* handle);

func (*SymbolIterator) Err

func (it *SymbolIterator) Err() error

Err returns the active error or nil.

func (*SymbolIterator) Get added in v0.3.0

func (it *SymbolIterator) Get() (string, float32)

Get return the current symbol and its confidence. The confidence lies between 0 and 1 (other than tesseract's confidences - which lie between 0 and 100).

func (*SymbolIterator) Next

func (it *SymbolIterator) Next() bool

Next advances the result iterator to the next level instance. Returns false if at the end of the line or if an error was encountered. Use Err() to check for any error after the iteration.

Used function(s): BOOL TessResultIteratorNext(TessResultIterator* handle,

TessPageIteratorLevel level);

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL