classer

package
v0.0.0-...-73e6ce1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 27, 2023 License: AGPL-3.0, AGPL-3.0-only Imports: 6 Imported by: 0

Documentation

Overview

Package classer contains apache leptonica like implementation of the jbig2 classificator.

Index

Constants

View Source
const (
	MaxDiffWidth  = 2
	MaxDiffHeight = 2
)

For PixHausTest, PixRankHausTest and PixCorrelationScore the values should be or greater.

View Source
const (
	// MaxConnCompWidth is the default max cc width.
	MaxConnCompWidth = 350
	// MaxCharCompWidth is the default max char width.
	MaxCharCompWidth = 350
	// MaxWordCompWidth is the default max word width.
	MaxWordCompWidth = 1000
	// MaxCompHeight is the default max component height.
	MaxCompHeight = 120
)
View Source
const JbAddedPixels = 6

JbAddedPixels is the size of the border added around pix of each c.c. for further processing.

Variables

View Source
var TwoByTwoWalk = []int{
	0, 0,
	0, 1,
	-1, 0,
	0, -1,
	1, 0,
	-1, 1,
	1, 1,
	-1, -1,
	1, -1,
	0, -2,
	2, 0,
	0, 2,
	-2, 0,
	-1, -2,
	1, -2,
	2, -1,
	2, 1,
	1, 2,
	-1, 2,
	-2, 1,
	-2, -1,
	-2, -2,
	2, -2,
	2, 2,
	-2, 2,
}

TwoByTwoWalk is the slice of values for classified encoding.

Functions

This section is empty.

Types

type Classer

type Classer struct {
	// BaseIndex is number of components already processed on fully processed pages.
	BaseIndex int
	// Settings are current classer settings.
	Settings Settings

	// Number of components on each page - 'nacomps'- for each page added to the classer a new entry to the slice
	// is added with the value of components per page.
	ComponentsNumber *basic.IntSlice
	// Width * Height of each template without extra border pixels - 'naarea'.
	TemplateAreas *basic.IntSlice

	// Widths is max width of original src images.
	Widths map[int]int
	// Heights is max height of original src images.
	Heights map[int]int

	// NumberOfClasses is the current number of classes - 'nclass'.
	NumberOfClasses int
	// ClassInstances is the slice of bitmaps for each class. Unbordered - 'pixaa'.
	ClassInstances *bitmap.BitmapsArray
	// UndilatedTemplates for each class. Bordered and not dilated - 'pixat'.
	UndilatedTemplates *bitmap.Bitmaps
	// DilatedTemplates for each class. Bordered and dilated - 'pixatd'.
	DilatedTemplates *bitmap.Bitmaps

	// Hash table to find templates by their size - 'dahash'.
	TemplatesSize basic.IntsMap
	// FgTemplates - foreground areas of undilated templates. Used for rank < 1.0 - 'nafgt'.
	FgTemplates *basic.NumSlice

	// CentroidPoints centroids of all bordered cc.
	CentroidPoints *bitmap.Points
	// CentroidPointsTemplates centroids of all bordered template cc.
	CentroidPointsTemplates *bitmap.Points
	// ClassIDs is the slice of class ids for each component - 'naclass'.
	ClassIDs *basic.IntSlice
	// ComponentPageNumbers is the slice of page numbers for each component - 'napage'.
	// The index is the component id.
	ComponentPageNumbers *basic.IntSlice
	// PtaUL is the slice of UL corners at which the template
	// is to be placed for each component.
	PtaUL *bitmap.Points
	// PtaLL is the slice of LL corners at which the template
	// is to be placed for each component.
	PtaLL *bitmap.Points
}

Classer holds all the data accumulated during the classification process that can be used for a compressed jbig2-type representation of a set of images.

func Init

func Init(settings Settings) (*Classer, error)

Init initializes the classer with the provided settings.

func (*Classer) AddPage

func (c *Classer) AddPage(inputPage *bitmap.Bitmap, pageNumber int, method Method) (err error)

AddPage adds the 'inputPage' to the classer 'c'.

func (*Classer) ComputeLLCorners

func (c *Classer) ComputeLLCorners() (err error)

ComputeLLCorners computes the position of the LL (lower left) corners.

type Method

type Method int

Method is the encoding method used enum.

const (
	RankHaus Method = iota
	Correlation
)

enum definitions of the encoding methods.

type Settings

type Settings struct {
	// MaxCompWidth is max component width allowed.
	MaxCompWidth int
	// MaxCompHeight is max component height allowed.
	MaxCompHeight int
	// SizeHaus is the size of square struct elem for hausdorf method.
	SizeHaus int
	// Rank val of hausdorf method match.
	RankHaus float64
	// Thresh is the threshold value for the correlation score.
	Thresh float64
	// Corrects thresh value for heavier components; 0 for no correction.
	WeightFactor float64
	// KeepClassInstances is a flag that defines if the class instances should be stored
	// in the 'ClassInstances' BitmapsArray.
	KeepClassInstances bool
	// Components is the setting the classification.
	Components bitmap.Component
	// Method is the encoding method.
	Method Method
}

Settings keeps the settings for the classer.

func DefaultSettings

func DefaultSettings() Settings

DefaultSettings returns default settings struct.

func (*Settings) SetDefault

func (s *Settings) SetDefault()

SetDefault sets the default value for the settings.

func (Settings) Validate

func (s Settings) Validate() error

Validate validates the settings input.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL