Documentation ¶
Index ¶
Constants ¶
const ( // MIMEType defines the mime-type of page XML files. // See: https://github.com/PRImA-Research-Lab/PAGE-XML MIMEType = "application/alto+xml" )
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Line ¶ added in v0.4.0
type Line struct {
// contains filtered or unexported fields
}
Line represents a line of text in the page XML file.
func (Line) FindWordByID ¶ added in v0.4.0
FindWordByID searches for a line with the given ID.
func (Line) TextEquivUnicodeAt ¶ added in v0.4.0
TextEquivUnicodeAt returns the i-th TextEquiv/Unicode entry (indexing is zero-based).
type Match ¶ added in v0.5.0
type Match struct {
RegionID, LineID, WordID string
}
Match is used to match text regions. If any of the IDs is the empty string, the according region is ignored.
type Page ¶
type Page struct {
// contains filtered or unexported fields
}
Page represents an open page XML file.
func (Page) Find ¶ added in v0.5.0
func (p Page) Find(m Match) (TextRegion, bool)
Find searches for a given {region,line,word}-ID in the PAGE-XML (IDs are assumed to be unique).
func (Page) FindRegionByID ¶ added in v0.4.0
FindRegionByID returns the region with the given ID.
type Polygon ¶ added in v0.5.0
Polygon is used to represent the polygons of <Coords points='...'/> points in the PAGE-XML.
type Region ¶
type Region struct {
// contains filtered or unexported fields
}
Region defines a text region in the page XML file.
func (Region) FindLineByID ¶
FindLineByID searches for a line with the given ID.
type TextRegion ¶ added in v0.5.0
type TextRegion interface { ID() string TextEquivUnicodeAt(int) (string, bool) Polygon() (Polygon, error) }
TextRegion defines an interface for abstract text regions in a PAGE-XML document.