Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Extractor ¶
type Extractor struct {
// contains filtered or unexported fields
}
Extractor stores and offers functionality for extracting content from PDF pages.
func New ¶
func New(contents string, f model.FontsByNames) *Extractor
New returns an Extractor instance for extracting content from the input PDF page.
func (*Extractor) ExtractText ¶
ExtractText processes and extracts all text data in content streams and returns as a string. Takes into account character encoding via CMaps in the PDF file. The text is processed linearly e.g. in the order in which it appears. A best effort is done to add spaces and newlines.
Click to show internal directories.
Click to hide internal directories.