Documentation ¶
Overview ¶
Package reader leverage a PDF file reader to read a file, analyze its structure and build a high level, in-memory representation as a `model.Document`.
Index ¶
- func DateTime(s string) (time.Time, bool)
- func DecodeTextString(s string) string
- func ParsePDFFile(filename string, options Options) (model.Document, *model.Encrypt, error)
- func ParsePDFReader(source io.ReadSeeker, options Options) (model.Document, *model.Encrypt, error)
- func ProcessContext(ctx file.PDFFile) (model.Document, *model.Encrypt, error)
- type CustomObjectResolver
- type Fl
- type Options
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func DateTime ¶
DateTime decodes s into a time.Time. It returns false if the string is a valid date. See - 7.9.4 Dates
func DecodeTextString ¶
DecodeTextString expects a "text string" as defined in PDF spec, that is, either a PDFDocEncoded string or a UTF-16BE string, and returns the UTF-8 corresponding string. Note that encryption, escaping or hex-encoding should already have been taken care of.
func ParsePDFFile ¶
ParsePDFFile opens a file and calls `ParsePDFReader`, see the latter for details.
func ParsePDFReader ¶
ParsePDFReader reads a PDF file and builds a model. This is done in two steps:
- a first parsing step (involving lexing and parsing) builds a tree object
- this tree is then interpreted according to the PDF specification, resolving indirect objects and transforming dynamic, opaque types into statically typed objects, building the returned `Document`.
Information about encryption are returned separately, and will be needed if you want to encrypt the document back.
Types ¶
type CustomObjectResolver ¶
type CustomObjectResolver interface {
Resolve(f *file.PDFFile, obj model.Object) (model.Object, error)
}
CustomObjectResolver provides a way to overide the default reading behaviour for custom objects
type Options ¶
type Options struct { CustomObjectResolver CustomObjectResolver UserPassword string }
Options enables greater control on the processing. The zero value is a valid default configuration.
Source Files ¶
Directories ¶
Path | Synopsis |
---|---|
Package file builds upon a parser to read an existing PDF file, producing a tree of PDF objets.
|
Package file builds upon a parser to read an existing PDF file, producing a tree of PDF objets. |
Implements a PDF object parser, mapping a list of tokens (see the tokenizer package) into tree-like structure.
|
Implements a PDF object parser, mapping a list of tokens (see the tokenizer package) into tree-like structure. |
filters
Package filters provide logic to handle binary data encoded with PDF filters, such as inline data images.
|
Package filters provide logic to handle binary data encoded with PDF filters, such as inline data images. |