types

package
v0.0.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 21, 2017 License: MIT Imports: 12 Imported by: 0

Documentation

Overview

Package types provides the PDFContext, representing an ecosystem for PDF processing.

It implements the specification PDF 32000-1:2008

Please refer to the spec for any documentation of PDFContext's internals.

Index

Constants

View Source
const (

	// ValidationStrict ensures 100% compliance with the spec (PDF 32000-1:2008).
	ValidationStrict = 0

	// ValidationRelaxed ensures PDF compliance based on frequently encountered validation errors.
	ValidationRelaxed = 1

	// StatsFileNameDefault is the standard stats filename.
	StatsFileNameDefault = "stats.csv"
)
View Source
const (
	RootVersion = iota
	RootExtensions
	RootPageLabels
	RootNames
	RootDests
	RootViewerPrefs
	RootPageLayout
	RootPageMode
	RootOutlines
	RootThreads
	RootOpenAction
	RootAA
	RootURI
	RootAcroForm
	RootMetadata
	RootStructTreeRoot
	RootMarkInfo
	RootLang
	RootSpiderInfo
	RootOutputIntents
	RootPieceInfo
	RootOCProperties
	RootPerms
	RootLegal
	RootRequirements
	RootCollection
	RootNeedsRendering
)

The PDF root object fields.

View Source
const (
	PageLastModified = iota
	PageResources
	PageMediaBox
	PageCropBox
	PageBleedBox
	PageTrimBox
	PageArtBox
	PageBoxColorInfo
	PageContents
	PageRotate
	PageGroup
	PageThumb
	PageB
	PageDur
	PageTrans
	PageAnnots
	PageAA
	PageMetadata
	PagePieceInfo
	PageStructParents
	PageID
	PagePZ
	PageSeparationInfo
	PageTabs
	PageTemplateInstantiated
	PagePresSteps
	PageUserUnit
	PageVP
)

The PDF page object fields.

View Source
const (
	EolLF   = "\x0A"
	EolCR   = "\x0D"
	EolCRLF = "\x0D\x0A"

	FreeHeadGeneration = 65535
)

Supported line delimiters

Variables

This section is empty.

Functions

func DecodeUTF16String

func DecodeUTF16String(s string) (string, error)

DecodeUTF16String decodes a UTF16BE string from a hex string.

func HexLiteralToString

func HexLiteralToString(hexString string) (s string, err error)

HexLiteralToString returns a possibly UTF16 encoded string for a hex string.

func IsStringUTF16BE

func IsStringUTF16BE(s string) bool

IsStringUTF16BE checks for Big Endian byte order BOM in octal.

func IsUTF16BE

func IsUTF16BE(b []byte) (ok bool, err error)

IsUTF16BE checks for Big Endian byte order mark.

func StringLiteralToString

func StringLiteralToString(str string) (s string, err error)

StringLiteralToString returns the best possible string rep for a string literal.

func Verbose

func Verbose(verbose bool)

Verbose controls logging output.

func VersionString

func VersionString(version PDFVersion) string

VersionString returns a string representation for a given PDFVersion.

Types

type ByteSize

type ByteSize float64

ByteSize represents the various terms for storage space.

const (
	KB ByteSize = 1 << (10 * iota)
	MB
	GB
	TB
	PB
	EB
	ZB
	YB
)

Storage space terms.

func (ByteSize) String

func (b ByteSize) String() string

type Configuration

type Configuration struct {

	// Enables PDF V1.5 compatible processing of object streams, xref streams, hybrid PDF files.
	Reader15 bool

	// Enables decoding of all streams (fontfiles, images..) for logging purposes.
	DecodeAllStreams bool

	// Validate against ISO-32000: strict or relaxed
	ValidationMode int

	// End of line char sequence for writing.
	Eol string

	// Turns on object stream generation.
	// A signal for compressing any new non-stream-object into an object stream.
	// true enforces WriteXRefStream to true.
	// false does not prevent xRefStream generation.
	WriteObjectStream bool

	// Switches between xRefSection (<=V1.4) and objectStream/xRefStream (>=V1.5) writing.
	WriteXRefStream bool

	// Turns on stats collection.
	CollectStats bool

	// A CSV-filename holding the statistics.
	StatsFileName string
}

Configuration of a PDFContext.

func NewDefaultConfiguration

func NewDefaultConfiguration() *Configuration

NewDefaultConfiguration returns the default pdfcpu configuration.

func (*Configuration) SetValidationRelaxed

func (c *Configuration) SetValidationRelaxed()

SetValidationRelaxed sets relaxed validation.

func (*Configuration) SetValidationStrict

func (c *Configuration) SetValidationStrict()

SetValidationStrict sets strict validation.

func (*Configuration) ValidationModeString

func (c *Configuration) ValidationModeString() string

ValidationModeString returns a string rep for the validation mode in effect.

type FontObject

type FontObject struct {
	ResourceNames []string
	Prefix        string
	FontName      string
	FontDict      *PDFDict
}

FontObject represents a font used in a PDF file.

func (*FontObject) AddResourceName

func (fo *FontObject) AddResourceName(resourceName string)

AddResourceName adds a resourceName referring to this font.

func (FontObject) BaseFont

func (fo FontObject) BaseFont() (*string, error)

BaseFont returns a string representation for the BaseFont of this font.

func (FontObject) Embedded

func (fo FontObject) Embedded() (embedded bool)

Embedded returns true if the font is embedded into this PDF file.

func (FontObject) Encoding

func (fo FontObject) Encoding() string

Encoding returns the Encoding of this font.

func (FontObject) ResourceNamesString

func (fo FontObject) ResourceNamesString() string

ResourceNamesString returns a string representation of all the resource names of this font.

func (FontObject) String

func (fo FontObject) String() string

func (FontObject) SubType

func (fo FontObject) SubType() string

SubType returns the SubType of this font.

type ImageObject

type ImageObject struct {
	ResourceNames []string
	ImageDict     *PDFStreamDict
}

ImageObject represents an image used in a PDF file.

func (*ImageObject) AddResourceName

func (io *ImageObject) AddResourceName(resourceName string)

AddResourceName adds a resourceName to this imageObject's ResourceNames dict.

func (ImageObject) ResourceNamesString

func (io ImageObject) ResourceNamesString() string

ResourceNamesString returns a string representation of the ResourceNames for this image.

type IntSet

type IntSet map[int]bool

IntSet is a set of integers.

type OptimizationContext

type OptimizationContext struct {

	// Font section
	PageFonts         []IntSet
	FontObjects       map[int]*FontObject
	Fonts             map[string]*[]int
	DuplicateFontObjs IntSet
	DuplicateFonts    map[int]*PDFDict

	// Image section
	PageImages         []IntSet
	ImageObjects       map[int]*ImageObject
	DuplicateImageObjs IntSet
	DuplicateImages    map[int]*PDFStreamDict

	DuplicateInfoObjects IntSet // Really a possible result of manual info dict modification.

	NonReferencedObjs []int // Objects that are not referenced.
}

OptimizationContext represents the context for the optimiziation of a PDF file.

func (*OptimizationContext) DuplicateFontObjectsString

func (oc *OptimizationContext) DuplicateFontObjectsString() (int, string)

DuplicateFontObjectsString returns a formatted string and the number of objs.

func (*OptimizationContext) DuplicateImageObjectsString

func (oc *OptimizationContext) DuplicateImageObjectsString() (int, string)

DuplicateImageObjectsString returns a formatted string and the number of objs.

func (*OptimizationContext) DuplicateInfoObjectsString

func (oc *OptimizationContext) DuplicateInfoObjectsString() (int, string)

DuplicateInfoObjectsString returns a formatted string and the number of objs.

func (*OptimizationContext) IsDuplicateFontObject

func (oc *OptimizationContext) IsDuplicateFontObject(i int) bool

IsDuplicateFontObject returns true if object #i is a duplicate font object.

func (*OptimizationContext) IsDuplicateImageObject

func (oc *OptimizationContext) IsDuplicateImageObject(i int) bool

IsDuplicateImageObject returns true if object #i is a duplicate image object.

func (*OptimizationContext) IsDuplicateInfoObject

func (oc *OptimizationContext) IsDuplicateInfoObject(i int) bool

IsDuplicateInfoObject returns true if object #i is a duplicate info object.

func (*OptimizationContext) NonReferencedObjsString

func (oc *OptimizationContext) NonReferencedObjsString() (int, string)

NonReferencedObjsString returns a formatted string and the number of objs.

type PDFArray

type PDFArray []interface{}

PDFArray represents a PDF array object.

func (PDFArray) PDFString

func (array PDFArray) PDFString() string

PDFString returns a string representation as found in and written to a PDF file.

func (PDFArray) String

func (array PDFArray) String() string

type PDFBoolean

type PDFBoolean bool

PDFBoolean represents a PDF boolean object.

func (PDFBoolean) PDFString

func (boolean PDFBoolean) PDFString() string

PDFString returns a string representation as found in and written to a PDF file.

func (PDFBoolean) String

func (boolean PDFBoolean) String() string

func (PDFBoolean) Value

func (boolean PDFBoolean) Value() bool

Value returns a bool value for this PDF object.

type PDFContext

type PDFContext struct {
	*Configuration
	*XRefTable
	Read     *ReadContext
	Optimize *OptimizationContext
	Write    *WriteContext
}

PDFContext represents the context for processing PDF files with pdfcpu.

func NewPDFContext

func NewPDFContext(fileName string, file *os.File, config *Configuration) (ctx *PDFContext, err error)

NewPDFContext initializes a new PDF context.

func (*PDFContext) ResetWriteContext

func (ctx *PDFContext) ResetWriteContext()

ResetWriteContext prepares an existing WriteContext for a new file to be written.

func (*PDFContext) String

func (ctx *PDFContext) String() string

type PDFDict

type PDFDict struct {
	Dict map[string]interface{}
}

PDFDict represents a PDF dict object.

func NewPDFDict

func NewPDFDict() PDFDict

NewPDFDict returns a new PDFDict object.

func (PDFDict) BooleanEntry

func (d PDFDict) BooleanEntry(key string) (b bool)

BooleanEntry expects and returns a BooleanEntry for given key.

func (*PDFDict) Delete

func (d *PDFDict) Delete(key string) (value interface{})

Delete deletes the PDFObject for given key.

func (PDFDict) Find

func (d PDFDict) Find(key string) (value interface{}, found bool)

Find returns the PDFObject for given key and PDFDict.

func (PDFDict) First

func (d PDFDict) First() *int

First returns a *int for key "First".

func (PDFDict) Index

func (d PDFDict) Index() *PDFArray

Index returns a *PDFArray for key "Index".

func (PDFDict) IndirectRefEntry

func (d PDFDict) IndirectRefEntry(key string) *PDFIndirectRef

IndirectRefEntry returns an indirectRefEntry for given key for this dictionary.

func (*PDFDict) Insert

func (d *PDFDict) Insert(key string, value interface{}) (ok bool)

Insert adds a new entry(key,value) to this PDFDict.

func (PDFDict) Int64Entry

func (d PDFDict) Int64Entry(key string) *int64

Int64Entry expects and returns a PDFInteger entry representing an int64 value for given key.

func (PDFDict) IntEntry

func (d PDFDict) IntEntry(key string) *int

IntEntry expects and returns a PDFInteger entry for given key.

func (PDFDict) IsLinearizationParmDict

func (d PDFDict) IsLinearizationParmDict() bool

IsLinearizationParmDict returns true if this dict has an int entry for key "Linearized".

func (PDFDict) IsObjStm

func (d PDFDict) IsObjStm() bool

IsObjStm returns true if given PDFDict is an object stream.

func (*PDFDict) Len

func (d *PDFDict) Len() int

Len returns the length of this PDFDict.

func (PDFDict) Length

func (d PDFDict) Length() (*int64, *int)

Length returns a *int64 for entry with key "Length". Stream length may be referring to an indirect object.

func (PDFDict) N

func (d PDFDict) N() *int

N returns a *int for key "N".

func (PDFDict) NameEntry

func (d PDFDict) NameEntry(key string) *string

NameEntry expects and returns a PDFName entry for given key.

func (PDFDict) PDFArrayEntry

func (d PDFDict) PDFArrayEntry(key string) *PDFArray

PDFArrayEntry expects and returns a PDFArray entry for given key.

func (PDFDict) PDFDictEntry

func (d PDFDict) PDFDictEntry(key string) *PDFDict

PDFDictEntry expects and returns a PDFDict entry for given key.

func (PDFDict) PDFNameEntry

func (d PDFDict) PDFNameEntry(key string) *PDFName

PDFNameEntry returns a PDFName object for given key.

func (PDFDict) PDFStreamDictEntry

func (d PDFDict) PDFStreamDictEntry(key string) *PDFStreamDict

PDFStreamDictEntry expects and returns a PDFStreamDict entry for given key. unused.

func (PDFDict) PDFString

func (d PDFDict) PDFString() string

PDFString returns a string representation as found in and written to a PDF file.

func (PDFDict) PDFStringLiteralEntry

func (d PDFDict) PDFStringLiteralEntry(key string) *PDFStringLiteral

PDFStringLiteralEntry returns a PDFStringLiteral object for given key.

func (PDFDict) Prev

func (d PDFDict) Prev() *int64

Prev returns the previous offset.

func (PDFDict) Size

func (d PDFDict) Size() *int

Size returns the value of the int entry for key "Size"

func (PDFDict) String

func (d PDFDict) String() string

func (PDFDict) StringEntry

func (d PDFDict) StringEntry(key string) *string

StringEntry expects and returns a PDFStringLiteral entry for given key. Unused.

func (PDFDict) Subtype

func (d PDFDict) Subtype() *string

Subtype returns the value of the name entry for key "Subtype".

func (PDFDict) Type

func (d PDFDict) Type() *string

Type returns the value of the name entry for key "Type".

func (*PDFDict) Update

func (d *PDFDict) Update(key string, value interface{})

Update modifies an existing entry of this PDFDict.

func (PDFDict) W

func (d PDFDict) W() *PDFArray

W returns a *PDFArray for key "W".

type PDFFilter

type PDFFilter struct {
	Name        string
	DecodeParms *PDFDict
}

PDFFilter represents a PDF stream filter object.

type PDFFloat

type PDFFloat float64

PDFFloat represents a PDF float object.

func (PDFFloat) PDFString

func (f PDFFloat) PDFString() string

PDFString returns a string representation as found in and written to a PDF file.

func (PDFFloat) String

func (f PDFFloat) String() string

func (PDFFloat) Value

func (f PDFFloat) Value() float64

Value returns a float64 value for this PDF object.

type PDFHexLiteral

type PDFHexLiteral string

PDFHexLiteral represents a PDF hex literal object.

func (PDFHexLiteral) PDFString

func (hexliteral PDFHexLiteral) PDFString() string

PDFString returns a string representation as found in and written to a PDF file.

func (PDFHexLiteral) String

func (hexliteral PDFHexLiteral) String() string

func (PDFHexLiteral) Value

func (hexliteral PDFHexLiteral) Value() string

Value returns a string value for this PDF object.

type PDFIndirectRef

type PDFIndirectRef struct {
	ObjectNumber     PDFInteger
	GenerationNumber PDFInteger
}

PDFIndirectRef represents a PDF indirect object.

func NewPDFIndirectRef

func NewPDFIndirectRef(objectNumber, generationNumber int) PDFIndirectRef

NewPDFIndirectRef returns a new PDFIndirectRef object.

func (PDFIndirectRef) Equals

func (indirectRef PDFIndirectRef) Equals(indRef PDFIndirectRef) bool

Equals returns true if two indirect References refer to the same object.

func (PDFIndirectRef) PDFString

func (indirectRef PDFIndirectRef) PDFString() string

PDFString returns a string representation as found in and written to a PDF file.

func (PDFIndirectRef) String

func (indirectRef PDFIndirectRef) String() string

type PDFInteger

type PDFInteger int

PDFInteger represents a PDF integer object.

func (PDFInteger) PDFString

func (i PDFInteger) PDFString() string

PDFString returns a string representation as found in and written to a PDF file.

func (PDFInteger) String

func (i PDFInteger) String() string

func (PDFInteger) Value

func (i PDFInteger) Value() int

Value returns an int value for this PDF object.

type PDFName

type PDFName string

PDFName represents a PDF name object.

func (PDFName) PDFString

func (nameObject PDFName) PDFString() string

PDFString returns a string representation as found in and written to a PDF file.

func (PDFName) String

func (nameObject PDFName) String() string

func (PDFName) Value

func (nameObject PDFName) Value() string

Value returns a string value for this PDF object.

type PDFObjectStreamDict

type PDFObjectStreamDict struct {
	PDFStreamDict
	Prolog         []byte
	ObjCount       int
	FirstObjOffset int
	ObjArray       PDFArray
}

PDFObjectStreamDict represents a object stream dictionary.

func NewPDFObjectStreamDict

func NewPDFObjectStreamDict() *PDFObjectStreamDict

NewPDFObjectStreamDict creates a new PDFObjectStreamDict object.

func (*PDFObjectStreamDict) AddObject

func (oStreamDict *PDFObjectStreamDict) AddObject(objNumber int, entry *XRefTableEntry) (err error)

AddObject adds another object to this object stream. Relies on decoded content!

func (*PDFObjectStreamDict) Finalize

func (oStreamDict *PDFObjectStreamDict) Finalize()

Finalize prepares the final content of the objectstream.

func (*PDFObjectStreamDict) GetIndexedObject

func (oStreamDict *PDFObjectStreamDict) GetIndexedObject(index int) (interface{}, error)

GetIndexedObject returns the object at given index from a PDFObjectStreamDict.

type PDFStats

type PDFStats struct {
	// contains filtered or unexported fields
}

PDFStats is a container for stats.

func NewPDFStats

func NewPDFStats() PDFStats

NewPDFStats returns a new PDFStats object.

func (PDFStats) AddPageAttr

func (stats PDFStats) AddPageAttr(name int)

AddPageAttr adds the occurrence of a field with given name to the pageAttrs set.

func (PDFStats) AddRootAttr

func (stats PDFStats) AddRootAttr(name int)

AddRootAttr adds the occurrence of a field with given name to the rootAttrs set.

func (PDFStats) UsesPageAttr

func (stats PDFStats) UsesPageAttr(name int) bool

UsesPageAttr returns true if a field with given name is contained in the pageAttrs set.

func (PDFStats) UsesRootAttr

func (stats PDFStats) UsesRootAttr(name int) bool

UsesRootAttr returns true if a field with given name is contained in the rootAttrs set.

type PDFStreamDict

type PDFStreamDict struct {
	PDFDict
	StreamOffset      int64
	StreamLength      *int64
	StreamLengthObjNr *int
	FilterPipeline    []PDFFilter
	Raw               []byte // Encoded
	Content           []byte // Decoded
	IsPageContent     bool
}

PDFStreamDict represents a PDF stream dict object.

func NewPDFStreamDict

func NewPDFStreamDict(pdfDict PDFDict, streamOffset int64, streamLength *int64, streamLengthObjNr *int,
	filterPipeline []PDFFilter) PDFStreamDict

NewPDFStreamDict creates a new PDFStreamDict for given PDFDict, stream offset and length.

func (PDFStreamDict) HasSoleFilterNamed

func (streamDict PDFStreamDict) HasSoleFilterNamed(filterName string) bool

HasSoleFilterNamed returns true if there is exactly one filter defined for a stream dict.

type PDFStringLiteral

type PDFStringLiteral string

PDFStringLiteral represents a PDF string literal object.

func (PDFStringLiteral) PDFString

func (stringliteral PDFStringLiteral) PDFString() string

PDFString returns a string representation as found in and written to a PDF file.

func (PDFStringLiteral) String

func (stringliteral PDFStringLiteral) String() string

func (PDFStringLiteral) Value

func (stringliteral PDFStringLiteral) Value() string

Value returns a string value for this PDF object.

type PDFVersion

type PDFVersion int

PDFVersion is a type for the internal representation of PDF versions.

const (
	V10 PDFVersion = iota
	V11
	V12
	V13
	V14
	V15
	V16
	V17
)

Constants for all PDF versions up to v1.7

func Version

func Version(versionStr string) (PDFVersion, error)

Version returns the PDFVersion for a version string.

type PDFXRefStreamDict

type PDFXRefStreamDict struct {
	PDFStreamDict
	Size           int
	Objects        []int
	W              [3]int
	PreviousOffset *int64
}

PDFXRefStreamDict represents a cross reference stream dictionary.

func NewPDFXRefStreamDict

func NewPDFXRefStreamDict(xRefTable *XRefTable) *PDFXRefStreamDict

NewPDFXRefStreamDict creates a new PDFXRefStreamDict object.

type ReadContext

type ReadContext struct {

	// The PDF-File which gets processed.
	FileName string
	File     *os.File
	FileSize int64

	BinaryTotalSize     int64 // total stream data
	BinaryImageSize     int64 // total image stream data
	BinaryFontSize      int64 // total font stream data (fontfiles)
	BinaryImageDuplSize int64 // total obsolet image stream data after optimization
	BinaryFontDuplSize  int64 // total obsolet font stream data after optimization

	Linearized bool // File is linearized.
	Hybrid     bool // File is a hybrid PDF file.

	UsingObjectStreams bool   // File is using object streams.
	ObjectStreams      IntSet // All object numbers of any object streams found which need to be decoded.

	UsingXRefStreams bool   // File is using xref streams.
	XRefStreams      IntSet // All object numbers of any xref streams found.
}

ReadContext represents the context for reading a PDF file.

func (*ReadContext) IsObjectStreamObject

func (rc *ReadContext) IsObjectStreamObject(i int) bool

IsObjectStreamObject returns true if object i is a an object stream. All compressed objects are object streams.

func (*ReadContext) IsXRefStreamObject

func (rc *ReadContext) IsXRefStreamObject(i int) bool

IsXRefStreamObject returns true if object #i is a an xref stream.

func (*ReadContext) LogStats

func (rc *ReadContext) LogStats(log *log.Logger, optimized bool)

LogStats logs stats for read file.

func (*ReadContext) ObjectStreamsString

func (rc *ReadContext) ObjectStreamsString() (int, string)

ObjectStreamsString returns a formatted string and the number of object stream objects.

func (*ReadContext) XRefStreamsString

func (rc *ReadContext) XRefStreamsString() (int, string)

XRefStreamsString returns a formatted string and the number of xref stream objects.

type WriteContext

type WriteContext struct {

	// The PDF-File which gets generated.
	DirName  string
	FileName string
	FileSize int64
	*bufio.Writer

	Command       string // command in effect.
	ExtractPageNr int    // page to be generated for rendering a single-page/PDF.
	ExtractPages  IntSet // pages to be generated for a trimmed PDF.

	BinaryTotalSize int64 // total stream data, counts 100% all stream data written.
	BinaryImageSize int64 // total image stream data written = Read.BinaryImageSize.
	BinaryFontSize  int64 // total font stream data (fontfiles) = copy of Read.BinaryFontSize.

	Table  map[int]int64 // object write offsets
	Offset int64         // current write offset

	WriteToObjectStream bool // if true start to embed objects into object streams and obey ObjectStreamMaxObjects.
	CurrentObjStream    *int // if not nil, any new non-stream-object gets added to the object stream with this object number.

	Eol string // end of line char sequence
}

WriteContext represents the context for writing a PDF file.

func NewWriteContext

func NewWriteContext(eol string) *WriteContext

NewWriteContext returns a new WriteContext.

func (*WriteContext) ExtractPage

func (wc *WriteContext) ExtractPage(i int) bool

ExtractPage returns true if page i needs to be generated.

func (*WriteContext) HasWriteOffset

func (wc *WriteContext) HasWriteOffset(objNumber int) bool

HasWriteOffset returns true if an object has already been written to PDFDestination.

func (*WriteContext) LogStats

func (wc *WriteContext) LogStats(log *log.Logger)

LogStats logs stats for written file.

func (*WriteContext) ReducedFeatureSet

func (wc *WriteContext) ReducedFeatureSet() bool

ReducedFeatureSet returns true for Split,Trim,Merge,ExtractPages. Don't confuse with pdfcpu commands, these are internal triggers.

func (*WriteContext) SetWriteOffset

func (wc *WriteContext) SetWriteOffset(objNumber int)

SetWriteOffset saves the current write offset to the PDFDestination.

func (*WriteContext) WriteEol

func (wc *WriteContext) WriteEol() error

WriteEol writes an end of line sequence.

type XRefTable

type XRefTable struct {
	Table     map[int]*XRefTableEntry
	Size      *int            // Object count from PDF trailer dict.
	PageCount int             // Number of pages.
	Root      *PDFIndirectRef // Catalog (reference to root object).

	// PDF Version
	HeaderVersion *PDFVersion // The PDF version the source is claiming to us as per its header.
	RootVersion   *PDFVersion // Optional PDF version taking precedence over the header version.

	// Document information section
	Info     *PDFIndirectRef // Infodict (reference to info dict object)
	ID       *PDFArray       // from info dict (or trailer?)
	Author   string
	Creator  string
	Producer string

	// Linearization section (not yet supported)
	OffsetPrimaryHintTable  *int64
	OffsetOverflowHintTable *int64
	LinearizationObjs       IntSet

	// Offspec section
	AdditionalStreams []PDFIndirectRef //trailer :e.g., Oasis "Open Doc"

	// Statistics
	Stats PDFStats

	Tagged bool // File is using tags. This is important for ???

	// Validation
	Valid          bool // true means successful validated against ISO 32000.
	ValidationMode int  // see Configuration

	Optimized bool
}

XRefTable represents a PDF cross reference table plus stats for a PDF file.

func (*XRefTable) Catalog

func (xRefTable *XRefTable) Catalog() (*PDFDict, error)

Catalog returns a pointer to the root object / catalog.

func (*XRefTable) CatalogHasPieceInfo

func (xRefTable *XRefTable) CatalogHasPieceInfo() (bool, error)

CatalogHasPieceInfo returns true if the root has an entry for \"PieceInfo\".

func (*XRefTable) DeleteObject

func (xRefTable *XRefTable) DeleteObject(objectNumber int) (err error)

DeleteObject marks an object as free and inserts it into the free list right after the head.

func (*XRefTable) Dereference

func (xRefTable *XRefTable) Dereference(obj interface{}) (interface{}, error)

Dereference resolves an indirect object and returns the resulting PDF object.

func (*XRefTable) DereferenceArray

func (xRefTable *XRefTable) DereferenceArray(obj interface{}) (arrp *PDFArray, err error)

DereferenceArray resolves an indirect object that points to a PDFArray.

func (*XRefTable) DereferenceDict

func (xRefTable *XRefTable) DereferenceDict(obj interface{}) (dictp *PDFDict, err error)

DereferenceDict resolves an indirect object that points to a PDFDict.

func (*XRefTable) DereferenceInteger

func (xRefTable *XRefTable) DereferenceInteger(obj interface{}) (ip *PDFInteger, err error)

DereferenceInteger resolves and validates an integer object, which may be an indirect reference.

func (*XRefTable) DereferenceName

func (xRefTable *XRefTable) DereferenceName(obj interface{}, sinceVersion PDFVersion, validate func(string) bool) (n PDFName, err error)

DereferenceName resolves and validates a name object, which may be an indirect reference.

func (*XRefTable) DereferenceStreamDict

func (xRefTable *XRefTable) DereferenceStreamDict(obj interface{}) (streamDictp *PDFStreamDict, err error)

DereferenceStreamDict resolves an indirect object that points to a PDFStreamDict.

func (*XRefTable) DereferenceStringLiteral

func (xRefTable *XRefTable) DereferenceStringLiteral(obj interface{}, sinceVersion PDFVersion, validate func(string) bool) (s PDFStringLiteral, err error)

DereferenceStringLiteral resolves and validates a string literal object, which may be an indirect reference.

func (*XRefTable) DereferenceStringOrHexLiteral

func (xRefTable *XRefTable) DereferenceStringOrHexLiteral(obj interface{}, sinceVersion PDFVersion, validate func(string) bool) (o interface{}, err error)

DereferenceStringOrHexLiteral resolves and validates a string or hex literal object, which may be an indirect reference.

func (*XRefTable) EnsureValidFreeList

func (xRefTable *XRefTable) EnsureValidFreeList() (err error)

EnsureValidFreeList ensures the integrity of the free list associated with the recorded free objects. See 7.5.4 Cross-Reference Table

func (*XRefTable) Exists

func (xRefTable *XRefTable) Exists(objNumber int) bool

Exists returns true if xRefTable contains an entry for objNumber.

func (*XRefTable) Find

func (xRefTable *XRefTable) Find(objNumber int) (*XRefTableEntry, bool)

Find returns the XRefTable entry for given object number.

func (*XRefTable) FindTableEntry

func (xRefTable *XRefTable) FindTableEntry(objNumber int, generationNumber int) (*XRefTableEntry, bool)

FindTableEntry returns the XRefTable entry for given object and generation numbers.

func (*XRefTable) FindTableEntryForIndRef

func (xRefTable *XRefTable) FindTableEntryForIndRef(indRef *PDFIndirectRef) (*XRefTableEntry, bool)

FindTableEntryForIndRef returns the XRefTable entry for given indirect reference.

func (*XRefTable) FindTableEntryLight

func (xRefTable *XRefTable) FindTableEntryLight(objNumber int) (*XRefTableEntry, bool)

FindTableEntryLight returns the XRefTable entry for given object number.

func (*XRefTable) Free

func (xRefTable *XRefTable) Free(objNumber int) (entry *XRefTableEntry, err error)

Free returns the cross ref table entry for given number of a free object.

func (*XRefTable) Insert

func (xRefTable *XRefTable) Insert(objNumber int, xRefTableEntry XRefTableEntry) bool

Insert adds given xRefTableEntry at given index objNumber into the cross reference table. Gets called when reading in a PDF file and generating its xRefTable in memory.

func (*XRefTable) InsertAndUseRecycled

func (xRefTable *XRefTable) InsertAndUseRecycled(xRefTableEntry XRefTableEntry) (objNumber int, err error)

InsertAndUseRecycled adds given xRefTableEntry into the cross reference table utilizing the freelist. Called on creation of new xref stream only.

func (*XRefTable) InsertNew

func (xRefTable *XRefTable) InsertNew(xRefTableEntry XRefTableEntry) (objNumber int, ok bool)

InsertNew adds given xRefTableEntry at next new objNumber into the cross reference table. Only to be called once an xRefTable has been generated completely and all trailer dicts have been processed. xRefTable.Size is the size entry of the first trailer dict processed. Called on creation of new object streams. Called by InsertAndUseRecycled.

func (*XRefTable) IsLinearizationObject

func (xRefTable *XRefTable) IsLinearizationObject(i int) bool

IsLinearizationObject returns true if object #i is a a linearization object.

func (*XRefTable) LinearizationObjsString

func (xRefTable *XRefTable) LinearizationObjsString() (int, string)

LinearizationObjsString returns a formatted string and the number of objs.

func (*XRefTable) MissingObjects

func (xRefTable *XRefTable) MissingObjects() (int, *string)

MissingObjects returns the number of objects that were not written plus the corresponding comma separated string representation.

func (*XRefTable) NextForFree

func (xRefTable *XRefTable) NextForFree(objNumber int) (next int, err error)

NextForFree returns the number of the object the free object with objNumber links to. This is the successor of this free object in the free list.

func (*XRefTable) Pages

func (xRefTable *XRefTable) Pages() (*PDFIndirectRef, error)

Pages returns the Pages reference contained in the catalog.

func (*XRefTable) ParseRootVersion

func (xRefTable *XRefTable) ParseRootVersion() (*string, error)

ParseRootVersion returns a string representation for an optional Version entry in the root object.

func (*XRefTable) UndeleteObject

func (xRefTable *XRefTable) UndeleteObject(objectNumber int) (err error)

UndeleteObject ensures an object is not recorded in the free list. e.g. sometimes caused by indirect references to free objects in the original PDF file.

func (*XRefTable) Version

func (xRefTable *XRefTable) Version() PDFVersion

Version returns the PDF version of the PDF writer that created this file. Before V1.4 this is the header version. Since V1.4 the catalog may contain a Version entry which takes precedence over the header version.

func (*XRefTable) VersionString

func (xRefTable *XRefTable) VersionString() string

VersionString return a string representation for this PDF files PDF version.

type XRefTableEntry

type XRefTableEntry struct {
	Free            bool
	Offset          *int64
	Generation      *int
	Object          interface{}
	Compressed      bool
	ObjectStream    *int
	ObjectStreamInd *int
}

XRefTableEntry represents an entry in the PDF cross reference table.

This may be a free object, a compressed object or any in use PDF object:

PDFDict, PDFStreamDict, PDFObjectStreamDict, PDFXRefStreamDict, PDFArray, PDFInteger, PDFFloat, PDFName, PDFStringLiteral, PDFHexLiteral, PDFBoolean

func NewXRefTableEntryGen0

func NewXRefTableEntryGen0() *XRefTableEntry

NewXRefTableEntryGen0 creates a cross reference table entry for an object with generation 0.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL