Documentation ¶
Overview ¶
Package pdfcpu is a simple PDF processing library written in Go supporting encryption. It provides an API and a command line interface. Supported are all versions up to PDF 1.7 (ISO-32000).
The available commands are:
validate validate PDF against PDF 32000-1:2008 (PDF 1.7) optimize optimize PDF by getting rid of redundant page resources split split multi-page PDF into several single-page PDFs merge concatenate 2 or more PDFs extract extract images, fonts, content, pages or metadata trim create trimmed version stamp add text or image stamp to selected pages watermark add text or image watermark for selected pages attach list, add, remove, extract embedded file attachments perm list, add user access permissions encrypt set password protection decrypt remove password protection changeupw change user password changeopw change owner password version print version
Index ¶
- Constants
- Variables
- func AddWatermarks(xRefTable *XRefTable, selectedPages IntSet, wm *Watermark) error
- func AppendStatsFile(ctx *Context) error
- func AttachAdd(xRefTable *XRefTable, files StringSet) (ok bool, err error)
- func AttachExtract(ctx *Context, files StringSet) (err error)
- func AttachList(xRefTable *XRefTable) (list []string, err error)
- func AttachRemove(xRefTable *XRefTable, files StringSet) (ok bool, err error)
- func CreatePDF(xRefTable *XRefTable, dirName, fileName string) error
- func DateString(t time.Time) string
- func DecodeUTF16String(s string) (string, error)
- func Escape(s string) (*string, error)
- func ExtractStreamData(ctx *Context, objNr int) (data []byte, err error)
- func HexLiteralToString(hexString string) (string, error)
- func IntMemberOf(i int, list []int) bool
- func IsStringUTF16BE(s string) bool
- func IsUTF16BE(b []byte) (ok bool, err error)
- func MemberOf(s string, list []string) bool
- func MergeXRefTables(ctxSource, ctxDest *Context) (err error)
- func OptimizeXRefTable(ctx *Context) error
- func Permissions(ctx *Context) (list []string)
- func StringLiteralToString(s string) (string, error)
- func Unescape(s string) ([]byte, error)
- func WriteImage(xRefTable *XRefTable, filename string, sd *StreamDict, objNr int) (string, error)
- func WritePDFFile(ctx *Context) error
- type Array
- type Boolean
- type ByteSize
- type CommandMode
- type Configuration
- type Context
- type Dict
- func (d Dict) ArrayEntry(key string) *Array
- func (d Dict) BooleanEntry(key string) *bool
- func (d Dict) Delete(key string) (value Object)
- func (d Dict) DictEntry(key string) *Dict
- func (d Dict) Entry(dictName, key string, required bool) (Object, error)
- func (d Dict) Find(key string) (value Object, found bool)
- func (d Dict) First() *int
- func (d Dict) HexLiteralEntry(key string) *HexLiteral
- func (d Dict) Index() *Array
- func (d Dict) IndirectRefEntry(key string) *IndirectRef
- func (d Dict) Insert(key string, value Object) (ok bool)
- func (d Dict) InsertFloat(key string, value float32)
- func (d Dict) InsertInt(key string, value int)
- func (d Dict) InsertName(key, value string)
- func (d Dict) InsertString(key, value string)
- func (d Dict) Int64Entry(key string) *int64
- func (d Dict) IntEntry(key string) *int
- func (d Dict) IsLinearizationParmDict() bool
- func (d Dict) IsObjStm() bool
- func (d Dict) Len() int
- func (d Dict) Length() (*int64, *int)
- func (d Dict) N() *int
- func (d Dict) NameEntry(key string) *string
- func (d Dict) PDFString() string
- func (d Dict) Prev() *int64
- func (d Dict) Size() *int
- func (d Dict) StreamDictEntry(key string) *StreamDict
- func (d Dict) String() string
- func (d Dict) StringEntry(key string) *string
- func (d Dict) StringEntryBytes(key string) ([]byte, error)
- func (d Dict) StringLiteralEntry(key string) *StringLiteral
- func (d Dict) Subtype() *string
- func (d Dict) Type() *string
- func (d Dict) Update(key string, value Object)
- func (d Dict) W() *Array
- type Enc
- type Float
- type FontObject
- type HexLiteral
- type ImageObject
- type IndirectRef
- type InheritedPageAttrs
- type IntSet
- type Integer
- type Name
- type Node
- func (n *Node) Add(xRefTable *XRefTable, k string, v Object) error
- func (n *Node) AddToLeaf(k string, v Object)
- func (n Node) KeyList() ([]string, error)
- func (n Node) Process(xRefTable *XRefTable, handler func(*XRefTable, string, Object) error) error
- func (n *Node) Remove(xRefTable *XRefTable, k string) (empty, ok bool, err error)
- func (n Node) String() string
- func (n Node) Value(k string) (Object, bool)
- type Object
- type ObjectStreamDict
- type OptimizationContext
- func (oc *OptimizationContext) DuplicateFontObjectsString() (int, string)
- func (oc *OptimizationContext) DuplicateImageObjectsString() (int, string)
- func (oc *OptimizationContext) DuplicateInfoObjectsString() (int, string)
- func (oc *OptimizationContext) IsDuplicateFontObject(i int) bool
- func (oc *OptimizationContext) IsDuplicateImageObject(i int) bool
- func (oc *OptimizationContext) IsDuplicateInfoObject(i int) bool
- func (oc *OptimizationContext) NonReferencedObjsString() (int, string)
- type PDFFilter
- type PDFImage
- type PDFStats
- type ReadContext
- func (rc *ReadContext) IsObjectStreamObject(i int) bool
- func (rc *ReadContext) IsXRefStreamObject(i int) bool
- func (rc *ReadContext) LogStats(optimized bool)
- func (rc *ReadContext) ObjectStreamsString() (int, string)
- func (rc *ReadContext) ReadFileSize() int
- func (rc *ReadContext) XRefStreamsString() (int, string)
- type StreamDict
- type StringLiteral
- type StringSet
- type Version
- type Watermark
- type WriteContext
- type XRefStreamDict
- type XRefTable
- func (xRefTable *XRefTable) BindNameTrees() error
- func (xRefTable *XRefTable) Catalog() (*Dict, error)
- func (xRefTable *XRefTable) CatalogHasPieceInfo() (bool, error)
- func (xRefTable *XRefTable) DeleteObject(objectNumber int) error
- func (xRefTable *XRefTable) DeleteObjectGraph(obj Object) error
- func (xRefTable *XRefTable) Dereference(obj Object) (Object, error)
- func (xRefTable *XRefTable) DereferenceArray(obj Object) (*Array, error)
- func (xRefTable *XRefTable) DereferenceDict(obj Object) (*Dict, error)
- func (xRefTable *XRefTable) DereferenceDictEntry(dict *Dict, entryName string) (Object, error)
- func (xRefTable *XRefTable) DereferenceInteger(obj Object) (*Integer, error)
- func (xRefTable *XRefTable) DereferenceName(obj Object, sinceVersion Version, validate func(string) bool) (n Name, err error)
- func (xRefTable *XRefTable) DereferenceNumber(obj Object) (f float64)
- func (xRefTable *XRefTable) DereferenceStreamDict(obj Object) (*StreamDict, error)
- func (xRefTable *XRefTable) DereferenceStringLiteral(obj Object, sinceVersion Version, validate func(string) bool) (s StringLiteral, err error)
- func (xRefTable *XRefTable) DereferenceStringOrHexLiteral(obj Object, sinceVersion Version, validate func(string) bool) (o Object, err error)
- func (xRefTable *XRefTable) DereferenceText(obj Object) (string, error)
- func (xRefTable *XRefTable) EncryptDict() (*Dict, error)
- func (xRefTable *XRefTable) EnsureCollection() error
- func (xRefTable *XRefTable) EnsureValidFreeList() error
- func (xRefTable *XRefTable) Exists(objNumber int) bool
- func (xRefTable *XRefTable) Find(objNumber int) (*XRefTableEntry, bool)
- func (xRefTable *XRefTable) FindObject(objNumber int) (Object, error)
- func (xRefTable *XRefTable) FindTableEntry(objNumber int, generationNumber int) (*XRefTableEntry, bool)
- func (xRefTable *XRefTable) FindTableEntryForIndRef(indRef *IndirectRef) (*XRefTableEntry, bool)
- func (xRefTable *XRefTable) FindTableEntryLight(objNumber int) (*XRefTableEntry, bool)
- func (xRefTable *XRefTable) Free(objNumber int) (*XRefTableEntry, error)
- func (xRefTable *XRefTable) IDFirstElement() (id []byte, err error)
- func (xRefTable *XRefTable) IndRefForNewObject(obj Object) (*IndirectRef, error)
- func (xRefTable *XRefTable) InsertAndUseRecycled(xRefTableEntry XRefTableEntry) (objNumber int, err error)
- func (xRefTable *XRefTable) InsertNew(xRefTableEntry XRefTableEntry) (objNumber int)
- func (xRefTable *XRefTable) InsertObject(obj Object) (objNumber int, err error)
- func (xRefTable *XRefTable) IsLinearizationObject(i int) bool
- func (xRefTable *XRefTable) LinearizationObjsString() (int, string)
- func (xRefTable *XRefTable) LocateNameTree(nameTreeName string, ensure bool) error
- func (xRefTable *XRefTable) MissingObjects() (int, *string)
- func (xRefTable *XRefTable) NamesDict() (*Dict, error)
- func (xRefTable *XRefTable) NewEmbeddedFileStreamDict(filename string) (*StreamDict, error)
- func (xRefTable *XRefTable) NewFileSpecDict(filename string, indRefStreamDict IndirectRef) (*Dict, error)
- func (xRefTable *XRefTable) NewSoundStreamDict(filename string, samplingRate int, fileSpecDict *Dict) (*StreamDict, error)
- func (xRefTable *XRefTable) NewStreamDict(filename string) (*StreamDict, error)
- func (xRefTable *XRefTable) NextForFree(objNumber int) (int, error)
- func (xRefTable *XRefTable) PageDict(page int) (*Dict, *InheritedPageAttrs, error)
- func (xRefTable *XRefTable) Pages() (*IndirectRef, error)
- func (xRefTable *XRefTable) ParseRootVersion() (v *string, err error)
- func (xRefTable *XRefTable) RemoveCollection() error
- func (xRefTable *XRefTable) RemoveEmbeddedFilesNameTree() error
- func (xRefTable *XRefTable) RemoveNameTree(nameTreeName string) error
- func (xRefTable *XRefTable) UndeleteObject(objectNumber int) error
- func (xRefTable *XRefTable) ValidateVersion(element string, sinceVersion Version) error
- func (xRefTable *XRefTable) Version() Version
- func (xRefTable *XRefTable) VersionString() string
- type XRefTableEntry
Constants ¶
const ( DeviceGrayCS = "DeviceGray" DeviceRGBCS = "DeviceRGB" DeviceCMYKCS = "DeviceCMYK" CalGrayCS = "CalGray" CalRGBCS = "CalRGB" LabCS = "Lab" ICCBasedCS = "ICCBased" IndexedCS = "Indexed" PatternCS = "Pattern" SeparationCS = "Separation" DeviceNCS = "DeviceN" )
PDF defines the following Color Spaces:
const ( // ValidationStrict ensures 100% compliance with the spec (PDF 32000-1:2008). ValidationStrict = 0 // ValidationRelaxed ensures PDF compliance based on frequently encountered validation errors. ValidationRelaxed = 1 // StatsFileNameDefault is the standard stats filename. StatsFileNameDefault = "stats.csv" // PermissionsAll enables all user access permission bits. PermissionsAll int16 = -1 // 0xFFFF // PermissionsNone disables all user access permissions bits. PermissionsNone int16 = -3901 // 0xF0C3 )
const ( RootVersion = iota RootExtensions RootPageLabels RootNames RootDests RootViewerPrefs RootPageLayout RootPageMode RootOutlines RootThreads RootOpenAction RootAA RootURI RootAcroForm RootMetadata RootStructTreeRoot RootMarkInfo RootLang RootSpiderInfo RootOutputIntents RootPieceInfo RootOCProperties RootPerms RootLegal RootRequirements RootCollection RootNeedsRendering )
The PDF root object fields.
const ( PageLastModified = iota PageResources PageMediaBox PageCropBox PageBleedBox PageTrimBox PageArtBox PageBoxColorInfo PageContents PageRotate PageGroup PageThumb PageB PageDur PageTrans PageAnnots PageAA PageMetadata PagePieceInfo PageStructParents PageID PagePZ PageSeparationInfo PageTabs PageTemplateInstantiated PagePresSteps PageUserUnit PageVP )
The PDF page object fields.
const ( EolLF = "\x0A" EolCR = "\x0D" EolCRLF = "\x0D\x0A" )
Supported line delimiters
const ( // PDFCPUVersion returns the current pdfcpu version. PDFCPUVersion = "0.1.15" // PDFCPULongVersion returns pdfcpu's signature. PDFCPULongVersion = "golang pdfcpu v" + PDFCPUVersion )
const FreeHeadGeneration = 65535
FreeHeadGeneration is the predefined generation number for the head of the free list.
const (
// ObjectStreamMaxObjects limits the number of objects within an object stream written.
ObjectStreamMaxObjects = 100
)
Variables ¶
var ( ErrUnsupportedColorSpace = errors.New("unsupported color space") ErrUnsupported16BPC = errors.New("unsupported 16 bits per component") ErrUnsupportedTIFFCreation = errors.New("unsupported tiff file creation") )
Errors to be identified.
Functions ¶
func AddWatermarks ¶ added in v0.1.16
AddWatermarks adds watermarks to all pages selected.
func AppendStatsFile ¶
AppendStatsFile appends a stats line for this xRefTable to the configured csv file name.
func AttachAdd ¶
AttachAdd embeds specified files. Existing attachments are replaced. ok returns true if at least one attachment was added.
func AttachExtract ¶
AttachExtract exports specified embedded files. If no files specified extract all embedded files.
func AttachList ¶
AttachList returns a list of embedded files.
func AttachRemove ¶
AttachRemove deletes specified embedded files. ok returns true if at least one attachment could be removed.
func DateString ¶ added in v0.1.16
DateString returns a string representation of t.
func DecodeUTF16String ¶
DecodeUTF16String decodes a UTF16BE string from a hex string.
func ExtractStreamData ¶ added in v0.1.16
ExtractStreamData extracts the content of a stream dict for a specific objNr.
func HexLiteralToString ¶
HexLiteralToString returns a possibly UTF16 encoded string for a hex string.
func IntMemberOf ¶ added in v0.1.16
IntMemberOf returns true if list contains i.
func IsStringUTF16BE ¶
IsStringUTF16BE checks a string for Big Endian byte order BOM.
func MergeXRefTables ¶
MergeXRefTables merges Context ctxSource into ctxDest by appending its page tree.
func OptimizeXRefTable ¶
OptimizeXRefTable optimizes an xRefTable by locating and getting rid of redundant embedded fonts and images.
func Permissions ¶
Permissions returns a list of set permissions.
func StringLiteralToString ¶
StringLiteralToString returns the best possible string rep for a string literal.
func WriteImage ¶ added in v0.1.16
WriteImage writes a PDF image object to disk.
func WritePDFFile ¶
WritePDFFile generates a PDF file for the cross reference table contained in Context.
Types ¶
type Array ¶ added in v0.1.16
type Array []Object
Array represents a PDF array object.
func NewIntegerArray ¶
NewIntegerArray returns a PDFArray with Integer entries.
func NewNameArray ¶
NewNameArray returns a PDFArray with Name entries.
func NewNumberArray ¶
NewNumberArray returns a PDFArray with Float entries.
func NewRectangle ¶
NewRectangle creates a rectangle array
func NewStringArray ¶
NewStringArray returns a PDFArray with StringLiteral entries.
type Boolean ¶ added in v0.1.16
type Boolean bool
Boolean represents a PDF boolean object.
type CommandMode ¶
type CommandMode int
CommandMode specifies the operation being executed.
const ( VALIDATE CommandMode = iota OPTIMIZE SPLIT MERGE EXTRACTIMAGES EXTRACTFONTS EXTRACTPAGES EXTRACTCONTENT EXTRACTMETADATA TRIM ADDATTACHMENTS REMOVEATTACHMENTS EXTRACTATTACHMENTS LISTATTACHMENTS ADDPERMISSIONS LISTPERMISSIONS ENCRYPT DECRYPT CHANGEUPW CHANGEOPW STAMP ADDWATERMARKS )
The available commands.
type Configuration ¶
type Configuration struct { // Enables PDF V1.5 compatible processing of object streams, xref streams, hybrid PDF files. Reader15 bool // Enables decoding of all streams (fontfiles, images..) for logging purposes. DecodeAllStreams bool // Validate against ISO-32000: strict or relaxed ValidationMode int // End of line char sequence for writing. Eol string // Turns on object stream generation. // A signal for compressing any new non-stream-object into an object stream. // true enforces WriteXRefStream to true. // false does not prevent xRefStream generation. WriteObjectStream bool // Switches between xRefSection (<=V1.4) and objectStream/xRefStream (>=V1.5) writing. WriteXRefStream bool // Turns on stats collection. CollectStats bool // A CSV-filename holding the statistics. StatsFileName string // Supplied user password UserPW string UserPWNew *string // Supplied owner password OwnerPW string OwnerPWNew *string // EncryptUsingAES ensures AES encryption. // true: AES encryption // false: RC4 encryption. EncryptUsingAES bool // EncryptUsing128BitKey ensures 128 bit key length. // true: use 128 bit key // false: use 40 bit key EncryptUsing128BitKey bool // Supplied user access permissions, see Table 22 UserAccessPermissions int16 // Command being executed. Mode CommandMode }
Configuration of a Context.
func NewDefaultConfiguration ¶
func NewDefaultConfiguration() *Configuration
NewDefaultConfiguration returns the default pdfcpu configuration.
func (*Configuration) ValidationModeString ¶
func (c *Configuration) ValidationModeString() string
ValidationModeString returns a string rep for the validation mode in effect.
type Context ¶ added in v0.1.16
type Context struct { *Configuration *XRefTable Read *ReadContext Optimize *OptimizationContext Write *WriteContext }
Context represents an environment for processing PDF files.
func NewContext ¶ added in v0.1.16
NewContext initializes a new Context.
func ReadPDFFile ¶
func ReadPDFFile(fileName string, config *Configuration) (*Context, error)
ReadPDFFile reads in a PDFFile and generates a Context, an in-memory representation containing a cross reference table.
func (*Context) ResetWriteContext ¶ added in v0.1.16
func (ctx *Context) ResetWriteContext()
ResetWriteContext prepares an existing WriteContext for a new file to be written.
type Dict ¶ added in v0.1.16
Dict represents a PDF dict object.
func (Dict) ArrayEntry ¶ added in v0.1.16
ArrayEntry expects and returns a Array entry for given key.
func (Dict) BooleanEntry ¶ added in v0.1.16
BooleanEntry expects and returns a BooleanEntry for given key.
func (Dict) DictEntry ¶ added in v0.1.16
DictEntry expects and returns a PDFDict entry for given key.
func (Dict) HexLiteralEntry ¶ added in v0.1.16
func (d Dict) HexLiteralEntry(key string) *HexLiteral
HexLiteralEntry returns a HexLiteral object for given key.
func (Dict) IndirectRefEntry ¶ added in v0.1.16
func (d Dict) IndirectRefEntry(key string) *IndirectRef
IndirectRefEntry returns an indirectRefEntry for given key for this dictionary.
func (Dict) InsertFloat ¶ added in v0.1.16
InsertFloat adds a new float entry to this PDFDict.
func (Dict) InsertName ¶ added in v0.1.16
InsertName adds a new name entry to this PDFDict.
func (Dict) InsertString ¶ added in v0.1.16
InsertString adds a new string entry to this PDFDict.
func (Dict) Int64Entry ¶ added in v0.1.16
Int64Entry expects and returns a Integer entry representing an int64 value for given key.
func (Dict) IsLinearizationParmDict ¶ added in v0.1.16
IsLinearizationParmDict returns true if this dict has an int entry for key "Linearized".
func (Dict) Length ¶ added in v0.1.16
Length returns a *int64 for entry with key "Length". Stream length may be referring to an indirect object.
func (Dict) PDFString ¶ added in v0.1.16
PDFString returns a string representation as found in and written to a PDF file.
func (Dict) StreamDictEntry ¶ added in v0.1.16
func (d Dict) StreamDictEntry(key string) *StreamDict
StreamDictEntry expects and returns a StreamDict entry for given key. unused.
func (Dict) StringEntry ¶ added in v0.1.16
StringEntry expects and returns a StringLiteral entry for given key. Unused.
func (Dict) StringEntryBytes ¶ added in v0.1.16
StringEntryBytes returns the byte slice representing the string value for key.
func (Dict) StringLiteralEntry ¶ added in v0.1.16
func (d Dict) StringLiteralEntry(key string) *StringLiteral
StringLiteralEntry returns a StringLiteral object for given key.
func (Dict) Subtype ¶ added in v0.1.16
Subtype returns the value of the name entry for key "Subtype".
type Float ¶ added in v0.1.16
type Float float64
Float represents a PDF float object.
type FontObject ¶
type FontObject struct { ResourceNames []string Prefix string FontName string FontDict *Dict Data []byte Extension string }
FontObject represents a font used in a PDF file.
func ExtractFontData ¶
func ExtractFontData(ctx *Context, objNr int) (*FontObject, error)
ExtractFontData extracts font data (the "fontfile") for objNr. Supported fontTypes: TrueType
func (*FontObject) AddResourceName ¶
func (fo *FontObject) AddResourceName(resourceName string)
AddResourceName adds a resourceName referring to this font.
func (FontObject) Embedded ¶
func (fo FontObject) Embedded() (embedded bool)
Embedded returns true if the font is embedded into this PDF file.
func (FontObject) Encoding ¶
func (fo FontObject) Encoding() string
Encoding returns the Encoding of this font.
func (FontObject) ResourceNamesString ¶
func (fo FontObject) ResourceNamesString() string
ResourceNamesString returns a string representation of all the resource names of this font.
func (FontObject) String ¶
func (fo FontObject) String() string
func (FontObject) SubType ¶
func (fo FontObject) SubType() string
SubType returns the SubType of this font.
type HexLiteral ¶ added in v0.1.16
type HexLiteral string
HexLiteral represents a PDF hex literal object.
func (HexLiteral) Bytes ¶ added in v0.1.16
func (hexliteral HexLiteral) Bytes() ([]byte, error)
Bytes returns the byte representation.
func (HexLiteral) PDFString ¶ added in v0.1.16
func (hexliteral HexLiteral) PDFString() string
PDFString returns the string representation as found in and written to a PDF file.
func (HexLiteral) String ¶ added in v0.1.16
func (hexliteral HexLiteral) String() string
func (HexLiteral) Value ¶ added in v0.1.16
func (hexliteral HexLiteral) Value() string
Value returns a string value for this PDF object.
type ImageObject ¶
type ImageObject struct { ResourceNames []string ImageDict *StreamDict }
ImageObject represents an image used in a PDF file.
func ExtractImageData ¶
func ExtractImageData(ctx *Context, objNr int) (*ImageObject, error)
ExtractImageData extracts image data for objNr. Supported imgTypes: FlateDecode, DCTDecode, JPXDecode TODO: Implementation and usage of these filters: DCTDecode and JPXDecode.
func (*ImageObject) AddResourceName ¶
func (io *ImageObject) AddResourceName(resourceName string)
AddResourceName adds a resourceName to this imageObject's ResourceNames dict.
func (ImageObject) ResourceNamesString ¶
func (io ImageObject) ResourceNamesString() string
ResourceNamesString returns a string representation of the ResourceNames for this image.
type IndirectRef ¶ added in v0.1.16
IndirectRef represents a PDF indirect object.
func NewIndirectRef ¶ added in v0.1.16
func NewIndirectRef(objectNumber, generationNumber int) *IndirectRef
NewIndirectRef returns a new PDFIndirectRef object.
func (IndirectRef) Equals ¶ added in v0.1.16
func (ir IndirectRef) Equals(indRef IndirectRef) bool
Equals returns true if two indirect References refer to the same object.
func (IndirectRef) PDFString ¶ added in v0.1.16
func (ir IndirectRef) PDFString() string
PDFString returns a string representation as found in and written to a PDF file.
func (IndirectRef) String ¶ added in v0.1.16
func (ir IndirectRef) String() string
type InheritedPageAttrs ¶ added in v0.1.16
type InheritedPageAttrs struct {
// contains filtered or unexported fields
}
InheritedPageAttrs represents all inherited page attributes.
type Integer ¶ added in v0.1.16
type Integer int
Integer represents a PDF integer object.
type Name ¶ added in v0.1.16
type Name string
Name represents a PDF name object.
type Node ¶
type Node struct { Kids []*Node // Mirror of the name tree's Kids array. Names []entry // Mirror of the name tree's Names array. Kmin, Kmax string // Mirror of the name tree's Limit array[Kmin,Kmax]. IndRef *IndirectRef // Pointer to the PDF object representing this name tree node. }
Node is an opiniated implementation of the PDF name tree. pdfcpu caches all name trees found in the PDF catalog with this data structure. The PDF spec does not impose any rules regarding a strategy for the creation of nodes. A binary tree was chosen where each leaf node has a limited number of entries (maxEntries). Once maxEntries has been reached a leaf node turns into an intermediary node with two kids, which are leaf nodes each of them holding half of the sorted entries of the original leaf node.
func (Node) Process ¶
Process traverses the nametree applying a handler to each entry (key-value pair).
type ObjectStreamDict ¶ added in v0.1.16
type ObjectStreamDict struct { StreamDict Prolog []byte ObjCount int FirstObjOffset int ObjArray Array }
ObjectStreamDict represents a object stream dictionary.
func NewObjectStreamDict ¶ added in v0.1.16
func NewObjectStreamDict() *ObjectStreamDict
NewObjectStreamDict creates a new ObjectStreamDict object.
func (*ObjectStreamDict) AddObject ¶ added in v0.1.16
func (oStreamDict *ObjectStreamDict) AddObject(objNumber int, entry *XRefTableEntry) error
AddObject adds another object to this object stream. Relies on decoded content!
func (*ObjectStreamDict) Finalize ¶ added in v0.1.16
func (oStreamDict *ObjectStreamDict) Finalize()
Finalize prepares the final content of the objectstream.
func (*ObjectStreamDict) IndexedObject ¶ added in v0.1.16
func (oStreamDict *ObjectStreamDict) IndexedObject(index int) (Object, error)
IndexedObject returns the object at given index from a ObjectStreamDict.
type OptimizationContext ¶
type OptimizationContext struct { // Font section PageFonts []IntSet FontObjects map[int]*FontObject Fonts map[string][]int DuplicateFontObjs IntSet DuplicateFonts map[int]*Dict // Image section PageImages []IntSet ImageObjects map[int]*ImageObject DuplicateImageObjs IntSet DuplicateImages map[int]*StreamDict DuplicateInfoObjects IntSet // Possible result of manual info dict modification. NonReferencedObjs []int // Objects that are not referenced. }
OptimizationContext represents the context for the optimiziation of a PDF file.
func (*OptimizationContext) DuplicateFontObjectsString ¶
func (oc *OptimizationContext) DuplicateFontObjectsString() (int, string)
DuplicateFontObjectsString returns a formatted string and the number of objs.
func (*OptimizationContext) DuplicateImageObjectsString ¶
func (oc *OptimizationContext) DuplicateImageObjectsString() (int, string)
DuplicateImageObjectsString returns a formatted string and the number of objs.
func (*OptimizationContext) DuplicateInfoObjectsString ¶
func (oc *OptimizationContext) DuplicateInfoObjectsString() (int, string)
DuplicateInfoObjectsString returns a formatted string and the number of objs.
func (*OptimizationContext) IsDuplicateFontObject ¶
func (oc *OptimizationContext) IsDuplicateFontObject(i int) bool
IsDuplicateFontObject returns true if object #i is a duplicate font object.
func (*OptimizationContext) IsDuplicateImageObject ¶
func (oc *OptimizationContext) IsDuplicateImageObject(i int) bool
IsDuplicateImageObject returns true if object #i is a duplicate image object.
func (*OptimizationContext) IsDuplicateInfoObject ¶
func (oc *OptimizationContext) IsDuplicateInfoObject(i int) bool
IsDuplicateInfoObject returns true if object #i is a duplicate info object.
func (*OptimizationContext) NonReferencedObjsString ¶
func (oc *OptimizationContext) NonReferencedObjsString() (int, string)
NonReferencedObjsString returns a formatted string and the number of objs.
type PDFImage ¶ added in v0.1.16
type PDFImage struct {
// contains filtered or unexported fields
}
PDFImage represents a XObject of subtype image.
type PDFStats ¶
type PDFStats struct {
// contains filtered or unexported fields
}
PDFStats is a container for stats.
func (PDFStats) AddPageAttr ¶
AddPageAttr adds the occurrence of a field with given name to the pageAttrs set.
func (PDFStats) AddRootAttr ¶
AddRootAttr adds the occurrence of a field with given name to the rootAttrs set.
func (PDFStats) UsesPageAttr ¶
UsesPageAttr returns true if a field with given name is contained in the pageAttrs set.
func (PDFStats) UsesRootAttr ¶
UsesRootAttr returns true if a field with given name is contained in the rootAttrs set.
type ReadContext ¶
type ReadContext struct { // The PDF-File which gets processed. FileName string File *os.File FileSize int64 BinaryTotalSize int64 // total stream data BinaryImageSize int64 // total image stream data BinaryFontSize int64 // total font stream data (fontfiles) BinaryImageDuplSize int64 // total obsolet image stream data after optimization BinaryFontDuplSize int64 // total obsolet font stream data after optimization Linearized bool // File is linearized. Hybrid bool // File is a hybrid PDF file. UsingObjectStreams bool // File is using object streams. ObjectStreams IntSet // All object numbers of any object streams found which need to be decoded. UsingXRefStreams bool // File is using xref streams. XRefStreams IntSet // All object numbers of any xref streams found. }
ReadContext represents the context for reading a PDF file.
func (*ReadContext) IsObjectStreamObject ¶
func (rc *ReadContext) IsObjectStreamObject(i int) bool
IsObjectStreamObject returns true if object i is a an object stream. All compressed objects are object streams.
func (*ReadContext) IsXRefStreamObject ¶
func (rc *ReadContext) IsXRefStreamObject(i int) bool
IsXRefStreamObject returns true if object #i is a an xref stream.
func (*ReadContext) LogStats ¶
func (rc *ReadContext) LogStats(optimized bool)
LogStats logs stats for read file.
func (*ReadContext) ObjectStreamsString ¶
func (rc *ReadContext) ObjectStreamsString() (int, string)
ObjectStreamsString returns a formatted string and the number of object stream objects.
func (*ReadContext) ReadFileSize ¶ added in v0.1.16
func (rc *ReadContext) ReadFileSize() int
ReadFileSize returns the size of the input file, if there is one.
func (*ReadContext) XRefStreamsString ¶
func (rc *ReadContext) XRefStreamsString() (int, string)
XRefStreamsString returns a formatted string and the number of xref stream objects.
type StreamDict ¶ added in v0.1.16
type StreamDict struct { Dict StreamOffset int64 StreamLength *int64 StreamLengthObjNr *int FilterPipeline []PDFFilter Raw []byte // Encoded Content []byte // Decoded IsPageContent bool }
StreamDict represents a PDF stream dict object.
func NewStreamDict ¶ added in v0.1.16
func NewStreamDict(dict Dict, streamOffset int64, streamLength *int64, streamLengthObjNr *int, filterPipeline []PDFFilter) StreamDict
NewStreamDict creates a new PDFStreamDict for given PDFDict, stream offset and length.
func ReadPNGFile ¶ added in v0.1.16
func ReadPNGFile(xRefTable *XRefTable, fileName string) (*StreamDict, error)
ReadPNGFile generates a PDF image object for a PNG file and appends this object to the cross reference table.
func ReadTIFFFile ¶ added in v0.1.16
func ReadTIFFFile(xRefTable *XRefTable, fileName string) (*StreamDict, error)
ReadTIFFFile generates a PDF image object for a TIFF file and appends this object to the cross reference table.
func (StreamDict) HasSoleFilterNamed ¶ added in v0.1.16
func (streamDict StreamDict) HasSoleFilterNamed(filterName string) bool
HasSoleFilterNamed returns true if there is exactly one filter defined for a stream dict.
type StringLiteral ¶ added in v0.1.16
type StringLiteral string
StringLiteral represents a PDF string literal object.
func (StringLiteral) PDFString ¶ added in v0.1.16
func (stringliteral StringLiteral) PDFString() string
PDFString returns a string representation as found in and written to a PDF file.
func (StringLiteral) String ¶ added in v0.1.16
func (stringliteral StringLiteral) String() string
func (StringLiteral) Value ¶ added in v0.1.16
func (stringliteral StringLiteral) Value() string
Value returns a string value for this PDF object.
type Version ¶
type Version int
Version is a type for the internal representation of PDF versions.
func PDFVersion ¶
PDFVersion returns the PDFVersion for a version string.
type Watermark ¶ added in v0.1.16
type Watermark struct {
// contains filtered or unexported fields
}
Watermark represents the basic structure and command details for the commands "Stamp" and "Watermark".
func ParseWatermarkDetails ¶ added in v0.1.16
ParseWatermarkDetails parses a Watermark/Stamp command string into an internal structure.
func (Watermark) IsImage ¶ added in v0.1.16
IsImage returns whether the watermark content is an image or text.
func (Watermark) OnTopString ¶ added in v0.1.16
OnTopString returns "watermark" or "stamp" whichever applies.
type WriteContext ¶
type WriteContext struct { // The PDF-File which gets generated. DirName string FileName string FileSize int64 *bufio.Writer Command string // command in effect. ExtractPageNr int // page to be generated for rendering a single-page/PDF. ExtractPages IntSet // pages to be generated for a trimmed PDF. BinaryTotalSize int64 // total stream data, counts 100% all stream data written. BinaryImageSize int64 // total image stream data written = Read.BinaryImageSize. BinaryFontSize int64 // total font stream data (fontfiles) = copy of Read.BinaryFontSize. Table map[int]int64 // object write offsets Offset int64 // current write offset WriteToObjectStream bool // if true start to embed objects into object streams and obey ObjectStreamMaxObjects. CurrentObjStream *int // if not nil, any new non-stream-object gets added to the object stream with this object number. Eol string // end of line char sequence }
WriteContext represents the context for writing a PDF file.
func NewWriteContext ¶
func NewWriteContext(eol string) *WriteContext
NewWriteContext returns a new WriteContext.
func (*WriteContext) ExtractPage ¶
func (wc *WriteContext) ExtractPage(i int) bool
ExtractPage returns true if page i needs to be generated.
func (*WriteContext) HasWriteOffset ¶
func (wc *WriteContext) HasWriteOffset(objNumber int) bool
HasWriteOffset returns true if an object has already been written to PDFDestination.
func (*WriteContext) LogStats ¶
func (wc *WriteContext) LogStats()
LogStats logs stats for written file.
func (*WriteContext) ReducedFeatureSet ¶
func (wc *WriteContext) ReducedFeatureSet() bool
ReducedFeatureSet returns true for Split,Trim,Merge,ExtractPages. Don't confuse with pdfcpu commands, these are internal triggers.
func (*WriteContext) SetWriteOffset ¶
func (wc *WriteContext) SetWriteOffset(objNumber int)
SetWriteOffset saves the current write offset to the PDFDestination.
func (*WriteContext) WriteEol ¶
func (wc *WriteContext) WriteEol() error
WriteEol writes an end of line sequence.
type XRefStreamDict ¶ added in v0.1.16
type XRefStreamDict struct { StreamDict Size int Objects []int W [3]int PreviousOffset *int64 }
XRefStreamDict represents a cross reference stream dictionary.
func NewXRefStreamDict ¶ added in v0.1.16
func NewXRefStreamDict(ctx *Context) *XRefStreamDict
NewXRefStreamDict creates a new PDFXRefStreamDict object.
type XRefTable ¶
type XRefTable struct { Table map[int]*XRefTableEntry Size *int // Object count from PDF trailer dict. PageCount int // Number of pages, set during validation. Root *IndirectRef // Pointer to catalog (reference to root object). RootDict *Dict // Catalog Names map[string]*Node // Cache for name trees as found in catalog. Encrypt *IndirectRef // Encrypt dict. E *Enc EncKey []byte // Encrypt key. AES4Strings bool AES4Streams bool AES4EmbeddedStreams bool // PDF Version HeaderVersion *Version // The PDF version the source is claiming to us as per its header. RootVersion *Version // Optional PDF version taking precedence over the header version. // Document information section Info *IndirectRef // Infodict (reference to info dict object) ID *Array // from trailer Author string Creator string Producer string // Linearization section (not yet supported) OffsetPrimaryHintTable *int64 OffsetOverflowHintTable *int64 LinearizationObjs IntSet // Offspec section AdditionalStreams *Array // array of IndirectRef - trailer :e.g., Oasis "Open Doc" // Statistics Stats PDFStats Tagged bool // File is using tags. This is important for ??? // Validation Valid bool // true means successful validated against ISO 32000. ValidationMode int // see Configuration Optimized bool }
XRefTable represents a PDF cross reference table plus stats for a PDF file.
func CreateAcroFormDemoXRef ¶
CreateAcroFormDemoXRef creates a PDF file with an AcroForm example.
func CreateAnnotationDemoXRef ¶
CreateAnnotationDemoXRef creates a PDF file with examples of annotations and actions.
func CreateDemoXRef ¶ added in v0.1.16
CreateDemoXRef creates a minimal PDF file for demo purposes.
func (*XRefTable) BindNameTrees ¶
BindNameTrees syncs up the internal name tree cache with the xreftable.
func (*XRefTable) CatalogHasPieceInfo ¶
CatalogHasPieceInfo returns true if the root has an entry for \"PieceInfo\".
func (*XRefTable) DeleteObject ¶
DeleteObject marks an object as free and inserts it into the free list right after the head.
func (*XRefTable) DeleteObjectGraph ¶
DeleteObjectGraph deletes all objects reachable by indRef.
func (*XRefTable) Dereference ¶
Dereference resolves an indirect object and returns the resulting PDF object.
func (*XRefTable) DereferenceArray ¶
DereferenceArray resolves and validates an array object, which may be an indirect reference.
func (*XRefTable) DereferenceDict ¶
DereferenceDict resolves and validates a dictionary object, which may be an indirect reference.
func (*XRefTable) DereferenceDictEntry ¶ added in v0.1.16
DereferenceDictEntry returns a dereferenced dict entry.
func (*XRefTable) DereferenceInteger ¶
DereferenceInteger resolves and validates an integer object, which may be an indirect reference.
func (*XRefTable) DereferenceName ¶
func (xRefTable *XRefTable) DereferenceName(obj Object, sinceVersion Version, validate func(string) bool) (n Name, err error)
DereferenceName resolves and validates a name object, which may be an indirect reference.
func (*XRefTable) DereferenceNumber ¶ added in v0.1.16
DereferenceNumber resolves a number object, which may be an indirect reference and returns a float64 It is assumed this func is called on a validated xRefTable.
func (*XRefTable) DereferenceStreamDict ¶
func (xRefTable *XRefTable) DereferenceStreamDict(obj Object) (*StreamDict, error)
DereferenceStreamDict resolves and validates a stream dictionary object, which may be an indirect reference.
func (*XRefTable) DereferenceStringLiteral ¶
func (xRefTable *XRefTable) DereferenceStringLiteral(obj Object, sinceVersion Version, validate func(string) bool) (s StringLiteral, err error)
DereferenceStringLiteral resolves and validates a string literal object, which may be an indirect reference.
func (*XRefTable) DereferenceStringOrHexLiteral ¶
func (xRefTable *XRefTable) DereferenceStringOrHexLiteral(obj Object, sinceVersion Version, validate func(string) bool) (o Object, err error)
DereferenceStringOrHexLiteral resolves and validates a string or hex literal object, which may be an indirect reference.
func (*XRefTable) DereferenceText ¶ added in v0.1.16
DereferenceText resolves and validates a string or hex literal object to a string.
func (*XRefTable) EncryptDict ¶
EncryptDict returns a pointer to the root object / catalog.
func (*XRefTable) EnsureCollection ¶
EnsureCollection makes sure there is a Collection entry in the catalog. Needed for portfolio / portable collections eg. for file attachments.
func (*XRefTable) EnsureValidFreeList ¶
EnsureValidFreeList ensures the integrity of the free list associated with the recorded free objects. See 7.5.4 Cross-Reference Table
func (*XRefTable) Find ¶
func (xRefTable *XRefTable) Find(objNumber int) (*XRefTableEntry, bool)
Find returns the XRefTable entry for given object number.
func (*XRefTable) FindObject ¶
FindObject returns the object of the XRefTableEntry for a specific object number.
func (*XRefTable) FindTableEntry ¶
func (xRefTable *XRefTable) FindTableEntry(objNumber int, generationNumber int) (*XRefTableEntry, bool)
FindTableEntry returns the XRefTable entry for given object and generation numbers.
func (*XRefTable) FindTableEntryForIndRef ¶
func (xRefTable *XRefTable) FindTableEntryForIndRef(indRef *IndirectRef) (*XRefTableEntry, bool)
FindTableEntryForIndRef returns the XRefTable entry for given indirect reference.
func (*XRefTable) FindTableEntryLight ¶
func (xRefTable *XRefTable) FindTableEntryLight(objNumber int) (*XRefTableEntry, bool)
FindTableEntryLight returns the XRefTable entry for given object number.
func (*XRefTable) Free ¶
func (xRefTable *XRefTable) Free(objNumber int) (*XRefTableEntry, error)
Free returns the cross ref table entry for given number of a free object.
func (*XRefTable) IDFirstElement ¶
IDFirstElement returns the first element of ID.
func (*XRefTable) IndRefForNewObject ¶
func (xRefTable *XRefTable) IndRefForNewObject(obj Object) (*IndirectRef, error)
IndRefForNewObject inserts an object into the xRefTable and returns an indirect reference to it.
func (*XRefTable) InsertAndUseRecycled ¶
func (xRefTable *XRefTable) InsertAndUseRecycled(xRefTableEntry XRefTableEntry) (objNumber int, err error)
InsertAndUseRecycled adds given xRefTableEntry into the cross reference table utilizing the freelist.
func (*XRefTable) InsertNew ¶
func (xRefTable *XRefTable) InsertNew(xRefTableEntry XRefTableEntry) (objNumber int)
InsertNew adds given xRefTableEntry at next new objNumber into the cross reference table. Only to be called once an xRefTable has been generated completely and all trailer dicts have been processed. xRefTable.Size is the size entry of the first trailer dict processed. Called on creation of new object streams. Called by InsertAndUseRecycled.
func (*XRefTable) InsertObject ¶
InsertObject inserts an object into the xRefTable.
func (*XRefTable) IsLinearizationObject ¶
IsLinearizationObject returns true if object #i is a a linearization object.
func (*XRefTable) LinearizationObjsString ¶
LinearizationObjsString returns a formatted string and the number of objs.
func (*XRefTable) LocateNameTree ¶
LocateNameTree locates/ensures a specific name tree.
func (*XRefTable) MissingObjects ¶
MissingObjects returns the number of objects that were not written plus the corresponding comma separated string representation.
func (*XRefTable) NewEmbeddedFileStreamDict ¶
func (xRefTable *XRefTable) NewEmbeddedFileStreamDict(filename string) (*StreamDict, error)
NewEmbeddedFileStreamDict creates and returns an embeddedFileStreamDict containing the file "filename".
func (*XRefTable) NewFileSpecDict ¶
func (xRefTable *XRefTable) NewFileSpecDict(filename string, indRefStreamDict IndirectRef) (*Dict, error)
NewFileSpecDict creates and returns a new fileSpec dictionary.
func (*XRefTable) NewSoundStreamDict ¶
func (xRefTable *XRefTable) NewSoundStreamDict(filename string, samplingRate int, fileSpecDict *Dict) (*StreamDict, error)
NewSoundStreamDict returns a new sound stream dict.
func (*XRefTable) NewStreamDict ¶ added in v0.1.16
func (xRefTable *XRefTable) NewStreamDict(filename string) (*StreamDict, error)
NewStreamDict creates a streamDict for buf.
func (*XRefTable) NextForFree ¶
NextForFree returns the number of the object the free object with objNumber links to. This is the successor of this free object in the free list.
func (*XRefTable) PageDict ¶
func (xRefTable *XRefTable) PageDict(page int) (*Dict, *InheritedPageAttrs, error)
PageDict returns a specific page dict along with the resources, mediaBox and CropBox in effect.
func (*XRefTable) Pages ¶
func (xRefTable *XRefTable) Pages() (*IndirectRef, error)
Pages returns the Pages reference contained in the catalog.
func (*XRefTable) ParseRootVersion ¶
ParseRootVersion returns a string representation for an optional Version entry in the root object.
func (*XRefTable) RemoveCollection ¶
RemoveCollection removes an existing Collection entry from the catalog.
func (*XRefTable) RemoveEmbeddedFilesNameTree ¶
RemoveEmbeddedFilesNameTree removes both the embedded files name tree and the Collection dict.
func (*XRefTable) RemoveNameTree ¶
RemoveNameTree removes a specific name tree. Also removes a resulting empty names dict.
func (*XRefTable) UndeleteObject ¶
UndeleteObject ensures an object is not recorded in the free list. e.g. sometimes caused by indirect references to free objects in the original PDF file.
func (*XRefTable) ValidateVersion ¶
ValidateVersion validates against the xRefTable's version.
func (*XRefTable) Version ¶
Version returns the PDF version of the PDF writer that created this file. Before V1.4 this is the header version. Since V1.4 the catalog may contain a Version entry which takes precedence over the header version.
func (*XRefTable) VersionString ¶
VersionString return a string representation for this PDF files PDF version.
type XRefTableEntry ¶
type XRefTableEntry struct { Free bool Offset *int64 Generation *int Object Object // maybe *Object ?? Compressed bool ObjectStream *int ObjectStreamInd *int }
XRefTableEntry represents an entry in the PDF cross reference table.
This may wrap a free object, a compressed object or any in use PDF object:
Dict, StreamDict, ObjectStreamDict, PDFXRefStreamDict, Array, Integer, Float, Name, StringLiteral, HexLiteral, Boolean
func NewFreeHeadXRefTableEntry ¶
func NewFreeHeadXRefTableEntry() *XRefTableEntry
NewFreeHeadXRefTableEntry returns the xref table entry for object 0 which is per definition the head of the free list (list of free objects).
func NewXRefTableEntryGen0 ¶
func NewXRefTableEntryGen0(obj Object) *XRefTableEntry
NewXRefTableEntryGen0 returns a cross reference table entry for an object with generation 0.
Source Files ¶
- array.go
- attach.go
- colorSpace.go
- configuration.go
- context.go
- createAnnotations.go
- createRenditions.go
- createTestPDF.go
- crypto.go
- dict.go
- doc.go
- equal.go
- extract.go
- filter.go
- iccProfile.go
- imageRead.go
- imageWrite.go
- info.go
- merge.go
- nameTree.go
- optimize.go
- parse.go
- read.go
- resources.go
- slice.go
- stamp.go
- stats.go
- streamdict.go
- string.go
- types.go
- utf16.go
- version.go
- write.go
- writeObjects.go
- writePages.go
- writeStats.go
- xreftable.go