Documentation ¶
Overview ¶
Package pdf provides support for reading and writing PDF files.
The package treats PDF files as containers containing a sequence of objects (typically Dictionaries and Streams). Object are written sequentially, but can be read in any order. These objects represent pages of text, fonts, images and so on. Subpackages implement support to produce PDF files representing pages of text and images.
A Reader can be used to read objects from an existing PDF file:
r, err := pdf.Open("in.pdf") if err != nil { log.Fatal(err) } defer r.Close() ... use r.Catalog to locate objects in the file ...
A Writer can be used to write objects to a new PDF file:
w, err := pdf.Create("out.pdf") if err != nil { log.Fatal(err) } ... add objects to the document using w.Write() and w.OpenStream() ... w.Catalog.Pages = ... // set the page tree err = out.Close() if err != nil { log.Fatal(err) }
The following classes represent the native PDF object types. All of these implement the Object interface: Array, Bool, Dict, Integer, Name, Real, Reference, Stream, String.
Index ¶
- type Array
- type AuthenticationError
- type Bool
- type Catalog
- type Dict
- type FilterInfo
- type Info
- type Integer
- type MalformedFileError
- type Name
- type Number
- type Object
- type PageRotation
- type Perm
- type Placeholder
- type ReadPwdFunc
- type Reader
- func (r *Reader) AuthenticateOwner() error
- func (r *Reader) Close() error
- func (r *Reader) GetArray(obj Object) (Array, error)
- func (r *Reader) GetDict(obj Object) (Dict, error)
- func (r *Reader) GetInfo() (*Info, error)
- func (r *Reader) GetInt(obj Object) (Integer, error)
- func (r *Reader) GetRectangle(obj Object) (*Rectangle, error)
- func (r *Reader) GetStream(obj Object) (*Stream, error)
- func (r *Reader) ReadSequential() (Object, *Reference, error)
- func (r *Reader) Resolve(obj Object) (Object, error)
- type Real
- type Rectangle
- func (rect *Rectangle) Extend(other *Rectangle)
- func (rect Rectangle) IsZero() bool
- func (rect *Rectangle) NearlyEqual(other *Rectangle, eps float64) bool
- func (rect *Rectangle) PDF(w io.Writer) error
- func (rect *Rectangle) String() string
- func (rect *Rectangle) XPos(rel float64) float64
- func (rect *Rectangle) YPos(rel float64) float64
- type Reference
- type Resource
- type Resources
- type Stream
- type String
- type Version
- type VersionError
- type Writer
- func (pdf *Writer) Alloc() *Reference
- func (pdf *Writer) CheckVersion(operation string, minVersion Version) error
- func (pdf *Writer) Close() error
- func (pdf *Writer) NewPlaceholder(size int) *Placeholder
- func (pdf *Writer) OnClose(callback func(*Writer) error)
- func (pdf *Writer) OpenStream(dict Dict, ref *Reference, filters ...*FilterInfo) (io.WriteCloser, *Reference, error)
- func (pdf *Writer) SetInfo(info *Info)
- func (pdf *Writer) Write(obj Object, ref *Reference) (*Reference, error)
- func (pdf *Writer) WriteCompressed(refs []*Reference, objects ...Object) ([]*Reference, error)
- type WriterOptions
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Array ¶
type Array []Object
Array represent an array of objects in a PDF file.
func (Array) AsRectangle ¶
AsRectangle converts an array of 4 numbers to a Rectangle object. If the array does not have the correct format, an error is returned.
type AuthenticationError ¶
type AuthenticationError struct {
ID []byte
}
AuthenticationError indicates that authentication failed because the correct password has not been supplied.
func (*AuthenticationError) Error ¶
func (err *AuthenticationError) Error() string
type Catalog ¶
type Catalog struct { Version Version `pdf:"optional"` Extensions Object `pdf:"optional"` Pages *Reference PageLabels Object `pdf:"optional"` Names Object `pdf:"optional"` Dests Object `pdf:"optional"` ViewerPreferences Object `pdf:"optional"` PageLayout Name `pdf:"optional"` PageMode Name `pdf:"optional"` Outlines *Reference `pdf:"optional"` Threads *Reference `pdf:"optional"` OpenAction Object `pdf:"optional"` AA Object `pdf:"optional"` URI Object `pdf:"optional"` AcroForm Object `pdf:"optional"` MetaData *Reference `pdf:"optional"` StructTreeRoot Object `pdf:"optional"` MarkInfo Object `pdf:"optional"` Lang language.Tag `pdf:"optional"` SpiderInfo Object `pdf:"optional"` OutputIntents Object `pdf:"optional"` PieceInfo Object `pdf:"optional"` OCProperties Object `pdf:"optional"` Perms Object `pdf:"optional"` Legal Object `pdf:"optional"` Requirements Object `pdf:"optional"` Collection Object `pdf:"optional"` NeedsRendering bool `pdf:"optional"` // contains filtered or unexported fields }
Catalog represents a PDF Document Catalog. The only required field in this structure is Pages, which specifies the root of the page tree.
The Document Catalog is documented in section 7.7.2 of PDF 32000-1:2008.
type Dict ¶
Dict represent a Dictionary object in a PDF file.
func AsDict ¶
func AsDict(s interface{}) Dict
AsDict creates a PDF Dict object, encoding the fields of a Go struct.
func (Dict) Decode ¶
Decode initialises a tagged struct using the data from a PDF dictionary. The argument s must be a pointer to a struct, or the function will panic. The function get, if non-nil, is used to resolve references to indirect objects, where needed; the Reader.Resolve method can be used for this argument.
type FilterInfo ¶
FilterInfo describes one PDF stream filter.
type Info ¶
type Info struct { Title string `pdf:"text string,optional"` Author string `pdf:"text string,optional"` Subject string `pdf:"text string,optional"` Keywords string `pdf:"text string,optional"` // Creator gives the name of the application that created the original // document, if the document was converted to PDF from another format. Creator string `pdf:"text string,optional"` // Producer gives the name of the application that converted the document, // if the document was converted to PDF from another format. Producer string `pdf:"text string,optional"` // CreationDate gives the date and time the document was created. CreationDate time.Time `pdf:"optional"` // ModDate gives the date and time the document was most recently modified. ModDate time.Time `pdf:"optional"` // Trapped indicates whether the document has been modified to include // trapping information. (A trap is an overlap between adjacent areas of // of different colours, used to avoid visual problems caused by imprecise // alignment of different layers of ink.) Possible values are: // * "True": The document has been fully trapped. No further trapping is // necessary. // * "False": The document has not been trapped. // * "Unknown" (default): Either it is unknown whether the document has // been trapped, or the document has been partially trapped. Further // trapping may be necessary. Trapped Name `pdf:"optional,allowstring"` // Custom contains all non-standard fields in the Info dictionary. Custom map[string]string `pdf:"extra"` }
Info represents a PDF Document Information Dictionary. All fields in this structure are optional.
The Document Information Dictionary is documented in section 14.3.3 of PDF 32000-1:2008.
type MalformedFileError ¶
MalformedFileError indicates that the PDF file could not be parsed.
func (*MalformedFileError) Error ¶
func (err *MalformedFileError) Error() string
func (*MalformedFileError) Unwrap ¶
func (err *MalformedFileError) Unwrap() error
type Name ¶
type Name string
Name represents a name object in a PDF file.
type Object ¶
type Object interface { // PDF writes the PDF file representation of the object to w. PDF(w io.Writer) error }
Object represents an object in a PDF file. There are nine native types of PDF objects, which implement this interface: Array, Bool, Dict, Integer, Name, Real, Reference, Stream, and String. Custom types can be constructed of these basic types, by implementing the Object interface.
type PageRotation ¶
type PageRotation int
PageRotation describes how a page shall be rotated when displayed or printed. The possible values are RotateInherit, Rotate0, Rotate90, Rotate180, Rotate270.
const ( RotateInherit PageRotation = iota // use inherited value Rotate0 // don't rotate Rotate90 // rotate 90 degrees clockwise Rotate180 // rotate 180 degrees clockwise Rotate270 // rotate 270 degrees clockwise )
Valid values for PageRotation.
We can't use the pdf integer values directly, because then we could not tell apart 0 degree rotations from unspecified rotations.
func DecodeRotation ¶
func DecodeRotation(rot Integer) (PageRotation, error)
func (PageRotation) ToPDF ¶
func (r PageRotation) ToPDF() Integer
type Perm ¶
type Perm int
Perm describes which operations are permitted when accessing the document with User access (but not Owner access). This library just reports the permissions as specified in the PDF file. It is up to the caller to enforce the permissions.
const ( // PermCopy allows to extract text and graphics. PermCopy Perm = 1 << iota // PermPrintDegraded allows printing of a low-level representation of the // appearance, possibly of degraded quality. PermPrintDegraded // PermPrint allows printing a representation from which a faithful digital // copy of the PDF content could be generated. This implies // PermPrintDegraded. PermPrint // PermForms allows to fill in form fields, including signature fields. PermForms // PermAnnotate allows to add or modify text annotations. This implies // PermForms. PermAnnotate // PermAssemble allows to insert, rotate, or delete pages and to create // bookmarks or thumbnail images. PermAssemble // PermModify allows to modify the document. This implies PermAssemble. PermModify // PermAll gives the user all permissions, making User access equivalent to // Owner access. PermAll = permNext - 1 )
type Placeholder ¶
type Placeholder struct {
// contains filtered or unexported fields
}
A Placeholder can be used to reserve space in a PDF file where some value can be filled in later. This is, for example, used to store the content length of a compressed stream in a PDF stream dictionary. Placeholer objects are created using Writer.NewPlaceholder.
func (*Placeholder) PDF ¶
func (x *Placeholder) PDF(w io.Writer) error
PDF implements the Object interface.
func (*Placeholder) Set ¶
func (x *Placeholder) Set(val Object) error
Set fills in the value of the placeholder object. This should be called as soon as possible after the value becomes known.
type ReadPwdFunc ¶
ReadPwdFunc describes a function which can be used to query the user for a password for the document with the given ID. The first call for each authentication attempt has try == 0. If the returned password was wrong, the function is called again, repeatedly, with sequentially increasing values of try. If the ReadPwdFunc return the empty string, the authentication attempt is aborted and an AuthenticationError is reported to the caller.
type Reader ¶
type Reader struct { // Version is the PDF version used in this file. This is specified in // the initial comment at the start of the file, and may be overridden by // the /Version entry in the document catalog. Version Version // The ID of the file. This is either a slice of two byte slices (the // original ID of the file, and the ID of the current version), or nil if // the file does not specify an ID. ID [][]byte Catalog *Catalog // contains filtered or unexported fields }
Reader represents a pdf file opened for reading. Use the functions Open or NewReader to create a new Reader.
func Open ¶
Open opens the named PDF file for reading. After use, Reader.Close must be called to close the file the Reader is reading from.
func (*Reader) AuthenticateOwner ¶
AuthenticateOwner tries to authenticate the owner of a document. If a password is required, this calls the readPwd function specified in the call to NewReader. The return value is nil if the owner was authenticated (or if no authentication is required), and an object of type AuthenticationError if the required password was not supplied.
func (*Reader) Close ¶
Close closes the file underlying the reader. This call only has an effect if the io.ReaderAt passed to NewReader has a Close method, or if the Reader was created using Open. Otherwise, Close has no effect and returns nil.
func (*Reader) GetArray ¶
GetArray resolves references to indirect objects and makes sure the resulting object is an array.
func (*Reader) GetDict ¶
GetDict resolves references to indirect objects and makes sure the resulting object is a dictionary.
func (*Reader) GetInfo ¶
GetInfo reads the PDF /Info dictionary for the file. If no Info dictionary is present, nil is returned.
func (*Reader) GetInt ¶
GetInt resolves references to indirect objects and makes sure the resulting object is an Integer.
func (*Reader) GetRectangle ¶
GetRectangle resolves references to indirect objects and makes sure the resulting object is a PDF rectangle object. If the object is null, nil is returned.
func (*Reader) GetStream ¶
GetStream resolves references to indirect objects and makes sure the resulting object is a dictionary.
func (*Reader) ReadSequential ¶
ReadSequential returns the objects in a PDF file in the order they are stored in the file. When the end of file has been reached, io.EOF is returned.
The function returns the next object in the file, together with a Reference which can be used to read the object using [Reder.Resolce]. The read position is not affected by other methods of the Reader, sequential access can safely be interspersed with calls to Reader.Resolve.
ReadSequential makes some effort to repair problems in corrupted or malformed PDF files. In particular, it may still work when the Reader.Resolve method fails with errors.
type Rectangle ¶
type Rectangle struct {
LLx, LLy, URx, URy float64
}
Rectangle represents a PDF rectangle.
func (*Rectangle) NearlyEqual ¶
NearlyEqual reports whether the corner coordinates of two rectangles differ by less than `eps`.
type Reference ¶
Reference represents a reference to an indirect object in a PDF file. TODO(voss): use the struct directly, rather than pointers to the struct? TODO(voss): use a fixed-size type for Number?
type Resources ¶
type Resources struct { ExtGState Dict `pdf:"optional"` // maps resource names to graphics state parameter dictionaries ColorSpace Dict `pdf:"optional"` // maps each resource name to either the name of a device-dependent colour space or an array describing a colour space Pattern Dict `pdf:"optional"` // maps resource names to pattern objects Shading Dict `pdf:"optional"` // maps resource names to shading dictionaries XObject Dict `pdf:"optional"` // maps resource names to external objects Font Dict `pdf:"optional"` // maps resource names to font dictionaries ProcSet Array `pdf:"optional"` // predefined procedure set names Properties Dict `pdf:"optional"` // maps resource names to property list dictionaries for marked content }
Resources describes a PDF Resource Dictionary. See section 7.8.3 of PDF 32000-1:2008 for details. TODO(voss): use []*font.Font for the .Font field?
type Stream ¶
Stream represent a stream object in a PDF file.
func (*Stream) Decode ¶
Decode returns a reader for the decoded stream data.
TODO(voss): allow to decode only the first few filters?
type String ¶
type String []byte
String represents a raw string in a PDF file. The character set encoding, if any, is determined by the context.
func ParseString ¶
ParseString parses a string from the given buffer. The buffer must include the surrounding parentheses or angle brackets.
func TextString ¶
TextString creates a String object using the "text string" encoding, i.e. using either UTF-16BE encoding (with a BOM) or PdfDocEncoding.
func (String) AsDate ¶
AsDate converts a PDF date string to a time.Time object. If the string does not have the correct format, an error is returned.
func (String) AsTextString ¶
AsTextString interprets x as a PDF "text string" and returns the corresponding utf-8 encoded string.
type Version ¶
type Version int
Version represent the version of PDF standard used in a file.
const ( V1_0 Version V1_1 V1_2 V1_3 V1_4 V1_5 V1_6 V1_7 )
PDF versions supported by this library.
func ParseVersion ¶
ParseVersion parses a PDF version string.
type VersionError ¶
VersionError is returned when trying to use a feature in a PDF file which is not supported by the PDF version used. Use Writer.CheckVersion to create VersionError objects.
func (*VersionError) Error ¶
func (err *VersionError) Error() string
type Writer ¶
type Writer struct { // Version is the PDF version used in this file. This field is // read-only. Use the opt argument of NewWriter to set the PDF version for // a new file. Version Version // The Document Catalog is documented in section 7.7.2 of PDF 32000-1:2008. Catalog *Catalog Resources map[interface{}]Resource // contains filtered or unexported fields }
Writer represents a PDF file open for writing. Use the functions Create or NewWriter to create a new Writer.
func Create ¶
Create creates the named PDF file and opens it for output. If a previous file with the same name exists, it is overwritten. After writing is complete, Writer.Close must be called to write the trailer and to close the underlying file.
If non-default settings are required, NewWriter can be used to set options.
func NewWriter ¶
func NewWriter(w io.Writer, opt *WriterOptions) (*Writer, error)
NewWriter prepares a PDF file for writing.
The Writer.Close method must be called after the file contents have been written, to add the trailer and the cross reference table to the PDF file. It is the callers responsibility, to close the writer w after the pdf.Writer has been closed.
func (*Writer) CheckVersion ¶
CheckVersion checks whether the PDF file being written has version minVersion or later. If the version is new enough, nil is returned. Otherwise a VersionError for the given operation is returned.
func (*Writer) Close ¶
Close closes the Writer, flushing any unwritten data to the underlying io.Writer.
func (*Writer) NewPlaceholder ¶
func (pdf *Writer) NewPlaceholder(size int) *Placeholder
NewPlaceholder creates a new placeholder for a value which is not yet known. The argument size must be an upper bound to the length of the replacement text. Once the value becomes known, it can be filled in using the Placeholder.Set method.
func (*Writer) OnClose ¶
OnClose registers a callback function which is called before the writer is closed. Callbacks are executed in the reverse order, i.e. the last callback registered is the first one to run.
TODO(voss): remove?
func (*Writer) OpenStream ¶
func (pdf *Writer) OpenStream(dict Dict, ref *Reference, filters ...*FilterInfo) (io.WriteCloser, *Reference, error)
OpenStream adds a PDF Stream to the file and returns an io.Writer which can be used to add the stream's data. No other objects can be added to the file until the stream is closed.
func (*Writer) Write ¶
Write writes an object to the PDF file, as an indirect object. The returned reference can be used to refer to this object from other parts of the file.
func (*Writer) WriteCompressed ¶
WriteCompressed writes a number of objects to the file as a compressed object stream. Object streams are only available for PDF version 1.5 and newer; in case the file version is too low, the objects are written directly into the PDF file, without compression.
Source Files ¶
Directories ¶
Path | Synopsis |
---|---|
Package color implements different PDF color spaces.
|
Package color implements different PDF color spaces. |
demo
|
|
cff-glyphs
Read a CFF font and display a magnified version of each glyph in a PDF file.
|
Read a CFF font and display a magnified version of each glyph in a PDF file. |
Package font implements the PDF font handling.
|
Package font implements the PDF font handling. |
builtin
Package builtin implements support for the 14 built-in PDF fonts.
|
Package builtin implements support for the 14 built-in PDF fonts. |
cid
Package cid provides support for embedding CID fonts into PDF documents.
|
Package cid provides support for embedding CID fonts into PDF documents. |
simple
Package simple provides support for embedding simple fonts into PDF documents.
|
Package simple provides support for embedding simple fonts into PDF documents. |
type3
Package type3 provides support for embedding type 3 fonts into PDF documents.
|
Package type3 provides support for embedding type 3 fonts into PDF documents. |
Package graphics allows to draw on a PDF page.
|
Package graphics allows to draw on a PDF page. |
Package image provides functions for embedding images in PDF files.
|
Package image provides functions for embedding images in PDF files. |
internal
|
|
Package lzw implements the Lempel-Ziv-Welch compressed data format.
|
Package lzw implements the Lempel-Ziv-Welch compressed data format. |
Package pages implements PDF page trees.
|
Package pages implements PDF page trees. |