pdf

module
v0.0.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 28, 2023 License: MIT

README

Golang PDF toolbox

Why yet another PDF processing library ?

There are already numerous good PDF libraries for Go, and this one deliberatly takes inspiration from them. However, it is based on a slighty different approach : instead of working with a PDF as a tree of dynamic objects, it starts by modeling the whole SPEC (at least a good portion of it) with static types: see the package model.

Overview

The package model is the corner stone of this library. Then, packages may be divided in two parts:

Scope

The idea is possibly to provide a complete support of the PDF spec, but more importantly to exposes the differents layers (such as parser or content stream operators) so that it can be reusable by other libraries. As such, the first target of this library would be higher levels libraries (such as pdfcpu, gofpdf, oksvg, etc...).

Code example

A standard workflow to modify an existing PDF would look like

// load the existing file in memory
fi, _, err := reader.ParsePDFFile(filePath, reader.Options{})
// error handling ...

// process the document model as you wish

err = fi.WriteFile(output, nil)
// error handling ...

See decompress and api for more examples.

Directories

Path Synopsis
cmd
decompress
This script decodes the streams of a PDF file.
This script decodes the streams of a PDF file.
This package defines the commands used in PDF content stream objects.
This package defines the commands used in PDF content stream objects.
This package provides tooling for exploiting the fonts defined (and embedded) in a PDF file and ( TODO: ) to add new ones.
This package provides tooling for exploiting the fonts defined (and embedded) in a PDF file and ( TODO: ) to add new ones.
cmaps
Implements a CMap parser (both for ToUnicode and CID CMaps)
Implements a CMap parser (both for ToUnicode and CID CMaps)
glyphsnames
copied from https://git.maze.io/go/unipdf/src/branch/master/internal/textencoding
copied from https://git.maze.io/go/unipdf/src/branch/master/internal/textencoding
psinterpreter
Package psinterpreter implement a Postscript interpreter required to parse .CFF files, and Type1 and Type2 Charstrings.
Package psinterpreter implement a Postscript interpreter required to parse .CFF files, and Type1 and Type2 Charstrings.
simpleencodings
Simple encodings map a subset of the unicode characters (at most 256) to a set of single bytes.
Simple encodings map a subset of the unicode characters (at most 256) to a set of single bytes.
standardcmaps
Adobe predefined ToUnicode cmaps
Adobe predefined ToUnicode cmaps
standardfonts/generate
Tool to generate the metrics for the standard Adobe Type1 fonts.
Tool to generate the metrics for the standard Adobe Type1 fonts.
type1
Package type1 implements a parser for Adobe Type1 fonts, defined by .afm files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5004.AFM_Spec.pdf) and .pdf files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/T1_SPEC.pdf)
Package type1 implements a parser for Adobe Type1 fonts, defined by .afm files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5004.AFM_Spec.pdf) and .pdf files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/T1_SPEC.pdf)
type1C
Package type1c provides a parser for the CFF font format defined at https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5176.CFF.pdf.
Package type1c provides a parser for the CFF font format defined at https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5176.CFF.pdf.
Package formfill provides support for filling forms found in PDF files (aka AcroForm), reading forms input either form an FDF file or directly from memory.
Package formfill provides support for filling forms found in PDF files (aka AcroForm), reading forms input either form an FDF file or directly from memory.
Implements the in-memory structure of a PDF document, using static types.
Implements the in-memory structure of a PDF document, using static types.
Package reader leverage a PDF file reader to read a file, analyze its structure and build a high level, in-memory representation as a `model.Document`.
Package reader leverage a PDF file reader to read a file, analyze its structure and build a high level, in-memory representation as a `model.Document`.
file
Package file builds upon a parser to read an existing PDF file, producing a tree of PDF objets.
Package file builds upon a parser to read an existing PDF file, producing a tree of PDF objets.
parser
Implements a PDF object parser, mapping a list of tokens (see the tokenizer package) into tree-like structure.
Implements a PDF object parser, mapping a list of tokens (see the tokenizer package) into tree-like structure.
parser/filters
Package filters provide logic to handle binary data encoded with PDF filters, such as inline data images.
Package filters provide logic to handle binary data encoded with PDF filters, such as inline data images.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL