fonts

package

v0.0.14 Latest Latest Go to latest Published: May 28, 2024 License: MIT Imports: 10 Imported by: 1

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/benoitkugler/pdf

Links

Open Source Insights

Documentation ¶

Overview ¶

This package provides tooling for exploiting the fonts defined (and embedded) in a PDF file and ( TODO: ) to add new ones.

PDF supports 4 kinds of fonts: the Simples (Type1, TrueType and Type3) and the Composite (Type0) and divides the text representation in 3 differents objects:

1- Glyph (selector): it is either a name (for Simples) or an integer called CID (for Composite)
2- Chars (character code): it is a slice of bytes (1 byte for Simples, 1 to 4 bytes for Composite)
3- Unicode (point): the Unicode point of a character, coded in Go as runes.

The Glyphs are mapped to Chars (which are the bytes written in the PDF in content streams) by an Encoding entry (and also the 'buitlin' encoding of a font). Going from Chars to Glyphs is well-defined, but in general, there is no clear mapping from Unicode to Glyph (or Chars). Thus, to be able to write an Unicode string (such as UTF-8 strings, which are the default in Go), a writter need to build a mapping between Unicode and Glyph. It is possible (and automatic) for many fonts (thanks to predifined encodings), but some custom fonts may require user inputs.

Index ¶

func ResolveSimpleEncoding(font model.FontSimple) simpleencodings.Encoding
type BuiltFont
- func BuildFont(f *model.FontDict) (BuiltFont, error)
type Fl
type Font
type TextSpaced

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func ResolveSimpleEncoding ¶

func ResolveSimpleEncoding(font model.FontSimple) simpleencodings.Encoding

We follow here the logic from poppler, which itself is based on the PDF spec. Encodings start with a base encoding, which can come from (in order of priority):

FontDict.Encoding or FontDict.Encoding.BaseEncoding - MacRoman / MacExpert / WinAnsi / Standard
embedded or external font file
default: - builtin --> builtin encoding - TrueType --> WinAnsiEncoding - others --> StandardEncoding

and then add a list of differences (if any) from FontDict.Encoding.Differences.

Types ¶

type BuiltFont ¶

type BuiltFont struct {
	Font
	Meta *model.FontDict
}

BuiltFont associate the built font to its origin data.

func BuildFont ¶

func BuildFont(f *model.FontDict) (BuiltFont, error)

BuildFont compiles an existing FontDictionary, as found in a PDF, to a usefull font metrics. When needed the font builtin encoding is parsed and used.

type Fl ¶

type Fl = model.Fl

type Font ¶

type Font interface {
	// GetWidth return the size, in points, needed to display the character `c`
	// using the font size `size`.
	// Note that this method can't handle kerning.
	GetWidth(c rune, size Fl) Fl

	// Encode transform a slice of unicode points to a
	// slice of bytes, conform to the font expectation.
	// See `EncodeKern` for kerning support.
	Encode(cs []rune) []byte

	// Desc return the font descriptor
	Desc() model.FontDescriptor
}

Font provides metric related to a font, and a way to encode utf-8 strings to a compatible byte string. Since fetching such informations has a cost, one font should be build once and reused as often as possible.

type TextSpaced ¶

type TextSpaced struct {
	// unescaped content. required a font to interpret the codes
	CharCodes            []byte
	SpaceSubtractedAfter int // value in thousands of text space unit
}

TextSpaced subtracts space after showing the text See 9.4.3 - Text-Showing Operators

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cmaps Implements a CMap parser (both for ToUnicode and CID CMaps)	Implements a CMap parser (both for ToUnicode and CID CMaps)
glyphsnames copied from https://git.maze.io/go/unipdf/src/branch/master/internal/textencoding	copied from https://git.maze.io/go/unipdf/src/branch/master/internal/textencoding
psinterpreter Package psinterpreter implement a Postscript interpreter required to parse .CFF files, and Type1 and Type2 Charstrings.	Package psinterpreter implement a Postscript interpreter required to parse .CFF files, and Type1 and Type2 Charstrings.
simpleencodings Simple encodings map a subset of the unicode characters (at most 256) to a set of single bytes.	Simple encodings map a subset of the unicode characters (at most 256) to a set of single bytes.
standardcmaps Adobe predefined ToUnicode cmaps	Adobe predefined ToUnicode cmaps
generate
standardfonts
generate Tool to generate the metrics for the standard Adobe Type1 fonts.	Tool to generate the metrics for the standard Adobe Type1 fonts.
type1 Package type1 implements a parser for Adobe Type1 fonts, defined by .afm files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5004.AFM_Spec.pdf) and .pdf files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/T1_SPEC.pdf)	Package type1 implements a parser for Adobe Type1 fonts, defined by .afm files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5004.AFM_Spec.pdf) and .pdf files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/T1_SPEC.pdf)
type1C Package type1c provides a parser for the CFF font format defined at https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5176.CFF.pdf.	Package type1c provides a parser for the CFF font format defined at https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5176.CFF.pdf.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL