fonts

package
v0.0.11 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 7, 2024 License: MIT Imports: 10 Imported by: 1

Documentation

Overview

This package provides tooling for exploiting the fonts defined (and embedded) in a PDF file and ( TODO: ) to add new ones.

PDF supports 4 kinds of fonts: the Simples (Type1, TrueType and Type3) and the Composite (Type0) and divides the text representation in 3 differents objects:

1- Glyph (selector): it is either a name (for Simples) or an integer called CID (for Composite)
2- Chars (character code): it is a slice of bytes (1 byte for Simples, 1 to 4 bytes for Composite)
3- Unicode (point): the Unicode point of a character, coded in Go as runes.

The Glyphs are mapped to Chars (which are the bytes written in the PDF in content streams) by an Encoding entry (and also the 'buitlin' encoding of a font). Going from Chars to Glyphs is well-defined, but in general, there is no clear mapping from Unicode to Glyph (or Chars). Thus, to be able to write an Unicode string (such as UTF-8 strings, which are the default in Go), a writter need to build a mapping between Unicode and Glyph. It is possible (and automatic) for many fonts (thanks to predifined encodings), but some custom fonts may require user inputs.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ResolveSimpleEncoding

func ResolveSimpleEncoding(font model.FontSimple) simpleencodings.Encoding

We follow here the logic from poppler, which itself is based on the PDF spec. Encodings start with a base encoding, which can come from (in order of priority):

  1. FontDict.Encoding or FontDict.Encoding.BaseEncoding - MacRoman / MacExpert / WinAnsi / Standard
  2. embedded or external font file
  3. default: - builtin --> builtin encoding - TrueType --> WinAnsiEncoding - others --> StandardEncoding

and then add a list of differences (if any) from FontDict.Encoding.Differences.

Types

type BuiltFont

type BuiltFont struct {
	Font
	Meta *model.FontDict
}

BuiltFont associate the built font to its origin data.

func BuildFont

func BuildFont(f *model.FontDict) (BuiltFont, error)

BuildFont compiles an existing FontDictionary, as found in a PDF, to a usefull font metrics. When needed the font builtin encoding is parsed and used.

type Fl

type Fl = model.Fl

type Font

type Font interface {
	// GetWidth return the size, in points, needed to display the character `c`
	// using the font size `size`.
	// Note that this method can't handle kerning.
	GetWidth(c rune, size Fl) Fl

	// Encode transform a slice of unicode points to a
	// slice of bytes, conform to the font expectation.
	// See `EncodeKern` for kerning support.
	Encode(cs []rune) []byte

	// Desc return the font descriptor
	Desc() model.FontDescriptor
}

Font provides metric related to a font, and a way to encode utf-8 strings to a compatible byte string. Since fetching such informations has a cost, one font should be build once and reused as often as possible.

type TextSpaced

type TextSpaced struct {
	// unescaped content. required a font to interpret the codes
	CharCodes            []byte
	SpaceSubtractedAfter int // value in thousands of text space unit
}

TextSpaced subtracts space after showing the text See 9.4.3 - Text-Showing Operators

Directories

Path Synopsis
Implements a CMap parser (both for ToUnicode and CID CMaps)
Implements a CMap parser (both for ToUnicode and CID CMaps)
copied from https://git.maze.io/go/unipdf/src/branch/master/internal/textencoding
copied from https://git.maze.io/go/unipdf/src/branch/master/internal/textencoding
Package psinterpreter implement a Postscript interpreter required to parse .CFF files, and Type1 and Type2 Charstrings.
Package psinterpreter implement a Postscript interpreter required to parse .CFF files, and Type1 and Type2 Charstrings.
Simple encodings map a subset of the unicode characters (at most 256) to a set of single bytes.
Simple encodings map a subset of the unicode characters (at most 256) to a set of single bytes.
Adobe predefined ToUnicode cmaps
Adobe predefined ToUnicode cmaps
generate
Tool to generate the metrics for the standard Adobe Type1 fonts.
Tool to generate the metrics for the standard Adobe Type1 fonts.
Package type1 implements a parser for Adobe Type1 fonts, defined by .afm files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5004.AFM_Spec.pdf) and .pdf files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/T1_SPEC.pdf)
Package type1 implements a parser for Adobe Type1 fonts, defined by .afm files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5004.AFM_Spec.pdf) and .pdf files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/T1_SPEC.pdf)
Package type1c provides a parser for the CFF font format defined at https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5176.CFF.pdf.
Package type1c provides a parser for the CFF font format defined at https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5176.CFF.pdf.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL