Documentation ¶
Overview ¶
This package provides tooling for exploiting the fonts defined (and embedded) in a PDF file and ( TODO: ) to add new ones.
PDF supports 4 kinds of fonts: the Simples (Type1, TrueType and Type3) and the Composite (Type0) and divides the text representation in 3 differents objects:
1- Glyph (selector): it is either a name (for Simples) or an integer called CID (for Composite) 2- Chars (character code): it is a slice of bytes (1 byte for Simples, 1 to 4 bytes for Composite) 3- Unicode (point): the Unicode point of a character, coded in Go as runes.
The Glyphs are mapped to Chars (which are the bytes written in the PDF in content streams) by an Encoding entry (and also the 'buitlin' encoding of a font). Going from Chars to Glyphs is well-defined, but in general, there is no clear mapping from Unicode to Glyph (or Chars). Thus, to be able to write an Unicode string (such as UTF-8 strings, which are the default in Go), a writter need to build a mapping between Unicode and Glyph. It is possible (and automatic) for many fonts (thanks to predifined encodings), but some custom fonts may require user inputs.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ResolveSimpleEncoding ¶
func ResolveSimpleEncoding(font model.FontSimple) simpleencodings.Encoding
We follow here the logic from poppler, which itself is based on the PDF spec. Encodings start with a base encoding, which can come from (in order of priority):
- FontDict.Encoding or FontDict.Encoding.BaseEncoding - MacRoman / MacExpert / WinAnsi / Standard
- embedded or external font file
- default: - builtin --> builtin encoding - TrueType --> WinAnsiEncoding - others --> StandardEncoding
and then add a list of differences (if any) from FontDict.Encoding.Differences.
Types ¶
type Font ¶
type Font interface { // GetWidth return the size, in points, needed to display the character `c` // using the font size `size`. // Note that this method can't handle kerning. GetWidth(c rune, size Fl) Fl // Encode transform a slice of unicode points to a // slice of bytes, conform to the font expectation. // See `EncodeKern` for kerning support. Encode(cs []rune) []byte // Desc return the font descriptor Desc() model.FontDescriptor }
Font provides metric related to a font, and a way to encode utf-8 strings to a compatible byte string. Since fetching such informations has a cost, one font should be build once and reused as often as possible.
type TextSpaced ¶
type TextSpaced struct { // unescaped content. required a font to interpret the codes CharCodes []byte SpaceSubtractedAfter int // value in thousands of text space unit }
TextSpaced subtracts space after showing the text See 9.4.3 - Text-Showing Operators
Directories ¶
Path | Synopsis |
---|---|
Implements a CMap parser (both for ToUnicode and CID CMaps)
|
Implements a CMap parser (both for ToUnicode and CID CMaps) |
copied from https://git.maze.io/go/unipdf/src/branch/master/internal/textencoding
|
copied from https://git.maze.io/go/unipdf/src/branch/master/internal/textencoding |
Package psinterpreter implement a Postscript interpreter required to parse .CFF files, and Type1 and Type2 Charstrings.
|
Package psinterpreter implement a Postscript interpreter required to parse .CFF files, and Type1 and Type2 Charstrings. |
Simple encodings map a subset of the unicode characters (at most 256) to a set of single bytes.
|
Simple encodings map a subset of the unicode characters (at most 256) to a set of single bytes. |
Adobe predefined ToUnicode cmaps
|
Adobe predefined ToUnicode cmaps |
generate
Tool to generate the metrics for the standard Adobe Type1 fonts.
|
Tool to generate the metrics for the standard Adobe Type1 fonts. |
Package type1 implements a parser for Adobe Type1 fonts, defined by .afm files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5004.AFM_Spec.pdf) and .pdf files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/T1_SPEC.pdf)
|
Package type1 implements a parser for Adobe Type1 fonts, defined by .afm files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5004.AFM_Spec.pdf) and .pdf files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/T1_SPEC.pdf) |
Package type1c provides a parser for the CFF font format defined at https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5176.CFF.pdf.
|
Package type1c provides a parser for the CFF font format defined at https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5176.CFF.pdf. |