cmap

package

v0.4.0 Latest Latest Go to latest Published: Feb 4, 2024 License: GPL-3.0 Imports: 22 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/seehuhn/go-pdf

Links

Open Source Insights

Documentation ¶

Overview ¶

Package cmap implements CMap files for embedding in PDF files.

When composite fonts are used in PDF files, glyphs are selected using a two-step process: first, character codes are mapped to character identifiers (CIDs), and then CIDs are mapped to glyph identifiers (GIDs). A CMap file describes the first step of this process, i.e. the mapping from character codes to CIDs.

Index ¶

type CIDEncoder
- func NewCIDEncoderIdentity(g2c GIDToCID) CIDEncoder
- func NewCIDEncoderUTF8(g2c GIDToCID) CIDEncoder
type GIDToCID
- func NewGIDToCIDIdentity() GIDToCID
- func NewGIDToCIDSequential() GIDToCID
type Info
type RangeEntry
type RangeTUEntry
- func (r RangeTUEntry) String() string
type SingleEntry
type SingleTUEntry
- func (s SingleTUEntry) String() string
type ToUnicode

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type CIDEncoder ¶

type CIDEncoder interface {
	// AppendEncoded appends the character code for the given glyph ID
	// to the given PDF string (allocating new codes as needed).
	// It also records the fact that the character code corresponds to the
	// given unicode string.
	AppendEncoded(pdf.String, glyph.ID, []rune) pdf.String

	CodeAndCID(pdf.String, glyph.ID, []rune) (pdf.String, pscid.CID)

	CS() charcode.CodeSpaceRange

	Lookup(c charcode.CharCode) (pscid.CID, bool)

	// CMap returns the mapping from character codes to CID values.
	CMap() *Info

	// ToUnicode returns a PDF ToUnicode CMap.
	ToUnicode() *ToUnicode

	// Subset is the set of all GIDs which have been used with AppendEncoded.
	// The returned slice is sorted and always starts with GID 0.
	Subset() []glyph.ID

	AsText(pdf.String) []rune

	AllCIDs(pdf.String) func(yield func([]byte, pscid.CID) bool) bool
}

CIDEncoder constructs and stores mappings from character codes to CID values and from character codes to unicode strings.

func NewCIDEncoderIdentity ¶ added in v0.3.6

func NewCIDEncoderIdentity(g2c GIDToCID) CIDEncoder

NewCIDEncoderIdentity returns an encoder where two-byte codes are used directly as CID values.

func NewCIDEncoderUTF8 ¶ added in v0.3.6

func NewCIDEncoderUTF8(g2c GIDToCID) CIDEncoder

NewCIDEncoderUTF8 returns an encoder where character codes equal the UTF-8 encoding of the text content, where possible.

type GIDToCID ¶ added in v0.3.5

type GIDToCID interface {
	CID(glyph.ID, []rune) pscid.CID
	GID(pscid.CID) glyph.ID

	ROS() *pscid.SystemInfo

	GIDToCID(numGlyph int) []pscid.CID
}

GIDToCID encodes a mapping from Glyph Identifier (GID) values to Character Identifier (CID) values.

func NewGIDToCIDIdentity ¶ added in v0.4.0

func NewGIDToCIDIdentity() GIDToCID

NewGIDToCIDIdentity returns a GIDToCID which uses the GID values directly as CID values.

func NewGIDToCIDSequential ¶ added in v0.4.0

func NewGIDToCIDSequential() GIDToCID

NewGIDToCIDSequential returns a GIDToCID which assigns CID values sequentially, starting with 1.

type Info ¶ added in v0.3.3

type Info struct {
	Name string
	ROS  *cid.SystemInfo
	charcode.CodeSpaceRange
	CSFile  charcode.CodeSpaceRange // TODO(voss): remove this
	WMode   int
	UseCMap string
	Singles []SingleEntry
	Ranges  []RangeEntry
}

Info holds the information for a PDF CMap.

func Extract ¶ added in v0.3.5

func Extract(r pdf.Getter, obj pdf.Object) (*Info, error)

Extract reads a CMap from a PDF file.

func New ¶ added in v0.3.5

func New(ROS *cid.SystemInfo, cs charcode.CodeSpaceRange, m map[charcode.CharCode]cid.CID) *Info

New allocates a new CMap object.

func Read ¶ added in v0.3.5

func Read(r io.Reader, other map[string]*Info) (*Info, error)

func (*Info) Embed ¶ added in v0.3.5

func (info *Info) Embed(w pdf.Putter, ref pdf.Reference, other map[string]pdf.Reference) error

func (*Info) GetMapping ¶ added in v0.3.5

func (info *Info) GetMapping() map[charcode.CharCode]cid.CID

GetMapping returns the mapping information from info.

func (*Info) IsIdentity ¶ added in v0.3.5

func (info *Info) IsIdentity() bool

IsIdentity returns true if all codes are equal to the corresponding CID.

func (*Info) IsPredefined ¶ added in v0.3.6

func (info *Info) IsPredefined() bool

IsPredefined returns true if the CMap is one of the CMaps predefined in PDF.

func (*Info) MaxCID ¶ added in v0.3.5

func (info *Info) MaxCID() cid.CID

MaxCID returns the largest CID used by this CMap.

func (*Info) SetMapping ¶ added in v0.3.5

func (info *Info) SetMapping(m map[charcode.CharCode]cid.CID)

SetMapping replaces the mapping information in info with the given mapping.

To make efficient use of range entries, the generated mapping may be a superset of the original mapping, i.e. it may contain entries for charcodes which were not mapped in the original mapping.

func (*Info) Write ¶ added in v0.3.5

func (info *Info) Write(w io.Writer) error

type RangeEntry ¶ added in v0.3.6

type RangeEntry struct {
	First charcode.CharCode
	Last  charcode.CharCode
	Value cid.CID
}

RangeEntry describes a range of character codes with consecutive CIDs. First and Last are the first and last code points in the range. Value is the CID of the first code point in the range.

type RangeTUEntry ¶ added in v0.3.6

type RangeTUEntry struct {
	First  charcode.CharCode
	Last   charcode.CharCode
	Values [][]rune
}

RangeTUEntry describes a range of character codes. First and Last are the first and last code points in the range. Values is a list of unicode strings. If the list has length one, then the replacement character is incremented by one for each code point in the range. Otherwise, the list must have the length Last-First+1, and specify the value for each code point in the range.

func (RangeTUEntry) String ¶ added in v0.3.6

func (r RangeTUEntry) String() string

type SingleEntry ¶ added in v0.3.6

type SingleEntry struct {
	Code  charcode.CharCode
	Value cid.CID
}

SingleEntry specifies that character code Code represents the given CID.

type SingleTUEntry ¶ added in v0.3.6

type SingleTUEntry struct {
	Code  charcode.CharCode
	Value []rune
}

SingleTUEntry specifies that character code Code represents the given unicode string.

func (SingleTUEntry) String ¶ added in v0.3.6

func (s SingleTUEntry) String() string

type ToUnicode ¶ added in v0.3.6

type ToUnicode struct {
	CS      charcode.CodeSpaceRange
	Singles []SingleTUEntry
	Ranges  []RangeTUEntry
}

ToUnicode holds the information for a PDF ToUnicode cmap.

func ExtractToUnicode ¶ added in v0.3.6

func ExtractToUnicode(r pdf.Getter, obj pdf.Object, cs charcode.CodeSpaceRange) (*ToUnicode, error)

ExtractToUnicode extracts a ToUnicode CMap from a PDF file. If cs is not nil, it overrides the code space range given inside the CMap.

func NewToUnicode ¶ added in v0.3.6

func NewToUnicode(cs charcode.CodeSpaceRange, m map[charcode.CharCode][]rune) *ToUnicode

NewToUnicode constructs a ToUnicode cmap from the given mapping.

func NewToUnicodeNew ¶ added in v0.4.0

func NewToUnicodeNew(cs charcode.CodeSpaceRange, m map[string][]rune) *ToUnicode

NewToUnicodeNew constructs a ToUnicode cmap from the given mapping.

func ReadToUnicode ¶ added in v0.3.6

func ReadToUnicode(r io.Reader, cs charcode.CodeSpaceRange) (*ToUnicode, error)

ReadToUnicode reads a ToUnicode CMap. If cs is not nil, it overrides the code space range given inside the CMap.

func (*ToUnicode) Decode ¶ added in v0.3.6

func (info *ToUnicode) Decode(s pdf.String) ([]rune, int)

Decode decodes the first character code from the given string. It returns the corresponding unicode rune and the number of bytes consumed. If the character code cannot be decoded, unicode.ReplacementChar is returned, and the length is either 0 (if the string is empty) or 1. If a valid character code is found but the code is not mapped by the ToUnicode cmap, then the unicode replacement character is returned.

func (*ToUnicode) Embed ¶ added in v0.3.6

func (info *ToUnicode) Embed(w pdf.Putter, ref pdf.Reference) error

Embed adds the ToUnicode cmap to a PDF file.

func (*ToUnicode) GetMapping ¶ added in v0.3.6

func (info *ToUnicode) GetMapping() map[charcode.CharCode][]rune

GetMapping returns the mapping information from info.

func (*ToUnicode) GetMappingNew ¶ added in v0.4.0

func (info *ToUnicode) GetMappingNew() map[string][]rune

GetMappingNew returns the mapping information from info.

func (*ToUnicode) GetSimpleMapping ¶ added in v0.4.0

func (info *ToUnicode) GetSimpleMapping() [][]rune

func (*ToUnicode) SetMapping ¶ added in v0.3.6

func (info *ToUnicode) SetMapping(m map[charcode.CharCode][]rune)

SetMapping replaces the mapping information in info with the given mapping.

To make efficient use of range entries, the generated mapping may be a superset of the original mapping, i.e. it may contain entries for charcodes which were not mapped in the original mapping.

func (*ToUnicode) Write ¶ added in v0.3.6

func (info *ToUnicode) Write(w io.Writer) error

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL