Documentation ¶
Index ¶
- Constants
- Variables
- func IsPredefinedCMap(name string) bool
- type CIDSystemInfo
- type CMap
- func (cmap *CMap) Bytes() []byte
- func (cmap *CMap) BytesToCharcodes(data []byte) ([]CharCode, bool)
- func (cmap *CMap) CIDToCharcode(cid CharCode) (CharCode, bool)
- func (cmap *CMap) CharcodeBytesToUnicode(data []byte) (string, int)
- func (cmap *CMap) CharcodeToCID(code CharCode) (CharCode, bool)
- func (cmap *CMap) CharcodeToUnicode(code CharCode) (rune, bool)
- func (cmap *CMap) NBits() int
- func (cmap *CMap) Name() string
- func (cmap *CMap) RuneToCID(r rune) (CharCode, bool)
- func (cmap *CMap) String() string
- func (cmap *CMap) Type() int
- type CharCode
- type Codespace
Constants ¶
const ( // MissingCodeRune replaces runes that can't be decoded. '\ufffd' = �. Was '?'. MissingCodeRune = '\ufffd' // � )
Variables ¶
var ( ErrBadCMap = errors.New("bad cmap") ErrBadCMapComment = errors.New("comment should start with %") ErrBadCMapDict = errors.New("invalid dict") )
CMap parser errors.
Functions ¶
func IsPredefinedCMap ¶
IsPredefinedCMap returns true if the specified CMap name is a predefined CJK CMap. The predefined CMaps are bundled with the package and can be loaded using the LoadPredefinedCMap function. See section 9.7.5.2 "Predefined CMaps" (page 273, Table 118).
Types ¶
type CIDSystemInfo ¶
CIDSystemInfo contains information for identifying the character collection used by a CID font. CIDSystemInfo=Dict("Registry": Adobe, "Ordering": Korea1, "Supplement": 0, )
func NewCIDSystemInfo ¶
func NewCIDSystemInfo(obj core.PdfObject) (info CIDSystemInfo, err error)
NewCIDSystemInfo returns the CIDSystemInfo encoded in PDFObject `obj`.
func (*CIDSystemInfo) String ¶
func (info *CIDSystemInfo) String() string
String returns a human readable description of `info`. It looks like "Adobe-Japan2-000".
type CMap ¶
type CMap struct {
// contains filtered or unexported fields
}
CMap represents a character code to unicode mapping used in PDF files. References:
https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/5411.ToUnicode.pdf https://github.com/adobe-type-tools/cmap-resources/releases
func LoadCmapFromData ¶
LoadCmapFromData parses the in-memory cmap `data` and returns the resulting CMap. If `isSimple` is true, it uses 1-byte encodings, otherwise it uses the codespaces in the cmap.
9.10.3 ToUnicode CMaps (page 293).
func LoadCmapFromDataCID ¶
LoadCmapFromDataCID parses the in-memory cmap `data` and returns the resulting CMap. It is a convenience function.
func LoadPredefinedCMap ¶
LoadPredefinedCMap loads a predefined CJK CMap by name. See section 9.7.5.2 "Predefined CMaps" (page 273, Table 118).
func NewToUnicodeCMap ¶
NewToUnicodeCMap returns an identity CMap with codeToUnicode matching the `codeToUnicode` arg.
func (*CMap) BytesToCharcodes ¶
BytesToCharcodes attempts to convert the entire byte array `data` to a list of character codes from the ranges specified by `cmap`'s codespaces. Returns:
character code sequence (if there is a match complete match) matched?
NOTE: A partial list of character codes will be returned if a complete match
is not possible.
func (*CMap) CIDToCharcode ¶
CIDToCharcode maps the specified character identified to a character code. If the provided CID has no available mapping, the second return value is false.
func (*CMap) CharcodeBytesToUnicode ¶
CharcodeBytesToUnicode converts a byte array of charcodes to a unicode string representation. It also returns a bool flag to tell if the conversion was successful. NOTE: This only works for ToUnicode cmaps.
func (*CMap) CharcodeToCID ¶
CharcodeToCID maps the specified character code to a character identifier. If the provided charcode has no available mapping, the second return value is false. The returned CID can be mapped to a Unicode character using a Unicode conversion CMap.
func (*CMap) CharcodeToUnicode ¶
CharcodeToUnicode converts a single character code `code` to a unicode string. If `code` is not in the unicode map, '�' is returned. NOTE: CharcodeBytesToUnicode is typically more efficient.
func (*CMap) RuneToCID ¶
RuneToCID maps the specified rune to a character identifier. If the provided rune has no available mapping, the second return value is false.
type CharCode ¶
type CharCode uint32
CharCode is a character code or Unicode rune is int32 https://golang.org/doc/go1#rune