Documentation ¶
Index ¶
- Constants
- Variables
- func NewRunID() int
- func ReadZipReader(r *zip.Reader, o *Options) (map[string][]byte, error)
- func ResetRunIdCounter()
- func ValidatePositions(document []byte, runs []*Run) error
- type DocumentRuns
- type File
- type Options
- type Position
- type Reader
- type Run
- type RunParser
- type TagPair
Constants ¶
const ( // RunElementName is the local name of the XML tag for runs (<w:r>, </w:r> and <w:r/>) RunElementName = "r" // TextElementName is the local name of the XML tag for text-runs (<w:t> and </w:t>) TextElementName = "t" )
const (
UnzipSizeLimit = 1000 << 24
)
Variables ¶
var ( // RunOpenTagRegex matches all OpenTags for runs, including eventually set attributes RunOpenTagRegex = regexp.MustCompile(`(<w:r).*>`) // RunCloseTagRegex matches the close tag of runs RunCloseTagRegex = regexp.MustCompile(`(</w:r>)`) // RunSingletonTagRegex matches a singleton run tag RunSingletonTagRegex = regexp.MustCompile(`(<w:r/>)`) // TextOpenTagRegex matches all OpenTags for text-runs, including eventually set attributes TextOpenTagRegex = regexp.MustCompile(`(<w:t).*>`) // TextCloseTagRegex matches the close tag of text-runs TextCloseTagRegex = regexp.MustCompile(`(</w:t>)`) // ErrTagsInvalid is returned if the parsing failed and the result cannot be used. // Typically this means that one or more tag-offsets were not parsed correctly which // would cause the document to become corrupted as soon as replacing starts. ErrTagsInvalid = errors.New("one or more tags are invalid and will cause the XML to be corrupt") )
Functions ¶
func ReadZipReader ¶
读取内存中的zip文件
func ResetRunIdCounter ¶
func ResetRunIdCounter()
ResetRunIdCounter will reset the runId counter to 0
func ValidatePositions ¶
ValidatePositions will iterate over all runs and their texts (if any) and ensure that they match their respective regex. If the validation failed, the replacement will not work since offsets are wrong.
Types ¶
type DocumentRuns ¶
type DocumentRuns []*Run
DocumentRuns is a convenience type used to describe a slice of runs. It also implements Push() and Pop() which allows it to be used as LIFO stack.
func (*DocumentRuns) Pop ¶
func (dr *DocumentRuns) Pop() *Run
Pop will return the last Run added to the stack and remove it.
func (*DocumentRuns) Push ¶
func (dr *DocumentRuns) Push(run *Run)
Push will push a new Run onto the DocumentRuns stack
func (DocumentRuns) WithText ¶
func (dr DocumentRuns) WithText() DocumentRuns
WithText returns all runs with the HasText flag set
type Position ¶
Position is a generic position of a tag, represented by byte offsets
type Reader ¶
type Reader struct {
// contains filtered or unexported fields
}
Reader is a very basic io.Reader implementation which is capable of returning the current position.
type Run ¶
type Run struct { TagPair ID int Text TagPair // Text is the <w:t> tag pair which is always within a run and cannot be standalone. HasText bool }
Run defines a non-block region of text with a common set of properties. It is specified with the <w:r> element. In our case the run is specified by four byte positions (start and end tag).
func NewEmptyRun ¶
func NewEmptyRun() *Run
NewEmptyRun returns a new, empty run which has only an ID set.
type RunParser ¶
type RunParser struct {
// contains filtered or unexported fields
}
RunParser can parse a list of Runs from a given byte slice.
func NewRunParser ¶
NewRunParser returns an initialized RunParser given the source-bytes.
func (*RunParser) Execute ¶
Execute will fire up the parser. The parser will do two passes on the given document. First, all <w:r> tags are located and marked. Then, inside that run tags the <w:t> tags are located.
func (*RunParser) Runs ¶
func (parser *RunParser) Runs() DocumentRuns
Runs returns the all runs found by the parser.