docxlib

package module
v0.0.0-...-d8f39ce Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 17, 2021 License: MIT Imports: 7 Imported by: 2

README

Docx library

Yet another library to read and write .docx (a.k.a. Microsoft Word documents or ECMA-376 Office Open XML) files in Go.

Introduction

As part of my work for Basement Crowd and FromCounsel, we were in need of a basic library to manipulate (both read and write) Microsoft Word documents.

The difference with other projects is the following:

  • UniOffice is probably the most complete but it is also commercial (you need to pay). It also very complete, but too much for my needs.

  • gingfrederik/docx only allows to write.

There are also a couple of other projects kingzbauer/docx and nguyenthenguyen/docx

gingfrederik/docx was a heavy influence (the original structures and the main method come from that project).

However, those original structures didn't handle reading and extending them was particularly difficult due to Go xml parser being a bit limited including a 6 year old bug.

Additionally, my requirements go beyond the original structure and a hard fork seemed more sensible.

The plan is to evolve the library, so the API is likely to change according to my company's needs. But please do feel free to send patches, reports and PRs (or fork).

In the mean time, shared as an example in case somebody finds it useful.

Getting Started

Install

Go modules supported

go get github.com/gonfva/docxlib
Usage

See main for an example

$ go build -o docxlib ./main
$ ./docxlib
Preparing new document to write at /tmp/new-file.docx
Document writen.
Now trying to read it
	We've found a new run with the text ->test
	We've found a new run with the text ->test font size
	We've found a new run with the text ->test color
	We've found a new run with the text ->test font size and color
	We've found a new hyperlink with ref http://google.com and the text google
End of main

You can also increase the log level (-logtostderr=true -v=0) and just dump a specific file(-file /tmp/new-file.docx). See getstructure/main

$ go build -o docxlib ./getstructure/ && ./docxlib -logtostderr=true -v=0 -file /tmp/new-file.docx
I0511 12:37:40.898493   18466 unpack.go:69] Relations: [...]
I0511 12:37:40.898787   18466 unpack.go:47] Doc: [...]
I0511 12:37:40.899330   18466 unpack.go:58] Paragraph [0xc000026d40 0xc000027d00 0xc000172340]
I0511 12:37:40.899369   18466 main.go:31] There is a new paragraph [...]
	We've found a new run with the text ->test
	We've found a new run with the text ->test font size
	We've found a new run with the text ->test color
I0511 12:37:40.899389   18466 main.go:31] There is a new paragraph [...]
	We've found a new run with the text ->test font size and color
I0511 12:37:40.899396   18466 main.go:31] There is a new paragraph [...]
	We've found a new hyperlink with ref http://google.com and the text google
End of main
Build
$ go build ./...

License

MIT. See LICENSE

Documentation

Index

Constants

View Source
const (
	TEMP_REL = `` /* 601-byte string literal not displayed */

	TEMP_DOCPROPS_APP  = `` /* 276-byte string literal not displayed */
	TEMP_DOCPROPS_CORE = `` /* 363-byte string literal not displayed */
	TEMP_CONTENT       = `` /* 934-byte string literal not displayed */

	TEMP_WORD_STYLE = `` /* 1743-byte string literal not displayed */

	TEMP_WORD_THEME_THEME = `` /* 9767-byte string literal not displayed */

)
View Source
const (
	XMLNS_W = `http://schemas.openxmlformats.org/wordprocessingml/2006/main`
	XMLNS_R = `http://schemas.openxmlformats.org/officeDocument/2006/relationships`
)
View Source
const (
	XMLNS         = `http://schemas.openxmlformats.org/package/2006/relationships`
	REL_HYPERLINK = `http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink`

	REL_TARGETMODE = "External"
)
View Source
const (
	HYPERLINK_STYLE = "a1"
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Body

type Body struct {
	XMLName    xml.Name     `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main body"`
	Paragraphs []*Paragraph `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main p"`
}

type Color

type Color struct {
	XMLName xml.Name `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main color"`
	Val     string   `xml:"w:val,attr"`
}

Color contains the sound of music. :D I'm kidding. It contains the color

type Document

type Document struct {
	XMLName xml.Name `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main document"`
	XMLW    string   `xml:"xmlns:w,attr"`
	XMLR    string   `xml:"xmlns:r,attr"`
	Body    *Body
}

type DocxLib

type DocxLib struct {
	Document    Document
	DocRelation Relationships
	// contains filtered or unexported fields
}

DocxLib is the structure that allow to access the internal represntation in memory of the doc (either read or about to be written)

func New

func New() *DocxLib

New generates a new empty docx file that we can manipulate and later on, save

func Parse

func Parse(reader io.ReaderAt, size int64) (doc *DocxLib, err error)

Parse generates a new docx file in memory from a reader You can it invoke from a file

readFile, err := os.Open(FILE_PATH)
if err != nil {
	panic(err)
}
fileinfo, err := readFile.Stat()
if err != nil {
	panic(err)
}
size := fileinfo.Size()
doc, err := docxlib.Parse(readFile, int64(size))

but also you can invoke from a webform (BEWARE of trusting users data!!!)

func uploadFile(w http.ResponseWriter, r *http.Request) {
	r.ParseMultipartForm(10 << 20)

	file, handler, err := r.FormFile("file")
	if err != nil {
		fmt.Println("Error Retrieving the File")
		fmt.Println(err)
		http.Error(w, err.Error(), http.StatusBadRequest)
		return
	}
	defer file.Close()
	docxlib.Parse(file, handler.Size)
}

func (*DocxLib) AddParagraph

func (f *DocxLib) AddParagraph() *Paragraph

AddParagraph adds a new paragraph

func (*DocxLib) Paragraphs

func (f *DocxLib) Paragraphs() []*Paragraph

func (*DocxLib) References

func (f *DocxLib) References(id string) (href string, err error)

References gets the url for a reference

func (*DocxLib) Write

func (f *DocxLib) Write(writer io.Writer) (err error)

Write allows to save a docx to a writer

type Hyperlink struct {
	XMLName xml.Name `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main hyperlink,omitempty"`
	ID      string   `xml:"http://schemas.openxmlformats.org/officeDocument/2006/relationships id,attr"`
	Run     Run
}

The hyperlink element contains links

func (*Hyperlink) UnmarshalXML

func (r *Hyperlink) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

type Paragraph

type Paragraph struct {
	XMLName xml.Name `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main p"`
	Data    []ParagraphChild
	// contains filtered or unexported fields
}
func (p *Paragraph) AddLink(text string, link string) *Hyperlink

AddLink adds an hyperlink to paragraph

func (*Paragraph) AddText

func (p *Paragraph) AddText(text string) *Run

AddText adds text to paragraph

func (*Paragraph) Children

func (p *Paragraph) Children() (ret []ParagraphChild)

func (*Paragraph) UnmarshalXML

func (p *Paragraph) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

type ParagraphChild

type ParagraphChild struct {
	Link       *Hyperlink     `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main hyperlink,omitempty"`
	Run        *Run           `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main r,omitempty"`
	Properties *RunProperties `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main rPr,omitempty"`
}

type Relationship

type Relationship struct {
	XMLName    xml.Name `xml:"Relationship"`
	ID         string   `xml:"Id,attr"`
	Type       string   `xml:"Type,attr"`
	Target     string   `xml:"Target,attr"`
	TargetMode string   `xml:"TargetMode,attr,omitempty"`
}

type Relationships

type Relationships struct {
	XMLName       xml.Name        `xml:"Relationships"`
	Xmlns         string          `xml:"xmlns,attr"`
	Relationships []*Relationship `xml:"Relationship"`
}

type Run

type Run struct {
	XMLName       xml.Name       `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main r,omitempty"`
	RunProperties *RunProperties `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main rPr,omitempty"`
	InstrText     string         `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main instrText,omitempty"`
	Text          *Text
}

A Run is part of a paragraph that has its own style. It could be a piece of text in bold, or a link

func (*Run) Color

func (r *Run) Color(color string) *Run

Color allows to set run color

func (*Run) Size

func (r *Run) Size(size int) *Run

Size allows to set run size

func (*Run) UnmarshalXML

func (r *Run) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

type RunProperties

type RunProperties struct {
	XMLName  xml.Name  `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main rPr,omitempty"`
	Color    *Color    `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main color,omitempty"`
	Size     *Size     `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main sz,omitempty"`
	RunStyle *RunStyle `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main rStyle,omitempty"`
	Style    *Style    `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main pStyle,omitempty"`
}

RunProperties encapsulates visual properties of a run

type RunStyle

type RunStyle struct {
	XMLName xml.Name `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main rStyle,omitempty"`
	Val     string   `xml:"w:val,attr"`
}

RunStyle contains styling for a run

func (*RunStyle) UnmarshalXML

func (r *RunStyle) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

type Size

type Size struct {
	XMLName xml.Name `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main sz"`
	Val     int      `xml:"w:val,attr"`
}

Size contains the font size

type Style

type Style struct {
	XMLName xml.Name `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main pStyle,omitempty"`
	Val     string   `xml:"w:val,attr"`
}

Style contains styling for a paragraph

type Text

type Text struct {
	XMLName  xml.Name `xml:"http://schemas.openxmlformats.org/wordprocessingml/2006/main t"`
	XMLSpace string   `xml:"xml:space,attr,omitempty"`
	Text     string   `xml:",chardata"`
}

The Text object contains the actual text

func (*Text) UnmarshalXML

func (r *Text) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL