docx

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 5, 2021 License: MIT Imports: 4 Imported by: 2

README

go-read-docx

simple way to read parts of a docx in go

package main

import (
	docx "github.com/khnom5000/go-read-docx"
)

func main() {
	d, reader, err := docx.GetDocument("./TestDocument.docx")
	if err != nil {
		panic(err)
	}
}
Example - Read text in a docx:

Code:

...
ps := d.Body.Paragraphs
for i, p := range ps {
	fmt.Println("Para:", i, p)
}
...
Example - Read a table thats inside a docx:

Input:

+-----+---+---+----+
|   1 | 2 | 3 |  4 |
|   8 | 8 | 8 | 66 |
| 123 | 1 | 1 |  1 |
|     |   |   |    |
+-----+---+---+----+

Code:

...
t := d.Body.Tables[0].TableRows
var table [][]string
for _, r := range t {
	var row []string
	for _, c := range r.TableColumns {
		row = append(row, c.Cell)
	}
	table = append(table, row)
}
fmt.Println(table)
...

Output:

[[1 2 3 4] [8 8 8 66] [123 1 1 1] [   ]]
Example - Read more than one table in the same docx:

Input:

+-----+-----+-----+-----+
|   1 |   2 |   3 |   4 |
|   8 |   8 |   8 |  66 |
| 123 |   1 |   1 |   1 |
|     |     |     |     |
+-----+-----+-----+-----+
...
+-----+-----+-----+-----+
|   7 |   8 |   9 |   0 |
|   0 |  33 |  66 |  99 |
| 123 | 100 | 100 | 100 |
|     |     |     |     |
+-----+-----+-----+-----+

Code:

...
ts := d.Body.Tables
for _, t := range ts {
	var table [][]string
	for _, tr := range t.TableRows {
		var row []string
		for _, tc := range tr.TableColumns {
			row = append(row, tc.Cell)
		}
		table = append(table, row)
	}
	fmt.Println(table)
}
...

Output:

[[1 2 3 4] [8 8 8 66] [123 1 1 1] [   ]]
[[7 8 9 0] [0 33 66 99] [123 100 100 100] [   ]]
Example - Get the Headers:
...
h, err := docx.GetHeader("./TestDocument.docx")
if err != nil {
	panic(err)
}
fmt.Println(h.Text)
...

The above also works for footers just swap the function call GetHeader() -> GetFooter()

Examples

Run the example code with the TestDoc

from the go-read-docx folder, run the following command go run ./examples/docWrapper.go

Output:

Show all paragraphs
Para: 0 Start of page one.
Para: 1 This is a table.
Para: 2 This is the second table.
Para: 3 This is the end of the doc!
Show first table
[[1 2 3 4] [8 8 8 66] [123 1 1 1] [   ]]
Show all tables
[[1 2 3 4] [8 8 8 66] [123 1 1 1] [   ]]
[[7 8 9 0] [0 33 66 99] [123 100 100 100] [   ]]
Show Header
This is a header.
Show Footer
This is a footer.

Go Reference

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Body

type Body struct {
	Paragraphs []string `xml:"p>r>t"`
	Tables     []Table  `xml:"tbl"`
}

Body : Elements found within the body section of a word docx

type Document

type Document struct {
	XMLName xml.Name `xml:"document"`
	Body    Body     `xml:"body"`
}

Document : Holds all the things a docx document(.xml) can

func GetDocument added in v0.2.0

func GetDocument(f string) (Document, error)

GetDocument : takes a file, and retuns the Document{} struct

type Footer struct {
	XMLName xml.Name `xml:"ftr"`
	Text    string   `xml:"p>r>t"`
}

Footer : Holds all the things a docx footer1(.xml) can

func GetFooter added in v0.3.0

func GetFooter(f string) (Footer, error)

GetFooter : takes a file, and retuns the Footer{} struct

type Header struct {
	XMLName xml.Name `xml:"hdr"`
	Text    string   `xml:"p>r>t"`
}

Header : Holds all the things a docx header1(.xml) can

func GetHeader added in v0.3.0

func GetHeader(f string) (Header, error)

GetHeader : takes a file, and retuns the Header{} struct

type Table

type Table struct {
	TableRows []TableRow `xml:"tr"`
}

Table : Holds all the things a tbl element can

type TableColumn

type TableColumn struct {
	Cell string `xml:"p>r>t"`
}

TableColumn : Holds all the things a tc element can

type TableRow

type TableRow struct {
	TableColumns []TableColumn `xml:"tc"`
}

TableRow : Holds all the things a tr element can

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL