docxology

package module
v0.0.0-...-b130db7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 6, 2019 License: MIT Imports: 10 Imported by: 2

README

Docxology

Golang Word Doc (.docx) file extractor and manipulator.

In progress and open to contributions, suggestions, etc.

Build Status GoDoc

How To

Info

.docx files are really just "application/zip" made of XML files. This package is intended to assist in extracting the XML and manipulating the data as you need. Go has everything you need built-in to handle this type of functionality, so this package aims, not to replace that funcationality, but to make it more immediately user-friendly.

Get
go get github.com/jwhittle933/docxology

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DocxOnDiscUnzip

func DocxOnDiscUnzip(pathToFile string) error

DocxOnDiscUnzip for reading Word .docx files << Entry func pathToFile param path to file stored on disc.

Types

type Document

type Document struct {
	Doc *zip.File
}

Document type for storing word/document.xml extracted from docx zip.

func (*Document) CopyToOS

func (d *Document) CopyToOS(filePath string) error

CopyToOS for writing contents of d *Document to disc.

func (*Document) XMLExtractText

func (d *Document) XMLExtractText() XMLDocMacroData

XMLExtractText for manipulating xml

type LambdaNoReturn

type LambdaNoReturn func(string) error

LambdaNoReturn func type

type LambdaReturn

type LambdaReturn func(string) (string error)

LambdaReturn func type

type Mapfunc

type Mapfunc func(string) error

Mapfunc type for Lambda to MapFiles

type UnZip

type UnZip struct {
	Reader *zip.Reader
	Files  []*zip.File
}

UnZip struct for handling extention methods on unziped files.

func ExtractFileHTTP

func ExtractFileHTTP(fi *multipart.FileHeader) *UnZip

ExtractFileHTTP return *UnZip

func ExtractLocalDocx

func ExtractLocalDocx(pathToFile string) (*UnZip, error)

ExtractLocalDocx returns *UnZip pathToFile param path to file stored on disc.

func (*UnZip) FindDoc

func (f *UnZip) FindDoc(searchDoc string) (file *Document)

FindDoc locates file by filename and returns Document This method must include dirs, i.e., word/document/xml, word/theme/theme1.xml

func (*UnZip) MapFiles

func (f *UnZip) MapFiles(saveLocation string) error

MapFiles for iterating through zip.File slice and performing an operation on it. TODO: Pass in Lambda to map over files

type UnZipedFile

type UnZipedFile struct {
	File *zip.File
}

UnZipedFile struct for extending *zip.File

type XMLDocMacroData

type XMLDocMacroData struct {
	DocumentMeta xml.Name `xml:"document"`
	Text         string   `xml:"body>p>r>t"`
}

XMLDocMacroData struct for Unmarshalling xml.

!! https://www.loc.gov/preservation/digital/formats/fdd/fdd000397.shtml !! http://officeopenxml.com/anatomyofOOXML.php !! https://docs.microsoft.com/en-us/office/open-xml/structure-of-a-wordprocessingml-document * * Notes from Microsoft: * A WordprocessingML document is organized around the concept of stories. * A story is a region of content in a WordprocessingML document. * * The main document story of the simplest WordprocessingML document consists of the following XML elements: * document – The root element * body – The container for the collection of block-level structures * p – A paragraph * r – A run * t – A range of text

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL