docparser

package
v2.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 28, 2024 License: GPL-3.0 Imports: 11 Imported by: 0

Documentation

Overview

Package docparser implements a parser for legacy MS Word documents (.doc). It depends on the wvWare tool which must be installed.

The metadata parser is mainly taken from https://github.com/sajari/docconv/blob/master/doc.go

Index

Constants

This section is empty.

Variables

View Source
var Initialized bool

Initialized indicates if the package is usable (depending on the presence of the WV tool)

Functions

This section is empty.

Types

type DocMetadata

type DocMetadata struct {
	Author   string `json:"author,omitempty"`
	Category string `json:"category,omitempty"`
	Comment  string `json:"comment,omitempty"`
	Company  string `json:"company,omitempty"`
	Keywords string `json:"keywords,omitempty"`
	Manager  string `json:"manager,omitempty"`
	Subject  string `json:"subject,omitempty"`
	Title    string `json:"title,omitempty"`

	Created   *time.Time `json:"created,omitempty"`
	Modified  *time.Time `json:"modified,omitempty"`
	PageCount int32      `json:"page_count,omitempty"`
	CharCount int32      `json:"char_count,omitempty"`
	WordCount int32      `json:"word_count,omitempty"`
}

type WordDoc

type WordDoc struct {
	// contains filtered or unexported fields
}

func NewFromBytes

func NewFromBytes(data []byte) (doc *WordDoc, err error)

func NewFromStream

func NewFromStream(stream io.ReadCloser) (doc *WordDoc, err error)

func (*WordDoc) Close

func (d *WordDoc) Close()

Close is a no-op

func (*WordDoc) Metadata

func (d *WordDoc) Metadata() DocMetadata

func (*WordDoc) MetadataMap

func (d *WordDoc) MetadataMap() map[string]string

func (*WordDoc) StreamText

func (d *WordDoc) StreamText(w io.Writer)

func (*WordDoc) Text

func (d *WordDoc) Text() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL