lib

package
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 24, 2024 License: Apache-2.0 Imports: 15 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DOC2Text

func DOC2Text(r io.Reader) (io.Reader, error)

DOC2Text converts a standard io.Reader from a Microsoft Word .doc binary file and returns a reader (actually a bytes.Buffer) which will output the plain text found in the .doc file

func ExtractText

func ExtractText(r io.Reader) (string, error)

ExtractText parses PPT file represented by Reader r and extracts text from it.

func IsFileDOC

func IsFileDOC(data []byte) bool

IsFileDOC checks if the data indicates a DOC file DOC has multiple signature according to https://filesignatures.net/index.php?search=doc&mode=EXT, D0 CF 11 E0 A1 B1 1A E1

func IsFileXLS

func IsFileXLS(data []byte) bool

IsFileXLS checks if the data indicates a XLS file XLS has a signature of D0 CF 11 E0 A1 B1 1A E1

func XLS2Cells

func XLS2Cells(reader io.ReadSeeker) (cells []string, err error)

XLS2Cells converts an XLS file to individual cells

func XLS2Text

func XLS2Text(reader io.ReadSeeker) (string, error)

XLS2Text extracts text from an Excel sheet. It returns bytes written. The parameter size is the max amount of bytes (not characters) to write out. The whole Excel file is required even for partial text extraction. This function returns no error with 0 bytes written in case of corrupted or invalid file.

Types

This section is empty.

Directories

Path Synopsis
pdf
Package pdf implements reading of PDF files.
Package pdf implements reading of PDF files.
xls package use to parse the 97 -2004 microsoft xls file(".xls" suffix, NOT ".xlsx" suffix )
xls package use to parse the 97 -2004 microsoft xls file(".xls" suffix, NOT ".xlsx" suffix )

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL