textify

package module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 24, 2017 License: Apache-2.0 Imports: 2 Imported by: 0

README

Textify Apache-2.0 License GoDoc

Turns popular document formats into plain text for searching, indexing, etc.

Purpose

I began working on LockedArchive and really wanted a golang package that could extract text from PDF files.

There are some interesting packages available (such as UniDoc and docconv), but I prefer avoiding any non-golang dependencies and like the freedom of the Apache-2.0 license.

Install

go get -u github.com/jonathan-robertson/textify

Acknowledgements

  • Rus Cox for his incredibly helpful pdf package. I forked his package to have it return spaces since he was excluding them from output.

Documentation

Overview

Package textify extracts useable text from PDF files. github.com/jonathan-robertson/pdf is a fork of rsc.io/pdf

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func PDF

func PDF(filename string, newlineDelimiter string) (text string, err error)

PDF reads a pdf file and returns its text

Example
package main

import (
	"log"

	"github.com/jonathan-robertson/textify"
)

func main() {
	text, err := textify.PDF("testing/test.pdf", "\n")
	if err != nil {
		log.Fatal(err)
	}
	log.Println(text)
}
Output:

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL