documentloaders

package

v0.0.0-...-56865d5 Latest Latest Go to latest Published: Aug 21, 2023 License: MIT Imports: 13 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/csims-gr8/langchaingo

Links

Open Source Insights

Documentation ¶

Overview ¶

Package documentloaders includes a standard interface for loading documents from a source and implementations of this interface.

Index ¶

type CSV
- func NewCSV(r io.Reader, columns ...string) CSV
- func (c CSV) Load(_ context.Context) ([]schema.Document, error)
- func (c CSV) LoadAndSplit(ctx context.Context, splitter textsplitter.TextSplitter) ([]schema.Document, error)
type HTML
- func NewHTML(r io.Reader) HTML
- func (h HTML) Load(_ context.Context) ([]schema.Document, error)
- func (h HTML) LoadAndSplit(ctx context.Context, splitter textsplitter.TextSplitter) ([]schema.Document, error)
type Loader
type PDF
- func NewPDF(r io.ReaderAt, size int64, opts ...PDFOptions) PDF
- func (p PDF) Load(_ context.Context) ([]schema.Document, error)
- func (p PDF) LoadAndSplit(ctx context.Context, splitter textsplitter.TextSplitter) ([]schema.Document, error)
type PDFOptions
- func WithPassword(password string) PDFOptions
type Text
- func NewText(r io.Reader) Text
- func (l Text) Load(_ context.Context) ([]schema.Document, error)
- func (l Text) LoadAndSplit(ctx context.Context, splitter textsplitter.TextSplitter) ([]schema.Document, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type CSV ¶

type CSV struct {
	// contains filtered or unexported fields
}

CSV represents a CSV document loader.

func NewCSV ¶

func NewCSV(r io.Reader, columns ...string) CSV

NewCSV creates a new csv loader with an io.Reader and optional column names for filtering.

func (CSV) Load ¶

func (c CSV) Load(_ context.Context) ([]schema.Document, error)

Load reads from the io.Reader and returns a single document with the data.

func (CSV) LoadAndSplit ¶

func (c CSV) LoadAndSplit(ctx context.Context, splitter textsplitter.TextSplitter) ([]schema.Document, error)

LoadAndSplit reads text data from the io.Reader and splits it into multiple documents using a text splitter.

type HTML ¶

type HTML struct {
	// contains filtered or unexported fields
}

HTML loads parses and sanitizes html content from an io.Reader.

func NewHTML ¶

func NewHTML(r io.Reader) HTML

NewHTML creates a new html loader with an io.Reader.

func (HTML) Load ¶

func (h HTML) Load(_ context.Context) ([]schema.Document, error)

Load reads from the io.Reader and returns a single document with the data.

func (HTML) LoadAndSplit ¶

func (h HTML) LoadAndSplit(ctx context.Context, splitter textsplitter.TextSplitter) ([]schema.Document, error)

LoadAndSplit reads text data from the io.Reader and splits it into multiple documents using a text splitter.

type Loader ¶

type Loader interface {
	// Loads loads from a source and returns documents.
	Load(context.Context) ([]schema.Document, error)
	// LoadAndSplit loads from a source and splits the documents using a text splitter.
	LoadAndSplit(context.Context, textsplitter.TextSplitter) ([]schema.Document, error)
}

Loader is the interface for loading and splitting documents from a source.

type PDF ¶

type PDF struct {
	// contains filtered or unexported fields
}

PDF loads text data from an io.Reader.

func NewPDF ¶

func NewPDF(r io.ReaderAt, size int64, opts ...PDFOptions) PDF

NewText creates a new text loader with an io.Reader.

func (PDF) Load ¶

func (p PDF) Load(_ context.Context) ([]schema.Document, error)

Load reads from the io.Reader for the PDF data and returns the documents with the data and with metadata attached of the page number and total number of pages of the PDF.

func (PDF) LoadAndSplit ¶

func (p PDF) LoadAndSplit(ctx context.Context, splitter textsplitter.TextSplitter) ([]schema.Document, error)

LoadAndSplit reads pdf data from the io.Reader and splits it into multiple documents using a text splitter.

type PDFOptions ¶

type PDFOptions func(pdf *PDF)

PDFOptions are options for the PDF loader.

func WithPassword ¶

func WithPassword(password string) PDFOptions

WithPassword sets the password for the PDF.

type Text ¶

type Text struct {
	// contains filtered or unexported fields
}

Text loads text data from an io.Reader.

func NewText ¶

func NewText(r io.Reader) Text

NewText creates a new text loader with an io.Reader.

func (Text) Load ¶

func (l Text) Load(_ context.Context) ([]schema.Document, error)

Load reads from the io.Reader and returns a single document with the data.

func (Text) LoadAndSplit ¶

func (l Text) LoadAndSplit(ctx context.Context, splitter textsplitter.TextSplitter) ([]schema.Document, error)

LoadAndSplit reads text data from the io.Reader and splits it into multiple documents using a text splitter.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL