Documentation ¶
Overview ¶
Package documentloader provides functionality for loading and processing documents.
Index ¶
- Variables
- type CSV
- type CSVOptions
- type FileFilter
- type Git
- func NewGit(r *git.Repository, optFns ...func(o *GitOptions)) *Git
- func NewGitFromCloneURL(url string, optFns ...func(o *GitCloneURLOptions)) (*Git, error)
- func NewGitFromCodeCommitURL(url string, creds aws.Credentials, optFns ...func(o *GitOptions)) (*Git, error)
- func NewGitFromPath(path string, optFns ...func(o *GitOptions)) (*Git, error)
- type GitCloneURLOptions
- type GitOptions
- type Text
Constants ¶
This section is empty.
Variables ¶
var DefaultGitOptions = GitOptions{ Branch: "main", FileFilter: func(f *object.File) bool { return true }, }
DefaultGitOptions provides default Git options.
Functions ¶
This section is empty.
Types ¶
type CSV ¶ added in v0.0.39
type CSV struct {
// contains filtered or unexported fields
}
CSV represents a CSV document loader.
func NewCSV ¶ added in v0.0.39
func NewCSV(r io.Reader, optFns ...func(o *CSVOptions)) *CSV
NewCSV creates a new CSV loader with an io.Reader and optional configuration options. It returns a pointer to the created CSV loader.
func (*CSV) LoadAndSplit ¶ added in v0.0.39
func (l *CSV) LoadAndSplit(ctx context.Context, splitter schema.TextSplitter) ([]schema.Document, error)
LoadAndSplit loads CSV documents from the provided reader and splits them using the specified text splitter.
type CSVOptions ¶ added in v0.0.41
type CSVOptions struct { // Separator is the rune used to separate fields in the CSV file. Separator rune // LazyQuotes controls whether the CSV reader should use lazy quotes mode. LazyQuotes bool // Columns is a list of column names to filter and include in the loaded documents. Columns []string }
CSVOptions contains options for configuring the CSV loader.
type FileFilter ¶ added in v0.0.52
FileFilter is a function that filters files based on specific criteria.
type Git ¶ added in v0.0.52
type Git struct {
// contains filtered or unexported fields
}
Git is a Git-based implementation of the DocumentLoader interface.
func NewGit ¶ added in v0.0.52
func NewGit(r *git.Repository, optFns ...func(o *GitOptions)) *Git
NewGit creates a Git document loader from an existing Git repository and returns it. The options can be customized using functional options.
func NewGitFromCloneURL ¶ added in v0.0.52
func NewGitFromCloneURL(url string, optFns ...func(o *GitCloneURLOptions)) (*Git, error)
NewGitFromCloneURL clones a Git repository from a URL and returns a Git document loader. The options can be customized using functional options.
func NewGitFromCodeCommitURL ¶ added in v0.0.52
func NewGitFromCodeCommitURL(url string, creds aws.Credentials, optFns ...func(o *GitOptions)) (*Git, error)
NewGitFromCodeCommitURL clones a Git repository from an AWS CodeCommit URL using the provided AWS credentials, and returns a Git document loader. The options can be customized using functional options.
func NewGitFromPath ¶ added in v0.0.52
func NewGitFromPath(path string, optFns ...func(o *GitOptions)) (*Git, error)
NewGitFromPath opens an existing Git repository from a local path and returns a Git document loader. The options can be customized using functional options.
func (*Git) Load ¶ added in v0.0.52
Load retrieves documents from the Git repository and returns them as a slice of schema.Document.
func (*Git) LoadAndSplit ¶ added in v0.0.52
func (l *Git) LoadAndSplit(ctx context.Context, splitter schema.TextSplitter) ([]schema.Document, error)
LoadAndSplit retrieves documents from the Git repository, splits them using the provided TextSplitter, and returns the split documents as a slice of schema.Document.
type GitCloneURLOptions ¶ added in v0.0.52
type GitCloneURLOptions struct { GitOptions Auth transport.AuthMethod }
GitCloneURLOptions holds options for Git repositories cloned from a URL.
type GitOptions ¶ added in v0.0.52
type GitOptions struct { Branch string FileFilter FileFilter }
GitOptions holds options for the Git document loader.
type Text ¶ added in v0.0.22
type Text struct {
// contains filtered or unexported fields
}
func (*Text) Load ¶ added in v0.0.22
Load reads the content from the reader and returns it as a single document.
func (*Text) LoadAndSplit ¶ added in v0.0.22
func (l *Text) LoadAndSplit(ctx context.Context, splitter schema.TextSplitter) ([]schema.Document, error)
LoadAndSplit reads the content from the reader and splits it into multiple documents using the provided splitter.