markdown_basic

package
v0.4.14-rc.24 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 11, 2024 License: Apache-2.0 Imports: 4 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type MarkdownTextSplitter

type MarkdownTextSplitter struct {
	ChunkSize    int
	ChunkOverlap int
	// SecondSplitter splits paragraphs
	SecondSplitter lcgosplitter.TextSplitter

	MaxHeadingLevel int

	IgnoreHeadingOnly bool
}

MarkdownTextSplitter markdown header text splitter.

func NewMarkdownTextSplitter

func NewMarkdownTextSplitter(opts ...Option) *MarkdownTextSplitter

NewMarkdownTextSplitter creates a new Markdown text splitter.

func (MarkdownTextSplitter) SplitText

func (sp MarkdownTextSplitter) SplitText(text string) ([]string, error)

SplitText splits a text into multiple text.

type Option

type Option func(*Options)

Option is a function that can be used to set options for a text splitter.

func WithChunkOverlap

func WithChunkOverlap(chunkOverlap int) Option

WithChunkOverlap sets the chunk overlap for a text splitter.

func WithChunkSize

func WithChunkSize(chunkSize int) Option

WithChunkSize sets the chunk size for a text splitter.

func WithEncodingName

func WithEncodingName(encodingName string) Option

WithEncodingName sets the encoding name for a text splitter.

func WithIgnoreHeadingOnly

func WithIgnoreHeadingOnly(ignoreHeadingOnly bool) Option

func WithKeepSeparator

func WithKeepSeparator(keepSeparator bool) Option

WithKeepSeparator sets whether the separators should be kept in the resulting split text or not. When it is set to True, the separators are included in the resulting split text. When it is set to False, the separators are not included in the resulting split text. The purpose of having this parameter is to provide flexibility in how text splitting is handled. Default to False if not specified.

func WithMaxHeadingLevel

func WithMaxHeadingLevel(maxHeadingLevel int) Option

func WithModelName

func WithModelName(modelName string) Option

WithModelName sets the model name for a text splitter.

func WithSecondSplitter

func WithSecondSplitter(secondSplitter lcgosplitter.TextSplitter) Option

WithSecondSplitter sets the second splitter for a text splitter.

type Options

type Options struct {
	ChunkSize      int
	ChunkOverlap   int
	Separators     []string
	KeepSeparator  bool
	ModelName      string
	EncodingName   string
	SecondSplitter lcgosplitter.TextSplitter

	MaxHeadingLevel   int  // Maximum heading level to split on
	IgnoreHeadingOnly bool // Ignore chunks that only contain headings
}

Options is a struct that contains options for a text splitter.

func DefaultOptions

func DefaultOptions() Options

DefaultOptions returns the default options for all text splitter.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL