Documentation ¶
Index ¶
- Variables
- func Cli(args []string, config *CliConfig) (rc int, err error)
- func CodeVersion() string
- func EditFile(fn string) (err error)
- func InitTokenizer() (err error)
- func OutfilesRegex(files []FileLang) string
- type ChatHistory
- type ChatMsg
- type Chunk
- type CliConfig
- type Document
- type FileLang
- type GrokkerInternal
- func Init(rootdir, model string) (g *GrokkerInternal, err error)
- func InitNamed(rootdir, name, model string) (g *GrokkerInternal, err error)
- func Load(readonly bool) (g *GrokkerInternal, migrated bool, oldver, newver string, lock *flock.Flock, ...)
- func LoadFrom(grokpath string, readonly bool) (g *GrokkerInternal, migrated bool, oldver, newver string, lock *flock.Flock, ...)
- func (g *GrokkerInternal) AddDocument(path string) (err error)
- func (g *GrokkerInternal) Answer(question string, withHeaders, withLineNumbers, global bool) (resp string, err error)
- func (g *GrokkerInternal) Backup() (backpath string, err error)
- func (g *GrokkerInternal) Chat(sysmsg, prompt, fileName string, level util.ContextLevel, infiles []string, ...) (resp string, err error)
- func (g *GrokkerInternal) Context(text string, tokenLimit int, withHeaders, withLineNumbers bool) (context string, err error)
- func (g *GrokkerInternal) Continue(in string, global bool) (out, sysmsg string, err error)
- func (g *GrokkerInternal) DBVersion() string
- func (g *GrokkerInternal) Embed(text string) (jsonEmbedding string, err error)
- func (g *GrokkerInternal) ForgetDocument(path string) (err error)
- func (g *GrokkerInternal) GetModel() (model string, m *Model, err error)
- func (g *GrokkerInternal) GitCommitMessage(diff string) (msg string, err error)
- func (g *GrokkerInternal) ListDocuments() (paths []string)
- func (g *GrokkerInternal) ListModels() (models []*Model, err error)
- func (g *GrokkerInternal) Msg(sysmsg, txt string) (resp string, err error)
- func (g *GrokkerInternal) OpenChatHistory(sysmsg, relPath string) (history *ChatHistory, err error)
- func (g *GrokkerInternal) RefreshEmbeddings() (err error)
- func (g *GrokkerInternal) Revise(in string, global, sysmsgin bool) (out, sysmsg string, err error)
- func (g *GrokkerInternal) Save() (err error)
- func (g *GrokkerInternal) SetModel(model string) (oldModel string, err error)
- func (g *GrokkerInternal) Setup(model string) (err error)
- func (g *GrokkerInternal) Similarity(reftext string, texts ...string) (sims []float64, err error)
- func (g *GrokkerInternal) TokenCount(text string) (count int, err error)
- func (g *GrokkerInternal) UpdateEmbeddings() (update bool, err error)
- type Model
- type Models
Constants ¶
This section is empty.
Variables ¶
var DefaultModel = "gpt-4"
var GitDiffPrompt = `` /* 275-byte string literal not displayed */
var GitSummaryPrompt = `` /* 207-byte string literal not displayed */
var SysMsgChat = "" /* 227-byte string literal not displayed */
var SysMsgContinue = "" /* 319-byte string literal not displayed */
var SysMsgRevise = "" /* 317-byte string literal not displayed */
var SysMsgSummarizeChat = `` /* 230-byte string literal not displayed */
var Tokenizer tokenizer.Codec
Functions ¶
func Cli ¶
Cli parses the given arguments and then executes the appropriate subcommand.
We use this function instead of kong.Parse() so that we can pass in the arguments to parse. This allows us to more easily test the cli subcommands, and could later ease e.g. WASM usage.
XXX note how gitea/tea does this, also uses urfave instead of kong
func OutfilesRegex ¶
OutfilesRegex returns a regular expression that matches the format of output files embedded in chat responses. The Language field of each FileLang struct is used to generate a repeating regex that matches multiple files. If the files argument is nil, the regex matches a single file.
Types ¶
type ChatHistory ¶
func (*ChatHistory) Save ¶
func (history *ChatHistory) Save(addToDb bool) (err error)
Save saves the chat history file.
type Chunk ¶
type Chunk struct { // The document that this chunk is from. Document *Document // The offset of the chunk in the document. Offset int // The length of the chunk in the document. Length int // sha256 hash of the text of the chunk. Hash string // The embedding of the chunk. Embedding []float64 // contains filtered or unexported fields }
Chunk is a single chunk of text from a document.
type CliConfig ¶
type CliConfig struct { // Name is the name of the program Name string // Description is a short description of the program Description string // Version is the version of the program Version string // Exit is the function to call to exit the program Exit func(int) Stdin io.Reader Stdout io.Writer Stderr io.Writer }
CliConfig contains the configuration for grokker's cli
func NewCliConfig ¶
func NewCliConfig() *CliConfig
NewCliConfig returns a new Config struct with default values populated
type Document ¶
type Document struct { // XXX deprecated because we weren't precise about what it meant. Path string // The path to the document file, relative to g.Root RelPath string }
Document is a single document in a document repository.
type GrokkerInternal ¶
type GrokkerInternal struct { // The grokker version number this db was last updated with. Version string // The absolute path of the root directory of the document // repository. This is passed in from cli based on where we // found the db. Root string // The list of documents in the database. Documents []*Document // The list of chunks in the database. Chunks []*Chunk Model string // contains filtered or unexported fields }
func Init ¶
func Init(rootdir, model string) (g *GrokkerInternal, err error)
Init creates a Grokker database in the given root directory.
func InitNamed ¶
func InitNamed(rootdir, name, model string) (g *GrokkerInternal, err error)
InitNamed creates a named Grokker database in the given root directory.
func Load ¶
func Load(readonly bool) (g *GrokkerInternal, migrated bool, oldver, newver string, lock *flock.Flock, err error)
Load loads a Grokker database from the current or any parent directory.
func LoadFrom ¶
func LoadFrom(grokpath string, readonly bool) (g *GrokkerInternal, migrated bool, oldver, newver string, lock *flock.Flock, err error)
LoadFrom loads a Grokker database from a given path. XXX replace the json db with a kv store, store vectors as binary floating point values.
func (*GrokkerInternal) AddDocument ¶
func (g *GrokkerInternal) AddDocument(path string) (err error)
AddDocument adds a document to the Grokker database. It creates the embeddings for the document and adds them to the database.
func (*GrokkerInternal) Answer ¶
func (g *GrokkerInternal) Answer(question string, withHeaders, withLineNumbers, global bool) (resp string, err error)
Answer returns the answer to a question.
func (*GrokkerInternal) Backup ¶
func (g *GrokkerInternal) Backup() (backpath string, err error)
Backup backs up the Grokker database to a time-stamped backup and returns the path.
func (*GrokkerInternal) Chat ¶
func (g *GrokkerInternal) Chat(sysmsg, prompt, fileName string, level util.ContextLevel, infiles []string, outfiles []FileLang, extract, promptTokenLimit int, extractToStdout, addToDb, edit bool) (resp string, err error)
Chat uses the given sysmsg and prompt along with context from the knowledge base and message history file to generate a response.
func (*GrokkerInternal) Context ¶
func (g *GrokkerInternal) Context(text string, tokenLimit int, withHeaders, withLineNumbers bool) (context string, err error)
Context returns the context for a given text, limited by the tokenLimit.
func (*GrokkerInternal) Continue ¶
func (g *GrokkerInternal) Continue(in string, global bool) (out, sysmsg string, err error)
Continue returns a continuation of the input text.
func (*GrokkerInternal) DBVersion ¶
func (g *GrokkerInternal) DBVersion() string
DBVersion returns the version of the grokker database.
func (*GrokkerInternal) Embed ¶
func (g *GrokkerInternal) Embed(text string) (jsonEmbedding string, err error)
Embed returns the embedding for a given text as a JSON string.
func (*GrokkerInternal) ForgetDocument ¶
func (g *GrokkerInternal) ForgetDocument(path string) (err error)
ForgetDocument removes a document from the Grokker database.
func (*GrokkerInternal) GetModel ¶
func (g *GrokkerInternal) GetModel() (model string, m *Model, err error)
GetModel returns the current model name and model_t from the db
func (*GrokkerInternal) GitCommitMessage ¶
func (g *GrokkerInternal) GitCommitMessage(diff string) (msg string, err error)
GitCommitMessage generates a git commit message given a diff. It appends a reasonable prompt, and then uses the result as a grokker query.
func (*GrokkerInternal) ListDocuments ¶
func (g *GrokkerInternal) ListDocuments() (paths []string)
ListDocuments returns a list of all documents in the knowledge base. XXX this is a bit of a hack, since we're using the document name as the document ID. XXX this is also a bit of a hack since we're trying to make this work for multiple versions -- we should be able to simplify this after migration is automatic during Load().
func (*GrokkerInternal) ListModels ¶
func (g *GrokkerInternal) ListModels() (models []*Model, err error)
ListModels lists the available models.
func (*GrokkerInternal) Msg ¶
func (g *GrokkerInternal) Msg(sysmsg, txt string) (resp string, err error)
Msg sends sysmsg and txt to openai and returns the response.
func (*GrokkerInternal) OpenChatHistory ¶
func (g *GrokkerInternal) OpenChatHistory(sysmsg, relPath string) (history *ChatHistory, err error)
OpenChatHistory opens a chat history file and returns a ChatHistory object. The chat history file is a special format that is amenable to context chunking and summarization. The first line of the file is a json string that contains a ChatHistory struct. The rest of the file is a chat history in the following format:
<role>:\n<message>\n\n
...where <role> is either "USER" or "AI", and <message> is the text of the message. The last message in the file is the most recent message.
func (*GrokkerInternal) RefreshEmbeddings ¶
func (g *GrokkerInternal) RefreshEmbeddings() (err error)
RefreshEmbeddings refreshes the embeddings for all documents in the database.
func (*GrokkerInternal) Revise ¶
func (g *GrokkerInternal) Revise(in string, global, sysmsgin bool) (out, sysmsg string, err error)
Revise returns revised text based on input text.
func (*GrokkerInternal) Save ¶
func (g *GrokkerInternal) Save() (err error)
Save saves the Grokker database to the stored path.
func (*GrokkerInternal) SetModel ¶
func (g *GrokkerInternal) SetModel(model string) (oldModel string, err error)
SetModel sets the default chat completion model for queries.
func (*GrokkerInternal) Setup ¶
func (g *GrokkerInternal) Setup(model string) (err error)
Setup the model and oai clients. This function needs to be idempotent because it might be called multiple times during the lifetime of a Grokker object.
func (*GrokkerInternal) Similarity ¶
func (g *GrokkerInternal) Similarity(reftext string, texts ...string) (sims []float64, err error)
Similarity returns the similarity between two or more texts. Each text is compared to the reference text, and the similarities are returned as a float64 slice.
func (*GrokkerInternal) TokenCount ¶
func (g *GrokkerInternal) TokenCount(text string) (count int, err error)
TokenCount returns the number of tokens in a string.
func (*GrokkerInternal) UpdateEmbeddings ¶
func (g *GrokkerInternal) UpdateEmbeddings() (update bool, err error)
UpdateEmbeddings updates the embeddings for any documents that have changed since the last time the embeddings were updated. It returns true if any embeddings were updated.