eval

command
v0.4.19 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 28, 2025 License: MIT Imports: 11 Imported by: 0

README

I want an eval tool for my geppetto prompts:

Input:

  • eval dataset json file
  • prompt template

Output:

  • set of eval metrics

dataset + template -> llm calls -> compute accuracy -> eval results

step 0

  • create a glazed command for evals
  • generate mock rows for eval results
  • wrap as command line tool

step 1

  • load a eval data set from eval.json

    • array of objects
    • each object:
      • input: hash[string]interface{}
      • golden answer: interface{}
  • iterate over each entry in eval.json

  • load a prompt from complaint.yaml

  • interpolate the complaint.yaml command

Running the actual LLM inference
  • run it
    • load the API key, etc...
    • create the chat step
    • get the step result
    • store the metadata in the result json
Postprocessing the LLM response
  • store the answer
    • store the LLM metadata
    • store the date
    • give it a unique UUID

go run ./cmd/eval --dataset eval.json --command complaint.yaml

step 2

  • run a grading function against the LLM answer
    • take a javascript script grading
  • compute a accuracy score

go run ./cmd/eval --dataset eval.json --command complaint.yaml --scoring score.js

step 3

  • REST API

  • web ui (braintrust inspired)

    • make it cancellable when pressing Ctrl-C

    • show full conversation when expanding

    • rerun a single conversation and get streaming completion

    • import/export datasets

    • import/export/manage prompts

    • log + monitoring of testruns

    • streaming display of running datasets

    • edit prompt and save new revisions

    • switch between different versions and compare results and metrics and accuracy

features

  • caching of inference

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
templ: version: v0.2.793
templ: version: v0.2.793

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL