pikobrain

command module

v0.0.6 Latest Latest Go to latest Published: Aug 10, 2024 License: MPL-2.0 Imports: 19 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/pikocloud/pikobrain

Links

Open Source Insights

README ¶

PikoBrain

PikoBrain is function-calling API for LLM from multiple providers.

The key project features:

allows you to define model configuration
provides universal API regardless of LLM
provides actual function calling (currently OpenAPI)
(optionally) supports different models for Vision and text
Basic UI

It allows set functions (RAG) without vendor lock-in.

The project LICENSED under MPL-2.0 Exhibit A which promotes collaboration (requires sharing changes) but does not restrict for commercial or any other usage.

Roadmap

Providers

OpenAI
AWS Bedrock
Ollama
Google

State

Threads

Integration

Webhooks
NATS Notifications

Functions

OpenAPI (including automatic reload)
Internal functions (threads)
Scripting functions

Libraries

Python
Golang
Typescript

Installation

Source (requires go 1.22.5+) go run github.com/pikocloud/pikobrain@latest <args>
Binary in releases
Docker ghcr.io/pikocloud/pikobrain

Usage

Binary

pikobrain --config examples/brain.yaml --tools examples/tools.yaml

Docker

docker run --rm -v $(pwd):/data -v $(pwd)/examples:/config:ro -p 8080:8080 ghcr.io/pikocloud/pikobrain

Define model and tools like in examples/
Run service
Call service

Basic UI

http://127.0.0.1:8080

Screenshot from 2024-08-10 20-13-32

[!NOTE]
UI designed primarily for admin tasks. For user-friendly chat experience use something like LibreChat

Request

POST http://127.0.0.1:8080

Input can be:

multipart/form-data payload (preferred), where:
- each part can be text/plain (default if not set), application/x-www-form-urlencoded, application/json, image/png, image/jpeg, image/webp, image/gif
- may contain header X-User in each part which maps to user field in providers
- may contain header X-Role where values could be user (default) or assistant
- multipart name doesn't matter
application/x-www-form-urlencoded; content will be decoded
text/plain, application/json
image/png, image/jpeg, image/webp, image/gif
without content type, then payload should be valid UTF-8 string and will be used as single payload

Request may contain query parameter user which maps to user field and/or query role (user or assistant)

Multipart payload allows caller provide full history context messages. For multipart, header X-User and X-Role may override query parameters.

Output is the response from LLM.

[!INFO]
User field is not used for inference. Only for audit.

Threads

In addition to normal usage, it's possible to use stateful chat context within "thread".

For every request historical questions will be fetched (up to depth).

Add and run

POST http://127.0.0.1:8080/<thread name>

Content can be empty (just run)

Just add

PUT http://127.0.0.1:8080/<thread name>

Clients

Python with aiohttp

import asyncio
import io
from dataclasses import dataclass
from datetime import timedelta
from typing import Literal, Iterable

import aiohttp


@dataclass(frozen=True, slots=True)
class Message:
    content: str | bytes | io.BytesIO
    mime: str | None = None
    role: Literal['assistant', "user"] | None = None
    user: str | None = None


@dataclass(frozen=True, slots=True)
class Response:
    content: bytes
    mime: str
    duration: timedelta
    input_messages: int
    input_tokens: int
    output_tokens: int
    total_tokens: int


async def request(url: str, messages: Iterable[Message]) -> Response:
    with aiohttp.MultipartWriter('form-data') as mpwriter:
        for message in messages:
            headers = {}
            if message.mime:
                headers[aiohttp.hdrs.CONTENT_TYPE] = message.mime
            if message.role:
                headers['X-Role'] = message.role
            if message.user:
                headers['X-User'] = message.user

            mpwriter.append(message.content, headers)

        async with aiohttp.ClientSession() as session, session.post(url, data=mpwriter) as res:
            assert res.ok, await res.text()
            return Response(
                content=await res.read(),
                mime=res.headers.get(aiohttp.hdrs.CONTENT_TYPE),
                duration=timedelta(seconds=float(res.headers.get('X-Run-Duration'))),
                input_messages=int(res.headers.get('X-Run-Context')),
                input_tokens=int(res.headers.get('X-Run-Input-Tokens')),
                output_tokens=int(res.headers.get('X-Run-Output-Tokens')),
                total_tokens=int(res.headers.get('X-Run-Total-Tokens')),
            )


async def example():
    res = await request('http://127.0.0.1:8080', messages=[
        Message("My name is RedDec. You name is Bot."),
        Message("What is your and my name?"),
    ])
    print(res)

cURL

Simple

curl --data 'Why sky is blue?' http://127.0.0.1:8080

Text multipart

curl -F '_=my name is RedDec' -F '_=What is my name?' -v http://127.0.0.1:8080

Image and text

curl -F '_=@eifeltower.jpeg' -F '_=Describe the picture' -v http://127.0.0.1:8080

CLI

Application Options:
      --timeout=                  LLM timeout (default: 30s) [$TIMEOUT]
      --refresh=                  Refresh interval for tools (default: 30s) [$REFRESH]
      --config=                   Config file (default: brain.yaml) [$CONFIG]
      --tools=                    Tool file [$TOOLS]

Debug:
      --debug.enable              Enable debug mode [$DEBUG_ENABLE]

Database configuration:
      --db.url=                   Database URL (default: sqlite://data.sqlite?cache=shared&_fk=1&_pragma=foreign_keys(1)) [$DB_URL]
      --db.max-conn=              Maximum number of opened connections to database (default: 10) [$DB_MAX_CONN]
      --db.idle-conn=             Maximum number of idle connections to database (default: 1) [$DB_IDLE_CONN]
      --db.idle-timeout=          Maximum amount of time a connection may be idle (default: 0) [$DB_IDLE_TIMEOUT]
      --db.conn-life-time=        Maximum amount of time a connection may be reused (default: 0) [$DB_CONN_LIFE_TIME]

HTTP server configuration:
      --http.bind=                Bind address (default: :8080) [$HTTP_BIND]
      --http.tls                  Enable TLS [$HTTP_TLS]
      --http.ca=                  Path to CA files. Optional unless IGNORE_SYSTEM_CA set (default: ca.pem) [$HTTP_CA]
      --http.cert=                Server certificate (default: cert.pem) [$HTTP_CERT]
      --http.key=                 Server private key (default: key.pem) [$HTTP_KEY]
      --http.mutual               Enable mutual TLS [$HTTP_MUTUAL]
      --http.ignore-system-ca     Do not load system-wide CA [$HTTP_IGNORE_SYSTEM_CA]
      --http.read-header-timeout= How long to read header from the request (default: 3s) [$HTTP_READ_HEADER_TIMEOUT]
      --http.graceful=            Graceful shutdown timeout (default: 5s) [$HTTP_GRACEFUL]
      --http.timeout=             Any request timeout (default: 30s) [$HTTP_TIMEOUT]
      --http.max-body-size=       Maximum payload size in bytes (default: 1048576) [$HTTP_MAX_BODY_SIZE]

Providers

OpenAI

First-class support, everything works just fine.

Google

Good support. Known limitations:

date-time not supported in tools
empty object (aka any JSON) is not supported
for complex schemas, gemini-1.5-flash may hallucinate and call with incorrect arguments. Use gemini-1.5-pro

Ollama

Requires Ollama 0.3.3+

Recommended model: llava for vision and mistral:instruct for general messages (including function calling).

model: 'mistral:instruct'
vision:
  model: 'llava'

[!TIP]
Check https://ollama.com/library for models with 'tools' and 'vision' features. The bigger model then generally better. For non-vision models, instruct kind usually better.

AWS Bedrock

[!WARNING]
Due to multiple limitations, only Claude 3+ models are working properly. Recommended multi-modal model for AWS Bedrock is Anthropic Claude-3-5.

Initial support.

Some models may not support system prompt.
Some models may not support tools.
Authorization is ignored (use AWS environment variables)
forceJSON is not supported (workaround: use tools)

Required minimal set of environment variables

AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_REGION=

Please refer to AWS Environment variable cheatsheet for configuration.

Based on function calling feature the recommended models are:

Anthropic Claude 3 models
Mistral AI Mistral Large and Mistral Small
Cohere Command R and Command R+

See list of compatibilities

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

main.go

Directories ¶

Path	Synopsis
internal
brain
ent
ent/enttest
ent/hook
ent/message
ent/migrate
ent/predicate
ent/runtime
ent/schema
providers/bedrock
providers/google
providers/ollama
providers/openai
providers/types
server
tools/loader
tools/openapi
utils
web
testutils

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL