bricksllm

module

v1.0.4 Latest Latest Go to latest Published: Nov 20, 2023 License: MIT

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/bricks-cloud/bricksllm

Links

Open Source Insights

README ¶

BricksLLM: AI Gateway For Putting LLM In Production

BricksLLM is a cloud native AI gateway written in Go. Currently, it serves as a proxy to OpenAI. We let you create API keys that have rate limits, cost limits and TTLs. The API keys can be used in both development and production to achieve fine-grained access control that is not provided by OpenAI at the moment. The proxy is compatible with OpenAI API and its SDKs.

The vision of BricksLLM is to support many more large language models such as LLama2, Claude, PaLM2 etc, and streamline LLM operations.

Roadmap

Access control via API key with rate limit, cost limit and ttl
Logging integration
Statsd integration 🚧
Routes configuration 🚧
PII detection and masking 🚧

Getting Started

The easiest way to get started with BricksLLM is through BricksLLM-Docker.

Step 1 - Clone BricksLLM-Docker repository

git clone https://github.com/bricks-cloud/BricksLLM-Docker

Step 2 - Change to BricksLLM-Docker directory

cd BricksLLM-Docker

Step 3 - Deploy BricksLLM locally with Postgresql and Redis

docker-compose up

You can run this in detach mode use the -d flag: docker-compose up -d

Step 4 - Create a provider setting

curl -X PUT http://localhost:8001/api/provider-settings \
   -H "Content-Type: application/json" \
   -d '{
    "provider":"openai",
    "setting": {
        "apikey": "YOUR_OPENAI_KEY"
    }
}'

Copy the id from the response.

Step 5 - Create a Bricks API key

Use id from the previous step as settingId to create a key with a rate limit of 2 req/min and a spend limit of 25 cents.

curl -X PUT http://localhost:8001/api/key-management/keys \
   -H "Content-Type: application/json" \
   -d '{
	      "name": "My Secret Key",
	      "key": "my-secret-key",
	      "tags": ["mykey"],
        "settingId": "ID_FROM_STEP_FOUR"
        "rateLimitOverTime": 2,
        "rateLimitUnit": "m",
        "costLimitInUsd": 0.25
      }'

Congradulations you are done!!!

Then, just redirect your requests to us and use OpenAI as you would normally. For example:

curl -X POST http://localhost:8002/api/providers/openai/v1/chat/completions \
   -H "Authorization: Bearer my-secret-key" \
   -H "Content-Type: application/json" \
   -d '{
          "model": "gpt-3.5-turbo",
          "messages": [
              {
                  "role": "system",
                  "content": "hi"
              }
          ]
      }'

Or if you're using an SDK, you could change its baseURL to point to us. For example:

// OpenAI Node SDK v4
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: "some-secret-key", // key created earlier
  baseURL: "http://localhost:8002/api/providers/openai/v1", // redirect to us
});

Documentation

Environment variables

Name type description default

POSTGRESQL_HOSTS required Hosts for Postgresql DB. Seperated by , localhost

POSTGRESQL_DB_NAME optional Name for Postgresql DB.

POSTGRESQL_USERNAME required Postgresql DB username

POSTGRESQL_PASSWORD required Postgresql DB password

POSTGRESQL_SSL_MODE optional Postgresql SSL mode disable

POSTGRESQL_PORT optional The port that Postgresql DB runs on 5432

POSTGRESQL_READ_TIME_OUT optional Timeout for Postgresql read operations 2s

POSTGRESQL_WRITE_TIME_OUT optional Timeout for Postgresql write operations 1s

REDIS_HOSTS required Host for Redis. Seperated by , localhost

REDIS_PASSWORD required Redis Password

REDIS_PORT optional The port that Redis DB runs on 6379

REDIS_READ_TIME_OUT optional Timeout for Redis read operations 1s

REDIS_WRITE_TIME_OUT optional Timeout for Redis write operations 500ms

IN_MEMORY_DB_UPDATE_INTERVAL optional The interval BricksLLM API gateway polls Postgresql DB for latest key configurations 1s

Configuration Endpoints

The configuration server runs on Port 8001.

Get keys: GET /api/key-management/keys?tag={tag}

Description

This endpoint is set up for retrieving key configurations using a query param called tag.

Parameters

name type data type description

tag required string Identifier attached to a key configuration

Error Response

http code content-type

400, 500 application/json

Field type example

status int 400

title string request body reader error

type string /errors/request-body-read

detail string something is wrong

instance string /api/key-management/keys

Response

Response Body

[]KeyConfiguration

Fields of KeyConfiguration

Field type example description

name string spike's developer key Name of the API key.

createdAt int64 1257894000 Key configuration creation time in unix.

updatedAt int64 1257894000 Key configuration update time in unix.

revoked boolean true Indicator for whether the key is revoked.

revokedReason string The key has expired Reason for why the key is revoked.

tags []string ["org-tag-12345"] Identifiers associated with the key.

keyId string 550e8400-e29b-41d4-a716-446655440000 Unique identifier for the key.

costLimitInUsd float64 5.5 Total spend limit of the API key.

costLimitInUsdOverTime float64 2 Total spend within period of time. This field is required if costLimitInUsdUnit is specified.

costLimitInUsdUnit enum d Time unit for costLimitInUsdOverTime. Possible values are [h, m, s, d].

rateLimitOverTime int 2 rate limit over period of time. This field is required if rateLimitUnit is specified.

rateLimitUnit string m Time unit for rateLimitOverTime. Possible values are [h, m, s, d]

ttl string 2d time to live. Available units are [s, m, h]

Create key: PUT /api/key-management/keys

Description

This endpoint is set up for retrieving key configurations using a query param called tag.

Request

Field type type example description

name required string spike's developer key Name of the API key.

tags optional []string ["org-tag-12345"] Identifiers associated with the key.

key required string abcdef12345 API key

settingId required string 98daa3ae-961d-4253-bf6a-322a32fdca3d API key

costLimitInUsd optional float64 5.5 Total spend limit of the API key.

costLimitInUsdOverTime optional float64 2 Total spend within period of time. This field is required if costLimitInUsdUnit is specified.

costLimitInUsdUnit optional enum d Time unit for costLimitInUsdOverTime. Possible values are [h, d].

rateLimitOverTime optional string 2 rate limit over period of time. This field is required if rateLimitUnit is specified.

rateLimitUnit optional enum m Time unit for rateLimitOverTime. Possible values are [h, m, s, d]

ttl optional string 2d time to live. Available units are [s, m, h]

Error Response

http code content-type

400, 500 application/json

Field type example

status int 400

title string request body reader error

type string /errors/request-body-read

detail string something is wrong

instance string /api/key-management/keys

Responses

Field type example description

name string spike's developer key Name of the API key.

createdAt int64 1257894000 Key configuration creation time in unix.

updatedAt int64 1257894000 Key configuration update time in unix.

revoked boolean true Indicator for whether the key is revoked.

revokedReason string The key has expired Reason for why the key is revoked.

tags []string ["org-tag-12345"] Identifiers associated with the key.

keyId string 550e8400-e29b-41d4-a716-446655440000 Unique identifier for the key.

costLimitInUsd float64 5.5 Total spend limit of the API key.

costLimitInUsdOverTime float64 2 Total spend within period of time. This field is required if costLimitInUsdUnit is specified.

costLimitInUsdUnit enum d Time unit for costLimitInUsdOverTime. Possible values are [h, d].

rateLimitOverTime int 2 rate limit over period of time. This field is required if rateLimitUnit is specified.

rateLimitOverTime int 2 rate limit over period of time. This field is required if rateLimitUnit is specified.

rateLimitUnit string m Time unit for rateLimitOverTime. Possible values are [h, m, s, d]

ttl string 2d time to live. Available units are [s, m, h]

Update key: PATCH /api/key-management/keys/{keyId}

Description

This endpoint is set up for updating key configurations using key id.

Parameters

name type data type description

keyId required string Unique key configuration identifier.

Request

Field type type example description

name optional string spike's developer key Name of the API key.

tags optional []string ["org-tag-12345"] Identifiers associated with the key.

revoked optional boolean true Indicator for whether the key is revoked.

revokedReason optional string The key has expired Reason for why the key is revoked.

costLimitInUsdOverTime optional float64 2 Total spend within period of time. This field is required if costLimitInUsdUnit is specified.

costLimitInUsdUnit optional enum d Time unit for costLimitInUsdOverTime. Possible values are [h, d].

rateLimitOverTime optional int 2 rate limit over period of time. This field is required if rateLimitUnit is specified.

rateLimitUnit optional enum m Time unit for rateLimitOverTime. Possible values are [h, m, s, d]

Error Response

http code content-type

400, 500 application/json

Field type example

status int 400

title string request body reader error

type string /errors/request-body-read

detail string something is wrong

instance string /api/key-management/keys

Response

Field type example description

name string spike's developer key Name of the API key.

createdAt int64 1257894000 Key configuration creation time in unix.

updatedAt int64 1257894000 Key configuration update time in unix.

revoked boolean true Indicator for whether the key is revoked.

revokedReason string The key has expired Reason for why the key is revoked.

tags []string ["org-tag-12345"] Identifiers associated with the key.

keyId string 550e8400-e29b-41d4-a716-446655440000 Unique identifier for the key.

costLimitInUsd float64 5.5 Total spend limit of the API key.

costLimitInUsdOverTime float64 2 Total spend within period of time. This field is required if costLimitInUsdUnit is specified.

costLimitInUsdUnit enum d Time unit for costLimitInUsdOverTime. Possible values are [h, d].

rateLimitOverTime int 2 rate limit over period of time. This field is required if rateLimitUnit is specified.

rateLimitUnit string m Time unit for rateLimitOverTime. Possible values are [h, m, s, d]

ttl string 2d time to live. Available units are [s, m, h]

Create a provider setting: POST /api/provider-settings

Description

This endpoint is creating a provider setting .

Request

Field type type example description

provider required enum openai This value can only be openai as for now.

setting required object { "apikey": "YOUR_OPENAI_KEY" } A map of values used for authenticating with the selected provider.

setting.apikey required string xx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx This field is required if provider is openai.

Error Response

http code content-type

400, 500 application/json

Field type example

status int 400

title string request body reader error

type string /errors/request-body-read

detail string something is wrong

instance string /api/provider-settings

Response

Field type example description

createdAt int64 1699933571 Unix timestamp for creation time.

updatedAt int64 1699933571 Unix timestamp for update time.

provider enum openai This value can only be openai as for now.

id string 98daa3ae-961d-4253-bf6a-322a32fdca3d This value is a unique identifier

Update a provider setting: PATCH /api/provider-settings/:id

Description

This endpoint is updating a provider setting .

Parameters

name type data type description

id required string Unique identifier for the provider setting that you want to update.

Request

Field type type example description

provider required enum openai This value can only be openai as for now.

setting required object { "apikey": "YOUR_OPENAI_KEY" } A map of values used for authenticating with the selected provider.

setting.apikey required string xx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx This field is required if provider is openai.

Error Response

http code content-type

400, 500 application/json

Field type example

status int 400

title string request body reader error

type string /errors/request-body-read

detail string something is wrong

instance string /api/provider-settings

Response

Field type example description

createdAt int64 1699933571 Unix timestamp for creation time.

updatedAt int64 1699933571 Unix timestamp for update time.

provider enum openai This value can only be openai as for now.

id string 98daa3ae-961d-4253-bf6a-322a32fdca3d This value is a unique identifier

Retrieve Metrics: POST /api/reporting/events

Description

This endpoint is retrieving aggregated metrics given an array of key ids and tags.

Request

Field type type example description

keyIds required []string ["key-1", "key-2", "key-3" ] Array of ids that specicify the keys that you want to aggregate stats from.

tags required []string ["tag-1", "tag-2"] Array of tags that specicify the keys that you want to aggregate stats from.

start required int64 1699933571 Start timestamp for the requested timeseries data.

end required int64 1699933571 End timestamp for the requested timeseries data.

increment required int 60 This field is the increment in seconds for the requested timeseries data.

Error Response

http code content-type

500 application/json

Field type example

status int 400

title string request body reader error

type string /errors/request-body-read

detail string something is wrong

instance string /api/provider-settings

Response

Field type example description

dataPoints []dataPoint [{ "timeStamp": 1699933571, "numberOfRequests": 1, "costInUsd": 0.8, "latencyInMs": 600, "promptTokenCount": 0, "completionTokenCount": 0, "successCount": 1 }] Unix timestamp for creation time.

latencyInMsMedian float64 656.7 Median latency for the given time period.

latencyInMs99th float64 555.7 99th percentile latency for the given time period.

dataPoints.[].timeStamp int64 555.7 Timestamp of the data point

dataPoints.[].numberOfRequests int64 555.7 Aggregated number of http requests over the given time increment.

dataPoints.[].costInUsd int64 555.7 Aggregated cost of http requests over the given time increment.

dataPoints.[].latencyInMs float64 555.7 Aggregated latency of http requests over the given time increment.

dataPoints.[].promptTokenCount int 555.7 Aggregated prompt token counts over the given time increment.

dataPoints.[].completionTokenCount int 555.7 Aggregated completion token counts over the given time increment.

dataPoints.[].successCount int 555.7 Aggregated number of successful http requests over the given time increment.

OpenAI Proxy

The OpenAI proxy runs on Port 8002.

Call OpenAI chat completions: POST /api/providers/openai/v1/chat/completions

Description

This endpoint is set up for proxying OpenAI API requests. Documentation for this endpoint can be found here.

Directories ¶

Path	Synopsis
cmd
bricksllm
tool
internal
config
encrypter
errors
event
key
logger
logger/zap
manager
provider
provider/openai
recorder
server/web
stats
storage
storage/memdb
storage/postgresql
storage/redis
util
validator

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

Name	type	description	default
`POSTGRESQL_HOSTS`	required	Hosts for Postgresql DB. Seperated by ,	`localhost`
`POSTGRESQL_DB_NAME`	optional	Name for Postgresql DB.
`POSTGRESQL_USERNAME`	required	Postgresql DB username
`POSTGRESQL_PASSWORD`	required	Postgresql DB password
`POSTGRESQL_SSL_MODE`	optional	Postgresql SSL mode	`disable`
`POSTGRESQL_PORT`	optional	The port that Postgresql DB runs on	`5432`
`POSTGRESQL_READ_TIME_OUT`	optional	Timeout for Postgresql read operations	`2s`
`POSTGRESQL_WRITE_TIME_OUT`	optional	Timeout for Postgresql write operations	`1s`
`REDIS_HOSTS`	required	Host for Redis. Seperated by ,	`localhost`
`REDIS_PASSWORD`	required	Redis Password
`REDIS_PORT`	optional	The port that Redis DB runs on	`6379`
`REDIS_READ_TIME_OUT`	optional	Timeout for Redis read operations	`1s`
`REDIS_WRITE_TIME_OUT`	optional	Timeout for Redis write operations	`500ms`
`IN_MEMORY_DB_UPDATE_INTERVAL`	optional	The interval BricksLLM API gateway polls Postgresql DB for latest key configurations	`1s`

Field	type	example
status	`int`	400
title	`string`	request body reader error
type	`string`	/errors/request-body-read
detail	`string`	something is wrong
instance	`string`	/api/key-management/keys

Field	type	example	description
name	`string`	spike's developer key	Name of the API key.
createdAt	`int64`	1257894000	Key configuration creation time in unix.
updatedAt	`int64`	1257894000	Key configuration update time in unix.
revoked	`boolean`	true	Indicator for whether the key is revoked.
revokedReason	`string`	The key has expired	Reason for why the key is revoked.
tags	`[]string`	["org-tag-12345"]	Identifiers associated with the key.
keyId	`string`	550e8400-e29b-41d4-a716-446655440000	Unique identifier for the key.
costLimitInUsd	`float64`	`5.5`	Total spend limit of the API key.
costLimitInUsdOverTime	`float64`	`2`	Total spend within period of time. This field is required if costLimitInUsdUnit is specified.
costLimitInUsdUnit	`enum`	d	Time unit for costLimitInUsdOverTime. Possible values are [`h`, `m`, `s`, `d`].
rateLimitOverTime	`int`	`2`	rate limit over period of time. This field is required if rateLimitUnit is specified.
rateLimitUnit	`string`	m	Time unit for rateLimitOverTime. Possible values are [`h`, `m`, `s`, `d`]
ttl	`string`	2d	time to live. Available units are [`s`, `m`, `h`]

Field	type	example
status	`int`	`400`
title	`string`	request body reader error
type	`string`	/errors/request-body-read
detail	`string`	something is wrong
instance	`string`	/api/key-management/keys

Field	type	example	description
createdAt	`int64`	`1699933571`	Unix timestamp for creation time.
updatedAt	`int64`	`1699933571`	Unix timestamp for update time.
provider	`enum`	`openai`	This value can only be `openai` as for now.
id	`string`	`98daa3ae-961d-4253-bf6a-322a32fdca3d`	This value is a unique identifier

Field	type	type	example	description
keyIds	required	`[]string`	`["key-1", "key-2", "key-3" ]`	Array of ids that specicify the keys that you want to aggregate stats from.
tags	required	`[]string`	`["tag-1", "tag-2"]`	Array of tags that specicify the keys that you want to aggregate stats from.
start	required	`int64`	`1699933571`	Start timestamp for the requested timeseries data.
end	required	`int64`	`1699933571`	End timestamp for the requested timeseries data.
increment	required	`int`	`60`	This field is the increment in seconds for the requested timeseries data.

Field	type	example	description
dataPoints	`[]dataPoint`	`[{ "timeStamp": 1699933571, "numberOfRequests": 1, "costInUsd": 0.8, "latencyInMs": 600, "promptTokenCount": 0, "completionTokenCount": 0, "successCount": 1 }]`	Unix timestamp for creation time.
latencyInMsMedian	`float64`	`656.7`	Median latency for the given time period.
latencyInMs99th	`float64`	`555.7`	99th percentile latency for the given time period.
dataPoints.[].timeStamp	`int64`	`555.7`	Timestamp of the data point
dataPoints.[].numberOfRequests	`int64`	`555.7`	Aggregated number of http requests over the given time increment.
dataPoints.[].costInUsd	`int64`	`555.7`	Aggregated cost of http requests over the given time increment.
dataPoints.[].latencyInMs	`float64`	`555.7`	Aggregated latency of http requests over the given time increment.
dataPoints.[].promptTokenCount	`int`	`555.7`	Aggregated prompt token counts over the given time increment.
dataPoints.[].completionTokenCount	`int`	`555.7`	Aggregated completion token counts over the given time increment.
dataPoints.[].successCount	`int`	`555.7`	Aggregated number of successful http requests over the given time increment.