BricksLLM: AI Gateway For Putting LLM In Production
BricksLLM is a cloud native AI gateway written in Go. Currently, it serves as a proxy only for OpenAI. The main feature of the gateway is letting you create API keys that has a rate limit, cost limit and ttl that you could use in both development and production use cases to achieve fine-grained access control that is not provided by OpenAI at the moment. The proxy is compatible with OpenAI API and its SDKs.
The vision of BricksLLM is to support many more large language models such as LLama2, Claude, PaLM2 etc, and streamline LLM operations.
Roadmap
- Access control via API key with rate limit, cost limit and ttl
- Statsd integration 🚧
- Logging integration 🚧
- Routes configuration 🚧
- PII detection and masking 🚧
Getting Started
The easiest way to get started with BricksLLM is through BricksLLM-Docker.
Step 1 - Clone BricksLLM-Docker repository
git clone https://github.com/bricks-cloud/BricksLLM-Docker
Step 2 - Change to BricksLLM-Docker directory
cd BricksLLM-Docker
Step 3 - Export your OpenAI API Key as environment variable
export OPENAI_API_KEY=YOUR_OPENAI_API_KEY
Step 4 - Deploy BricksLLM with Postgresql and Redis
docker-compose up
You can run this in detach mode use the -d flag: docker-compose up -d
Congradulations you are done!!!
Use the following command to create your first OpenAI API Key my-secret-key
with a 2 requests per minute rate limit and a spend limit of total 25 cents.
curl -X PUT http://localhost:8001/api/key-management/keys \
-H "Content-Type: application/json" \
-d '{
"name": "My Development Key",
"key": "my-secret-key",
"tags": ["mykey"],
"rateLimitOverTime": 2,
"rateLimitUnit": "m",
"costLimitInUsed": 0.25
}'
You can test your newly created OpenAI API Key by calling the BricksLLM OpenAI proxy
curl -X POST http://localhost:8002/api/providers/openai/v1/chat/completions \
-H "Authorization: Bearer my-secret-key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "hi"
}
]
}'
Documentation
Environment variables
Name |
type |
description |
default |
OPENAI_API_KEY |
required |
OpenAI API Key |
|
POSTGRESQL_HOSTS |
optional |
Hosts for Postgresql DB. Seperated by , |
localhost |
POSTGRESQL_DB_NAME |
optional |
Name for Postgresql DB. |
|
POSTGRESQL_USERNAME |
required |
Postgresql DB username |
|
POSTGRESQL_PASSWORD |
required |
Postgresql DB password |
|
POSTGRESQL_SSL_MODE |
optional |
Postgresql SSL mode |
disable |
POSTGRESQL_PORT |
optional |
The port that Postgresql DB runs on |
5432 |
POSTGRESQL_READ_TIME_OUT |
optional |
Timeout for Postgresql read operations |
2s |
POSTGRESQL_WRITE_TIME_OUT |
optional |
Timeout for Postgresql write operations |
1s |
REDIS_HOSTS |
required |
Host for Redis. Seperated by , |
localhost |
REDIS_PASSWORD |
required |
Redis Password |
|
REDIS_PORT |
optional |
The port that Redis DB runs on |
6379 |
REDIS_READ_TIME_OUT |
optional |
Timeout for Redis read operations |
1s |
REDIS_WRITE_TIME_OUT |
optional |
Timeout for Redis write operations |
500ms |
IN_MEMORY_DB_UPDATE_INTERVAL |
optional |
The interval BricksLLM API gateway polls Postgresql DB for latest key configurations |
10s |
Configuration Endpoints
The configuration server runs on Port 8001
.
GET
/api/key-management/keys?tag={tag}
Description
This endpoint is set up for retrieving key configurations using a query param called tag.
Parameters
name |
type |
data type |
description |
tag |
required |
string |
Identifier attached to a key configuration |
Error Response
http code |
content-type |
400 , 500 |
application/json |
Field |
type |
example |
status |
number |
400 |
title |
string |
request body reader error |
type |
string |
/errors/request-body-read |
detail |
string |
something is wrong |
instance |
string |
/api/key-management/keys |
Response
Response Body |
[]KeyConfiguration |
Fields of KeyConfiguration
Field |
type |
example |
description |
name |
string |
spike's developer key |
Name of the API key. |
createdAt |
number |
1257894000 |
Key configuration creation time in unix. |
updatedAt |
number |
1257894000 |
Key configuration update time in unix. |
revoked |
boolean |
true |
Indicator for whether the key is revoked. |
revokedReason |
string |
The key has expired |
Reason for why the key is revoked. |
tags |
[]string |
["org-tag-12345"] |
Identifiers associated with the key. |
keyId |
string |
550e8400-e29b-41d4-a716-446655440000 |
Unique identifier for the key. |
costLimitInUsd |
number |
5.5 |
Total spend limit of the API key. |
costLimitInUsdOverTime |
string |
2 |
Total spend within period of time. This field is required if costLimitInUsdUnit is specified. |
costLimitInUsdUnit |
enum |
d |
Time unit for costLimitInUsdOverTime. Possible values are [h , m , s , d ]. |
rateLimitOverTime |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitOverTime |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitUnit |
string |
m |
Time unit for rateLimitOverTime. Possible values are [h , m , s , d ] |
ttl |
string |
2d |
time to live. Available units are [s , m , h ] |
PUT
/api/key-management/keys
Description
This endpoint is set up for retrieving key configurations using a query param called tag.
Request
Field |
type |
type |
example |
description |
name |
required |
string |
spike's developer key |
Name of the API key. |
tags |
optional |
[]string |
["org-tag-12345"] |
Identifiers associated with the key. |
key |
required |
string |
abcdef12345 |
API key |
costLimitInUsd |
optional |
number |
5.5 |
Total spend limit of the API key. |
costLimitInUsdOverTime |
optional |
string |
2 |
Total spend within period of time. This field is required if costLimitInUsdUnit is specified. |
costLimitInUsdUnit |
optional |
enum |
d |
Time unit for costLimitInUsdOverTime. Possible values are [h , d ]. |
rateLimitOverTime |
optional |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitUnit |
optional |
enum |
m |
Time unit for rateLimitOverTime. Possible values are [h , m , s , d ] |
ttl |
optional |
string |
2d |
time to live. Available units are [s , m , h ] |
Error Response
http code |
content-type |
400 , 500 |
application/json |
Field |
type |
example |
status |
number |
400 |
title |
string |
request body reader error |
type |
string |
/errors/request-body-read |
detail |
string |
something is wrong |
instance |
string |
/api/key-management/keys |
Responses
Field |
type |
example |
description |
name |
string |
spike's developer key |
Name of the API key. |
createdAt |
number |
1257894000 |
Key configuration creation time in unix. |
updatedAt |
number |
1257894000 |
Key configuration update time in unix. |
revoked |
boolean |
true |
Indicator for whether the key is revoked. |
revokedReason |
string |
The key has expired |
Reason for why the key is revoked. |
tags |
[]string |
["org-tag-12345"] |
Identifiers associated with the key. |
keyId |
string |
550e8400-e29b-41d4-a716-446655440000 |
Unique identifier for the key. |
costLimitInUsd |
number |
5.5 |
Total spend limit of the API key. |
costLimitInUsdOverTime |
string |
2 |
Total spend within period of time. This field is required if costLimitInUsdUnit is specified. |
costLimitInUsdUnit |
enum |
d |
Time unit for costLimitInUsdOverTime. Possible values are [h , d ]. |
rateLimitOverTime |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitOverTime |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitUnit |
string |
m |
Time unit for rateLimitOverTime. Possible values are [h , m , s , d ] |
ttl |
string |
2d |
time to live. Available units are [s , m , h ] |
PATCH
/api/key-management/keys/{keyId}
Description
This endpoint is set up for updating key configurations using key id.
Parameters
name |
type |
data type |
description |
keyId |
required |
string |
Unique key configuration identifier. |
Request
Field |
type |
type |
example |
description |
name |
optional |
string |
spike's developer key |
Name of the API key. |
tags |
optional |
[]string |
["org-tag-12345"] |
Identifiers associated with the key. |
revoked |
optional |
boolean |
true |
Indicator for whether the key is revoked. |
revokedReason |
optional |
string |
The key has expired |
Reason for why the key is revoked. |
costLimitInUsdOverTime |
optional |
string |
2 |
Total spend within period of time. This field is required if costLimitInUsdUnit is specified. |
costLimitInUsdUnit |
optional |
enum |
d |
Time unit for costLimitInUsdOverTime. Possible values are [h , d ]. |
rateLimitOverTime |
optional |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitUnit |
optional |
enum |
m |
Time unit for rateLimitOverTime. Possible values are [h , m , s , d ] |
Error Response
http code |
content-type |
400 , 500 |
application/json |
Field |
type |
example |
status |
number |
400 |
title |
string |
request body reader error |
type |
string |
/errors/request-body-read |
detail |
string |
something is wrong |
instance |
string |
/api/key-management/keys |
Response
Field |
type |
example |
description |
name |
string |
spike's developer key |
Name of the API key. |
createdAt |
number |
1257894000 |
Key configuration creation time in unix. |
updatedAt |
number |
1257894000 |
Key configuration update time in unix. |
revoked |
boolean |
true |
Indicator for whether the key is revoked. |
revokedReason |
string |
The key has expired |
Reason for why the key is revoked. |
tags |
[]string |
["org-tag-12345"] |
Identifiers associated with the key. |
keyId |
string |
550e8400-e29b-41d4-a716-446655440000 |
Unique identifier for the key. |
costLimitInUsd |
number |
5.5 |
Total spend limit of the API key. |
costLimitInUsdOverTime |
string |
2 |
Total spend within period of time. This field is required if costLimitInUsdUnit is specified. |
costLimitInUsdUnit |
enum |
d |
Time unit for costLimitInUsdOverTime. Possible values are [h , d ]. |
rateLimitOverTime |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitOverTime |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitUnit |
string |
m |
Time unit for rateLimitOverTime. Possible values are [h , m , s , d ] |
ttl |
string |
2d |
time to live. Available units are [s , m , h ] |
OpenAI Proxy
The OpenAI proxy runs on Port 8002
.
POST
/api/providers/openai/v1/chat/completions
Description
This endpoint is set up for proxying OpenAI API requests. Documentation for this endpoint can be found here.