BricksLLM: AI Gateway For Putting LLM In Production
BricksLLM is a cloud native AI gateway written in Go. Currently, it serves as a proxy to OpenAI. We let you create API keys that have rate limits, cost limits and TTLs. The API keys can be used in both development and production to achieve fine-grained access control that is not provided by OpenAI at the moment. The proxy is compatible with OpenAI API and its SDKs.
The vision of BricksLLM is to support many more large language models such as LLama2, Claude, PaLM2 etc, and streamline LLM operations.
Roadmap
- Access control via API key with rate limit, cost limit and ttl
- Logging integration
- Statsd integration 🚧
- Routes configuration 🚧
- PII detection and masking 🚧
Getting Started
The easiest way to get started with BricksLLM is through BricksLLM-Docker.
Step 1 - Clone BricksLLM-Docker repository
git clone https://github.com/bricks-cloud/BricksLLM-Docker
Step 2 - Change to BricksLLM-Docker directory
cd BricksLLM-Docker
Step 3 - Export your OpenAI API Key as environment variable
export OPENAI_API_KEY=YOUR_OPENAI_API_KEY
Step 4 - Deploy BricksLLM with Postgresql and Redis
docker-compose up
You can run this in detach mode use the -d flag: docker-compose up -d
Congradulations you are done!!!
Create an API key through the create key endpoint. For example, create a key with a rate limit of 2 req/min and a spend limit of 25 cents.
curl -X PUT http://localhost:8001/api/key-management/keys \
-H "Content-Type: application/json" \
-d '{
"name": "My Development Key",
"key": "my-secret-key",
"tags": ["mykey"],
"rateLimitOverTime": 2,
"rateLimitUnit": "m",
"costLimitInUsd": 0.25
}'
Then, just redirect your requests to us and use OpenAI as you would normally. For example:
curl -X POST http://localhost:8002/api/providers/openai/v1/chat/completions \
-H "Authorization: Bearer my-secret-key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "hi"
}
]
}'
Or if you're using an SDK, you could change its baseURL
to point to us. For example:
// OpenAI Node SDK v4
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: "some-secret-key", // key created earlier
baseURL: "http://localhost:8002/api/providers/openai/v1", // redirect to us
});
Documentation
Environment variables
Name |
type |
description |
default |
OPENAI_API_KEY |
required |
OpenAI API Key |
|
POSTGRESQL_HOSTS |
optional |
Hosts for Postgresql DB. Seperated by , |
localhost |
POSTGRESQL_DB_NAME |
optional |
Name for Postgresql DB. |
|
POSTGRESQL_USERNAME |
required |
Postgresql DB username |
|
POSTGRESQL_PASSWORD |
required |
Postgresql DB password |
|
POSTGRESQL_SSL_MODE |
optional |
Postgresql SSL mode |
disable |
POSTGRESQL_PORT |
optional |
The port that Postgresql DB runs on |
5432 |
POSTGRESQL_READ_TIME_OUT |
optional |
Timeout for Postgresql read operations |
2s |
POSTGRESQL_WRITE_TIME_OUT |
optional |
Timeout for Postgresql write operations |
1s |
REDIS_HOSTS |
required |
Host for Redis. Seperated by , |
localhost |
REDIS_PASSWORD |
required |
Redis Password |
|
REDIS_PORT |
optional |
The port that Redis DB runs on |
6379 |
REDIS_READ_TIME_OUT |
optional |
Timeout for Redis read operations |
1s |
REDIS_WRITE_TIME_OUT |
optional |
Timeout for Redis write operations |
500ms |
IN_MEMORY_DB_UPDATE_INTERVAL |
optional |
The interval BricksLLM API gateway polls Postgresql DB for latest key configurations |
10s |
Configuration Endpoints
The configuration server runs on Port 8001
.
Get keys: GET
/api/key-management/keys?tag={tag}
Description
This endpoint is set up for retrieving key configurations using a query param called tag.
Parameters
name |
type |
data type |
description |
tag |
required |
string |
Identifier attached to a key configuration |
Error Response
http code |
content-type |
400 , 500 |
application/json |
Field |
type |
example |
status |
number |
400 |
title |
string |
request body reader error |
type |
string |
/errors/request-body-read |
detail |
string |
something is wrong |
instance |
string |
/api/key-management/keys |
Response
Response Body |
[]KeyConfiguration |
Fields of KeyConfiguration
Field |
type |
example |
description |
name |
string |
spike's developer key |
Name of the API key. |
createdAt |
number |
1257894000 |
Key configuration creation time in unix. |
updatedAt |
number |
1257894000 |
Key configuration update time in unix. |
revoked |
boolean |
true |
Indicator for whether the key is revoked. |
revokedReason |
string |
The key has expired |
Reason for why the key is revoked. |
tags |
[]string |
["org-tag-12345"] |
Identifiers associated with the key. |
keyId |
string |
550e8400-e29b-41d4-a716-446655440000 |
Unique identifier for the key. |
costLimitInUsd |
number |
5.5 |
Total spend limit of the API key. |
costLimitInUsdOverTime |
string |
2 |
Total spend within period of time. This field is required if costLimitInUsdUnit is specified. |
costLimitInUsdUnit |
enum |
d |
Time unit for costLimitInUsdOverTime. Possible values are [h , m , s , d ]. |
rateLimitOverTime |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitOverTime |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitUnit |
string |
m |
Time unit for rateLimitOverTime. Possible values are [h , m , s , d ] |
ttl |
string |
2d |
time to live. Available units are [s , m , h ] |
Create key: PUT
/api/key-management/keys
Description
This endpoint is set up for retrieving key configurations using a query param called tag.
Request
Field |
type |
type |
example |
description |
name |
required |
string |
spike's developer key |
Name of the API key. |
tags |
optional |
[]string |
["org-tag-12345"] |
Identifiers associated with the key. |
key |
required |
string |
abcdef12345 |
API key |
costLimitInUsd |
optional |
number |
5.5 |
Total spend limit of the API key. |
costLimitInUsdOverTime |
optional |
string |
2 |
Total spend within period of time. This field is required if costLimitInUsdUnit is specified. |
costLimitInUsdUnit |
optional |
enum |
d |
Time unit for costLimitInUsdOverTime. Possible values are [h , d ]. |
rateLimitOverTime |
optional |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitUnit |
optional |
enum |
m |
Time unit for rateLimitOverTime. Possible values are [h , m , s , d ] |
ttl |
optional |
string |
2d |
time to live. Available units are [s , m , h ] |
Error Response
http code |
content-type |
400 , 500 |
application/json |
Field |
type |
example |
status |
number |
400 |
title |
string |
request body reader error |
type |
string |
/errors/request-body-read |
detail |
string |
something is wrong |
instance |
string |
/api/key-management/keys |
Responses
Field |
type |
example |
description |
name |
string |
spike's developer key |
Name of the API key. |
createdAt |
number |
1257894000 |
Key configuration creation time in unix. |
updatedAt |
number |
1257894000 |
Key configuration update time in unix. |
revoked |
boolean |
true |
Indicator for whether the key is revoked. |
revokedReason |
string |
The key has expired |
Reason for why the key is revoked. |
tags |
[]string |
["org-tag-12345"] |
Identifiers associated with the key. |
keyId |
string |
550e8400-e29b-41d4-a716-446655440000 |
Unique identifier for the key. |
costLimitInUsd |
number |
5.5 |
Total spend limit of the API key. |
costLimitInUsdOverTime |
string |
2 |
Total spend within period of time. This field is required if costLimitInUsdUnit is specified. |
costLimitInUsdUnit |
enum |
d |
Time unit for costLimitInUsdOverTime. Possible values are [h , d ]. |
rateLimitOverTime |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitOverTime |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitUnit |
string |
m |
Time unit for rateLimitOverTime. Possible values are [h , m , s , d ] |
ttl |
string |
2d |
time to live. Available units are [s , m , h ] |
Update key: PATCH
/api/key-management/keys/{keyId}
Description
This endpoint is set up for updating key configurations using key id.
Parameters
name |
type |
data type |
description |
keyId |
required |
string |
Unique key configuration identifier. |
Request
Field |
type |
type |
example |
description |
name |
optional |
string |
spike's developer key |
Name of the API key. |
tags |
optional |
[]string |
["org-tag-12345"] |
Identifiers associated with the key. |
revoked |
optional |
boolean |
true |
Indicator for whether the key is revoked. |
revokedReason |
optional |
string |
The key has expired |
Reason for why the key is revoked. |
costLimitInUsdOverTime |
optional |
string |
2 |
Total spend within period of time. This field is required if costLimitInUsdUnit is specified. |
costLimitInUsdUnit |
optional |
enum |
d |
Time unit for costLimitInUsdOverTime. Possible values are [h , d ]. |
rateLimitOverTime |
optional |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitUnit |
optional |
enum |
m |
Time unit for rateLimitOverTime. Possible values are [h , m , s , d ] |
Error Response
http code |
content-type |
400 , 500 |
application/json |
Field |
type |
example |
status |
number |
400 |
title |
string |
request body reader error |
type |
string |
/errors/request-body-read |
detail |
string |
something is wrong |
instance |
string |
/api/key-management/keys |
Response
Field |
type |
example |
description |
name |
string |
spike's developer key |
Name of the API key. |
createdAt |
number |
1257894000 |
Key configuration creation time in unix. |
updatedAt |
number |
1257894000 |
Key configuration update time in unix. |
revoked |
boolean |
true |
Indicator for whether the key is revoked. |
revokedReason |
string |
The key has expired |
Reason for why the key is revoked. |
tags |
[]string |
["org-tag-12345"] |
Identifiers associated with the key. |
keyId |
string |
550e8400-e29b-41d4-a716-446655440000 |
Unique identifier for the key. |
costLimitInUsd |
number |
5.5 |
Total spend limit of the API key. |
costLimitInUsdOverTime |
string |
2 |
Total spend within period of time. This field is required if costLimitInUsdUnit is specified. |
costLimitInUsdUnit |
enum |
d |
Time unit for costLimitInUsdOverTime. Possible values are [h , d ]. |
rateLimitOverTime |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitOverTime |
string |
2 |
rate limit over period of time. This field is required if rateLimitUnit is specified. |
rateLimitUnit |
string |
m |
Time unit for rateLimitOverTime. Possible values are [h , m , s , d ] |
ttl |
string |
2d |
time to live. Available units are [s , m , h ] |
Get Key Reporting: GET
/api/reporting/keys/{keyId}
Description
This endpoint is set up for retrieving the cumulative OpenAI cost of an API key.
Parameters
name |
type |
data type |
description |
keyId |
required |
string |
Unique key configuration identifier. |
Error Response
http code |
content-type |
400 , 500 , 404 |
application/json |
Field |
type |
example |
status |
number |
400 |
title |
string |
request body reader error |
type |
string |
/errors/request-body-read |
detail |
string |
something is wrong |
instance |
string |
/api/key-management/keys |
Response
Field |
type |
example |
description |
keyId |
string |
550e8400-e29b-41d4-a716-446655440000 |
Unique identifier for the key. |
costInMicroDollars |
number |
55 |
Cumulative spend of the API key in micro dollars. |
OpenAI Proxy
The OpenAI proxy runs on Port 8002
.
Call OpenAI chat completions: POST
/api/providers/openai/v1/chat/completions
Description
This endpoint is set up for proxying OpenAI API requests. Documentation for this endpoint can be found here.