AIKit β¨
![](https://github.com/sozercan/aikit/raw/v0.10.0/website/static/img/logo.png)
AIKit is a one-stop shop to quickly get started to host, deploy, build and fine-tune large language models (LLMs).
AIKit offers two main capabilities:
-
Inference: AIKit uses LocalAI, which supports a wide range of inference capabilities and formats. LocalAI provides a drop-in replacement REST API that is OpenAI API compatible, so you can use any OpenAI API compatible client, such as Kubectl AI, Chatbot-UI and many more, to send requests to open LLMs!
-
Fine Tuning: AIKit offers an extensible fine tuning interface. It supports Unsloth for fast, memory efficient, and easy fine-tuning experience.
π For full documentation, please see AIKit website!
Features
Quick Start
You can get started with AIKit quickly on your local machine without a GPU!
docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3:8b
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "llama-3-8b-instruct",
"messages": [{"role": "user", "content": "explain kubernetes in a sentence"}]
}'
Output should be similar to:
{
// ...
"model": "llama-3-8b-instruct",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "Kubernetes is an open-source container orchestration system that automates the deployment, scaling, and management of applications and services, allowing developers to focus on writing code rather than managing infrastructure."
}
}
],
// ...
}
That's it! π API is OpenAI compatible so this is a drop-in replacement for any OpenAI API compatible client.
Pre-made Models
AIKit comes with pre-made models that you can use out-of-the-box!
If it doesn't include a specific model, you can always create your own images, and host in a container registry of your choice!
CPU
Model |
Optimization |
Parameters |
Command |
Model Name |
License |
π¦ Llama 3 |
Instruct |
8B |
docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3:8b |
llama-3-8b-instruct |
Llama |
π¦ Llama 3 |
Instruct |
70B |
docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3:70b |
llama-3-70b-instruct |
Llama |
π¦ Llama 2 |
Chat |
7B |
docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama2:7b |
llama-2-7b-chat |
Llama |
π¦ Llama 2 |
Chat |
13B |
docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama2:13b |
llama-2-13b-chat |
Llama |
βοΈ Mixtral |
Instruct |
8x7B |
docker run -d --rm -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b |
mixtral-8x7b-instruct |
Apache |
π
ΏοΈ Phi 3 |
Instruct |
3.8B |
docker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi3:3.8b |
phi-3-3.8b |
MIT |
π‘ Gemma 1.1 |
Instruct |
2B |
docker run -d --rm -p 8080:8080 ghcr.io/sozercan/gemma:2b |
gemma-2b-instruct |
Gemma |
β¨οΈ Codestral 0.1 |
Code |
22B |
docker run -d --rm -p 8080:8080 ghcr.io/sozercan/codestral:22b |
codestral-22b |
MNLP |
NVIDIA CUDA
[!NOTE]
To enable GPU acceleration, please see GPU Acceleration.
Please note that only difference between CPU and GPU section is the --gpus all
flag in the command to enable GPU acceleration.
Model |
Optimization |
Parameters |
Command |
Model Name |
License |
π¦ Llama 3 |
Instruct |
8B |
docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3:8b |
llama-3-8b-instruct |
Llama |
π¦ Llama 3 |
Instruct |
70B |
docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3:70b |
llama-3-70b-instruct |
Llama |
π¦ Llama 2 |
Chat |
7B |
docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama2:7b |
llama-2-7b-chat |
Llama |
π¦ Llama 2 |
Chat |
13B |
docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama2:13b |
llama-2-13b-chat |
Llama |
βοΈ Mixtral |
Instruct |
8x7B |
docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b |
mixtral-8x7b-instruct |
Apache |
π
ΏοΈ Phi 3 |
Instruct |
3.8B |
docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi3:3.8b |
phi-3-3.8b |
MIT |
π‘ Gemma 1.1 |
Instruct |
2B |
docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/gemma:2b |
gemma-2b-instruct |
Gemma |
β¨οΈ Codestral 0.1 |
Code |
22B |
docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/codestral:22b |
codestral-22b |
MNLP |
What's next?
π For more information and how to fine tune models or create your own images, please see AIKit website!