aikit

module

v0.10.0 Latest Latest Go to latest Published: Jun 6, 2024 License: MIT

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/sozercan/aikit

Links

Open Source Insights

README ¶

AIKit ✨

AIKit is a one-stop shop to quickly get started to host, deploy, build and fine-tune large language models (LLMs).

AIKit offers two main capabilities:

Inference: AIKit uses LocalAI, which supports a wide range of inference capabilities and formats. LocalAI provides a drop-in replacement REST API that is OpenAI API compatible, so you can use any OpenAI API compatible client, such as Kubectl AI, Chatbot-UI and many more, to send requests to open LLMs!
Fine Tuning: AIKit offers an extensible fine tuning interface. It supports Unsloth for fast, memory efficient, and easy fine-tuning experience.

👉 For full documentation, please see AIKit website!

Features

🐳 No GPU, Internet access or additional tools needed except for Docker!
🤏 Minimal image size, resulting in less vulnerabilities and smaller attack surface with a custom distroless-based image
🎵 Fine tune support
🚀 Easy to use declarative configuration for inference and fine tuning
✨ OpenAI API compatible to use with any OpenAI API compatible client
📸 Multi-modal model support
🖼️ Image generation support with Stable Diffusion
🦙 Support for GGUF (llama), GPTQ (exllama or exllama2), EXL2 (exllama2), and GGML (llama-ggml) and Mamba models
🚢 Kubernetes deployment ready
📦 Supports multiple models with a single image
🖥️ Supports GPU-accelerated inferencing with NVIDIA GPUs
🔐 Signed images for aikit and pre-made models
🌈 Supports air-gapped environments with self-hosted, local, or any remote container registries to store model images for inference on the edge.

Quick Start

You can get started with AIKit quickly on your local machine without a GPU!

docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3:8b

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
    "model": "llama-3-8b-instruct",
    "messages": [{"role": "user", "content": "explain kubernetes in a sentence"}]
  }'

Output should be similar to:

{
  // ...
    "model": "llama-3-8b-instruct",
    "choices": [
        {
            "index": 0,
            "finish_reason": "stop",
            "message": {
                "role": "assistant",
                "content": "Kubernetes is an open-source container orchestration system that automates the deployment, scaling, and management of applications and services, allowing developers to focus on writing code rather than managing infrastructure."
            }
        }
    ],
  // ...
}

That's it! 🎉 API is OpenAI compatible so this is a drop-in replacement for any OpenAI API compatible client.

Pre-made Models

AIKit comes with pre-made models that you can use out-of-the-box!

If it doesn't include a specific model, you can always create your own images, and host in a container registry of your choice!

CPU

Model	Optimization	Parameters	Command	Model Name	License
🦙 Llama 3	Instruct	8B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3:8b`	`llama-3-8b-instruct`	Llama
🦙 Llama 3	Instruct	70B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3:70b`	`llama-3-70b-instruct`	Llama
🦙 Llama 2	Chat	7B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama2:7b`	`llama-2-7b-chat`	Llama
🦙 Llama 2	Chat	13B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama2:13b`	`llama-2-13b-chat`	Llama
Ⓜ️ Mixtral	Instruct	8x7B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b`	`mixtral-8x7b-instruct`	Apache
🅿️ Phi 3	Instruct	3.8B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi3:3.8b`	`phi-3-3.8b`	MIT
🔡 Gemma 1.1	Instruct	2B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/gemma:2b`	`gemma-2b-instruct`	Gemma
⌨️ Codestral 0.1	Code	22B	`docker run -d --rm -p 8080:8080 ghcr.io/sozercan/codestral:22b`	`codestral-22b`	MNLP

NVIDIA CUDA

[!NOTE] To enable GPU acceleration, please see GPU Acceleration. Please note that only difference between CPU and GPU section is the --gpus all flag in the command to enable GPU acceleration.

Model	Optimization	Parameters	Command	Model Name	License
🦙 Llama 3	Instruct	8B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3:8b`	`llama-3-8b-instruct`	Llama
🦙 Llama 3	Instruct	70B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3:70b`	`llama-3-70b-instruct`	Llama
🦙 Llama 2	Chat	7B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama2:7b`	`llama-2-7b-chat`	Llama
🦙 Llama 2	Chat	13B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama2:13b`	`llama-2-13b-chat`	Llama
Ⓜ️ Mixtral	Instruct	8x7B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b`	`mixtral-8x7b-instruct`	Apache
🅿️ Phi 3	Instruct	3.8B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi3:3.8b`	`phi-3-3.8b`	MIT
🔡 Gemma 1.1	Instruct	2B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/gemma:2b`	`gemma-2b-instruct`	Gemma
⌨️ Codestral 0.1	Code	22B	`docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/codestral:22b`	`codestral-22b`	MNLP

What's next?

👉 For more information and how to fine tune models or create your own images, please see AIKit website!

Directories ¶

Path	Synopsis
cmd
frontend
pkg
aikit/config
aikit2llb/finetune
aikit2llb/inference
build
utils
version

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL