Type something to search...
Serge

Serge

Serge

5.5k 393
04 May, 2024
  Svelte

What is Serge ?

Serge is a chat interface crafted with llama.cpp for running GGUF models. No API keys, entirely self-hosted !

  • 🌐 SvelteKit frontend

  • 💾 Redis for storing chat history & parameters

  • ⚙️ FastAPI + LangChain for the API, wrapping calls to llama.cpp using the python bindings


Serge Demo


⚡️ Quick start

🐳 Docker:

Terminal window
docker run -d \
--name serge \
-v weights:/usr/src/app/weights \
-v datadb:/data/db/ \
-p 8008:8008 \
ghcr.io/serge-chat/serge:latest

🐙 Docker Compose:

Terminal window
services:
serge:
image: ghcr.io/serge-chat/serge:latest
container_name: serge
restart: unless-stopped
ports:
- 8008:8008
volumes:
- weights:/usr/src/app/weights
- datadb:/data/db/
volumes:
weights:
datadb:

Then, just visit http://localhost:8008, You can find the API documentation at http://localhost:8008/api/docs

🖥️ Windows

Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.

☁️ Kubernetes

Instructions for setting up Serge on Kubernetes can be found in the wiki.


🧠 Supported Models

CategoryModels
Alfred40B-1023
Code13B, 33B
CodeLLaMA7B, 7B-Instruct, 7B-Python, 13B, 13B-Instruct, 13B-Python, 34B, 34B-Instruct, 34B-Python
Falcon7B, 7B-Instruct, 40B, 40B-Instruct
LLaMA 27B, 7B-Chat, 7B-Coder, 13B, 13B-Chat, 70B, 70B-Chat, 70B-OASST
Med4270B
Medalpaca13B
Medicine-LLM13B
Meditron7B, 7B-Chat, 70B
Mistral7B-V0.1, 7B-Instruct-v0.2, 7B-OpenOrca
MistralLite7B
Mixtral8x7B-v0.1, 8x7B-Dolphin-2.7, 8x7B-Instruct-v0.1
Neural-Chat7B-v3.3
Notus7B-v1
Notux8x7b-v1
OpenChat7B-v3.5-1210
OpenLLaMA3B-v2, 7B-v2, 13B-v2
Orca 27B, 13B
Phi 22.7B
Python Code13B, 33B
PsyMedRP13B-v1, 20B-v1
Starling LM7B-Alpha
Vicuna7B-v1.5, 13B-v1.5, 33B-v1.3, 33B-Coder
WizardLM7B-v1.0, 13B-v1.2, 70B-v1.0
Zephyr3B, 7B-Alpha, 7B-Beta

Additional models can be requested by opening a GitHub issue.