

Lite LLM
8.5k 936What is LiteLLM ?
Lite LLM allow you to call all LLM APIs using the OpenAI format [Bedrock, Huggingface, VertexAI, TogetherAI, Azure, OpenAI, etc.]
LiteLLM manages
- Translate inputs to provider’s
completion
,embedding
, andimage_generation
endpoints - Consistent output, text responses will always be available at
['choices'][0]['message']['content']
- Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - Router
- Set Budgets & Rate limits per project, api key, model OpenAI Proxy Server
Jump to OpenAI Proxy Docs Jump to Supported LLM Providers
Supported Providers (Docs)
Provider | Completion | Streaming | Async Completion | Async Streaming | Async Embedding | Async Image Generation |
---|---|---|---|---|---|---|
openai | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
azure | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
aws - sagemaker | ✅ | ✅ | ✅ | ✅ | ✅ | |
aws - bedrock | ✅ | ✅ | ✅ | ✅ | ✅ | |
google - vertex_ai [Gemini] | ✅ | ✅ | ✅ | ✅ | ||
google - palm | ✅ | ✅ | ✅ | ✅ | ||
google AI Studio - gemini | ✅ | ✅ | ✅ | ✅ | ||
mistral ai api | ✅ | ✅ | ✅ | ✅ | ✅ | |
cloudflare AI Workers | ✅ | ✅ | ✅ | ✅ | ||
cohere | ✅ | ✅ | ✅ | ✅ | ✅ | |
anthropic | ✅ | ✅ | ✅ | ✅ | ||
huggingface | ✅ | ✅ | ✅ | ✅ | ✅ | |
replicate | ✅ | ✅ | ✅ | ✅ | ||
together_ai | ✅ | ✅ | ✅ | ✅ | ||
openrouter | ✅ | ✅ | ✅ | ✅ | ||
ai21 | ✅ | ✅ | ✅ | ✅ | ||
baseten | ✅ | ✅ | ✅ | ✅ | ||
vllm | ✅ | ✅ | ✅ | ✅ | ||
nlp_cloud | ✅ | ✅ | ✅ | ✅ | ||
aleph alpha | ✅ | ✅ | ✅ | ✅ | ||
petals | ✅ | ✅ | ✅ | ✅ | ||
ollama | ✅ | ✅ | ✅ | ✅ | ✅ | |
deepinfra | ✅ | ✅ | ✅ | ✅ | ||
perplexity-ai | ✅ | ✅ | ✅ | ✅ | ||
Groq AI | ✅ | ✅ | ✅ | ✅ | ||
anyscale | ✅ | ✅ | ✅ | ✅ | ||
IBM - watsonx.ai | ✅ | ✅ | ✅ | ✅ | ✅ | |
voyage ai | ✅ | |||||
xinference [Xorbits Inference] | ✅ |