forked from phoenix/litellm-mirror
LiteLLM fork
.circleci | ||
.github/ISSUE_TEMPLATE | ||
cookbook | ||
dist | ||
docs/my-website | ||
litellm | ||
proxy-server | ||
.all-contributorsrc | ||
.env.example | ||
.gitattributes | ||
.gitignore | ||
.readthedocs.yaml | ||
LICENSE | ||
mkdocs.yml | ||
model_prices_and_context_window.json | ||
poetry.lock | ||
pyproject.toml | ||
README.md |
🚅 LiteLLM
Call all LLM APIs using the OpenAI format [Anthropic, Huggingface, Cohere, TogetherAI, Azure, OpenAI, etc.]
100+ Supported Models | Docs | Demo Website
📣1-click deploy your own LLM proxy server. Grab time, if you're interested!
LiteLLM manages
- Translating inputs to the provider's completion and embedding endpoints
- Guarantees consistent output, text responses will always be available at
['choices'][0]['message']['content']
- Exception mapping - common exceptions across providers are mapped to the OpenAI exception types
Usage
pip install litellm
from litellm import completion
import os
## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# cohere call
response = completion(model="command-nightly", messages=messages)
Stable version
pip install litellm==0.1.424
Streaming
liteLLM supports streaming the model response back, pass stream=True
to get a streaming iterator in response.
Streaming is supported for OpenAI, Azure, Anthropic, Huggingface models
response = completion(model="gpt-3.5-turbo", messages=messages, stream=True)
for chunk in response:
print(chunk['choices'][0]['delta'])
# claude 2
result = completion('claude-2', messages, stream=True)
for chunk in result:
print(chunk['choices'][0]['delta'])
Support / talk with founders
- Schedule Demo 👋
- Community Discord 💭
- Our numbers 📞 +1 (770) 8783-106 / +1 (412) 618-6238
- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai
Why did we build this
- Need for simplicity: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI, Cohere