๐
LiteLLM
Call all LLM APIs using the OpenAI format [Anthropic, Huggingface, Cohere, TogetherAI, Azure, OpenAI, etc.]
Schedule Demo
ยท
Feature Request
LiteLLM manages
- Translating inputs to the provider's completion and embedding endpoints
- Guarantees [consistent output](https://docs.litellm.ai/docs/completion/output), text responses will always be available at `['choices'][0]['message']['content']`
- Exception mapping - common exceptions across providers are mapped to the OpenAI exception types.
**๐จ Seeing errors?** [](https://wa.link/huol9n) [](https://discord.gg/wuPM9dRgDw)
**10/05/2023:** LiteLLM is adopting Semantic Versioning for all commits. [Learn more](https://github.com/BerriAI/litellm/issues/532)
**10/16/2023:** **Self-hosted OpenAI-proxy server** [Learn more](https://docs.litellm.ai/docs/proxy_server#deploy-proxy)
# Usage
```
pip install litellm
```
```python
from litellm import completion
import os
## set ENV variables
os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["COHERE_API_KEY"] = "your-cohere-key"
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# cohere call
response = completion(model="command-nightly", messages=messages)
print(response)
```
## Streaming ([Docs](https://docs.litellm.ai/docs/completion/stream))
liteLLM supports streaming the model response back, pass `stream=True` to get a streaming iterator in response.
Streaming is supported for OpenAI, Azure, Anthropic, Huggingface models
```python
response = completion(model="gpt-3.5-turbo", messages=messages, stream=True)
for chunk in response:
print(chunk['choices'][0]['delta'])
# claude 2
result = completion('claude-2', messages, stream=True)
for chunk in result:
print(chunk['choices'][0]['delta'])
```
## OpenAI Proxy Server ([Docs](https://docs.litellm.ai/docs/proxy_server))
Spin up a local server to translate openai api calls to any non-openai model (e.g. Huggingface, TogetherAI, Ollama, etc.)
This works for async + streaming as well.
```python
litellm --model
```
Running your model locally or on a custom endpoint ? Set the `--api-base` parameter [see how](https://docs.litellm.ai/docs/proxy_server)
### Multiple LLMs ([Docs](https://docs.litellm.ai/docs/proxy_server#multiple-llms))
```shell
$ litellm
#INFO: litellm proxy running on http://0.0.0.0:8000
```
### Self-host server ([Docs](https://docs.litellm.ai/docs/proxy_server#deploy-proxy))
1. Clone the repo
```shell
git clone https://github.com/BerriAI/litellm.git
```
2. Modify `template_secrets.toml`
```shell
[keys]
OPENAI_API_KEY="sk-..."
[general]
default_model = "gpt-3.5-turbo"
```
3. Deploy
```shell
docker build -t litellm . && docker run -p 8000:8000 litellm
```
## Supported Provider ([Docs](https://docs.litellm.ai/docs/providers))
| Provider | [Completion](https://docs.litellm.ai/docs/#basic-usage) | [Streaming](https://docs.litellm.ai/docs/completion/stream#streaming-responses) | [Async Completion](https://docs.litellm.ai/docs/completion/stream#async-completion) | [Async Streaming](https://docs.litellm.ai/docs/completion/stream#async-streaming) |
| ------------- | ------------- | ------------- | ------------- | ------------- |
| [openai](https://docs.litellm.ai/docs/providers/openai) | โ
| โ
| โ
| โ
|
| [cohere](https://docs.litellm.ai/docs/providers/cohere) | โ
| โ
| โ
| โ
|
| [anthropic](https://docs.litellm.ai/docs/providers/anthropic) | โ
| โ
| โ
| โ
|
| [replicate](https://docs.litellm.ai/docs/providers/replicate) | โ
| โ
| โ
| โ
|
| [huggingface](https://docs.litellm.ai/docs/providers/huggingface) | โ
| โ
| โ
| โ
|
| [together_ai](https://docs.litellm.ai/docs/providers/togetherai) | โ
| โ
| โ
| โ
|
| [openrouter](https://docs.litellm.ai/docs/providers/openrouter) | โ
| โ
| โ
| โ
|
| [vertex_ai](https://docs.litellm.ai/docs/providers/vertex) | โ
| โ
| โ
| โ
|
| [palm](https://docs.litellm.ai/docs/providers/palm) | โ
| โ
| โ
| โ
|
| [ai21](https://docs.litellm.ai/docs/providers/ai21) | โ
| โ
| โ
| โ
|
| [baseten](https://docs.litellm.ai/docs/providers/baseten) | โ
| โ
| โ
| โ
|
| [azure](https://docs.litellm.ai/docs/providers/azure) | โ
| โ
| โ
| โ
|
| [sagemaker](https://docs.litellm.ai/docs/providers/aws_sagemaker) | โ
| โ
| โ
| โ
|
| [bedrock](https://docs.litellm.ai/docs/providers/bedrock) | โ
| โ
| โ
| โ
|
| [vllm](https://docs.litellm.ai/docs/providers/vllm) | โ
| โ
| โ
| โ
|
| [nlp_cloud](https://docs.litellm.ai/docs/providers/nlp_cloud) | โ
| โ
| โ
| โ
|
| [aleph alpha](https://docs.litellm.ai/docs/providers/aleph_alpha) | โ
| โ
| โ
| โ
|
| [petals](https://docs.litellm.ai/docs/providers/petals) | โ
| โ
| โ
| โ
|
| [ollama](https://docs.litellm.ai/docs/providers/ollama) | โ
| โ
| โ
| โ
|
| [deepinfra](https://docs.litellm.ai/docs/providers/deepinfra) | โ
| โ
| โ
| โ
|
[**Read the Docs**](https://docs.litellm.ai/docs/)
# Contributing
To contribute: Clone the repo locally -> Make a change -> Submit a PR with the change.
Here's how to modify the repo locally:
Step 1: Clone the repo
```
git clone https://github.com/BerriAI/litellm.git
```
Step 2: Navigate into the project, and install dependencies:
```
cd litellm
poetry install
```
Step 3: Test your change:
```
cd litellm/tests # pwd: Documents/litellm/litellm/tests
pytest .
```
Step 4: Submit a PR with your changes! ๐
- push your fork to your GitHub repo
- submit a PR from there
[Learn more on how to make a PR](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request)
# Support / talk with founders
- [Schedule Demo ๐](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version)
- [Community Discord ๐ญ](https://discord.gg/wuPM9dRgDw)
- Our numbers ๐ +1 (770) 8783-106 / โญ+1 (412) 618-6238โฌ
- Our emails โ๏ธ ishaan@berri.ai / krrish@berri.ai
# Why did we build this
- **Need for simplicity**: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI and Cohere.
# Contributors