๐Ÿš… LiteLLM

Call all LLM APIs using the OpenAI format [Anthropic, Huggingface, Cohere, TogetherAI, Azure, OpenAI, etc.]

Bug Report ยท Feature Request

PyPI Version CircleCI Y Combinator W23

Docs 100+ Supported Models Demo Video
LiteLLM manages - Translating inputs to the provider's completion and embedding endpoints - Guarantees [consistent output](https://docs.litellm.ai/docs/completion/output), text responses will always be available at `['choices'][0]['message']['content']` - Exception mapping - common exceptions across providers are mapped to the OpenAI exception types. **๐Ÿšจ Seeing errors?** [![Chat on WhatsApp](https://img.shields.io/static/v1?label=Chat%20on&message=WhatsApp&color=success&logo=WhatsApp&style=flat-square)](https://wa.link/huol9n) [![Chat on Discord](https://img.shields.io/static/v1?label=Chat%20on&message=Discord&color=blue&logo=Discord&style=flat-square)](https://discord.gg/wuPM9dRgDw) **05/10/2023:** LiteLLM is adopting Semantic Versioning for all commits. [Learn more](https://github.com/BerriAI/litellm/issues/532) # Usage Open In Colab ``` pip install litellm ``` ```python from litellm import completion import os ## set ENV variables os.environ["OPENAI_API_KEY"] = "your-openai-key" os.environ["COHERE_API_KEY"] = "your-cohere-key" messages = [{ "content": "Hello, how are you?","role": "user"}] # openai call response = completion(model="gpt-3.5-turbo", messages=messages) # cohere call response = completion(model="command-nightly", messages=messages) print(response) ``` ## Streaming ([Docs](https://docs.litellm.ai/docs/completion/stream)) liteLLM supports streaming the model response back, pass `stream=True` to get a streaming iterator in response. Streaming is supported for OpenAI, Azure, Anthropic, Huggingface models ```python response = completion(model="gpt-3.5-turbo", messages=messages, stream=True) for chunk in response: print(chunk['choices'][0]['delta']) # claude 2 result = completion('claude-2', messages, stream=True) for chunk in result: print(chunk['choices'][0]['delta']) ``` ## OpenAI Proxy Server ([Docs](https://docs.litellm.ai/docs/proxy_server)) Spin up a local server to translate openai api calls to any non-openai model (e.g. Huggingface, TogetherAI, Ollama, etc.) This works for async + streaming as well. ```python litellm --model ``` Running your model locally or on a custom endpoint ? Set the `--api-base` parameter [see how](https://docs.litellm.ai/docs/proxy_server) ## Supported Provider ([Docs](https://docs.litellm.ai/docs/providers)) | Provider | [Completion](https://docs.litellm.ai/docs/#basic-usage) | [Streaming](https://docs.litellm.ai/docs/completion/stream#streaming-responses) | [Async Completion](https://docs.litellm.ai/docs/completion/stream#async-completion) | [Async Streaming](https://docs.litellm.ai/docs/completion/stream#async-streaming) | | ------------- | ------------- | ------------- | ------------- | ------------- | | [openai](https://docs.litellm.ai/docs/providers/openai) | โœ… | โœ… | โœ… | โœ… | | [cohere](https://docs.litellm.ai/docs/providers/cohere) | โœ… | โœ… | โœ… | โœ… | | [anthropic](https://docs.litellm.ai/docs/providers/anthropic) | โœ… | โœ… | โœ… | โœ… | | [replicate](https://docs.litellm.ai/docs/providers/replicate) | โœ… | โœ… | โœ… | โœ… | | [huggingface](https://docs.litellm.ai/docs/providers/huggingface) | โœ… | โœ… | โœ… | โœ… | | [together_ai](https://docs.litellm.ai/docs/providers/togetherai) | โœ… | โœ… | โœ… | โœ… | | [openrouter](https://docs.litellm.ai/docs/providers/openrouter) | โœ… | โœ… | โœ… | โœ… | | [vertex_ai](https://docs.litellm.ai/docs/providers/vertex) | โœ… | โœ… | โœ… | โœ… | | [palm](https://docs.litellm.ai/docs/providers/palm) | โœ… | โœ… | โœ… | โœ… | | [ai21](https://docs.litellm.ai/docs/providers/ai21) | โœ… | โœ… | โœ… | โœ… | | [baseten](https://docs.litellm.ai/docs/providers/baseten) | โœ… | โœ… | โœ… | โœ… | | [azure](https://docs.litellm.ai/docs/providers/azure) | โœ… | โœ… | โœ… | โœ… | | [sagemaker](https://docs.litellm.ai/docs/providers/aws_sagemaker) | โœ… | โœ… | โœ… | โœ… | | [bedrock](https://docs.litellm.ai/docs/providers/bedrock) | โœ… | โœ… | โœ… | โœ… | | [vllm](https://docs.litellm.ai/docs/providers/vllm) | โœ… | โœ… | โœ… | โœ… | | [nlp_cloud](https://docs.litellm.ai/docs/providers/nlp_cloud) | โœ… | โœ… | โœ… | โœ… | | [aleph alpha](https://docs.litellm.ai/docs/providers/aleph_alpha) | โœ… | โœ… | โœ… | โœ… | | [petals](https://docs.litellm.ai/docs/providers/petals) | โœ… | โœ… | โœ… | โœ… | | [ollama](https://docs.litellm.ai/docs/providers/ollama) | โœ… | โœ… | โœ… | โœ… | | [deepinfra](https://docs.litellm.ai/docs/providers/deepinfra) | โœ… | โœ… | โœ… | โœ… | [**Read the Docs**](https://docs.litellm.ai/docs/) # Contributing To contribute: Clone the repo locally -> Make a change -> Submit a PR with the change. Here's how to modify the repo locally: Step 1: Clone the repo ``` git clone https://github.com/BerriAI/litellm.git ``` Step 2: Navigate into the project, and install dependencies: ``` cd litellm poetry install ``` Step 3: Test your change: ``` cd litellm/tests # pwd: Documents/litellm/litellm/tests pytest . ``` Step 4: Submit a PR with your changes! ๐Ÿš€ - push your fork to your GitHub repo - submit a PR from there [Learn more on how to make a PR](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request) # Support / talk with founders - [Schedule Demo ๐Ÿ‘‹](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version) - [Community Discord ๐Ÿ’ญ](https://discord.gg/wuPM9dRgDw) - Our numbers ๐Ÿ“ž +1 (770) 8783-106 / โ€ญ+1 (412) 618-6238โ€ฌ - Our emails โœ‰๏ธ ishaan@berri.ai / krrish@berri.ai # Why did we build this - **Need for simplicity**: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI and Cohere. # Contributors