Merge branch 'main' into fix-streaming-anthropic-2

This commit is contained in:
Ishaan Jaff 2023-08-28 09:05:51 -07:00 committed by GitHub
commit 8c35ffe884
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
89 changed files with 4930 additions and 675 deletions

4
.all-contributorsrc Normal file
View file

@ -0,0 +1,4 @@
{
"projectName": "litellm",
"projectOwner": "BerriAI"
}

View file

@ -3,4 +3,6 @@ openai
python-dotenv
openai
tiktoken
importlib_metadata
importlib_metadata
baseten
gptcache

View file

@ -1,5 +1,6 @@
# OpenAI
OPENAI_API_KEY = ""
OPENAI_API_BASE = ""
# Cohere
COHERE_API_KEY = ""
# OpenRouter
@ -18,4 +19,4 @@ REPLICATE_API_TOKEN = ""
# Anthropic
ANTHROPIC_API_KEY = ""
# Infisical
INFISICAL_TOKEN = ""
INFISICAL_TOKEN = ""

10
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View file

@ -0,0 +1,10 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''
---
What's the problem? (if there are multiple - list as bullet points)

View file

@ -0,0 +1,14 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''
---
**Your feature request in one line**
Describe your request in one line
**Describe the solution you'd like**
A clear and concise description of what you want to happen.

View file

@ -1,23 +1,53 @@
# *🚅 litellm*
[![PyPI Version](https://img.shields.io/pypi/v/litellm.svg)](https://pypi.org/project/litellm/)
[![PyPI Version](https://img.shields.io/badge/stable%20version-v0.1.424-blue?color=green&link=https://pypi.org/project/litellm/0.1.1/)](https://pypi.org/project/litellm/0.1.1/)
[![CircleCI](https://dl.circleci.com/status-badge/img/gh/BerriAI/litellm/tree/main.svg?style=svg)](https://dl.circleci.com/status-badge/redirect/gh/BerriAI/litellm/tree/main)
![Downloads](https://img.shields.io/pypi/dm/litellm)
<h1 align="center">
🚅 LiteLLM
</h1>
<p align="center">
<p align="center">Call all LLM APIs using the OpenAI format [Anthropic, Huggingface, Cohere, Azure OpenAI etc.]</p>
</p>
[![](https://dcbadge.vercel.app/api/server/wuPM9dRgDw)](https://discord.gg/wuPM9dRgDw)
<h4 align="center">
<a href="https://pypi.org/project/litellm/" target="_blank">
<img src="https://img.shields.io/pypi/v/litellm.svg" alt="PyPI Version">
</a>
<a href="https://pypi.org/project/litellm/0.1.1/" target="_blank">
<img src="https://img.shields.io/badge/stable%20version-v0.1.424-blue?color=green&link=https://pypi.org/project/litellm/0.1.1/" alt="Stable Version">
</a>
<a href="https://dl.circleci.com/status-badge/redirect/gh/BerriAI/litellm/tree/main" target="_blank">
<img src="https://dl.circleci.com/status-badge/img/gh/BerriAI/litellm/tree/main.svg?style=svg" alt="CircleCI">
</a>
<img src="https://img.shields.io/pypi/dm/litellm" alt="Downloads">
<a href="https://discord.gg/wuPM9dRgDw" target="_blank">
<img src="https://dcbadge.vercel.app/api/server/wuPM9dRgDw?style=flat">
</a>
<a href="https://www.ycombinator.com/companies/berriai">
<img src="https://img.shields.io/badge/Y%20Combinator-W23-orange?style=flat-square" alt="Y Combinator W23">
</a>
</h4>
a light package to simplify calling OpenAI, Azure, Cohere, Anthropic, Huggingface API Endpoints. It manages:
- translating inputs to the provider's completion and embedding endpoints
- guarantees [consistent output](https://litellm.readthedocs.io/en/latest/output/), text responses will always be available at `['choices'][0]['message']['content']`
- exception mapping - common exceptions across providers are mapped to the [OpenAI exception types](https://help.openai.com/en/articles/6897213-openai-library-error-types-guidance)
# usage
<a href='https://docs.litellm.ai/docs/completion/supported' target="_blank"><img alt='None' src='https://img.shields.io/badge/100+_Supported_LLMs_liteLLM-100000?style=for-the-badge&logo=None&logoColor=000000&labelColor=000000&color=8400EA'/></a>
<h4 align="center">
<a target="_blank" href="https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/liteLLM_OpenAI.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>
</h4>
<h4 align="center">
<a href="https://docs.litellm.ai/docs/completion/supported" target="_blank">100+ Supported Models</a> |
<a href="https://docs.litellm.ai/docs/" target="_blank">Docs</a> |
<a href="https://litellm.ai/playground" target="_blank">Demo Website</a>
</h4>
LiteLLM manages
- Translating inputs to the provider's completion and embedding endpoints
- Guarantees [consistent output](https://litellm.readthedocs.io/en/latest/output/), text responses will always be available at `['choices'][0]['message']['content']`
- Exception mapping - common exceptions across providers are mapped to the [OpenAI exception types](https://help.openai.com/en/articles/6897213-openai-library-error-types-guidance)
# Usage
<a target="_blank" href="https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/liteLLM_OpenAI.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>
Demo - https://litellm.ai/playground
Docs - https://docs.litellm.ai/docs/
**Free** Dashboard - https://docs.litellm.ai/docs/debugging/hosted_debugging
## quick start
```
pip install litellm
```
@ -28,6 +58,7 @@ from litellm import completion
## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"
os.environ["ANTHROPIC_API_KEY"] = "anthropic key"
messages = [{ "content": "Hello, how are you?","role": "user"}]
@ -35,16 +66,18 @@ messages = [{ "content": "Hello, how are you?","role": "user"}]
response = completion(model="gpt-3.5-turbo", messages=messages)
# cohere call
response = completion(model="command-nightly", messages)
response = completion(model="command-nightly", messages=messages)
# anthropic
response = completion(model="claude-2", messages=messages)
```
Code Sample: [Getting Started Notebook](https://colab.research.google.com/drive/1gR3pY-JzDZahzpVdbGBtrNGDBmzUNJaJ?usp=sharing)
Stable version
```
pip install litellm==0.1.424
```
## Streaming Queries
## Streaming
liteLLM supports streaming the model response back, pass `stream=True` to get a streaming iterator in response.
Streaming is supported for OpenAI, Azure, Anthropic, Huggingface models
```python
@ -58,11 +91,27 @@ for chunk in result:
print(chunk['choices'][0]['delta'])
```
# support / talk with founders
- [Our calendar 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version)
# Support / talk with founders
- [Schedule Demo 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version)
- [Community Discord 💭](https://discord.gg/wuPM9dRgDw)
- Our numbers 📞 +1 (770) 8783-106 / +1 (412) 618-6238
- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai
# why did we build this
# Why did we build this
- **Need for simplicity**: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI, Cohere
# Contributors
<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
<!-- prettier-ignore-start -->
<!-- markdownlint-disable -->
<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- ALL-CONTRIBUTORS-LIST:END -->
<a href="https://github.com/BerriAI/litellm/graphs/contributors">
<img src="https://contrib.rocks/image?repo=BerriAI/litellm" />
</a>

View file

@ -0,0 +1,123 @@
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"source": [
"## LiteLLM Caching Tutorial\n",
"Link to using Caching in Docs:\n",
"https://docs.litellm.ai/docs/caching/"
],
"metadata": {
"id": "Lvj-GI3YQfQx"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "eKSBuuKn99Jm"
},
"outputs": [],
"source": [
"!pip install litellm==0.1.492"
]
},
{
"cell_type": "markdown",
"source": [
"## Set `caching_with_models` to True\n",
"Enables caching on a per-model basis.\n",
"Keys are the input messages + model and values stored in the cache is the corresponding response"
],
"metadata": {
"id": "sFXj4UUnQpyt"
}
},
{
"cell_type": "code",
"source": [
"import os, time, litellm\n",
"from litellm import completion\n",
"litellm.caching_with_models = True # set caching for each model to True\n"
],
"metadata": {
"id": "xCea1EjR99rU"
},
"execution_count": 8,
"outputs": []
},
{
"cell_type": "code",
"source": [
"os.environ['OPENAI_API_KEY'] = \"\""
],
"metadata": {
"id": "VK3kXGXI-dtC"
},
"execution_count": 9,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"## Use LiteLLM Cache"
],
"metadata": {
"id": "U_CDCcnjQ7c6"
}
},
{
"cell_type": "code",
"source": [
"question = \"write 1 page about what's LiteLLM\"\n",
"for _ in range(2):\n",
" start_time = time.time()\n",
" response = completion(\n",
" model='gpt-3.5-turbo',\n",
" messages=[\n",
" {\n",
" 'role': 'user',\n",
" 'content': question\n",
" }\n",
" ],\n",
" )\n",
" print(f'Question: {question}')\n",
" print(\"Time consuming: {:.2f}s\".format(time.time() - start_time))"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Efli-J-t-bJH",
"outputId": "cfdb1e14-96b0-48ee-c504-7f567e84c349"
},
"execution_count": 10,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Question: write 1 page about what's LiteLLM\n",
"Time consuming: 13.53s\n",
"Question: write 1 page about what's LiteLLM\n",
"Time consuming: 0.00s\n"
]
}
]
}
]
}

View file

@ -0,0 +1,181 @@
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"source": [
"# Using GPT Cache x LiteLLM\n",
"- Cut costs 10x, improve speed 100x"
],
"metadata": {
"id": "kBwDrphDDEoO"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_K_4auSgCSjg"
},
"outputs": [],
"source": [
"!pip install litellm gptcache"
]
},
{
"cell_type": "markdown",
"source": [
"# Usage\n",
"* use `from litellm.cache import completion`\n",
"* Init GPT Cache using the following lines:\n",
"```python\n",
"from gptcache import cache\n",
"cache.init()\n",
"cache.set_openai_key()\n",
"```"
],
"metadata": {
"id": "DlZ22IfmDR5L"
}
},
{
"cell_type": "markdown",
"source": [
"## With OpenAI"
],
"metadata": {
"id": "js80pW9PC1KQ"
}
},
{
"cell_type": "code",
"source": [
"from gptcache import cache\n",
"import os\n",
"from litellm.cache import completion # import completion from litellm.cache\n",
"import time\n",
"\n",
"# Set your .env keys\n",
"os.environ['OPENAI_API_KEY'] = \"\"\n",
"\n",
"##### GPT Cache Init\n",
"cache.init()\n",
"cache.set_openai_key()\n",
"#### End of GPT Cache Init\n",
"\n",
"question = \"what's LiteLLM\"\n",
"for _ in range(2):\n",
" start_time = time.time()\n",
" response = completion(\n",
" model='gpt-3.5-turbo',\n",
" messages=[\n",
" {\n",
" 'role': 'user',\n",
" 'content': question\n",
" }\n",
" ],\n",
" )\n",
" print(f'Question: {question}')\n",
" print(\"Time consuming: {:.2f}s\".format(time.time() - start_time))"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "24a-mg1OCWe1",
"outputId": "36130cb6-9bd6-4bc6-8405-b6e19a1e9357"
},
"execution_count": 2,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"start to install package: redis\n",
"successfully installed package: redis\n",
"start to install package: redis_om\n",
"successfully installed package: redis_om\n",
"Question: what's LiteLLM\n",
"Time consuming: 1.18s\n",
"Question: what's LiteLLM\n",
"Time consuming: 0.00s\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"## With Cohere"
],
"metadata": {
"id": "xXPtHamPCy73"
}
},
{
"cell_type": "code",
"source": [
"from gptcache import cache\n",
"import os\n",
"from litellm.cache import completion # import completion from litellm.cache\n",
"import time\n",
"\n",
"# Set your .env keys\n",
"os.environ['COHERE_API_KEY'] = \"\"\n",
"\n",
"##### GPT Cache Init\n",
"cache.init()\n",
"cache.set_openai_key()\n",
"#### End of GPT Cache Init\n",
"\n",
"question = \"what's LiteLLM Github\"\n",
"for _ in range(2):\n",
" start_time = time.time()\n",
" response = completion(\n",
" model='gpt-3.5-turbo',\n",
" messages=[\n",
" {\n",
" 'role': 'user',\n",
" 'content': question\n",
" }\n",
" ],\n",
" )\n",
" print(f'Question: {question}')\n",
" print(\"Time consuming: {:.2f}s\".format(time.time() - start_time))"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "goRtiiAlChRW",
"outputId": "47f473da-5560-4d6f-d9ef-525ff8e60758"
},
"execution_count": 4,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Question: what's LiteLLM Github\n",
"Time consuming: 1.58s\n",
"Question: what's LiteLLM Github\n",
"Time consuming: 0.00s\n"
]
}
]
}
]
}

File diff suppressed because one or more lines are too long

View file

@ -1,7 +1,6 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "WemkFEdDAnJL"
@ -13,18 +12,58 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"metadata": {
"id": "pc6IO4V99O25"
"id": "pc6IO4V99O25",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "2d69da44-010b-41c2-b38b-5b478576bb8b"
},
"outputs": [],
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Collecting litellm\n",
" Downloading litellm-0.1.482-py3-none-any.whl (69 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m69.3/69.3 kB\u001b[0m \u001b[31m757.5 kB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hRequirement already satisfied: importlib-metadata<7.0.0,>=6.8.0 in /usr/local/lib/python3.10/dist-packages (from litellm) (6.8.0)\n",
"Collecting openai<0.28.0,>=0.27.8 (from litellm)\n",
" Downloading openai-0.27.9-py3-none-any.whl (75 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m75.5/75.5 kB\u001b[0m \u001b[31m3.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hCollecting python-dotenv<2.0.0,>=1.0.0 (from litellm)\n",
" Downloading python_dotenv-1.0.0-py3-none-any.whl (19 kB)\n",
"Collecting tiktoken<0.5.0,>=0.4.0 (from litellm)\n",
" Downloading tiktoken-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.7/1.7 MB\u001b[0m \u001b[31m17.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hRequirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.10/dist-packages (from importlib-metadata<7.0.0,>=6.8.0->litellm) (3.16.2)\n",
"Requirement already satisfied: requests>=2.20 in /usr/local/lib/python3.10/dist-packages (from openai<0.28.0,>=0.27.8->litellm) (2.31.0)\n",
"Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from openai<0.28.0,>=0.27.8->litellm) (4.66.1)\n",
"Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from openai<0.28.0,>=0.27.8->litellm) (3.8.5)\n",
"Requirement already satisfied: regex>=2022.1.18 in /usr/local/lib/python3.10/dist-packages (from tiktoken<0.5.0,>=0.4.0->litellm) (2023.6.3)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai<0.28.0,>=0.27.8->litellm) (3.2.0)\n",
"Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai<0.28.0,>=0.27.8->litellm) (3.4)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai<0.28.0,>=0.27.8->litellm) (2.0.4)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai<0.28.0,>=0.27.8->litellm) (2023.7.22)\n",
"Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.8->litellm) (23.1.0)\n",
"Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.8->litellm) (6.0.4)\n",
"Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.8->litellm) (4.0.3)\n",
"Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.8->litellm) (1.9.2)\n",
"Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.8->litellm) (1.4.0)\n",
"Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai<0.28.0,>=0.27.8->litellm) (1.3.1)\n",
"Installing collected packages: python-dotenv, tiktoken, openai, litellm\n",
"Successfully installed litellm-0.1.482 openai-0.27.9 python-dotenv-1.0.0 tiktoken-0.4.0\n"
]
}
],
"source": [
"!pip install litellm==0.1.419"
"!pip install litellm"
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"metadata": {
"id": "TMI3739_9q97"
},
@ -38,7 +77,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "bEqJ2HHjBJqq"
@ -50,7 +88,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
@ -95,7 +133,50 @@
]
},
{
"attachments": {},
"cell_type": "code",
"source": [
"model_name = \"togethercomputer/CodeLlama-34b-Instruct\"\n",
"response = completion(model=model_name, messages=messages, max_tokens=200)\n",
"print(response)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "GIUevHlMvPb8",
"outputId": "ad930a12-16e3-4400-fff4-38151e4f6da5"
},
"execution_count": 4,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"\u001b[92mHere's your LiteLLM Dashboard 👉 \u001b[94m\u001b[4mhttps://admin.litellm.ai/6c0f0403-becb-44af-9724-7201c7d381d0\u001b[0m\n",
"{\n",
" \"choices\": [\n",
" {\n",
" \"finish_reason\": \"stop\",\n",
" \"index\": 0,\n",
" \"message\": {\n",
" \"content\": \"\\nI'm in San Francisco, and I'm not sure what the weather is like.\\nI'm in San Francisco, and I'm not sure what the weather is like. I'm in San Francisco, and I'm not sure what the weather is like. I'm in San Francisco, and I'm not sure what the weather is like. I'm in San Francisco, and I'm not sure what the weather is like. I'm in San Francisco, and I'm not sure what the weather is like. I'm in San Francisco, and I'm not sure what the weather is like. I'm in San Francisco, and I'm not sure what the weather is like. I'm in San Francisco, and I'm not sure what the weather is like. I'm in San Francisco, and I'm not sure what the weather is like. I'm in San Francisco, and\",\n",
" \"role\": \"assistant\"\n",
" }\n",
" }\n",
" ],\n",
" \"created\": 1692934243.8663018,\n",
" \"model\": \"togethercomputer/CodeLlama-34b-Instruct\",\n",
" \"usage\": {\n",
" \"prompt_tokens\": 9,\n",
" \"completion_tokens\": 178,\n",
" \"total_tokens\": 187\n",
" }\n",
"}\n"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sfWtgf-mBQcM"
@ -109,10 +190,11 @@
"execution_count": null,
"metadata": {
"colab": {
"background_save": true,
"base_uri": "https://localhost:8080/"
},
"id": "wuBhlZtC6MH5",
"outputId": "1bedc981-4ab1-4abd-9b81-a9727223b66a"
"outputId": "8f4a408c-25eb-4434-cdd4-7b4ae4f6d3aa"
},
"outputs": [
{
@ -649,7 +731,243 @@
"{'choices': [{'delta': {'role': 'assistant', 'content': ' provide'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' fund'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'ing'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' to'}}]}\n"
"{'choices': [{'delta': {'role': 'assistant', 'content': ' to'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' its'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' port'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'folio'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' companies'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' but'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' instead'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' focus'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'es'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' on'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' connecting'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' found'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'ers'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' with'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' invest'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'ors'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' and'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' resources'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' that'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' can'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' help'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' them'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' raise'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' capital'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '.'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' This'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' means'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' that'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' if'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' your'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' startup'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' is'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' looking'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' for'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' fund'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'ing'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' Y'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'C'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' may'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' be'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' a'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' better'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' option'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '.'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '\\n'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '\\n'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'So'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' which'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' program'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' is'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' right'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' for'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' your'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' startup'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '?'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' It'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' ultimately'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' depends'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' on'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' your'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' specific'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' needs'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' and'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' goals'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '.'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' If'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' your'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' startup'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' is'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' in'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' a'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' non'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '-'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'tech'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' industry'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' l'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'ite'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'LL'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'M'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': \"'\"}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 's'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' bro'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'ader'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' focus'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' may'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' be'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' a'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' better'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' fit'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '.'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' Additionally'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' if'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' you'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': \"'\"}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 're'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' looking'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' for'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' a'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' more'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' personal'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'ized'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' flexible'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' approach'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' to'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' acceleration'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' l'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'ite'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'LL'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'M'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': \"'\"}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 's'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' program'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' may'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' be'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' a'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' better'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' choice'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '.'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' On'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' the'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' other'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' hand'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' if'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' your'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' startup'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' is'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' in'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' the'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' software'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' technology'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' or'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' internet'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' space'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' and'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' you'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': \"'\"}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 're'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' looking'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' for'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' seed'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' fund'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'ing'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' Y'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'C'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': \"'\"}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 's'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' program'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' may'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' be'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' a'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' better'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' fit'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '.'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '\\n'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '\\n'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'In'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' conclusion'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' Y'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'C'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' and'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' l'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'ite'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'LL'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'M'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' are'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' both'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' excellent'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' startup'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' acceler'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'ators'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' that'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' can'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' provide'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' valuable'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' resources'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' and'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' support'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' to'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' early'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '-'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'stage'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' companies'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '.'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' While'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' they'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' share'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' some'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' similar'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 'ities'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' they'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' also'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' have'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' distinct'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' differences'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' that'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' set'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' them'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' apart'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '.'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' By'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' considering'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' your'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' startup'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': \"'\"}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': 's'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' specific'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' needs'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' and'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' goals'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ','}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' you'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' can'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' determine'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' which'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' program'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' is'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' the'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' best'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' fit'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' for'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' your'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': ' business'}}]}\n",
"{'choices': [{'delta': {'role': 'assistant', 'content': '.'}}]}\n"
]
}
],
@ -686,4 +1004,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1,46 @@
from flask import Flask, request, jsonify, abort, Response
from flask_cors import CORS
from litellm import completion
import os, dotenv
import random
dotenv.load_dotenv()
# TODO: set your keys in .env or here:
# os.environ["OPENAI_API_KEY"] = "" # set your openai key here or in your .env
# see supported models, keys here:
app = Flask(__name__)
CORS(app)
@app.route('/')
def index():
return 'received!', 200
# Dictionary of LLM functions with their A/B test ratios, should sum to 1 :)
llm_dict = {
"gpt-4": 0.2,
"together_ai/togethercomputer/llama-2-70b-chat": 0.4,
"claude-2": 0.2,
"claude-1.2": 0.2
}
@app.route('/chat/completions', methods=["POST"])
def api_completion():
data = request.json
try:
# pass in data to completion function, unpack data
selected_llm = random.choices(list(llm_dict.keys()), weights=list(llm_dict.values()))[0]
data['model'] = selected_llm
response = completion(**data)
except Exception as e:
print(f"got error{e}")
return response, 200
if __name__ == "__main__":
from waitress import serve
print("starting server")
serve(app, host="0.0.0.0", port=5000, threads=500)

View file

@ -0,0 +1,171 @@
<h1 align="center">
🚅 LiteLLM - A/B Testing LLMs in Production
</h1>
<p align="center">
<p align="center">Call all LLM APIs using the OpenAI format [Anthropic, Huggingface, Cohere, Azure OpenAI etc.]</p>
</p>
<h4 align="center">
<a href="https://pypi.org/project/litellm/" target="_blank">
<img src="https://img.shields.io/pypi/v/litellm.svg" alt="PyPI Version">
</a>
<a href="https://pypi.org/project/litellm/0.1.1/" target="_blank">
<img src="https://img.shields.io/badge/stable%20version-v0.1.424-blue?color=green&link=https://pypi.org/project/litellm/0.1.1/" alt="Stable Version">
</a>
<a href="https://dl.circleci.com/status-badge/redirect/gh/BerriAI/litellm/tree/main" target="_blank">
<img src="https://dl.circleci.com/status-badge/img/gh/BerriAI/litellm/tree/main.svg?style=svg" alt="CircleCI">
</a>
<img src="https://img.shields.io/pypi/dm/litellm" alt="Downloads">
<a href="https://discord.gg/wuPM9dRgDw" target="_blank">
<img src="https://dcbadge.vercel.app/api/server/wuPM9dRgDw?style=flat">
</a>
</h4>
<h4 align="center">
<a href="https://docs.litellm.ai/docs/completion/supported" target="_blank">100+ Supported Models</a> |
<a href="https://docs.litellm.ai/docs/" target="_blank">Docs</a> |
<a href="https://litellm.ai/playground" target="_blank">Demo Website</a>
</h4>
LiteLLM allows you to call 100+ LLMs using completion
## This template server allows you to define LLMs with their A/B test ratios
```python
llm_dict = {
"gpt-4": 0.2,
"together_ai/togethercomputer/llama-2-70b-chat": 0.4,
"claude-2": 0.2,
"claude-1.2": 0.2
}
```
All models defined can be called with the same Input/Output format using litellm `completion`
```python
from litellm import completion
# SET API KEYS in .env
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# cohere call
response = completion(model="command-nightly", messages=messages)
# anthropic
response = completion(model="claude-2", messages=messages)
```
This server allows you to view responses, costs and latency on your LiteLLM dashboard
### LiteLLM Client UI
![pika-1693023669579-1x](https://github.com/BerriAI/litellm/assets/29436595/86633e2f-eda0-4939-a588-84e4c100f36a)
# Using LiteLLM A/B Testing Server
## Setup
### Install LiteLLM
```
pip install litellm
```
Stable version
```
pip install litellm==0.1.424
```
### Clone LiteLLM Git Repo
```
git clone https://github.com/BerriAI/litellm/
```
### Navigate to LiteLLM-A/B Test Server
```
cd litellm/cookbook/llm-ab-test-server
```
### Run the Server
```
python3 main.py
```
### Set your LLM Configs
Set your LLMs and LLM weights you want to run A/B testing with
In main.py set your selected LLMs you want to AB test in `llm_dict`
You can A/B test more than 100+ LLMs using LiteLLM https://docs.litellm.ai/docs/completion/supported
```python
llm_dict = {
"gpt-4": 0.2,
"together_ai/togethercomputer/llama-2-70b-chat": 0.4,
"claude-2": 0.2,
"claude-1.2": 0.2
}
```
#### Setting your API Keys
Set your LLM API keys in a .env file in the directory or set them as `os.environ` variables.
See https://docs.litellm.ai/docs/completion/supported for the format of API keys
LiteLLM generalizes api keys to follow the following format
`PROVIDER_API_KEY`
## Making Requests to the LiteLLM Server Locally
The server follows the Input/Output format set by the OpenAI Chat Completions API
Here is an example request made the LiteLLM Server
### Python
```python
import requests
import json
url = "http://localhost:5000/chat/completions"
payload = json.dumps({
"messages": [
{
"content": "who is CTO of litellm",
"role": "user"
}
]
})
headers = {
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
```
### Curl Command
```
curl --location 'http://localhost:5000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"messages": [
{
"content": "who is CTO of litellm",
"role": "user"
}
]
}
'
```
## Viewing Logs
After running your first `completion()` call litellm autogenerates a new logs dashboard for you. Link to your Logs dashboard is generated in the terminal / console.
Example Terminal Output with Log Dashboard
<img width="1280" alt="Screenshot 2023-08-25 at 8 53 27 PM" src="https://github.com/BerriAI/litellm/assets/29436595/8f4cc218-a991-4988-a05c-c8e508da5d18">
# support / talk with founders
- [Schedule Demo 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version)
- [Community Discord 💭](https://discord.gg/wuPM9dRgDw)
- Our numbers 📞 +1 (770) 8783-106 / +1 (412) 618-6238
- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai
# why did we build this
- **Need for simplicity**: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI, Cohere

View file

@ -1,6 +1,7 @@
# liteLLM Proxy Server: 50+ LLM Models, Error Handling, Caching
### Azure, Llama2, OpenAI, Claude, Hugging Face, Replicate Models
[![PyPI Version](https://img.shields.io/pypi/v/litellm.svg)](https://pypi.org/project/litellm/)
[![PyPI Version](https://img.shields.io/badge/stable%20version-v0.1.345-blue?color=green&link=https://pypi.org/project/litellm/0.1.1/)](https://pypi.org/project/litellm/0.1.1/)
![Downloads](https://img.shields.io/pypi/dm/litellm)
@ -11,34 +12,36 @@
![4BC6491E-86D0-4833-B061-9F54524B2579](https://github.com/BerriAI/litellm/assets/17561003/f5dd237b-db5e-42e1-b1ac-f05683b1d724)
## What does liteLLM proxy do
- Make `/chat/completions` requests for 50+ LLM models **Azure, OpenAI, Replicate, Anthropic, Hugging Face**
Example: for `model` use `claude-2`, `gpt-3.5`, `gpt-4`, `command-nightly`, `stabilityai/stablecode-completion-alpha-3b-4k`
```json
{
"model": "replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1",
"messages": [
{
"content": "Hello, whats the weather in San Francisco??",
"role": "user"
}
]
{
"content": "Hello, whats the weather in San Francisco??",
"role": "user"
}
]
}
```
- **Consistent Input/Output** Format
- Call all models using the OpenAI format - `completion(model, messages)`
- Text responses will always be available at `['choices'][0]['message']['content']`
- **Error Handling** Using Model Fallbacks (if `GPT-4` fails, try `llama2`)
- **Logging** - Log Requests, Responses and Errors to `Supabase`, `Posthog`, `Mixpanel`, `Sentry`, `Helicone` (Any of the supported providers here: https://litellm.readthedocs.io/en/latest/advanced/
**Example: Logs sent to Supabase**
- **Consistent Input/Output** Format
- Call all models using the OpenAI format - `completion(model, messages)`
- Text responses will always be available at `['choices'][0]['message']['content']`
- **Error Handling** Using Model Fallbacks (if `GPT-4` fails, try `llama2`)
- **Logging** - Log Requests, Responses and Errors to `Supabase`, `Posthog`, `Mixpanel`, `Sentry`, `LLMonitor,` `Helicone` (Any of the supported providers here: https://litellm.readthedocs.io/en/latest/advanced/
**Example: Logs sent to Supabase**
<img width="1015" alt="Screenshot 2023-08-11 at 4 02 46 PM" src="https://github.com/ishaan-jaff/proxy-server/assets/29436595/237557b8-ba09-4917-982c-8f3e1b2c8d08">
- **Token Usage & Spend** - Track Input + Completion tokens used + Spend/model
- **Caching** - Implementation of Semantic Caching
- **Streaming & Async Support** - Return generators to stream text responses
## API Endpoints
### `/chat/completions` (POST)
@ -46,34 +49,37 @@
This endpoint is used to generate chat completions for 50+ support LLM API Models. Use llama2, GPT-4, Claude2 etc
#### Input
This API endpoint accepts all inputs in raw JSON and expects the following inputs
- `model` (string, required): ID of the model to use for chat completions. See all supported models [here]: (https://litellm.readthedocs.io/en/latest/supported/):
eg `gpt-3.5-turbo`, `gpt-4`, `claude-2`, `command-nightly`, `stabilityai/stablecode-completion-alpha-3b-4k`
- `model` (string, required): ID of the model to use for chat completions. See all supported models [here]: (https://litellm.readthedocs.io/en/latest/supported/):
eg `gpt-3.5-turbo`, `gpt-4`, `claude-2`, `command-nightly`, `stabilityai/stablecode-completion-alpha-3b-4k`
- `messages` (array, required): A list of messages representing the conversation context. Each message should have a `role` (system, user, assistant, or function), `content` (message text), and `name` (for function role).
- Additional Optional parameters: `temperature`, `functions`, `function_call`, `top_p`, `n`, `stream`. See the full list of supported inputs here: https://litellm.readthedocs.io/en/latest/input/
#### Example JSON body
For claude-2
```json
{
"model": "claude-2",
"messages": [
{
"content": "Hello, whats the weather in San Francisco??",
"role": "user"
}
]
"model": "claude-2",
"messages": [
{
"content": "Hello, whats the weather in San Francisco??",
"role": "user"
}
]
}
```
### Making an API request to the Proxy Server
```python
import requests
import json
# TODO: use your URL
# TODO: use your URL
url = "http://localhost:5000/chat/completions"
payload = json.dumps({
@ -94,34 +100,38 @@ print(response.text)
```
### Output [Response Format]
Responses from the server are given in the following format.
Responses from the server are given in the following format.
All responses from the server are returned in the following format (for all LLM models). More info on output here: https://litellm.readthedocs.io/en/latest/output/
```json
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "I'm sorry, but I don't have the capability to provide real-time weather information. However, you can easily check the weather in San Francisco by searching online or using a weather app on your phone.",
"role": "assistant"
}
}
],
"created": 1691790381,
"id": "chatcmpl-7mUFZlOEgdohHRDx2UpYPRTejirzb",
"model": "gpt-3.5-turbo-0613",
"object": "chat.completion",
"usage": {
"completion_tokens": 41,
"prompt_tokens": 16,
"total_tokens": 57
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "I'm sorry, but I don't have the capability to provide real-time weather information. However, you can easily check the weather in San Francisco by searching online or using a weather app on your phone.",
"role": "assistant"
}
}
],
"created": 1691790381,
"id": "chatcmpl-7mUFZlOEgdohHRDx2UpYPRTejirzb",
"model": "gpt-3.5-turbo-0613",
"object": "chat.completion",
"usage": {
"completion_tokens": 41,
"prompt_tokens": 16,
"total_tokens": 57
}
}
```
## Installation & Usage
### Running Locally
1. Clone liteLLM repository to your local machine:
```
git clone https://github.com/BerriAI/liteLLM-proxy
@ -141,24 +151,24 @@ All responses from the server are returned in the following format (for all LLM
python main.py
```
## Deploying
1. Quick Start: Deploy on Railway
[![Deploy on Railway](https://railway.app/button.svg)](https://railway.app/template/DYqQAW?referralCode=t3ukrU)
2. `GCP`, `AWS`, `Azure`
This project includes a `Dockerfile` allowing you to build and deploy a Docker Project on your providers
2. `GCP`, `AWS`, `Azure`
This project includes a `Dockerfile` allowing you to build and deploy a Docker Project on your providers
# Support / Talk with founders
- [Our calendar 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version)
- [Community Discord 💭](https://discord.gg/wuPM9dRgDw)
- Our numbers 📞 +1 (770) 8783-106 / +1 (412) 618-6238
- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai
## Roadmap
- [ ] Support hosted db (e.g. Supabase)
- [ ] Easily send data to places like posthog and sentry.
- [ ] Add a hot-cache for project spend logs - enables fast checks for user + project limitings

BIN
dist/litellm-0.1.486-py3-none-any.whl vendored Normal file

Binary file not shown.

BIN
dist/litellm-0.1.486.tar.gz vendored Normal file

Binary file not shown.

BIN
dist/litellm-0.1.487-py3-none-any.whl vendored Normal file

Binary file not shown.

BIN
dist/litellm-0.1.487.tar.gz vendored Normal file

Binary file not shown.

View file

@ -1,4 +1,4 @@
# Caching Completion() Responses
# LiteLLM - Caching
liteLLM implements exact match caching. It can be enabled by setting
1. `litellm.caching`: When set to `True`, enables caching for all responses. Keys are the input `messages` and values store in the cache is the corresponding `response`

View file

@ -0,0 +1,53 @@
# Using GPTCache with LiteLLM
GPTCache is a Library for Creating Semantic Cache for LLM Queries
GPTCache Docs: https://gptcache.readthedocs.io/en/latest/index.html#
GPTCache Github: https://github.com/zilliztech/GPTCache
## Usage
### Install GPTCache
```
pip install gptcache
```
### Using GPT Cache with Litellm Completion()
#### Using GPTCache
In order to use GPTCache the following lines are used to instantiat it
```python
from gptcache import cache
# set API keys in .env / os.environ
cache.init()
cache.set_openai_key()
```
#### Full Code using GPTCache and LiteLLM
```python
from gptcache import cache
from litellm.cache import completion # import completion from litellm.cache
import time
# Set your .env keys
os.environ['OPENAI_API_KEY'] = ""
cache.init()
cache.set_openai_key()
question = "what's LiteLLM"
for _ in range(2):
start_time = time.time()
response = completion(
model='gpt-3.5-turbo',
messages=[
{
'role': 'user',
'content': question
}
],
)
print(f'Question: {question}')
print("Time consuming: {:.2f}s".format(time.time() - start_time))
```

View file

@ -0,0 +1,26 @@
# Reliability for Completions()
LiteLLM supports `completion_with_retries`.
You can use this as a drop-in replacement for the `completion()` function to use tenacity retries - by default we retry the call 3 times.
Here's a quick look at how you can use it:
```python
from litellm import completion_with_retries
user_message = "Hello, whats the weather in San Francisco??"
messages = [{"content": user_message, "role": "user"}]
# normal call
def test_completion_custom_provider_model_name():
try:
response = completion_with_retries(
model="gpt-3.5-turbo",
messages=messages,
)
# Add any assertions here to check the response
print(response)
except Exception as e:
printf"Error occurred: {e}")
```

View file

@ -17,6 +17,8 @@ liteLLM reads key naming, all keys should be named in the following format:
| gpt-3.5-turbo-16k-0613 | `completion('gpt-3.5-turbo-16k-0613', messages)` | `os.environ['OPENAI_API_KEY']` |
| gpt-4 | `completion('gpt-4', messages)` | `os.environ['OPENAI_API_KEY']` |
These also support the `OPENAI_API_BASE` environment variable, which can be used to specify a custom API endpoint.
### Azure OpenAI Chat Completion Models
| Model Name | Function Call | Required OS Variables |
@ -77,6 +79,22 @@ Here are some examples of supported models:
| [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) | `completion(model="bigcode/starcoder", messages=messages, custom_llm_provider="huggingface")` | `os.environ['HUGGINGFACE_API_KEY']` |
| [google/flan-t5-xxl](https://huggingface.co/google/flan-t5-xxl) | `completion(model="google/flan-t5-xxl", messages=messages, custom_llm_provider="huggingface")` | `os.environ['HUGGINGFACE_API_KEY']` |
| [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) | `completion(model="google/flan-t5-large", messages=messages, custom_llm_provider="huggingface")` | `os.environ['HUGGINGFACE_API_KEY']` |
### Replicate Models
liteLLM supports all replicate LLMs. For replicate models ensure to add a `replicate` prefix to the `model` arg. liteLLM detects it using this arg.
Below are examples on how to call replicate LLMs using liteLLM
Model Name | Function Call | Required OS Variables |
-----------------------------|----------------------------------------------------------------|--------------------------------------|
replicate/llama-2-70b-chat | `completion('replicate/replicate/llama-2-70b-chat', messages)` | `os.environ['REPLICATE_API_KEY']` |
a16z-infra/llama-2-13b-chat| `completion('replicate/a16z-infra/llama-2-13b-chat', messages)`| `os.environ['REPLICATE_API_KEY']` |
joehoover/instructblip-vicuna13b | `completion('replicate/joehoover/instructblip-vicuna13b', messages)` | `os.environ['REPLICATE_API_KEY']` |
replicate/dolly-v2-12b | `completion('replicate/replicate/dolly-v2-12b', messages)` | `os.environ['REPLICATE_API_KEY']` |
a16z-infra/llama-2-7b-chat | `completion('replicate/a16z-infra/llama-2-7b-chat', messages)` | `os.environ['REPLICATE_API_KEY']` |
replicate/vicuna-13b | `completion('replicate/replicate/vicuna-13b', messages)` | `os.environ['REPLICATE_API_KEY']` |
daanelson/flan-t5-large | `completion('replicate/daanelson/flan-t5-large', messages)` | `os.environ['REPLICATE_API_KEY']` |
replit/replit-code-v1-3b | `completion('replicate/replit/replit-code-v1-3b', messages)` | `os.environ['REPLICATE_API_KEY']` |
### AI21 Models
| Model Name | Function Call | Required OS Variables |
@ -112,9 +130,9 @@ Example Baseten Usage - Note: liteLLM supports all models deployed on Basten
| Model Name | Function Call | Required OS Variables |
|------------------|--------------------------------------------|------------------------------------|
| Falcon 7B | `completion(model='<your model version id>', messages=messages, custom_llm_provider="baseten")` | `os.environ['BASETEN_API_KEY']` |
| Wizard LM | `completion(model='<your model version id>', messages=messages, custom_llm_provider="baseten")` | `os.environ['BASETEN_API_KEY']` |
| MPT 7B Base | `completion(model='<your model version id>', messages=messages, custom_llm_provider="baseten")` | `os.environ['BASETEN_API_KEY']` |
| Falcon 7B | `completion(model='baseten/qvv0xeq', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
| Wizard LM | `completion(model='baseten/q841o8w', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
| MPT 7B Base | `completion(model='baseten/31dxrj3', messages=messages)` | `os.environ['BASETEN_API_KEY']` |
### OpenRouter Completion Models

View file

@ -1,40 +1,131 @@
import Image from '@theme/IdealImage';
import QueryParamReader from '../../src/components/queryParamReader.js'
# Debugging Dashboard
LiteLLM offers a free UI to debug your calls + add new models at (https://admin.litellm.ai/). This is useful if you're testing your LiteLLM server and need to see if the API calls were made successfully **or** want to add new models without going into code.
# Debug + Deploy LLMs [UI]
**Needs litellm>=0.1.438***
LiteLLM offers a UI to:
* 1-Click Deploy LLMs - the client stores your api keys + model configurations
* Debug your Call Logs
## Setup
Once created, your dashboard is viewable at - `admin.litellm.ai/<your_email>` [👋 Tell us if you need better privacy controls](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version?month=2023-08)
You can set your user email in 2 ways.
- By setting it on the module - `litellm.email=<your_email>`.
- By setting it as an environment variable - `os.environ["LITELLM_EMAIL"] = "your_email"`.
<Image img={require('../../img/dashboard.png')} alt="Dashboard" />
See our live dashboard 👉 [admin.litellm.ai](https://admin.litellm.ai/)
👉 Jump to our sample LiteLLM Dashboard: https://admin.litellm.ai/
## Example Usage
<Image img={require('../../img/alt_dashboard.png')} alt="Dashboard" />
## Debug your first logs
<a target="_blank" href="https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/liteLLM_OpenAI.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>
### 1. Make a normal `completion()` call
```
import litellm
from litellm import embedding, completion
## Set your email
litellm.email = "test_email@test.com"
user_message = "Hello, how are you?"
messages = [{ "content": user_message,"role": "user"}]
# openai call
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])
# bad request call
response = completion(model="chatgpt-test", messages=[{"role": "user", "content": "Hi 👋 - i'm a bad request"}])
pip install litellm
```
<QueryParamReader/>
### 2. Check request state
All `completion()` calls print with a link to your session dashboard
Click on your personal dashboard link. Here's how you can find it 👇
<Image img={require('../../img/dash_output.png')} alt="Dashboard" />
[👋 Tell us if you need better privacy controls](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version?month=2023-08)
### 3. Review request log
Oh! Looks like our request was made successfully. Let's click on it and see exactly what got sent to the LLM provider.
<Image img={require('../../img/dashboard_log_row.png')} alt="Dashboard Log Row" />
Ah! So we can see that this request was made to a **Baseten** (see litellm_params > custom_llm_provider) for a model with ID - **7qQNLDB** (see model). The message sent was - `"Hey, how's it going?"` and the response received was - `"As an AI language model, I don't have feelings or emotions, but I can assist you with your queries. How can I assist you today?"`
<Image img={require('../../img/dashboard_log.png')} alt="Dashboard Log Row" />
:::info
🎉 Congratulations! You've successfully debugger your first log!
:::
## Deploy your first LLM
LiteLLM also lets you to add a new model to your project - without touching code **or** using a proxy server.
### 1. Add new model
On the same debugger dashboard we just made, just go to the 'Add New LLM' Section:
* Select Provider
* Select your LLM
* Add your LLM Key
<Image img={require('../../img/add_model.png')} alt="Dashboard" />
This works with any model on - Replicate, Together_ai, Baseten, Anthropic, Cohere, AI21, OpenAI, Azure, VertexAI (Google Palm), OpenRouter
After adding your new LLM, LiteLLM securely stores your API key and model configs.
[👋 Tell us if you need to self-host **or** integrate with your key manager](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version?month=2023-08)
### 2. Test new model Using `completion()`
Once you've added your models LiteLLM completion calls will just work for those models + providers.
```python
import litellm
from litellm import completion
litellm.token = "80888ede-4881-4876-ab3f-765d47282e66" # use your token
messages = [{ "content": "Hello, how are you?" ,"role": "user"}]
# no need to set key, LiteLLM Client reads your set key
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])
```
### 3. [Bonus] Get available model list
Get a list of all models you've created through the Dashboard with 1 function call
```python
import litellm
litellm.token = "80888ede-4881-4876-ab3f-765d47282e66" # use your token
litellm.get_model_list()
```
## Persisting your dashboard
If you want to use the same dashboard for your project set
`litellm.token` in code or your .env as `LITELLM_TOKEN`
All generated dashboards come with a token
```python
import litellm
litellm.token = "80888ede-4881-4876-ab3f-765d47282e66"
```
## Additional Information
### LiteLLM Dashboard - Debug Logs
All your `completion()` and `embedding()` call logs are available on `admin.litellm.ai/<your-token>`
#### Debug Logs for `completion()` and `embedding()`
<Image img={require('../../img/lite_logs.png')} alt="Dashboard" />
#### Viewing Errors on debug logs
<Image img={require('../../img/lite_logs2.png')} alt="Dashboard" />
### Opt-Out of using LiteLLM Client
If you want to opt out of using LiteLLM client you can set
```python
litellm.use_client = True
```

View file

@ -0,0 +1,22 @@
---
displayed_sidebar: tutorialSidebar
---
# Get Started
import QueryParamReader from '../src/components/queryParamReader.js'
import TokenComponent from '../src/components/queryParamToken.js'
:::info
This section assumes you've already added your API keys in <TokenComponent/>
If you want to use the non-hosted version, [go here](https://docs.litellm.ai/docs/#quick-start)
:::
```
pip install litellm
```
<QueryParamReader/>

View file

@ -0,0 +1,52 @@
# Exception Mapping
LiteLLM maps the 3 most common exceptions across all providers.
- Rate Limit Errors
- Context Window Errors
- InvalidAuth errors (key rotation stuff)
Base case - we return the original exception.
For all 3 cases, the exception returned inherits from the original OpenAI Exception but contains 3 additional attributes:
* status_code - the http status code of the exception
* message - the error message
* llm_provider - the provider raising the exception
## usage
```python
from litellm import completion
os.environ["ANTHROPIC_API_KEY"] = "bad-key"
try:
# some code
completion(model="claude-instant-1", messages=[{"role": "user", "content": "Hey, how's it going?"}])
except Exception as e:
print(e.llm_provider)
```
## details
To see how it's implemented - [check out the code](https://github.com/BerriAI/litellm/blob/a42c197e5a6de56ea576c73715e6c7c6b19fa249/litellm/utils.py#L1217)
[Create an issue](https://github.com/BerriAI/litellm/issues/new) **or** [make a PR](https://github.com/BerriAI/litellm/pulls) if you want to improve the exception mapping.
**Note** For OpenAI and Azure we return the original exception (since they're of the OpenAI Error type). But we add the 'llm_provider' attribute to them. [See code](https://github.com/BerriAI/litellm/blob/a42c197e5a6de56ea576c73715e6c7c6b19fa249/litellm/utils.py#L1221)
| LLM Provider | Initial Status Code / Initial Error Message | Returned Exception | Returned Status Code |
|----------------------|------------------------|-----------------|-----------------|
| Anthropic | 401 | AuthenticationError | 401 |
| Anthropic | Could not resolve authentication method. Expected either api_key or auth_token to be set. | AuthenticationError | 401 |
| Anthropic | 400 | InvalidRequestError | 400 |
| Anthropic | 429 | RateLimitError | 429 |
| Replicate | Incorrect authentication token | AuthenticationError | 401 |
| Replicate | ModelError | InvalidRequestError | 400 |
| Replicate | Request was throttled | RateLimitError | 429 |
| Replicate | ReplicateError | ServiceUnavailableError | 500 |
| Cohere | invalid api token | AuthenticationError | 401 |
| Cohere | too many tokens | InvalidRequestError | 400 |
| Cohere | CohereConnectionError | RateLimitError | 429 |
| Huggingface | 401 | AuthenticationError | 401 |
| Huggingface | 400 | InvalidRequestError | 400 |
| Huggingface | 429 | RateLimitError | 429 |

View file

@ -1,8 +1,10 @@
---
displayed_sidebar: tutorialSidebar
---
# litellm
import QueryParamReader from '../src/components/queryParamReader.js'
[![PyPI Version](https://img.shields.io/pypi/v/litellm.svg)](https://pypi.org/project/litellm/)
[![PyPI Version](https://img.shields.io/badge/stable%20version-v0.1.345-blue?color=green&link=https://pypi.org/project/litellm/0.1.1/)](https://pypi.org/project/litellm/0.1.1/)
[![CircleCI](https://dl.circleci.com/status-badge/img/gh/BerriAI/litellm/tree/main.svg?style=svg)](https://dl.circleci.com/status-badge/redirect/gh/BerriAI/litellm/tree/main)
@ -11,46 +13,40 @@ displayed_sidebar: tutorialSidebar
[![](https://dcbadge.vercel.app/api/server/wuPM9dRgDw)](https://discord.gg/wuPM9dRgDw)
a light package to simplify calling OpenAI, Azure, Cohere, Anthropic, Huggingface API Endpoints. It manages:
a light package to simplify calling OpenAI, Azure, Cohere, Anthropic, Huggingface API Endpoints. It manages:
- translating inputs to the provider's completion and embedding endpoints
- guarantees [consistent output](https://litellm.readthedocs.io/en/latest/output/), text responses will always be available at `['choices'][0]['message']['content']`
- exception mapping - common exceptions across providers are mapped to the [OpenAI exception types](https://help.openai.com/en/articles/6897213-openai-library-error-types-guidance)
# usage
<a href='https://docs.litellm.ai/docs/completion/supported' target="_blank"><img alt='None' src='https://img.shields.io/badge/Supported_LLMs-100000?style=for-the-badge&logo=None&logoColor=000000&labelColor=000000&color=8400EA'/></a>
Demo - https://litellm.ai/playground \
Read the docs - https://docs.litellm.ai/docs/
## quick start
```
pip install litellm
```
```python
from litellm import completion
<QueryParamReader/>
## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# cohere call
response = completion("command-nightly", messages)
```
Code Sample: [Getting Started Notebook](https://colab.research.google.com/drive/1gR3pY-JzDZahzpVdbGBtrNGDBmzUNJaJ?usp=sharing)
Stable version
```
pip install litellm==0.1.345
```
## Streaming Queries
liteLLM supports streaming the model response back, pass `stream=True` to get a streaming iterator in response.
Streaming is supported for OpenAI, Azure, Anthropic, Huggingface models
```python
response = completion(model="gpt-3.5-turbo", messages=messages, stream=True)
for chunk in response:
@ -63,10 +59,12 @@ for chunk in result:
```
# support / talk with founders
- [Our calendar 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version)
- [Community Discord 💭](https://discord.gg/wuPM9dRgDw)
- Our numbers 📞 +1 (770) 8783-106 / +1 (412) 618-6238
- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai
# why did we build this
# why did we build this
- **Need for simplicity**: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI, Cohere

View file

@ -1,29 +1,31 @@
# Callbacks
## Use Callbacks to send Output Data to Posthog, Sentry etc
liteLLM provides `success_callbacks` and `failure_callbacks`, making it easy for you to send data to a particular provider depending on the status of your responses.
liteLLM supports:
liteLLM provides `success_callbacks` and `failure_callbacks`, making it easy for you to send data to a particular provider depending on the status of your responses.
liteLLM supports:
- [LLMonitor](https://llmonitor.com/docs)
- [Helicone](https://docs.helicone.ai/introduction)
- [Sentry](https://docs.sentry.io/platforms/python/)
- [Sentry](https://docs.sentry.io/platforms/python/)
- [PostHog](https://posthog.com/docs/libraries/python)
- [Slack](https://slack.dev/bolt-python/concepts)
### Quick Start
```python
from litellm import completion
# set callbacks
litellm.success_callback=["posthog", "helicone"]
litellm.failure_callback=["sentry"]
litellm.success_callback=["posthog", "helicone", "llmonitor"]
litellm.failure_callback=["sentry", "llmonitor"]
## set env variables
os.environ['SENTRY_API_URL'], os.environ['SENTRY_API_TRACE_RATE']= ""
os.environ['POSTHOG_API_KEY'], os.environ['POSTHOG_API_URL'] = "api-key", "api-url"
os.environ["HELICONE_API_KEY"] = ""
os.environ["HELICONE_API_KEY"] = ""
os.environ["LLMONITOR_APP_ID"] = ""
response = completion(model="gpt-3.5-turbo", messages=messages)
response = completion(model="gpt-3.5-turbo", messages=messages)
```

View file

@ -1,12 +1,10 @@
# Logging Integrations
| Integration | Required OS Variables | How to Use with callbacks |
|-----------------|--------------------------------------------|-------------------------------------------|
| Sentry | `SENTRY_API_URL` | `litellm.success_callback=["sentry"]` |
| Posthog | `POSTHOG_API_KEY`,`POSTHOG_API_URL` | `litellm.success_callback=["posthog"]` |
| Slack | `SLACK_API_TOKEN`,`SLACK_API_SECRET`,`SLACK_API_CHANNEL` | `litellm.success_callback=["slack"]` |
| Helicone | `HELICONE_API_TOKEN` | `litellm.success_callback=["helicone"]` |
| Integration | Required OS Variables | How to Use with callbacks |
| ----------- | -------------------------------------------------------- | ---------------------------------------- |
| Promptlayer | `PROMPLAYER_API_KEY` | `litellm.success_callback=["promptlayer"]` |
| LLMonitor | `LLMONITOR_APP_ID` | `litellm.success_callback=["llmonitor"]` |
| Sentry | `SENTRY_API_URL` | `litellm.success_callback=["sentry"]` |
| Posthog | `POSTHOG_API_KEY`,`POSTHOG_API_URL` | `litellm.success_callback=["posthog"]` |
| Slack | `SLACK_API_TOKEN`,`SLACK_API_SECRET`,`SLACK_API_CHANNEL` | `litellm.success_callback=["slack"]` |
| Helicone | `HELICONE_API_TOKEN` | `litellm.success_callback=["helicone"]` |

View file

@ -0,0 +1,49 @@
# LLMonitor Tutorial
[LLMonitor](https://llmonitor.com/) is an open-source observability platform that provides cost tracking, user tracking and powerful agent tracing.
<video controls width='900' >
<source src='https://llmonitor.com/videos/demo-annotated.mp4'/>
</video>
## Use LLMonitor to log requests across all LLM Providers (OpenAI, Azure, Anthropic, Cohere, Replicate, PaLM)
liteLLM provides `callbacks`, making it easy for you to log data depending on the status of your responses.
### Using Callbacks
First, sign up to get an app ID on the [LLMonitor dashboard](https://llmonitor.com).
Use just 2 lines of code, to instantly log your responses **across all providers** with llmonitor:
```
litellm.success_callback = ["llmonitor"]
litellm.failure_callback = ["llmonitor"]
```
Complete code
```python
from litellm import completion
## set env variables
os.environ["LLMONITOR_APP_ID"] = "your-llmonitor-app-id"
# Optional: os.environ["LLMONITOR_API_URL"] = "self-hosting-url"
os.environ["OPENAI_API_KEY"], os.environ["COHERE_API_KEY"] = "", ""
# set callbacks
litellm.success_callback = ["llmonitor"]
litellm.failure_callback = ["llmonitor"]
#openai call
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])
#cohere call
response = completion(model="command-nightly", messages=[{"role": "user", "content": "Hi 👋 - i'm cohere"}])
```
## Support
For any question or issue with integration you can reach out to the LLMonitor team on [Discord](http://discord.com/invite/8PafSG58kK) or via [email](mailto:vince@llmonitor.com).

View file

@ -0,0 +1,40 @@
# Promptlayer Tutorial
Promptlayer is a platform for prompt engineers. Log OpenAI requests. Search usage history. Track performance. Visually manage prompt templates.
<Image img={require('../../img/promptlayer.png')} />
## Use Promptlayer to log requests across all LLM Providers (OpenAI, Azure, Anthropic, Cohere, Replicate, PaLM)
liteLLM provides `callbacks`, making it easy for you to log data depending on the status of your responses.
### Using Callbacks
Get your PromptLayer API Key from https://promptlayer.com/
Use just 2 lines of code, to instantly log your responses **across all providers** with promptlayer:
```python
litellm.success_callback = ["promptlayer"]
```
Complete code
```python
from litellm import completion
## set env variables
os.environ["PROMPTLAYER_API_KEY"] = "your"
os.environ["OPENAI_API_KEY"], os.environ["COHERE_API_KEY"] = "", ""
# set callbacks
litellm.success_callback = ["promptlayer"]
#openai call
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])
#cohere call
response = completion(model="command-nightly", messages=[{"role": "user", "content": "Hi 👋 - i'm cohere"}])
```

View file

@ -1,4 +1,4 @@
# Streaming Responses & Async Completion
# Streaming + Async
- [Streaming Responses](#streaming-responses)
- [Async Completion](#async-completion)

View file

@ -0,0 +1,139 @@
# A/B Test LLMs
LiteLLM allows you to call 100+ LLMs using completion
## This template server allows you to define LLMs with their A/B test ratios
```python
llm_dict = {
"gpt-4": 0.2,
"together_ai/togethercomputer/llama-2-70b-chat": 0.4,
"claude-2": 0.2,
"claude-1.2": 0.2
}
```
All models defined can be called with the same Input/Output format using litellm `completion`
```python
from litellm import completion
# SET API KEYS in .env
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# cohere call
response = completion(model="command-nightly", messages=messages)
# anthropic
response = completion(model="claude-2", messages=messages)
```
This server allows you to view responses, costs and latency on your LiteLLM dashboard
### LiteLLM Client UI
<Image img={require('../../img/ab_test_logs.png')} />
# Using LiteLLM A/B Testing Server
## Setup
### Install LiteLLM
```
pip install litellm
```
Stable version
```
pip install litellm==0.1.424
```
### Clone LiteLLM Git Repo
```
git clone https://github.com/BerriAI/litellm/
```
### Navigate to LiteLLM-A/B Test Server
```
cd litellm/cookbook/llm-ab-test-server
```
### Run the Server
```
python3 main.py
```
### Set your LLM Configs
Set your LLMs and LLM weights you want to run A/B testing with
In main.py set your selected LLMs you want to AB test in `llm_dict`
You can A/B test more than 100+ LLMs using LiteLLM https://docs.litellm.ai/docs/completion/supported
```python
llm_dict = {
"gpt-4": 0.2,
"together_ai/togethercomputer/llama-2-70b-chat": 0.4,
"claude-2": 0.2,
"claude-1.2": 0.2
}
```
#### Setting your API Keys
Set your LLM API keys in a .env file in the directory or set them as `os.environ` variables.
See https://docs.litellm.ai/docs/completion/supported for the format of API keys
LiteLLM generalizes api keys to follow the following format
`PROVIDER_API_KEY`
## Making Requests to the LiteLLM Server Locally
The server follows the Input/Output format set by the OpenAI Chat Completions API
Here is an example request made the LiteLLM Server
### Python
```python
import requests
import json
url = "http://localhost:5000/chat/completions"
payload = json.dumps({
"messages": [
{
"content": "who is CTO of litellm",
"role": "user"
}
]
})
headers = {
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
```
### Curl Command
```
curl --location 'http://localhost:5000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"messages": [
{
"content": "who is CTO of litellm",
"role": "user"
}
]
}
'
```
## Viewing Logs
After running your first `completion()` call litellm autogenerates a new logs dashboard for you. Link to your Logs dashboard is generated in the terminal / console.
Example Terminal Output with Log Dashboard
<Image img={require('../../img/term_output.png')} />

View file

@ -0,0 +1,189 @@
# Create your first LLM playground
import Image from '@theme/IdealImage';
Create a playground to **evaluate multiple LLM Providers in less than 10 minutes**. If you want to see this in prod, check out our [website](https://litellm.ai/).
**What will it look like?**
<Image
img={require('../../img/litellm_streamlit_playground.png')}
alt="streamlit_playground"
style={{ maxWidth: '75%', height: 'auto' }}
/>
**How will we do this?**: We'll build <u>the server</u> and connect it to our template frontend, ending up with a working playground UI by the end!
:::info
Before you start, make sure you have followed the [environment-setup](./installation) guide. Please note, that this tutorial relies on you having API keys from at least 1 model provider (E.g. OpenAI).
:::
## 1. Quick start
Let's make sure our keys are working. Run this script in any environment of your choice (e.g. [Google Colab](https://colab.research.google.com/#create=true)).
🚨 Don't forget to replace the placeholder key values with your keys!
```python
pip install litellm
```
```python
from litellm import completion
## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key" ## REPLACE THIS
os.environ["COHERE_API_KEY"] = "cohere key" ## REPLACE THIS
os.environ["AI21_API_KEY"] = "ai21 key" ## REPLACE THIS
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# cohere call
response = completion("command-nightly", messages)
# ai21 call
response = completion("j2-mid", messages)
```
## 2. Set-up Server
Let's build a basic Flask app as our backend server. We'll give it a specific route for our completion calls.
**Notes**:
* 🚨 Don't forget to replace the placeholder key values with your keys!
* `completion_with_retries`: LLM API calls can fail in production. This function wraps the normal litellm completion() call with [tenacity](https://tenacity.readthedocs.io/en/latest/) to retry the call in case it fails.
LiteLLM specific snippet:
```python
import os
from litellm import completion_with_retries
## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key" ## REPLACE THIS
os.environ["COHERE_API_KEY"] = "cohere key" ## REPLACE THIS
os.environ["AI21_API_KEY"] = "ai21 key" ## REPLACE THIS
@app.route('/chat/completions', methods=["POST"])
def api_completion():
data = request.json
data["max_tokens"] = 256 # By default let's set max_tokens to 256
try:
# COMPLETION CALL
response = completion_with_retries(**data)
except Exception as e:
# print the error
print(e)
return response
```
The complete code:
```python
import os
from flask import Flask, jsonify, request
from litellm import completion_with_retries
## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key" ## REPLACE THIS
os.environ["COHERE_API_KEY"] = "cohere key" ## REPLACE THIS
os.environ["AI21_API_KEY"] = "ai21 key" ## REPLACE THIS
app = Flask(__name__)
# Example route
@app.route('/', methods=['GET'])
def hello():
return jsonify(message="Hello, Flask!")
@app.route('/chat/completions', methods=["POST"])
def api_completion():
data = request.json
data["max_tokens"] = 256 # By default let's set max_tokens to 256
try:
# COMPLETION CALL
response = completion_with_retries(**data)
except Exception as e:
# print the error
print(e)
return response
if __name__ == '__main__':
from waitress import serve
serve(app, host="0.0.0.0", port=4000, threads=500)
```
### Let's test it
Start the server:
```python
python main.py
```
Run this curl command to test it:
```curl
curl -X POST localhost:4000/chat/completions \
-H 'Content-Type: application/json' \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{
"content": "Hello, how are you?",
"role": "user"
}]
}'
```
This is what you should see
<Image img={require('../../img/test_python_server_2.png')} alt="python_code_sample_2" />
## 3. Connect to our frontend template
### 3.1 Download template
For our frontend, we'll use [Streamlit](https://streamlit.io/) - this enables us to build a simple python web-app.
Let's download the playground template we (LiteLLM) have created:
```zsh
git clone https://github.com/BerriAI/litellm_playground_fe_template.git
```
### 3.2 Run it
Make sure our server from [step 2](#2-set-up-server) is still running at port 4000
:::info
If you used another port, no worries - just make sure you change [this line](https://github.com/BerriAI/litellm_playground_fe_template/blob/411bea2b6a2e0b079eb0efd834886ad783b557ef/app.py#L7) in your playground template's app.py
:::
Now let's run our app:
```zsh
cd litellm_playground_fe_template && streamlit run app.py
```
If you're missing Streamlit - just pip install it (or check out their [installation guidelines](https://docs.streamlit.io/library/get-started/installation#install-streamlit-on-macoslinux))
```zsh
pip install streamlit
```
This is what you should see:
<Image img={require('../../img/litellm_streamlit_playground.png')} alt="streamlit_playground" />
# Congratulations 🚀
You've created your first LLM Playground - with the ability to call 50+ LLM APIs.
Next Steps:
* [Check out the full list of LLM Providers you can now add](../completion/supported)
* [Deploy your server using Render](https://render.com/docs/deploy-flask)
* [Deploy your playground using Streamlit](https://docs.streamlit.io/streamlit-community-cloud/deploy-your-app)

View file

@ -0,0 +1,17 @@
---
displayed_sidebar: tutorialSidebar
---
# Set up environment
Let's get the necessary keys to set up our demo environment.
Every LLM provider needs API keys (e.g. `OPENAI_API_KEY`). You can get API keys from OpenAI, Cohere and AI21 **without a waitlist**.
Let's get them for our demo!
**OpenAI**: https://platform.openai.com/account/api-keys
**Cohere**: https://dashboard.cohere.com/welcome/login?redirect_uri=%2Fapi-keys (no credit card required)
**AI21**: https://studio.ai21.com/account/api-key (no credit card required)

View file

@ -0,0 +1,136 @@
# Reliability test Multiple LLM Providers with LiteLLM
* Quality Testing
* Load Testing
* Duration Testing
```python
!pip install litellm python-dotenv
```
```python
import litellm
from litellm import load_test_model, testing_batch_completion
import time
```
```python
from dotenv import load_dotenv
load_dotenv()
```
# Quality Test endpoint
## Test the same prompt across multiple LLM providers
In this example, let's ask some questions about Paul Graham
```python
models = ["gpt-3.5-turbo", "gpt-3.5-turbo-16k", "gpt-4", "claude-instant-1", "replicate/llama-2-70b-chat:58d078176e02c219e11eb4da5a02a7830a283b14cf8f94537af893ccff5ee781"]
context = """Paul Graham (/ɡræm/; born 1964)[3] is an English computer scientist, essayist, entrepreneur, venture capitalist, and author. He is best known for his work on the programming language Lisp, his former startup Viaweb (later renamed Yahoo! Store), cofounding the influential startup accelerator and seed capital firm Y Combinator, his essays, and Hacker News. He is the author of several computer programming books, including: On Lisp,[4] ANSI Common Lisp,[5] and Hackers & Painters.[6] Technology journalist Steven Levy has described Graham as a "hacker philosopher".[7] Graham was born in England, where he and his family maintain permanent residence. However he is also a citizen of the United States, where he was educated, lived, and worked until 2016."""
prompts = ["Who is Paul Graham?", "What is Paul Graham known for?" , "Is paul graham a writer?" , "Where does Paul Graham live?", "What has Paul Graham done?"]
messages = [[{"role": "user", "content": context + "\n" + prompt}] for prompt in prompts] # pass in a list of messages we want to test
result = testing_batch_completion(models=models, messages=messages)
```
# Load Test endpoint
Run 100+ simultaneous queries across multiple providers to see when they fail + impact on latency
```python
models=["gpt-3.5-turbo", "replicate/llama-2-70b-chat:58d078176e02c219e11eb4da5a02a7830a283b14cf8f94537af893ccff5ee781", "claude-instant-1"]
context = """Paul Graham (/ɡræm/; born 1964)[3] is an English computer scientist, essayist, entrepreneur, venture capitalist, and author. He is best known for his work on the programming language Lisp, his former startup Viaweb (later renamed Yahoo! Store), cofounding the influential startup accelerator and seed capital firm Y Combinator, his essays, and Hacker News. He is the author of several computer programming books, including: On Lisp,[4] ANSI Common Lisp,[5] and Hackers & Painters.[6] Technology journalist Steven Levy has described Graham as a "hacker philosopher".[7] Graham was born in England, where he and his family maintain permanent residence. However he is also a citizen of the United States, where he was educated, lived, and worked until 2016."""
prompt = "Where does Paul Graham live?"
final_prompt = context + prompt
result = load_test_model(models=models, prompt=final_prompt, num_calls=5)
```
## Visualize the data
```python
import matplotlib.pyplot as plt
## calculate avg response time
unique_models = set(result["response"]['model'] for result in result["results"])
model_dict = {model: {"response_time": []} for model in unique_models}
for completion_result in result["results"]:
model_dict[completion_result["response"]["model"]]["response_time"].append(completion_result["response_time"])
avg_response_time = {}
for model, data in model_dict.items():
avg_response_time[model] = sum(data["response_time"]) / len(data["response_time"])
models = list(avg_response_time.keys())
response_times = list(avg_response_time.values())
plt.bar(models, response_times)
plt.xlabel('Model', fontsize=10)
plt.ylabel('Average Response Time')
plt.title('Average Response Times for each Model')
plt.xticks(models, [model[:15]+'...' if len(model) > 15 else model for model in models], rotation=45)
plt.show()
```
![png](litellm_Test_Multiple_Providers_files/litellm_Test_Multiple_Providers_11_0.png)
# Duration Test endpoint
Run load testing for 2 mins. Hitting endpoints with 100+ queries every 15 seconds.
```python
models=["gpt-3.5-turbo", "replicate/llama-2-70b-chat:58d078176e02c219e11eb4da5a02a7830a283b14cf8f94537af893ccff5ee781", "claude-instant-1"]
context = """Paul Graham (/ɡræm/; born 1964)[3] is an English computer scientist, essayist, entrepreneur, venture capitalist, and author. He is best known for his work on the programming language Lisp, his former startup Viaweb (later renamed Yahoo! Store), cofounding the influential startup accelerator and seed capital firm Y Combinator, his essays, and Hacker News. He is the author of several computer programming books, including: On Lisp,[4] ANSI Common Lisp,[5] and Hackers & Painters.[6] Technology journalist Steven Levy has described Graham as a "hacker philosopher".[7] Graham was born in England, where he and his family maintain permanent residence. However he is also a citizen of the United States, where he was educated, lived, and worked until 2016."""
prompt = "Where does Paul Graham live?"
final_prompt = context + prompt
result = load_test_model(models=models, prompt=final_prompt, num_calls=100, interval=15, duration=120)
```
```python
import matplotlib.pyplot as plt
## calculate avg response time
unique_models = set(unique_result["response"]['model'] for unique_result in result[0]["results"])
model_dict = {model: {"response_time": []} for model in unique_models}
for iteration in result:
for completion_result in iteration["results"]:
model_dict[completion_result["response"]["model"]]["response_time"].append(completion_result["response_time"])
avg_response_time = {}
for model, data in model_dict.items():
avg_response_time[model] = sum(data["response_time"]) / len(data["response_time"])
models = list(avg_response_time.keys())
response_times = list(avg_response_time.values())
plt.bar(models, response_times)
plt.xlabel('Model', fontsize=10)
plt.ylabel('Average Response Time')
plt.title('Average Response Times for each Model')
plt.xticks(models, [model[:15]+'...' if len(model) > 15 else model for model in models], rotation=45)
plt.show()
```
![png](litellm_Test_Multiple_Providers_files/litellm_Test_Multiple_Providers_14_0.png)

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

View file

@ -0,0 +1,39 @@
# Using Text Completion Format - with Completion()
If your prefer interfacing with the OpenAI Text Completion format this tutorial covers how to use LiteLLM in this format
```python
response = openai.Completion.create(
model="text-davinci-003",
prompt='Write a tagline for a traditional bavarian tavern',
temperature=0,
max_tokens=100)
```
## Using LiteLLM in the Text Completion format
### With gpt-3.5-turbo
```python
from litellm import text_completion
response = text_completion(
model="gpt-3.5-turbo",
prompt='Write a tagline for a traditional bavarian tavern',
temperature=0,
max_tokens=100)
```
### With text-davinci-003
```python
response = text_completion(
model="text-davinci-003",
prompt='Write a tagline for a traditional bavarian tavern',
temperature=0,
max_tokens=100)
```
### With llama2
```python
response = text_completion(
model="togethercomputer/llama-2-70b-chat",
prompt='Write a tagline for a traditional bavarian tavern',
temperature=0,
max_tokens=100)
```

View file

@ -31,13 +31,16 @@ const config = {
[
'@docusaurus/plugin-ideal-image',
{
quality: 70,
max: 1030, // max resized image's size.
quality: 100,
max: 1920, // max resized image's size.
min: 640, // min resized image's size. if original is lower, use that size.
steps: 2, // the max number of images generated between min and max (inclusive)
disableInDev: false,
},
],
[ require.resolve('docusaurus-lunr-search'), {
languages: ['en'] // language codes
}]
],
presets: [
@ -60,7 +63,7 @@ const config = {
/** @type {import('@docusaurus/preset-classic').ThemeConfig} */
({
// Replace with your project's social card
image: 'img/docusaurus-social-card.jpg',
image: 'img/docusaurus-social-card.png',
navbar: {
title: '🚅 LiteLLM',
items: [

Binary file not shown.

After

Width:  |  Height:  |  Size: 476 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 145 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 523 KiB

After

Width:  |  Height:  |  Size: 386 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 429 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 505 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 246 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 467 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 123 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 254 KiB

View file

@ -13,9 +13,11 @@
"@docusaurus/preset-classic": "2.4.1",
"@mdx-js/react": "^1.6.22",
"clsx": "^1.2.1",
"docusaurus-lunr-search": "^2.4.1",
"prism-react-renderer": "^1.3.5",
"react": "^17.0.2",
"react-dom": "^17.0.2",
"sharp": "^0.32.5",
"uuid": "^9.0.0"
},
"devDependencies": {
@ -2285,6 +2287,33 @@
"node": ">=16.14"
}
},
"node_modules/@docusaurus/lqip-loader/node_modules/node-addon-api": {
"version": "5.1.0",
"resolved": "https://registry.npmjs.org/node-addon-api/-/node-addon-api-5.1.0.tgz",
"integrity": "sha512-eh0GgfEkpnoWDq+VY8OyvYhFEzBk6jIYbRKdIlyTiAXIVJ8PyBaKb0rp7oDtoddbdoHWhq8wwr+XZ81F1rpNdA=="
},
"node_modules/@docusaurus/lqip-loader/node_modules/sharp": {
"version": "0.30.7",
"resolved": "https://registry.npmjs.org/sharp/-/sharp-0.30.7.tgz",
"integrity": "sha512-G+MY2YW33jgflKPTXXptVO28HvNOo9G3j0MybYAHeEmby+QuD2U98dT6ueht9cv/XDqZspSpIhoSW+BAKJ7Hig==",
"hasInstallScript": true,
"dependencies": {
"color": "^4.2.3",
"detect-libc": "^2.0.1",
"node-addon-api": "^5.0.0",
"prebuild-install": "^7.1.1",
"semver": "^7.3.7",
"simple-get": "^4.0.1",
"tar-fs": "^2.1.1",
"tunnel-agent": "^0.6.0"
},
"engines": {
"node": ">=12.13.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/@docusaurus/mdx-loader": {
"version": "2.4.1",
"resolved": "https://registry.npmjs.org/@docusaurus/mdx-loader/-/mdx-loader-2.4.1.tgz",
@ -2522,6 +2551,33 @@
}
}
},
"node_modules/@docusaurus/plugin-ideal-image/node_modules/node-addon-api": {
"version": "5.1.0",
"resolved": "https://registry.npmjs.org/node-addon-api/-/node-addon-api-5.1.0.tgz",
"integrity": "sha512-eh0GgfEkpnoWDq+VY8OyvYhFEzBk6jIYbRKdIlyTiAXIVJ8PyBaKb0rp7oDtoddbdoHWhq8wwr+XZ81F1rpNdA=="
},
"node_modules/@docusaurus/plugin-ideal-image/node_modules/sharp": {
"version": "0.30.7",
"resolved": "https://registry.npmjs.org/sharp/-/sharp-0.30.7.tgz",
"integrity": "sha512-G+MY2YW33jgflKPTXXptVO28HvNOo9G3j0MybYAHeEmby+QuD2U98dT6ueht9cv/XDqZspSpIhoSW+BAKJ7Hig==",
"hasInstallScript": true,
"dependencies": {
"color": "^4.2.3",
"detect-libc": "^2.0.1",
"node-addon-api": "^5.0.0",
"prebuild-install": "^7.1.1",
"semver": "^7.3.7",
"simple-get": "^4.0.1",
"tar-fs": "^2.1.1",
"tunnel-agent": "^0.6.0"
},
"engines": {
"node": ">=12.13.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/@docusaurus/plugin-sitemap": {
"version": "2.4.1",
"resolved": "https://registry.npmjs.org/@docusaurus/plugin-sitemap/-/plugin-sitemap-2.4.1.tgz",
@ -3822,6 +3878,11 @@
"resolved": "https://registry.npmjs.org/@xtuc/long/-/long-4.2.2.tgz",
"integrity": "sha512-NuHqBY1PB/D8xU6s/thBgOAiAP7HOYDQ32+BFZILJ8ivkUkAHQnWfn6WhL79Owj1qmUnoN/YPhktdIoucipkAQ=="
},
"node_modules/abbrev": {
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/abbrev/-/abbrev-1.1.1.tgz",
"integrity": "sha512-nne9/IiQ/hzIhY6pdDnbBtz7DjPTKrY00P/zvPSm5pOFkl6xuGrGnXn/VtTNNfNtAfZ9/1RtehkszU9qcTii0Q=="
},
"node_modules/accepts": {
"version": "1.3.8",
"resolved": "https://registry.npmjs.org/accepts/-/accepts-1.3.8.tgz",
@ -4062,6 +4123,11 @@
"node": ">= 8"
}
},
"node_modules/aproba": {
"version": "2.0.0",
"resolved": "https://registry.npmjs.org/aproba/-/aproba-2.0.0.tgz",
"integrity": "sha512-lYe4Gx7QT+MKGbDsA+Z+he/Wtef0BiwDOlK/XkBrdfsh9J/jPPXbX0tE9x9cl27Tmu5gg3QUbUrQYa/y+KOHPQ=="
},
"node_modules/arg": {
"version": "5.0.2",
"resolved": "https://registry.npmjs.org/arg/-/arg-5.0.2.tgz",
@ -4098,6 +4164,14 @@
"node": ">= 4.0.0"
}
},
"node_modules/autocomplete.js": {
"version": "0.37.1",
"resolved": "https://registry.npmjs.org/autocomplete.js/-/autocomplete.js-0.37.1.tgz",
"integrity": "sha512-PgSe9fHYhZEsm/9jggbjtVsGXJkPLvd+9mC7gZJ662vVL5CRWEtm/mIrrzCx0MrNxHVwxD5d00UOn6NsmL2LUQ==",
"dependencies": {
"immediate": "^3.2.3"
}
},
"node_modules/autoprefixer": {
"version": "10.4.14",
"resolved": "https://registry.npmjs.org/autoprefixer/-/autoprefixer-10.4.14.tgz",
@ -4138,6 +4212,11 @@
"follow-redirects": "^1.14.7"
}
},
"node_modules/b4a": {
"version": "1.6.4",
"resolved": "https://registry.npmjs.org/b4a/-/b4a-1.6.4.tgz",
"integrity": "sha512-fpWrvyVHEKyeEvbKZTVOeZF3VSKKWtJxFIxX/jaVPf+cLbGUSitjb49pHLqPV2BUNNZ0LcoeEGfE/YCpyDYHIw=="
},
"node_modules/babel-loader": {
"version": "8.3.0",
"resolved": "https://registry.npmjs.org/babel-loader/-/babel-loader-8.3.0.tgz",
@ -4289,6 +4368,15 @@
"resolved": "https://registry.npmjs.org/batch/-/batch-0.6.1.tgz",
"integrity": "sha512-x+VAiMRL6UPkx+kudNvxTl6hB2XNNCG2r+7wixVfIYwu/2HKRXimwQyaumLjMveWvT2Hkd/cAJw+QBMfJ/EKVw=="
},
"node_modules/bcp-47-match": {
"version": "1.0.3",
"resolved": "https://registry.npmjs.org/bcp-47-match/-/bcp-47-match-1.0.3.tgz",
"integrity": "sha512-LggQ4YTdjWQSKELZF5JwchnBa1u0pIQSZf5lSdOHEdbVP55h0qICA/FUp3+W99q0xqxYa1ZQizTUH87gecII5w==",
"funding": {
"type": "github",
"url": "https://github.com/sponsors/wooorm"
}
},
"node_modules/big.js": {
"version": "5.2.2",
"resolved": "https://registry.npmjs.org/big.js/-/big.js-5.2.2.tgz",
@ -4888,6 +4976,14 @@
"simple-swizzle": "^0.2.2"
}
},
"node_modules/color-support": {
"version": "1.1.3",
"resolved": "https://registry.npmjs.org/color-support/-/color-support-1.1.3.tgz",
"integrity": "sha512-qiBjkpbMLO/HL68y+lh4q0/O1MZFj2RX6X/KmMa3+gJD3z+WwI1ZzDHysvqHGS3mP6mznPckpXmw1nI9cJjyRg==",
"bin": {
"color-support": "bin.js"
}
},
"node_modules/colord": {
"version": "2.9.3",
"resolved": "https://registry.npmjs.org/colord/-/colord-2.9.3.tgz",
@ -5016,6 +5112,11 @@
"resolved": "https://registry.npmjs.org/consola/-/consola-2.15.3.tgz",
"integrity": "sha512-9vAdYbHj6x2fLKC4+oPH0kFzY/orMZyG2Aj+kNylHxKGJ/Ed4dpNyAQYwJOdqO4zdM7XpVHmyejQDcQHrnuXbw=="
},
"node_modules/console-control-strings": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/console-control-strings/-/console-control-strings-1.1.0.tgz",
"integrity": "sha512-ty/fTekppD2fIwRvnZAVdeOiGd1c7YXEixbgJTNzqcxJWKQnjJ/V1bNEEE6hygpM3WjwHFUVK6HTjWSzV4a8sQ=="
},
"node_modules/consolidated-events": {
"version": "2.0.2",
"resolved": "https://registry.npmjs.org/consolidated-events/-/consolidated-events-2.0.2.tgz",
@ -5402,6 +5503,11 @@
"url": "https://github.com/sponsors/fb55"
}
},
"node_modules/css-selector-parser": {
"version": "1.4.1",
"resolved": "https://registry.npmjs.org/css-selector-parser/-/css-selector-parser-1.4.1.tgz",
"integrity": "sha512-HYPSb7y/Z7BNDCOrakL4raGO2zltZkbeXyAd6Tg9obzix6QhzxCotdBl6VT0Dv4vZfJGVz3WL/xaEI9Ly3ul0g=="
},
"node_modules/css-tree": {
"version": "1.1.3",
"resolved": "https://registry.npmjs.org/css-tree/-/css-tree-1.1.3.tgz",
@ -5742,6 +5848,18 @@
"node": ">=8"
}
},
"node_modules/direction": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/direction/-/direction-1.0.4.tgz",
"integrity": "sha512-GYqKi1aH7PJXxdhTeZBFrg8vUBeKXi+cNprXsC1kpJcbcVnV9wBsrOu1cQEdG0WeQwlfHiy3XvnKfIrJ2R0NzQ==",
"bin": {
"direction": "cli.js"
},
"funding": {
"type": "github",
"url": "https://github.com/sponsors/wooorm"
}
},
"node_modules/dns-equal": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/dns-equal/-/dns-equal-1.0.0.tgz",
@ -5758,6 +5876,35 @@
"node": ">=6"
}
},
"node_modules/docusaurus-lunr-search": {
"version": "2.4.1",
"resolved": "https://registry.npmjs.org/docusaurus-lunr-search/-/docusaurus-lunr-search-2.4.1.tgz",
"integrity": "sha512-UOgaAypgO0iLyA1Hk4EThG/ofLm9/JldznzN98ZKr7TMYVjMZbAEaIBKLAUDFdfOPr9D5EswXdLn39/aRkwHMA==",
"dependencies": {
"autocomplete.js": "^0.37.0",
"clsx": "^1.2.1",
"gauge": "^3.0.0",
"hast-util-select": "^4.0.0",
"hast-util-to-text": "^2.0.0",
"hogan.js": "^3.0.2",
"lunr": "^2.3.8",
"lunr-languages": "^1.4.0",
"minimatch": "^3.0.4",
"object-assign": "^4.1.1",
"rehype-parse": "^7.0.1",
"to-vfile": "^6.1.0",
"unified": "^9.0.0",
"unist-util-is": "^4.0.2"
},
"engines": {
"node": ">= 8.10.0"
},
"peerDependencies": {
"@docusaurus/core": "^2.0.0-alpha.60 || ^2.0.0",
"react": "^16.8.4 || ^17",
"react-dom": "^16.8.4 || ^17"
}
},
"node_modules/dom-converter": {
"version": "0.2.0",
"resolved": "https://registry.npmjs.org/dom-converter/-/dom-converter-0.2.0.tgz",
@ -6224,6 +6371,11 @@
"resolved": "https://registry.npmjs.org/fast-deep-equal/-/fast-deep-equal-3.1.3.tgz",
"integrity": "sha512-f3qQ9oQy9j2AhBe/H9VC91wLmKBCCU/gDOnKNAYG5hswO7BLKj09Hc5HYNz9cGI++xlpDCIgDaitVs03ATR84Q=="
},
"node_modules/fast-fifo": {
"version": "1.3.2",
"resolved": "https://registry.npmjs.org/fast-fifo/-/fast-fifo-1.3.2.tgz",
"integrity": "sha512-/d9sfos4yxzpwkDkuN7k2SqFKtYNmCTzgfEpz82x34IM9/zc8KGxQoXg1liNC/izpRM/MBdt44Nmx41ZWqk+FQ=="
},
"node_modules/fast-glob": {
"version": "3.3.1",
"resolved": "https://registry.npmjs.org/fast-glob/-/fast-glob-3.3.1.tgz",
@ -6619,6 +6771,43 @@
"resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.1.tgz",
"integrity": "sha512-yIovAzMX49sF8Yl58fSCWJ5svSLuaibPxXQJFLmBObTuCr0Mf1KiPopGM9NiFjiYBCbfaa2Fh6breQ6ANVTI0A=="
},
"node_modules/gauge": {
"version": "3.0.2",
"resolved": "https://registry.npmjs.org/gauge/-/gauge-3.0.2.tgz",
"integrity": "sha512-+5J6MS/5XksCuXq++uFRsnUd7Ovu1XenbeuIuNRJxYWjgQbPuFhT14lAvsWfqfAmnwluf1OwMjz39HjfLPci0Q==",
"dependencies": {
"aproba": "^1.0.3 || ^2.0.0",
"color-support": "^1.1.2",
"console-control-strings": "^1.0.0",
"has-unicode": "^2.0.1",
"object-assign": "^4.1.1",
"signal-exit": "^3.0.0",
"string-width": "^4.2.3",
"strip-ansi": "^6.0.1",
"wide-align": "^1.1.2"
},
"engines": {
"node": ">=10"
}
},
"node_modules/gauge/node_modules/emoji-regex": {
"version": "8.0.0",
"resolved": "https://registry.npmjs.org/emoji-regex/-/emoji-regex-8.0.0.tgz",
"integrity": "sha512-MSjYzcWNOA0ewAHpz0MxpYFvwg6yjy1NG3xteoqz644VCo/RPgnr1/GGt+ic3iJTzQ8Eu3TdM14SawnVUmGE6A=="
},
"node_modules/gauge/node_modules/string-width": {
"version": "4.2.3",
"resolved": "https://registry.npmjs.org/string-width/-/string-width-4.2.3.tgz",
"integrity": "sha512-wKyQRQpjJ0sIp62ErSZdGsjMJWsap5oRNihHhu6G7JVO/9jIB6UyevL+tXuOqrng8j/cxKTWyWUwvSTriiZz/g==",
"dependencies": {
"emoji-regex": "^8.0.0",
"is-fullwidth-code-point": "^3.0.0",
"strip-ansi": "^6.0.1"
},
"engines": {
"node": ">=8"
}
},
"node_modules/gensync": {
"version": "1.0.0-beta.2",
"resolved": "https://registry.npmjs.org/gensync/-/gensync-1.0.0-beta.2.tgz",
@ -6917,6 +7106,11 @@
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/has-unicode": {
"version": "2.0.1",
"resolved": "https://registry.npmjs.org/has-unicode/-/has-unicode-2.0.1.tgz",
"integrity": "sha512-8Rf9Y83NBReMnx0gFzA8JImQACstCYWUplepDa9xprwwtmgEZUF0h/i5xSA625zB/I37EtrswSST6OXxwaaIJQ=="
},
"node_modules/has-yarn": {
"version": "2.1.0",
"resolved": "https://registry.npmjs.org/has-yarn/-/has-yarn-2.1.0.tgz",
@ -6960,6 +7154,24 @@
"url": "https://opencollective.com/unified"
}
},
"node_modules/hast-util-has-property": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/hast-util-has-property/-/hast-util-has-property-1.0.4.tgz",
"integrity": "sha512-ghHup2voGfgFoHMGnaLHOjbYFACKrRh9KFttdCzMCbFoBMJXiNi2+XTrPP8+q6cDJM/RSqlCfVWrjp1H201rZg==",
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/unified"
}
},
"node_modules/hast-util-is-element": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/hast-util-is-element/-/hast-util-is-element-1.1.0.tgz",
"integrity": "sha512-oUmNua0bFbdrD/ELDSSEadRVtWZOf3iF6Lbv81naqsIV99RnSCieTbWuWCY8BAeEfKJTKl0gRdokv+dELutHGQ==",
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/unified"
}
},
"node_modules/hast-util-parse-selector": {
"version": "2.2.5",
"resolved": "https://registry.npmjs.org/hast-util-parse-selector/-/hast-util-parse-selector-2.2.5.tgz",
@ -6995,6 +7207,31 @@
"resolved": "https://registry.npmjs.org/parse5/-/parse5-6.0.1.tgz",
"integrity": "sha512-Ofn/CTFzRGTTxwpNEs9PP93gXShHcTq255nzRYSKe8AkVpZY7e1fpmTfOyoIvjP5HG7Z2ZM7VS9PPhQGW2pOpw=="
},
"node_modules/hast-util-select": {
"version": "4.0.2",
"resolved": "https://registry.npmjs.org/hast-util-select/-/hast-util-select-4.0.2.tgz",
"integrity": "sha512-8EEG2//bN5rrzboPWD2HdS3ugLijNioS1pqOTIolXNf67xxShYw4SQEmVXd3imiBG+U2bC2nVTySr/iRAA7Cjg==",
"dependencies": {
"bcp-47-match": "^1.0.0",
"comma-separated-tokens": "^1.0.0",
"css-selector-parser": "^1.0.0",
"direction": "^1.0.0",
"hast-util-has-property": "^1.0.0",
"hast-util-is-element": "^1.0.0",
"hast-util-to-string": "^1.0.0",
"hast-util-whitespace": "^1.0.0",
"not": "^0.1.0",
"nth-check": "^2.0.0",
"property-information": "^5.0.0",
"space-separated-tokens": "^1.0.0",
"unist-util-visit": "^2.0.0",
"zwitch": "^1.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/unified"
}
},
"node_modules/hast-util-to-parse5": {
"version": "6.0.0",
"resolved": "https://registry.npmjs.org/hast-util-to-parse5/-/hast-util-to-parse5-6.0.0.tgz",
@ -7011,6 +7248,38 @@
"url": "https://opencollective.com/unified"
}
},
"node_modules/hast-util-to-string": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/hast-util-to-string/-/hast-util-to-string-1.0.4.tgz",
"integrity": "sha512-eK0MxRX47AV2eZ+Lyr18DCpQgodvaS3fAQO2+b9Two9F5HEoRPhiUMNzoXArMJfZi2yieFzUBMRl3HNJ3Jus3w==",
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/unified"
}
},
"node_modules/hast-util-to-text": {
"version": "2.0.1",
"resolved": "https://registry.npmjs.org/hast-util-to-text/-/hast-util-to-text-2.0.1.tgz",
"integrity": "sha512-8nsgCARfs6VkwH2jJU9b8LNTuR4700na+0h3PqCaEk4MAnMDeu5P0tP8mjk9LLNGxIeQRLbiDbZVw6rku+pYsQ==",
"dependencies": {
"hast-util-is-element": "^1.0.0",
"repeat-string": "^1.0.0",
"unist-util-find-after": "^3.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/unified"
}
},
"node_modules/hast-util-whitespace": {
"version": "1.0.4",
"resolved": "https://registry.npmjs.org/hast-util-whitespace/-/hast-util-whitespace-1.0.4.tgz",
"integrity": "sha512-I5GTdSfhYfAPNztx2xJRQpG8cuDSNt599/7YUn7Gx/WxNMsG+a835k97TDkFgk123cwjfwINaZknkKkphx/f2A==",
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/unified"
}
},
"node_modules/hastscript": {
"version": "6.0.0",
"resolved": "https://registry.npmjs.org/hastscript/-/hastscript-6.0.0.tgz",
@ -7048,6 +7317,18 @@
"value-equal": "^1.0.1"
}
},
"node_modules/hogan.js": {
"version": "3.0.2",
"resolved": "https://registry.npmjs.org/hogan.js/-/hogan.js-3.0.2.tgz",
"integrity": "sha512-RqGs4wavGYJWE07t35JQccByczmNUXQT0E12ZYV1VKYu5UiAU9lsos/yBAcf840+zrUQQxgVduCR5/B8nNtibg==",
"dependencies": {
"mkdirp": "0.3.0",
"nopt": "1.0.10"
},
"bin": {
"hulk": "bin/hulk"
}
},
"node_modules/hoist-non-react-statics": {
"version": "3.3.2",
"resolved": "https://registry.npmjs.org/hoist-non-react-statics/-/hoist-non-react-statics-3.3.2.tgz",
@ -7350,6 +7631,11 @@
"node": ">=14.0.0"
}
},
"node_modules/immediate": {
"version": "3.3.0",
"resolved": "https://registry.npmjs.org/immediate/-/immediate-3.3.0.tgz",
"integrity": "sha512-HR7EVodfFUdQCTIeySw+WDRFJlPcLOJbXfwwZ7Oom6tjsvZ3bOkCDJHehQC3nxJrv7+f9XecwazynjU8e4Vw3Q=="
},
"node_modules/immer": {
"version": "9.0.21",
"resolved": "https://registry.npmjs.org/immer/-/immer-9.0.21.tgz",
@ -8059,6 +8345,16 @@
"yallist": "^3.0.2"
}
},
"node_modules/lunr": {
"version": "2.3.9",
"resolved": "https://registry.npmjs.org/lunr/-/lunr-2.3.9.tgz",
"integrity": "sha512-zTU3DaZaF3Rt9rhN3uBMGQD3dD2/vFQqnvZCDv4dl5iOzq2IZQqTxu90r4E5J+nP70J3ilqVCrbho2eWaeW8Ow=="
},
"node_modules/lunr-languages": {
"version": "1.13.0",
"resolved": "https://registry.npmjs.org/lunr-languages/-/lunr-languages-1.13.0.tgz",
"integrity": "sha512-qgTOarcnAtVFKr0aJ2GuiqbBdhKF61jpF8OgFbnlSAb1t6kOiQW67q0hv0UQzzB+5+OwPpnZyFT/L0L9SQG1/A=="
},
"node_modules/make-dir": {
"version": "3.1.0",
"resolved": "https://registry.npmjs.org/make-dir/-/make-dir-3.1.0.tgz",
@ -8346,6 +8642,15 @@
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/mkdirp": {
"version": "0.3.0",
"resolved": "https://registry.npmjs.org/mkdirp/-/mkdirp-0.3.0.tgz",
"integrity": "sha512-OHsdUcVAQ6pOtg5JYWpCBo9W/GySVuwvP9hueRMW7UqshC0tbfzLv8wjySTPm3tfUZ/21CE9E1pJagOA91Pxew==",
"deprecated": "Legacy versions of mkdirp are no longer supported. Please update to mkdirp 1.x. (Note that the API surface has changed to use Promises in 1.x.)",
"engines": {
"node": "*"
}
},
"node_modules/mkdirp-classic": {
"version": "0.5.3",
"resolved": "https://registry.npmjs.org/mkdirp-classic/-/mkdirp-classic-0.5.3.tgz",
@ -8432,9 +8737,9 @@
}
},
"node_modules/node-addon-api": {
"version": "5.1.0",
"resolved": "https://registry.npmjs.org/node-addon-api/-/node-addon-api-5.1.0.tgz",
"integrity": "sha512-eh0GgfEkpnoWDq+VY8OyvYhFEzBk6jIYbRKdIlyTiAXIVJ8PyBaKb0rp7oDtoddbdoHWhq8wwr+XZ81F1rpNdA=="
"version": "6.1.0",
"resolved": "https://registry.npmjs.org/node-addon-api/-/node-addon-api-6.1.0.tgz",
"integrity": "sha512-+eawOlIgy680F0kBzPUNFhMZGtJ1YmqM6l4+Crf4IkImjYrO/mqPwRMh352g23uIaQKFItcQ64I7KMaJxHgAVA=="
},
"node_modules/node-emoji": {
"version": "1.11.0",
@ -8476,6 +8781,20 @@
"resolved": "https://registry.npmjs.org/node-releases/-/node-releases-2.0.13.tgz",
"integrity": "sha512-uYr7J37ae/ORWdZeQ1xxMJe3NtdmqMC/JZK+geofDrkLUApKRHPd18/TxtBOJ4A0/+uUIliorNrfYV6s1b02eQ=="
},
"node_modules/nopt": {
"version": "1.0.10",
"resolved": "https://registry.npmjs.org/nopt/-/nopt-1.0.10.tgz",
"integrity": "sha512-NWmpvLSqUrgrAC9HCuxEvb+PSloHpqVu+FqcO4eeF2h5qYRhA7ev6KvelyQAKtegUbC6RypJnlEOhd8vloNKYg==",
"dependencies": {
"abbrev": "1"
},
"bin": {
"nopt": "bin/nopt.js"
},
"engines": {
"node": "*"
}
},
"node_modules/normalize-path": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/normalize-path/-/normalize-path-3.0.0.tgz",
@ -8503,6 +8822,11 @@
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/not": {
"version": "0.1.0",
"resolved": "https://registry.npmjs.org/not/-/not-0.1.0.tgz",
"integrity": "sha512-5PDmaAsVfnWUgTUbJ3ERwn7u79Z0dYxN9ErxCpVJJqe2RK0PJ3z+iFUxuqjwtlDDegXvtWoxD/3Fzxox7tFGWA=="
},
"node_modules/npm-run-path": {
"version": "4.0.1",
"resolved": "https://registry.npmjs.org/npm-run-path/-/npm-run-path-4.0.1.tgz",
@ -9746,6 +10070,11 @@
}
]
},
"node_modules/queue-tick": {
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/queue-tick/-/queue-tick-1.0.1.tgz",
"integrity": "sha512-kJt5qhMxoszgU/62PLP1CJytzd2NKetjSRnyuj31fDd3Rlcz3fzlFdFLD1SItunPwyqEOkca6GbV612BWfaBag=="
},
"node_modules/randombytes": {
"version": "2.1.0",
"resolved": "https://registry.npmjs.org/randombytes/-/randombytes-2.1.0.tgz",
@ -10240,6 +10569,24 @@
"jsesc": "bin/jsesc"
}
},
"node_modules/rehype-parse": {
"version": "7.0.1",
"resolved": "https://registry.npmjs.org/rehype-parse/-/rehype-parse-7.0.1.tgz",
"integrity": "sha512-fOiR9a9xH+Le19i4fGzIEowAbwG7idy2Jzs4mOrFWBSJ0sNUgy0ev871dwWnbOo371SjgjG4pwzrbgSVrKxecw==",
"dependencies": {
"hast-util-from-parse5": "^6.0.0",
"parse5": "^6.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/unified"
}
},
"node_modules/rehype-parse/node_modules/parse5": {
"version": "6.0.1",
"resolved": "https://registry.npmjs.org/parse5/-/parse5-6.0.1.tgz",
"integrity": "sha512-Ofn/CTFzRGTTxwpNEs9PP93gXShHcTq255nzRYSKe8AkVpZY7e1fpmTfOyoIvjP5HG7Z2ZM7VS9PPhQGW2pOpw=="
},
"node_modules/relateurl": {
"version": "0.2.7",
"resolved": "https://registry.npmjs.org/relateurl/-/relateurl-0.2.7.tgz",
@ -11029,27 +11376,47 @@
"integrity": "sha512-y0m1JoUZSlPAjXVtPPW70aZWfIL/dSP7AFkRnniLCrK/8MDKog3TySTBmckD+RObVxH0v4Tox67+F14PdED2oQ=="
},
"node_modules/sharp": {
"version": "0.30.7",
"resolved": "https://registry.npmjs.org/sharp/-/sharp-0.30.7.tgz",
"integrity": "sha512-G+MY2YW33jgflKPTXXptVO28HvNOo9G3j0MybYAHeEmby+QuD2U98dT6ueht9cv/XDqZspSpIhoSW+BAKJ7Hig==",
"version": "0.32.5",
"resolved": "https://registry.npmjs.org/sharp/-/sharp-0.32.5.tgz",
"integrity": "sha512-0dap3iysgDkNaPOaOL4X/0akdu0ma62GcdC2NBQ+93eqpePdDdr2/LM0sFdDSMmN7yS+odyZtPsb7tx/cYBKnQ==",
"hasInstallScript": true,
"dependencies": {
"color": "^4.2.3",
"detect-libc": "^2.0.1",
"node-addon-api": "^5.0.0",
"detect-libc": "^2.0.2",
"node-addon-api": "^6.1.0",
"prebuild-install": "^7.1.1",
"semver": "^7.3.7",
"semver": "^7.5.4",
"simple-get": "^4.0.1",
"tar-fs": "^2.1.1",
"tar-fs": "^3.0.4",
"tunnel-agent": "^0.6.0"
},
"engines": {
"node": ">=12.13.0"
"node": ">=14.15.0"
},
"funding": {
"url": "https://opencollective.com/libvips"
}
},
"node_modules/sharp/node_modules/tar-fs": {
"version": "3.0.4",
"resolved": "https://registry.npmjs.org/tar-fs/-/tar-fs-3.0.4.tgz",
"integrity": "sha512-5AFQU8b9qLfZCX9zp2duONhPmZv0hGYiBPJsyUdqMjzq/mqVpy/rEUSeHk1+YitmxugaptgBh5oDGU3VsAJq4w==",
"dependencies": {
"mkdirp-classic": "^0.5.2",
"pump": "^3.0.0",
"tar-stream": "^3.1.5"
}
},
"node_modules/sharp/node_modules/tar-stream": {
"version": "3.1.6",
"resolved": "https://registry.npmjs.org/tar-stream/-/tar-stream-3.1.6.tgz",
"integrity": "sha512-B/UyjYwPpMBv+PaFSWAmtYjwdrlEaZQEhMIBFNC5oEG8lpiW8XjcSdmEaClj28ArfKScKHs2nshz3k2le6crsg==",
"dependencies": {
"b4a": "^1.6.4",
"fast-fifo": "^1.2.0",
"streamx": "^2.15.0"
}
},
"node_modules/shebang-command": {
"version": "2.0.0",
"resolved": "https://registry.npmjs.org/shebang-command/-/shebang-command-2.0.0.tgz",
@ -11362,6 +11729,15 @@
"resolved": "https://registry.npmjs.org/std-env/-/std-env-3.3.3.tgz",
"integrity": "sha512-Rz6yejtVyWnVjC1RFvNmYL10kgjC49EOghxWn0RFqlCHGFpQx+Xe7yW3I4ceK1SGrWIGMjD5Kbue8W/udkbMJg=="
},
"node_modules/streamx": {
"version": "2.15.1",
"resolved": "https://registry.npmjs.org/streamx/-/streamx-2.15.1.tgz",
"integrity": "sha512-fQMzy2O/Q47rgwErk/eGeLu/roaFWV0jVsogDmrszM9uIw8L5OA+t+V93MgYlufNptfjmYR1tOMWhei/Eh7TQA==",
"dependencies": {
"fast-fifo": "^1.1.0",
"queue-tick": "^1.0.1"
}
},
"node_modules/string_decoder": {
"version": "1.3.0",
"resolved": "https://registry.npmjs.org/string_decoder/-/string_decoder-1.3.0.tgz",
@ -11783,6 +12159,19 @@
"node": ">=8.0"
}
},
"node_modules/to-vfile": {
"version": "6.1.0",
"resolved": "https://registry.npmjs.org/to-vfile/-/to-vfile-6.1.0.tgz",
"integrity": "sha512-BxX8EkCxOAZe+D/ToHdDsJcVI4HqQfmw0tCkp31zf3dNP/XWIAjU4CmeuSwsSoOzOTqHPOL0KUzyZqJplkD0Qw==",
"dependencies": {
"is-buffer": "^2.0.0",
"vfile": "^4.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/unified"
}
},
"node_modules/toidentifier": {
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/toidentifier/-/toidentifier-1.0.1.tgz",
@ -12011,6 +12400,18 @@
"url": "https://opencollective.com/unified"
}
},
"node_modules/unist-util-find-after": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/unist-util-find-after/-/unist-util-find-after-3.0.0.tgz",
"integrity": "sha512-ojlBqfsBftYXExNu3+hHLfJQ/X1jYY/9vdm4yZWjIbf0VuWF6CRufci1ZyoD/wV2TYMKxXUoNuoqwy+CkgzAiQ==",
"dependencies": {
"unist-util-is": "^4.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/unified"
}
},
"node_modules/unist-util-generated": {
"version": "1.1.6",
"resolved": "https://registry.npmjs.org/unist-util-generated/-/unist-util-generated-1.1.6.tgz",
@ -12950,6 +13351,32 @@
"node": ">= 8"
}
},
"node_modules/wide-align": {
"version": "1.1.5",
"resolved": "https://registry.npmjs.org/wide-align/-/wide-align-1.1.5.tgz",
"integrity": "sha512-eDMORYaPNZ4sQIuuYPDHdQvf4gyCF9rEEV/yPxGfwPkRodwEgiMUUXTx/dex+Me0wxx53S+NgUHaP7y3MGlDmg==",
"dependencies": {
"string-width": "^1.0.2 || 2 || 3 || 4"
}
},
"node_modules/wide-align/node_modules/emoji-regex": {
"version": "8.0.0",
"resolved": "https://registry.npmjs.org/emoji-regex/-/emoji-regex-8.0.0.tgz",
"integrity": "sha512-MSjYzcWNOA0ewAHpz0MxpYFvwg6yjy1NG3xteoqz644VCo/RPgnr1/GGt+ic3iJTzQ8Eu3TdM14SawnVUmGE6A=="
},
"node_modules/wide-align/node_modules/string-width": {
"version": "4.2.3",
"resolved": "https://registry.npmjs.org/string-width/-/string-width-4.2.3.tgz",
"integrity": "sha512-wKyQRQpjJ0sIp62ErSZdGsjMJWsap5oRNihHhu6G7JVO/9jIB6UyevL+tXuOqrng8j/cxKTWyWUwvSTriiZz/g==",
"dependencies": {
"emoji-regex": "^8.0.0",
"is-fullwidth-code-point": "^3.0.0",
"strip-ansi": "^6.0.1"
},
"engines": {
"node": ">=8"
}
},
"node_modules/widest-line": {
"version": "4.0.1",
"resolved": "https://registry.npmjs.org/widest-line/-/widest-line-4.0.1.tgz",

View file

@ -19,6 +19,7 @@
"@docusaurus/preset-classic": "2.4.1",
"@mdx-js/react": "^1.6.22",
"clsx": "^1.2.1",
"docusaurus-lunr-search": "^2.4.1",
"prism-react-renderer": "^1.3.5",
"react": "^17.0.2",
"react-dom": "^17.0.2",

View file

@ -17,37 +17,67 @@ const sidebars = {
// But you can create a sidebar manually
tutorialSidebar: [
{ type: "doc", id: "index" }, // NEW
{ type: "doc", id: "index" }, // NEW
"tutorials/first_playground",
{
type: 'category',
label: 'Completion()',
items: ['completion/input','completion/output'],
type: "category",
label: "Completion()",
items: ["completion/input", "completion/output", "completion/reliable_completions"],
},
{
type: 'category',
label: 'Embedding()',
items: ['embedding/supported_embedding'],
type: "category",
label: "Embedding()",
items: ["embedding/supported_embedding"],
},
'completion/supported',
'debugging/local_debugging',
"token_usage",
"exception_mapping",
"stream",
'debugging/hosted_debugging',
'debugging/local_debugging',
{
type: 'category',
label: 'Tutorials',
items: ['tutorials/huggingface_tutorial', 'tutorials/TogetherAI_liteLLM', 'tutorials/fallbacks', 'tutorials/finetuned_chat_gpt'],
items: [
'tutorials/huggingface_tutorial',
'tutorials/TogetherAI_liteLLM',
'tutorials/fallbacks',
'tutorials/finetuned_chat_gpt',
'tutorials/text_completion',
'tutorials/ab_test_llms',
'tutorials/litellm_Test_Multiple_Providers'
],
},
{
type: "category",
label: "Logging & Observability",
items: [
"observability/callbacks",
"observability/integrations",
"observability/promptlayer_integration",
"observability/llmonitor_integration",
"observability/helicone_integration",
"observability/supabase_integration",
],
},
{
type: "category",
label: "Caching",
items: [
"caching/caching",
"caching/gpt_cache",
],
},
'token_usage',
'stream',
'secret',
'caching',
{
type: 'category',
label: 'Logging & Observability',
items: ['observability/callbacks', 'observability/integrations', 'observability/helicone_integration', 'observability/supabase_integration'],
label: 'Extras',
items: [
'extras/secret',
],
},
'troubleshoot',
'contributing',
'contact'
"troubleshoot",
"contributing",
"contact",
],
};

View file

@ -0,0 +1,61 @@
import React, { useState, useEffect } from 'react';
const CodeBlock = ({ token }) => {
const codeWithToken = `
import os
from litellm import completion
# set ENV variables
os.environ["LITELLM_TOKEN"] = '${token}'
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# cohere call
response = completion("command-nightly", messages)
`;
const codeWithoutToken = `
from litellm import completion
## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# cohere call
response = completion("command-nightly", messages)
`;
return (
<pre>
{console.log("token: ", token)}
{token ? codeWithToken : codeWithoutToken}
</pre>
)
}
const QueryParamReader = () => {
const [token, setToken] = useState(null);
useEffect(() => {
const urlParams = new URLSearchParams(window.location.search);
console.log("urlParams: ", urlParams)
const token = urlParams.get('token');
setToken(token);
}, []);
return (
<div>
<CodeBlock token={token} />
</div>
);
}
export default QueryParamReader;

View file

@ -0,0 +1,19 @@
import React, { useState, useEffect } from 'react';
const QueryParamToken = () => {
const [token, setToken] = useState(null);
useEffect(() => {
const urlParams = new URLSearchParams(window.location.search);
const token = urlParams.get('token');
setToken(token);
}, []);
return (
<span style={{ padding: 0, margin: 0 }}>
{token ? <a href={`https://admin.litellm.ai/${token}`} target="_blank" rel="noopener noreferrer">admin.litellm.ai</a> : ""}
</span>
);
}
export default QueryParamToken;

View file

@ -1,29 +1,30 @@
# Callbacks
## Use Callbacks to send Output Data to Posthog, Sentry etc
liteLLM provides `success_callbacks` and `failure_callbacks`, making it easy for you to send data to a particular provider depending on the status of your responses.
liteLLM supports:
liteLLM provides `success_callbacks` and `failure_callbacks`, making it easy for you to send data to a particular provider depending on the status of your responses.
liteLLM supports:
- [LLMonitor](https://llmonitor.com/docs)
- [Helicone](https://docs.helicone.ai/introduction)
- [Sentry](https://docs.sentry.io/platforms/python/)
- [Sentry](https://docs.sentry.io/platforms/python/)
- [PostHog](https://posthog.com/docs/libraries/python)
- [Slack](https://slack.dev/bolt-python/concepts)
### Quick Start
```python
from litellm import completion
# set callbacks
litellm.success_callback=["posthog", "helicone"]
litellm.failure_callback=["sentry"]
litellm.success_callback=["posthog", "helicone", "llmonitor"]
litellm.failure_callback=["sentry", "llmonitor"]
## set env variables
os.environ['SENTRY_API_URL'], os.environ['SENTRY_API_TRACE_RATE']= ""
os.environ['POSTHOG_API_KEY'], os.environ['POSTHOG_API_URL'] = "api-key", "api-url"
os.environ["HELICONE_API_KEY"] = ""
os.environ["HELICONE_API_KEY"] = ""
response = completion(model="gpt-3.5-turbo", messages=messages)
response = completion(model="gpt-3.5-turbo", messages=messages)
```

View file

@ -1,12 +1,9 @@
# Logging Integrations
| Integration | Required OS Variables | How to Use with callbacks |
|-----------------|--------------------------------------------|-------------------------------------------|
| Sentry | `SENTRY_API_URL` | `litellm.success_callback=["sentry"]` |
| Posthog | `POSTHOG_API_KEY`,`POSTHOG_API_URL` | `litellm.success_callback=["posthog"]` |
| Slack | `SLACK_API_TOKEN`,`SLACK_API_SECRET`,`SLACK_API_CHANNEL` | `litellm.success_callback=["slack"]` |
| Helicone | `HELICONE_API_TOKEN` | `litellm.success_callback=["helicone"]` |
| Integration | Required OS Variables | How to Use with callbacks |
| ----------- | -------------------------------------------------------- | ---------------------------------------- |
| LLMonitor | `LLMONITOR_APP_ID` | `litellm.success_callback=["llmonitor"]` |
| Sentry | `SENTRY_API_URL` | `litellm.success_callback=["sentry"]` |
| Posthog | `POSTHOG_API_KEY`,`POSTHOG_API_URL` | `litellm.success_callback=["posthog"]` |
| Slack | `SLACK_API_TOKEN`,`SLACK_API_SECRET`,`SLACK_API_CHANNEL` | `litellm.success_callback=["slack"]` |
| Helicone | `HELICONE_API_TOKEN` | `litellm.success_callback=["helicone"]` |

View file

@ -1245,7 +1245,7 @@
"@docsearch/css" "3.5.1"
algoliasearch "^4.0.0"
"@docusaurus/core@2.4.1":
"@docusaurus/core@^2.0.0-alpha.60 || ^2.0.0", "@docusaurus/core@2.4.1":
version "2.4.1"
resolved "https://registry.npmjs.org/@docusaurus/core/-/core-2.4.1.tgz"
integrity sha512-SNsY7PshK3Ri7vtsLXVeAJGS50nJN3RgF836zkyUfAD01Fq+sAk5EwWgLw+nnm5KVNGDu7PRR2kRGDsWvqpo0g==
@ -2396,6 +2396,11 @@
resolved "https://registry.npmjs.org/@xtuc/long/-/long-4.2.2.tgz"
integrity sha512-NuHqBY1PB/D8xU6s/thBgOAiAP7HOYDQ32+BFZILJ8ivkUkAHQnWfn6WhL79Owj1qmUnoN/YPhktdIoucipkAQ==
abbrev@1:
version "1.1.1"
resolved "https://registry.npmjs.org/abbrev/-/abbrev-1.1.1.tgz"
integrity sha512-nne9/IiQ/hzIhY6pdDnbBtz7DjPTKrY00P/zvPSm5pOFkl6xuGrGnXn/VtTNNfNtAfZ9/1RtehkszU9qcTii0Q==
accepts@~1.3.4, accepts@~1.3.5, accepts@~1.3.8:
version "1.3.8"
resolved "https://registry.npmjs.org/accepts/-/accepts-1.3.8.tgz"
@ -2557,6 +2562,11 @@ anymatch@~3.1.2:
normalize-path "^3.0.0"
picomatch "^2.0.4"
"aproba@^1.0.3 || ^2.0.0":
version "2.0.0"
resolved "https://registry.npmjs.org/aproba/-/aproba-2.0.0.tgz"
integrity sha512-lYe4Gx7QT+MKGbDsA+Z+he/Wtef0BiwDOlK/XkBrdfsh9J/jPPXbX0tE9x9cl27Tmu5gg3QUbUrQYa/y+KOHPQ==
arg@^5.0.0:
version "5.0.2"
resolved "https://registry.npmjs.org/arg/-/arg-5.0.2.tgz"
@ -2599,6 +2609,13 @@ at-least-node@^1.0.0:
resolved "https://registry.npmjs.org/at-least-node/-/at-least-node-1.0.0.tgz"
integrity sha512-+q/t7Ekv1EDY2l6Gda6LLiX14rU9TV20Wa3ofeQmwPFZbOMo9DXrLbOjFaaclkXKWidIaopwAObQDqwWtGUjqg==
autocomplete.js@^0.37.0:
version "0.37.1"
resolved "https://registry.npmjs.org/autocomplete.js/-/autocomplete.js-0.37.1.tgz"
integrity sha512-PgSe9fHYhZEsm/9jggbjtVsGXJkPLvd+9mC7gZJ662vVL5CRWEtm/mIrrzCx0MrNxHVwxD5d00UOn6NsmL2LUQ==
dependencies:
immediate "^3.2.3"
autoprefixer@^10.4.12, autoprefixer@^10.4.7:
version "10.4.14"
resolved "https://registry.npmjs.org/autoprefixer/-/autoprefixer-10.4.14.tgz"
@ -2618,6 +2635,11 @@ axios@^0.25.0:
dependencies:
follow-redirects "^1.14.7"
b4a@^1.6.4:
version "1.6.4"
resolved "https://registry.npmjs.org/b4a/-/b4a-1.6.4.tgz"
integrity sha512-fpWrvyVHEKyeEvbKZTVOeZF3VSKKWtJxFIxX/jaVPf+cLbGUSitjb49pHLqPV2BUNNZ0LcoeEGfE/YCpyDYHIw==
babel-loader@^8.2.5:
version "8.3.0"
resolved "https://registry.npmjs.org/babel-loader/-/babel-loader-8.3.0.tgz"
@ -2699,6 +2721,11 @@ batch@0.6.1:
resolved "https://registry.npmjs.org/batch/-/batch-0.6.1.tgz"
integrity sha512-x+VAiMRL6UPkx+kudNvxTl6hB2XNNCG2r+7wixVfIYwu/2HKRXimwQyaumLjMveWvT2Hkd/cAJw+QBMfJ/EKVw==
bcp-47-match@^1.0.0:
version "1.0.3"
resolved "https://registry.npmjs.org/bcp-47-match/-/bcp-47-match-1.0.3.tgz"
integrity sha512-LggQ4YTdjWQSKELZF5JwchnBa1u0pIQSZf5lSdOHEdbVP55h0qICA/FUp3+W99q0xqxYa1ZQizTUH87gecII5w==
big.js@^5.2.2:
version "5.2.2"
resolved "https://registry.npmjs.org/big.js/-/big.js-5.2.2.tgz"
@ -3072,6 +3099,11 @@ color-string@^1.9.0:
color-name "^1.0.0"
simple-swizzle "^0.2.2"
color-support@^1.1.2:
version "1.1.3"
resolved "https://registry.npmjs.org/color-support/-/color-support-1.1.3.tgz"
integrity sha512-qiBjkpbMLO/HL68y+lh4q0/O1MZFj2RX6X/KmMa3+gJD3z+WwI1ZzDHysvqHGS3mP6mznPckpXmw1nI9cJjyRg==
color@^4.2.3:
version "4.2.3"
resolved "https://registry.npmjs.org/color/-/color-4.2.3.tgz"
@ -3172,6 +3204,11 @@ consola@^2.15.3:
resolved "https://registry.npmjs.org/consola/-/consola-2.15.3.tgz"
integrity sha512-9vAdYbHj6x2fLKC4+oPH0kFzY/orMZyG2Aj+kNylHxKGJ/Ed4dpNyAQYwJOdqO4zdM7XpVHmyejQDcQHrnuXbw==
console-control-strings@^1.0.0:
version "1.1.0"
resolved "https://registry.npmjs.org/console-control-strings/-/console-control-strings-1.1.0.tgz"
integrity sha512-ty/fTekppD2fIwRvnZAVdeOiGd1c7YXEixbgJTNzqcxJWKQnjJ/V1bNEEE6hygpM3WjwHFUVK6HTjWSzV4a8sQ==
"consolidated-events@^1.1.0 || ^2.0.0":
version "2.0.2"
resolved "https://registry.npmjs.org/consolidated-events/-/consolidated-events-2.0.2.tgz"
@ -3354,6 +3391,11 @@ css-select@^5.1.0:
domutils "^3.0.1"
nth-check "^2.0.1"
css-selector-parser@^1.0.0:
version "1.4.1"
resolved "https://registry.npmjs.org/css-selector-parser/-/css-selector-parser-1.4.1.tgz"
integrity sha512-HYPSb7y/Z7BNDCOrakL4raGO2zltZkbeXyAd6Tg9obzix6QhzxCotdBl6VT0Dv4vZfJGVz3WL/xaEI9Ly3ul0g==
css-tree@^1.1.2, css-tree@^1.1.3:
version "1.1.3"
resolved "https://registry.npmjs.org/css-tree/-/css-tree-1.1.3.tgz"
@ -3551,7 +3593,7 @@ detab@2.0.4:
dependencies:
repeat-string "^1.5.4"
detect-libc@^2.0.0, detect-libc@^2.0.1:
detect-libc@^2.0.0, detect-libc@^2.0.1, detect-libc@^2.0.2:
version "2.0.2"
resolved "https://registry.npmjs.org/detect-libc/-/detect-libc-2.0.2.tgz"
integrity sha512-UX6sGumvvqSaXgdKGUsgZWqcUyIXZ/vZTrlRT/iobiKhGL0zL4d3osHj3uqllWJK+i+sixDS/3COVEOFbupFyw==
@ -3584,6 +3626,11 @@ dir-glob@^3.0.1:
dependencies:
path-type "^4.0.0"
direction@^1.0.0:
version "1.0.4"
resolved "https://registry.npmjs.org/direction/-/direction-1.0.4.tgz"
integrity sha512-GYqKi1aH7PJXxdhTeZBFrg8vUBeKXi+cNprXsC1kpJcbcVnV9wBsrOu1cQEdG0WeQwlfHiy3XvnKfIrJ2R0NzQ==
dns-equal@^1.0.0:
version "1.0.0"
resolved "https://registry.npmjs.org/dns-equal/-/dns-equal-1.0.0.tgz"
@ -3596,6 +3643,26 @@ dns-packet@^5.2.2:
dependencies:
"@leichtgewicht/ip-codec" "^2.0.1"
docusaurus-lunr-search@^2.4.1:
version "2.4.1"
resolved "https://registry.npmjs.org/docusaurus-lunr-search/-/docusaurus-lunr-search-2.4.1.tgz"
integrity sha512-UOgaAypgO0iLyA1Hk4EThG/ofLm9/JldznzN98ZKr7TMYVjMZbAEaIBKLAUDFdfOPr9D5EswXdLn39/aRkwHMA==
dependencies:
autocomplete.js "^0.37.0"
clsx "^1.2.1"
gauge "^3.0.0"
hast-util-select "^4.0.0"
hast-util-to-text "^2.0.0"
hogan.js "^3.0.2"
lunr "^2.3.8"
lunr-languages "^1.4.0"
minimatch "^3.0.4"
object-assign "^4.1.1"
rehype-parse "^7.0.1"
to-vfile "^6.1.0"
unified "^9.0.0"
unist-util-is "^4.0.2"
dom-converter@^0.2.0:
version "0.2.0"
resolved "https://registry.npmjs.org/dom-converter/-/dom-converter-0.2.0.tgz"
@ -3922,6 +3989,11 @@ fast-deep-equal@^3.1.1, fast-deep-equal@^3.1.3:
resolved "https://registry.npmjs.org/fast-deep-equal/-/fast-deep-equal-3.1.3.tgz"
integrity sha512-f3qQ9oQy9j2AhBe/H9VC91wLmKBCCU/gDOnKNAYG5hswO7BLKj09Hc5HYNz9cGI++xlpDCIgDaitVs03ATR84Q==
fast-fifo@^1.1.0, fast-fifo@^1.2.0:
version "1.3.2"
resolved "https://registry.npmjs.org/fast-fifo/-/fast-fifo-1.3.2.tgz"
integrity sha512-/d9sfos4yxzpwkDkuN7k2SqFKtYNmCTzgfEpz82x34IM9/zc8KGxQoXg1liNC/izpRM/MBdt44Nmx41ZWqk+FQ==
fast-glob@^3.2.11, fast-glob@^3.2.9, fast-glob@^3.3.0:
version "3.3.1"
resolved "https://registry.npmjs.org/fast-glob/-/fast-glob-3.3.1.tgz"
@ -4147,6 +4219,21 @@ function-bind@^1.1.1:
resolved "https://registry.npmjs.org/function-bind/-/function-bind-1.1.1.tgz"
integrity sha512-yIovAzMX49sF8Yl58fSCWJ5svSLuaibPxXQJFLmBObTuCr0Mf1KiPopGM9NiFjiYBCbfaa2Fh6breQ6ANVTI0A==
gauge@^3.0.0:
version "3.0.2"
resolved "https://registry.npmjs.org/gauge/-/gauge-3.0.2.tgz"
integrity sha512-+5J6MS/5XksCuXq++uFRsnUd7Ovu1XenbeuIuNRJxYWjgQbPuFhT14lAvsWfqfAmnwluf1OwMjz39HjfLPci0Q==
dependencies:
aproba "^1.0.3 || ^2.0.0"
color-support "^1.1.2"
console-control-strings "^1.0.0"
has-unicode "^2.0.1"
object-assign "^4.1.1"
signal-exit "^3.0.0"
string-width "^4.2.3"
strip-ansi "^6.0.1"
wide-align "^1.1.2"
gensync@^1.0.0-beta.1, gensync@^1.0.0-beta.2:
version "1.0.0-beta.2"
resolved "https://registry.npmjs.org/gensync/-/gensync-1.0.0-beta.2.tgz"
@ -4349,6 +4436,11 @@ has-symbols@^1.0.3:
resolved "https://registry.npmjs.org/has-symbols/-/has-symbols-1.0.3.tgz"
integrity sha512-l3LCuF6MgDNwTDKkdYGEihYjt5pRPbEg46rtlmnSPlUbgmB8LOIrKJbYYFBSbnPaJexMKtiPO8hmeRjRz2Td+A==
has-unicode@^2.0.1:
version "2.0.1"
resolved "https://registry.npmjs.org/has-unicode/-/has-unicode-2.0.1.tgz"
integrity sha512-8Rf9Y83NBReMnx0gFzA8JImQACstCYWUplepDa9xprwwtmgEZUF0h/i5xSA625zB/I37EtrswSST6OXxwaaIJQ==
has-yarn@^2.1.0:
version "2.1.0"
resolved "https://registry.npmjs.org/has-yarn/-/has-yarn-2.1.0.tgz"
@ -4386,6 +4478,16 @@ hast-util-from-parse5@^6.0.0:
vfile-location "^3.2.0"
web-namespaces "^1.0.0"
hast-util-has-property@^1.0.0:
version "1.0.4"
resolved "https://registry.npmjs.org/hast-util-has-property/-/hast-util-has-property-1.0.4.tgz"
integrity sha512-ghHup2voGfgFoHMGnaLHOjbYFACKrRh9KFttdCzMCbFoBMJXiNi2+XTrPP8+q6cDJM/RSqlCfVWrjp1H201rZg==
hast-util-is-element@^1.0.0:
version "1.1.0"
resolved "https://registry.npmjs.org/hast-util-is-element/-/hast-util-is-element-1.1.0.tgz"
integrity sha512-oUmNua0bFbdrD/ELDSSEadRVtWZOf3iF6Lbv81naqsIV99RnSCieTbWuWCY8BAeEfKJTKl0gRdokv+dELutHGQ==
hast-util-parse-selector@^2.0.0:
version "2.2.5"
resolved "https://registry.npmjs.org/hast-util-parse-selector/-/hast-util-parse-selector-2.2.5.tgz"
@ -4407,6 +4509,26 @@ hast-util-raw@6.0.1:
xtend "^4.0.0"
zwitch "^1.0.0"
hast-util-select@^4.0.0:
version "4.0.2"
resolved "https://registry.npmjs.org/hast-util-select/-/hast-util-select-4.0.2.tgz"
integrity sha512-8EEG2//bN5rrzboPWD2HdS3ugLijNioS1pqOTIolXNf67xxShYw4SQEmVXd3imiBG+U2bC2nVTySr/iRAA7Cjg==
dependencies:
bcp-47-match "^1.0.0"
comma-separated-tokens "^1.0.0"
css-selector-parser "^1.0.0"
direction "^1.0.0"
hast-util-has-property "^1.0.0"
hast-util-is-element "^1.0.0"
hast-util-to-string "^1.0.0"
hast-util-whitespace "^1.0.0"
not "^0.1.0"
nth-check "^2.0.0"
property-information "^5.0.0"
space-separated-tokens "^1.0.0"
unist-util-visit "^2.0.0"
zwitch "^1.0.0"
hast-util-to-parse5@^6.0.0:
version "6.0.0"
resolved "https://registry.npmjs.org/hast-util-to-parse5/-/hast-util-to-parse5-6.0.0.tgz"
@ -4418,6 +4540,25 @@ hast-util-to-parse5@^6.0.0:
xtend "^4.0.0"
zwitch "^1.0.0"
hast-util-to-string@^1.0.0:
version "1.0.4"
resolved "https://registry.npmjs.org/hast-util-to-string/-/hast-util-to-string-1.0.4.tgz"
integrity sha512-eK0MxRX47AV2eZ+Lyr18DCpQgodvaS3fAQO2+b9Two9F5HEoRPhiUMNzoXArMJfZi2yieFzUBMRl3HNJ3Jus3w==
hast-util-to-text@^2.0.0:
version "2.0.1"
resolved "https://registry.npmjs.org/hast-util-to-text/-/hast-util-to-text-2.0.1.tgz"
integrity sha512-8nsgCARfs6VkwH2jJU9b8LNTuR4700na+0h3PqCaEk4MAnMDeu5P0tP8mjk9LLNGxIeQRLbiDbZVw6rku+pYsQ==
dependencies:
hast-util-is-element "^1.0.0"
repeat-string "^1.0.0"
unist-util-find-after "^3.0.0"
hast-util-whitespace@^1.0.0:
version "1.0.4"
resolved "https://registry.npmjs.org/hast-util-whitespace/-/hast-util-whitespace-1.0.4.tgz"
integrity sha512-I5GTdSfhYfAPNztx2xJRQpG8cuDSNt599/7YUn7Gx/WxNMsG+a835k97TDkFgk123cwjfwINaZknkKkphx/f2A==
hastscript@^6.0.0:
version "6.0.0"
resolved "https://registry.npmjs.org/hastscript/-/hastscript-6.0.0.tgz"
@ -4446,6 +4587,14 @@ history@^4.9.0:
tiny-warning "^1.0.0"
value-equal "^1.0.1"
hogan.js@^3.0.2:
version "3.0.2"
resolved "https://registry.npmjs.org/hogan.js/-/hogan.js-3.0.2.tgz"
integrity sha512-RqGs4wavGYJWE07t35JQccByczmNUXQT0E12ZYV1VKYu5UiAU9lsos/yBAcf840+zrUQQxgVduCR5/B8nNtibg==
dependencies:
mkdirp "0.3.0"
nopt "1.0.10"
hoist-non-react-statics@^3.1.0:
version "3.3.2"
resolved "https://registry.npmjs.org/hoist-non-react-statics/-/hoist-non-react-statics-3.3.2.tgz"
@ -4612,6 +4761,11 @@ image-size@^1.0.1:
dependencies:
queue "6.0.2"
immediate@^3.2.3:
version "3.3.0"
resolved "https://registry.npmjs.org/immediate/-/immediate-3.3.0.tgz"
integrity sha512-HR7EVodfFUdQCTIeySw+WDRFJlPcLOJbXfwwZ7Oom6tjsvZ3bOkCDJHehQC3nxJrv7+f9XecwazynjU8e4Vw3Q==
immer@^9.0.7:
version "9.0.21"
resolved "https://registry.npmjs.org/immer/-/immer-9.0.21.tgz"
@ -5170,6 +5324,16 @@ lru-cache@^6.0.0:
dependencies:
yallist "^4.0.0"
lunr-languages@^1.4.0:
version "1.13.0"
resolved "https://registry.npmjs.org/lunr-languages/-/lunr-languages-1.13.0.tgz"
integrity sha512-qgTOarcnAtVFKr0aJ2GuiqbBdhKF61jpF8OgFbnlSAb1t6kOiQW67q0hv0UQzzB+5+OwPpnZyFT/L0L9SQG1/A==
lunr@^2.3.8:
version "2.3.9"
resolved "https://registry.npmjs.org/lunr/-/lunr-2.3.9.tgz"
integrity sha512-zTU3DaZaF3Rt9rhN3uBMGQD3dD2/vFQqnvZCDv4dl5iOzq2IZQqTxu90r4E5J+nP70J3ilqVCrbho2eWaeW8Ow==
make-dir@^3.0.0, make-dir@^3.0.2, make-dir@^3.1.0:
version "3.1.0"
resolved "https://registry.npmjs.org/make-dir/-/make-dir-3.1.0.tgz"
@ -5364,6 +5528,11 @@ mkdirp-classic@^0.5.2, mkdirp-classic@^0.5.3:
resolved "https://registry.npmjs.org/mkdirp-classic/-/mkdirp-classic-0.5.3.tgz"
integrity sha512-gKLcREMhtuZRwRAfqP3RFW+TK4JqApVBtOIftVgjuABpAtpxhPGaDcfvbhNvD0B8iD1oUr/txX35NjcaY6Ns/A==
mkdirp@0.3.0:
version "0.3.0"
resolved "https://registry.npmjs.org/mkdirp/-/mkdirp-0.3.0.tgz"
integrity sha512-OHsdUcVAQ6pOtg5JYWpCBo9W/GySVuwvP9hueRMW7UqshC0tbfzLv8wjySTPm3tfUZ/21CE9E1pJagOA91Pxew==
mrmime@^1.0.0:
version "1.0.1"
resolved "https://registry.npmjs.org/mrmime/-/mrmime-1.0.1.tgz"
@ -5432,6 +5601,11 @@ node-addon-api@^5.0.0:
resolved "https://registry.npmjs.org/node-addon-api/-/node-addon-api-5.1.0.tgz"
integrity sha512-eh0GgfEkpnoWDq+VY8OyvYhFEzBk6jIYbRKdIlyTiAXIVJ8PyBaKb0rp7oDtoddbdoHWhq8wwr+XZ81F1rpNdA==
node-addon-api@^6.1.0:
version "6.1.0"
resolved "https://registry.npmjs.org/node-addon-api/-/node-addon-api-6.1.0.tgz"
integrity sha512-+eawOlIgy680F0kBzPUNFhMZGtJ1YmqM6l4+Crf4IkImjYrO/mqPwRMh352g23uIaQKFItcQ64I7KMaJxHgAVA==
node-emoji@^1.10.0:
version "1.11.0"
resolved "https://registry.npmjs.org/node-emoji/-/node-emoji-1.11.0.tgz"
@ -5456,6 +5630,13 @@ node-releases@^2.0.13:
resolved "https://registry.npmjs.org/node-releases/-/node-releases-2.0.13.tgz"
integrity sha512-uYr7J37ae/ORWdZeQ1xxMJe3NtdmqMC/JZK+geofDrkLUApKRHPd18/TxtBOJ4A0/+uUIliorNrfYV6s1b02eQ==
nopt@1.0.10:
version "1.0.10"
resolved "https://registry.npmjs.org/nopt/-/nopt-1.0.10.tgz"
integrity sha512-NWmpvLSqUrgrAC9HCuxEvb+PSloHpqVu+FqcO4eeF2h5qYRhA7ev6KvelyQAKtegUbC6RypJnlEOhd8vloNKYg==
dependencies:
abbrev "1"
normalize-path@^3.0.0, normalize-path@~3.0.0:
version "3.0.0"
resolved "https://registry.npmjs.org/normalize-path/-/normalize-path-3.0.0.tgz"
@ -5476,6 +5657,11 @@ normalize-url@^6.0.1:
resolved "https://registry.npmjs.org/normalize-url/-/normalize-url-6.1.0.tgz"
integrity sha512-DlL+XwOy3NxAQ8xuC0okPgK46iuVNAK01YN7RueYBqqFeGsBjV9XmCAzAdgt+667bCl5kPh9EqKKDwnaPG1I7A==
not@^0.1.0:
version "0.1.0"
resolved "https://registry.npmjs.org/not/-/not-0.1.0.tgz"
integrity sha512-5PDmaAsVfnWUgTUbJ3ERwn7u79Z0dYxN9ErxCpVJJqe2RK0PJ3z+iFUxuqjwtlDDegXvtWoxD/3Fzxox7tFGWA==
npm-run-path@^4.0.1:
version "4.0.1"
resolved "https://registry.npmjs.org/npm-run-path/-/npm-run-path-4.0.1.tgz"
@ -5488,7 +5674,7 @@ nprogress@^0.2.0:
resolved "https://registry.npmjs.org/nprogress/-/nprogress-0.2.0.tgz"
integrity sha512-I19aIingLgR1fmhftnbWWO3dXc0hSxqHQHQb3H8m+K3TnEn/iSeTZZOyvKXWqQESMwuUVnatlCnZdLBZZt2VSA==
nth-check@^2.0.1:
nth-check@^2.0.0, nth-check@^2.0.1:
version "2.1.1"
resolved "https://registry.npmjs.org/nth-check/-/nth-check-2.1.1.tgz"
integrity sha512-lqjrjmaOoAnWfMmBPL+XNnynZh2+swxiX3WUE0s4yEHI6m+AwrK2UZOimIRl3X/4QctVqS8AiZjFqyOGrMXb/w==
@ -6208,6 +6394,11 @@ queue-microtask@^1.2.2:
resolved "https://registry.npmjs.org/queue-microtask/-/queue-microtask-1.2.3.tgz"
integrity sha512-NuaNSa6flKT5JaSYQzJok04JzTL1CA6aGhv5rfLW3PgqA+M2ChpZQnAC8h8i4ZFkBS8X5RqkDBHA7r4hej3K9A==
queue-tick@^1.0.1:
version "1.0.1"
resolved "https://registry.npmjs.org/queue-tick/-/queue-tick-1.0.1.tgz"
integrity sha512-kJt5qhMxoszgU/62PLP1CJytzd2NKetjSRnyuj31fDd3Rlcz3fzlFdFLD1SItunPwyqEOkca6GbV612BWfaBag==
queue@6.0.2:
version "6.0.2"
resolved "https://registry.npmjs.org/queue/-/queue-6.0.2.tgz"
@ -6297,7 +6488,7 @@ react-dev-utils@^12.0.1:
strip-ansi "^6.0.1"
text-table "^0.2.0"
react-dom@*, "react-dom@^16.6.0 || ^17.0.0 || ^18.0.0", "react-dom@^16.8.4 || ^17.0.0", "react-dom@^17.0.0 || ^16.3.0 || ^15.5.4", react-dom@^17.0.2, "react-dom@>= 16.8.0 < 19.0.0":
react-dom@*, "react-dom@^16.6.0 || ^17.0.0 || ^18.0.0", "react-dom@^16.8.4 || ^17", "react-dom@^16.8.4 || ^17.0.0", "react-dom@^17.0.0 || ^16.3.0 || ^15.5.4", react-dom@^17.0.2, "react-dom@>= 16.8.0 < 19.0.0":
version "17.0.2"
resolved "https://registry.npmjs.org/react-dom/-/react-dom-17.0.2.tgz"
integrity sha512-s4h96KtLDUQlsENhMn1ar8t2bEa+q/YAtj8pPPdIjPDGBDIVNsrD9aXNWqspUe6AzKCIG0C1HZZLqLV7qpOBGA==
@ -6421,7 +6612,7 @@ react-waypoint@^10.3.0, react-waypoint@>=9.0.2:
prop-types "^15.0.0"
react-is "^17.0.1 || ^18.0.0"
react@*, "react@^15.0.2 || ^16.0.0 || ^17.0.0", "react@^15.3.0 || ^16.0.0 || ^17.0.0 || ^18.0.0", "react@^16.13.1 || ^17.0.0", "react@^16.6.0 || ^17.0.0 || ^18.0.0", "react@^16.8.0 || ^17.0.0 || ^18.0.0", "react@^16.8.4 || ^17.0.0", "react@^17.0.0 || ^16.3.0 || ^15.5.4", react@^17.0.2, "react@>= 16.8.0 < 19.0.0", react@>=0.14.9, react@>=0.14.x, react@>=15, react@17.0.2:
react@*, "react@^15.0.2 || ^16.0.0 || ^17.0.0", "react@^15.3.0 || ^16.0.0 || ^17.0.0 || ^18.0.0", "react@^16.13.1 || ^17.0.0", "react@^16.6.0 || ^17.0.0 || ^18.0.0", "react@^16.8.0 || ^17.0.0 || ^18.0.0", "react@^16.8.4 || ^17", "react@^16.8.4 || ^17.0.0", "react@^17.0.0 || ^16.3.0 || ^15.5.4", react@^17.0.2, "react@>= 16.8.0 < 19.0.0", react@>=0.14.9, react@>=0.14.x, react@>=15, react@17.0.2:
version "17.0.2"
resolved "https://registry.npmjs.org/react/-/react-17.0.2.tgz"
integrity sha512-gnhPt75i/dq/z3/6q/0asP78D0u592D5L1pd7M8P+dck6Fu/jJeL6iVVK23fptSUZj8Vjf++7wXA8UNclGQcbA==
@ -6534,6 +6725,14 @@ regjsparser@^0.9.1:
dependencies:
jsesc "~0.5.0"
rehype-parse@^7.0.1:
version "7.0.1"
resolved "https://registry.npmjs.org/rehype-parse/-/rehype-parse-7.0.1.tgz"
integrity sha512-fOiR9a9xH+Le19i4fGzIEowAbwG7idy2Jzs4mOrFWBSJ0sNUgy0ev871dwWnbOo371SjgjG4pwzrbgSVrKxecw==
dependencies:
hast-util-from-parse5 "^6.0.0"
parse5 "^6.0.0"
relateurl@^0.2.7:
version "0.2.7"
resolved "https://registry.npmjs.org/relateurl/-/relateurl-0.2.7.tgz"
@ -6607,7 +6806,7 @@ renderkid@^3.0.0:
lodash "^4.17.21"
strip-ansi "^6.0.1"
repeat-string@^1.5.4:
repeat-string@^1.0.0, repeat-string@^1.5.4:
version "1.6.1"
resolved "https://registry.npmjs.org/repeat-string/-/repeat-string-1.6.1.tgz"
integrity sha512-PV0dzCYDNfRi1jCDbJzpW7jNNDRuCOG/jI5ctQcGKt/clZD+YcPS3yIlWuTJMmESC8aevCFmWJy5wjAFgNqN6w==
@ -6844,7 +7043,7 @@ semver@^6.3.1:
resolved "https://registry.npmjs.org/semver/-/semver-6.3.1.tgz"
integrity sha512-BR7VvDCVHO+q2xBEWskxS6DJE1qRnb7DxzUrogb71CWoSficBxYsiAGd+Kl0mmq/MprG9yArRkyrQxTO6XjMzA==
semver@^7.3.2, semver@^7.3.4, semver@^7.3.5, semver@^7.3.7, semver@^7.3.8:
semver@^7.3.2, semver@^7.3.4, semver@^7.3.5, semver@^7.3.7, semver@^7.3.8, semver@^7.5.4:
version "7.5.4"
resolved "https://registry.npmjs.org/semver/-/semver-7.5.4.tgz"
integrity sha512-1bCSESV6Pv+i21Hvpxp3Dx+pSD8lIPt8uVjRrxAUt/nbswYc+tK6Y2btiULjd4+fnq15PX+nqQDC7Oft7WkwcA==
@ -6941,7 +7140,21 @@ shallowequal@^1.1.0:
resolved "https://registry.npmjs.org/shallowequal/-/shallowequal-1.1.0.tgz"
integrity sha512-y0m1JoUZSlPAjXVtPPW70aZWfIL/dSP7AFkRnniLCrK/8MDKog3TySTBmckD+RObVxH0v4Tox67+F14PdED2oQ==
sharp@*, sharp@^0.30.7:
sharp@*, sharp@^0.32.5:
version "0.32.5"
resolved "https://registry.npmjs.org/sharp/-/sharp-0.32.5.tgz"
integrity sha512-0dap3iysgDkNaPOaOL4X/0akdu0ma62GcdC2NBQ+93eqpePdDdr2/LM0sFdDSMmN7yS+odyZtPsb7tx/cYBKnQ==
dependencies:
color "^4.2.3"
detect-libc "^2.0.2"
node-addon-api "^6.1.0"
prebuild-install "^7.1.1"
semver "^7.5.4"
simple-get "^4.0.1"
tar-fs "^3.0.4"
tunnel-agent "^0.6.0"
sharp@^0.30.7:
version "0.30.7"
resolved "https://registry.npmjs.org/sharp/-/sharp-0.30.7.tgz"
integrity sha512-G+MY2YW33jgflKPTXXptVO28HvNOo9G3j0MybYAHeEmby+QuD2U98dT6ueht9cv/XDqZspSpIhoSW+BAKJ7Hig==
@ -6990,7 +7203,7 @@ side-channel@^1.0.4:
get-intrinsic "^1.0.2"
object-inspect "^1.9.0"
signal-exit@^3.0.2, signal-exit@^3.0.3:
signal-exit@^3.0.0, signal-exit@^3.0.2, signal-exit@^3.0.3:
version "3.0.7"
resolved "https://registry.npmjs.org/signal-exit/-/signal-exit-3.0.7.tgz"
integrity sha512-wnD2ZE+l+SPC/uoS0vXeE9L1+0wuaMqKlfz9AMUo38JsyLSBWSFcHR1Rri62LZc12vLr1gb3jl7iwQhgwpAbGQ==
@ -7145,6 +7358,14 @@ std-env@^3.0.1:
resolved "https://registry.npmjs.org/std-env/-/std-env-3.3.3.tgz"
integrity sha512-Rz6yejtVyWnVjC1RFvNmYL10kgjC49EOghxWn0RFqlCHGFpQx+Xe7yW3I4ceK1SGrWIGMjD5Kbue8W/udkbMJg==
streamx@^2.15.0:
version "2.15.1"
resolved "https://registry.npmjs.org/streamx/-/streamx-2.15.1.tgz"
integrity sha512-fQMzy2O/Q47rgwErk/eGeLu/roaFWV0jVsogDmrszM9uIw8L5OA+t+V93MgYlufNptfjmYR1tOMWhei/Eh7TQA==
dependencies:
fast-fifo "^1.1.0"
queue-tick "^1.0.1"
string_decoder@^1.1.1:
version "1.3.0"
resolved "https://registry.npmjs.org/string_decoder/-/string_decoder-1.3.0.tgz"
@ -7159,6 +7380,15 @@ string_decoder@~1.1.1:
dependencies:
safe-buffer "~5.1.0"
"string-width@^1.0.2 || 2 || 3 || 4":
version "4.2.3"
resolved "https://registry.npmjs.org/string-width/-/string-width-4.2.3.tgz"
integrity sha512-wKyQRQpjJ0sIp62ErSZdGsjMJWsap5oRNihHhu6G7JVO/9jIB6UyevL+tXuOqrng8j/cxKTWyWUwvSTriiZz/g==
dependencies:
emoji-regex "^8.0.0"
is-fullwidth-code-point "^3.0.0"
strip-ansi "^6.0.1"
string-width@^4.0.0, string-width@^4.1.0, string-width@^4.2.2:
version "4.2.3"
resolved "https://registry.npmjs.org/string-width/-/string-width-4.2.3.tgz"
@ -7177,6 +7407,15 @@ string-width@^4.2.0:
is-fullwidth-code-point "^3.0.0"
strip-ansi "^6.0.1"
string-width@^4.2.3:
version "4.2.3"
resolved "https://registry.npmjs.org/string-width/-/string-width-4.2.3.tgz"
integrity sha512-wKyQRQpjJ0sIp62ErSZdGsjMJWsap5oRNihHhu6G7JVO/9jIB6UyevL+tXuOqrng8j/cxKTWyWUwvSTriiZz/g==
dependencies:
emoji-regex "^8.0.0"
is-fullwidth-code-point "^3.0.0"
strip-ansi "^6.0.1"
string-width@^5.0.1:
version "5.1.2"
resolved "https://registry.npmjs.org/string-width/-/string-width-5.1.2.tgz"
@ -7308,6 +7547,15 @@ tar-fs@^2.0.0, tar-fs@^2.1.1:
pump "^3.0.0"
tar-stream "^2.1.4"
tar-fs@^3.0.4:
version "3.0.4"
resolved "https://registry.npmjs.org/tar-fs/-/tar-fs-3.0.4.tgz"
integrity sha512-5AFQU8b9qLfZCX9zp2duONhPmZv0hGYiBPJsyUdqMjzq/mqVpy/rEUSeHk1+YitmxugaptgBh5oDGU3VsAJq4w==
dependencies:
mkdirp-classic "^0.5.2"
pump "^3.0.0"
tar-stream "^3.1.5"
tar-stream@^2.1.4:
version "2.2.0"
resolved "https://registry.npmjs.org/tar-stream/-/tar-stream-2.2.0.tgz"
@ -7319,6 +7567,15 @@ tar-stream@^2.1.4:
inherits "^2.0.3"
readable-stream "^3.1.1"
tar-stream@^3.1.5:
version "3.1.6"
resolved "https://registry.npmjs.org/tar-stream/-/tar-stream-3.1.6.tgz"
integrity sha512-B/UyjYwPpMBv+PaFSWAmtYjwdrlEaZQEhMIBFNC5oEG8lpiW8XjcSdmEaClj28ArfKScKHs2nshz3k2le6crsg==
dependencies:
b4a "^1.6.4"
fast-fifo "^1.2.0"
streamx "^2.15.0"
terser-webpack-plugin@^5.3.3, terser-webpack-plugin@^5.3.7:
version "5.3.9"
resolved "https://registry.npmjs.org/terser-webpack-plugin/-/terser-webpack-plugin-5.3.9.tgz"
@ -7377,6 +7634,14 @@ to-regex-range@^5.0.1:
dependencies:
is-number "^7.0.0"
to-vfile@^6.1.0:
version "6.1.0"
resolved "https://registry.npmjs.org/to-vfile/-/to-vfile-6.1.0.tgz"
integrity sha512-BxX8EkCxOAZe+D/ToHdDsJcVI4HqQfmw0tCkp31zf3dNP/XWIAjU4CmeuSwsSoOzOTqHPOL0KUzyZqJplkD0Qw==
dependencies:
is-buffer "^2.0.0"
vfile "^4.0.0"
toidentifier@1.0.1:
version "1.0.1"
resolved "https://registry.npmjs.org/toidentifier/-/toidentifier-1.0.1.tgz"
@ -7485,7 +7750,7 @@ unicode-property-aliases-ecmascript@^2.0.0:
resolved "https://registry.npmjs.org/unicode-property-aliases-ecmascript/-/unicode-property-aliases-ecmascript-2.1.0.tgz"
integrity sha512-6t3foTQI9qne+OZoVQB/8x8rk2k1eVy1gRXhV3oFQ5T6R1dqQ1xtin3XqSlx3+ATBkliTaR/hHyJBm+LVPNM8w==
unified@^9.2.2:
unified@^9.0.0, unified@^9.2.2:
version "9.2.2"
resolved "https://registry.npmjs.org/unified/-/unified-9.2.2.tgz"
integrity sha512-Sg7j110mtefBD+qunSLO1lqOEKdrwBFBrR6Qd8f4uwkhWNlbkaqwHse6e7QvD3AP/MNoJdEDLaf8OxYyoWgorQ==
@ -7521,12 +7786,19 @@ unist-builder@^2.0.0, unist-builder@2.0.3:
resolved "https://registry.npmjs.org/unist-builder/-/unist-builder-2.0.3.tgz"
integrity sha512-f98yt5pnlMWlzP539tPc4grGMsFaQQlP/vM396b00jngsiINumNmsY8rkXjfoi1c6QaM8nQ3vaGDuoKWbe/1Uw==
unist-util-find-after@^3.0.0:
version "3.0.0"
resolved "https://registry.npmjs.org/unist-util-find-after/-/unist-util-find-after-3.0.0.tgz"
integrity sha512-ojlBqfsBftYXExNu3+hHLfJQ/X1jYY/9vdm4yZWjIbf0VuWF6CRufci1ZyoD/wV2TYMKxXUoNuoqwy+CkgzAiQ==
dependencies:
unist-util-is "^4.0.0"
unist-util-generated@^1.0.0:
version "1.1.6"
resolved "https://registry.npmjs.org/unist-util-generated/-/unist-util-generated-1.1.6.tgz"
integrity sha512-cln2Mm1/CZzN5ttGK7vkoGw+RZ8VcUH6BtGbq98DDtRGquAAOXig1mrBQYelOwMXYS8rK+vZDyyojSjp7JX+Lg==
unist-util-is@^4.0.0:
unist-util-is@^4.0.0, unist-util-is@^4.0.2:
version "4.1.0"
resolved "https://registry.npmjs.org/unist-util-is/-/unist-util-is-4.1.0.tgz"
integrity sha512-ZOQSsnce92GrxSqlnEEseX0gi7GH9zTJZ0p9dtu87WRb/37mMPO2Ilx1s/t9vBHrFhbgweUwb+t7cIn5dxPhZg==
@ -7908,6 +8180,13 @@ which@^2.0.1:
dependencies:
isexe "^2.0.0"
wide-align@^1.1.2:
version "1.1.5"
resolved "https://registry.npmjs.org/wide-align/-/wide-align-1.1.5.tgz"
integrity sha512-eDMORYaPNZ4sQIuuYPDHdQvf4gyCF9rEEV/yPxGfwPkRodwEgiMUUXTx/dex+Me0wxx53S+NgUHaP7y3MGlDmg==
dependencies:
string-width "^1.0.2 || 2 || 3 || 4"
widest-line@^3.1.0:
version "3.1.0"
resolved "https://registry.npmjs.org/widest-line/-/widest-line-3.1.0.tgz"

View file

@ -1,11 +1,12 @@
import threading
from typing import Callable, List, Optional
from typing import Callable, List, Optional, Dict
input_callback: List[str] = []
success_callback: List[str] = []
failure_callback: List[str] = []
set_verbose = False
email: Optional[str] = None # for hosted dashboard. Learn more - https://docs.litellm.ai/docs/debugging/hosted_debugging
token: Optional[str] = None # for hosted dashboard. Learn more - https://docs.litellm.ai/docs/debugging/hosted_debugging
telemetry = True
max_tokens = 256 # OpenAI Defaults
retry = True
@ -20,10 +21,23 @@ huggingface_key: Optional[str] = None
vertex_project: Optional[str] = None
vertex_location: Optional[str] = None
togetherai_api_key: Optional[str] = None
baseten_key: Optional[str] = None
use_client = False
logging = True
caching = False
caching_with_models = False # if you want the caching key to be model + prompt
debugger = False
model_alias_map: Dict[str, str] = {}
model_cost = {
"babbage-002": {
"max_tokens": 16384,
"input_cost_per_token": 0.0000004,
"output_cost_per_token": 0.0000004,
},
"davinci-002": {
"max_tokens": 16384,
"input_cost_per_token": 0.000002,
"output_cost_per_token": 0.000002,
},
"gpt-3.5-turbo": {
"max_tokens": 4000,
"input_cost_per_token": 0.0000015,
@ -137,7 +151,7 @@ open_ai_chat_completion_models = [
"gpt-3.5-turbo-0613",
"gpt-3.5-turbo-16k-0613",
]
open_ai_text_completion_models = ["text-davinci-003"]
open_ai_text_completion_models = ["text-davinci-003", "babbage-002", "davinci-002"]
cohere_models = [
"command-nightly",
@ -153,7 +167,7 @@ replicate_models = [
"replicate/",
"replicate/llama-2-70b-chat:58d078176e02c219e11eb4da5a02a7830a283b14cf8f94537af893ccff5ee781",
"a16z-infra/llama-2-13b-chat:2a7f981751ec7fdf87b5b91ad4db53683a98082e9ff7bfd12c8cd5ea85980a52",
"joehoover/instructblip-vicuna13b:c4c54e3c8c97cd50c2d2fec9be3b6065563ccf7d43787fb99f84151b867178fe"
"joehoover/instructblip-vicuna13b:c4c54e3c8c97cd50c2d2fec9be3b6065563ccf7d43787fb99f84151b867178fe",
"replicate/dolly-v2-12b:ef0e1aefc61f8e096ebe4db6b2bacc297daf2ef6899f0f7e001ec445893500e5",
"a16z-infra/llama-2-7b-chat:7b0bfc9aff140d5b75bacbed23e91fd3c34b01a1e958d32132de6e0a19796e2c",
"replicate/vicuna-13b:6282abe6a492de4145d7bb601023762212f9ddbbe78278bd6771c8b3b2f2a13b",
@ -220,7 +234,6 @@ model_list = (
provider_list = [
"openai",
"azure",
"cohere",
"anthropic",
"replicate",
@ -230,6 +243,7 @@ provider_list = [
"vertex_ai",
"ai21",
"baseten",
"azure",
]
models_by_provider = {

31
litellm/cache.py Normal file
View file

@ -0,0 +1,31 @@
###### LiteLLM Integration with GPT Cache #########
import gptcache
# openai.ChatCompletion._llm_handler = litellm.completion
from gptcache.adapter import openai
import litellm
class LiteLLMChatCompletion(gptcache.adapter.openai.ChatCompletion):
@classmethod
def _llm_handler(cls, *llm_args, **llm_kwargs):
return litellm.completion(*llm_args, **llm_kwargs)
completion = LiteLLMChatCompletion.create
###### End of LiteLLM Integration with GPT Cache #########
# ####### Example usage ###############
# from gptcache import cache
# completion = LiteLLMChatCompletion.create
# # set API keys in .env / os.environ
# cache.init()
# cache.set_openai_key()
# result = completion(model="claude-2", messages=[{"role": "user", "content": "cto of litellm"}])
# print(result)

View file

@ -1,12 +1,12 @@
import requests, traceback, json, os
class LiteDebugger:
user_email = None
dashboard_url = None
def __init__(self, email=None):
self.api_url = "https://api.litellm.ai/debugger"
# self.api_url = "http://0.0.0.0:4000/debugger"
self.validate_environment(email)
pass
@ -14,7 +14,10 @@ class LiteDebugger:
try:
self.user_email = os.getenv("LITELLM_EMAIL") or email
self.dashboard_url = "https://admin.litellm.ai/" + self.user_email
print(f"Here's your free Dashboard 👉 {self.dashboard_url}")
try:
print(f"\033[92mHere's your LiteLLM Dashboard 👉 \033[94m\033[4m{self.dashboard_url}\033[0m")
except:
print(f"Here's your LiteLLM Dashboard 👉 {self.dashboard_url}")
if self.user_email == None:
raise Exception(
"[Non-Blocking Error] LiteLLMDebugger: Missing LITELLM_EMAIL. Set it in your environment. Eg.: os.environ['LITELLM_EMAIL']= <your_email>"
@ -25,12 +28,19 @@ class LiteDebugger:
)
def input_log_event(
self, model, messages, end_user, litellm_call_id, print_verbose
self, model, messages, end_user, litellm_call_id, print_verbose, litellm_params, optional_params
):
try:
print_verbose(
f"LiteLLMDebugger: Logging - Enters input logging function for model {model}"
)
def remove_key_value(dictionary, key):
new_dict = dictionary.copy() # Create a copy of the original dictionary
new_dict.pop(key) # Remove the specified key-value pair from the copy
return new_dict
updated_litellm_params = remove_key_value(litellm_params, "logger_fn")
litellm_data_obj = {
"model": model,
"messages": messages,
@ -38,6 +48,33 @@ class LiteDebugger:
"status": "initiated",
"litellm_call_id": litellm_call_id,
"user_email": self.user_email,
"litellm_params": updated_litellm_params,
"optional_params": optional_params
}
print_verbose(
f"LiteLLMDebugger: Logging - logged data obj {litellm_data_obj}"
)
response = requests.post(
url=self.api_url,
headers={"content-type": "application/json"},
data=json.dumps(litellm_data_obj),
)
print_verbose(f"LiteDebugger: api response - {response.text}")
except:
print_verbose(
f"[Non-Blocking Error] LiteDebugger: Logging Error - {traceback.format_exc()}"
)
pass
def post_call_log_event(
self, original_response, litellm_call_id, print_verbose
):
try:
litellm_data_obj = {
"status": "received",
"additional_details": {"original_response": original_response},
"litellm_call_id": litellm_call_id,
"user_email": self.user_email,
}
response = requests.post(
url=self.api_url,
@ -49,7 +86,6 @@ class LiteDebugger:
print_verbose(
f"[Non-Blocking Error] LiteDebugger: Logging Error - {traceback.format_exc()}"
)
pass
def log_event(
self,
@ -64,7 +100,7 @@ class LiteDebugger:
):
try:
print_verbose(
f"LiteLLMDebugger: Logging - Enters input logging function for model {model}"
f"LiteLLMDebugger: Logging - Enters handler logging function for model {model} with response object {response_obj}"
)
total_cost = 0 # [TODO] implement cost tracking
response_time = (end_time - start_time).total_seconds()
@ -74,7 +110,47 @@ class LiteDebugger:
"model": response_obj["model"],
"total_cost": total_cost,
"messages": messages,
"response": response_obj["choices"][0]["message"]["content"],
"response": response['choices'][0]['message']['content'],
"end_user": end_user,
"litellm_call_id": litellm_call_id,
"status": "success",
"user_email": self.user_email,
}
print_verbose(
f"LiteDebugger: Logging - final data object: {litellm_data_obj}"
)
response = requests.post(
url=self.api_url,
headers={"content-type": "application/json"},
data=json.dumps(litellm_data_obj),
)
elif "data" in response_obj and isinstance(response_obj["data"], list) and len(response_obj["data"]) > 0 and "embedding" in response_obj["data"][0]:
print(f"messages: {messages}")
litellm_data_obj = {
"response_time": response_time,
"model": response_obj["model"],
"total_cost": total_cost,
"messages": messages,
"response": str(response_obj["data"][0]["embedding"][:5]),
"end_user": end_user,
"litellm_call_id": litellm_call_id,
"status": "success",
"user_email": self.user_email,
}
print_verbose(
f"LiteDebugger: Logging - final data object: {litellm_data_obj}"
)
response = requests.post(
url=self.api_url,
headers={"content-type": "application/json"},
data=json.dumps(litellm_data_obj),
)
elif isinstance(response_obj, object) and response_obj.__class__.__name__ == "CustomStreamWrapper":
litellm_data_obj = {
"response_time": response_time,
"total_cost": total_cost,
"messages": messages,
"response": "Streamed response",
"end_user": end_user,
"litellm_call_id": litellm_call_id,
"status": "success",

View file

@ -0,0 +1,124 @@
#### What this does ####
# On success + failure, log events to aispend.io
import datetime
import traceback
import dotenv
import os
import requests
dotenv.load_dotenv() # Loading env variables using dotenv
# convert to {completion: xx, tokens: xx}
def parse_usage(usage):
return {
"completion":
usage["completion_tokens"] if "completion_tokens" in usage else 0,
"prompt":
usage["prompt_tokens"] if "prompt_tokens" in usage else 0,
}
def parse_messages(input):
if input is None:
return None
def clean_message(message):
#if is strin, return as is
if isinstance(message, str):
return message
if "message" in message:
return clean_message(message["message"])
return {
"role": message["role"],
"text": message["content"],
}
if isinstance(input, list):
if len(input) == 1:
return clean_message(input[0])
else:
return [clean_message(msg) for msg in input]
else:
return clean_message(input)
class LLMonitorLogger:
# Class variables or attributes
def __init__(self):
# Instance variables
self.api_url = os.getenv(
"LLMONITOR_API_URL") or "https://app.llmonitor.com"
self.app_id = os.getenv("LLMONITOR_APP_ID")
def log_event(
self,
type,
event,
run_id,
model,
print_verbose,
input=None,
user_id=None,
response_obj=None,
start_time=datetime.datetime.now(),
end_time=datetime.datetime.now(),
error=None,
):
# Method definition
try:
print_verbose(
f"LLMonitor Logging - Logging request for model {model}")
if response_obj:
usage = parse_usage(
response_obj['usage']) if 'usage' in response_obj else None
output = response_obj[
'choices'] if 'choices' in response_obj else None
else:
usage = None
output = None
if error:
error_obj = {'stack': error}
else:
error_obj = None
data = [{
"type": type,
"name": model,
"runId": run_id,
"app": self.app_id,
'event': 'start',
"timestamp": start_time.isoformat(),
"userId": user_id,
"input": parse_messages(input),
}, {
"type": type,
"runId": run_id,
"app": self.app_id,
"event": event,
"error": error_obj,
"timestamp": end_time.isoformat(),
"userId": user_id,
"output": parse_messages(output),
"tokensUsage": usage,
}]
# print_verbose(f"LLMonitor Logging - final data object: {data}")
response = requests.post(
self.api_url + '/api/report',
headers={'Content-Type': 'application/json'},
json={'events': data})
print_verbose(f"LLMonitor Logging - response: {response}")
except:
# traceback.print_exc()
print_verbose(
f"LLMonitor Logging Error - {traceback.format_exc()}")
pass

View file

@ -0,0 +1,45 @@
#### What this does ####
# On success, logs events to Promptlayer
import dotenv, os
import requests
import requests
dotenv.load_dotenv() # Loading env variables using dotenv
import traceback
class PromptLayerLogger:
# Class variables or attributes
def __init__(self):
# Instance variables
self.key = os.getenv("PROMPTLAYER_API_KEY")
def log_event(self, kwargs, response_obj, start_time, end_time, print_verbose):
# Method definition
try:
print_verbose(
f"Prompt Layer Logging - Enters logging function for model {kwargs}"
)
request_response = requests.post(
"https://api.promptlayer.com/rest/track-request",
json={
"function_name": "openai.ChatCompletion.create",
"kwargs": kwargs,
"tags": ["hello", "world"],
"request_response": dict(response_obj), # TODO: Check if we need a dict
"request_start_time": int(start_time.timestamp()),
"request_end_time": int(end_time.timestamp()),
"api_key": self.key,
# Optional params for PromptLayer
# "prompt_id": "<PROMPT ID>",
# "prompt_input_variables": "<Dictionary of variables for prompt>",
# "prompt_version":1,
},
)
print_verbose(f"Prompt Layer Logging - final response object: {request_response}")
except:
# traceback.print_exc()
print_verbose(f"Prompt Layer Error - {traceback.format_exc()}")
pass

View file

@ -81,16 +81,17 @@ class AnthropicLLM:
api_key=self.api_key,
additional_args={"complete_input_dict": data},
)
# COMPLETION CALL
response = requests.post(
self.completion_url, headers=self.headers, data=json.dumps(data), stream=optional_params["stream"]
)
print(optional_params)
if "stream" in optional_params and optional_params["stream"] is True:
print("IS STREAMING")
## COMPLETION CALL
if "stream" in optional_params and optional_params["stream"] == True:
response = requests.post(
self.completion_url, headers=self.headers, data=json.dumps(data), stream=optional_params["stream"]
)
return response.iter_lines()
else:
# LOGGING
response = requests.post(
self.completion_url, headers=self.headers, data=json.dumps(data)
)
## LOGGING
self.logging_obj.post_call(
input=prompt,
api_key=self.api_key,

132
litellm/llms/baseten.py Normal file
View file

@ -0,0 +1,132 @@
import os, json
from enum import Enum
import requests
import time
from typing import Callable
from litellm.utils import ModelResponse
class BasetenError(Exception):
def __init__(self, status_code, message):
self.status_code = status_code
self.message = message
super().__init__(
self.message
) # Call the base class constructor with the parameters it needs
class BasetenLLM:
def __init__(
self, encoding, logging_obj, api_key=None
):
self.encoding = encoding
self.completion_url_fragment_1 = "https://app.baseten.co/models/"
self.completion_url_fragment_2 = "/predict"
self.api_key = api_key
self.logging_obj = logging_obj
self.validate_environment(api_key=api_key)
def validate_environment(
self, api_key
): # set up the environment required to run the model
# set the api key
if self.api_key == None:
raise ValueError(
"Missing Baseten API Key - A call is being made to baseten but no key is set either in the environment variables or via params"
)
self.api_key = api_key
self.headers = {
"accept": "application/json",
"content-type": "application/json",
"Authorization": "Api-Key " + self.api_key,
}
def completion(
self,
model: str,
messages: list,
model_response: ModelResponse,
print_verbose: Callable,
optional_params=None,
litellm_params=None,
logger_fn=None,
): # logic for parsing in - calling - parsing out model completion calls
model = model
prompt = ""
for message in messages:
if "role" in message:
if message["role"] == "user":
prompt += (
f"{message['content']}"
)
else:
prompt += (
f"{message['content']}"
)
else:
prompt += f"{message['content']}"
data = {
"prompt": prompt,
# "instruction": prompt, # some baseten models require the prompt to be passed in via the 'instruction' kwarg
# **optional_params,
}
## LOGGING
self.logging_obj.pre_call(
input=prompt,
api_key=self.api_key,
additional_args={"complete_input_dict": data},
)
## COMPLETION CALL
response = requests.post(
self.completion_url_fragment_1 + model + self.completion_url_fragment_2, headers=self.headers, data=json.dumps(data)
)
if "stream" in optional_params and optional_params["stream"] == True:
return response.iter_lines()
else:
## LOGGING
self.logging_obj.post_call(
input=prompt,
api_key=self.api_key,
original_response=response.text,
additional_args={"complete_input_dict": data},
)
print_verbose(f"raw model_response: {response.text}")
## RESPONSE OBJECT
completion_response = response.json()
if "error" in completion_response:
raise BasetenError(
message=completion_response["error"],
status_code=response.status_code,
)
else:
if "model_output" in completion_response:
if isinstance(completion_response["model_output"], dict) and "data" in completion_response["model_output"] and isinstance(completion_response["model_output"]["data"], list):
model_response["choices"][0]["message"]["content"] = completion_response["model_output"]["data"][0]
elif isinstance(completion_response["model_output"], str):
model_response["choices"][0]["message"]["content"] = completion_response["model_output"]
elif "completion" in completion_response and isinstance(completion_response["completion"], str):
model_response["choices"][0]["message"]["content"] = completion_response["completion"]
else:
raise ValueError(f"Unable to parse response. Original response: {response.text}")
## CALCULATING USAGE - baseten charges on time, not tokens - have some mapping of cost here.
prompt_tokens = len(
self.encoding.encode(prompt)
)
completion_tokens = len(
self.encoding.encode(model_response["choices"][0]["message"]["content"])
)
model_response["created"] = time.time()
model_response["model"] = model
model_response["usage"] = {
"prompt_tokens": prompt_tokens,
"completion_tokens": completion_tokens,
"total_tokens": prompt_tokens + completion_tokens,
}
return model_response
def embedding(
self,
): # logic for parsing in - calling - parsing out model embedding calls
pass

View file

@ -1,7 +1,7 @@
import os, openai, sys
from typing import Any
from functools import partial
import dotenv, traceback, random, asyncio, time
import dotenv, traceback, random, asyncio, time, contextvars
from copy import deepcopy
import litellm
from litellm import ( # type: ignore
@ -21,6 +21,7 @@ from litellm.utils import (
)
from .llms.anthropic import AnthropicLLM
from .llms.huggingface_restapi import HuggingfaceRestAPILLM
from .llms.baseten import BasetenLLM
import tiktoken
from concurrent.futures import ThreadPoolExecutor
@ -34,8 +35,6 @@ from litellm.utils import (
)
from litellm.utils import (
get_ollama_response_stream,
stream_to_string,
together_ai_completion_streaming,
)
####### ENVIRONMENT VARIABLES ###################
@ -50,19 +49,23 @@ async def acompletion(*args, **kwargs):
# Use a partial function to pass your keyword arguments
func = partial(completion, *args, **kwargs)
# Add the context to the function
ctx = contextvars.copy_context()
func_with_context = partial(ctx.run, func)
# Call the synchronous function using run_in_executor
return await loop.run_in_executor(None, func)
return await loop.run_in_executor(None, func_with_context)
@client
# @retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(2), reraise=True, retry_error_callback=lambda retry_state: setattr(retry_state.outcome, 'retry_variable', litellm.retry)) # retry call, turn this off by setting `litellm.retry = False`
@timeout( # type: ignore
600
) ## set timeouts, in case calls hang (e.g. Azure) - default is 60s, override with `force_timeout`
) ## set timeouts, in case calls hang (e.g. Azure) - default is 600s, override with `force_timeout`
def completion(
model,
messages, # required params
# Optional OpenAI params: see https://platform.openai.com/docs/api-reference/chat/create
messages=[],
functions=[],
function_call="", # optional params
temperature=1,
@ -73,6 +76,7 @@ def completion(
max_tokens=float("inf"),
presence_penalty=0,
frequency_penalty=0,
num_beams=1,
logit_bias={},
user="",
deployment_id=None,
@ -97,6 +101,9 @@ def completion(
try:
if fallbacks != []:
return completion_with_fallbacks(**args)
if litellm.model_alias_map and model in litellm.model_alias_map:
args["model_alias_map"] = litellm.model_alias_map
model = litellm.model_alias_map[model] # update the model to the actual value if an alias has been passed in
model_response = ModelResponse()
if azure: # this flag is deprecated, remove once notebooks are also updated.
custom_llm_provider = "azure"
@ -139,6 +146,7 @@ def completion(
custom_llm_provider=custom_llm_provider,
custom_api_base=custom_api_base,
litellm_call_id=litellm_call_id,
model_alias_map=litellm.model_alias_map
)
logging = Logging(
model=model,
@ -206,12 +214,11 @@ def completion(
): # allow user to make an openai call with a custom base
openai.api_type = "openai"
# note: if a user sets a custom base - we should ensure this works
# allow for the setting of dynamic and stateful api-bases
api_base = (
custom_api_base if custom_api_base is not None else litellm.api_base
) # allow for the setting of dynamic and stateful api-bases
openai.api_base = (
api_base if api_base is not None else "https://api.openai.com/v1"
custom_api_base or litellm.api_base or get_secret("OPENAI_API_BASE") or "https://api.openai.com/v1"
)
openai.api_base = api_base
openai.api_version = None
if litellm.organization:
openai.organization = litellm.organization
@ -248,7 +255,9 @@ def completion(
original_response=response,
additional_args={"headers": litellm.headers},
)
elif model in litellm.open_ai_text_completion_models:
elif (model in litellm.open_ai_text_completion_models or
"ft:babbage-002" in model or # support for finetuned completion models
"ft:davinci-002" in model):
openai.api_type = "openai"
openai.api_base = (
litellm.api_base
@ -521,6 +530,7 @@ def completion(
TOGETHER_AI_TOKEN = (
get_secret("TOGETHER_AI_TOKEN")
or get_secret("TOGETHERAI_API_KEY")
or get_secret("TOGETHER_AI_API_KEY")
or api_key
or litellm.togetherai_api_key
)
@ -532,9 +542,28 @@ def completion(
## LOGGING
logging.pre_call(input=prompt, api_key=TOGETHER_AI_TOKEN)
if stream == True:
return together_ai_completion_streaming(
{
print(f"TOGETHER_AI_TOKEN: {TOGETHER_AI_TOKEN}")
if "stream_tokens" in optional_params and optional_params["stream_tokens"] == True:
res = requests.post(
endpoint,
json={
"model": model,
"prompt": prompt,
"request_type": "language-model-inference",
**optional_params,
},
stream=optional_params["stream_tokens"],
headers=headers,
)
response = CustomStreamWrapper(
res.iter_lines(), model, custom_llm_provider="together_ai"
)
return response
else:
res = requests.post(
endpoint,
json={
"model": model,
"prompt": prompt,
"request_type": "language-model-inference",
@ -542,39 +571,29 @@ def completion(
},
headers=headers,
)
res = requests.post(
endpoint,
json={
"model": model,
"prompt": prompt,
"request_type": "language-model-inference",
**optional_params,
},
headers=headers,
)
## LOGGING
logging.post_call(
input=prompt, api_key=TOGETHER_AI_TOKEN, original_response=res.text
)
# make this safe for reading, if output does not exist raise an error
json_response = res.json()
if "output" not in json_response:
raise Exception(
f"liteLLM: Error Making TogetherAI request, JSON Response {json_response}"
## LOGGING
logging.post_call(
input=prompt, api_key=TOGETHER_AI_TOKEN, original_response=res.text
)
completion_response = json_response["output"]["choices"][0]["text"]
prompt_tokens = len(encoding.encode(prompt))
completion_tokens = len(encoding.encode(completion_response))
## RESPONSE OBJECT
model_response["choices"][0]["message"]["content"] = completion_response
model_response["created"] = time.time()
model_response["model"] = model
model_response["usage"] = {
"prompt_tokens": prompt_tokens,
"completion_tokens": completion_tokens,
"total_tokens": prompt_tokens + completion_tokens,
}
response = model_response
# make this safe for reading, if output does not exist raise an error
json_response = res.json()
if "output" not in json_response:
raise Exception(
f"liteLLM: Error Making TogetherAI request, JSON Response {json_response}"
)
completion_response = json_response["output"]["choices"][0]["text"]
prompt_tokens = len(encoding.encode(prompt))
completion_tokens = len(encoding.encode(completion_response))
## RESPONSE OBJECT
model_response["choices"][0]["message"]["content"] = completion_response
model_response["created"] = time.time()
model_response["model"] = model
model_response["usage"] = {
"prompt_tokens": prompt_tokens,
"completion_tokens": completion_tokens,
"total_tokens": prompt_tokens + completion_tokens,
}
response = model_response
elif model in litellm.vertex_chat_models:
# import vertexai/if it fails then pip install vertexai# import cohere/if it fails then pip install cohere
install_and_import("vertexai")
@ -677,36 +696,31 @@ def completion(
custom_llm_provider == "baseten"
or litellm.api_base == "https://app.baseten.co"
):
import baseten
base_ten_key = get_secret("BASETEN_API_KEY")
baseten.login(base_ten_key)
prompt = " ".join([message["content"] for message in messages])
## LOGGING
logging.pre_call(input=prompt, api_key=base_ten_key)
base_ten__model = baseten.deployed_model_version_id(model)
completion_response = base_ten__model.predict({"prompt": prompt})
if type(completion_response) == dict:
completion_response = completion_response["data"]
if type(completion_response) == dict:
completion_response = completion_response["generated_text"]
## LOGGING
logging.post_call(
input=prompt,
api_key=base_ten_key,
original_response=completion_response,
custom_llm_provider = "baseten"
baseten_key = (
api_key
or litellm.baseten_key
or os.environ.get("BASETEN_API_KEY")
)
## RESPONSE OBJECT
model_response["choices"][0]["message"]["content"] = completion_response
model_response["created"] = time.time()
model_response["model"] = model
baseten_client = BasetenLLM(
encoding=encoding, api_key=baseten_key, logging_obj=logging
)
model_response = baseten_client.completion(
model=model,
messages=messages,
model_response=model_response,
print_verbose=print_verbose,
optional_params=optional_params,
litellm_params=litellm_params,
logger_fn=logger_fn,
)
if "stream" in optional_params and optional_params["stream"] == True:
# don't try to access stream object,
response = CustomStreamWrapper(
model_response, model, custom_llm_provider="huggingface"
)
return response
response = model_response
elif custom_llm_provider == "petals" or (
litellm.api_base and "chat.petals.dev" in litellm.api_base
):
@ -753,6 +767,10 @@ def completion(
model=model, custom_llm_provider=custom_llm_provider, original_exception=e
)
def completion_with_retries(*args, **kwargs):
import tenacity
retryer = tenacity.Retrying(stop=tenacity.stop_after_attempt(3), reraise=True)
return retryer(completion, *args, **kwargs)
def batch_completion(*args, **kwargs):
batch_messages = args[1] if len(args) > 1 else kwargs.get("messages")
@ -813,7 +831,7 @@ def embedding(
)
## EMBEDDING CALL
response = openai.Embedding.create(input=input, engine=model)
print_verbose(f"response_value: {str(response)[:50]}")
print_verbose(f"response_value: {str(response)[:100]}")
elif model in litellm.open_ai_embedding_models:
openai.api_type = "openai"
openai.api_base = "https://api.openai.com/v1"
@ -831,7 +849,7 @@ def embedding(
)
## EMBEDDING CALL
response = openai.Embedding.create(input=input, model=model)
print_verbose(f"response_value: {str(response)[:50]}")
print_verbose(f"response_value: {str(response)[:100]}")
else:
args = locals()
raise ValueError(f"No valid embedding model args passed in - {args}")
@ -847,6 +865,13 @@ def embedding(
custom_llm_provider="azure" if azure == True else None,
)
###### Text Completion ################
def text_completion(*args, **kwargs):
if 'prompt' in kwargs:
messages = [{'role': 'system', 'content': kwargs['prompt']}]
kwargs['messages'] = messages
kwargs.pop('prompt')
return completion(*args, **kwargs)
####### HELPER FUNCTIONS ################
## Set verbose to true -> ```litellm.set_verbose = True```

BIN
litellm/tests/data_map.txt Normal file

Binary file not shown.

View file

@ -14,7 +14,6 @@ from litellm import embedding, completion
messages = [{"role": "user", "content": "who is ishaan Github? "}]
# test if response cached
def test_caching():
try:
@ -36,6 +35,7 @@ def test_caching():
def test_caching_with_models():
litellm.caching_with_models = True
response1 = completion(model="gpt-3.5-turbo", messages=messages)
response2 = completion(model="gpt-3.5-turbo", messages=messages)
response3 = completion(model="command-nightly", messages=messages)
print(f"response2: {response2}")
@ -45,4 +45,32 @@ def test_caching_with_models():
# if models are different, it should not return cached response
print(f"response2: {response2}")
print(f"response3: {response3}")
pytest.fail(f"Error occurred: {e}")
pytest.fail(f"Error occurred:")
if response1 != response2:
print(f"response1: {response1}")
print(f"response2: {response2}")
pytest.fail(f"Error occurred:")
# test_caching_with_models()
def test_gpt_cache():
# INIT GPT Cache #
from gptcache import cache
from litellm.cache import completion
cache.init()
cache.set_openai_key()
messages = [{"role": "user", "content": "what is litellm YC 22?"}]
response2 = completion(model="gpt-3.5-turbo", messages=messages)
response3 = completion(model="command-nightly", messages=messages)
print(f"response2: {response2}")
print(f"response3: {response3}")
if response3['choices'] != response2['choices']:
# if models are different, it should not return cached response
print(f"response2: {response2}")
print(f"response3: {response3}")
pytest.fail(f"Error occurred:")
# test_gpt_cache()

View file

@ -10,9 +10,7 @@ sys.path.insert(
) # Adds the parent directory to the system path
import pytest
import litellm
from litellm import embedding, completion
litellm.debugger = True
from litellm import embedding, completion, text_completion
# from infisical import InfisicalClient
@ -144,6 +142,17 @@ def test_completion_openai():
except Exception as e:
pytest.fail(f"Error occurred: {e}")
def test_completion_openai_prompt():
try:
response = text_completion(model="gpt-3.5-turbo", prompt="What's the weather in SF?")
response_str = response["choices"][0]["message"]["content"]
response_str_2 = response.choices[0].message.content
print(response)
assert response_str == response_str_2
assert type(response_str) == str
assert len(response_str) > 1
except Exception as e:
pytest.fail(f"Error occurred: {e}")
def test_completion_text_openai():
try:

View file

@ -0,0 +1,86 @@
# import sys, os
# import traceback
# from dotenv import load_dotenv
# load_dotenv()
# import os
# sys.path.insert(
# 0, os.path.abspath("../..")
# ) # Adds the parent directory to the system path
# import pytest
# import litellm
# from litellm import completion_with_retries
# from litellm import (
# AuthenticationError,
# InvalidRequestError,
# RateLimitError,
# ServiceUnavailableError,
# OpenAIError,
# )
# user_message = "Hello, whats the weather in San Francisco??"
# messages = [{"content": user_message, "role": "user"}]
# def logger_fn(user_model_dict):
# # print(f"user_model_dict: {user_model_dict}")
# pass
# # normal call
# def test_completion_custom_provider_model_name():
# try:
# response = completion_with_retries(
# model="together_ai/togethercomputer/llama-2-70b-chat",
# messages=messages,
# logger_fn=logger_fn,
# )
# # Add any assertions here to check the response
# print(response)
# except Exception as e:
# pytest.fail(f"Error occurred: {e}")
# # bad call
# # def test_completion_custom_provider_model_name():
# # try:
# # response = completion_with_retries(
# # model="bad-model",
# # messages=messages,
# # logger_fn=logger_fn,
# # )
# # # Add any assertions here to check the response
# # print(response)
# # except Exception as e:
# # pytest.fail(f"Error occurred: {e}")
# # impact on exception mapping
# def test_context_window():
# sample_text = "how does a court case get to the Supreme Court?" * 5000
# messages = [{"content": sample_text, "role": "user"}]
# try:
# model = "chatgpt-test"
# response = completion_with_retries(
# model=model,
# messages=messages,
# custom_llm_provider="azure",
# logger_fn=logger_fn,
# )
# print(f"response: {response}")
# except InvalidRequestError as e:
# print(f"InvalidRequestError: {e.llm_provider}")
# return
# except OpenAIError as e:
# print(f"OpenAIError: {e.llm_provider}")
# return
# except Exception as e:
# print("Uncaught Error in test_context_window")
# print(f"Error Type: {type(e).__name__}")
# print(f"Uncaught Exception - {e}")
# pytest.fail(f"Error occurred: {e}")
# return
# test_context_window()
# test_completion_custom_provider_model_name()

View file

@ -9,7 +9,7 @@ import litellm
from litellm import embedding, completion
from infisical import InfisicalClient
# # litellm.set_verbose = True
litellm.set_verbose = True
# litellm.secret_manager_client = InfisicalClient(token=os.environ["INFISICAL_TOKEN"])
@ -19,6 +19,7 @@ def test_openai_embedding():
model="text-embedding-ada-002", input=["good morning from litellm"]
)
# Add any assertions here to check the response
print(f"response: {str(response)}")
# print(f"response: {str(response)}")
except Exception as e:
pytest.fail(f"Error occurred: {e}")
test_openai_embedding()

View file

@ -0,0 +1,42 @@
#### What this tests ####
# This tests if logging to the llmonitor integration actually works
# Adds the parent directory to the system path
import sys
import os
sys.path.insert(0, os.path.abspath('../..'))
from litellm import completion, embedding
import litellm
litellm.success_callback = ["llmonitor"]
litellm.failure_callback = ["llmonitor"]
litellm.set_verbose = True
def test_chat_openai():
try:
response = completion(model="gpt-3.5-turbo",
messages=[{
"role": "user",
"content": "Hi 👋 - i'm openai"
}])
print(response)
except Exception as e:
print(e)
def test_embedding_openai():
try:
response = embedding(model="text-embedding-ada-002", input=['test'])
# Add any assertions here to check the response
print(f"response: {str(response)[:50]}")
except Exception as e:
print(e)
test_chat_openai()
test_embedding_openai()

View file

@ -0,0 +1,17 @@
#### What this tests ####
# This tests the model alias mapping - if user passes in an alias, and has set an alias, set it to the actual value
import sys, os
import traceback
sys.path.insert(
0, os.path.abspath("../..")
) # Adds the parent directory to the system path
import litellm
from litellm import embedding, completion
litellm.set_verbose = True
# Test: Check if the alias created via LiteDebugger is mapped correctly
{"top_p": 0.75, "prompt": "What's the meaning of life?", "num_beams": 4, "temperature": 0.1}
print(completion("llama2", messages=[{"role": "user", "content": "Hey, how's it going?"}], top_p=0.1, temperature=0, num_beams=4, max_tokens=60))

View file

@ -0,0 +1,31 @@
# #### What this tests ####
# # This tests if logging to the llmonitor integration actually works
# # Adds the parent directory to the system path
# import sys
# import os
# sys.path.insert(0, os.path.abspath('../..'))
# from litellm import completion, embedding
# import litellm
# litellm.success_callback = ["promptlayer"]
# litellm.set_verbose = True
# def test_chat_openai():
# try:
# response = completion(model="gpt-3.5-turbo",
# messages=[{
# "role": "user",
# "content": "Hi 👋 - i'm openai"
# }])
# print(response)
# except Exception as e:
# print(e)
# # test_chat_openai()

View file

@ -3,19 +3,20 @@
import sys, os
import traceback
import time
sys.path.insert(
0, os.path.abspath("../..")
) # Adds the parent directory to the system path
import litellm
from litellm import completion
litellm.logging = False
litellm.set_verbose = False
score = 0
def logger_fn(model_call_object: dict):
return
print(f"model call details: {model_call_object}")
@ -23,29 +24,41 @@ user_message = "Hello, how are you?"
messages = [{"content": user_message, "role": "user"}]
# test on openai completion call
try:
response = completion(
model="gpt-3.5-turbo", messages=messages, stream=True, logger_fn=logger_fn
)
for chunk in response:
print(chunk["choices"][0]["delta"])
score += 1
except:
print(f"error occurred: {traceback.format_exc()}")
pass
# try:
# response = completion(
# model="gpt-3.5-turbo", messages=messages, stream=True, logger_fn=logger_fn
# )
# complete_response = ""
# start_time = time.time()
# for chunk in response:
# chunk_time = time.time()
# print(f"time since initial request: {chunk_time - start_time:.5f}")
# print(chunk["choices"][0]["delta"])
# complete_response += chunk["choices"][0]["delta"]["content"]
# if complete_response == "":
# raise Exception("Empty response received")
# except:
# print(f"error occurred: {traceback.format_exc()}")
# pass
# test on azure completion call
try:
response = completion(
model="azure/chatgpt-test", messages=messages, stream=True, logger_fn=logger_fn
)
for chunk in response:
print(chunk["choices"][0]["delta"])
score += 1
except:
print(f"error occurred: {traceback.format_exc()}")
pass
# # test on azure completion call
# try:
# response = completion(
# model="azure/chatgpt-test", messages=messages, stream=True, logger_fn=logger_fn
# )
# response = ""
# start_time = time.time()
# for chunk in response:
# chunk_time = time.time()
# print(f"time since initial request: {chunk_time - start_time:.2f}")
# print(chunk["choices"][0]["delta"])
# response += chunk["choices"][0]["delta"]
# if response == "":
# raise Exception("Empty response received")
# except:
# print(f"error occurred: {traceback.format_exc()}")
# pass
# test on anthropic completion call
@ -53,9 +66,15 @@ try:
response = completion(
model="claude-instant-1", messages=messages, stream=True, logger_fn=logger_fn
)
complete_response = ""
start_time = time.time()
for chunk in response:
chunk_time = time.time()
print(f"time since initial request: {chunk_time - start_time:.5f}")
print(chunk["choices"][0]["delta"])
score += 1
complete_response += chunk["choices"][0]["delta"]["content"]
if complete_response == "":
raise Exception("Empty response received")
except:
print(f"error occurred: {traceback.format_exc()}")
pass
@ -63,17 +82,110 @@ except:
# # test on huggingface completion call
# try:
# start_time = time.time()
# response = completion(
# model="meta-llama/Llama-2-7b-chat-hf",
# messages=messages,
# custom_llm_provider="huggingface",
# custom_api_base="https://s7c7gytn18vnu4tw.us-east-1.aws.endpoints.huggingface.cloud",
# stream=True,
# logger_fn=logger_fn,
# model="gpt-3.5-turbo", messages=messages, stream=True, logger_fn=logger_fn
# )
# complete_response = ""
# for chunk in response:
# chunk_time = time.time()
# print(f"time since initial request: {chunk_time - start_time:.2f}")
# print(chunk["choices"][0]["delta"])
# score += 1
# complete_response += chunk["choices"][0]["delta"]["content"] if len(chunk["choices"][0]["delta"].keys()) > 0 else ""
# if complete_response == "":
# raise Exception("Empty response received")
# except:
# print(f"error occurred: {traceback.format_exc()}")
# pass
# test on together ai completion call - replit-code-3b
try:
start_time = time.time()
response = completion(
model="Replit-Code-3B", messages=messages, logger_fn=logger_fn, stream= True
)
complete_response = ""
print(f"returned response object: {response}")
for chunk in response:
chunk_time = time.time()
print(f"time since initial request: {chunk_time - start_time:.2f}")
print(chunk["choices"][0]["delta"])
complete_response += chunk["choices"][0]["delta"]["content"] if len(chunk["choices"][0]["delta"].keys()) > 0 else ""
if complete_response == "":
raise Exception("Empty response received")
except:
print(f"error occurred: {traceback.format_exc()}")
pass
# test on together ai completion call - starcoder
try:
start_time = time.time()
response = completion(
model="together_ai/bigcode/starcoder", messages=messages, logger_fn=logger_fn, stream= True
)
complete_response = ""
print(f"returned response object: {response}")
for chunk in response:
chunk_time = time.time()
complete_response += chunk["choices"][0]["delta"]["content"] if len(chunk["choices"][0]["delta"].keys()) > 0 else ""
if len(complete_response) > 0:
print(complete_response)
if complete_response == "":
raise Exception("Empty response received")
except:
print(f"error occurred: {traceback.format_exc()}")
pass
# # test on azure completion call
# try:
# response = completion(
# model="azure/chatgpt-test", messages=messages, stream=True, logger_fn=logger_fn
# )
# response = ""
# for chunk in response:
# chunk_time = time.time()
# print(f"time since initial request: {chunk_time - start_time:.2f}")
# print(chunk["choices"][0]["delta"])
# response += chunk["choices"][0]["delta"]
# if response == "":
# raise Exception("Empty response received")
# except:
# print(f"error occurred: {traceback.format_exc()}")
# pass
# # test on anthropic completion call
# try:
# response = completion(
# model="claude-instant-1", messages=messages, stream=True, logger_fn=logger_fn
# )
# response = ""
# for chunk in response:
# chunk_time = time.time()
# print(f"time since initial request: {chunk_time - start_time:.2f}")
# print(chunk["choices"][0]["delta"])
# response += chunk["choices"][0]["delta"]
# if response == "":
# raise Exception("Empty response received")
# except:
# print(f"error occurred: {traceback.format_exc()}")
# pass
# # # test on huggingface completion call
# # try:
# # response = completion(
# # model="meta-llama/Llama-2-7b-chat-hf",
# # messages=messages,
# # custom_llm_provider="huggingface",
# # custom_api_base="https://s7c7gytn18vnu4tw.us-east-1.aws.endpoints.huggingface.cloud",
# # stream=True,
# # logger_fn=logger_fn,
# # )
# # for chunk in response:
# # print(chunk["choices"][0]["delta"])
# # score += 1
# # except:
# # print(f"error occurred: {traceback.format_exc()}")
# # pass

File diff suppressed because it is too large Load diff

View file

@ -16,6 +16,7 @@ nav:
- 💾 Callbacks - Logging Output:
- Quick Start: advanced.md
- Output Integrations: client_integrations.md
- LLMonitor Tutorial: llmonitor_integration.md
- Helicone Tutorial: helicone_integration.md
- Supabase Tutorial: supabase_integration.md
- BerriSpend Tutorial: berrispend_integration.md

36
poetry.lock generated
View file

@ -338,6 +338,25 @@ files = [
{file = "idna-3.4.tar.gz", hash = "sha256:814f528e8dead7d329833b91c5faa87d60bf71824cd12a7530b5526063d02cb4"},
]
[[package]]
name = "importlib-metadata"
version = "6.8.0"
description = "Read metadata from Python packages"
optional = false
python-versions = ">=3.8"
files = [
{file = "importlib_metadata-6.8.0-py3-none-any.whl", hash = "sha256:3ebb78df84a805d7698245025b975d9d67053cd94c79245ba4b3eb694abe68bb"},
{file = "importlib_metadata-6.8.0.tar.gz", hash = "sha256:dbace7892d8c0c4ac1ad096662232f831d4e64f4c4545bd53016a3e9d4654743"},
]
[package.dependencies]
zipp = ">=0.5"
[package.extras]
docs = ["furo", "jaraco.packaging (>=9)", "jaraco.tidelift (>=1.4)", "rst.linker (>=1.9)", "sphinx (>=3.5)", "sphinx-lint"]
perf = ["ipython"]
testing = ["flufl.flake8", "importlib-resources (>=1.3)", "packaging", "pyfakefs", "pytest (>=6)", "pytest-black (>=0.3.7)", "pytest-checkdocs (>=2.4)", "pytest-cov", "pytest-enabler (>=2.2)", "pytest-mypy (>=0.9.1)", "pytest-perf (>=0.9.2)", "pytest-ruff"]
[[package]]
name = "multidict"
version = "6.0.4"
@ -744,7 +763,22 @@ files = [
idna = ">=2.0"
multidict = ">=4.0"
[[package]]
name = "zipp"
version = "3.16.2"
description = "Backport of pathlib-compatible object wrapper for zip files"
optional = false
python-versions = ">=3.8"
files = [
{file = "zipp-3.16.2-py3-none-any.whl", hash = "sha256:679e51dd4403591b2d6838a48de3d283f3d188412a9782faadf845f298736ba0"},
{file = "zipp-3.16.2.tar.gz", hash = "sha256:ebc15946aa78bd63458992fc81ec3b6f7b1e92d51c35e6de1c3804e73b799147"},
]
[package.extras]
docs = ["furo", "jaraco.packaging (>=9.3)", "jaraco.tidelift (>=1.4)", "rst.linker (>=1.9)", "sphinx (>=3.5)", "sphinx-lint"]
testing = ["big-O", "jaraco.functools", "jaraco.itertools", "more-itertools", "pytest (>=6)", "pytest-black (>=0.3.7)", "pytest-checkdocs (>=2.4)", "pytest-cov", "pytest-enabler (>=2.2)", "pytest-ignore-flaky", "pytest-mypy (>=0.9.1)", "pytest-ruff"]
[metadata]
lock-version = "2.0"
python-versions = "^3.8"
content-hash = "fe7d88d91250950917244f8a6ffc8eba7bfc9aa84314ed6498131172ae4ef3cf"
content-hash = "de77e77aaa3ed490ffa159387c8e70e43361d78b095a975fec950336e54758e6"

View file

@ -1,6 +1,7 @@
# liteLLM Proxy Server: 50+ LLM Models, Error Handling, Caching
### Azure, Llama2, OpenAI, Claude, Hugging Face, Replicate Models
[![PyPI Version](https://img.shields.io/pypi/v/litellm.svg)](https://pypi.org/project/litellm/)
[![PyPI Version](https://img.shields.io/badge/stable%20version-v0.1.345-blue?color=green&link=https://pypi.org/project/litellm/0.1.1/)](https://pypi.org/project/litellm/0.1.1/)
![Downloads](https://img.shields.io/pypi/dm/litellm)
@ -11,34 +12,36 @@
![4BC6491E-86D0-4833-B061-9F54524B2579](https://github.com/BerriAI/litellm/assets/17561003/f5dd237b-db5e-42e1-b1ac-f05683b1d724)
## What does liteLLM proxy do
- Make `/chat/completions` requests for 50+ LLM models **Azure, OpenAI, Replicate, Anthropic, Hugging Face**
Example: for `model` use `claude-2`, `gpt-3.5`, `gpt-4`, `command-nightly`, `stabilityai/stablecode-completion-alpha-3b-4k`
```json
{
"model": "replicate/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1",
"messages": [
{
"content": "Hello, whats the weather in San Francisco??",
"role": "user"
}
]
{
"content": "Hello, whats the weather in San Francisco??",
"role": "user"
}
]
}
```
- **Consistent Input/Output** Format
- Call all models using the OpenAI format - `completion(model, messages)`
- Text responses will always be available at `['choices'][0]['message']['content']`
- **Error Handling** Using Model Fallbacks (if `GPT-4` fails, try `llama2`)
- **Logging** - Log Requests, Responses and Errors to `Supabase`, `Posthog`, `Mixpanel`, `Sentry`, `Helicone` (Any of the supported providers here: https://litellm.readthedocs.io/en/latest/advanced/
**Example: Logs sent to Supabase**
- **Consistent Input/Output** Format
- Call all models using the OpenAI format - `completion(model, messages)`
- Text responses will always be available at `['choices'][0]['message']['content']`
- **Error Handling** Using Model Fallbacks (if `GPT-4` fails, try `llama2`)
- **Logging** - Log Requests, Responses and Errors to `Supabase`, `Posthog`, `Mixpanel`, `Sentry`, `LLMonitor`, `Helicone` (Any of the supported providers here: https://litellm.readthedocs.io/en/latest/advanced/
**Example: Logs sent to Supabase**
<img width="1015" alt="Screenshot 2023-08-11 at 4 02 46 PM" src="https://github.com/ishaan-jaff/proxy-server/assets/29436595/237557b8-ba09-4917-982c-8f3e1b2c8d08">
- **Token Usage & Spend** - Track Input + Completion tokens used + Spend/model
- **Caching** - Implementation of Semantic Caching
- **Streaming & Async Support** - Return generators to stream text responses
## API Endpoints
### `/chat/completions` (POST)
@ -46,34 +49,37 @@
This endpoint is used to generate chat completions for 50+ support LLM API Models. Use llama2, GPT-4, Claude2 etc
#### Input
This API endpoint accepts all inputs in raw JSON and expects the following inputs
- `model` (string, required): ID of the model to use for chat completions. See all supported models [here]: (https://litellm.readthedocs.io/en/latest/supported/):
eg `gpt-3.5-turbo`, `gpt-4`, `claude-2`, `command-nightly`, `stabilityai/stablecode-completion-alpha-3b-4k`
- `model` (string, required): ID of the model to use for chat completions. See all supported models [here]: (https://litellm.readthedocs.io/en/latest/supported/):
eg `gpt-3.5-turbo`, `gpt-4`, `claude-2`, `command-nightly`, `stabilityai/stablecode-completion-alpha-3b-4k`
- `messages` (array, required): A list of messages representing the conversation context. Each message should have a `role` (system, user, assistant, or function), `content` (message text), and `name` (for function role).
- Additional Optional parameters: `temperature`, `functions`, `function_call`, `top_p`, `n`, `stream`. See the full list of supported inputs here: https://litellm.readthedocs.io/en/latest/input/
#### Example JSON body
For claude-2
```json
{
"model": "claude-2",
"messages": [
{
"content": "Hello, whats the weather in San Francisco??",
"role": "user"
}
]
"model": "claude-2",
"messages": [
{
"content": "Hello, whats the weather in San Francisco??",
"role": "user"
}
]
}
```
### Making an API request to the Proxy Server
```python
import requests
import json
# TODO: use your URL
# TODO: use your URL
url = "http://localhost:5000/chat/completions"
payload = json.dumps({
@ -94,34 +100,38 @@ print(response.text)
```
### Output [Response Format]
Responses from the server are given in the following format.
Responses from the server are given in the following format.
All responses from the server are returned in the following format (for all LLM models). More info on output here: https://litellm.readthedocs.io/en/latest/output/
```json
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "I'm sorry, but I don't have the capability to provide real-time weather information. However, you can easily check the weather in San Francisco by searching online or using a weather app on your phone.",
"role": "assistant"
}
}
],
"created": 1691790381,
"id": "chatcmpl-7mUFZlOEgdohHRDx2UpYPRTejirzb",
"model": "gpt-3.5-turbo-0613",
"object": "chat.completion",
"usage": {
"completion_tokens": 41,
"prompt_tokens": 16,
"total_tokens": 57
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "I'm sorry, but I don't have the capability to provide real-time weather information. However, you can easily check the weather in San Francisco by searching online or using a weather app on your phone.",
"role": "assistant"
}
}
],
"created": 1691790381,
"id": "chatcmpl-7mUFZlOEgdohHRDx2UpYPRTejirzb",
"model": "gpt-3.5-turbo-0613",
"object": "chat.completion",
"usage": {
"completion_tokens": 41,
"prompt_tokens": 16,
"total_tokens": 57
}
}
```
## Installation & Usage
### Running Locally
1. Clone liteLLM repository to your local machine:
```
git clone https://github.com/BerriAI/liteLLM-proxy
@ -141,24 +151,24 @@ All responses from the server are returned in the following format (for all LLM
python main.py
```
## Deploying
1. Quick Start: Deploy on Railway
[![Deploy on Railway](https://railway.app/button.svg)](https://railway.app/template/DYqQAW?referralCode=t3ukrU)
2. `GCP`, `AWS`, `Azure`
This project includes a `Dockerfile` allowing you to build and deploy a Docker Project on your providers
2. `GCP`, `AWS`, `Azure`
This project includes a `Dockerfile` allowing you to build and deploy a Docker Project on your providers
# Support / Talk with founders
- [Our calendar 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version)
- [Community Discord 💭](https://discord.gg/wuPM9dRgDw)
- Our numbers 📞 +1 (770) 8783-106 / +1 (412) 618-6238
- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai
## Roadmap
- [ ] Support hosted db (e.g. Supabase)
- [ ] Easily send data to places like posthog and sentry.
- [ ] Add a hot-cache for project spend logs - enables fast checks for user + project limitings

View file

@ -1,6 +1,6 @@
[tool.poetry]
name = "litellm"
version = "0.1.457"
version = "0.1.494"
description = "Library to easily interface with LLM API providers"
authors = ["BerriAI"]
license = "MIT License"

View file

@ -1 +0,0 @@