Merge branch 'main' into litellm_aioboto3_sagemaker

This commit is contained in:
Krish Dholakia 2024-02-14 21:46:58 -08:00 committed by GitHub
commit 57654f4533
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
79 changed files with 3440 additions and 253 deletions

View file

@ -34,6 +34,8 @@ LiteLLM manages:
[**Jump to OpenAI Proxy Docs**](https://github.com/BerriAI/litellm?tab=readme-ov-file#openai-proxy---docs) <br>
[**Jump to Supported LLM Providers**](https://github.com/BerriAI/litellm?tab=readme-ov-file#supported-provider-docs)
Support for more providers. Missing a provider or LLM Platform, raise a [feature request](https://github.com/BerriAI/litellm/issues/new?assignees=&labels=enhancement&projects=&template=feature_request.yml&title=%5BFeature%5D%3A+).
# Usage ([**Docs**](https://docs.litellm.ai/docs/))
> [!IMPORTANT]
> LiteLLM v1.0.0 now requires `openai>=1.0.0`. Migration guide [here](https://docs.litellm.ai/docs/migration)

View file

@ -1,5 +1,10 @@
# Custom Callbacks
:::info
**For PROXY** [Go Here](../proxy/logging.md#custom-callback-class-async)
:::
## Callback Class
You can create a custom callback class to precisely log events as they occur in litellm.

View file

@ -79,6 +79,23 @@ model_list:
mode: embedding # 👈 ADD THIS
```
### Image Generation Models
We need some way to know if the model is an image generation model when running checks, if you have this in your config, specifying mode it makes an image generation health check
```yaml
model_list:
- model_name: dall-e-3
litellm_params:
model: azure/dall-e-3
api_base: os.environ/AZURE_API_BASE
api_key: os.environ/AZURE_API_KEY
api_version: "2023-07-01-preview"
model_info:
mode: image_generation # 👈 ADD THIS
```
### Text Completion Models
We need some way to know if the model is a text completion model when running checks, if you have this in your config, specifying mode it makes an embedding health check

View file

@ -4,21 +4,24 @@ import Image from '@theme/IdealImage';
LiteLLM supports [Microsoft Presidio](https://github.com/microsoft/presidio/) for PII masking.
## Step 1. Add env
## Quick Start
### Step 1. Add env
```bash
export PRESIDIO_ANALYZER_API_BASE="http://localhost:5002"
export PRESIDIO_ANONYMIZER_API_BASE="http://localhost:5001"
```
## Step 2. Set it as a callback in config.yaml
### Step 2. Set it as a callback in config.yaml
```yaml
litellm_settings:
litellm.callbacks = ["presidio"]
callbacks = ["presidio", ...] # e.g. ["presidio", custom_callbacks.proxy_handler_instance]
```
## Start proxy
### Step 3. Start proxy
```
litellm --config /path/to/config.yaml
@ -27,4 +30,28 @@ litellm --config /path/to/config.yaml
This will mask the input going to the llm provider
<Image img={require('../../img/presidio_screenshot.png')} />
<Image img={require('../../img/presidio_screenshot.png')} />
## Output parsing
LLM responses can sometimes contain the masked tokens.
For presidio 'replace' operations, LiteLLM can check the LLM response and replace the masked token with the user-submitted values.
Just set `litellm.output_parse_pii = True`, to enable this.
```yaml
litellm_settings:
output_parse_pii: true
```
**Expected Flow: **
1. User Input: "hello world, my name is Jane Doe. My number is: 034453334"
2. LLM Input: "hello world, my name is [PERSON]. My number is: [PHONE_NUMBER]"
3. LLM Response: "Hey [PERSON], nice to meet you!"
4. User Response: "Hey Jane Doe, nice to meet you!"

View file

@ -370,12 +370,12 @@ See the latest available ghcr docker image here:
https://github.com/berriai/litellm/pkgs/container/litellm
```shell
docker pull ghcr.io/berriai/litellm:main-v1.16.13
docker pull ghcr.io/berriai/litellm:main-latest
```
### Run the Docker Image
```shell
docker run ghcr.io/berriai/litellm:main-v1.16.13
docker run ghcr.io/berriai/litellm:main-latest
```
#### Run the Docker Image with LiteLLM CLI args
@ -384,12 +384,12 @@ See all supported CLI args [here](https://docs.litellm.ai/docs/proxy/cli):
Here's how you can run the docker image and pass your config to `litellm`
```shell
docker run ghcr.io/berriai/litellm:main-v1.16.13 --config your_config.yaml
docker run ghcr.io/berriai/litellm:main-latest --config your_config.yaml
```
Here's how you can run the docker image and start litellm on port 8002 with `num_workers=8`
```shell
docker run ghcr.io/berriai/litellm:main-v1.16.13 --port 8002 --num_workers 8
docker run ghcr.io/berriai/litellm:main-latest --port 8002 --num_workers 8
```
#### Run the Docker Image using docker compose

View file

@ -37,12 +37,12 @@ http://0.0.0.0:8000/ui # <proxy_base_url>/ui
```
## Get Admin UI Link on Swagger
### 3. Get Admin UI Link on Swagger
Your Proxy Swagger is available on the root of the Proxy: e.g.: `http://localhost:4000/`
<Image img={require('../../img/ui_link.png')} />
## Change default username + password
### 4. Change default username + password
Set the following in your .env on the Proxy
@ -111,6 +111,29 @@ MICROSOFT_TENANT="5a39737
</TabItem>
<TabItem value="Generic" label="Generic SSO Provider">
A generic OAuth client that can be used to quickly create support for any OAuth provider with close to no code
**Required .env variables on your Proxy**
```shell
GENERIC_CLIENT_ID = "******"
GENERIC_CLIENT_SECRET = "G*******"
GENERIC_AUTHORIZATION_ENDPOINT = "http://localhost:9090/auth"
GENERIC_TOKEN_ENDPOINT = "http://localhost:9090/token"
GENERIC_USERINFO_ENDPOINT = "http://localhost:9090/me"
```
- Set Redirect URI, if your provider requires it
- Set a redirect url = `<your proxy base url>/sso/callback`
```shell
http://localhost:4000/sso/callback
```
</TabItem>
</Tabs>
### Step 3. Test flow

View file

@ -197,7 +197,7 @@ from openai import OpenAI
# set api_key to send to proxy server
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:8000")
response = openai.embeddings.create(
response = client.embeddings.create(
input=["hello from litellm"],
model="text-embedding-ada-002"
)
@ -281,6 +281,84 @@ print(query_result[:5])
```
## `/moderations`
### Request Format
Input, Output and Exceptions are mapped to the OpenAI format for all supported models
<Tabs>
<TabItem value="openai" label="OpenAI Python v1.0.0+">
```python
import openai
from openai import OpenAI
# set base_url to your proxy server
# set api_key to send to proxy server
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:8000")
response = client.moderations.create(
input="hello from litellm",
model="text-moderation-stable"
)
print(response)
```
</TabItem>
<TabItem value="Curl" label="Curl Request">
```shell
curl --location 'http://0.0.0.0:8000/moderations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{"input": "Sample text goes here", "model": "text-moderation-stable"}'
```
</TabItem>
</Tabs>
### Response Format
```json
{
"id": "modr-8sFEN22QCziALOfWTa77TodNLgHwA",
"model": "text-moderation-007",
"results": [
{
"categories": {
"harassment": false,
"harassment/threatening": false,
"hate": false,
"hate/threatening": false,
"self-harm": false,
"self-harm/instructions": false,
"self-harm/intent": false,
"sexual": false,
"sexual/minors": false,
"violence": false,
"violence/graphic": false
},
"category_scores": {
"harassment": 0.000019947197870351374,
"harassment/threatening": 5.5971017900446896e-6,
"hate": 0.000028560316422954202,
"hate/threatening": 2.2631787999216613e-8,
"self-harm": 2.9121162015144364e-7,
"self-harm/instructions": 9.314219084899378e-8,
"self-harm/intent": 8.093739012338119e-8,
"sexual": 0.00004414955765241757,
"sexual/minors": 0.0000156943697220413,
"violence": 0.00022354527027346194,
"violence/graphic": 8.804164281173144e-6
},
"flagged": false
}
]
}
```
## Advanced

View file

@ -696,7 +696,9 @@ general_settings:
"region_name": "us-west-2"
"user_table_name": "your-user-table",
"key_table_name": "your-token-table",
"config_table_name": "your-config-table"
"config_table_name": "your-config-table",
"aws_role_name": "your-aws_role_name",
"aws_session_name": "your-aws_session_name",
}
```

View file

@ -67,6 +67,7 @@ max_budget: float = 0.0 # set the max budget across all providers
budget_duration: Optional[
str
] = None # proxy only - resets budget after fixed duration. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d").
_openai_finish_reasons = ["stop", "length", "function_call", "content_filter", "null"]
_openai_completion_params = [
"functions",
"function_call",
@ -164,6 +165,8 @@ secret_manager_client: Optional[
] = None # list of instantiated key management clients - e.g. azure kv, infisical, etc.
_google_kms_resource_name: Optional[str] = None
_key_management_system: Optional[KeyManagementSystem] = None
#### PII MASKING ####
output_parse_pii: bool = False
#############################################

View file

@ -675,6 +675,9 @@ class S3Cache(BaseCache):
def flush_cache(self):
pass
async def disconnect(self):
pass
class DualCache(BaseCache):
"""

View file

@ -2,9 +2,11 @@
# On success, logs events to Promptlayer
import dotenv, os
import requests
from litellm.proxy._types import UserAPIKeyAuth
from litellm.caching import DualCache
from typing import Literal
from typing import Literal, Union
dotenv.load_dotenv() # Loading env variables using dotenv
import traceback
@ -54,7 +56,7 @@ class CustomLogger: # https://docs.litellm.ai/docs/observability/custom_callbac
user_api_key_dict: UserAPIKeyAuth,
cache: DualCache,
data: dict,
call_type: Literal["completion", "embeddings"],
call_type: Literal["completion", "embeddings", "image_generation"],
):
pass
@ -63,6 +65,13 @@ class CustomLogger: # https://docs.litellm.ai/docs/observability/custom_callbac
):
pass
async def async_post_call_success_hook(
self,
user_api_key_dict: UserAPIKeyAuth,
response,
):
pass
#### SINGLE-USE #### - https://docs.litellm.ai/docs/observability/custom_callback#using-your-custom-callback-function
def log_input_event(self, model, messages, kwargs, print_verbose, callback_func):

View file

@ -477,8 +477,8 @@ def init_bedrock_client(
def convert_messages_to_prompt(model, messages, provider, custom_prompt_dict):
# handle anthropic prompts using anthropic constants
if provider == "anthropic":
# handle anthropic prompts and amazon titan prompts
if provider == "anthropic" or provider == "amazon":
if model in custom_prompt_dict:
# check if the model has a registered custom prompt
model_prompt_details = custom_prompt_dict[model]
@ -490,7 +490,7 @@ def convert_messages_to_prompt(model, messages, provider, custom_prompt_dict):
)
else:
prompt = prompt_factory(
model=model, messages=messages, custom_llm_provider="anthropic"
model=model, messages=messages, custom_llm_provider="bedrock"
)
else:
prompt = ""
@ -623,6 +623,7 @@ def completion(
"textGenerationConfig": inference_params,
}
)
else:
data = json.dumps({})

View file

@ -90,9 +90,11 @@ def ollama_pt(
return {"prompt": prompt, "images": images}
else:
prompt = "".join(
m["content"]
if isinstance(m["content"], str) is str
else "".join(m["content"])
(
m["content"]
if isinstance(m["content"], str) is str
else "".join(m["content"])
)
for m in messages
)
return prompt
@ -422,6 +424,34 @@ def anthropic_pt(
return prompt
def amazon_titan_pt(
messages: list,
): # format - https://github.com/BerriAI/litellm/issues/1896
"""
Amazon Titan uses 'User:' and 'Bot: in it's prompt template
"""
class AmazonTitanConstants(Enum):
HUMAN_PROMPT = "\n\nUser: " # Assuming this is similar to Anthropic prompt formatting, since amazon titan's prompt formatting is currently undocumented
AI_PROMPT = "\n\nBot: "
prompt = ""
for idx, message in enumerate(messages):
if message["role"] == "user":
prompt += f"{AmazonTitanConstants.HUMAN_PROMPT.value}{message['content']}"
elif message["role"] == "system":
prompt += f"{AmazonTitanConstants.HUMAN_PROMPT.value}<admin>{message['content']}</admin>"
else:
prompt += f"{AmazonTitanConstants.AI_PROMPT.value}{message['content']}"
if (
idx == 0 and message["role"] == "assistant"
): # ensure the prompt always starts with `\n\nHuman: `
prompt = f"{AmazonTitanConstants.HUMAN_PROMPT.value}" + prompt
if messages[-1]["role"] != "assistant":
prompt += f"{AmazonTitanConstants.AI_PROMPT.value}"
return prompt
def _load_image_from_url(image_url):
try:
from PIL import Image
@ -636,6 +666,14 @@ def prompt_factory(
return gemini_text_image_pt(messages=messages)
elif custom_llm_provider == "mistral":
return mistral_api_pt(messages=messages)
elif custom_llm_provider == "bedrock":
if "amazon.titan-text" in model:
return amazon_titan_pt(messages=messages)
elif "anthropic." in model:
if any(_ in model for _ in ["claude-2.1", "claude-v2:1"]):
return claude_2_1_pt(messages=messages)
else:
return anthropic_pt(messages=messages)
try:
if "meta-llama/llama-2" in model and "chat" in model:
return llama_2_chat_pt(messages=messages)

View file

@ -484,7 +484,7 @@ def embedding(
aws_access_key_id = optional_params.pop("aws_access_key_id", None)
aws_region_name = optional_params.pop("aws_region_name", None)
if aws_access_key_id != None:
if aws_access_key_id is not None:
# uses auth params passed to completion
# aws_access_key_id is not None, assume user is trying to auth using litellm.completion
client = boto3.client(

View file

@ -4,7 +4,7 @@ from enum import Enum
import requests
import time
from typing import Callable, Optional, Union
from litellm.utils import ModelResponse, Usage, CustomStreamWrapper
from litellm.utils import ModelResponse, Usage, CustomStreamWrapper, map_finish_reason
import litellm, uuid
import httpx
@ -575,9 +575,9 @@ def completion(
model_response["model"] = model
## CALCULATING USAGE
if model in litellm.vertex_language_models and response_obj is not None:
model_response["choices"][0].finish_reason = response_obj.candidates[
0
].finish_reason.name
model_response["choices"][0].finish_reason = map_finish_reason(
response_obj.candidates[0].finish_reason.name
)
usage = Usage(
prompt_tokens=response_obj.usage_metadata.prompt_token_count,
completion_tokens=response_obj.usage_metadata.candidates_token_count,
@ -771,9 +771,9 @@ async def async_completion(
model_response["model"] = model
## CALCULATING USAGE
if model in litellm.vertex_language_models and response_obj is not None:
model_response["choices"][0].finish_reason = response_obj.candidates[
0
].finish_reason.name
model_response["choices"][0].finish_reason = map_finish_reason(
response_obj.candidates[0].finish_reason.name
)
usage = Usage(
prompt_tokens=response_obj.usage_metadata.prompt_token_count,
completion_tokens=response_obj.usage_metadata.candidates_token_count,

View file

@ -10,7 +10,6 @@
import os, openai, sys, json, inspect, uuid, datetime, threading
from typing import Any, Literal, Union
from functools import partial
import dotenv, traceback, random, asyncio, time, contextvars
from copy import deepcopy
import httpx
@ -2964,16 +2963,39 @@ def text_completion(
##### Moderation #######################
def moderation(input: str, api_key: Optional[str] = None):
def moderation(
input: str, model: Optional[str] = None, api_key: Optional[str] = None, **kwargs
):
# only supports open ai for now
api_key = (
api_key or litellm.api_key or litellm.openai_key or get_secret("OPENAI_API_KEY")
)
openai.api_key = api_key
openai.api_type = "open_ai" # type: ignore
openai.api_version = None
openai.base_url = "https://api.openai.com/v1/"
response = openai.moderations.create(input=input)
openai_client = kwargs.get("client", None)
if openai_client is None:
openai_client = openai.OpenAI(
api_key=api_key,
)
response = openai_client.moderations.create(input=input, model=model)
return response
##### Moderation #######################
@client
async def amoderation(input: str, model: str, api_key: Optional[str] = None, **kwargs):
# only supports open ai for now
api_key = (
api_key or litellm.api_key or litellm.openai_key or get_secret("OPENAI_API_KEY")
)
openai_client = kwargs.get("client", None)
if openai_client is None:
openai_client = openai.AsyncOpenAI(
api_key=api_key,
)
response = await openai_client.moderations.create(input=input, model=model)
return response

View file

@ -198,6 +198,33 @@
"litellm_provider": "openai",
"mode": "embedding"
},
"text-moderation-stable": {
"max_tokens": 32768,
"max_input_tokens": 32768,
"max_output_tokens": 0,
"input_cost_per_token": 0.000000,
"output_cost_per_token": 0.000000,
"litellm_provider": "openai",
"mode": "moderations"
},
"text-moderation-007": {
"max_tokens": 32768,
"max_input_tokens": 32768,
"max_output_tokens": 0,
"input_cost_per_token": 0.000000,
"output_cost_per_token": 0.000000,
"litellm_provider": "openai",
"mode": "moderations"
},
"text-moderation-latest": {
"max_tokens": 32768,
"max_input_tokens": 32768,
"max_output_tokens": 0,
"input_cost_per_token": 0.000000,
"output_cost_per_token": 0.000000,
"litellm_provider": "openai",
"mode": "moderations"
},
"256-x-256/dall-e-2": {
"mode": "image_generation",
"input_cost_per_pixel": 0.00000024414,

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -1 +1 @@
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[185],{11837:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=11837)}),_N_E=n.O()}]);
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[185],{87421:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=87421)}),_N_E=n.O()}]);

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -1 +1 @@
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[744],{70377:function(e,n,t){Promise.resolve().then(t.t.bind(t,47690,23)),Promise.resolve().then(t.t.bind(t,48955,23)),Promise.resolve().then(t.t.bind(t,5613,23)),Promise.resolve().then(t.t.bind(t,11902,23)),Promise.resolve().then(t.t.bind(t,31778,23)),Promise.resolve().then(t.t.bind(t,77831,23))}},function(e){var n=function(n){return e(e.s=n)};e.O(0,[971,69],function(){return n(35317),n(70377)}),_N_E=e.O()}]);
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[744],{32028:function(e,n,t){Promise.resolve().then(t.t.bind(t,47690,23)),Promise.resolve().then(t.t.bind(t,48955,23)),Promise.resolve().then(t.t.bind(t,5613,23)),Promise.resolve().then(t.t.bind(t,11902,23)),Promise.resolve().then(t.t.bind(t,31778,23)),Promise.resolve().then(t.t.bind(t,77831,23))}},function(e){var n=function(n){return e(e.s=n)};e.O(0,[971,69],function(){return n(35317),n(32028)}),_N_E=e.O()}]);

View file

@ -1 +1 @@
!function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o||0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r||"object"==typeof e&&e&&(4&r&&e.__esModule||16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t||[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/654259bbf9e4c196.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e||l.getAttribute("data-webpack")==o+n){i=l;break}}i||(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children||(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E||[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}();
!function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o||0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r||"object"==typeof e&&e&&(4&r&&e.__esModule||16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t||[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/c18941d97fb7245b.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e||l.getAttribute("data-webpack")==o+n){i=l;break}}i||(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children||(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E||[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}();

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -1 +1 @@
<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-85c9b4219c1bb384.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-9b4acf26920649bc.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-096338c8e1915716.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/654259bbf9e4c196.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[75985,[\"838\",\"static/chunks/838-7fa0bab5a1c3631d.js\",\"931\",\"static/chunks/app/page-5a7453e3903c5d60.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/654259bbf9e4c196.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"4mrMigZY9ob7yaIDjXpX6\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>
<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-a85b2c176012d8e5.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-e1b183dda365ec86.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-9b4fb13a7db53edf.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/c18941d97fb7245b.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[48016,[\"145\",\"static/chunks/145-9c160ad5539e000f.js\",\"931\",\"static/chunks/app/page-fcb69349f15d154b.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/c18941d97fb7245b.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"lLFQRQnIrRo-GJf5spHEd\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>

View file

@ -1,7 +1,7 @@
2:I[77831,[],""]
3:I[75985,["838","static/chunks/838-7fa0bab5a1c3631d.js","931","static/chunks/app/page-5a7453e3903c5d60.js"],""]
3:I[48016,["145","static/chunks/145-9c160ad5539e000f.js","931","static/chunks/app/page-fcb69349f15d154b.js"],""]
4:I[5613,[],""]
5:I[31778,[],""]
0:["4mrMigZY9ob7yaIDjXpX6",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/654259bbf9e4c196.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
0:["lLFQRQnIrRo-GJf5spHEd",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/c18941d97fb7245b.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"🚅 LiteLLM"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
1:null

View file

@ -234,6 +234,15 @@ class DynamoDBArgs(LiteLLMBase):
key_table_name: str = "LiteLLM_VerificationToken"
config_table_name: str = "LiteLLM_Config"
spend_table_name: str = "LiteLLM_SpendLogs"
aws_role_name: Optional[str] = None
aws_session_name: Optional[str] = None
aws_web_identity_token: Optional[str] = None
aws_provider_id: Optional[str] = None
aws_policy_arns: Optional[List[str]] = None
aws_policy: Optional[str] = None
aws_duration_seconds: Optional[int] = None
assume_role_aws_role_name: Optional[str] = None
assume_role_aws_session_name: Optional[str] = None
class ConfigGeneralSettings(LiteLLMBase):

View file

@ -53,6 +53,41 @@ class DynamoDBWrapper(CustomDB):
self.database_arguments = database_arguments
self.region_name = database_arguments.region_name
def set_env_vars_based_on_arn(self):
if self.database_arguments.aws_role_name is None:
return
verbose_proxy_logger.debug(
f"DynamoDB: setting env vars based on arn={self.database_arguments.aws_role_name}"
)
import boto3, os
sts_client = boto3.client("sts")
# call 1
non_used_assumed_role = sts_client.assume_role_with_web_identity(
RoleArn=self.database_arguments.aws_role_name,
RoleSessionName=self.database_arguments.aws_session_name,
WebIdentityToken=self.database_arguments.aws_web_identity_token,
)
# call 2
assumed_role = sts_client.assume_role(
RoleArn=self.database_arguments.assume_role_aws_role_name,
RoleSessionName=self.database_arguments.assume_role_aws_session_name,
)
aws_access_key_id = assumed_role["Credentials"]["AccessKeyId"]
aws_secret_access_key = assumed_role["Credentials"]["SecretAccessKey"]
aws_session_token = assumed_role["Credentials"]["SessionToken"]
verbose_proxy_logger.debug(
f"Got STS assumed Role, aws_access_key_id={aws_access_key_id}"
)
# set these in the env so aiodynamo can use them
os.environ["AWS_ACCESS_KEY_ID"] = aws_access_key_id
os.environ["AWS_SECRET_ACCESS_KEY"] = aws_secret_access_key
os.environ["AWS_SESSION_TOKEN"] = aws_session_token
async def connect(self):
"""
Connect to DB, and creating / updating any tables
@ -75,6 +110,7 @@ class DynamoDBWrapper(CustomDB):
import aiohttp
verbose_proxy_logger.debug("DynamoDB Wrapper - Attempting to connect")
self.set_env_vars_based_on_arn()
# before making ClientSession check if ssl_verify=False
if self.database_arguments.ssl_verify == False:
client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False))
@ -171,6 +207,8 @@ class DynamoDBWrapper(CustomDB):
from aiohttp import ClientSession
import aiohttp
self.set_env_vars_based_on_arn()
if self.database_arguments.ssl_verify == False:
client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False))
else:
@ -214,6 +252,8 @@ class DynamoDBWrapper(CustomDB):
from aiohttp import ClientSession
import aiohttp
self.set_env_vars_based_on_arn()
if self.database_arguments.ssl_verify == False:
client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False))
else:
@ -261,6 +301,7 @@ class DynamoDBWrapper(CustomDB):
async def update_data(
self, key: str, value: dict, table_name: Literal["user", "key", "config"]
):
self.set_env_vars_based_on_arn()
from aiodynamo.client import Client
from aiodynamo.credentials import Credentials, StaticCredentials
from aiodynamo.http.httpx import HTTPX
@ -334,4 +375,5 @@ class DynamoDBWrapper(CustomDB):
"""
Not Implemented yet.
"""
self.set_env_vars_based_on_arn()
return super().delete_data(keys, table_name)

View file

@ -8,14 +8,19 @@
# Tell us how we can improve! - Krrish & Ishaan
from typing import Optional
import litellm, traceback, sys
from typing import Optional, Literal, Union
import litellm, traceback, sys, uuid
from litellm.caching import DualCache
from litellm.proxy._types import UserAPIKeyAuth
from litellm.integrations.custom_logger import CustomLogger
from fastapi import HTTPException
from litellm._logging import verbose_proxy_logger
from litellm import ModelResponse
from litellm.utils import (
ModelResponse,
EmbeddingResponse,
ImageResponse,
StreamingChoices,
)
from datetime import datetime
import aiohttp, asyncio
@ -24,7 +29,13 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
user_api_key_cache = None
# Class variables or attributes
def __init__(self):
def __init__(self, mock_testing: bool = False):
self.pii_tokens: dict = (
{}
) # mapping of PII token to original text - only used with Presidio `replace` operation
if mock_testing == True: # for testing purposes only
return
self.presidio_analyzer_api_base = litellm.get_secret(
"PRESIDIO_ANALYZER_API_BASE", None
)
@ -51,12 +62,15 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
pass
async def check_pii(self, text: str) -> str:
"""
[TODO] make this more performant for high-throughput scenario
"""
try:
async with aiohttp.ClientSession() as session:
# Make the first request to /analyze
analyze_url = f"{self.presidio_analyzer_api_base}/analyze"
analyze_payload = {"text": text, "language": "en"}
redacted_text = None
async with session.post(analyze_url, json=analyze_payload) as response:
analyze_results = await response.json()
@ -72,6 +86,26 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
) as response:
redacted_text = await response.json()
new_text = text
if redacted_text is not None:
for item in redacted_text["items"]:
start = item["start"]
end = item["end"]
replacement = item["text"] # replacement token
if (
item["operator"] == "replace"
and litellm.output_parse_pii == True
):
# check if token in dict
# if exists, add a uuid to the replacement token for swapping back to the original text in llm response output parsing
if replacement in self.pii_tokens:
replacement = replacement + uuid.uuid4()
self.pii_tokens[replacement] = new_text[
start:end
] # get text it'll replace
new_text = new_text[:start] + replacement + new_text[end:]
return redacted_text["text"]
except Exception as e:
traceback.print_exc()
@ -94,6 +128,7 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
if call_type == "completion": # /chat/completions requests
messages = data["messages"]
tasks = []
for m in messages:
if isinstance(m["content"], str):
tasks.append(self.check_pii(text=m["content"]))
@ -104,3 +139,30 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
"content"
] = r # replace content with redacted string
return data
async def async_post_call_success_hook(
self,
user_api_key_dict: UserAPIKeyAuth,
response: Union[ModelResponse, EmbeddingResponse, ImageResponse],
):
"""
Output parse the response object to replace the masked tokens with user sent values
"""
verbose_proxy_logger.debug(
f"PII Masking Args: litellm.output_parse_pii={litellm.output_parse_pii}; type of response={type(response)}"
)
if litellm.output_parse_pii == False:
return response
if isinstance(response, ModelResponse) and not isinstance(
response.choices[0], StreamingChoices
): # /chat/completions requests
if isinstance(response.choices[0].message.content, str):
verbose_proxy_logger.debug(
f"self.pii_tokens: {self.pii_tokens}; initial response: {response.choices[0].message.content}"
)
for key, value in self.pii_tokens.items():
response.choices[0].message.content = response.choices[
0
].message.content.replace(key, value)
return response

View file

@ -570,6 +570,7 @@ def run_server(
"worker_class": "uvicorn.workers.UvicornWorker",
"preload": True, # Add the preload flag,
"accesslog": "-", # Log to stdout
"timeout": 600, # default to very high number, bedrock/anthropic.claude-v2:1 can take 30+ seconds for the 1st chunk to come in
"access_log_format": '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s',
}

View file

@ -9,73 +9,41 @@ model_list:
mode: chat
max_tokens: 4096
base_model: azure/gpt-4-1106-preview
access_groups: ["public"]
- model_name: openai-gpt-3.5
litellm_params:
model: gpt-3.5-turbo
api_key: os.environ/OPENAI_API_KEY
model_info:
access_groups: ["public"]
- model_name: anthropic-claude-v2.1
litellm_params:
model: bedrock/anthropic.claude-v2:1
timeout: 300 # sets a 5 minute timeout
model_info:
access_groups: ["private"]
- model_name: anthropic-claude-v2
litellm_params:
model: bedrock/anthropic.claude-v2
- model_name: bedrock-cohere
litellm_params:
model: bedrock/cohere.command-text-v14
timeout: 0.0001
- model_name: gpt-4
litellm_params:
model: azure/chatgpt-v-2
api_base: https://openai-gpt-4-test-v-1.openai.azure.com/
api_version: "2023-05-15"
api_key: os.environ/AZURE_API_KEY # The `os.environ/` prefix tells litellm to read this from the env. See https://docs.litellm.ai/docs/simple_proxy#load-api-keys-from-vault
- model_name: gpt-vision
model_info:
base_model: azure/gpt-4
- model_name: text-moderation-stable
litellm_params:
model: azure/gpt-4-vision
base_url: https://gpt-4-vision-resource.openai.azure.com/openai/deployments/gpt-4-vision/extensions
api_key: os.environ/AZURE_VISION_API_KEY
api_version: "2023-09-01-preview"
dataSources:
- type: AzureComputerVision
parameters:
endpoint: os.environ/AZURE_VISION_ENHANCE_ENDPOINT
key: os.environ/AZURE_VISION_ENHANCE_KEY
- model_name: BEDROCK_GROUP
litellm_params:
model: bedrock/cohere.command-text-v14
timeout: 0.0001
- model_name: tg-ai
litellm_params:
model: together_ai/mistralai/Mistral-7B-Instruct-v0.1
- model_name: sagemaker
litellm_params:
model: sagemaker/berri-benchmarking-Llama-2-70b-chat-hf-4
- model_name: openai-gpt-3.5
litellm_params:
model: gpt-3.5-turbo
model: text-moderation-stable
api_key: os.environ/OPENAI_API_KEY
model_info:
mode: chat
- model_name: azure-cloudflare
litellm_params:
model: azure/chatgpt-v-2
api_base: https://gateway.ai.cloudflare.com/v1/0399b10e77ac6668c80404a5ff49eb37/litellm-test/azure-openai/openai-gpt-4-test-v-1
api_key: os.environ/AZURE_API_KEY
api_version: "2023-07-01-preview"
- model_name: azure-embedding-model
litellm_params:
model: azure/azure-embedding-model
api_base: os.environ/AZURE_API_BASE
api_key: os.environ/AZURE_API_KEY
api_version: "2023-07-01-preview"
model_info:
mode: embedding
base_model: text-embedding-ada-002
- model_name: text-embedding-ada-002
litellm_params:
model: text-embedding-ada-002
api_key: os.environ/OPENAI_API_KEY
model_info:
mode: embedding
litellm_settings:
fallbacks: [{"openai-gpt-3.5": ["azure-gpt-3.5"]}]
success_callback: ['langfuse']
max_budget: 10 # global budget for proxy
max_user_budget: 0.0001
budget_duration: 30d # global budget duration, will reset after 30d
default_key_generate_params:
max_budget: 1.5000
models: ["azure-gpt-3.5"]
duration: None
upperbound_key_generate_params:
max_budget: 100
duration: "30d"
# setting callback class
# callbacks: custom_callbacks.proxy_handler_instance # sets litellm.callbacks = [proxy_handler_instance]
@ -93,6 +61,7 @@ general_settings:
environment_variables:
# otel: True # OpenTelemetry Logger
# master_key: sk-1234 # [OPTIONAL] Only use this if you to require all calls to contain this key (Authorization: Bearer sk-1234)

View file

@ -403,34 +403,43 @@ async def user_api_key_auth(
verbose_proxy_logger.debug(
f"LLM Model List pre access group check: {llm_model_list}"
)
access_groups = []
from collections import defaultdict
access_groups = defaultdict(list)
if llm_model_list is not None:
for m in llm_model_list:
for group in m.get("model_info", {}).get("access_groups", []):
access_groups.append((m["model_name"], group))
model_name = m["model_name"]
access_groups[group].append(model_name)
allowed_models = valid_token.models
access_group_idx = set()
models_in_current_access_groups = []
if (
len(access_groups) > 0
): # check if token contains any model access groups
for idx, m in enumerate(valid_token.models):
for model_name, group in access_groups:
if m == group:
access_group_idx.add(idx)
allowed_models.append(model_name)
for idx, m in enumerate(
valid_token.models
): # loop token models, if any of them are an access group add the access group
if m in access_groups:
# if it is an access group we need to remove it from valid_token.models
models_in_group = access_groups[m]
models_in_current_access_groups.extend(models_in_group)
# Filter out models that are access_groups
filtered_models = [
m for m in valid_token.models if m not in access_groups
]
filtered_models += models_in_current_access_groups
verbose_proxy_logger.debug(
f"model: {model}; allowed_models: {allowed_models}"
f"model: {model}; allowed_models: {filtered_models}"
)
if model is not None and model not in allowed_models:
if model is not None and model not in filtered_models:
raise ValueError(
f"API Key not allowed to access model. This token can only access models={valid_token.models}. Tried to access {model}"
)
for val in access_group_idx:
allowed_models.pop(val)
valid_token.models = allowed_models
valid_token.models = filtered_models
verbose_proxy_logger.debug(
f"filtered allowed_models: {allowed_models}; valid_token.models: {valid_token.models}"
f"filtered allowed_models: {filtered_models}; valid_token.models: {valid_token.models}"
)
# Check 2. If user_id for this token is in budget
@ -682,34 +691,31 @@ async def user_api_key_auth(
# sso/login, ui/login, /key functions and /user functions
# this will never be allowed to call /chat/completions
token_team = getattr(valid_token, "team_id", None)
if token_team is not None:
if token_team == "litellm-dashboard":
# this token is only used for managing the ui
allowed_routes = [
"/sso",
"/login",
"/key",
"/spend",
"/user",
]
# check if the current route startswith any of the allowed routes
if (
route is not None
and isinstance(route, str)
and any(
route.startswith(allowed_route)
for allowed_route in allowed_routes
)
):
# Do something if the current route starts with any of the allowed routes
pass
else:
raise Exception(
f"This key is made for LiteLLM UI, Tried to access route: {route}. Not allowed"
)
return UserAPIKeyAuth(api_key=api_key, **valid_token_dict)
else:
raise Exception(f"Invalid Key Passed to LiteLLM Proxy")
if token_team is not None and token_team == "litellm-dashboard":
# this token is only used for managing the ui
allowed_routes = [
"/sso",
"/login",
"/key",
"/spend",
"/user",
"/model/info",
]
# check if the current route startswith any of the allowed routes
if (
route is not None
and isinstance(route, str)
and any(
route.startswith(allowed_route) for allowed_route in allowed_routes
)
):
# Do something if the current route starts with any of the allowed routes
pass
else:
raise Exception(
f"This key is made for LiteLLM UI, Tried to access route: {route}. Not allowed"
)
return UserAPIKeyAuth(api_key=api_key, **valid_token_dict)
except Exception as e:
# verbose_proxy_logger.debug(f"An exception occurred - {traceback.format_exc()}")
traceback.print_exc()
@ -1443,6 +1449,24 @@ class ProxyConfig:
database_type == "dynamo_db" or database_type == "dynamodb"
):
database_args = general_settings.get("database_args", None)
### LOAD FROM os.environ/ ###
for k, v in database_args.items():
if isinstance(v, str) and v.startswith("os.environ/"):
database_args[k] = litellm.get_secret(v)
if isinstance(k, str) and k == "aws_web_identity_token":
value = database_args[k]
verbose_proxy_logger.debug(
f"Loading AWS Web Identity Token from file: {value}"
)
if os.path.exists(value):
with open(value, "r") as file:
token_content = file.read()
database_args[k] = token_content
else:
verbose_proxy_logger.info(
f"DynamoDB Loading - {value} is not a valid file path"
)
verbose_proxy_logger.debug(f"database_args: {database_args}")
custom_db_client = DBClient(
custom_db_args=database_args, custom_db_type=database_type
)
@ -1580,8 +1604,6 @@ async def generate_key_helper_fn(
tpm_limit = tpm_limit
rpm_limit = rpm_limit
allowed_cache_controls = allowed_cache_controls
if type(team_id) is not str:
team_id = str(team_id)
try:
# Create a new verification token (you may want to enhance this logic based on your needs)
user_data = {
@ -2057,14 +2079,6 @@ def model_list(
if user_model is not None:
all_models += [user_model]
verbose_proxy_logger.debug(f"all_models: {all_models}")
### CHECK OLLAMA MODELS ###
try:
response = requests.get("http://0.0.0.0:11434/api/tags")
models = response.json()["models"]
ollama_models = ["ollama/" + m["name"].replace(":latest", "") for m in models]
all_models.extend(ollama_models)
except Exception as e:
pass
return dict(
data=[
{
@ -2355,8 +2369,13 @@ async def chat_completion(
llm_router is not None and data["model"] in llm_router.deployment_names
): # model in router deployments, calling a specific deployment on the router
response = await llm_router.acompletion(**data, specific_deployment=True)
else: # router is not set
elif user_model is not None: # `litellm --model <your-model-name>`
response = await litellm.acompletion(**data)
else:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail={"error": "Invalid model name passed in"},
)
# Post Call Processing
data["litellm_status"] = "success" # used for alerting
@ -2387,6 +2406,11 @@ async def chat_completion(
)
fastapi_response.headers["x-litellm-model-id"] = model_id
### CALL HOOKS ### - modify outgoing data
response = await proxy_logging_obj.post_call_success_hook(
user_api_key_dict=user_api_key_dict, response=response
)
return response
except Exception as e:
traceback.print_exc()
@ -2417,7 +2441,12 @@ async def chat_completion(
traceback.print_exc()
if isinstance(e, HTTPException):
raise e
raise ProxyException(
message=getattr(e, "detail", str(e)),
type=getattr(e, "type", "None"),
param=getattr(e, "param", "None"),
code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
)
else:
error_traceback = traceback.format_exc()
error_msg = f"{str(e)}\n\n{error_traceback}"
@ -2567,8 +2596,13 @@ async def embeddings(
llm_router is not None and data["model"] in llm_router.deployment_names
): # model in router deployments, calling a specific deployment on the router
response = await llm_router.aembedding(**data, specific_deployment=True)
else:
elif user_model is not None: # `litellm --model <your-model-name>`
response = await litellm.aembedding(**data)
else:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail={"error": "Invalid model name passed in"},
)
### ALERTING ###
data["litellm_status"] = "success" # used for alerting
@ -2586,7 +2620,12 @@ async def embeddings(
)
traceback.print_exc()
if isinstance(e, HTTPException):
raise e
raise ProxyException(
message=getattr(e, "message", str(e)),
type=getattr(e, "type", "None"),
param=getattr(e, "param", "None"),
code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
)
else:
error_traceback = traceback.format_exc()
error_msg = f"{str(e)}\n\n{error_traceback}"
@ -2702,8 +2741,13 @@ async def image_generation(
response = await llm_router.aimage_generation(
**data
) # ensure this goes the llm_router, router will do the correct alias mapping
else:
elif user_model is not None: # `litellm --model <your-model-name>`
response = await litellm.aimage_generation(**data)
else:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail={"error": "Invalid model name passed in"},
)
### ALERTING ###
data["litellm_status"] = "success" # used for alerting
@ -2721,7 +2765,165 @@ async def image_generation(
)
traceback.print_exc()
if isinstance(e, HTTPException):
raise e
raise ProxyException(
message=getattr(e, "message", str(e)),
type=getattr(e, "type", "None"),
param=getattr(e, "param", "None"),
code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
)
else:
error_traceback = traceback.format_exc()
error_msg = f"{str(e)}\n\n{error_traceback}"
raise ProxyException(
message=getattr(e, "message", error_msg),
type=getattr(e, "type", "None"),
param=getattr(e, "param", "None"),
code=getattr(e, "status_code", 500),
)
@router.post(
"/v1/moderations",
dependencies=[Depends(user_api_key_auth)],
response_class=ORJSONResponse,
tags=["moderations"],
)
@router.post(
"/moderations",
dependencies=[Depends(user_api_key_auth)],
response_class=ORJSONResponse,
tags=["moderations"],
)
async def moderations(
request: Request,
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
):
"""
The moderations endpoint is a tool you can use to check whether content complies with an LLM Providers policies.
Quick Start
```
curl --location 'http://0.0.0.0:4000/moderations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{"input": "Sample text goes here", "model": "text-moderation-stable"}'
```
"""
global proxy_logging_obj
try:
# Use orjson to parse JSON data, orjson speeds up requests significantly
body = await request.body()
data = orjson.loads(body)
# Include original request and headers in the data
data["proxy_server_request"] = {
"url": str(request.url),
"method": request.method,
"headers": dict(request.headers),
"body": copy.copy(data), # use copy instead of deepcopy
}
if data.get("user", None) is None and user_api_key_dict.user_id is not None:
data["user"] = user_api_key_dict.user_id
data["model"] = (
general_settings.get("moderation_model", None) # server default
or user_model # model name passed via cli args
or data["model"] # default passed in http request
)
if user_model:
data["model"] = user_model
if "metadata" not in data:
data["metadata"] = {}
data["metadata"]["user_api_key"] = user_api_key_dict.api_key
data["metadata"]["user_api_key_metadata"] = user_api_key_dict.metadata
_headers = dict(request.headers)
_headers.pop(
"authorization", None
) # do not store the original `sk-..` api key in the db
data["metadata"]["headers"] = _headers
data["metadata"]["user_api_key_user_id"] = user_api_key_dict.user_id
data["metadata"]["endpoint"] = str(request.url)
### TEAM-SPECIFIC PARAMS ###
if user_api_key_dict.team_id is not None:
team_config = await proxy_config.load_team_config(
team_id=user_api_key_dict.team_id
)
if len(team_config) == 0:
pass
else:
team_id = team_config.pop("team_id", None)
data["metadata"]["team_id"] = team_id
data = {
**team_config,
**data,
} # add the team-specific configs to the completion call
router_model_names = (
[m["model_name"] for m in llm_model_list]
if llm_model_list is not None
else []
)
### CALL HOOKS ### - modify incoming data / reject request before calling the model
data = await proxy_logging_obj.pre_call_hook(
user_api_key_dict=user_api_key_dict, data=data, call_type="moderation"
)
start_time = time.time()
## ROUTE TO CORRECT ENDPOINT ##
# skip router if user passed their key
if "api_key" in data:
response = await litellm.amoderation(**data)
elif (
llm_router is not None and data["model"] in router_model_names
): # model in router model list
response = await llm_router.amoderation(**data)
elif (
llm_router is not None and data["model"] in llm_router.deployment_names
): # model in router deployments, calling a specific deployment on the router
response = await llm_router.amoderation(**data, specific_deployment=True)
elif (
llm_router is not None
and llm_router.model_group_alias is not None
and data["model"] in llm_router.model_group_alias
): # model set in model_group_alias
response = await llm_router.amoderation(
**data
) # ensure this goes the llm_router, router will do the correct alias mapping
elif user_model is not None: # `litellm --model <your-model-name>`
response = await litellm.amoderation(**data)
else:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail={"error": "Invalid model name passed in"},
)
### ALERTING ###
data["litellm_status"] = "success" # used for alerting
end_time = time.time()
asyncio.create_task(
proxy_logging_obj.response_taking_too_long(
start_time=start_time, end_time=end_time, type="slow_response"
)
)
return response
except Exception as e:
await proxy_logging_obj.post_call_failure_hook(
user_api_key_dict=user_api_key_dict, original_exception=e
)
traceback.print_exc()
if isinstance(e, HTTPException):
raise ProxyException(
message=getattr(e, "message", str(e)),
type=getattr(e, "type", "None"),
param=getattr(e, "param", "None"),
code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
)
else:
error_traceback = traceback.format_exc()
error_msg = f"{str(e)}\n\n{error_traceback}"
@ -3516,6 +3718,7 @@ async def google_login(request: Request):
"""
microsoft_client_id = os.getenv("MICROSOFT_CLIENT_ID", None)
google_client_id = os.getenv("GOOGLE_CLIENT_ID", None)
generic_client_id = os.getenv("GENERIC_CLIENT_ID", None)
# get url from request
redirect_url = os.getenv("PROXY_BASE_URL", str(request.base_url))
@ -3574,6 +3777,69 @@ async def google_login(request: Request):
)
with microsoft_sso:
return await microsoft_sso.get_login_redirect()
elif generic_client_id is not None:
from fastapi_sso.sso.generic import create_provider, DiscoveryDocument
generic_client_secret = os.getenv("GENERIC_CLIENT_SECRET", None)
generic_authorization_endpoint = os.getenv(
"GENERIC_AUTHORIZATION_ENDPOINT", None
)
generic_token_endpoint = os.getenv("GENERIC_TOKEN_ENDPOINT", None)
generic_userinfo_endpoint = os.getenv("GENERIC_USERINFO_ENDPOINT", None)
if generic_client_secret is None:
raise ProxyException(
message="GENERIC_CLIENT_SECRET not set. Set it in .env file",
type="auth_error",
param="GENERIC_CLIENT_SECRET",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
if generic_authorization_endpoint is None:
raise ProxyException(
message="GENERIC_AUTHORIZATION_ENDPOINT not set. Set it in .env file",
type="auth_error",
param="GENERIC_AUTHORIZATION_ENDPOINT",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
if generic_token_endpoint is None:
raise ProxyException(
message="GENERIC_TOKEN_ENDPOINT not set. Set it in .env file",
type="auth_error",
param="GENERIC_TOKEN_ENDPOINT",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
if generic_userinfo_endpoint is None:
raise ProxyException(
message="GENERIC_USERINFO_ENDPOINT not set. Set it in .env file",
type="auth_error",
param="GENERIC_USERINFO_ENDPOINT",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
verbose_proxy_logger.debug(
f"authorization_endpoint: {generic_authorization_endpoint}\ntoken_endpoint: {generic_token_endpoint}\nuserinfo_endpoint: {generic_userinfo_endpoint}"
)
verbose_proxy_logger.debug(
f"GENERIC_REDIRECT_URI: {redirect_url}\nGENERIC_CLIENT_ID: {generic_client_id}\n"
)
discovery = DiscoveryDocument(
authorization_endpoint=generic_authorization_endpoint,
token_endpoint=generic_token_endpoint,
userinfo_endpoint=generic_userinfo_endpoint,
)
SSOProvider = create_provider(name="oidc", discovery_document=discovery)
generic_sso = SSOProvider(
client_id=generic_client_id,
client_secret=generic_client_secret,
redirect_uri=redirect_url,
allow_insecure_http=True,
)
with generic_sso:
return await generic_sso.get_login_redirect()
elif ui_username is not None:
# No Google, Microsoft SSO
# Use UI Credentials set in .env
@ -3673,6 +3939,7 @@ async def auth_callback(request: Request):
global general_settings
microsoft_client_id = os.getenv("MICROSOFT_CLIENT_ID", None)
google_client_id = os.getenv("GOOGLE_CLIENT_ID", None)
generic_client_id = os.getenv("GENERIC_CLIENT_ID", None)
# get url from request
redirect_url = os.getenv("PROXY_BASE_URL", str(request.base_url))
@ -3728,6 +3995,77 @@ async def auth_callback(request: Request):
allow_insecure_http=True,
)
result = await microsoft_sso.verify_and_process(request)
elif generic_client_id is not None:
# make generic sso provider
from fastapi_sso.sso.generic import create_provider, DiscoveryDocument
generic_client_secret = os.getenv("GENERIC_CLIENT_SECRET", None)
generic_authorization_endpoint = os.getenv(
"GENERIC_AUTHORIZATION_ENDPOINT", None
)
generic_token_endpoint = os.getenv("GENERIC_TOKEN_ENDPOINT", None)
generic_userinfo_endpoint = os.getenv("GENERIC_USERINFO_ENDPOINT", None)
if generic_client_secret is None:
raise ProxyException(
message="GENERIC_CLIENT_SECRET not set. Set it in .env file",
type="auth_error",
param="GENERIC_CLIENT_SECRET",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
if generic_authorization_endpoint is None:
raise ProxyException(
message="GENERIC_AUTHORIZATION_ENDPOINT not set. Set it in .env file",
type="auth_error",
param="GENERIC_AUTHORIZATION_ENDPOINT",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
if generic_token_endpoint is None:
raise ProxyException(
message="GENERIC_TOKEN_ENDPOINT not set. Set it in .env file",
type="auth_error",
param="GENERIC_TOKEN_ENDPOINT",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
if generic_userinfo_endpoint is None:
raise ProxyException(
message="GENERIC_USERINFO_ENDPOINT not set. Set it in .env file",
type="auth_error",
param="GENERIC_USERINFO_ENDPOINT",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
verbose_proxy_logger.debug(
f"authorization_endpoint: {generic_authorization_endpoint}\ntoken_endpoint: {generic_token_endpoint}\nuserinfo_endpoint: {generic_userinfo_endpoint}"
)
verbose_proxy_logger.debug(
f"GENERIC_REDIRECT_URI: {redirect_url}\nGENERIC_CLIENT_ID: {generic_client_id}\n"
)
discovery = DiscoveryDocument(
authorization_endpoint=generic_authorization_endpoint,
token_endpoint=generic_token_endpoint,
userinfo_endpoint=generic_userinfo_endpoint,
)
SSOProvider = create_provider(name="oidc", discovery_document=discovery)
generic_sso = SSOProvider(
client_id=generic_client_id,
client_secret=generic_client_secret,
redirect_uri=redirect_url,
allow_insecure_http=True,
)
verbose_proxy_logger.debug(f"calling generic_sso.verify_and_process")
request_body = await request.body()
request_query_params = request.query_params
# get "code" from query params
code = request_query_params.get("code")
result = await generic_sso.verify_and_process(request)
verbose_proxy_logger.debug(f"generic result: {result}")
# User is Authe'd in - generate key for the UI to access Proxy
user_email = getattr(result, "email", None)
@ -3936,7 +4274,6 @@ async def add_new_model(model_params: ModelParams):
)
#### [BETA] - This is a beta endpoint, format might change based on user feedback https://github.com/BerriAI/litellm/issues/933. If you need a stable endpoint use /model/info
@router.get(
"/model/info",
description="Provides more info about each model in /models, including config.yaml descriptions (except api key and api base)",
@ -3969,6 +4306,28 @@ async def model_info_v1(
# read litellm model_prices_and_context_window.json to get the following:
# input_cost_per_token, output_cost_per_token, max_tokens
litellm_model_info = get_litellm_model_info(model=model)
# 2nd pass on the model, try seeing if we can find model in litellm model_cost map
if litellm_model_info == {}:
# use litellm_param model_name to get model_info
litellm_params = model.get("litellm_params", {})
litellm_model = litellm_params.get("model", None)
try:
litellm_model_info = litellm.get_model_info(model=litellm_model)
except:
litellm_model_info = {}
# 3rd pass on the model, try seeing if we can find model but without the "/" in model cost map
if litellm_model_info == {}:
# use litellm_param model_name to get model_info
litellm_params = model.get("litellm_params", {})
litellm_model = litellm_params.get("model", None)
split_model = litellm_model.split("/")
if len(split_model) > 0:
litellm_model = split_model[-1]
try:
litellm_model_info = litellm.get_model_info(model=litellm_model)
except:
litellm_model_info = {}
for k, v in litellm_model_info.items():
if k not in model_info:
model_info[k] = v

View file

@ -4,25 +4,29 @@ const openai = require('openai');
process.env.DEBUG=false;
async function runOpenAI() {
const client = new openai.OpenAI({
apiKey: 'sk-JkKeNi6WpWDngBsghJ6B9g',
baseURL: 'http://0.0.0.0:8000'
apiKey: 'sk-1234',
baseURL: 'http://0.0.0.0:4000'
});
try {
const response = await client.chat.completions.create({
model: 'sagemaker',
model: 'anthropic-claude-v2.1',
stream: true,
max_tokens: 1000,
messages: [
{
role: 'user',
content: 'write a 20 pg essay about YC ',
content: 'write a 20 pg essay about YC '.repeat(6000),
},
],
});
console.log(response);
let original = '';
for await (const chunk of response) {
original += chunk.choices[0].delta.content;
console.log(original);
console.log(chunk);
console.log(chunk.choices[0].delta.content);
}

View file

@ -11,6 +11,7 @@ from litellm.caching import DualCache
from litellm.proxy.hooks.parallel_request_limiter import (
_PROXY_MaxParallelRequestsHandler,
)
from litellm import ModelResponse, EmbeddingResponse, ImageResponse
from litellm.proxy.hooks.max_budget_limiter import _PROXY_MaxBudgetLimiter
from litellm.proxy.hooks.cache_control_check import _PROXY_CacheControlCheck
from litellm.integrations.custom_logger import CustomLogger
@ -92,7 +93,9 @@ class ProxyLogging:
self,
user_api_key_dict: UserAPIKeyAuth,
data: dict,
call_type: Literal["completion", "embeddings", "image_generation"],
call_type: Literal[
"completion", "embeddings", "image_generation", "moderation"
],
):
"""
Allows users to modify/reject the incoming request to the proxy, without having to deal with parsing Request body.
@ -377,6 +380,28 @@ class ProxyLogging:
raise e
return
async def post_call_success_hook(
self,
response: Union[ModelResponse, EmbeddingResponse, ImageResponse],
user_api_key_dict: UserAPIKeyAuth,
):
"""
Allow user to modify outgoing data
Covers:
1. /chat/completions
"""
new_response = copy.deepcopy(response)
for callback in litellm.callbacks:
try:
if isinstance(callback, CustomLogger):
await callback.async_post_call_success_hook(
user_api_key_dict=user_api_key_dict, response=new_response
)
except Exception as e:
raise e
return new_response
### DB CONNECTOR ###
# Define the retry decorator with backoff strategy

View file

@ -599,6 +599,98 @@ class Router:
self.fail_calls[model_name] += 1
raise e
async def amoderation(self, model: str, input: str, **kwargs):
try:
kwargs["model"] = model
kwargs["input"] = input
kwargs["original_function"] = self._amoderation
kwargs["num_retries"] = kwargs.get("num_retries", self.num_retries)
timeout = kwargs.get("request_timeout", self.timeout)
kwargs.setdefault("metadata", {}).update({"model_group": model})
response = await self.async_function_with_fallbacks(**kwargs)
return response
except Exception as e:
raise e
async def _amoderation(self, model: str, input: str, **kwargs):
model_name = None
try:
verbose_router_logger.debug(
f"Inside _moderation()- model: {model}; kwargs: {kwargs}"
)
deployment = self.get_available_deployment(
model=model,
input=input,
specific_deployment=kwargs.pop("specific_deployment", None),
)
kwargs.setdefault("metadata", {}).update(
{
"deployment": deployment["litellm_params"]["model"],
"model_info": deployment.get("model_info", {}),
}
)
kwargs["model_info"] = deployment.get("model_info", {})
data = deployment["litellm_params"].copy()
model_name = data["model"]
for k, v in self.default_litellm_params.items():
if (
k not in kwargs and v is not None
): # prioritize model-specific params > default router params
kwargs[k] = v
elif k == "metadata":
kwargs[k].update(v)
potential_model_client = self._get_client(
deployment=deployment, kwargs=kwargs, client_type="async"
)
# check if provided keys == client keys #
dynamic_api_key = kwargs.get("api_key", None)
if (
dynamic_api_key is not None
and potential_model_client is not None
and dynamic_api_key != potential_model_client.api_key
):
model_client = None
else:
model_client = potential_model_client
self.total_calls[model_name] += 1
timeout = (
data.get(
"timeout", None
) # timeout set on litellm_params for this deployment
or self.timeout # timeout set on router
or kwargs.get(
"timeout", None
) # this uses default_litellm_params when nothing is set
)
response = await litellm.amoderation(
**{
**data,
"input": input,
"caching": self.cache_responses,
"client": model_client,
"timeout": timeout,
**kwargs,
}
)
self.success_calls[model_name] += 1
verbose_router_logger.info(
f"litellm.amoderation(model={model_name})\033[32m 200 OK\033[0m"
)
return response
except Exception as e:
verbose_router_logger.info(
f"litellm.amoderation(model={model_name})\033[31m Exception {str(e)}\033[0m"
)
if model_name is not None:
self.fail_calls[model_name] += 1
raise e
def text_completion(
self,
model: str,

View file

@ -86,7 +86,7 @@ class LowestLatencyLoggingHandler(CustomLogger):
if isinstance(response_obj, ModelResponse):
completion_tokens = response_obj.usage.completion_tokens
total_tokens = response_obj.usage.total_tokens
final_value = float(completion_tokens / response_ms.total_seconds())
final_value = float(response_ms.total_seconds() / completion_tokens)
# ------------
# Update usage
@ -168,7 +168,7 @@ class LowestLatencyLoggingHandler(CustomLogger):
if isinstance(response_obj, ModelResponse):
completion_tokens = response_obj.usage.completion_tokens
total_tokens = response_obj.usage.total_tokens
final_value = float(completion_tokens / response_ms.total_seconds())
final_value = float(response_ms.total_seconds() / completion_tokens)
# ------------
# Update usage

View file

@ -123,6 +123,10 @@ def test_vertex_ai():
print(response)
assert type(response.choices[0].message.content) == str
assert len(response.choices[0].message.content) > 1
print(
f"response.choices[0].finish_reason: {response.choices[0].finish_reason}"
)
assert response.choices[0].finish_reason in litellm._openai_finish_reasons
except Exception as e:
pytest.fail(f"Error occurred: {e}")

View file

@ -71,7 +71,7 @@ def test_completion_claude():
messages=messages,
request_timeout=10,
)
# Add any assertions here to check the response
# Add any assertions here to check response args
print(response)
print(response.usage)
print(response.usage.completion_tokens)
@ -1545,9 +1545,9 @@ def test_completion_bedrock_titan_null_response():
],
)
# Add any assertions here to check the response
pytest.fail(f"Expected to fail")
print(f"response: {response}")
except Exception as e:
pass
pytest.fail(f"An error occurred - {str(e)}")
def test_completion_bedrock_titan():
@ -2093,10 +2093,6 @@ def test_completion_cloudflare():
def test_moderation():
import openai
openai.api_type = "azure"
openai.api_version = "GM"
response = litellm.moderation(input="i'm ishaan cto of litellm")
print(response)
output = response.results[0]

View file

@ -0,0 +1,65 @@
# What is this?
## Unit test for presidio pii masking
import sys, os, asyncio, time, random
from datetime import datetime
import traceback
from dotenv import load_dotenv
load_dotenv()
import os
sys.path.insert(
0, os.path.abspath("../..")
) # Adds the parent directory to the system path
import pytest
import litellm
from litellm.proxy.hooks.presidio_pii_masking import _OPTIONAL_PresidioPIIMasking
from litellm import Router, mock_completion
from litellm.proxy.utils import ProxyLogging
from litellm.proxy._types import UserAPIKeyAuth
from litellm.caching import DualCache
@pytest.mark.asyncio
async def test_output_parsing():
"""
- have presidio pii masking - mask an input message
- make llm completion call
- have presidio pii masking - output parse message
- assert that no masked tokens are in the input message
"""
litellm.output_parse_pii = True
pii_masking = _OPTIONAL_PresidioPIIMasking(mock_testing=True)
initial_message = [
{
"role": "user",
"content": "hello world, my name is Jane Doe. My number is: 034453334",
}
]
filtered_message = [
{
"role": "user",
"content": "hello world, my name is <PERSON>. My number is: <PHONE_NUMBER>",
}
]
pii_masking.pii_tokens = {"<PERSON>": "Jane Doe", "<PHONE_NUMBER>": "034453334"}
response = mock_completion(
model="gpt-3.5-turbo",
messages=filtered_message,
mock_response="Hello <PERSON>! How can I assist you today?",
)
new_response = await pii_masking.async_post_call_success_hook(
user_api_key_dict=UserAPIKeyAuth(), response=response
)
assert (
new_response.choices[0].message.content
== "Hello Jane Doe! How can I assist you today?"
)
# asyncio.run(test_output_parsing())

View file

@ -139,7 +139,7 @@ def test_exception_openai_bad_model(client):
response=response
)
print("Type of exception=", type(openai_exception))
assert isinstance(openai_exception, openai.NotFoundError)
assert isinstance(openai_exception, openai.BadRequestError)
except Exception as e:
pytest.fail(f"LiteLLM Proxy test failed. Exception {str(e)}")
@ -160,7 +160,6 @@ def test_chat_completion_exception_any_model(client):
response = client.post("/chat/completions", json=test_data)
json_response = response.json()
print("keys in json response", json_response.keys())
assert json_response.keys() == {"error"}
# make an openai client to call _make_status_error_from_response

View file

@ -991,3 +991,23 @@ def test_router_timeout():
print(e)
print(vars(e))
pass
@pytest.mark.asyncio
async def test_router_amoderation():
model_list = [
{
"model_name": "openai-moderations",
"litellm_params": {
"model": "text-moderation-stable",
"api_key": os.getenv("OPENAI_API_KEY", None),
},
}
]
router = Router(model_list=model_list)
result = await router.amoderation(
model="openai-moderations", input="this is valid good text"
)
print("moderation result", result)

View file

@ -58,6 +58,18 @@ def my_post_call_rule(input: str):
return {"decision": True}
def my_post_call_rule_2(input: str):
input = input.lower()
print(f"input: {input}")
print(f"INSIDE MY POST CALL RULE, len(input) - {len(input)}")
if len(input) < 200 and len(input) > 0:
return {
"decision": False,
"message": "This violates LiteLLM Proxy Rules. Response too short",
}
return {"decision": True}
# test_pre_call_rule()
# Test 2: Post-call rule
# commenting out of ci/cd since llm's have variable output which was causing our pipeline to fail erratically.
@ -94,3 +106,24 @@ def test_post_call_rule():
# test_post_call_rule()
def test_post_call_rule_streaming():
try:
litellm.pre_call_rules = []
litellm.post_call_rules = [my_post_call_rule_2]
### completion
response = completion(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "say sorry"}],
max_tokens=2,
stream=True,
)
for chunk in response:
print(f"chunk: {chunk}")
pytest.fail(f"Completion call should have been failed. ")
except Exception as e:
print("Got exception", e)
print(type(e))
print(vars(e))
assert e.message == "This violates LiteLLM Proxy Rules. Response too short"

View file

@ -738,6 +738,8 @@ class CallTypes(Enum):
text_completion = "text_completion"
image_generation = "image_generation"
aimage_generation = "aimage_generation"
moderation = "moderation"
amoderation = "amoderation"
# Logging function -> log the exact model details + what's being sent | Non-BlockingP
@ -2100,6 +2102,11 @@ def client(original_function):
or call_type == CallTypes.aimage_generation.value
):
messages = args[0] if len(args) > 0 else kwargs["prompt"]
elif (
call_type == CallTypes.moderation.value
or call_type == CallTypes.amoderation.value
):
messages = args[1] if len(args) > 1 else kwargs["input"]
elif (
call_type == CallTypes.atext_completion.value
or call_type == CallTypes.text_completion.value
@ -7692,6 +7699,7 @@ class CustomStreamWrapper:
self.special_tokens = ["<|assistant|>", "<|system|>", "<|user|>", "<s>", "</s>"]
self.holding_chunk = ""
self.complete_response = ""
self.response_uptil_now = ""
_model_info = (
self.logging_obj.model_call_details.get("litellm_params", {}).get(
"model_info", {}
@ -7703,6 +7711,7 @@ class CustomStreamWrapper:
} # returned as x-litellm-model-id response header in proxy
self.response_id = None
self.logging_loop = None
self.rules = Rules()
def __iter__(self):
return self
@ -8659,7 +8668,7 @@ class CustomStreamWrapper:
chunk = next(self.completion_stream)
if chunk is not None and chunk != b"":
print_verbose(f"PROCESSED CHUNK PRE CHUNK CREATOR: {chunk}")
response = self.chunk_creator(chunk=chunk)
response: Optional[ModelResponse] = self.chunk_creator(chunk=chunk)
print_verbose(f"PROCESSED CHUNK POST CHUNK CREATOR: {response}")
if response is None:
continue
@ -8667,7 +8676,12 @@ class CustomStreamWrapper:
threading.Thread(
target=self.run_success_logging_in_thread, args=(response,)
).start() # log response
self.response_uptil_now += (
response.choices[0].delta.get("content", "") or ""
)
self.rules.post_call_rules(
input=self.response_uptil_now, model=self.model
)
# RETURN RESULT
return response
except StopIteration:
@ -8705,7 +8719,9 @@ class CustomStreamWrapper:
# chunk_creator() does logging/stream chunk building. We need to let it know its being called in_async_func, so we don't double add chunks.
# __anext__ also calls async_success_handler, which does logging
print_verbose(f"PROCESSED ASYNC CHUNK PRE CHUNK CREATOR: {chunk}")
processed_chunk = self.chunk_creator(chunk=chunk)
processed_chunk: Optional[ModelResponse] = self.chunk_creator(
chunk=chunk
)
print_verbose(
f"PROCESSED ASYNC CHUNK POST CHUNK CREATOR: {processed_chunk}"
)
@ -8720,6 +8736,12 @@ class CustomStreamWrapper:
processed_chunk,
)
)
self.response_uptil_now += (
processed_chunk.choices[0].delta.get("content", "") or ""
)
self.rules.post_call_rules(
input=self.response_uptil_now, model=self.model
)
return processed_chunk
raise StopAsyncIteration
else: # temporary patch for non-aiohttp async calls
@ -8733,7 +8755,9 @@ class CustomStreamWrapper:
chunk = next(self.completion_stream)
if chunk is not None and chunk != b"":
print_verbose(f"PROCESSED CHUNK PRE CHUNK CREATOR: {chunk}")
processed_chunk = self.chunk_creator(chunk=chunk)
processed_chunk: Optional[ModelResponse] = self.chunk_creator(
chunk=chunk
)
print_verbose(
f"PROCESSED CHUNK POST CHUNK CREATOR: {processed_chunk}"
)
@ -8750,6 +8774,12 @@ class CustomStreamWrapper:
)
)
self.response_uptil_now += (
processed_chunk.choices[0].delta.get("content", "") or ""
)
self.rules.post_call_rules(
input=self.response_uptil_now, model=self.model
)
# RETURN RESULT
return processed_chunk
except StopAsyncIteration:

View file

@ -198,6 +198,33 @@
"litellm_provider": "openai",
"mode": "embedding"
},
"text-moderation-stable": {
"max_tokens": 32768,
"max_input_tokens": 32768,
"max_output_tokens": 0,
"input_cost_per_token": 0.000000,
"output_cost_per_token": 0.000000,
"litellm_provider": "openai",
"mode": "moderations"
},
"text-moderation-007": {
"max_tokens": 32768,
"max_input_tokens": 32768,
"max_output_tokens": 0,
"input_cost_per_token": 0.000000,
"output_cost_per_token": 0.000000,
"litellm_provider": "openai",
"mode": "moderations"
},
"text-moderation-latest": {
"max_tokens": 32768,
"max_input_tokens": 32768,
"max_output_tokens": 0,
"input_cost_per_token": 0.000000,
"output_cost_per_token": 0.000000,
"litellm_provider": "openai",
"mode": "moderations"
},
"256-x-256/dall-e-2": {
"mode": "image_generation",
"input_cost_per_pixel": 0.00000024414,

View file

@ -1,6 +1,6 @@
[tool.poetry]
name = "litellm"
version = "1.23.10"
version = "1.23.16"
description = "Library to easily interface with LLM API providers"
authors = ["BerriAI"]
license = "MIT"
@ -69,7 +69,7 @@ requires = ["poetry-core", "wheel"]
build-backend = "poetry.core.masonry.api"
[tool.commitizen]
version = "1.23.10"
version = "1.23.16"
version_files = [
"pyproject.toml:^version"
]

View file

@ -27,7 +27,7 @@ tiktoken>=0.4.0 # for calculating usage
importlib-metadata>=6.8.0 # for random utils
tokenizers==0.14.0 # for calculating usage
click==8.1.7 # for proxy cli
jinja2==3.1.2 # for prompt templates
jinja2==3.1.3 # for prompt templates
certifi>=2023.7.22 # [TODO] clean up
aiohttp==3.9.0 # for network calls
aioboto3==12.3.0 # for async sagemaker calls

View file

@ -88,6 +88,22 @@ async def test_chat_completion():
await chat_completion(session=session, key=key_2)
@pytest.mark.asyncio
async def test_chat_completion_old_key():
"""
Production test for backwards compatibility. Test db against a pre-generated (old key)
- Create key
Make chat completion call
"""
async with aiohttp.ClientSession() as session:
try:
key = "sk-yNXvlRO4SxIGG0XnRMYxTw"
await chat_completion(session=session, key=key)
except Exception as e:
key = "sk-2KV0sAElLQqMpLZXdNf3yw" # try diff db key (in case db url is for the other db)
await chat_completion(session=session, key=key)
async def completion(session, key):
url = "http://0.0.0.0:4000/completions"
headers = {

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1 @@
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[165],{83155:function(e,t,n){(window.__NEXT_P=window.__NEXT_P||[]).push(["/_not-found",function(){return n(84032)}])},84032:function(e,t,n){"use strict";Object.defineProperty(t,"__esModule",{value:!0}),Object.defineProperty(t,"default",{enumerable:!0,get:function(){return i}}),n(86921);let o=n(3827);n(64090);let r={error:{fontFamily:'system-ui,"Segoe UI",Roboto,Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji"',height:"100vh",textAlign:"center",display:"flex",flexDirection:"column",alignItems:"center",justifyContent:"center"},desc:{display:"inline-block"},h1:{display:"inline-block",margin:"0 20px 0 0",padding:"0 23px 0 0",fontSize:24,fontWeight:500,verticalAlign:"top",lineHeight:"49px"},h2:{fontSize:14,fontWeight:400,lineHeight:"49px",margin:0}};function i(){return(0,o.jsxs)(o.Fragment,{children:[(0,o.jsx)("title",{children:"404: This page could not be found."}),(0,o.jsx)("div",{style:r.error,children:(0,o.jsxs)("div",{children:[(0,o.jsx)("style",{dangerouslySetInnerHTML:{__html:"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}),(0,o.jsx)("h1",{className:"next-error-h1",style:r.h1,children:"404"}),(0,o.jsx)("div",{style:r.desc,children:(0,o.jsx)("h2",{style:r.h2,children:"This page could not be found."})})]})})]})}("function"==typeof t.default||"object"==typeof t.default&&null!==t.default)&&void 0===t.default.__esModule&&(Object.defineProperty(t.default,"__esModule",{value:!0}),Object.assign(t.default,t),e.exports=t.default)}},function(e){e.O(0,[971,69,744],function(){return e(e.s=83155)}),_N_E=e.O()}]);

View file

@ -0,0 +1 @@
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[185],{87421:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=87421)}),_N_E=n.O()}]);

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1 @@
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[744],{32028:function(e,n,t){Promise.resolve().then(t.t.bind(t,47690,23)),Promise.resolve().then(t.t.bind(t,48955,23)),Promise.resolve().then(t.t.bind(t,5613,23)),Promise.resolve().then(t.t.bind(t,11902,23)),Promise.resolve().then(t.t.bind(t,31778,23)),Promise.resolve().then(t.t.bind(t,77831,23))}},function(e){var n=function(n){return e(e.s=n)};e.O(0,[971,69],function(){return n(35317),n(32028)}),_N_E=e.O()}]);

View file

@ -0,0 +1 @@
!function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o||0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r||"object"==typeof e&&e&&(4&r&&e.__esModule||16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t||[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/c18941d97fb7245b.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e||l.getAttribute("data-webpack")==o+n){i=l;break}}i||(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children||(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E||[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}();

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1 @@
self.__BUILD_MANIFEST={__rewrites:{afterFiles:[],beforeFiles:[],fallback:[]},"/_error":["static/chunks/pages/_error-d6107f1aac0c574c.js"],sortedPages:["/_app","/_error"]},self.__BUILD_MANIFEST_CB&&self.__BUILD_MANIFEST_CB();

View file

@ -0,0 +1 @@
self.__SSG_MANIFEST=new Set([]);self.__SSG_MANIFEST_CB&&self.__SSG_MANIFEST_CB()

View file

@ -1 +1 @@
<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-85c9b4219c1bb384.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-9b4acf26920649bc.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-096338c8e1915716.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/654259bbf9e4c196.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[75985,[\"838\",\"static/chunks/838-7fa0bab5a1c3631d.js\",\"931\",\"static/chunks/app/page-5a7453e3903c5d60.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/654259bbf9e4c196.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"4mrMigZY9ob7yaIDjXpX6\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>
<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-a85b2c176012d8e5.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-e1b183dda365ec86.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-9b4fb13a7db53edf.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/c18941d97fb7245b.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[48016,[\"145\",\"static/chunks/145-9c160ad5539e000f.js\",\"931\",\"static/chunks/app/page-fcb69349f15d154b.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/c18941d97fb7245b.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"lLFQRQnIrRo-GJf5spHEd\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>

View file

@ -1,7 +1,7 @@
2:I[77831,[],""]
3:I[75985,["838","static/chunks/838-7fa0bab5a1c3631d.js","931","static/chunks/app/page-5a7453e3903c5d60.js"],""]
3:I[48016,["145","static/chunks/145-9c160ad5539e000f.js","931","static/chunks/app/page-fcb69349f15d154b.js"],""]
4:I[5613,[],""]
5:I[31778,[],""]
0:["4mrMigZY9ob7yaIDjXpX6",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/654259bbf9e4c196.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
0:["lLFQRQnIrRo-GJf5spHEd",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/c18941d97fb7245b.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"🚅 LiteLLM"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
1:null

File diff suppressed because it is too large Load diff

View file

@ -19,14 +19,18 @@
"jsonwebtoken": "^9.0.2",
"jwt-decode": "^4.0.0",
"next": "14.1.0",
"openai": "^4.28.0",
"react": "^18",
"react-dom": "^18"
"react-dom": "^18",
"react-markdown": "^9.0.1",
"react-syntax-highlighter": "^15.5.0"
},
"devDependencies": {
"@tailwindcss/forms": "^0.5.7",
"@types/node": "^20",
"@types/react": "18.2.48",
"@types/react-dom": "^18",
"@types/react-syntax-highlighter": "^15.5.11",
"autoprefixer": "^10.4.17",
"eslint": "^8",
"eslint-config-next": "14.1.0",

View file

@ -3,6 +3,8 @@ import React, { Suspense, useEffect, useState } from "react";
import { useSearchParams } from "next/navigation";
import Navbar from "../components/navbar";
import UserDashboard from "../components/user_dashboard";
import ModelDashboard from "@/components/model_dashboard";
import ChatUI from "@/components/chat_ui";
import Sidebar from "../components/leftnav";
import Usage from "../components/usage";
import { jwtDecode } from "jwt-decode";
@ -80,7 +82,22 @@ const CreateKeyPage = () => {
userEmail={userEmail}
setUserEmail={setUserEmail}
/>
) : (
) : page == "models" ? (
<ModelDashboard
userID={userID}
userRole={userRole}
token={token}
accessToken={accessToken}
/>
) : page == "llm-playground" ? (
<ChatUI
userID={userID}
userRole={userRole}
token={token}
accessToken={accessToken}
/>
)
: (
<Usage
userID={userID}
userRole={userRole}

View file

@ -0,0 +1,301 @@
import React, { useState, useEffect } from "react";
import ReactMarkdown from "react-markdown";
import { Card, Title, Table, TableHead, TableRow, TableCell, TableBody, Grid, Tab,
TabGroup,
TabList,
TabPanel,
Metric,
Select,
SelectItem,
TabPanels, } from "@tremor/react";
import { modelInfoCall } from "./networking";
import openai from "openai";
import { Prism as SyntaxHighlighter } from 'react-syntax-highlighter';
interface ChatUIProps {
accessToken: string | null;
token: string | null;
userRole: string | null;
userID: string | null;
}
async function generateModelResponse(inputMessage: string, updateUI: (chunk: string) => void, selectedModel: string, accessToken: string) {
const client = new openai.OpenAI({
apiKey: accessToken, // Replace with your OpenAI API key
baseURL: 'http://0.0.0.0:4000', // Replace with your OpenAI API base URL
dangerouslyAllowBrowser: true, // using a temporary litellm proxy key
});
const response = await client.chat.completions.create({
model: selectedModel,
stream: true,
messages: [
{
role: 'user',
content: inputMessage,
},
],
});
for await (const chunk of response) {
console.log(chunk);
if (chunk.choices[0].delta.content) {
updateUI(chunk.choices[0].delta.content);
}
}
}
const ChatUI: React.FC<ChatUIProps> = ({ accessToken, token, userRole, userID }) => {
const [inputMessage, setInputMessage] = useState("");
const [chatHistory, setChatHistory] = useState<any[]>([]);
const [selectedModel, setSelectedModel] = useState<string | undefined>(undefined);
const [modelInfo, setModelInfo] = useState<any | null>(null); // Declare modelInfo at the component level
useEffect(() => {
if (!accessToken || !token || !userRole || !userID) {
return;
}
// Fetch model info and set the default selected model
const fetchModelInfo = async () => {
const fetchedModelInfo = await modelInfoCall(accessToken, userID, userRole);
console.log("model_info:", fetchedModelInfo);
if (fetchedModelInfo?.data.length > 0) {
setModelInfo(fetchedModelInfo);
setSelectedModel(fetchedModelInfo.data[0].model_name);
}
};
fetchModelInfo();
}, [accessToken, userID, userRole]);
const updateUI = (role: string, chunk: string) => {
setChatHistory((prevHistory) => {
const lastMessage = prevHistory[prevHistory.length - 1];
if (lastMessage && lastMessage.role === role) {
return [
...prevHistory.slice(0, prevHistory.length - 1),
{ role, content: lastMessage.content + chunk },
];
} else {
return [...prevHistory, { role, content: chunk }];
}
});
};
const handleSendMessage = async () => {
if (inputMessage.trim() === "") return;
if (!accessToken || !token || !userRole || !userID) {
return;
}
setChatHistory((prevHistory) => [
...prevHistory,
{ role: "user", content: inputMessage },
]);
try {
if (selectedModel) {
await generateModelResponse(inputMessage, (chunk) => updateUI("assistant", chunk), selectedModel, accessToken);
}
} catch (error) {
console.error("Error fetching model response", error);
updateUI("assistant", "Error fetching model response");
}
setInputMessage("");
};
return (
<div style={{ width: "100%", position: "relative" }}>
<Grid className="gap-2 p-10 h-[75vh] w-full">
<Card>
<TabGroup>
<TabList className="mt-4">
<Tab>Chat</Tab>
<Tab>API Reference</Tab>
</TabList>
<TabPanels>
<TabPanel>
<div>
<label>Select Model:</label>
<select
value={selectedModel || ""}
onChange={(e) => setSelectedModel(e.target.value)}
>
{/* Populate dropdown options from available models */}
{modelInfo?.data.map((element: { model_name: string }) => (
<option key={element.model_name} value={element.model_name}>
{element.model_name}
</option>
))}
</select>
</div>
<Table className="mt-5" style={{ display: "block", maxHeight: "60vh", overflowY: "auto" }}>
<TableHead>
<TableRow>
<TableCell>
<Title>Chat</Title>
</TableCell>
</TableRow>
</TableHead>
<TableBody>
{chatHistory.map((message, index) => (
<TableRow key={index}>
<TableCell>{`${message.role}: ${message.content}`}</TableCell>
</TableRow>
))}
</TableBody>
</Table>
<div className="mt-3" style={{ position: "absolute", bottom: 5, width: "95%" }}>
<div className="flex">
<input
type="text"
value={inputMessage}
onChange={(e) => setInputMessage(e.target.value)}
className="flex-1 p-2 border rounded-md mr-2"
placeholder="Type your message..."
/>
<button onClick={handleSendMessage} className="p-2 bg-blue-500 text-white rounded-md">
Send
</button>
</div>
</div>
</TabPanel>
<TabPanel>
<TabGroup>
<TabList>
<Tab>OpenAI Python SDK</Tab>
<Tab>LlamaIndex</Tab>
<Tab>Langchain Py</Tab>
</TabList>
<TabPanels>
<TabPanel>
<SyntaxHighlighter language="python">
{`
import openai
client = openai.OpenAI(
api_key="your_api_key",
base_url="http://0.0.0.0:4000" # proxy base url
)
response = client.chat.completions.create(
model="gpt-3.5-turbo", # model to use from Models Tab
messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
extra_body={
"metadata": {
"generation_name": "ishaan-generation-openai-client",
"generation_id": "openai-client-gen-id22",
"trace_id": "openai-client-trace-id22",
"trace_user_id": "openai-client-user-id2"
}
}
)
print(response)
`}
</SyntaxHighlighter>
</TabPanel>
<TabPanel>
<SyntaxHighlighter language="python">
{`
import os, dotenv
from llama_index.llms import AzureOpenAI
from llama_index.embeddings import AzureOpenAIEmbedding
from llama_index import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
llm = AzureOpenAI(
engine="azure-gpt-3.5", # model_name on litellm proxy
temperature=0.0,
azure_endpoint="http://0.0.0.0:4000", # litellm proxy endpoint
api_key="sk-1234", # litellm proxy API Key
api_version="2023-07-01-preview",
)
embed_model = AzureOpenAIEmbedding(
deployment_name="azure-embedding-model",
azure_endpoint="http://0.0.0.0:4000",
api_key="sk-1234",
api_version="2023-07-01-preview",
)
documents = SimpleDirectoryReader("llama_index_data").load_data()
service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)
index = VectorStoreIndex.from_documents(documents, service_context=service_context)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)
`}
</SyntaxHighlighter>
</TabPanel>
<TabPanel>
<SyntaxHighlighter language="python">
{`
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage
chat = ChatOpenAI(
openai_api_base="http://0.0.0.0:8000",
model = "gpt-3.5-turbo",
temperature=0.1,
extra_body={
"metadata": {
"generation_name": "ishaan-generation-langchain-client",
"generation_id": "langchain-client-gen-id22",
"trace_id": "langchain-client-trace-id22",
"trace_user_id": "langchain-client-user-id2"
}
}
)
messages = [
SystemMessage(
content="You are a helpful assistant that im using to make a test request to."
),
HumanMessage(
content="test from litellm. tell me why it's amazing in 1 sentence"
),
]
response = chat(messages)
print(response)
`}
</SyntaxHighlighter>
</TabPanel>
</TabPanels>
</TabGroup>
</TabPanel>
</TabPanels>
</TabGroup>
</Card>
</Grid>
</div>
);
};
export default ChatUI;

View file

@ -14,6 +14,7 @@ interface CreateKeyProps {
userRole: string | null;
accessToken: string;
data: any[] | null;
userModels: string[];
setData: React.Dispatch<React.SetStateAction<any[] | null>>;
}
@ -22,6 +23,7 @@ const CreateKey: React.FC<CreateKeyProps> = ({
userRole,
accessToken,
data,
userModels,
setData,
}) => {
const [form] = Form.useForm();
@ -42,20 +44,13 @@ const CreateKey: React.FC<CreateKeyProps> = ({
const handleCreate = async (formValues: Record<string, any>) => {
try {
message.info("Making API Call");
// Check if "models" exists and is not an empty string
if (formValues.models && formValues.models.trim() !== '') {
// Format the "models" field as an array
formValues.models = formValues.models.split(',').map((model: string) => model.trim());
} else {
// If "models" is undefined or an empty string, set it to an empty array
formValues.models = [];
}
setIsModalVisible(true);
const response = await keyCreateCall(accessToken, userID, formValues);
setData((prevData) => (prevData ? [...prevData, response] : [response])); // Check if prevData is null
setApiKey(response["key"]);
message.success("API Key Created");
form.resetFields();
localStorage.removeItem("userData" + userID)
} catch (error) {
console.error("Error creating the key:", error);
}
@ -90,13 +85,22 @@ const CreateKey: React.FC<CreateKeyProps> = ({
>
<Input placeholder="ai_team" />
</Form.Item>
<Form.Item
label="Models (Comma Separated). Eg: gpt-3.5-turbo,gpt-4"
name="models"
label="Models"
name="models"
>
<Select
mode="multiple"
placeholder="Select models"
style={{ width: '100%' }}
>
<Input placeholder="gpt-4,gpt-3.5-turbo" />
</Form.Item>
{userModels.map((model) => (
<Option key={model} value={model}>
{model}
</Option>
))}
</Select>
</Form.Item>
<Form.Item

View file

@ -20,7 +20,13 @@ const Sidebar: React.FC<SidebarProps> = ({ setPage }) => {
<Menu.Item key="1" onClick={() => setPage("api-keys")}>
API Keys
</Menu.Item>
<Menu.Item key="2" onClick={() => setPage("usage")}>
<Menu.Item key="2" onClick={() => setPage("models")}>
Models
</Menu.Item>
<Menu.Item key="3" onClick={() => setPage("llm-playground")}>
Chat UI
</Menu.Item>
<Menu.Item key="4" onClick={() => setPage("usage")}>
Usage
</Menu.Item>
</Menu>

View file

@ -0,0 +1,124 @@
import React, { useState, useEffect } from "react";
import { Card, Title, Subtitle, Table, TableHead, TableRow, TableCell, TableBody, Metric, Grid } from "@tremor/react";
import { modelInfoCall } from "./networking";
interface ModelDashboardProps {
accessToken: string | null;
token: string | null;
userRole: string | null;
userID: string | null;
}
const ModelDashboard: React.FC<ModelDashboardProps> = ({
accessToken,
token,
userRole,
userID,
}) => {
const [modelData, setModelData] = useState<any>({ data: [] });
useEffect(() => {
if (!accessToken || !token || !userRole || !userID) {
return;
}
const fetchData = async () => {
try {
// Replace with your actual API call for model data
const modelDataResponse = await modelInfoCall(accessToken, userID, userRole);
console.log("Model data response:", modelDataResponse.data);
setModelData(modelDataResponse);
} catch (error) {
console.error("There was an error fetching the model data", error);
}
};
if (accessToken && token && userRole && userID) {
fetchData();
}
}, [accessToken, token, userRole, userID]);
if (!modelData) {
return <div>Loading...</div>;
}
// loop through model data and edit each row
for (let i = 0; i < modelData.data.length; i++) {
let curr_model = modelData.data[i];
let litellm_model_name = curr_model?.litellm_params?.model;
let model_info = curr_model?.model_info;
let defaultProvider = "openai";
let provider = "";
let input_cost = "Undefined"
let output_cost = "Undefined"
let max_tokens = "Undefined"
// Check if litellm_model_name is null or undefined
if (litellm_model_name) {
// Split litellm_model_name based on "/"
let splitModel = litellm_model_name.split("/");
// Get the first element in the split
let firstElement = splitModel[0];
// If there is only one element, default provider to openai
provider = splitModel.length === 1 ? defaultProvider : firstElement;
console.log("Provider:", provider);
} else {
// litellm_model_name is null or undefined, default provider to openai
provider = defaultProvider;
console.log("Provider:", provider);
}
if (model_info) {
input_cost = model_info?.input_cost_per_token;
output_cost = model_info?.output_cost_per_token;
max_tokens = model_info?.max_tokens;
}
modelData.data[i].provider = provider
modelData.data[i].input_cost = input_cost
modelData.data[i].output_cost = output_cost
modelData.data[i].max_tokens = max_tokens
}
return (
<div style={{ width: "100%" }}>
<Grid className="gap-2 p-10 h-[75vh] w-full">
<Card>
<Table className="mt-5">
<TableHead>
<TableRow>
<TableCell><Title>Model Name </Title></TableCell>
<TableCell><Title>Provider</Title></TableCell>
<TableCell><Title>Input Price per token ($)</Title></TableCell>
<TableCell><Title>Output Price per token ($)</Title></TableCell>
<TableCell><Title>Max Tokens</Title></TableCell>
</TableRow>
</TableHead>
<TableBody>
{modelData.data.map((model: any) => (
<TableRow key={model.model_name}>
<TableCell><Title>{model.model_name}</Title></TableCell>
<TableCell>{model.provider}</TableCell>
<TableCell>{model.input_cost}</TableCell>
<TableCell>{model.output_cost}</TableCell>
<TableCell>{model.max_tokens}</TableCell>
</TableRow>
))}
</TableBody>
</Table>
</Card>
</Grid>
</div>
);
};
export default ModelDashboard;

View file

@ -137,6 +137,41 @@ export const userInfoCall = async (
}
};
export const modelInfoCall = async (
accessToken: String,
userID: String,
userRole: String
) => {
try {
let url = proxyBaseUrl ? `${proxyBaseUrl}/model/info` : `/model/info`;
message.info("Requesting model data");
const response = await fetch(url, {
method: "GET",
headers: {
Authorization: `Bearer ${accessToken}`,
"Content-Type": "application/json",
},
});
if (!response.ok) {
const errorData = await response.text();
message.error(errorData);
throw new Error("Network response was not ok");
}
const data = await response.json();
message.info("Received model data");
return data;
// Handle success - you might want to update some state or UI based on the created key
} catch (error) {
console.error("Failed to create key:", error);
throw error;
}
};
export const keySpendLogsCall = async (accessToken: String, token: String) => {
try {
const url = proxyBaseUrl ? `${proxyBaseUrl}/spend/logs` : `/spend/logs`;

View file

@ -1,6 +1,6 @@
"use client";
import React, { useState, useEffect } from "react";
import { userInfoCall } from "./networking";
import { userInfoCall, modelInfoCall } from "./networking";
import { Grid, Col, Card, Text } from "@tremor/react";
import CreateKey from "./create_key_button";
import ViewKeyTable from "./view_key_table";
@ -47,6 +47,7 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
const token = searchParams.get("token");
const [accessToken, setAccessToken] = useState<string | null>(null);
const [userModels, setUserModels] = useState<string[]>([]);
function formatUserRole(userRole: string) {
if (!userRole) {
@ -96,22 +97,39 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
}
}
if (userID && accessToken && userRole && !data) {
const cachedData = localStorage.getItem("userData");
const cachedSpendData = localStorage.getItem("userSpendData");
if (cachedData && cachedSpendData) {
const cachedData = localStorage.getItem("userData" + userID);
const cachedSpendData = localStorage.getItem("userSpendData" + userID);
const cachedUserModels = localStorage.getItem("userModels" + userID);
if (cachedData && cachedSpendData && cachedUserModels) {
setData(JSON.parse(cachedData));
setUserSpendData(JSON.parse(cachedSpendData));
setUserModels(JSON.parse(cachedUserModels));
} else {
const fetchData = async () => {
try {
const response = await userInfoCall(accessToken, userID, userRole);
setUserSpendData(response["user_info"]);
setData(response["keys"]); // Assuming this is the correct path to your data
localStorage.setItem("userData", JSON.stringify(response["keys"]));
localStorage.setItem("userData" + userID, JSON.stringify(response["keys"]));
localStorage.setItem(
"userSpendData",
"userSpendData" + userID,
JSON.stringify(response["user_info"])
);
const model_info = await modelInfoCall(accessToken, userID, userRole);
console.log("model_info:", model_info);
// loop through model_info["data"] and create an array of element.model_name
let available_model_names = model_info["data"].map((element: { model_name: string; }) => element.model_name);
console.log("available_model_names:", available_model_names);
setUserModels(available_model_names);
console.log("userModels:", userModels);
localStorage.setItem("userModels" + userID, JSON.stringify(available_model_names));
} catch (error) {
console.error("There was an error fetching the data", error);
// Optionally, update your UI to reflect the error state here as well
@ -158,6 +176,7 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
<CreateKey
userID={userID}
userRole={userRole}
userModels={userModels}
accessToken={accessToken}
data={data}
setData={setData}

View file

@ -43,6 +43,7 @@ const ViewKeyTable: React.FC<ViewKeyTableProps> = ({
// Set the key to delete and open the confirmation modal
setKeyToDelete(token);
localStorage.removeItem("userData" + userID)
setIsDeleteModalOpen(true);
};