Merge branch 'main' into litellm_aioboto3_sagemaker

This commit is contained in:
Krish Dholakia 2024-02-14 21:46:58 -08:00 committed by GitHub
commit 57654f4533
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
79 changed files with 3440 additions and 253 deletions

View file

@ -34,6 +34,8 @@ LiteLLM manages:
[**Jump to OpenAI Proxy Docs**](https://github.com/BerriAI/litellm?tab=readme-ov-file#openai-proxy---docs) <br> [**Jump to OpenAI Proxy Docs**](https://github.com/BerriAI/litellm?tab=readme-ov-file#openai-proxy---docs) <br>
[**Jump to Supported LLM Providers**](https://github.com/BerriAI/litellm?tab=readme-ov-file#supported-provider-docs) [**Jump to Supported LLM Providers**](https://github.com/BerriAI/litellm?tab=readme-ov-file#supported-provider-docs)
Support for more providers. Missing a provider or LLM Platform, raise a [feature request](https://github.com/BerriAI/litellm/issues/new?assignees=&labels=enhancement&projects=&template=feature_request.yml&title=%5BFeature%5D%3A+).
# Usage ([**Docs**](https://docs.litellm.ai/docs/)) # Usage ([**Docs**](https://docs.litellm.ai/docs/))
> [!IMPORTANT] > [!IMPORTANT]
> LiteLLM v1.0.0 now requires `openai>=1.0.0`. Migration guide [here](https://docs.litellm.ai/docs/migration) > LiteLLM v1.0.0 now requires `openai>=1.0.0`. Migration guide [here](https://docs.litellm.ai/docs/migration)

View file

@ -1,5 +1,10 @@
# Custom Callbacks # Custom Callbacks
:::info
**For PROXY** [Go Here](../proxy/logging.md#custom-callback-class-async)
:::
## Callback Class ## Callback Class
You can create a custom callback class to precisely log events as they occur in litellm. You can create a custom callback class to precisely log events as they occur in litellm.

View file

@ -79,6 +79,23 @@ model_list:
mode: embedding # 👈 ADD THIS mode: embedding # 👈 ADD THIS
``` ```
### Image Generation Models
We need some way to know if the model is an image generation model when running checks, if you have this in your config, specifying mode it makes an image generation health check
```yaml
model_list:
- model_name: dall-e-3
litellm_params:
model: azure/dall-e-3
api_base: os.environ/AZURE_API_BASE
api_key: os.environ/AZURE_API_KEY
api_version: "2023-07-01-preview"
model_info:
mode: image_generation # 👈 ADD THIS
```
### Text Completion Models ### Text Completion Models
We need some way to know if the model is a text completion model when running checks, if you have this in your config, specifying mode it makes an embedding health check We need some way to know if the model is a text completion model when running checks, if you have this in your config, specifying mode it makes an embedding health check

View file

@ -4,21 +4,24 @@ import Image from '@theme/IdealImage';
LiteLLM supports [Microsoft Presidio](https://github.com/microsoft/presidio/) for PII masking. LiteLLM supports [Microsoft Presidio](https://github.com/microsoft/presidio/) for PII masking.
## Step 1. Add env
## Quick Start
### Step 1. Add env
```bash ```bash
export PRESIDIO_ANALYZER_API_BASE="http://localhost:5002" export PRESIDIO_ANALYZER_API_BASE="http://localhost:5002"
export PRESIDIO_ANONYMIZER_API_BASE="http://localhost:5001" export PRESIDIO_ANONYMIZER_API_BASE="http://localhost:5001"
``` ```
## Step 2. Set it as a callback in config.yaml ### Step 2. Set it as a callback in config.yaml
```yaml ```yaml
litellm_settings: litellm_settings:
litellm.callbacks = ["presidio"] callbacks = ["presidio", ...] # e.g. ["presidio", custom_callbacks.proxy_handler_instance]
``` ```
## Start proxy ### Step 3. Start proxy
``` ```
litellm --config /path/to/config.yaml litellm --config /path/to/config.yaml
@ -28,3 +31,27 @@ litellm --config /path/to/config.yaml
This will mask the input going to the llm provider This will mask the input going to the llm provider
<Image img={require('../../img/presidio_screenshot.png')} /> <Image img={require('../../img/presidio_screenshot.png')} />
## Output parsing
LLM responses can sometimes contain the masked tokens.
For presidio 'replace' operations, LiteLLM can check the LLM response and replace the masked token with the user-submitted values.
Just set `litellm.output_parse_pii = True`, to enable this.
```yaml
litellm_settings:
output_parse_pii: true
```
**Expected Flow: **
1. User Input: "hello world, my name is Jane Doe. My number is: 034453334"
2. LLM Input: "hello world, my name is [PERSON]. My number is: [PHONE_NUMBER]"
3. LLM Response: "Hey [PERSON], nice to meet you!"
4. User Response: "Hey Jane Doe, nice to meet you!"

View file

@ -370,12 +370,12 @@ See the latest available ghcr docker image here:
https://github.com/berriai/litellm/pkgs/container/litellm https://github.com/berriai/litellm/pkgs/container/litellm
```shell ```shell
docker pull ghcr.io/berriai/litellm:main-v1.16.13 docker pull ghcr.io/berriai/litellm:main-latest
``` ```
### Run the Docker Image ### Run the Docker Image
```shell ```shell
docker run ghcr.io/berriai/litellm:main-v1.16.13 docker run ghcr.io/berriai/litellm:main-latest
``` ```
#### Run the Docker Image with LiteLLM CLI args #### Run the Docker Image with LiteLLM CLI args
@ -384,12 +384,12 @@ See all supported CLI args [here](https://docs.litellm.ai/docs/proxy/cli):
Here's how you can run the docker image and pass your config to `litellm` Here's how you can run the docker image and pass your config to `litellm`
```shell ```shell
docker run ghcr.io/berriai/litellm:main-v1.16.13 --config your_config.yaml docker run ghcr.io/berriai/litellm:main-latest --config your_config.yaml
``` ```
Here's how you can run the docker image and start litellm on port 8002 with `num_workers=8` Here's how you can run the docker image and start litellm on port 8002 with `num_workers=8`
```shell ```shell
docker run ghcr.io/berriai/litellm:main-v1.16.13 --port 8002 --num_workers 8 docker run ghcr.io/berriai/litellm:main-latest --port 8002 --num_workers 8
``` ```
#### Run the Docker Image using docker compose #### Run the Docker Image using docker compose

View file

@ -37,12 +37,12 @@ http://0.0.0.0:8000/ui # <proxy_base_url>/ui
``` ```
## Get Admin UI Link on Swagger ### 3. Get Admin UI Link on Swagger
Your Proxy Swagger is available on the root of the Proxy: e.g.: `http://localhost:4000/` Your Proxy Swagger is available on the root of the Proxy: e.g.: `http://localhost:4000/`
<Image img={require('../../img/ui_link.png')} /> <Image img={require('../../img/ui_link.png')} />
## Change default username + password ### 4. Change default username + password
Set the following in your .env on the Proxy Set the following in your .env on the Proxy
@ -111,6 +111,29 @@ MICROSOFT_TENANT="5a39737
</TabItem> </TabItem>
<TabItem value="Generic" label="Generic SSO Provider">
A generic OAuth client that can be used to quickly create support for any OAuth provider with close to no code
**Required .env variables on your Proxy**
```shell
GENERIC_CLIENT_ID = "******"
GENERIC_CLIENT_SECRET = "G*******"
GENERIC_AUTHORIZATION_ENDPOINT = "http://localhost:9090/auth"
GENERIC_TOKEN_ENDPOINT = "http://localhost:9090/token"
GENERIC_USERINFO_ENDPOINT = "http://localhost:9090/me"
```
- Set Redirect URI, if your provider requires it
- Set a redirect url = `<your proxy base url>/sso/callback`
```shell
http://localhost:4000/sso/callback
```
</TabItem>
</Tabs> </Tabs>
### Step 3. Test flow ### Step 3. Test flow

View file

@ -197,7 +197,7 @@ from openai import OpenAI
# set api_key to send to proxy server # set api_key to send to proxy server
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:8000") client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:8000")
response = openai.embeddings.create( response = client.embeddings.create(
input=["hello from litellm"], input=["hello from litellm"],
model="text-embedding-ada-002" model="text-embedding-ada-002"
) )
@ -281,6 +281,84 @@ print(query_result[:5])
``` ```
## `/moderations`
### Request Format
Input, Output and Exceptions are mapped to the OpenAI format for all supported models
<Tabs>
<TabItem value="openai" label="OpenAI Python v1.0.0+">
```python
import openai
from openai import OpenAI
# set base_url to your proxy server
# set api_key to send to proxy server
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:8000")
response = client.moderations.create(
input="hello from litellm",
model="text-moderation-stable"
)
print(response)
```
</TabItem>
<TabItem value="Curl" label="Curl Request">
```shell
curl --location 'http://0.0.0.0:8000/moderations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{"input": "Sample text goes here", "model": "text-moderation-stable"}'
```
</TabItem>
</Tabs>
### Response Format
```json
{
"id": "modr-8sFEN22QCziALOfWTa77TodNLgHwA",
"model": "text-moderation-007",
"results": [
{
"categories": {
"harassment": false,
"harassment/threatening": false,
"hate": false,
"hate/threatening": false,
"self-harm": false,
"self-harm/instructions": false,
"self-harm/intent": false,
"sexual": false,
"sexual/minors": false,
"violence": false,
"violence/graphic": false
},
"category_scores": {
"harassment": 0.000019947197870351374,
"harassment/threatening": 5.5971017900446896e-6,
"hate": 0.000028560316422954202,
"hate/threatening": 2.2631787999216613e-8,
"self-harm": 2.9121162015144364e-7,
"self-harm/instructions": 9.314219084899378e-8,
"self-harm/intent": 8.093739012338119e-8,
"sexual": 0.00004414955765241757,
"sexual/minors": 0.0000156943697220413,
"violence": 0.00022354527027346194,
"violence/graphic": 8.804164281173144e-6
},
"flagged": false
}
]
}
```
## Advanced ## Advanced

View file

@ -696,7 +696,9 @@ general_settings:
"region_name": "us-west-2" "region_name": "us-west-2"
"user_table_name": "your-user-table", "user_table_name": "your-user-table",
"key_table_name": "your-token-table", "key_table_name": "your-token-table",
"config_table_name": "your-config-table" "config_table_name": "your-config-table",
"aws_role_name": "your-aws_role_name",
"aws_session_name": "your-aws_session_name",
} }
``` ```

View file

@ -67,6 +67,7 @@ max_budget: float = 0.0 # set the max budget across all providers
budget_duration: Optional[ budget_duration: Optional[
str str
] = None # proxy only - resets budget after fixed duration. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d"). ] = None # proxy only - resets budget after fixed duration. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d").
_openai_finish_reasons = ["stop", "length", "function_call", "content_filter", "null"]
_openai_completion_params = [ _openai_completion_params = [
"functions", "functions",
"function_call", "function_call",
@ -164,6 +165,8 @@ secret_manager_client: Optional[
] = None # list of instantiated key management clients - e.g. azure kv, infisical, etc. ] = None # list of instantiated key management clients - e.g. azure kv, infisical, etc.
_google_kms_resource_name: Optional[str] = None _google_kms_resource_name: Optional[str] = None
_key_management_system: Optional[KeyManagementSystem] = None _key_management_system: Optional[KeyManagementSystem] = None
#### PII MASKING ####
output_parse_pii: bool = False
############################################# #############################################

View file

@ -675,6 +675,9 @@ class S3Cache(BaseCache):
def flush_cache(self): def flush_cache(self):
pass pass
async def disconnect(self):
pass
class DualCache(BaseCache): class DualCache(BaseCache):
""" """

View file

@ -2,9 +2,11 @@
# On success, logs events to Promptlayer # On success, logs events to Promptlayer
import dotenv, os import dotenv, os
import requests import requests
from litellm.proxy._types import UserAPIKeyAuth from litellm.proxy._types import UserAPIKeyAuth
from litellm.caching import DualCache from litellm.caching import DualCache
from typing import Literal
from typing import Literal, Union
dotenv.load_dotenv() # Loading env variables using dotenv dotenv.load_dotenv() # Loading env variables using dotenv
import traceback import traceback
@ -54,7 +56,7 @@ class CustomLogger: # https://docs.litellm.ai/docs/observability/custom_callbac
user_api_key_dict: UserAPIKeyAuth, user_api_key_dict: UserAPIKeyAuth,
cache: DualCache, cache: DualCache,
data: dict, data: dict,
call_type: Literal["completion", "embeddings"], call_type: Literal["completion", "embeddings", "image_generation"],
): ):
pass pass
@ -63,6 +65,13 @@ class CustomLogger: # https://docs.litellm.ai/docs/observability/custom_callbac
): ):
pass pass
async def async_post_call_success_hook(
self,
user_api_key_dict: UserAPIKeyAuth,
response,
):
pass
#### SINGLE-USE #### - https://docs.litellm.ai/docs/observability/custom_callback#using-your-custom-callback-function #### SINGLE-USE #### - https://docs.litellm.ai/docs/observability/custom_callback#using-your-custom-callback-function
def log_input_event(self, model, messages, kwargs, print_verbose, callback_func): def log_input_event(self, model, messages, kwargs, print_verbose, callback_func):

View file

@ -477,8 +477,8 @@ def init_bedrock_client(
def convert_messages_to_prompt(model, messages, provider, custom_prompt_dict): def convert_messages_to_prompt(model, messages, provider, custom_prompt_dict):
# handle anthropic prompts using anthropic constants # handle anthropic prompts and amazon titan prompts
if provider == "anthropic": if provider == "anthropic" or provider == "amazon":
if model in custom_prompt_dict: if model in custom_prompt_dict:
# check if the model has a registered custom prompt # check if the model has a registered custom prompt
model_prompt_details = custom_prompt_dict[model] model_prompt_details = custom_prompt_dict[model]
@ -490,7 +490,7 @@ def convert_messages_to_prompt(model, messages, provider, custom_prompt_dict):
) )
else: else:
prompt = prompt_factory( prompt = prompt_factory(
model=model, messages=messages, custom_llm_provider="anthropic" model=model, messages=messages, custom_llm_provider="bedrock"
) )
else: else:
prompt = "" prompt = ""
@ -623,6 +623,7 @@ def completion(
"textGenerationConfig": inference_params, "textGenerationConfig": inference_params,
} }
) )
else: else:
data = json.dumps({}) data = json.dumps({})

View file

@ -90,9 +90,11 @@ def ollama_pt(
return {"prompt": prompt, "images": images} return {"prompt": prompt, "images": images}
else: else:
prompt = "".join( prompt = "".join(
m["content"] (
if isinstance(m["content"], str) is str m["content"]
else "".join(m["content"]) if isinstance(m["content"], str) is str
else "".join(m["content"])
)
for m in messages for m in messages
) )
return prompt return prompt
@ -422,6 +424,34 @@ def anthropic_pt(
return prompt return prompt
def amazon_titan_pt(
messages: list,
): # format - https://github.com/BerriAI/litellm/issues/1896
"""
Amazon Titan uses 'User:' and 'Bot: in it's prompt template
"""
class AmazonTitanConstants(Enum):
HUMAN_PROMPT = "\n\nUser: " # Assuming this is similar to Anthropic prompt formatting, since amazon titan's prompt formatting is currently undocumented
AI_PROMPT = "\n\nBot: "
prompt = ""
for idx, message in enumerate(messages):
if message["role"] == "user":
prompt += f"{AmazonTitanConstants.HUMAN_PROMPT.value}{message['content']}"
elif message["role"] == "system":
prompt += f"{AmazonTitanConstants.HUMAN_PROMPT.value}<admin>{message['content']}</admin>"
else:
prompt += f"{AmazonTitanConstants.AI_PROMPT.value}{message['content']}"
if (
idx == 0 and message["role"] == "assistant"
): # ensure the prompt always starts with `\n\nHuman: `
prompt = f"{AmazonTitanConstants.HUMAN_PROMPT.value}" + prompt
if messages[-1]["role"] != "assistant":
prompt += f"{AmazonTitanConstants.AI_PROMPT.value}"
return prompt
def _load_image_from_url(image_url): def _load_image_from_url(image_url):
try: try:
from PIL import Image from PIL import Image
@ -636,6 +666,14 @@ def prompt_factory(
return gemini_text_image_pt(messages=messages) return gemini_text_image_pt(messages=messages)
elif custom_llm_provider == "mistral": elif custom_llm_provider == "mistral":
return mistral_api_pt(messages=messages) return mistral_api_pt(messages=messages)
elif custom_llm_provider == "bedrock":
if "amazon.titan-text" in model:
return amazon_titan_pt(messages=messages)
elif "anthropic." in model:
if any(_ in model for _ in ["claude-2.1", "claude-v2:1"]):
return claude_2_1_pt(messages=messages)
else:
return anthropic_pt(messages=messages)
try: try:
if "meta-llama/llama-2" in model and "chat" in model: if "meta-llama/llama-2" in model and "chat" in model:
return llama_2_chat_pt(messages=messages) return llama_2_chat_pt(messages=messages)

View file

@ -484,7 +484,7 @@ def embedding(
aws_access_key_id = optional_params.pop("aws_access_key_id", None) aws_access_key_id = optional_params.pop("aws_access_key_id", None)
aws_region_name = optional_params.pop("aws_region_name", None) aws_region_name = optional_params.pop("aws_region_name", None)
if aws_access_key_id != None: if aws_access_key_id is not None:
# uses auth params passed to completion # uses auth params passed to completion
# aws_access_key_id is not None, assume user is trying to auth using litellm.completion # aws_access_key_id is not None, assume user is trying to auth using litellm.completion
client = boto3.client( client = boto3.client(

View file

@ -4,7 +4,7 @@ from enum import Enum
import requests import requests
import time import time
from typing import Callable, Optional, Union from typing import Callable, Optional, Union
from litellm.utils import ModelResponse, Usage, CustomStreamWrapper from litellm.utils import ModelResponse, Usage, CustomStreamWrapper, map_finish_reason
import litellm, uuid import litellm, uuid
import httpx import httpx
@ -575,9 +575,9 @@ def completion(
model_response["model"] = model model_response["model"] = model
## CALCULATING USAGE ## CALCULATING USAGE
if model in litellm.vertex_language_models and response_obj is not None: if model in litellm.vertex_language_models and response_obj is not None:
model_response["choices"][0].finish_reason = response_obj.candidates[ model_response["choices"][0].finish_reason = map_finish_reason(
0 response_obj.candidates[0].finish_reason.name
].finish_reason.name )
usage = Usage( usage = Usage(
prompt_tokens=response_obj.usage_metadata.prompt_token_count, prompt_tokens=response_obj.usage_metadata.prompt_token_count,
completion_tokens=response_obj.usage_metadata.candidates_token_count, completion_tokens=response_obj.usage_metadata.candidates_token_count,
@ -771,9 +771,9 @@ async def async_completion(
model_response["model"] = model model_response["model"] = model
## CALCULATING USAGE ## CALCULATING USAGE
if model in litellm.vertex_language_models and response_obj is not None: if model in litellm.vertex_language_models and response_obj is not None:
model_response["choices"][0].finish_reason = response_obj.candidates[ model_response["choices"][0].finish_reason = map_finish_reason(
0 response_obj.candidates[0].finish_reason.name
].finish_reason.name )
usage = Usage( usage = Usage(
prompt_tokens=response_obj.usage_metadata.prompt_token_count, prompt_tokens=response_obj.usage_metadata.prompt_token_count,
completion_tokens=response_obj.usage_metadata.candidates_token_count, completion_tokens=response_obj.usage_metadata.candidates_token_count,

View file

@ -10,7 +10,6 @@
import os, openai, sys, json, inspect, uuid, datetime, threading import os, openai, sys, json, inspect, uuid, datetime, threading
from typing import Any, Literal, Union from typing import Any, Literal, Union
from functools import partial from functools import partial
import dotenv, traceback, random, asyncio, time, contextvars import dotenv, traceback, random, asyncio, time, contextvars
from copy import deepcopy from copy import deepcopy
import httpx import httpx
@ -2964,16 +2963,39 @@ def text_completion(
##### Moderation ####################### ##### Moderation #######################
def moderation(input: str, api_key: Optional[str] = None):
def moderation(
input: str, model: Optional[str] = None, api_key: Optional[str] = None, **kwargs
):
# only supports open ai for now # only supports open ai for now
api_key = ( api_key = (
api_key or litellm.api_key or litellm.openai_key or get_secret("OPENAI_API_KEY") api_key or litellm.api_key or litellm.openai_key or get_secret("OPENAI_API_KEY")
) )
openai.api_key = api_key
openai.api_type = "open_ai" # type: ignore openai_client = kwargs.get("client", None)
openai.api_version = None if openai_client is None:
openai.base_url = "https://api.openai.com/v1/" openai_client = openai.OpenAI(
response = openai.moderations.create(input=input) api_key=api_key,
)
response = openai_client.moderations.create(input=input, model=model)
return response
##### Moderation #######################
@client
async def amoderation(input: str, model: str, api_key: Optional[str] = None, **kwargs):
# only supports open ai for now
api_key = (
api_key or litellm.api_key or litellm.openai_key or get_secret("OPENAI_API_KEY")
)
openai_client = kwargs.get("client", None)
if openai_client is None:
openai_client = openai.AsyncOpenAI(
api_key=api_key,
)
response = await openai_client.moderations.create(input=input, model=model)
return response return response

View file

@ -198,6 +198,33 @@
"litellm_provider": "openai", "litellm_provider": "openai",
"mode": "embedding" "mode": "embedding"
}, },
"text-moderation-stable": {
"max_tokens": 32768,
"max_input_tokens": 32768,
"max_output_tokens": 0,
"input_cost_per_token": 0.000000,
"output_cost_per_token": 0.000000,
"litellm_provider": "openai",
"mode": "moderations"
},
"text-moderation-007": {
"max_tokens": 32768,
"max_input_tokens": 32768,
"max_output_tokens": 0,
"input_cost_per_token": 0.000000,
"output_cost_per_token": 0.000000,
"litellm_provider": "openai",
"mode": "moderations"
},
"text-moderation-latest": {
"max_tokens": 32768,
"max_input_tokens": 32768,
"max_output_tokens": 0,
"input_cost_per_token": 0.000000,
"output_cost_per_token": 0.000000,
"litellm_provider": "openai",
"mode": "moderations"
},
"256-x-256/dall-e-2": { "256-x-256/dall-e-2": {
"mode": "image_generation", "mode": "image_generation",
"input_cost_per_pixel": 0.00000024414, "input_cost_per_pixel": 0.00000024414,

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -1 +1 @@
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[185],{11837:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=11837)}),_N_E=n.O()}]); (self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[185],{87421:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=87421)}),_N_E=n.O()}]);

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -1 +1 @@
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[744],{70377:function(e,n,t){Promise.resolve().then(t.t.bind(t,47690,23)),Promise.resolve().then(t.t.bind(t,48955,23)),Promise.resolve().then(t.t.bind(t,5613,23)),Promise.resolve().then(t.t.bind(t,11902,23)),Promise.resolve().then(t.t.bind(t,31778,23)),Promise.resolve().then(t.t.bind(t,77831,23))}},function(e){var n=function(n){return e(e.s=n)};e.O(0,[971,69],function(){return n(35317),n(70377)}),_N_E=e.O()}]); (self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[744],{32028:function(e,n,t){Promise.resolve().then(t.t.bind(t,47690,23)),Promise.resolve().then(t.t.bind(t,48955,23)),Promise.resolve().then(t.t.bind(t,5613,23)),Promise.resolve().then(t.t.bind(t,11902,23)),Promise.resolve().then(t.t.bind(t,31778,23)),Promise.resolve().then(t.t.bind(t,77831,23))}},function(e){var n=function(n){return e(e.s=n)};e.O(0,[971,69],function(){return n(35317),n(32028)}),_N_E=e.O()}]);

View file

@ -1 +1 @@
!function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o||0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r||"object"==typeof e&&e&&(4&r&&e.__esModule||16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t||[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/654259bbf9e4c196.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e||l.getAttribute("data-webpack")==o+n){i=l;break}}i||(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children||(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E||[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}(); !function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o||0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r||"object"==typeof e&&e&&(4&r&&e.__esModule||16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t||[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/c18941d97fb7245b.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e||l.getAttribute("data-webpack")==o+n){i=l;break}}i||(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children||(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E||[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}();

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -1 +1 @@
<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-85c9b4219c1bb384.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-9b4acf26920649bc.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-096338c8e1915716.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/654259bbf9e4c196.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[75985,[\"838\",\"static/chunks/838-7fa0bab5a1c3631d.js\",\"931\",\"static/chunks/app/page-5a7453e3903c5d60.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/654259bbf9e4c196.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"4mrMigZY9ob7yaIDjXpX6\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html> <!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-a85b2c176012d8e5.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-e1b183dda365ec86.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-9b4fb13a7db53edf.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/c18941d97fb7245b.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[48016,[\"145\",\"static/chunks/145-9c160ad5539e000f.js\",\"931\",\"static/chunks/app/page-fcb69349f15d154b.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/c18941d97fb7245b.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"lLFQRQnIrRo-GJf5spHEd\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>

View file

@ -1,7 +1,7 @@
2:I[77831,[],""] 2:I[77831,[],""]
3:I[75985,["838","static/chunks/838-7fa0bab5a1c3631d.js","931","static/chunks/app/page-5a7453e3903c5d60.js"],""] 3:I[48016,["145","static/chunks/145-9c160ad5539e000f.js","931","static/chunks/app/page-fcb69349f15d154b.js"],""]
4:I[5613,[],""] 4:I[5613,[],""]
5:I[31778,[],""] 5:I[31778,[],""]
0:["4mrMigZY9ob7yaIDjXpX6",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/654259bbf9e4c196.css","precedence":"next","crossOrigin":""}]],"$L6"]]]] 0:["lLFQRQnIrRo-GJf5spHEd",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/c18941d97fb7245b.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"🚅 LiteLLM"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]] 6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"🚅 LiteLLM"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
1:null 1:null

View file

@ -234,6 +234,15 @@ class DynamoDBArgs(LiteLLMBase):
key_table_name: str = "LiteLLM_VerificationToken" key_table_name: str = "LiteLLM_VerificationToken"
config_table_name: str = "LiteLLM_Config" config_table_name: str = "LiteLLM_Config"
spend_table_name: str = "LiteLLM_SpendLogs" spend_table_name: str = "LiteLLM_SpendLogs"
aws_role_name: Optional[str] = None
aws_session_name: Optional[str] = None
aws_web_identity_token: Optional[str] = None
aws_provider_id: Optional[str] = None
aws_policy_arns: Optional[List[str]] = None
aws_policy: Optional[str] = None
aws_duration_seconds: Optional[int] = None
assume_role_aws_role_name: Optional[str] = None
assume_role_aws_session_name: Optional[str] = None
class ConfigGeneralSettings(LiteLLMBase): class ConfigGeneralSettings(LiteLLMBase):

View file

@ -53,6 +53,41 @@ class DynamoDBWrapper(CustomDB):
self.database_arguments = database_arguments self.database_arguments = database_arguments
self.region_name = database_arguments.region_name self.region_name = database_arguments.region_name
def set_env_vars_based_on_arn(self):
if self.database_arguments.aws_role_name is None:
return
verbose_proxy_logger.debug(
f"DynamoDB: setting env vars based on arn={self.database_arguments.aws_role_name}"
)
import boto3, os
sts_client = boto3.client("sts")
# call 1
non_used_assumed_role = sts_client.assume_role_with_web_identity(
RoleArn=self.database_arguments.aws_role_name,
RoleSessionName=self.database_arguments.aws_session_name,
WebIdentityToken=self.database_arguments.aws_web_identity_token,
)
# call 2
assumed_role = sts_client.assume_role(
RoleArn=self.database_arguments.assume_role_aws_role_name,
RoleSessionName=self.database_arguments.assume_role_aws_session_name,
)
aws_access_key_id = assumed_role["Credentials"]["AccessKeyId"]
aws_secret_access_key = assumed_role["Credentials"]["SecretAccessKey"]
aws_session_token = assumed_role["Credentials"]["SessionToken"]
verbose_proxy_logger.debug(
f"Got STS assumed Role, aws_access_key_id={aws_access_key_id}"
)
# set these in the env so aiodynamo can use them
os.environ["AWS_ACCESS_KEY_ID"] = aws_access_key_id
os.environ["AWS_SECRET_ACCESS_KEY"] = aws_secret_access_key
os.environ["AWS_SESSION_TOKEN"] = aws_session_token
async def connect(self): async def connect(self):
""" """
Connect to DB, and creating / updating any tables Connect to DB, and creating / updating any tables
@ -75,6 +110,7 @@ class DynamoDBWrapper(CustomDB):
import aiohttp import aiohttp
verbose_proxy_logger.debug("DynamoDB Wrapper - Attempting to connect") verbose_proxy_logger.debug("DynamoDB Wrapper - Attempting to connect")
self.set_env_vars_based_on_arn()
# before making ClientSession check if ssl_verify=False # before making ClientSession check if ssl_verify=False
if self.database_arguments.ssl_verify == False: if self.database_arguments.ssl_verify == False:
client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False)) client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False))
@ -171,6 +207,8 @@ class DynamoDBWrapper(CustomDB):
from aiohttp import ClientSession from aiohttp import ClientSession
import aiohttp import aiohttp
self.set_env_vars_based_on_arn()
if self.database_arguments.ssl_verify == False: if self.database_arguments.ssl_verify == False:
client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False)) client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False))
else: else:
@ -214,6 +252,8 @@ class DynamoDBWrapper(CustomDB):
from aiohttp import ClientSession from aiohttp import ClientSession
import aiohttp import aiohttp
self.set_env_vars_based_on_arn()
if self.database_arguments.ssl_verify == False: if self.database_arguments.ssl_verify == False:
client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False)) client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False))
else: else:
@ -261,6 +301,7 @@ class DynamoDBWrapper(CustomDB):
async def update_data( async def update_data(
self, key: str, value: dict, table_name: Literal["user", "key", "config"] self, key: str, value: dict, table_name: Literal["user", "key", "config"]
): ):
self.set_env_vars_based_on_arn()
from aiodynamo.client import Client from aiodynamo.client import Client
from aiodynamo.credentials import Credentials, StaticCredentials from aiodynamo.credentials import Credentials, StaticCredentials
from aiodynamo.http.httpx import HTTPX from aiodynamo.http.httpx import HTTPX
@ -334,4 +375,5 @@ class DynamoDBWrapper(CustomDB):
""" """
Not Implemented yet. Not Implemented yet.
""" """
self.set_env_vars_based_on_arn()
return super().delete_data(keys, table_name) return super().delete_data(keys, table_name)

View file

@ -8,14 +8,19 @@
# Tell us how we can improve! - Krrish & Ishaan # Tell us how we can improve! - Krrish & Ishaan
from typing import Optional from typing import Optional, Literal, Union
import litellm, traceback, sys import litellm, traceback, sys, uuid
from litellm.caching import DualCache from litellm.caching import DualCache
from litellm.proxy._types import UserAPIKeyAuth from litellm.proxy._types import UserAPIKeyAuth
from litellm.integrations.custom_logger import CustomLogger from litellm.integrations.custom_logger import CustomLogger
from fastapi import HTTPException from fastapi import HTTPException
from litellm._logging import verbose_proxy_logger from litellm._logging import verbose_proxy_logger
from litellm import ModelResponse from litellm.utils import (
ModelResponse,
EmbeddingResponse,
ImageResponse,
StreamingChoices,
)
from datetime import datetime from datetime import datetime
import aiohttp, asyncio import aiohttp, asyncio
@ -24,7 +29,13 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
user_api_key_cache = None user_api_key_cache = None
# Class variables or attributes # Class variables or attributes
def __init__(self): def __init__(self, mock_testing: bool = False):
self.pii_tokens: dict = (
{}
) # mapping of PII token to original text - only used with Presidio `replace` operation
if mock_testing == True: # for testing purposes only
return
self.presidio_analyzer_api_base = litellm.get_secret( self.presidio_analyzer_api_base = litellm.get_secret(
"PRESIDIO_ANALYZER_API_BASE", None "PRESIDIO_ANALYZER_API_BASE", None
) )
@ -51,12 +62,15 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
pass pass
async def check_pii(self, text: str) -> str: async def check_pii(self, text: str) -> str:
"""
[TODO] make this more performant for high-throughput scenario
"""
try: try:
async with aiohttp.ClientSession() as session: async with aiohttp.ClientSession() as session:
# Make the first request to /analyze # Make the first request to /analyze
analyze_url = f"{self.presidio_analyzer_api_base}/analyze" analyze_url = f"{self.presidio_analyzer_api_base}/analyze"
analyze_payload = {"text": text, "language": "en"} analyze_payload = {"text": text, "language": "en"}
redacted_text = None
async with session.post(analyze_url, json=analyze_payload) as response: async with session.post(analyze_url, json=analyze_payload) as response:
analyze_results = await response.json() analyze_results = await response.json()
@ -72,6 +86,26 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
) as response: ) as response:
redacted_text = await response.json() redacted_text = await response.json()
new_text = text
if redacted_text is not None:
for item in redacted_text["items"]:
start = item["start"]
end = item["end"]
replacement = item["text"] # replacement token
if (
item["operator"] == "replace"
and litellm.output_parse_pii == True
):
# check if token in dict
# if exists, add a uuid to the replacement token for swapping back to the original text in llm response output parsing
if replacement in self.pii_tokens:
replacement = replacement + uuid.uuid4()
self.pii_tokens[replacement] = new_text[
start:end
] # get text it'll replace
new_text = new_text[:start] + replacement + new_text[end:]
return redacted_text["text"] return redacted_text["text"]
except Exception as e: except Exception as e:
traceback.print_exc() traceback.print_exc()
@ -94,6 +128,7 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
if call_type == "completion": # /chat/completions requests if call_type == "completion": # /chat/completions requests
messages = data["messages"] messages = data["messages"]
tasks = [] tasks = []
for m in messages: for m in messages:
if isinstance(m["content"], str): if isinstance(m["content"], str):
tasks.append(self.check_pii(text=m["content"])) tasks.append(self.check_pii(text=m["content"]))
@ -104,3 +139,30 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
"content" "content"
] = r # replace content with redacted string ] = r # replace content with redacted string
return data return data
async def async_post_call_success_hook(
self,
user_api_key_dict: UserAPIKeyAuth,
response: Union[ModelResponse, EmbeddingResponse, ImageResponse],
):
"""
Output parse the response object to replace the masked tokens with user sent values
"""
verbose_proxy_logger.debug(
f"PII Masking Args: litellm.output_parse_pii={litellm.output_parse_pii}; type of response={type(response)}"
)
if litellm.output_parse_pii == False:
return response
if isinstance(response, ModelResponse) and not isinstance(
response.choices[0], StreamingChoices
): # /chat/completions requests
if isinstance(response.choices[0].message.content, str):
verbose_proxy_logger.debug(
f"self.pii_tokens: {self.pii_tokens}; initial response: {response.choices[0].message.content}"
)
for key, value in self.pii_tokens.items():
response.choices[0].message.content = response.choices[
0
].message.content.replace(key, value)
return response

View file

@ -570,6 +570,7 @@ def run_server(
"worker_class": "uvicorn.workers.UvicornWorker", "worker_class": "uvicorn.workers.UvicornWorker",
"preload": True, # Add the preload flag, "preload": True, # Add the preload flag,
"accesslog": "-", # Log to stdout "accesslog": "-", # Log to stdout
"timeout": 600, # default to very high number, bedrock/anthropic.claude-v2:1 can take 30+ seconds for the 1st chunk to come in
"access_log_format": '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s', "access_log_format": '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s',
} }

View file

@ -9,73 +9,41 @@ model_list:
mode: chat mode: chat
max_tokens: 4096 max_tokens: 4096
base_model: azure/gpt-4-1106-preview base_model: azure/gpt-4-1106-preview
access_groups: ["public"]
- model_name: openai-gpt-3.5
litellm_params:
model: gpt-3.5-turbo
api_key: os.environ/OPENAI_API_KEY
model_info:
access_groups: ["public"]
- model_name: anthropic-claude-v2.1
litellm_params:
model: bedrock/anthropic.claude-v2:1
timeout: 300 # sets a 5 minute timeout
model_info:
access_groups: ["private"]
- model_name: anthropic-claude-v2
litellm_params:
model: bedrock/anthropic.claude-v2
- model_name: bedrock-cohere
litellm_params:
model: bedrock/cohere.command-text-v14
timeout: 0.0001
- model_name: gpt-4 - model_name: gpt-4
litellm_params: litellm_params:
model: azure/chatgpt-v-2 model: azure/chatgpt-v-2
api_base: https://openai-gpt-4-test-v-1.openai.azure.com/ api_base: https://openai-gpt-4-test-v-1.openai.azure.com/
api_version: "2023-05-15" api_version: "2023-05-15"
api_key: os.environ/AZURE_API_KEY # The `os.environ/` prefix tells litellm to read this from the env. See https://docs.litellm.ai/docs/simple_proxy#load-api-keys-from-vault api_key: os.environ/AZURE_API_KEY # The `os.environ/` prefix tells litellm to read this from the env. See https://docs.litellm.ai/docs/simple_proxy#load-api-keys-from-vault
- model_name: gpt-vision model_info:
base_model: azure/gpt-4
- model_name: text-moderation-stable
litellm_params: litellm_params:
model: azure/gpt-4-vision model: text-moderation-stable
base_url: https://gpt-4-vision-resource.openai.azure.com/openai/deployments/gpt-4-vision/extensions
api_key: os.environ/AZURE_VISION_API_KEY
api_version: "2023-09-01-preview"
dataSources:
- type: AzureComputerVision
parameters:
endpoint: os.environ/AZURE_VISION_ENHANCE_ENDPOINT
key: os.environ/AZURE_VISION_ENHANCE_KEY
- model_name: BEDROCK_GROUP
litellm_params:
model: bedrock/cohere.command-text-v14
timeout: 0.0001
- model_name: tg-ai
litellm_params:
model: together_ai/mistralai/Mistral-7B-Instruct-v0.1
- model_name: sagemaker
litellm_params:
model: sagemaker/berri-benchmarking-Llama-2-70b-chat-hf-4
- model_name: openai-gpt-3.5
litellm_params:
model: gpt-3.5-turbo
api_key: os.environ/OPENAI_API_KEY api_key: os.environ/OPENAI_API_KEY
model_info:
mode: chat
- model_name: azure-cloudflare
litellm_params:
model: azure/chatgpt-v-2
api_base: https://gateway.ai.cloudflare.com/v1/0399b10e77ac6668c80404a5ff49eb37/litellm-test/azure-openai/openai-gpt-4-test-v-1
api_key: os.environ/AZURE_API_KEY
api_version: "2023-07-01-preview"
- model_name: azure-embedding-model
litellm_params:
model: azure/azure-embedding-model
api_base: os.environ/AZURE_API_BASE
api_key: os.environ/AZURE_API_KEY
api_version: "2023-07-01-preview"
model_info:
mode: embedding
base_model: text-embedding-ada-002
- model_name: text-embedding-ada-002
litellm_params:
model: text-embedding-ada-002
api_key: os.environ/OPENAI_API_KEY
model_info:
mode: embedding
litellm_settings: litellm_settings:
fallbacks: [{"openai-gpt-3.5": ["azure-gpt-3.5"]}] fallbacks: [{"openai-gpt-3.5": ["azure-gpt-3.5"]}]
success_callback: ['langfuse'] success_callback: ['langfuse']
max_budget: 10 # global budget for proxy
max_user_budget: 0.0001
budget_duration: 30d # global budget duration, will reset after 30d
default_key_generate_params:
max_budget: 1.5000
models: ["azure-gpt-3.5"]
duration: None
upperbound_key_generate_params:
max_budget: 100
duration: "30d"
# setting callback class # setting callback class
# callbacks: custom_callbacks.proxy_handler_instance # sets litellm.callbacks = [proxy_handler_instance] # callbacks: custom_callbacks.proxy_handler_instance # sets litellm.callbacks = [proxy_handler_instance]
@ -93,6 +61,7 @@ general_settings:
environment_variables: environment_variables:
# otel: True # OpenTelemetry Logger # otel: True # OpenTelemetry Logger
# master_key: sk-1234 # [OPTIONAL] Only use this if you to require all calls to contain this key (Authorization: Bearer sk-1234) # master_key: sk-1234 # [OPTIONAL] Only use this if you to require all calls to contain this key (Authorization: Bearer sk-1234)

View file

@ -403,34 +403,43 @@ async def user_api_key_auth(
verbose_proxy_logger.debug( verbose_proxy_logger.debug(
f"LLM Model List pre access group check: {llm_model_list}" f"LLM Model List pre access group check: {llm_model_list}"
) )
access_groups = [] from collections import defaultdict
access_groups = defaultdict(list)
if llm_model_list is not None: if llm_model_list is not None:
for m in llm_model_list: for m in llm_model_list:
for group in m.get("model_info", {}).get("access_groups", []): for group in m.get("model_info", {}).get("access_groups", []):
access_groups.append((m["model_name"], group)) model_name = m["model_name"]
access_groups[group].append(model_name)
allowed_models = valid_token.models models_in_current_access_groups = []
access_group_idx = set()
if ( if (
len(access_groups) > 0 len(access_groups) > 0
): # check if token contains any model access groups ): # check if token contains any model access groups
for idx, m in enumerate(valid_token.models): for idx, m in enumerate(
for model_name, group in access_groups: valid_token.models
if m == group: ): # loop token models, if any of them are an access group add the access group
access_group_idx.add(idx) if m in access_groups:
allowed_models.append(model_name) # if it is an access group we need to remove it from valid_token.models
models_in_group = access_groups[m]
models_in_current_access_groups.extend(models_in_group)
# Filter out models that are access_groups
filtered_models = [
m for m in valid_token.models if m not in access_groups
]
filtered_models += models_in_current_access_groups
verbose_proxy_logger.debug( verbose_proxy_logger.debug(
f"model: {model}; allowed_models: {allowed_models}" f"model: {model}; allowed_models: {filtered_models}"
) )
if model is not None and model not in allowed_models: if model is not None and model not in filtered_models:
raise ValueError( raise ValueError(
f"API Key not allowed to access model. This token can only access models={valid_token.models}. Tried to access {model}" f"API Key not allowed to access model. This token can only access models={valid_token.models}. Tried to access {model}"
) )
for val in access_group_idx: valid_token.models = filtered_models
allowed_models.pop(val)
valid_token.models = allowed_models
verbose_proxy_logger.debug( verbose_proxy_logger.debug(
f"filtered allowed_models: {allowed_models}; valid_token.models: {valid_token.models}" f"filtered allowed_models: {filtered_models}; valid_token.models: {valid_token.models}"
) )
# Check 2. If user_id for this token is in budget # Check 2. If user_id for this token is in budget
@ -682,34 +691,31 @@ async def user_api_key_auth(
# sso/login, ui/login, /key functions and /user functions # sso/login, ui/login, /key functions and /user functions
# this will never be allowed to call /chat/completions # this will never be allowed to call /chat/completions
token_team = getattr(valid_token, "team_id", None) token_team = getattr(valid_token, "team_id", None)
if token_team is not None: if token_team is not None and token_team == "litellm-dashboard":
if token_team == "litellm-dashboard": # this token is only used for managing the ui
# this token is only used for managing the ui allowed_routes = [
allowed_routes = [ "/sso",
"/sso", "/login",
"/login", "/key",
"/key", "/spend",
"/spend", "/user",
"/user", "/model/info",
] ]
# check if the current route startswith any of the allowed routes # check if the current route startswith any of the allowed routes
if ( if (
route is not None route is not None
and isinstance(route, str) and isinstance(route, str)
and any( and any(
route.startswith(allowed_route) route.startswith(allowed_route) for allowed_route in allowed_routes
for allowed_route in allowed_routes )
) ):
): # Do something if the current route starts with any of the allowed routes
# Do something if the current route starts with any of the allowed routes pass
pass else:
else: raise Exception(
raise Exception( f"This key is made for LiteLLM UI, Tried to access route: {route}. Not allowed"
f"This key is made for LiteLLM UI, Tried to access route: {route}. Not allowed" )
) return UserAPIKeyAuth(api_key=api_key, **valid_token_dict)
return UserAPIKeyAuth(api_key=api_key, **valid_token_dict)
else:
raise Exception(f"Invalid Key Passed to LiteLLM Proxy")
except Exception as e: except Exception as e:
# verbose_proxy_logger.debug(f"An exception occurred - {traceback.format_exc()}") # verbose_proxy_logger.debug(f"An exception occurred - {traceback.format_exc()}")
traceback.print_exc() traceback.print_exc()
@ -1443,6 +1449,24 @@ class ProxyConfig:
database_type == "dynamo_db" or database_type == "dynamodb" database_type == "dynamo_db" or database_type == "dynamodb"
): ):
database_args = general_settings.get("database_args", None) database_args = general_settings.get("database_args", None)
### LOAD FROM os.environ/ ###
for k, v in database_args.items():
if isinstance(v, str) and v.startswith("os.environ/"):
database_args[k] = litellm.get_secret(v)
if isinstance(k, str) and k == "aws_web_identity_token":
value = database_args[k]
verbose_proxy_logger.debug(
f"Loading AWS Web Identity Token from file: {value}"
)
if os.path.exists(value):
with open(value, "r") as file:
token_content = file.read()
database_args[k] = token_content
else:
verbose_proxy_logger.info(
f"DynamoDB Loading - {value} is not a valid file path"
)
verbose_proxy_logger.debug(f"database_args: {database_args}")
custom_db_client = DBClient( custom_db_client = DBClient(
custom_db_args=database_args, custom_db_type=database_type custom_db_args=database_args, custom_db_type=database_type
) )
@ -1580,8 +1604,6 @@ async def generate_key_helper_fn(
tpm_limit = tpm_limit tpm_limit = tpm_limit
rpm_limit = rpm_limit rpm_limit = rpm_limit
allowed_cache_controls = allowed_cache_controls allowed_cache_controls = allowed_cache_controls
if type(team_id) is not str:
team_id = str(team_id)
try: try:
# Create a new verification token (you may want to enhance this logic based on your needs) # Create a new verification token (you may want to enhance this logic based on your needs)
user_data = { user_data = {
@ -2057,14 +2079,6 @@ def model_list(
if user_model is not None: if user_model is not None:
all_models += [user_model] all_models += [user_model]
verbose_proxy_logger.debug(f"all_models: {all_models}") verbose_proxy_logger.debug(f"all_models: {all_models}")
### CHECK OLLAMA MODELS ###
try:
response = requests.get("http://0.0.0.0:11434/api/tags")
models = response.json()["models"]
ollama_models = ["ollama/" + m["name"].replace(":latest", "") for m in models]
all_models.extend(ollama_models)
except Exception as e:
pass
return dict( return dict(
data=[ data=[
{ {
@ -2355,8 +2369,13 @@ async def chat_completion(
llm_router is not None and data["model"] in llm_router.deployment_names llm_router is not None and data["model"] in llm_router.deployment_names
): # model in router deployments, calling a specific deployment on the router ): # model in router deployments, calling a specific deployment on the router
response = await llm_router.acompletion(**data, specific_deployment=True) response = await llm_router.acompletion(**data, specific_deployment=True)
else: # router is not set elif user_model is not None: # `litellm --model <your-model-name>`
response = await litellm.acompletion(**data) response = await litellm.acompletion(**data)
else:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail={"error": "Invalid model name passed in"},
)
# Post Call Processing # Post Call Processing
data["litellm_status"] = "success" # used for alerting data["litellm_status"] = "success" # used for alerting
@ -2387,6 +2406,11 @@ async def chat_completion(
) )
fastapi_response.headers["x-litellm-model-id"] = model_id fastapi_response.headers["x-litellm-model-id"] = model_id
### CALL HOOKS ### - modify outgoing data
response = await proxy_logging_obj.post_call_success_hook(
user_api_key_dict=user_api_key_dict, response=response
)
return response return response
except Exception as e: except Exception as e:
traceback.print_exc() traceback.print_exc()
@ -2417,7 +2441,12 @@ async def chat_completion(
traceback.print_exc() traceback.print_exc()
if isinstance(e, HTTPException): if isinstance(e, HTTPException):
raise e raise ProxyException(
message=getattr(e, "detail", str(e)),
type=getattr(e, "type", "None"),
param=getattr(e, "param", "None"),
code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
)
else: else:
error_traceback = traceback.format_exc() error_traceback = traceback.format_exc()
error_msg = f"{str(e)}\n\n{error_traceback}" error_msg = f"{str(e)}\n\n{error_traceback}"
@ -2567,8 +2596,13 @@ async def embeddings(
llm_router is not None and data["model"] in llm_router.deployment_names llm_router is not None and data["model"] in llm_router.deployment_names
): # model in router deployments, calling a specific deployment on the router ): # model in router deployments, calling a specific deployment on the router
response = await llm_router.aembedding(**data, specific_deployment=True) response = await llm_router.aembedding(**data, specific_deployment=True)
else: elif user_model is not None: # `litellm --model <your-model-name>`
response = await litellm.aembedding(**data) response = await litellm.aembedding(**data)
else:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail={"error": "Invalid model name passed in"},
)
### ALERTING ### ### ALERTING ###
data["litellm_status"] = "success" # used for alerting data["litellm_status"] = "success" # used for alerting
@ -2586,7 +2620,12 @@ async def embeddings(
) )
traceback.print_exc() traceback.print_exc()
if isinstance(e, HTTPException): if isinstance(e, HTTPException):
raise e raise ProxyException(
message=getattr(e, "message", str(e)),
type=getattr(e, "type", "None"),
param=getattr(e, "param", "None"),
code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
)
else: else:
error_traceback = traceback.format_exc() error_traceback = traceback.format_exc()
error_msg = f"{str(e)}\n\n{error_traceback}" error_msg = f"{str(e)}\n\n{error_traceback}"
@ -2702,8 +2741,13 @@ async def image_generation(
response = await llm_router.aimage_generation( response = await llm_router.aimage_generation(
**data **data
) # ensure this goes the llm_router, router will do the correct alias mapping ) # ensure this goes the llm_router, router will do the correct alias mapping
else: elif user_model is not None: # `litellm --model <your-model-name>`
response = await litellm.aimage_generation(**data) response = await litellm.aimage_generation(**data)
else:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail={"error": "Invalid model name passed in"},
)
### ALERTING ### ### ALERTING ###
data["litellm_status"] = "success" # used for alerting data["litellm_status"] = "success" # used for alerting
@ -2721,7 +2765,165 @@ async def image_generation(
) )
traceback.print_exc() traceback.print_exc()
if isinstance(e, HTTPException): if isinstance(e, HTTPException):
raise e raise ProxyException(
message=getattr(e, "message", str(e)),
type=getattr(e, "type", "None"),
param=getattr(e, "param", "None"),
code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
)
else:
error_traceback = traceback.format_exc()
error_msg = f"{str(e)}\n\n{error_traceback}"
raise ProxyException(
message=getattr(e, "message", error_msg),
type=getattr(e, "type", "None"),
param=getattr(e, "param", "None"),
code=getattr(e, "status_code", 500),
)
@router.post(
"/v1/moderations",
dependencies=[Depends(user_api_key_auth)],
response_class=ORJSONResponse,
tags=["moderations"],
)
@router.post(
"/moderations",
dependencies=[Depends(user_api_key_auth)],
response_class=ORJSONResponse,
tags=["moderations"],
)
async def moderations(
request: Request,
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
):
"""
The moderations endpoint is a tool you can use to check whether content complies with an LLM Providers policies.
Quick Start
```
curl --location 'http://0.0.0.0:4000/moderations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{"input": "Sample text goes here", "model": "text-moderation-stable"}'
```
"""
global proxy_logging_obj
try:
# Use orjson to parse JSON data, orjson speeds up requests significantly
body = await request.body()
data = orjson.loads(body)
# Include original request and headers in the data
data["proxy_server_request"] = {
"url": str(request.url),
"method": request.method,
"headers": dict(request.headers),
"body": copy.copy(data), # use copy instead of deepcopy
}
if data.get("user", None) is None and user_api_key_dict.user_id is not None:
data["user"] = user_api_key_dict.user_id
data["model"] = (
general_settings.get("moderation_model", None) # server default
or user_model # model name passed via cli args
or data["model"] # default passed in http request
)
if user_model:
data["model"] = user_model
if "metadata" not in data:
data["metadata"] = {}
data["metadata"]["user_api_key"] = user_api_key_dict.api_key
data["metadata"]["user_api_key_metadata"] = user_api_key_dict.metadata
_headers = dict(request.headers)
_headers.pop(
"authorization", None
) # do not store the original `sk-..` api key in the db
data["metadata"]["headers"] = _headers
data["metadata"]["user_api_key_user_id"] = user_api_key_dict.user_id
data["metadata"]["endpoint"] = str(request.url)
### TEAM-SPECIFIC PARAMS ###
if user_api_key_dict.team_id is not None:
team_config = await proxy_config.load_team_config(
team_id=user_api_key_dict.team_id
)
if len(team_config) == 0:
pass
else:
team_id = team_config.pop("team_id", None)
data["metadata"]["team_id"] = team_id
data = {
**team_config,
**data,
} # add the team-specific configs to the completion call
router_model_names = (
[m["model_name"] for m in llm_model_list]
if llm_model_list is not None
else []
)
### CALL HOOKS ### - modify incoming data / reject request before calling the model
data = await proxy_logging_obj.pre_call_hook(
user_api_key_dict=user_api_key_dict, data=data, call_type="moderation"
)
start_time = time.time()
## ROUTE TO CORRECT ENDPOINT ##
# skip router if user passed their key
if "api_key" in data:
response = await litellm.amoderation(**data)
elif (
llm_router is not None and data["model"] in router_model_names
): # model in router model list
response = await llm_router.amoderation(**data)
elif (
llm_router is not None and data["model"] in llm_router.deployment_names
): # model in router deployments, calling a specific deployment on the router
response = await llm_router.amoderation(**data, specific_deployment=True)
elif (
llm_router is not None
and llm_router.model_group_alias is not None
and data["model"] in llm_router.model_group_alias
): # model set in model_group_alias
response = await llm_router.amoderation(
**data
) # ensure this goes the llm_router, router will do the correct alias mapping
elif user_model is not None: # `litellm --model <your-model-name>`
response = await litellm.amoderation(**data)
else:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail={"error": "Invalid model name passed in"},
)
### ALERTING ###
data["litellm_status"] = "success" # used for alerting
end_time = time.time()
asyncio.create_task(
proxy_logging_obj.response_taking_too_long(
start_time=start_time, end_time=end_time, type="slow_response"
)
)
return response
except Exception as e:
await proxy_logging_obj.post_call_failure_hook(
user_api_key_dict=user_api_key_dict, original_exception=e
)
traceback.print_exc()
if isinstance(e, HTTPException):
raise ProxyException(
message=getattr(e, "message", str(e)),
type=getattr(e, "type", "None"),
param=getattr(e, "param", "None"),
code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
)
else: else:
error_traceback = traceback.format_exc() error_traceback = traceback.format_exc()
error_msg = f"{str(e)}\n\n{error_traceback}" error_msg = f"{str(e)}\n\n{error_traceback}"
@ -3516,6 +3718,7 @@ async def google_login(request: Request):
""" """
microsoft_client_id = os.getenv("MICROSOFT_CLIENT_ID", None) microsoft_client_id = os.getenv("MICROSOFT_CLIENT_ID", None)
google_client_id = os.getenv("GOOGLE_CLIENT_ID", None) google_client_id = os.getenv("GOOGLE_CLIENT_ID", None)
generic_client_id = os.getenv("GENERIC_CLIENT_ID", None)
# get url from request # get url from request
redirect_url = os.getenv("PROXY_BASE_URL", str(request.base_url)) redirect_url = os.getenv("PROXY_BASE_URL", str(request.base_url))
@ -3574,6 +3777,69 @@ async def google_login(request: Request):
) )
with microsoft_sso: with microsoft_sso:
return await microsoft_sso.get_login_redirect() return await microsoft_sso.get_login_redirect()
elif generic_client_id is not None:
from fastapi_sso.sso.generic import create_provider, DiscoveryDocument
generic_client_secret = os.getenv("GENERIC_CLIENT_SECRET", None)
generic_authorization_endpoint = os.getenv(
"GENERIC_AUTHORIZATION_ENDPOINT", None
)
generic_token_endpoint = os.getenv("GENERIC_TOKEN_ENDPOINT", None)
generic_userinfo_endpoint = os.getenv("GENERIC_USERINFO_ENDPOINT", None)
if generic_client_secret is None:
raise ProxyException(
message="GENERIC_CLIENT_SECRET not set. Set it in .env file",
type="auth_error",
param="GENERIC_CLIENT_SECRET",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
if generic_authorization_endpoint is None:
raise ProxyException(
message="GENERIC_AUTHORIZATION_ENDPOINT not set. Set it in .env file",
type="auth_error",
param="GENERIC_AUTHORIZATION_ENDPOINT",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
if generic_token_endpoint is None:
raise ProxyException(
message="GENERIC_TOKEN_ENDPOINT not set. Set it in .env file",
type="auth_error",
param="GENERIC_TOKEN_ENDPOINT",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
if generic_userinfo_endpoint is None:
raise ProxyException(
message="GENERIC_USERINFO_ENDPOINT not set. Set it in .env file",
type="auth_error",
param="GENERIC_USERINFO_ENDPOINT",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
verbose_proxy_logger.debug(
f"authorization_endpoint: {generic_authorization_endpoint}\ntoken_endpoint: {generic_token_endpoint}\nuserinfo_endpoint: {generic_userinfo_endpoint}"
)
verbose_proxy_logger.debug(
f"GENERIC_REDIRECT_URI: {redirect_url}\nGENERIC_CLIENT_ID: {generic_client_id}\n"
)
discovery = DiscoveryDocument(
authorization_endpoint=generic_authorization_endpoint,
token_endpoint=generic_token_endpoint,
userinfo_endpoint=generic_userinfo_endpoint,
)
SSOProvider = create_provider(name="oidc", discovery_document=discovery)
generic_sso = SSOProvider(
client_id=generic_client_id,
client_secret=generic_client_secret,
redirect_uri=redirect_url,
allow_insecure_http=True,
)
with generic_sso:
return await generic_sso.get_login_redirect()
elif ui_username is not None: elif ui_username is not None:
# No Google, Microsoft SSO # No Google, Microsoft SSO
# Use UI Credentials set in .env # Use UI Credentials set in .env
@ -3673,6 +3939,7 @@ async def auth_callback(request: Request):
global general_settings global general_settings
microsoft_client_id = os.getenv("MICROSOFT_CLIENT_ID", None) microsoft_client_id = os.getenv("MICROSOFT_CLIENT_ID", None)
google_client_id = os.getenv("GOOGLE_CLIENT_ID", None) google_client_id = os.getenv("GOOGLE_CLIENT_ID", None)
generic_client_id = os.getenv("GENERIC_CLIENT_ID", None)
# get url from request # get url from request
redirect_url = os.getenv("PROXY_BASE_URL", str(request.base_url)) redirect_url = os.getenv("PROXY_BASE_URL", str(request.base_url))
@ -3728,6 +3995,77 @@ async def auth_callback(request: Request):
allow_insecure_http=True, allow_insecure_http=True,
) )
result = await microsoft_sso.verify_and_process(request) result = await microsoft_sso.verify_and_process(request)
elif generic_client_id is not None:
# make generic sso provider
from fastapi_sso.sso.generic import create_provider, DiscoveryDocument
generic_client_secret = os.getenv("GENERIC_CLIENT_SECRET", None)
generic_authorization_endpoint = os.getenv(
"GENERIC_AUTHORIZATION_ENDPOINT", None
)
generic_token_endpoint = os.getenv("GENERIC_TOKEN_ENDPOINT", None)
generic_userinfo_endpoint = os.getenv("GENERIC_USERINFO_ENDPOINT", None)
if generic_client_secret is None:
raise ProxyException(
message="GENERIC_CLIENT_SECRET not set. Set it in .env file",
type="auth_error",
param="GENERIC_CLIENT_SECRET",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
if generic_authorization_endpoint is None:
raise ProxyException(
message="GENERIC_AUTHORIZATION_ENDPOINT not set. Set it in .env file",
type="auth_error",
param="GENERIC_AUTHORIZATION_ENDPOINT",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
if generic_token_endpoint is None:
raise ProxyException(
message="GENERIC_TOKEN_ENDPOINT not set. Set it in .env file",
type="auth_error",
param="GENERIC_TOKEN_ENDPOINT",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
if generic_userinfo_endpoint is None:
raise ProxyException(
message="GENERIC_USERINFO_ENDPOINT not set. Set it in .env file",
type="auth_error",
param="GENERIC_USERINFO_ENDPOINT",
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
verbose_proxy_logger.debug(
f"authorization_endpoint: {generic_authorization_endpoint}\ntoken_endpoint: {generic_token_endpoint}\nuserinfo_endpoint: {generic_userinfo_endpoint}"
)
verbose_proxy_logger.debug(
f"GENERIC_REDIRECT_URI: {redirect_url}\nGENERIC_CLIENT_ID: {generic_client_id}\n"
)
discovery = DiscoveryDocument(
authorization_endpoint=generic_authorization_endpoint,
token_endpoint=generic_token_endpoint,
userinfo_endpoint=generic_userinfo_endpoint,
)
SSOProvider = create_provider(name="oidc", discovery_document=discovery)
generic_sso = SSOProvider(
client_id=generic_client_id,
client_secret=generic_client_secret,
redirect_uri=redirect_url,
allow_insecure_http=True,
)
verbose_proxy_logger.debug(f"calling generic_sso.verify_and_process")
request_body = await request.body()
request_query_params = request.query_params
# get "code" from query params
code = request_query_params.get("code")
result = await generic_sso.verify_and_process(request)
verbose_proxy_logger.debug(f"generic result: {result}")
# User is Authe'd in - generate key for the UI to access Proxy # User is Authe'd in - generate key for the UI to access Proxy
user_email = getattr(result, "email", None) user_email = getattr(result, "email", None)
@ -3936,7 +4274,6 @@ async def add_new_model(model_params: ModelParams):
) )
#### [BETA] - This is a beta endpoint, format might change based on user feedback https://github.com/BerriAI/litellm/issues/933. If you need a stable endpoint use /model/info
@router.get( @router.get(
"/model/info", "/model/info",
description="Provides more info about each model in /models, including config.yaml descriptions (except api key and api base)", description="Provides more info about each model in /models, including config.yaml descriptions (except api key and api base)",
@ -3969,6 +4306,28 @@ async def model_info_v1(
# read litellm model_prices_and_context_window.json to get the following: # read litellm model_prices_and_context_window.json to get the following:
# input_cost_per_token, output_cost_per_token, max_tokens # input_cost_per_token, output_cost_per_token, max_tokens
litellm_model_info = get_litellm_model_info(model=model) litellm_model_info = get_litellm_model_info(model=model)
# 2nd pass on the model, try seeing if we can find model in litellm model_cost map
if litellm_model_info == {}:
# use litellm_param model_name to get model_info
litellm_params = model.get("litellm_params", {})
litellm_model = litellm_params.get("model", None)
try:
litellm_model_info = litellm.get_model_info(model=litellm_model)
except:
litellm_model_info = {}
# 3rd pass on the model, try seeing if we can find model but without the "/" in model cost map
if litellm_model_info == {}:
# use litellm_param model_name to get model_info
litellm_params = model.get("litellm_params", {})
litellm_model = litellm_params.get("model", None)
split_model = litellm_model.split("/")
if len(split_model) > 0:
litellm_model = split_model[-1]
try:
litellm_model_info = litellm.get_model_info(model=litellm_model)
except:
litellm_model_info = {}
for k, v in litellm_model_info.items(): for k, v in litellm_model_info.items():
if k not in model_info: if k not in model_info:
model_info[k] = v model_info[k] = v

View file

@ -4,25 +4,29 @@ const openai = require('openai');
process.env.DEBUG=false; process.env.DEBUG=false;
async function runOpenAI() { async function runOpenAI() {
const client = new openai.OpenAI({ const client = new openai.OpenAI({
apiKey: 'sk-JkKeNi6WpWDngBsghJ6B9g', apiKey: 'sk-1234',
baseURL: 'http://0.0.0.0:8000' baseURL: 'http://0.0.0.0:4000'
}); });
try { try {
const response = await client.chat.completions.create({ const response = await client.chat.completions.create({
model: 'sagemaker', model: 'anthropic-claude-v2.1',
stream: true, stream: true,
max_tokens: 1000,
messages: [ messages: [
{ {
role: 'user', role: 'user',
content: 'write a 20 pg essay about YC ', content: 'write a 20 pg essay about YC '.repeat(6000),
}, },
], ],
}); });
console.log(response); console.log(response);
let original = '';
for await (const chunk of response) { for await (const chunk of response) {
original += chunk.choices[0].delta.content;
console.log(original);
console.log(chunk); console.log(chunk);
console.log(chunk.choices[0].delta.content); console.log(chunk.choices[0].delta.content);
} }

View file

@ -11,6 +11,7 @@ from litellm.caching import DualCache
from litellm.proxy.hooks.parallel_request_limiter import ( from litellm.proxy.hooks.parallel_request_limiter import (
_PROXY_MaxParallelRequestsHandler, _PROXY_MaxParallelRequestsHandler,
) )
from litellm import ModelResponse, EmbeddingResponse, ImageResponse
from litellm.proxy.hooks.max_budget_limiter import _PROXY_MaxBudgetLimiter from litellm.proxy.hooks.max_budget_limiter import _PROXY_MaxBudgetLimiter
from litellm.proxy.hooks.cache_control_check import _PROXY_CacheControlCheck from litellm.proxy.hooks.cache_control_check import _PROXY_CacheControlCheck
from litellm.integrations.custom_logger import CustomLogger from litellm.integrations.custom_logger import CustomLogger
@ -92,7 +93,9 @@ class ProxyLogging:
self, self,
user_api_key_dict: UserAPIKeyAuth, user_api_key_dict: UserAPIKeyAuth,
data: dict, data: dict,
call_type: Literal["completion", "embeddings", "image_generation"], call_type: Literal[
"completion", "embeddings", "image_generation", "moderation"
],
): ):
""" """
Allows users to modify/reject the incoming request to the proxy, without having to deal with parsing Request body. Allows users to modify/reject the incoming request to the proxy, without having to deal with parsing Request body.
@ -377,6 +380,28 @@ class ProxyLogging:
raise e raise e
return return
async def post_call_success_hook(
self,
response: Union[ModelResponse, EmbeddingResponse, ImageResponse],
user_api_key_dict: UserAPIKeyAuth,
):
"""
Allow user to modify outgoing data
Covers:
1. /chat/completions
"""
new_response = copy.deepcopy(response)
for callback in litellm.callbacks:
try:
if isinstance(callback, CustomLogger):
await callback.async_post_call_success_hook(
user_api_key_dict=user_api_key_dict, response=new_response
)
except Exception as e:
raise e
return new_response
### DB CONNECTOR ### ### DB CONNECTOR ###
# Define the retry decorator with backoff strategy # Define the retry decorator with backoff strategy

View file

@ -599,6 +599,98 @@ class Router:
self.fail_calls[model_name] += 1 self.fail_calls[model_name] += 1
raise e raise e
async def amoderation(self, model: str, input: str, **kwargs):
try:
kwargs["model"] = model
kwargs["input"] = input
kwargs["original_function"] = self._amoderation
kwargs["num_retries"] = kwargs.get("num_retries", self.num_retries)
timeout = kwargs.get("request_timeout", self.timeout)
kwargs.setdefault("metadata", {}).update({"model_group": model})
response = await self.async_function_with_fallbacks(**kwargs)
return response
except Exception as e:
raise e
async def _amoderation(self, model: str, input: str, **kwargs):
model_name = None
try:
verbose_router_logger.debug(
f"Inside _moderation()- model: {model}; kwargs: {kwargs}"
)
deployment = self.get_available_deployment(
model=model,
input=input,
specific_deployment=kwargs.pop("specific_deployment", None),
)
kwargs.setdefault("metadata", {}).update(
{
"deployment": deployment["litellm_params"]["model"],
"model_info": deployment.get("model_info", {}),
}
)
kwargs["model_info"] = deployment.get("model_info", {})
data = deployment["litellm_params"].copy()
model_name = data["model"]
for k, v in self.default_litellm_params.items():
if (
k not in kwargs and v is not None
): # prioritize model-specific params > default router params
kwargs[k] = v
elif k == "metadata":
kwargs[k].update(v)
potential_model_client = self._get_client(
deployment=deployment, kwargs=kwargs, client_type="async"
)
# check if provided keys == client keys #
dynamic_api_key = kwargs.get("api_key", None)
if (
dynamic_api_key is not None
and potential_model_client is not None
and dynamic_api_key != potential_model_client.api_key
):
model_client = None
else:
model_client = potential_model_client
self.total_calls[model_name] += 1
timeout = (
data.get(
"timeout", None
) # timeout set on litellm_params for this deployment
or self.timeout # timeout set on router
or kwargs.get(
"timeout", None
) # this uses default_litellm_params when nothing is set
)
response = await litellm.amoderation(
**{
**data,
"input": input,
"caching": self.cache_responses,
"client": model_client,
"timeout": timeout,
**kwargs,
}
)
self.success_calls[model_name] += 1
verbose_router_logger.info(
f"litellm.amoderation(model={model_name})\033[32m 200 OK\033[0m"
)
return response
except Exception as e:
verbose_router_logger.info(
f"litellm.amoderation(model={model_name})\033[31m Exception {str(e)}\033[0m"
)
if model_name is not None:
self.fail_calls[model_name] += 1
raise e
def text_completion( def text_completion(
self, self,
model: str, model: str,

View file

@ -86,7 +86,7 @@ class LowestLatencyLoggingHandler(CustomLogger):
if isinstance(response_obj, ModelResponse): if isinstance(response_obj, ModelResponse):
completion_tokens = response_obj.usage.completion_tokens completion_tokens = response_obj.usage.completion_tokens
total_tokens = response_obj.usage.total_tokens total_tokens = response_obj.usage.total_tokens
final_value = float(completion_tokens / response_ms.total_seconds()) final_value = float(response_ms.total_seconds() / completion_tokens)
# ------------ # ------------
# Update usage # Update usage
@ -168,7 +168,7 @@ class LowestLatencyLoggingHandler(CustomLogger):
if isinstance(response_obj, ModelResponse): if isinstance(response_obj, ModelResponse):
completion_tokens = response_obj.usage.completion_tokens completion_tokens = response_obj.usage.completion_tokens
total_tokens = response_obj.usage.total_tokens total_tokens = response_obj.usage.total_tokens
final_value = float(completion_tokens / response_ms.total_seconds()) final_value = float(response_ms.total_seconds() / completion_tokens)
# ------------ # ------------
# Update usage # Update usage

View file

@ -123,6 +123,10 @@ def test_vertex_ai():
print(response) print(response)
assert type(response.choices[0].message.content) == str assert type(response.choices[0].message.content) == str
assert len(response.choices[0].message.content) > 1 assert len(response.choices[0].message.content) > 1
print(
f"response.choices[0].finish_reason: {response.choices[0].finish_reason}"
)
assert response.choices[0].finish_reason in litellm._openai_finish_reasons
except Exception as e: except Exception as e:
pytest.fail(f"Error occurred: {e}") pytest.fail(f"Error occurred: {e}")

View file

@ -71,7 +71,7 @@ def test_completion_claude():
messages=messages, messages=messages,
request_timeout=10, request_timeout=10,
) )
# Add any assertions here to check the response # Add any assertions here to check response args
print(response) print(response)
print(response.usage) print(response.usage)
print(response.usage.completion_tokens) print(response.usage.completion_tokens)
@ -1545,9 +1545,9 @@ def test_completion_bedrock_titan_null_response():
], ],
) )
# Add any assertions here to check the response # Add any assertions here to check the response
pytest.fail(f"Expected to fail") print(f"response: {response}")
except Exception as e: except Exception as e:
pass pytest.fail(f"An error occurred - {str(e)}")
def test_completion_bedrock_titan(): def test_completion_bedrock_titan():
@ -2093,10 +2093,6 @@ def test_completion_cloudflare():
def test_moderation(): def test_moderation():
import openai
openai.api_type = "azure"
openai.api_version = "GM"
response = litellm.moderation(input="i'm ishaan cto of litellm") response = litellm.moderation(input="i'm ishaan cto of litellm")
print(response) print(response)
output = response.results[0] output = response.results[0]

View file

@ -0,0 +1,65 @@
# What is this?
## Unit test for presidio pii masking
import sys, os, asyncio, time, random
from datetime import datetime
import traceback
from dotenv import load_dotenv
load_dotenv()
import os
sys.path.insert(
0, os.path.abspath("../..")
) # Adds the parent directory to the system path
import pytest
import litellm
from litellm.proxy.hooks.presidio_pii_masking import _OPTIONAL_PresidioPIIMasking
from litellm import Router, mock_completion
from litellm.proxy.utils import ProxyLogging
from litellm.proxy._types import UserAPIKeyAuth
from litellm.caching import DualCache
@pytest.mark.asyncio
async def test_output_parsing():
"""
- have presidio pii masking - mask an input message
- make llm completion call
- have presidio pii masking - output parse message
- assert that no masked tokens are in the input message
"""
litellm.output_parse_pii = True
pii_masking = _OPTIONAL_PresidioPIIMasking(mock_testing=True)
initial_message = [
{
"role": "user",
"content": "hello world, my name is Jane Doe. My number is: 034453334",
}
]
filtered_message = [
{
"role": "user",
"content": "hello world, my name is <PERSON>. My number is: <PHONE_NUMBER>",
}
]
pii_masking.pii_tokens = {"<PERSON>": "Jane Doe", "<PHONE_NUMBER>": "034453334"}
response = mock_completion(
model="gpt-3.5-turbo",
messages=filtered_message,
mock_response="Hello <PERSON>! How can I assist you today?",
)
new_response = await pii_masking.async_post_call_success_hook(
user_api_key_dict=UserAPIKeyAuth(), response=response
)
assert (
new_response.choices[0].message.content
== "Hello Jane Doe! How can I assist you today?"
)
# asyncio.run(test_output_parsing())

View file

@ -139,7 +139,7 @@ def test_exception_openai_bad_model(client):
response=response response=response
) )
print("Type of exception=", type(openai_exception)) print("Type of exception=", type(openai_exception))
assert isinstance(openai_exception, openai.NotFoundError) assert isinstance(openai_exception, openai.BadRequestError)
except Exception as e: except Exception as e:
pytest.fail(f"LiteLLM Proxy test failed. Exception {str(e)}") pytest.fail(f"LiteLLM Proxy test failed. Exception {str(e)}")
@ -160,7 +160,6 @@ def test_chat_completion_exception_any_model(client):
response = client.post("/chat/completions", json=test_data) response = client.post("/chat/completions", json=test_data)
json_response = response.json() json_response = response.json()
print("keys in json response", json_response.keys())
assert json_response.keys() == {"error"} assert json_response.keys() == {"error"}
# make an openai client to call _make_status_error_from_response # make an openai client to call _make_status_error_from_response

View file

@ -991,3 +991,23 @@ def test_router_timeout():
print(e) print(e)
print(vars(e)) print(vars(e))
pass pass
@pytest.mark.asyncio
async def test_router_amoderation():
model_list = [
{
"model_name": "openai-moderations",
"litellm_params": {
"model": "text-moderation-stable",
"api_key": os.getenv("OPENAI_API_KEY", None),
},
}
]
router = Router(model_list=model_list)
result = await router.amoderation(
model="openai-moderations", input="this is valid good text"
)
print("moderation result", result)

View file

@ -58,6 +58,18 @@ def my_post_call_rule(input: str):
return {"decision": True} return {"decision": True}
def my_post_call_rule_2(input: str):
input = input.lower()
print(f"input: {input}")
print(f"INSIDE MY POST CALL RULE, len(input) - {len(input)}")
if len(input) < 200 and len(input) > 0:
return {
"decision": False,
"message": "This violates LiteLLM Proxy Rules. Response too short",
}
return {"decision": True}
# test_pre_call_rule() # test_pre_call_rule()
# Test 2: Post-call rule # Test 2: Post-call rule
# commenting out of ci/cd since llm's have variable output which was causing our pipeline to fail erratically. # commenting out of ci/cd since llm's have variable output which was causing our pipeline to fail erratically.
@ -94,3 +106,24 @@ def test_post_call_rule():
# test_post_call_rule() # test_post_call_rule()
def test_post_call_rule_streaming():
try:
litellm.pre_call_rules = []
litellm.post_call_rules = [my_post_call_rule_2]
### completion
response = completion(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "say sorry"}],
max_tokens=2,
stream=True,
)
for chunk in response:
print(f"chunk: {chunk}")
pytest.fail(f"Completion call should have been failed. ")
except Exception as e:
print("Got exception", e)
print(type(e))
print(vars(e))
assert e.message == "This violates LiteLLM Proxy Rules. Response too short"

View file

@ -738,6 +738,8 @@ class CallTypes(Enum):
text_completion = "text_completion" text_completion = "text_completion"
image_generation = "image_generation" image_generation = "image_generation"
aimage_generation = "aimage_generation" aimage_generation = "aimage_generation"
moderation = "moderation"
amoderation = "amoderation"
# Logging function -> log the exact model details + what's being sent | Non-BlockingP # Logging function -> log the exact model details + what's being sent | Non-BlockingP
@ -2100,6 +2102,11 @@ def client(original_function):
or call_type == CallTypes.aimage_generation.value or call_type == CallTypes.aimage_generation.value
): ):
messages = args[0] if len(args) > 0 else kwargs["prompt"] messages = args[0] if len(args) > 0 else kwargs["prompt"]
elif (
call_type == CallTypes.moderation.value
or call_type == CallTypes.amoderation.value
):
messages = args[1] if len(args) > 1 else kwargs["input"]
elif ( elif (
call_type == CallTypes.atext_completion.value call_type == CallTypes.atext_completion.value
or call_type == CallTypes.text_completion.value or call_type == CallTypes.text_completion.value
@ -7692,6 +7699,7 @@ class CustomStreamWrapper:
self.special_tokens = ["<|assistant|>", "<|system|>", "<|user|>", "<s>", "</s>"] self.special_tokens = ["<|assistant|>", "<|system|>", "<|user|>", "<s>", "</s>"]
self.holding_chunk = "" self.holding_chunk = ""
self.complete_response = "" self.complete_response = ""
self.response_uptil_now = ""
_model_info = ( _model_info = (
self.logging_obj.model_call_details.get("litellm_params", {}).get( self.logging_obj.model_call_details.get("litellm_params", {}).get(
"model_info", {} "model_info", {}
@ -7703,6 +7711,7 @@ class CustomStreamWrapper:
} # returned as x-litellm-model-id response header in proxy } # returned as x-litellm-model-id response header in proxy
self.response_id = None self.response_id = None
self.logging_loop = None self.logging_loop = None
self.rules = Rules()
def __iter__(self): def __iter__(self):
return self return self
@ -8659,7 +8668,7 @@ class CustomStreamWrapper:
chunk = next(self.completion_stream) chunk = next(self.completion_stream)
if chunk is not None and chunk != b"": if chunk is not None and chunk != b"":
print_verbose(f"PROCESSED CHUNK PRE CHUNK CREATOR: {chunk}") print_verbose(f"PROCESSED CHUNK PRE CHUNK CREATOR: {chunk}")
response = self.chunk_creator(chunk=chunk) response: Optional[ModelResponse] = self.chunk_creator(chunk=chunk)
print_verbose(f"PROCESSED CHUNK POST CHUNK CREATOR: {response}") print_verbose(f"PROCESSED CHUNK POST CHUNK CREATOR: {response}")
if response is None: if response is None:
continue continue
@ -8667,7 +8676,12 @@ class CustomStreamWrapper:
threading.Thread( threading.Thread(
target=self.run_success_logging_in_thread, args=(response,) target=self.run_success_logging_in_thread, args=(response,)
).start() # log response ).start() # log response
self.response_uptil_now += (
response.choices[0].delta.get("content", "") or ""
)
self.rules.post_call_rules(
input=self.response_uptil_now, model=self.model
)
# RETURN RESULT # RETURN RESULT
return response return response
except StopIteration: except StopIteration:
@ -8705,7 +8719,9 @@ class CustomStreamWrapper:
# chunk_creator() does logging/stream chunk building. We need to let it know its being called in_async_func, so we don't double add chunks. # chunk_creator() does logging/stream chunk building. We need to let it know its being called in_async_func, so we don't double add chunks.
# __anext__ also calls async_success_handler, which does logging # __anext__ also calls async_success_handler, which does logging
print_verbose(f"PROCESSED ASYNC CHUNK PRE CHUNK CREATOR: {chunk}") print_verbose(f"PROCESSED ASYNC CHUNK PRE CHUNK CREATOR: {chunk}")
processed_chunk = self.chunk_creator(chunk=chunk) processed_chunk: Optional[ModelResponse] = self.chunk_creator(
chunk=chunk
)
print_verbose( print_verbose(
f"PROCESSED ASYNC CHUNK POST CHUNK CREATOR: {processed_chunk}" f"PROCESSED ASYNC CHUNK POST CHUNK CREATOR: {processed_chunk}"
) )
@ -8720,6 +8736,12 @@ class CustomStreamWrapper:
processed_chunk, processed_chunk,
) )
) )
self.response_uptil_now += (
processed_chunk.choices[0].delta.get("content", "") or ""
)
self.rules.post_call_rules(
input=self.response_uptil_now, model=self.model
)
return processed_chunk return processed_chunk
raise StopAsyncIteration raise StopAsyncIteration
else: # temporary patch for non-aiohttp async calls else: # temporary patch for non-aiohttp async calls
@ -8733,7 +8755,9 @@ class CustomStreamWrapper:
chunk = next(self.completion_stream) chunk = next(self.completion_stream)
if chunk is not None and chunk != b"": if chunk is not None and chunk != b"":
print_verbose(f"PROCESSED CHUNK PRE CHUNK CREATOR: {chunk}") print_verbose(f"PROCESSED CHUNK PRE CHUNK CREATOR: {chunk}")
processed_chunk = self.chunk_creator(chunk=chunk) processed_chunk: Optional[ModelResponse] = self.chunk_creator(
chunk=chunk
)
print_verbose( print_verbose(
f"PROCESSED CHUNK POST CHUNK CREATOR: {processed_chunk}" f"PROCESSED CHUNK POST CHUNK CREATOR: {processed_chunk}"
) )
@ -8750,6 +8774,12 @@ class CustomStreamWrapper:
) )
) )
self.response_uptil_now += (
processed_chunk.choices[0].delta.get("content", "") or ""
)
self.rules.post_call_rules(
input=self.response_uptil_now, model=self.model
)
# RETURN RESULT # RETURN RESULT
return processed_chunk return processed_chunk
except StopAsyncIteration: except StopAsyncIteration:

View file

@ -198,6 +198,33 @@
"litellm_provider": "openai", "litellm_provider": "openai",
"mode": "embedding" "mode": "embedding"
}, },
"text-moderation-stable": {
"max_tokens": 32768,
"max_input_tokens": 32768,
"max_output_tokens": 0,
"input_cost_per_token": 0.000000,
"output_cost_per_token": 0.000000,
"litellm_provider": "openai",
"mode": "moderations"
},
"text-moderation-007": {
"max_tokens": 32768,
"max_input_tokens": 32768,
"max_output_tokens": 0,
"input_cost_per_token": 0.000000,
"output_cost_per_token": 0.000000,
"litellm_provider": "openai",
"mode": "moderations"
},
"text-moderation-latest": {
"max_tokens": 32768,
"max_input_tokens": 32768,
"max_output_tokens": 0,
"input_cost_per_token": 0.000000,
"output_cost_per_token": 0.000000,
"litellm_provider": "openai",
"mode": "moderations"
},
"256-x-256/dall-e-2": { "256-x-256/dall-e-2": {
"mode": "image_generation", "mode": "image_generation",
"input_cost_per_pixel": 0.00000024414, "input_cost_per_pixel": 0.00000024414,

View file

@ -1,6 +1,6 @@
[tool.poetry] [tool.poetry]
name = "litellm" name = "litellm"
version = "1.23.10" version = "1.23.16"
description = "Library to easily interface with LLM API providers" description = "Library to easily interface with LLM API providers"
authors = ["BerriAI"] authors = ["BerriAI"]
license = "MIT" license = "MIT"
@ -69,7 +69,7 @@ requires = ["poetry-core", "wheel"]
build-backend = "poetry.core.masonry.api" build-backend = "poetry.core.masonry.api"
[tool.commitizen] [tool.commitizen]
version = "1.23.10" version = "1.23.16"
version_files = [ version_files = [
"pyproject.toml:^version" "pyproject.toml:^version"
] ]

View file

@ -27,7 +27,7 @@ tiktoken>=0.4.0 # for calculating usage
importlib-metadata>=6.8.0 # for random utils importlib-metadata>=6.8.0 # for random utils
tokenizers==0.14.0 # for calculating usage tokenizers==0.14.0 # for calculating usage
click==8.1.7 # for proxy cli click==8.1.7 # for proxy cli
jinja2==3.1.2 # for prompt templates jinja2==3.1.3 # for prompt templates
certifi>=2023.7.22 # [TODO] clean up certifi>=2023.7.22 # [TODO] clean up
aiohttp==3.9.0 # for network calls aiohttp==3.9.0 # for network calls
aioboto3==12.3.0 # for async sagemaker calls aioboto3==12.3.0 # for async sagemaker calls

View file

@ -88,6 +88,22 @@ async def test_chat_completion():
await chat_completion(session=session, key=key_2) await chat_completion(session=session, key=key_2)
@pytest.mark.asyncio
async def test_chat_completion_old_key():
"""
Production test for backwards compatibility. Test db against a pre-generated (old key)
- Create key
Make chat completion call
"""
async with aiohttp.ClientSession() as session:
try:
key = "sk-yNXvlRO4SxIGG0XnRMYxTw"
await chat_completion(session=session, key=key)
except Exception as e:
key = "sk-2KV0sAElLQqMpLZXdNf3yw" # try diff db key (in case db url is for the other db)
await chat_completion(session=session, key=key)
async def completion(session, key): async def completion(session, key):
url = "http://0.0.0.0:4000/completions" url = "http://0.0.0.0:4000/completions"
headers = { headers = {

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1 @@
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[165],{83155:function(e,t,n){(window.__NEXT_P=window.__NEXT_P||[]).push(["/_not-found",function(){return n(84032)}])},84032:function(e,t,n){"use strict";Object.defineProperty(t,"__esModule",{value:!0}),Object.defineProperty(t,"default",{enumerable:!0,get:function(){return i}}),n(86921);let o=n(3827);n(64090);let r={error:{fontFamily:'system-ui,"Segoe UI",Roboto,Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji"',height:"100vh",textAlign:"center",display:"flex",flexDirection:"column",alignItems:"center",justifyContent:"center"},desc:{display:"inline-block"},h1:{display:"inline-block",margin:"0 20px 0 0",padding:"0 23px 0 0",fontSize:24,fontWeight:500,verticalAlign:"top",lineHeight:"49px"},h2:{fontSize:14,fontWeight:400,lineHeight:"49px",margin:0}};function i(){return(0,o.jsxs)(o.Fragment,{children:[(0,o.jsx)("title",{children:"404: This page could not be found."}),(0,o.jsx)("div",{style:r.error,children:(0,o.jsxs)("div",{children:[(0,o.jsx)("style",{dangerouslySetInnerHTML:{__html:"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}),(0,o.jsx)("h1",{className:"next-error-h1",style:r.h1,children:"404"}),(0,o.jsx)("div",{style:r.desc,children:(0,o.jsx)("h2",{style:r.h2,children:"This page could not be found."})})]})})]})}("function"==typeof t.default||"object"==typeof t.default&&null!==t.default)&&void 0===t.default.__esModule&&(Object.defineProperty(t.default,"__esModule",{value:!0}),Object.assign(t.default,t),e.exports=t.default)}},function(e){e.O(0,[971,69,744],function(){return e(e.s=83155)}),_N_E=e.O()}]);

View file

@ -0,0 +1 @@
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[185],{87421:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=87421)}),_N_E=n.O()}]);

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1 @@
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[744],{32028:function(e,n,t){Promise.resolve().then(t.t.bind(t,47690,23)),Promise.resolve().then(t.t.bind(t,48955,23)),Promise.resolve().then(t.t.bind(t,5613,23)),Promise.resolve().then(t.t.bind(t,11902,23)),Promise.resolve().then(t.t.bind(t,31778,23)),Promise.resolve().then(t.t.bind(t,77831,23))}},function(e){var n=function(n){return e(e.s=n)};e.O(0,[971,69],function(){return n(35317),n(32028)}),_N_E=e.O()}]);

View file

@ -0,0 +1 @@
!function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o||0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r||"object"==typeof e&&e&&(4&r&&e.__esModule||16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t||[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/c18941d97fb7245b.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e||l.getAttribute("data-webpack")==o+n){i=l;break}}i||(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children||(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E||[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}();

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1 @@
self.__BUILD_MANIFEST={__rewrites:{afterFiles:[],beforeFiles:[],fallback:[]},"/_error":["static/chunks/pages/_error-d6107f1aac0c574c.js"],sortedPages:["/_app","/_error"]},self.__BUILD_MANIFEST_CB&&self.__BUILD_MANIFEST_CB();

View file

@ -0,0 +1 @@
self.__SSG_MANIFEST=new Set([]);self.__SSG_MANIFEST_CB&&self.__SSG_MANIFEST_CB()

View file

@ -1 +1 @@
<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-85c9b4219c1bb384.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-9b4acf26920649bc.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-096338c8e1915716.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/654259bbf9e4c196.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[75985,[\"838\",\"static/chunks/838-7fa0bab5a1c3631d.js\",\"931\",\"static/chunks/app/page-5a7453e3903c5d60.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/654259bbf9e4c196.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"4mrMigZY9ob7yaIDjXpX6\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html> <!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-a85b2c176012d8e5.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-e1b183dda365ec86.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-9b4fb13a7db53edf.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/c18941d97fb7245b.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[48016,[\"145\",\"static/chunks/145-9c160ad5539e000f.js\",\"931\",\"static/chunks/app/page-fcb69349f15d154b.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/c18941d97fb7245b.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"lLFQRQnIrRo-GJf5spHEd\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>

View file

@ -1,7 +1,7 @@
2:I[77831,[],""] 2:I[77831,[],""]
3:I[75985,["838","static/chunks/838-7fa0bab5a1c3631d.js","931","static/chunks/app/page-5a7453e3903c5d60.js"],""] 3:I[48016,["145","static/chunks/145-9c160ad5539e000f.js","931","static/chunks/app/page-fcb69349f15d154b.js"],""]
4:I[5613,[],""] 4:I[5613,[],""]
5:I[31778,[],""] 5:I[31778,[],""]
0:["4mrMigZY9ob7yaIDjXpX6",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/654259bbf9e4c196.css","precedence":"next","crossOrigin":""}]],"$L6"]]]] 0:["lLFQRQnIrRo-GJf5spHEd",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/c18941d97fb7245b.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"🚅 LiteLLM"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]] 6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"🚅 LiteLLM"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
1:null 1:null

File diff suppressed because it is too large Load diff

View file

@ -19,14 +19,18 @@
"jsonwebtoken": "^9.0.2", "jsonwebtoken": "^9.0.2",
"jwt-decode": "^4.0.0", "jwt-decode": "^4.0.0",
"next": "14.1.0", "next": "14.1.0",
"openai": "^4.28.0",
"react": "^18", "react": "^18",
"react-dom": "^18" "react-dom": "^18",
"react-markdown": "^9.0.1",
"react-syntax-highlighter": "^15.5.0"
}, },
"devDependencies": { "devDependencies": {
"@tailwindcss/forms": "^0.5.7", "@tailwindcss/forms": "^0.5.7",
"@types/node": "^20", "@types/node": "^20",
"@types/react": "18.2.48", "@types/react": "18.2.48",
"@types/react-dom": "^18", "@types/react-dom": "^18",
"@types/react-syntax-highlighter": "^15.5.11",
"autoprefixer": "^10.4.17", "autoprefixer": "^10.4.17",
"eslint": "^8", "eslint": "^8",
"eslint-config-next": "14.1.0", "eslint-config-next": "14.1.0",

View file

@ -3,6 +3,8 @@ import React, { Suspense, useEffect, useState } from "react";
import { useSearchParams } from "next/navigation"; import { useSearchParams } from "next/navigation";
import Navbar from "../components/navbar"; import Navbar from "../components/navbar";
import UserDashboard from "../components/user_dashboard"; import UserDashboard from "../components/user_dashboard";
import ModelDashboard from "@/components/model_dashboard";
import ChatUI from "@/components/chat_ui";
import Sidebar from "../components/leftnav"; import Sidebar from "../components/leftnav";
import Usage from "../components/usage"; import Usage from "../components/usage";
import { jwtDecode } from "jwt-decode"; import { jwtDecode } from "jwt-decode";
@ -80,7 +82,22 @@ const CreateKeyPage = () => {
userEmail={userEmail} userEmail={userEmail}
setUserEmail={setUserEmail} setUserEmail={setUserEmail}
/> />
) : ( ) : page == "models" ? (
<ModelDashboard
userID={userID}
userRole={userRole}
token={token}
accessToken={accessToken}
/>
) : page == "llm-playground" ? (
<ChatUI
userID={userID}
userRole={userRole}
token={token}
accessToken={accessToken}
/>
)
: (
<Usage <Usage
userID={userID} userID={userID}
userRole={userRole} userRole={userRole}

View file

@ -0,0 +1,301 @@
import React, { useState, useEffect } from "react";
import ReactMarkdown from "react-markdown";
import { Card, Title, Table, TableHead, TableRow, TableCell, TableBody, Grid, Tab,
TabGroup,
TabList,
TabPanel,
Metric,
Select,
SelectItem,
TabPanels, } from "@tremor/react";
import { modelInfoCall } from "./networking";
import openai from "openai";
import { Prism as SyntaxHighlighter } from 'react-syntax-highlighter';
interface ChatUIProps {
accessToken: string | null;
token: string | null;
userRole: string | null;
userID: string | null;
}
async function generateModelResponse(inputMessage: string, updateUI: (chunk: string) => void, selectedModel: string, accessToken: string) {
const client = new openai.OpenAI({
apiKey: accessToken, // Replace with your OpenAI API key
baseURL: 'http://0.0.0.0:4000', // Replace with your OpenAI API base URL
dangerouslyAllowBrowser: true, // using a temporary litellm proxy key
});
const response = await client.chat.completions.create({
model: selectedModel,
stream: true,
messages: [
{
role: 'user',
content: inputMessage,
},
],
});
for await (const chunk of response) {
console.log(chunk);
if (chunk.choices[0].delta.content) {
updateUI(chunk.choices[0].delta.content);
}
}
}
const ChatUI: React.FC<ChatUIProps> = ({ accessToken, token, userRole, userID }) => {
const [inputMessage, setInputMessage] = useState("");
const [chatHistory, setChatHistory] = useState<any[]>([]);
const [selectedModel, setSelectedModel] = useState<string | undefined>(undefined);
const [modelInfo, setModelInfo] = useState<any | null>(null); // Declare modelInfo at the component level
useEffect(() => {
if (!accessToken || !token || !userRole || !userID) {
return;
}
// Fetch model info and set the default selected model
const fetchModelInfo = async () => {
const fetchedModelInfo = await modelInfoCall(accessToken, userID, userRole);
console.log("model_info:", fetchedModelInfo);
if (fetchedModelInfo?.data.length > 0) {
setModelInfo(fetchedModelInfo);
setSelectedModel(fetchedModelInfo.data[0].model_name);
}
};
fetchModelInfo();
}, [accessToken, userID, userRole]);
const updateUI = (role: string, chunk: string) => {
setChatHistory((prevHistory) => {
const lastMessage = prevHistory[prevHistory.length - 1];
if (lastMessage && lastMessage.role === role) {
return [
...prevHistory.slice(0, prevHistory.length - 1),
{ role, content: lastMessage.content + chunk },
];
} else {
return [...prevHistory, { role, content: chunk }];
}
});
};
const handleSendMessage = async () => {
if (inputMessage.trim() === "") return;
if (!accessToken || !token || !userRole || !userID) {
return;
}
setChatHistory((prevHistory) => [
...prevHistory,
{ role: "user", content: inputMessage },
]);
try {
if (selectedModel) {
await generateModelResponse(inputMessage, (chunk) => updateUI("assistant", chunk), selectedModel, accessToken);
}
} catch (error) {
console.error("Error fetching model response", error);
updateUI("assistant", "Error fetching model response");
}
setInputMessage("");
};
return (
<div style={{ width: "100%", position: "relative" }}>
<Grid className="gap-2 p-10 h-[75vh] w-full">
<Card>
<TabGroup>
<TabList className="mt-4">
<Tab>Chat</Tab>
<Tab>API Reference</Tab>
</TabList>
<TabPanels>
<TabPanel>
<div>
<label>Select Model:</label>
<select
value={selectedModel || ""}
onChange={(e) => setSelectedModel(e.target.value)}
>
{/* Populate dropdown options from available models */}
{modelInfo?.data.map((element: { model_name: string }) => (
<option key={element.model_name} value={element.model_name}>
{element.model_name}
</option>
))}
</select>
</div>
<Table className="mt-5" style={{ display: "block", maxHeight: "60vh", overflowY: "auto" }}>
<TableHead>
<TableRow>
<TableCell>
<Title>Chat</Title>
</TableCell>
</TableRow>
</TableHead>
<TableBody>
{chatHistory.map((message, index) => (
<TableRow key={index}>
<TableCell>{`${message.role}: ${message.content}`}</TableCell>
</TableRow>
))}
</TableBody>
</Table>
<div className="mt-3" style={{ position: "absolute", bottom: 5, width: "95%" }}>
<div className="flex">
<input
type="text"
value={inputMessage}
onChange={(e) => setInputMessage(e.target.value)}
className="flex-1 p-2 border rounded-md mr-2"
placeholder="Type your message..."
/>
<button onClick={handleSendMessage} className="p-2 bg-blue-500 text-white rounded-md">
Send
</button>
</div>
</div>
</TabPanel>
<TabPanel>
<TabGroup>
<TabList>
<Tab>OpenAI Python SDK</Tab>
<Tab>LlamaIndex</Tab>
<Tab>Langchain Py</Tab>
</TabList>
<TabPanels>
<TabPanel>
<SyntaxHighlighter language="python">
{`
import openai
client = openai.OpenAI(
api_key="your_api_key",
base_url="http://0.0.0.0:4000" # proxy base url
)
response = client.chat.completions.create(
model="gpt-3.5-turbo", # model to use from Models Tab
messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
extra_body={
"metadata": {
"generation_name": "ishaan-generation-openai-client",
"generation_id": "openai-client-gen-id22",
"trace_id": "openai-client-trace-id22",
"trace_user_id": "openai-client-user-id2"
}
}
)
print(response)
`}
</SyntaxHighlighter>
</TabPanel>
<TabPanel>
<SyntaxHighlighter language="python">
{`
import os, dotenv
from llama_index.llms import AzureOpenAI
from llama_index.embeddings import AzureOpenAIEmbedding
from llama_index import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
llm = AzureOpenAI(
engine="azure-gpt-3.5", # model_name on litellm proxy
temperature=0.0,
azure_endpoint="http://0.0.0.0:4000", # litellm proxy endpoint
api_key="sk-1234", # litellm proxy API Key
api_version="2023-07-01-preview",
)
embed_model = AzureOpenAIEmbedding(
deployment_name="azure-embedding-model",
azure_endpoint="http://0.0.0.0:4000",
api_key="sk-1234",
api_version="2023-07-01-preview",
)
documents = SimpleDirectoryReader("llama_index_data").load_data()
service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)
index = VectorStoreIndex.from_documents(documents, service_context=service_context)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)
`}
</SyntaxHighlighter>
</TabPanel>
<TabPanel>
<SyntaxHighlighter language="python">
{`
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage
chat = ChatOpenAI(
openai_api_base="http://0.0.0.0:8000",
model = "gpt-3.5-turbo",
temperature=0.1,
extra_body={
"metadata": {
"generation_name": "ishaan-generation-langchain-client",
"generation_id": "langchain-client-gen-id22",
"trace_id": "langchain-client-trace-id22",
"trace_user_id": "langchain-client-user-id2"
}
}
)
messages = [
SystemMessage(
content="You are a helpful assistant that im using to make a test request to."
),
HumanMessage(
content="test from litellm. tell me why it's amazing in 1 sentence"
),
]
response = chat(messages)
print(response)
`}
</SyntaxHighlighter>
</TabPanel>
</TabPanels>
</TabGroup>
</TabPanel>
</TabPanels>
</TabGroup>
</Card>
</Grid>
</div>
);
};
export default ChatUI;

View file

@ -14,6 +14,7 @@ interface CreateKeyProps {
userRole: string | null; userRole: string | null;
accessToken: string; accessToken: string;
data: any[] | null; data: any[] | null;
userModels: string[];
setData: React.Dispatch<React.SetStateAction<any[] | null>>; setData: React.Dispatch<React.SetStateAction<any[] | null>>;
} }
@ -22,6 +23,7 @@ const CreateKey: React.FC<CreateKeyProps> = ({
userRole, userRole,
accessToken, accessToken,
data, data,
userModels,
setData, setData,
}) => { }) => {
const [form] = Form.useForm(); const [form] = Form.useForm();
@ -42,20 +44,13 @@ const CreateKey: React.FC<CreateKeyProps> = ({
const handleCreate = async (formValues: Record<string, any>) => { const handleCreate = async (formValues: Record<string, any>) => {
try { try {
message.info("Making API Call"); message.info("Making API Call");
// Check if "models" exists and is not an empty string
if (formValues.models && formValues.models.trim() !== '') {
// Format the "models" field as an array
formValues.models = formValues.models.split(',').map((model: string) => model.trim());
} else {
// If "models" is undefined or an empty string, set it to an empty array
formValues.models = [];
}
setIsModalVisible(true); setIsModalVisible(true);
const response = await keyCreateCall(accessToken, userID, formValues); const response = await keyCreateCall(accessToken, userID, formValues);
setData((prevData) => (prevData ? [...prevData, response] : [response])); // Check if prevData is null setData((prevData) => (prevData ? [...prevData, response] : [response])); // Check if prevData is null
setApiKey(response["key"]); setApiKey(response["key"]);
message.success("API Key Created"); message.success("API Key Created");
form.resetFields(); form.resetFields();
localStorage.removeItem("userData" + userID)
} catch (error) { } catch (error) {
console.error("Error creating the key:", error); console.error("Error creating the key:", error);
} }
@ -90,13 +85,22 @@ const CreateKey: React.FC<CreateKeyProps> = ({
> >
<Input placeholder="ai_team" /> <Input placeholder="ai_team" />
</Form.Item> </Form.Item>
<Form.Item <Form.Item
label="Models (Comma Separated). Eg: gpt-3.5-turbo,gpt-4" label="Models"
name="models" name="models"
>
<Select
mode="multiple"
placeholder="Select models"
style={{ width: '100%' }}
> >
<Input placeholder="gpt-4,gpt-3.5-turbo" /> {userModels.map((model) => (
</Form.Item> <Option key={model} value={model}>
{model}
</Option>
))}
</Select>
</Form.Item>
<Form.Item <Form.Item

View file

@ -20,7 +20,13 @@ const Sidebar: React.FC<SidebarProps> = ({ setPage }) => {
<Menu.Item key="1" onClick={() => setPage("api-keys")}> <Menu.Item key="1" onClick={() => setPage("api-keys")}>
API Keys API Keys
</Menu.Item> </Menu.Item>
<Menu.Item key="2" onClick={() => setPage("usage")}> <Menu.Item key="2" onClick={() => setPage("models")}>
Models
</Menu.Item>
<Menu.Item key="3" onClick={() => setPage("llm-playground")}>
Chat UI
</Menu.Item>
<Menu.Item key="4" onClick={() => setPage("usage")}>
Usage Usage
</Menu.Item> </Menu.Item>
</Menu> </Menu>

View file

@ -0,0 +1,124 @@
import React, { useState, useEffect } from "react";
import { Card, Title, Subtitle, Table, TableHead, TableRow, TableCell, TableBody, Metric, Grid } from "@tremor/react";
import { modelInfoCall } from "./networking";
interface ModelDashboardProps {
accessToken: string | null;
token: string | null;
userRole: string | null;
userID: string | null;
}
const ModelDashboard: React.FC<ModelDashboardProps> = ({
accessToken,
token,
userRole,
userID,
}) => {
const [modelData, setModelData] = useState<any>({ data: [] });
useEffect(() => {
if (!accessToken || !token || !userRole || !userID) {
return;
}
const fetchData = async () => {
try {
// Replace with your actual API call for model data
const modelDataResponse = await modelInfoCall(accessToken, userID, userRole);
console.log("Model data response:", modelDataResponse.data);
setModelData(modelDataResponse);
} catch (error) {
console.error("There was an error fetching the model data", error);
}
};
if (accessToken && token && userRole && userID) {
fetchData();
}
}, [accessToken, token, userRole, userID]);
if (!modelData) {
return <div>Loading...</div>;
}
// loop through model data and edit each row
for (let i = 0; i < modelData.data.length; i++) {
let curr_model = modelData.data[i];
let litellm_model_name = curr_model?.litellm_params?.model;
let model_info = curr_model?.model_info;
let defaultProvider = "openai";
let provider = "";
let input_cost = "Undefined"
let output_cost = "Undefined"
let max_tokens = "Undefined"
// Check if litellm_model_name is null or undefined
if (litellm_model_name) {
// Split litellm_model_name based on "/"
let splitModel = litellm_model_name.split("/");
// Get the first element in the split
let firstElement = splitModel[0];
// If there is only one element, default provider to openai
provider = splitModel.length === 1 ? defaultProvider : firstElement;
console.log("Provider:", provider);
} else {
// litellm_model_name is null or undefined, default provider to openai
provider = defaultProvider;
console.log("Provider:", provider);
}
if (model_info) {
input_cost = model_info?.input_cost_per_token;
output_cost = model_info?.output_cost_per_token;
max_tokens = model_info?.max_tokens;
}
modelData.data[i].provider = provider
modelData.data[i].input_cost = input_cost
modelData.data[i].output_cost = output_cost
modelData.data[i].max_tokens = max_tokens
}
return (
<div style={{ width: "100%" }}>
<Grid className="gap-2 p-10 h-[75vh] w-full">
<Card>
<Table className="mt-5">
<TableHead>
<TableRow>
<TableCell><Title>Model Name </Title></TableCell>
<TableCell><Title>Provider</Title></TableCell>
<TableCell><Title>Input Price per token ($)</Title></TableCell>
<TableCell><Title>Output Price per token ($)</Title></TableCell>
<TableCell><Title>Max Tokens</Title></TableCell>
</TableRow>
</TableHead>
<TableBody>
{modelData.data.map((model: any) => (
<TableRow key={model.model_name}>
<TableCell><Title>{model.model_name}</Title></TableCell>
<TableCell>{model.provider}</TableCell>
<TableCell>{model.input_cost}</TableCell>
<TableCell>{model.output_cost}</TableCell>
<TableCell>{model.max_tokens}</TableCell>
</TableRow>
))}
</TableBody>
</Table>
</Card>
</Grid>
</div>
);
};
export default ModelDashboard;

View file

@ -137,6 +137,41 @@ export const userInfoCall = async (
} }
}; };
export const modelInfoCall = async (
accessToken: String,
userID: String,
userRole: String
) => {
try {
let url = proxyBaseUrl ? `${proxyBaseUrl}/model/info` : `/model/info`;
message.info("Requesting model data");
const response = await fetch(url, {
method: "GET",
headers: {
Authorization: `Bearer ${accessToken}`,
"Content-Type": "application/json",
},
});
if (!response.ok) {
const errorData = await response.text();
message.error(errorData);
throw new Error("Network response was not ok");
}
const data = await response.json();
message.info("Received model data");
return data;
// Handle success - you might want to update some state or UI based on the created key
} catch (error) {
console.error("Failed to create key:", error);
throw error;
}
};
export const keySpendLogsCall = async (accessToken: String, token: String) => { export const keySpendLogsCall = async (accessToken: String, token: String) => {
try { try {
const url = proxyBaseUrl ? `${proxyBaseUrl}/spend/logs` : `/spend/logs`; const url = proxyBaseUrl ? `${proxyBaseUrl}/spend/logs` : `/spend/logs`;

View file

@ -1,6 +1,6 @@
"use client"; "use client";
import React, { useState, useEffect } from "react"; import React, { useState, useEffect } from "react";
import { userInfoCall } from "./networking"; import { userInfoCall, modelInfoCall } from "./networking";
import { Grid, Col, Card, Text } from "@tremor/react"; import { Grid, Col, Card, Text } from "@tremor/react";
import CreateKey from "./create_key_button"; import CreateKey from "./create_key_button";
import ViewKeyTable from "./view_key_table"; import ViewKeyTable from "./view_key_table";
@ -47,6 +47,7 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
const token = searchParams.get("token"); const token = searchParams.get("token");
const [accessToken, setAccessToken] = useState<string | null>(null); const [accessToken, setAccessToken] = useState<string | null>(null);
const [userModels, setUserModels] = useState<string[]>([]);
function formatUserRole(userRole: string) { function formatUserRole(userRole: string) {
if (!userRole) { if (!userRole) {
@ -96,22 +97,39 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
} }
} }
if (userID && accessToken && userRole && !data) { if (userID && accessToken && userRole && !data) {
const cachedData = localStorage.getItem("userData"); const cachedData = localStorage.getItem("userData" + userID);
const cachedSpendData = localStorage.getItem("userSpendData"); const cachedSpendData = localStorage.getItem("userSpendData" + userID);
if (cachedData && cachedSpendData) { const cachedUserModels = localStorage.getItem("userModels" + userID);
if (cachedData && cachedSpendData && cachedUserModels) {
setData(JSON.parse(cachedData)); setData(JSON.parse(cachedData));
setUserSpendData(JSON.parse(cachedSpendData)); setUserSpendData(JSON.parse(cachedSpendData));
setUserModels(JSON.parse(cachedUserModels));
} else { } else {
const fetchData = async () => { const fetchData = async () => {
try { try {
const response = await userInfoCall(accessToken, userID, userRole); const response = await userInfoCall(accessToken, userID, userRole);
setUserSpendData(response["user_info"]); setUserSpendData(response["user_info"]);
setData(response["keys"]); // Assuming this is the correct path to your data setData(response["keys"]); // Assuming this is the correct path to your data
localStorage.setItem("userData", JSON.stringify(response["keys"])); localStorage.setItem("userData" + userID, JSON.stringify(response["keys"]));
localStorage.setItem( localStorage.setItem(
"userSpendData", "userSpendData" + userID,
JSON.stringify(response["user_info"]) JSON.stringify(response["user_info"])
); );
const model_info = await modelInfoCall(accessToken, userID, userRole);
console.log("model_info:", model_info);
// loop through model_info["data"] and create an array of element.model_name
let available_model_names = model_info["data"].map((element: { model_name: string; }) => element.model_name);
console.log("available_model_names:", available_model_names);
setUserModels(available_model_names);
console.log("userModels:", userModels);
localStorage.setItem("userModels" + userID, JSON.stringify(available_model_names));
} catch (error) { } catch (error) {
console.error("There was an error fetching the data", error); console.error("There was an error fetching the data", error);
// Optionally, update your UI to reflect the error state here as well // Optionally, update your UI to reflect the error state here as well
@ -158,6 +176,7 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
<CreateKey <CreateKey
userID={userID} userID={userID}
userRole={userRole} userRole={userRole}
userModels={userModels}
accessToken={accessToken} accessToken={accessToken}
data={data} data={data}
setData={setData} setData={setData}

View file

@ -43,6 +43,7 @@ const ViewKeyTable: React.FC<ViewKeyTableProps> = ({
// Set the key to delete and open the confirmation modal // Set the key to delete and open the confirmation modal
setKeyToDelete(token); setKeyToDelete(token);
localStorage.removeItem("userData" + userID)
setIsDeleteModalOpen(true); setIsDeleteModalOpen(true);
}; };