Merge branch 'main' into litellm_aioboto3_sagemaker

2025-04-26 11:14:04 +00:00 · 2024-02-14 21:46:58 -08:00 · 2024-02-14 21:46:58 -08:00 · 57654f4533
commit 57654f4533
parent 1d8b0dd2bb a530bc7425
79 changed files with 3440 additions and 253 deletions
--- a/README.md
+++ b/README.md
@ -34,6 +34,8 @@ LiteLLM manages:
 [**Jump to OpenAI Proxy Docs**](https://github.com/BerriAI/litellm?tab=readme-ov-file#openai-proxy---docs) <br>
 [**Jump to Supported LLM Providers**](https://github.com/BerriAI/litellm?tab=readme-ov-file#supported-provider-docs)
 Support for more providers. Missing a provider or LLM Platform, raise a [feature request](https://github.com/BerriAI/litellm/issues/new?assignees=&labels=enhancement&projects=&template=feature_request.yml&title=%5BFeature%5D%3A+).
 # Usage ([**Docs**](https://docs.litellm.ai/docs/))
 > [!IMPORTANT]
 > LiteLLM v1.0.0 now requires `openai>=1.0.0`. Migration guide [here](https://docs.litellm.ai/docs/migration)
--- a/docs/my-website/docs/observability/custom_callback.md
+++ b/docs/my-website/docs/observability/custom_callback.md
@ -1,5 +1,10 @@
 # Custom Callbacks
 :::info
 **For PROXY** [Go Here](../proxy/logging.md#custom-callback-class-async)
 ::: 
 ## Callback Class
 You can create a custom callback class to precisely log events as they occur in litellm. 
--- a/docs/my-website/docs/proxy/health.md
+++ b/docs/my-website/docs/proxy/health.md
@ -79,6 +79,23 @@ model_list:
      mode: embedding # 👈 ADD THIS
 ```
 ### Image Generation Models 
 We need some way to know if the model is an image generation model when running checks, if you have this in your config, specifying mode it makes an image generation health check
 ```yaml
 model_list:
  - model_name: dall-e-3
    litellm_params:
      model: azure/dall-e-3
      api_base: os.environ/AZURE_API_BASE
      api_key: os.environ/AZURE_API_KEY
      api_version: "2023-07-01-preview"
    model_info:
      mode: image_generation # 👈 ADD THIS
 ```
 ### Text Completion Models 
 We need some way to know if the model is a text completion model when running checks, if you have this in your config, specifying mode it makes an embedding health check
--- a/docs/my-website/docs/proxy/pii_masking.md
+++ b/docs/my-website/docs/proxy/pii_masking.md
@ -4,21 +4,24 @@ import Image from '@theme/IdealImage';
 LiteLLM supports [Microsoft Presidio](https://github.com/microsoft/presidio/) for PII masking. 
-## Step 1. Add env
+
 ## Quick Start
 ### Step 1. Add env
 ```bash
 export PRESIDIO_ANALYZER_API_BASE="http://localhost:5002"
 export PRESIDIO_ANONYMIZER_API_BASE="http://localhost:5001"
 ```
-## Step 2. Set it as a callback in config.yaml
+### Step 2. Set it as a callback in config.yaml
 ```yaml
 litellm_settings: 
-    litellm.callbacks = ["presidio"] 
+    callbacks = ["presidio", ...] # e.g. ["presidio", custom_callbacks.proxy_handler_instance]
 ```
-## Start proxy 
+### Step 3. Start proxy 
 ```
 litellm --config /path/to/config.yaml
@ -28,3 +31,27 @@ litellm --config /path/to/config.yaml
 This will mask the input going to the llm provider
 <Image img={require('../../img/presidio_screenshot.png')} />
 ## Output parsing 
 LLM responses can sometimes contain the masked tokens. 
 For presidio 'replace' operations, LiteLLM can check the LLM response and replace the masked token with the user-submitted values. 
 Just set `litellm.output_parse_pii = True`, to enable this. 
 ```yaml
 litellm_settings:
    output_parse_pii: true
 ```
 **Expected Flow: **
 1. User Input: "hello world, my name is Jane Doe. My number is: 034453334"
 2. LLM Input: "hello world, my name is [PERSON]. My number is: [PHONE_NUMBER]"
 3. LLM Response: "Hey [PERSON], nice to meet you!"
 4. User Response: "Hey Jane Doe, nice to meet you!"
--- a/docs/my-website/docs/proxy/quick_start.md
+++ b/docs/my-website/docs/proxy/quick_start.md
@ -370,12 +370,12 @@ See the latest available ghcr docker image here:
 https://github.com/berriai/litellm/pkgs/container/litellm
 ```shell
-docker pull ghcr.io/berriai/litellm:main-v1.16.13
+docker pull ghcr.io/berriai/litellm:main-latest
 ```
 ### Run the Docker Image
 ```shell
-docker run ghcr.io/berriai/litellm:main-v1.16.13
+docker run ghcr.io/berriai/litellm:main-latest
 ```
 #### Run the Docker Image with LiteLLM CLI args
@ -384,12 +384,12 @@ See all supported CLI args [here](https://docs.litellm.ai/docs/proxy/cli):
 Here's how you can run the docker image and pass your config to `litellm`
 ```shell
-docker run ghcr.io/berriai/litellm:main-v1.16.13 --config your_config.yaml
+docker run ghcr.io/berriai/litellm:main-latest --config your_config.yaml
 ```
 Here's how you can run the docker image and start litellm on port 8002 with `num_workers=8`
 ```shell
-docker run ghcr.io/berriai/litellm:main-v1.16.13 --port 8002 --num_workers 8
+docker run ghcr.io/berriai/litellm:main-latest --port 8002 --num_workers 8
 ```
 #### Run the Docker Image using docker compose
--- a/docs/my-website/docs/proxy/ui.md
+++ b/docs/my-website/docs/proxy/ui.md
@ -37,12 +37,12 @@ http://0.0.0.0:8000/ui # <proxy_base_url>/ui
 ```
-## Get Admin UI Link on Swagger 
+### 3. Get Admin UI Link on Swagger 
 Your Proxy Swagger is available on the root of the Proxy: e.g.: `http://localhost:4000/`
 <Image img={require('../../img/ui_link.png')} />
-## Change default username + password
+### 4. Change default username + password
 Set the following in your .env on the Proxy
@ -111,6 +111,29 @@ MICROSOFT_TENANT="5a39737
 </TabItem>
 <TabItem value="Generic" label="Generic SSO Provider">
 A generic OAuth client that can be used to quickly create support for any OAuth provider with close to no code
 **Required .env variables on your Proxy**
 ```shell
 GENERIC_CLIENT_ID = "******"
 GENERIC_CLIENT_SECRET = "G*******"
 GENERIC_AUTHORIZATION_ENDPOINT = "http://localhost:9090/auth"
 GENERIC_TOKEN_ENDPOINT = "http://localhost:9090/token"
 GENERIC_USERINFO_ENDPOINT = "http://localhost:9090/me"
 ```
 - Set Redirect URI, if your provider requires it
    - Set a redirect url = `<your proxy base url>/sso/callback`
    ```shell
    http://localhost:4000/sso/callback
    ```
 </TabItem>
 </Tabs>
 ### Step 3. Test flow
--- a/docs/my-website/docs/proxy/user_keys.md
+++ b/docs/my-website/docs/proxy/user_keys.md
@ -197,7 +197,7 @@ from openai import OpenAI
 # set api_key to send to proxy server
 client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:8000")
-response = openai.embeddings.create(
+response = client.embeddings.create(
    input=["hello from litellm"],
    model="text-embedding-ada-002"
 )
@ -281,6 +281,84 @@ print(query_result[:5])
 ```
 ## `/moderations`
 ### Request Format
 Input, Output and Exceptions are mapped to the OpenAI format for all supported models
 <Tabs>
 <TabItem value="openai" label="OpenAI Python v1.0.0+">
 ```python
 import openai
 from openai import OpenAI
 # set base_url to your proxy server
 # set api_key to send to proxy server
 client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:8000")
 response = client.moderations.create(
    input="hello from litellm",
    model="text-moderation-stable"
 )
 print(response)
 ```
 </TabItem>
 <TabItem value="Curl" label="Curl Request">
 ```shell
 curl --location 'http://0.0.0.0:8000/moderations' \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer sk-1234' \
    --data '{"input": "Sample text goes here", "model": "text-moderation-stable"}'
 ```
 </TabItem>
 </Tabs>
 ### Response Format
 ```json
 {
  "id": "modr-8sFEN22QCziALOfWTa77TodNLgHwA",
  "model": "text-moderation-007",
  "results": [
    {
      "categories": {
        "harassment": false,
        "harassment/threatening": false,
        "hate": false,
        "hate/threatening": false,
        "self-harm": false,
        "self-harm/instructions": false,
        "self-harm/intent": false,
        "sexual": false,
        "sexual/minors": false,
        "violence": false,
        "violence/graphic": false
      },
      "category_scores": {
        "harassment": 0.000019947197870351374,
        "harassment/threatening": 5.5971017900446896e-6,
        "hate": 0.000028560316422954202,
        "hate/threatening": 2.2631787999216613e-8,
        "self-harm": 2.9121162015144364e-7,
        "self-harm/instructions": 9.314219084899378e-8,
        "self-harm/intent": 8.093739012338119e-8,
        "sexual": 0.00004414955765241757,
        "sexual/minors": 0.0000156943697220413,
        "violence": 0.00022354527027346194,
        "violence/graphic": 8.804164281173144e-6
      },
      "flagged": false
    }
  ]
 }
 ```
 ## Advanced
--- a/docs/my-website/docs/proxy/virtual_keys.md
+++ b/docs/my-website/docs/proxy/virtual_keys.md
@ -696,7 +696,9 @@ general_settings:
    "region_name": "us-west-2" 
    "user_table_name": "your-user-table",
    "key_table_name": "your-token-table",
-    "config_table_name": "your-config-table"
+    "config_table_name": "your-config-table",
    "aws_role_name": "your-aws_role_name",
    "aws_session_name": "your-aws_session_name",
  }
 ```
--- a/litellm/init.py
+++ b/litellm/init.py
@ -67,6 +67,7 @@ max_budget: float = 0.0  # set the max budget across all providers
 budget_duration: Optional[
    str
 ] = None  # proxy only - resets budget after fixed duration. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d").
 _openai_finish_reasons = ["stop", "length", "function_call", "content_filter", "null"]
 _openai_completion_params = [
    "functions",
    "function_call",
@ -164,6 +165,8 @@ secret_manager_client: Optional[
 ] = None  # list of instantiated key management clients - e.g. azure kv, infisical, etc.
 _google_kms_resource_name: Optional[str] = None
 _key_management_system: Optional[KeyManagementSystem] = None
 #### PII MASKING ####
 output_parse_pii: bool = False
 #############################################
--- a/litellm/caching.py
+++ b/litellm/caching.py
@ -675,6 +675,9 @@ class S3Cache(BaseCache):
    def flush_cache(self):
        pass
    async def disconnect(self):
        pass
 class DualCache(BaseCache):
    """
--- a/litellm/integrations/custom_logger.py
+++ b/litellm/integrations/custom_logger.py
@ -2,9 +2,11 @@
 #    On success, logs events to Promptlayer
 import dotenv, os
 import requests
 from litellm.proxy._types import UserAPIKeyAuth
 from litellm.caching import DualCache
-from typing import Literal
+
 from typing import Literal, Union
 dotenv.load_dotenv()  # Loading env variables using dotenv
 import traceback
@ -54,7 +56,7 @@ class CustomLogger:  # https://docs.litellm.ai/docs/observability/custom_callbac
        user_api_key_dict: UserAPIKeyAuth,
        cache: DualCache,
        data: dict,
-        call_type: Literal["completion", "embeddings"],
+        call_type: Literal["completion", "embeddings", "image_generation"],
    ):
        pass
@ -63,6 +65,13 @@ class CustomLogger:  # https://docs.litellm.ai/docs/observability/custom_callbac
    ):
        pass
    async def async_post_call_success_hook(
        self,
        user_api_key_dict: UserAPIKeyAuth,
        response,
    ):
        pass
    #### SINGLE-USE #### - https://docs.litellm.ai/docs/observability/custom_callback#using-your-custom-callback-function
    def log_input_event(self, model, messages, kwargs, print_verbose, callback_func):
--- a/litellm/llms/bedrock.py
+++ b/litellm/llms/bedrock.py
@ -477,8 +477,8 @@ def init_bedrock_client(
 def convert_messages_to_prompt(model, messages, provider, custom_prompt_dict):
-    # handle anthropic prompts using anthropic constants
+    # handle anthropic prompts and amazon titan prompts
-    if provider == "anthropic":
+    if provider == "anthropic" or provider == "amazon":
        if model in custom_prompt_dict:
            # check if the model has a registered custom prompt
            model_prompt_details = custom_prompt_dict[model]
@ -490,7 +490,7 @@ def convert_messages_to_prompt(model, messages, provider, custom_prompt_dict):
            )
        else:
            prompt = prompt_factory(
-                model=model, messages=messages, custom_llm_provider="anthropic"
+                model=model, messages=messages, custom_llm_provider="bedrock"
            )
    else:
        prompt = ""
@ -623,6 +623,7 @@ def completion(
                    "textGenerationConfig": inference_params,
                }
            )
        else:
            data = json.dumps({})
--- a/litellm/llms/prompt_templates/factory.py
+++ b/litellm/llms/prompt_templates/factory.py
@ -90,9 +90,11 @@ def ollama_pt(
        return {"prompt": prompt, "images": images}
    else:
        prompt = "".join(
-            m["content"]
+            (
-            if isinstance(m["content"], str) is str
+                m["content"]
-            else "".join(m["content"])
+                if isinstance(m["content"], str) is str
                else "".join(m["content"])
            )
            for m in messages
        )
    return prompt
@ -422,6 +424,34 @@ def anthropic_pt(
    return prompt
 def amazon_titan_pt(
    messages: list,
 ):  # format - https://github.com/BerriAI/litellm/issues/1896
    """
    Amazon Titan uses 'User:' and 'Bot: in it's prompt template
    """
    class AmazonTitanConstants(Enum):
        HUMAN_PROMPT = "\n\nUser: "  # Assuming this is similar to Anthropic prompt formatting, since amazon titan's prompt formatting is currently undocumented
        AI_PROMPT = "\n\nBot: "
    prompt = ""
    for idx, message in enumerate(messages):
        if message["role"] == "user":
            prompt += f"{AmazonTitanConstants.HUMAN_PROMPT.value}{message['content']}"
        elif message["role"] == "system":
            prompt += f"{AmazonTitanConstants.HUMAN_PROMPT.value}<admin>{message['content']}</admin>"
        else:
            prompt += f"{AmazonTitanConstants.AI_PROMPT.value}{message['content']}"
        if (
            idx == 0 and message["role"] == "assistant"
        ):  # ensure the prompt always starts with `\n\nHuman: `
            prompt = f"{AmazonTitanConstants.HUMAN_PROMPT.value}" + prompt
    if messages[-1]["role"] != "assistant":
        prompt += f"{AmazonTitanConstants.AI_PROMPT.value}"
    return prompt
 def _load_image_from_url(image_url):
    try:
        from PIL import Image
@ -636,6 +666,14 @@ def prompt_factory(
            return gemini_text_image_pt(messages=messages)
    elif custom_llm_provider == "mistral":
        return mistral_api_pt(messages=messages)
    elif custom_llm_provider == "bedrock":
        if "amazon.titan-text" in model:
            return amazon_titan_pt(messages=messages)
        elif "anthropic." in model:
            if any(_ in model for _ in ["claude-2.1", "claude-v2:1"]):
                return claude_2_1_pt(messages=messages)
            else:
                return anthropic_pt(messages=messages)
    try:
        if "meta-llama/llama-2" in model and "chat" in model:
            return llama_2_chat_pt(messages=messages)
--- a/litellm/llms/sagemaker.py
+++ b/litellm/llms/sagemaker.py
@ -484,7 +484,7 @@ def embedding(
    aws_access_key_id = optional_params.pop("aws_access_key_id", None)
    aws_region_name = optional_params.pop("aws_region_name", None)
-    if aws_access_key_id != None:
+    if aws_access_key_id is not None:
        # uses auth params passed to completion
        # aws_access_key_id is not None, assume user is trying to auth using litellm.completion
        client = boto3.client(
--- a/litellm/llms/vertex_ai.py
+++ b/litellm/llms/vertex_ai.py
@ -4,7 +4,7 @@ from enum import Enum
 import requests
 import time
 from typing import Callable, Optional, Union
-from litellm.utils import ModelResponse, Usage, CustomStreamWrapper
+from litellm.utils import ModelResponse, Usage, CustomStreamWrapper, map_finish_reason
 import litellm, uuid
 import httpx
@ -575,9 +575,9 @@ def completion(
        model_response["model"] = model
        ## CALCULATING USAGE
        if model in litellm.vertex_language_models and response_obj is not None:
-            model_response["choices"][0].finish_reason = response_obj.candidates[
+            model_response["choices"][0].finish_reason = map_finish_reason(
-                0
+                response_obj.candidates[0].finish_reason.name
-            ].finish_reason.name
+            )
            usage = Usage(
                prompt_tokens=response_obj.usage_metadata.prompt_token_count,
                completion_tokens=response_obj.usage_metadata.candidates_token_count,
@ -771,9 +771,9 @@ async def async_completion(
        model_response["model"] = model
        ## CALCULATING USAGE
        if model in litellm.vertex_language_models and response_obj is not None:
-            model_response["choices"][0].finish_reason = response_obj.candidates[
+            model_response["choices"][0].finish_reason = map_finish_reason(
-                0
+                response_obj.candidates[0].finish_reason.name
-            ].finish_reason.name
+            )
            usage = Usage(
                prompt_tokens=response_obj.usage_metadata.prompt_token_count,
                completion_tokens=response_obj.usage_metadata.candidates_token_count,
--- a/litellm/main.py
+++ b/litellm/main.py
@ -10,7 +10,6 @@
 import os, openai, sys, json, inspect, uuid, datetime, threading
 from typing import Any, Literal, Union
 from functools import partial
 import dotenv, traceback, random, asyncio, time, contextvars
 from copy import deepcopy
 import httpx
@ -2964,16 +2963,39 @@ def text_completion(
 ##### Moderation #######################
-def moderation(input: str, api_key: Optional[str] = None):
+
 def moderation(
    input: str, model: Optional[str] = None, api_key: Optional[str] = None, **kwargs
 ):
    # only supports open ai for now
    api_key = (
        api_key or litellm.api_key or litellm.openai_key or get_secret("OPENAI_API_KEY")
    )
-    openai.api_key = api_key
+
-    openai.api_type = "open_ai"  # type: ignore
+    openai_client = kwargs.get("client", None)
-    openai.api_version = None
+    if openai_client is None:
-    openai.base_url = "https://api.openai.com/v1/"
+        openai_client = openai.OpenAI(
-    response = openai.moderations.create(input=input)
+            api_key=api_key,
        )
    response = openai_client.moderations.create(input=input, model=model)
    return response
 ##### Moderation #######################
@client
 async def amoderation(input: str, model: str, api_key: Optional[str] = None, **kwargs):
    # only supports open ai for now
    api_key = (
        api_key or litellm.api_key or litellm.openai_key or get_secret("OPENAI_API_KEY")
    )
    openai_client = kwargs.get("client", None)
    if openai_client is None:
        openai_client = openai.AsyncOpenAI(
            api_key=api_key,
        )
    response = await openai_client.moderations.create(input=input, model=model)
    return response
--- a/litellm/model_prices_and_context_window_backup.json
+++ b/litellm/model_prices_and_context_window_backup.json
@ -198,6 +198,33 @@
        "litellm_provider": "openai",
        "mode": "embedding"
    },
    "text-moderation-stable": {
        "max_tokens": 32768,
        "max_input_tokens": 32768,
        "max_output_tokens": 0,
        "input_cost_per_token": 0.000000,
        "output_cost_per_token": 0.000000,
        "litellm_provider": "openai",
        "mode": "moderations"
    },
    "text-moderation-007": {
        "max_tokens": 32768,
        "max_input_tokens": 32768,
        "max_output_tokens": 0,
        "input_cost_per_token": 0.000000,
        "output_cost_per_token": 0.000000,
        "litellm_provider": "openai",
        "mode": "moderations"
    },
    "text-moderation-latest": {
        "max_tokens": 32768,
        "max_input_tokens": 32768,
        "max_output_tokens": 0,
        "input_cost_per_token": 0.000000,
        "output_cost_per_token": 0.000000,
        "litellm_provider": "openai",
        "mode": "moderations"
    },
    "256-x-256/dall-e-2": {
        "mode": "image_generation",
        "input_cost_per_pixel": 0.00000024414,
--- a/litellm/proxy/_experimental/out/404.html
+++ b/litellm/proxy/_experimental/out/404.html
--- a/litellm/proxy/_experimental/out/_next/static/chunks/145-9c160ad5539e000f.js
+++ b/litellm/proxy/_experimental/out/_next/static/chunks/145-9c160ad5539e000f.js
--- a/litellm/proxy/_experimental/out/_next/static/chunks/69-e1b183dda365ec86.js
+++ b/litellm/proxy/_experimental/out/_next/static/chunks/69-e1b183dda365ec86.js
--- a/litellm/proxy/_experimental/out/_next/static/chunks/838-7fa0bab5a1c3631d.js
+++ b/litellm/proxy/_experimental/out/_next/static/chunks/838-7fa0bab5a1c3631d.js
--- a/litellm/proxy/_experimental/out/_next/static/chunks/app/_not-found-fe7849876eed0a1f.js
+++ b/litellm/proxy/_experimental/out/_next/static/chunks/app/_not-found-fe7849876eed0a1f.js
--- a/litellm/proxy/_experimental/out/_next/static/chunks/app/layout-9fdafd7b01fc14f1.js
+++ b/litellm/proxy/_experimental/out/_next/static/chunks/app/layout-9fdafd7b01fc14f1.js
@ -1 +1 @@
-(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[185],{11837:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=11837)}),_N_E=n.O()}]);
+(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[185],{87421:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=87421)}),_N_E=n.O()}]);
--- a/litellm/proxy/_experimental/out/_next/static/chunks/app/page-5a7453e3903c5d60.js
+++ b/litellm/proxy/_experimental/out/_next/static/chunks/app/page-5a7453e3903c5d60.js
--- a/litellm/proxy/_experimental/out/_next/static/chunks/app/page-fcb69349f15d154b.js
+++ b/litellm/proxy/_experimental/out/_next/static/chunks/app/page-fcb69349f15d154b.js
--- a/litellm/proxy/_experimental/out/_next/static/chunks/fd9d1056-a85b2c176012d8e5.js
+++ b/litellm/proxy/_experimental/out/_next/static/chunks/fd9d1056-a85b2c176012d8e5.js
--- a/litellm/proxy/_experimental/out/_next/static/chunks/main-app-9b4fb13a7db53edf.js
+++ b/litellm/proxy/_experimental/out/_next/static/chunks/main-app-9b4fb13a7db53edf.js
@ -1 +1 @@
-(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[744],{70377:function(e,n,t){Promise.resolve().then(t.t.bind(t,47690,23)),Promise.resolve().then(t.t.bind(t,48955,23)),Promise.resolve().then(t.t.bind(t,5613,23)),Promise.resolve().then(t.t.bind(t,11902,23)),Promise.resolve().then(t.t.bind(t,31778,23)),Promise.resolve().then(t.t.bind(t,77831,23))}},function(e){var n=function(n){return e(e.s=n)};e.O(0,[971,69],function(){return n(35317),n(70377)}),_N_E=e.O()}]);
+(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[744],{32028:function(e,n,t){Promise.resolve().then(t.t.bind(t,47690,23)),Promise.resolve().then(t.t.bind(t,48955,23)),Promise.resolve().then(t.t.bind(t,5613,23)),Promise.resolve().then(t.t.bind(t,11902,23)),Promise.resolve().then(t.t.bind(t,31778,23)),Promise.resolve().then(t.t.bind(t,77831,23))}},function(e){var n=function(n){return e(e.s=n)};e.O(0,[971,69],function(){return n(35317),n(32028)}),_N_E=e.O()}]);
--- a/litellm/proxy/_experimental/out/_next/static/chunks/webpack-db47c93f042d6d15.js
+++ b/litellm/proxy/_experimental/out/_next/static/chunks/webpack-db47c93f042d6d15.js
@ -1 +1 @@
-!function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o||0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r||"object"==typeof e&&e&&(4&r&&e.__esModule||16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t||[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/654259bbf9e4c196.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e||l.getAttribute("data-webpack")==o+n){i=l;break}}i||(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children||(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E||[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}();
+!function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o||0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r||"object"==typeof e&&e&&(4&r&&e.__esModule||16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t||[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/c18941d97fb7245b.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e||l.getAttribute("data-webpack")==o+n){i=l;break}}i||(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children||(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E||[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}();
--- a/litellm/proxy/_experimental/out/_next/static/css/654259bbf9e4c196.css
+++ b/litellm/proxy/_experimental/out/_next/static/css/654259bbf9e4c196.css
--- a/litellm/proxy/_experimental/out/_next/static/css/c18941d97fb7245b.css
+++ b/litellm/proxy/_experimental/out/_next/static/css/c18941d97fb7245b.css
--- a/litellm/proxy/_experimental/out/_next/static/lLFQRQnIrRo-GJf5spHEd/_buildManifest.js
+++ b/litellm/proxy/_experimental/out/_next/static/lLFQRQnIrRo-GJf5spHEd/_buildManifest.js
--- a/litellm/proxy/_experimental/out/_next/static/lLFQRQnIrRo-GJf5spHEd/_ssgManifest.js
+++ b/litellm/proxy/_experimental/out/_next/static/lLFQRQnIrRo-GJf5spHEd/_ssgManifest.js
--- a/litellm/proxy/_experimental/out/index.html
+++ b/litellm/proxy/_experimental/out/index.html
@ -1 +1 @@
-<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-85c9b4219c1bb384.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-9b4acf26920649bc.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-096338c8e1915716.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/654259bbf9e4c196.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[75985,[\"838\",\"static/chunks/838-7fa0bab5a1c3631d.js\",\"931\",\"static/chunks/app/page-5a7453e3903c5d60.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/654259bbf9e4c196.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"4mrMigZY9ob7yaIDjXpX6\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>
+<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-a85b2c176012d8e5.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-e1b183dda365ec86.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-9b4fb13a7db53edf.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/c18941d97fb7245b.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[48016,[\"145\",\"static/chunks/145-9c160ad5539e000f.js\",\"931\",\"static/chunks/app/page-fcb69349f15d154b.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/c18941d97fb7245b.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"lLFQRQnIrRo-GJf5spHEd\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>
--- a/litellm/proxy/_experimental/out/index.txt
+++ b/litellm/proxy/_experimental/out/index.txt
@ -1,7 +1,7 @@
 2:I[77831,[],""]
-3:I[75985,["838","static/chunks/838-7fa0bab5a1c3631d.js","931","static/chunks/app/page-5a7453e3903c5d60.js"],""]
+3:I[48016,["145","static/chunks/145-9c160ad5539e000f.js","931","static/chunks/app/page-fcb69349f15d154b.js"],""]
 4:I[5613,[],""]
 5:I[31778,[],""]
-0:["4mrMigZY9ob7yaIDjXpX6",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/654259bbf9e4c196.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
+0:["lLFQRQnIrRo-GJf5spHEd",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/c18941d97fb7245b.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
 6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"🚅 LiteLLM"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
 1:null
--- a/litellm/proxy/_types.py
+++ b/litellm/proxy/_types.py
@ -234,6 +234,15 @@ class DynamoDBArgs(LiteLLMBase):
    key_table_name: str = "LiteLLM_VerificationToken"
    config_table_name: str = "LiteLLM_Config"
    spend_table_name: str = "LiteLLM_SpendLogs"
    aws_role_name: Optional[str] = None
    aws_session_name: Optional[str] = None
    aws_web_identity_token: Optional[str] = None
    aws_provider_id: Optional[str] = None
    aws_policy_arns: Optional[List[str]] = None
    aws_policy: Optional[str] = None
    aws_duration_seconds: Optional[int] = None
    assume_role_aws_role_name: Optional[str] = None
    assume_role_aws_session_name: Optional[str] = None
 class ConfigGeneralSettings(LiteLLMBase):
--- a/litellm/proxy/db/dynamo_db.py
+++ b/litellm/proxy/db/dynamo_db.py
@ -53,6 +53,41 @@ class DynamoDBWrapper(CustomDB):
        self.database_arguments = database_arguments
        self.region_name = database_arguments.region_name
    def set_env_vars_based_on_arn(self):
        if self.database_arguments.aws_role_name is None:
            return
        verbose_proxy_logger.debug(
            f"DynamoDB: setting env vars based on arn={self.database_arguments.aws_role_name}"
        )
        import boto3, os
        sts_client = boto3.client("sts")
        # call 1
        non_used_assumed_role = sts_client.assume_role_with_web_identity(
            RoleArn=self.database_arguments.aws_role_name,
            RoleSessionName=self.database_arguments.aws_session_name,
            WebIdentityToken=self.database_arguments.aws_web_identity_token,
        )
        # call 2
        assumed_role = sts_client.assume_role(
            RoleArn=self.database_arguments.assume_role_aws_role_name,
            RoleSessionName=self.database_arguments.assume_role_aws_session_name,
        )
        aws_access_key_id = assumed_role["Credentials"]["AccessKeyId"]
        aws_secret_access_key = assumed_role["Credentials"]["SecretAccessKey"]
        aws_session_token = assumed_role["Credentials"]["SessionToken"]
        verbose_proxy_logger.debug(
            f"Got STS assumed Role, aws_access_key_id={aws_access_key_id}"
        )
        # set these in the env so aiodynamo can use them
        os.environ["AWS_ACCESS_KEY_ID"] = aws_access_key_id
        os.environ["AWS_SECRET_ACCESS_KEY"] = aws_secret_access_key
        os.environ["AWS_SESSION_TOKEN"] = aws_session_token
    async def connect(self):
        """
        Connect to DB, and creating / updating any tables
@ -75,6 +110,7 @@ class DynamoDBWrapper(CustomDB):
        import aiohttp
        verbose_proxy_logger.debug("DynamoDB Wrapper - Attempting to connect")
        self.set_env_vars_based_on_arn()
        # before making ClientSession check if ssl_verify=False
        if self.database_arguments.ssl_verify == False:
            client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False))
@ -171,6 +207,8 @@ class DynamoDBWrapper(CustomDB):
        from aiohttp import ClientSession
        import aiohttp
        self.set_env_vars_based_on_arn()
        if self.database_arguments.ssl_verify == False:
            client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False))
        else:
@ -214,6 +252,8 @@ class DynamoDBWrapper(CustomDB):
        from aiohttp import ClientSession
        import aiohttp
        self.set_env_vars_based_on_arn()
        if self.database_arguments.ssl_verify == False:
            client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False))
        else:
@ -261,6 +301,7 @@ class DynamoDBWrapper(CustomDB):
    async def update_data(
        self, key: str, value: dict, table_name: Literal["user", "key", "config"]
    ):
        self.set_env_vars_based_on_arn()
        from aiodynamo.client import Client
        from aiodynamo.credentials import Credentials, StaticCredentials
        from aiodynamo.http.httpx import HTTPX
@ -334,4 +375,5 @@ class DynamoDBWrapper(CustomDB):
        """
        Not Implemented yet.
        """
        self.set_env_vars_based_on_arn()
        return super().delete_data(keys, table_name)
--- a/litellm/proxy/hooks/presidio_pii_masking.py
+++ b/litellm/proxy/hooks/presidio_pii_masking.py
@ -8,14 +8,19 @@
 #  Tell us how we can improve! - Krrish & Ishaan
-from typing import Optional
+from typing import Optional, Literal, Union
-import litellm, traceback, sys
+import litellm, traceback, sys, uuid
 from litellm.caching import DualCache
 from litellm.proxy._types import UserAPIKeyAuth
 from litellm.integrations.custom_logger import CustomLogger
 from fastapi import HTTPException
 from litellm._logging import verbose_proxy_logger
-from litellm import ModelResponse
+from litellm.utils import (
    ModelResponse,
    EmbeddingResponse,
    ImageResponse,
    StreamingChoices,
 )
 from datetime import datetime
 import aiohttp, asyncio
@ -24,7 +29,13 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
    user_api_key_cache = None
    # Class variables or attributes
-    def __init__(self):
+    def __init__(self, mock_testing: bool = False):
        self.pii_tokens: dict = (
            {}
        )  # mapping of PII token to original text - only used with Presidio `replace` operation
        if mock_testing == True:  # for testing purposes only
            return
        self.presidio_analyzer_api_base = litellm.get_secret(
            "PRESIDIO_ANALYZER_API_BASE", None
        )
@ -51,12 +62,15 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
            pass
    async def check_pii(self, text: str) -> str:
        """
        [TODO] make this more performant for high-throughput scenario
        """
        try:
            async with aiohttp.ClientSession() as session:
                # Make the first request to /analyze
                analyze_url = f"{self.presidio_analyzer_api_base}/analyze"
                analyze_payload = {"text": text, "language": "en"}
-
+                redacted_text = None
                async with session.post(analyze_url, json=analyze_payload) as response:
                    analyze_results = await response.json()
@ -72,6 +86,26 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
                ) as response:
                    redacted_text = await response.json()
                new_text = text
                if redacted_text is not None:
                    for item in redacted_text["items"]:
                        start = item["start"]
                        end = item["end"]
                        replacement = item["text"]  # replacement token
                        if (
                            item["operator"] == "replace"
                            and litellm.output_parse_pii == True
                        ):
                            # check if token in dict
                            # if exists, add a uuid to the replacement token for swapping back to the original text in llm response output parsing
                            if replacement in self.pii_tokens:
                                replacement = replacement + uuid.uuid4()
                            self.pii_tokens[replacement] = new_text[
                                start:end
                            ]  # get text it'll replace
                        new_text = new_text[:start] + replacement + new_text[end:]
                return redacted_text["text"]
        except Exception as e:
            traceback.print_exc()
@ -94,6 +128,7 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
        if call_type == "completion":  # /chat/completions requests
            messages = data["messages"]
            tasks = []
            for m in messages:
                if isinstance(m["content"], str):
                    tasks.append(self.check_pii(text=m["content"]))
@ -104,3 +139,30 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
                        "content"
                    ] = r  # replace content with redacted string
        return data
    async def async_post_call_success_hook(
        self,
        user_api_key_dict: UserAPIKeyAuth,
        response: Union[ModelResponse, EmbeddingResponse, ImageResponse],
    ):
        """
        Output parse the response object to replace the masked tokens with user sent values
        """
        verbose_proxy_logger.debug(
            f"PII Masking Args: litellm.output_parse_pii={litellm.output_parse_pii}; type of response={type(response)}"
        )
        if litellm.output_parse_pii == False:
            return response
        if isinstance(response, ModelResponse) and not isinstance(
            response.choices[0], StreamingChoices
        ):  # /chat/completions requests
            if isinstance(response.choices[0].message.content, str):
                verbose_proxy_logger.debug(
                    f"self.pii_tokens: {self.pii_tokens}; initial response: {response.choices[0].message.content}"
                )
                for key, value in self.pii_tokens.items():
                    response.choices[0].message.content = response.choices[
                        0
                    ].message.content.replace(key, value)
        return response
--- a/litellm/proxy/proxy_cli.py
+++ b/litellm/proxy/proxy_cli.py
@ -570,6 +570,7 @@ def run_server(
                "worker_class": "uvicorn.workers.UvicornWorker",
                "preload": True,  # Add the preload flag,
                "accesslog": "-",  # Log to stdout
                "timeout": 600,  # default to very high number, bedrock/anthropic.claude-v2:1 can take 30+ seconds for the 1st chunk to come in
                "access_log_format": '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s',
            }
--- a/litellm/proxy/proxy_config.yaml
+++ b/litellm/proxy/proxy_config.yaml
@ -9,73 +9,41 @@ model_list:
      mode: chat
      max_tokens: 4096
      base_model: azure/gpt-4-1106-preview
      access_groups: ["public"] 
  - model_name: openai-gpt-3.5
    litellm_params:
      model: gpt-3.5-turbo
      api_key: os.environ/OPENAI_API_KEY
    model_info:
      access_groups: ["public"]
  - model_name: anthropic-claude-v2.1
    litellm_params:
      model: bedrock/anthropic.claude-v2:1
      timeout: 300 # sets a 5 minute timeout
    model_info:
      access_groups: ["private"]
  - model_name: anthropic-claude-v2
    litellm_params:
      model: bedrock/anthropic.claude-v2
  - model_name: bedrock-cohere
    litellm_params:
      model: bedrock/cohere.command-text-v14
      timeout: 0.0001
  - model_name: gpt-4
    litellm_params:
      model: azure/chatgpt-v-2
      api_base: https://openai-gpt-4-test-v-1.openai.azure.com/
      api_version: "2023-05-15"
      api_key: os.environ/AZURE_API_KEY # The `os.environ/` prefix tells litellm to read this from the env. See https://docs.litellm.ai/docs/simple_proxy#load-api-keys-from-vault
-  - model_name: gpt-vision
+    model_info:
      base_model: azure/gpt-4
  - model_name: text-moderation-stable
    litellm_params:
-      model: azure/gpt-4-vision
+      model: text-moderation-stable
      base_url: https://gpt-4-vision-resource.openai.azure.com/openai/deployments/gpt-4-vision/extensions
      api_key: os.environ/AZURE_VISION_API_KEY
      api_version: "2023-09-01-preview"
      dataSources:
        - type: AzureComputerVision
          parameters:
            endpoint: os.environ/AZURE_VISION_ENHANCE_ENDPOINT
            key: os.environ/AZURE_VISION_ENHANCE_KEY
  - model_name: BEDROCK_GROUP
    litellm_params:
      model: bedrock/cohere.command-text-v14
      timeout: 0.0001
  - model_name: tg-ai
    litellm_params:
      model: together_ai/mistralai/Mistral-7B-Instruct-v0.1
  - model_name: sagemaker
    litellm_params:
      model: sagemaker/berri-benchmarking-Llama-2-70b-chat-hf-4
  - model_name: openai-gpt-3.5
    litellm_params:
      model: gpt-3.5-turbo
      api_key: os.environ/OPENAI_API_KEY
    model_info:
      mode: chat
  - model_name: azure-cloudflare
    litellm_params:
      model: azure/chatgpt-v-2
      api_base: https://gateway.ai.cloudflare.com/v1/0399b10e77ac6668c80404a5ff49eb37/litellm-test/azure-openai/openai-gpt-4-test-v-1
      api_key: os.environ/AZURE_API_KEY
      api_version: "2023-07-01-preview"
  - model_name: azure-embedding-model
    litellm_params:
      model: azure/azure-embedding-model
      api_base: os.environ/AZURE_API_BASE
      api_key: os.environ/AZURE_API_KEY
      api_version: "2023-07-01-preview"
    model_info:
      mode: embedding
      base_model: text-embedding-ada-002
  - model_name: text-embedding-ada-002
    litellm_params:
      model: text-embedding-ada-002
      api_key: os.environ/OPENAI_API_KEY
    model_info:
      mode: embedding
 litellm_settings:
  fallbacks: [{"openai-gpt-3.5": ["azure-gpt-3.5"]}]
  success_callback: ['langfuse']
  max_budget: 10      # global budget for proxy 
  max_user_budget: 0.0001
  budget_duration: 30d    # global budget duration, will reset after 30d
  default_key_generate_params:
    max_budget: 1.5000
    models: ["azure-gpt-3.5"]
    duration: None
  upperbound_key_generate_params:
    max_budget: 100
    duration: "30d"   
  # setting callback class
  # callbacks: custom_callbacks.proxy_handler_instance # sets litellm.callbacks = [proxy_handler_instance]
@ -93,6 +61,7 @@ general_settings:
 environment_variables: 
  # otel: True          # OpenTelemetry Logger
  # master_key: sk-1234 # [OPTIONAL] Only use this if you to require all calls to contain this key (Authorization: Bearer sk-1234)
--- a/litellm/proxy/proxy_server.py
+++ b/litellm/proxy/proxy_server.py
@ -403,34 +403,43 @@ async def user_api_key_auth(
                verbose_proxy_logger.debug(
                    f"LLM Model List pre access group check: {llm_model_list}"
                )
-                access_groups = []
+                from collections import defaultdict
                access_groups = defaultdict(list)
                if llm_model_list is not None:
                    for m in llm_model_list:
                        for group in m.get("model_info", {}).get("access_groups", []):
-                            access_groups.append((m["model_name"], group))
+                            model_name = m["model_name"]
                            access_groups[group].append(model_name)
-                allowed_models = valid_token.models
+                models_in_current_access_groups = []
                access_group_idx = set()
                if (
                    len(access_groups) > 0
                ):  # check if token contains any model access groups
-                    for idx, m in enumerate(valid_token.models):
+                    for idx, m in enumerate(
-                        for model_name, group in access_groups:
+                        valid_token.models
-                            if m == group:
+                    ):  # loop token models, if any of them are an access group add the access group
-                                access_group_idx.add(idx)
+                        if m in access_groups:
-                                allowed_models.append(model_name)
+                            # if it is an access group we need to remove it from valid_token.models
                            models_in_group = access_groups[m]
                            models_in_current_access_groups.extend(models_in_group)
                # Filter out models that are access_groups
                filtered_models = [
                    m for m in valid_token.models if m not in access_groups
                ]
                filtered_models += models_in_current_access_groups
                verbose_proxy_logger.debug(
-                    f"model: {model}; allowed_models: {allowed_models}"
+                    f"model: {model}; allowed_models: {filtered_models}"
                )
-                if model is not None and model not in allowed_models:
+                if model is not None and model not in filtered_models:
                    raise ValueError(
                        f"API Key not allowed to access model. This token can only access models={valid_token.models}. Tried to access {model}"
                    )
-                for val in access_group_idx:
+                valid_token.models = filtered_models
                    allowed_models.pop(val)
                valid_token.models = allowed_models
                verbose_proxy_logger.debug(
-                    f"filtered allowed_models: {allowed_models}; valid_token.models: {valid_token.models}"
+                    f"filtered allowed_models: {filtered_models}; valid_token.models: {valid_token.models}"
                )
            # Check 2. If user_id for this token is in budget
@ -682,34 +691,31 @@ async def user_api_key_auth(
        # sso/login, ui/login, /key functions and /user functions
        # this will never be allowed to call /chat/completions
        token_team = getattr(valid_token, "team_id", None)
-        if token_team is not None:
+        if token_team is not None and token_team == "litellm-dashboard":
-            if token_team == "litellm-dashboard":
+            # this token is only used for managing the ui
-                # this token is only used for managing the ui
+            allowed_routes = [
-                allowed_routes = [
+                "/sso",
-                    "/sso",
+                "/login",
-                    "/login",
+                "/key",
-                    "/key",
+                "/spend",
-                    "/spend",
+                "/user",
-                    "/user",
+                "/model/info",
-                ]
+            ]
-                # check if the current route startswith any of the allowed routes
+            # check if the current route startswith any of the allowed routes
-                if (
+            if (
-                    route is not None
+                route is not None
-                    and isinstance(route, str)
+                and isinstance(route, str)
-                    and any(
+                and any(
-                        route.startswith(allowed_route)
+                    route.startswith(allowed_route) for allowed_route in allowed_routes
-                        for allowed_route in allowed_routes
+                )
-                    )
+            ):
-                ):
+                # Do something if the current route starts with any of the allowed routes
-                    # Do something if the current route starts with any of the allowed routes
+                pass
-                    pass
+            else:
-                else:
+                raise Exception(
-                    raise Exception(
+                    f"This key is made for LiteLLM UI, Tried to access route: {route}. Not allowed"
-                        f"This key is made for LiteLLM UI, Tried to access route: {route}. Not allowed"
+                )
-                    )
+        return UserAPIKeyAuth(api_key=api_key, **valid_token_dict)
            return UserAPIKeyAuth(api_key=api_key, **valid_token_dict)
        else:
            raise Exception(f"Invalid Key Passed to LiteLLM Proxy")
    except Exception as e:
        # verbose_proxy_logger.debug(f"An exception occurred - {traceback.format_exc()}")
        traceback.print_exc()
@ -1443,6 +1449,24 @@ class ProxyConfig:
                database_type == "dynamo_db" or database_type == "dynamodb"
            ):
                database_args = general_settings.get("database_args", None)
                ### LOAD FROM os.environ/ ###
                for k, v in database_args.items():
                    if isinstance(v, str) and v.startswith("os.environ/"):
                        database_args[k] = litellm.get_secret(v)
                    if isinstance(k, str) and k == "aws_web_identity_token":
                        value = database_args[k]
                        verbose_proxy_logger.debug(
                            f"Loading AWS Web Identity Token from file: {value}"
                        )
                        if os.path.exists(value):
                            with open(value, "r") as file:
                                token_content = file.read()
                                database_args[k] = token_content
                        else:
                            verbose_proxy_logger.info(
                                f"DynamoDB Loading - {value} is not a valid file path"
                            )
                verbose_proxy_logger.debug(f"database_args: {database_args}")
                custom_db_client = DBClient(
                    custom_db_args=database_args, custom_db_type=database_type
                )
@ -1580,8 +1604,6 @@ async def generate_key_helper_fn(
    tpm_limit = tpm_limit
    rpm_limit = rpm_limit
    allowed_cache_controls = allowed_cache_controls
    if type(team_id) is not str:
        team_id = str(team_id)
    try:
        # Create a new verification token (you may want to enhance this logic based on your needs)
        user_data = {
@ -2057,14 +2079,6 @@ def model_list(
        if user_model is not None:
            all_models += [user_model]
    verbose_proxy_logger.debug(f"all_models: {all_models}")
    ### CHECK OLLAMA MODELS ###
    try:
        response = requests.get("http://0.0.0.0:11434/api/tags")
        models = response.json()["models"]
        ollama_models = ["ollama/" + m["name"].replace(":latest", "") for m in models]
        all_models.extend(ollama_models)
    except Exception as e:
        pass
    return dict(
        data=[
            {
@ -2355,8 +2369,13 @@ async def chat_completion(
            llm_router is not None and data["model"] in llm_router.deployment_names
        ):  # model in router deployments, calling a specific deployment on the router
            response = await llm_router.acompletion(**data, specific_deployment=True)
-        else:  # router is not set
+        elif user_model is not None:  # `litellm --model <your-model-name>`
            response = await litellm.acompletion(**data)
        else:
            raise HTTPException(
                status_code=status.HTTP_400_BAD_REQUEST,
                detail={"error": "Invalid model name passed in"},
            )
        # Post Call Processing
        data["litellm_status"] = "success"  # used for alerting
@ -2387,6 +2406,11 @@ async def chat_completion(
        )
        fastapi_response.headers["x-litellm-model-id"] = model_id
        ### CALL HOOKS ### - modify outgoing data
        response = await proxy_logging_obj.post_call_success_hook(
            user_api_key_dict=user_api_key_dict, response=response
        )
        return response
    except Exception as e:
        traceback.print_exc()
@ -2417,7 +2441,12 @@ async def chat_completion(
            traceback.print_exc()
        if isinstance(e, HTTPException):
-            raise e
+            raise ProxyException(
                message=getattr(e, "detail", str(e)),
                type=getattr(e, "type", "None"),
                param=getattr(e, "param", "None"),
                code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
            )
        else:
            error_traceback = traceback.format_exc()
            error_msg = f"{str(e)}\n\n{error_traceback}"
@ -2567,8 +2596,13 @@ async def embeddings(
            llm_router is not None and data["model"] in llm_router.deployment_names
        ):  # model in router deployments, calling a specific deployment on the router
            response = await llm_router.aembedding(**data, specific_deployment=True)
-        else:
+        elif user_model is not None:  # `litellm --model <your-model-name>`
            response = await litellm.aembedding(**data)
        else:
            raise HTTPException(
                status_code=status.HTTP_400_BAD_REQUEST,
                detail={"error": "Invalid model name passed in"},
            )
        ### ALERTING ###
        data["litellm_status"] = "success"  # used for alerting
@ -2586,7 +2620,12 @@ async def embeddings(
        )
        traceback.print_exc()
        if isinstance(e, HTTPException):
-            raise e
+            raise ProxyException(
                message=getattr(e, "message", str(e)),
                type=getattr(e, "type", "None"),
                param=getattr(e, "param", "None"),
                code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
            )
        else:
            error_traceback = traceback.format_exc()
            error_msg = f"{str(e)}\n\n{error_traceback}"
@ -2702,8 +2741,13 @@ async def image_generation(
            response = await llm_router.aimage_generation(
                **data
            )  # ensure this goes the llm_router, router will do the correct alias mapping
-        else:
+        elif user_model is not None:  # `litellm --model <your-model-name>`
            response = await litellm.aimage_generation(**data)
        else:
            raise HTTPException(
                status_code=status.HTTP_400_BAD_REQUEST,
                detail={"error": "Invalid model name passed in"},
            )
        ### ALERTING ###
        data["litellm_status"] = "success"  # used for alerting
@ -2721,7 +2765,165 @@ async def image_generation(
        )
        traceback.print_exc()
        if isinstance(e, HTTPException):
-            raise e
+            raise ProxyException(
                message=getattr(e, "message", str(e)),
                type=getattr(e, "type", "None"),
                param=getattr(e, "param", "None"),
                code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
            )
        else:
            error_traceback = traceback.format_exc()
            error_msg = f"{str(e)}\n\n{error_traceback}"
            raise ProxyException(
                message=getattr(e, "message", error_msg),
                type=getattr(e, "type", "None"),
                param=getattr(e, "param", "None"),
                code=getattr(e, "status_code", 500),
            )
@router.post(
    "/v1/moderations",
    dependencies=[Depends(user_api_key_auth)],
    response_class=ORJSONResponse,
    tags=["moderations"],
 )
@router.post(
    "/moderations",
    dependencies=[Depends(user_api_key_auth)],
    response_class=ORJSONResponse,
    tags=["moderations"],
 )
 async def moderations(
    request: Request,
    user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
 ):
    """
    The moderations endpoint is a tool you can use to check whether content complies with an LLM Providers policies. 
    Quick Start
    ```
    curl --location 'http://0.0.0.0:4000/moderations' \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer sk-1234' \
    --data '{"input": "Sample text goes here", "model": "text-moderation-stable"}'
    ```
    """
    global proxy_logging_obj
    try:
        # Use orjson to parse JSON data, orjson speeds up requests significantly
        body = await request.body()
        data = orjson.loads(body)
        # Include original request and headers in the data
        data["proxy_server_request"] = {
            "url": str(request.url),
            "method": request.method,
            "headers": dict(request.headers),
            "body": copy.copy(data),  # use copy instead of deepcopy
        }
        if data.get("user", None) is None and user_api_key_dict.user_id is not None:
            data["user"] = user_api_key_dict.user_id
        data["model"] = (
            general_settings.get("moderation_model", None)  # server default
            or user_model  # model name passed via cli args
            or data["model"]  # default passed in http request
        )
        if user_model:
            data["model"] = user_model
        if "metadata" not in data:
            data["metadata"] = {}
        data["metadata"]["user_api_key"] = user_api_key_dict.api_key
        data["metadata"]["user_api_key_metadata"] = user_api_key_dict.metadata
        _headers = dict(request.headers)
        _headers.pop(
            "authorization", None
        )  # do not store the original `sk-..` api key in the db
        data["metadata"]["headers"] = _headers
        data["metadata"]["user_api_key_user_id"] = user_api_key_dict.user_id
        data["metadata"]["endpoint"] = str(request.url)
        ### TEAM-SPECIFIC PARAMS ###
        if user_api_key_dict.team_id is not None:
            team_config = await proxy_config.load_team_config(
                team_id=user_api_key_dict.team_id
            )
            if len(team_config) == 0:
                pass
            else:
                team_id = team_config.pop("team_id", None)
                data["metadata"]["team_id"] = team_id
                data = {
                    **team_config,
                    **data,
                }  # add the team-specific configs to the completion call
        router_model_names = (
            [m["model_name"] for m in llm_model_list]
            if llm_model_list is not None
            else []
        )
        ### CALL HOOKS ### - modify incoming data / reject request before calling the model
        data = await proxy_logging_obj.pre_call_hook(
            user_api_key_dict=user_api_key_dict, data=data, call_type="moderation"
        )
        start_time = time.time()
        ## ROUTE TO CORRECT ENDPOINT ##
        # skip router if user passed their key
        if "api_key" in data:
            response = await litellm.amoderation(**data)
        elif (
            llm_router is not None and data["model"] in router_model_names
        ):  # model in router model list
            response = await llm_router.amoderation(**data)
        elif (
            llm_router is not None and data["model"] in llm_router.deployment_names
        ):  # model in router deployments, calling a specific deployment on the router
            response = await llm_router.amoderation(**data, specific_deployment=True)
        elif (
            llm_router is not None
            and llm_router.model_group_alias is not None
            and data["model"] in llm_router.model_group_alias
        ):  # model set in model_group_alias
            response = await llm_router.amoderation(
                **data
            )  # ensure this goes the llm_router, router will do the correct alias mapping
        elif user_model is not None:  # `litellm --model <your-model-name>`
            response = await litellm.amoderation(**data)
        else:
            raise HTTPException(
                status_code=status.HTTP_400_BAD_REQUEST,
                detail={"error": "Invalid model name passed in"},
            )
        ### ALERTING ###
        data["litellm_status"] = "success"  # used for alerting
        end_time = time.time()
        asyncio.create_task(
            proxy_logging_obj.response_taking_too_long(
                start_time=start_time, end_time=end_time, type="slow_response"
            )
        )
        return response
    except Exception as e:
        await proxy_logging_obj.post_call_failure_hook(
            user_api_key_dict=user_api_key_dict, original_exception=e
        )
        traceback.print_exc()
        if isinstance(e, HTTPException):
            raise ProxyException(
                message=getattr(e, "message", str(e)),
                type=getattr(e, "type", "None"),
                param=getattr(e, "param", "None"),
                code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
            )
        else:
            error_traceback = traceback.format_exc()
            error_msg = f"{str(e)}\n\n{error_traceback}"
@ -3516,6 +3718,7 @@ async def google_login(request: Request):
    """
    microsoft_client_id = os.getenv("MICROSOFT_CLIENT_ID", None)
    google_client_id = os.getenv("GOOGLE_CLIENT_ID", None)
    generic_client_id = os.getenv("GENERIC_CLIENT_ID", None)
    # get url from request
    redirect_url = os.getenv("PROXY_BASE_URL", str(request.base_url))
@ -3574,6 +3777,69 @@ async def google_login(request: Request):
        )
        with microsoft_sso:
            return await microsoft_sso.get_login_redirect()
    elif generic_client_id is not None:
        from fastapi_sso.sso.generic import create_provider, DiscoveryDocument
        generic_client_secret = os.getenv("GENERIC_CLIENT_SECRET", None)
        generic_authorization_endpoint = os.getenv(
            "GENERIC_AUTHORIZATION_ENDPOINT", None
        )
        generic_token_endpoint = os.getenv("GENERIC_TOKEN_ENDPOINT", None)
        generic_userinfo_endpoint = os.getenv("GENERIC_USERINFO_ENDPOINT", None)
        if generic_client_secret is None:
            raise ProxyException(
                message="GENERIC_CLIENT_SECRET not set. Set it in .env file",
                type="auth_error",
                param="GENERIC_CLIENT_SECRET",
                code=status.HTTP_500_INTERNAL_SERVER_ERROR,
            )
        if generic_authorization_endpoint is None:
            raise ProxyException(
                message="GENERIC_AUTHORIZATION_ENDPOINT not set. Set it in .env file",
                type="auth_error",
                param="GENERIC_AUTHORIZATION_ENDPOINT",
                code=status.HTTP_500_INTERNAL_SERVER_ERROR,
            )
        if generic_token_endpoint is None:
            raise ProxyException(
                message="GENERIC_TOKEN_ENDPOINT not set. Set it in .env file",
                type="auth_error",
                param="GENERIC_TOKEN_ENDPOINT",
                code=status.HTTP_500_INTERNAL_SERVER_ERROR,
            )
        if generic_userinfo_endpoint is None:
            raise ProxyException(
                message="GENERIC_USERINFO_ENDPOINT not set. Set it in .env file",
                type="auth_error",
                param="GENERIC_USERINFO_ENDPOINT",
                code=status.HTTP_500_INTERNAL_SERVER_ERROR,
            )
        verbose_proxy_logger.debug(
            f"authorization_endpoint: {generic_authorization_endpoint}\ntoken_endpoint: {generic_token_endpoint}\nuserinfo_endpoint: {generic_userinfo_endpoint}"
        )
        verbose_proxy_logger.debug(
            f"GENERIC_REDIRECT_URI: {redirect_url}\nGENERIC_CLIENT_ID: {generic_client_id}\n"
        )
        discovery = DiscoveryDocument(
            authorization_endpoint=generic_authorization_endpoint,
            token_endpoint=generic_token_endpoint,
            userinfo_endpoint=generic_userinfo_endpoint,
        )
        SSOProvider = create_provider(name="oidc", discovery_document=discovery)
        generic_sso = SSOProvider(
            client_id=generic_client_id,
            client_secret=generic_client_secret,
            redirect_uri=redirect_url,
            allow_insecure_http=True,
        )
        with generic_sso:
            return await generic_sso.get_login_redirect()
    elif ui_username is not None:
        # No Google, Microsoft SSO
        # Use UI Credentials set in .env
@ -3673,6 +3939,7 @@ async def auth_callback(request: Request):
    global general_settings
    microsoft_client_id = os.getenv("MICROSOFT_CLIENT_ID", None)
    google_client_id = os.getenv("GOOGLE_CLIENT_ID", None)
    generic_client_id = os.getenv("GENERIC_CLIENT_ID", None)
    # get url from request
    redirect_url = os.getenv("PROXY_BASE_URL", str(request.base_url))
@ -3728,6 +3995,77 @@ async def auth_callback(request: Request):
            allow_insecure_http=True,
        )
        result = await microsoft_sso.verify_and_process(request)
    elif generic_client_id is not None:
        # make generic sso provider
        from fastapi_sso.sso.generic import create_provider, DiscoveryDocument
        generic_client_secret = os.getenv("GENERIC_CLIENT_SECRET", None)
        generic_authorization_endpoint = os.getenv(
            "GENERIC_AUTHORIZATION_ENDPOINT", None
        )
        generic_token_endpoint = os.getenv("GENERIC_TOKEN_ENDPOINT", None)
        generic_userinfo_endpoint = os.getenv("GENERIC_USERINFO_ENDPOINT", None)
        if generic_client_secret is None:
            raise ProxyException(
                message="GENERIC_CLIENT_SECRET not set. Set it in .env file",
                type="auth_error",
                param="GENERIC_CLIENT_SECRET",
                code=status.HTTP_500_INTERNAL_SERVER_ERROR,
            )
        if generic_authorization_endpoint is None:
            raise ProxyException(
                message="GENERIC_AUTHORIZATION_ENDPOINT not set. Set it in .env file",
                type="auth_error",
                param="GENERIC_AUTHORIZATION_ENDPOINT",
                code=status.HTTP_500_INTERNAL_SERVER_ERROR,
            )
        if generic_token_endpoint is None:
            raise ProxyException(
                message="GENERIC_TOKEN_ENDPOINT not set. Set it in .env file",
                type="auth_error",
                param="GENERIC_TOKEN_ENDPOINT",
                code=status.HTTP_500_INTERNAL_SERVER_ERROR,
            )
        if generic_userinfo_endpoint is None:
            raise ProxyException(
                message="GENERIC_USERINFO_ENDPOINT not set. Set it in .env file",
                type="auth_error",
                param="GENERIC_USERINFO_ENDPOINT",
                code=status.HTTP_500_INTERNAL_SERVER_ERROR,
            )
        verbose_proxy_logger.debug(
            f"authorization_endpoint: {generic_authorization_endpoint}\ntoken_endpoint: {generic_token_endpoint}\nuserinfo_endpoint: {generic_userinfo_endpoint}"
        )
        verbose_proxy_logger.debug(
            f"GENERIC_REDIRECT_URI: {redirect_url}\nGENERIC_CLIENT_ID: {generic_client_id}\n"
        )
        discovery = DiscoveryDocument(
            authorization_endpoint=generic_authorization_endpoint,
            token_endpoint=generic_token_endpoint,
            userinfo_endpoint=generic_userinfo_endpoint,
        )
        SSOProvider = create_provider(name="oidc", discovery_document=discovery)
        generic_sso = SSOProvider(
            client_id=generic_client_id,
            client_secret=generic_client_secret,
            redirect_uri=redirect_url,
            allow_insecure_http=True,
        )
        verbose_proxy_logger.debug(f"calling generic_sso.verify_and_process")
        request_body = await request.body()
        request_query_params = request.query_params
        # get "code" from query params
        code = request_query_params.get("code")
        result = await generic_sso.verify_and_process(request)
        verbose_proxy_logger.debug(f"generic result: {result}")
    # User is Authe'd in - generate key for the UI to access Proxy
    user_email = getattr(result, "email", None)
@ -3936,7 +4274,6 @@ async def add_new_model(model_params: ModelParams):
        )
 #### [BETA] - This is a beta endpoint, format might change based on user feedback https://github.com/BerriAI/litellm/issues/933. If you need a stable endpoint use /model/info
@router.get(
    "/model/info",
    description="Provides more info about each model in /models, including config.yaml descriptions (except api key and api base)",
@ -3969,6 +4306,28 @@ async def model_info_v1(
        # read litellm model_prices_and_context_window.json to get the following:
        # input_cost_per_token, output_cost_per_token, max_tokens
        litellm_model_info = get_litellm_model_info(model=model)
        # 2nd pass on the model, try seeing if we can find model in litellm model_cost map
        if litellm_model_info == {}:
            # use litellm_param model_name to get model_info
            litellm_params = model.get("litellm_params", {})
            litellm_model = litellm_params.get("model", None)
            try:
                litellm_model_info = litellm.get_model_info(model=litellm_model)
            except:
                litellm_model_info = {}
        # 3rd pass on the model, try seeing if we can find model but without the "/" in model cost map
        if litellm_model_info == {}:
            # use litellm_param model_name to get model_info
            litellm_params = model.get("litellm_params", {})
            litellm_model = litellm_params.get("model", None)
            split_model = litellm_model.split("/")
            if len(split_model) > 0:
                litellm_model = split_model[-1]
            try:
                litellm_model_info = litellm.get_model_info(model=litellm_model)
            except:
                litellm_model_info = {}
        for k, v in litellm_model_info.items():
            if k not in model_info:
                model_info[k] = v
--- a/litellm/proxy/tests/test_openai_js.js
+++ b/litellm/proxy/tests/test_openai_js.js
@ -4,25 +4,29 @@ const openai = require('openai');
 process.env.DEBUG=false;
 async function runOpenAI() {
  const client = new openai.OpenAI({
-    apiKey: 'sk-JkKeNi6WpWDngBsghJ6B9g',
+    apiKey: 'sk-1234',
-    baseURL: 'http://0.0.0.0:8000'
+    baseURL: 'http://0.0.0.0:4000'
  });
  try {
    const response = await client.chat.completions.create({
-      model: 'sagemaker',
+      model: 'anthropic-claude-v2.1',
      stream: true,
      max_tokens: 1000,
      messages: [
        {
          role: 'user',
-          content: 'write a 20 pg essay about YC ',
+          content: 'write a 20 pg essay about YC '.repeat(6000),
        },
      ],
    });
    console.log(response);
    let original = '';
    for await (const chunk of response) {
      original += chunk.choices[0].delta.content;
      console.log(original);
      console.log(chunk);
      console.log(chunk.choices[0].delta.content);
    }
--- a/litellm/proxy/utils.py
+++ b/litellm/proxy/utils.py
@ -11,6 +11,7 @@ from litellm.caching import DualCache
 from litellm.proxy.hooks.parallel_request_limiter import (
    _PROXY_MaxParallelRequestsHandler,
 )
 from litellm import ModelResponse, EmbeddingResponse, ImageResponse
 from litellm.proxy.hooks.max_budget_limiter import _PROXY_MaxBudgetLimiter
 from litellm.proxy.hooks.cache_control_check import _PROXY_CacheControlCheck
 from litellm.integrations.custom_logger import CustomLogger
@ -92,7 +93,9 @@ class ProxyLogging:
        self,
        user_api_key_dict: UserAPIKeyAuth,
        data: dict,
-        call_type: Literal["completion", "embeddings", "image_generation"],
+        call_type: Literal[
            "completion", "embeddings", "image_generation", "moderation"
        ],
    ):
        """
        Allows users to modify/reject the incoming request to the proxy, without having to deal with parsing Request body.
@ -377,6 +380,28 @@ class ProxyLogging:
                raise e
        return
    async def post_call_success_hook(
        self,
        response: Union[ModelResponse, EmbeddingResponse, ImageResponse],
        user_api_key_dict: UserAPIKeyAuth,
    ):
        """
        Allow user to modify outgoing data
        Covers:
        1. /chat/completions
        """
        new_response = copy.deepcopy(response)
        for callback in litellm.callbacks:
            try:
                if isinstance(callback, CustomLogger):
                    await callback.async_post_call_success_hook(
                        user_api_key_dict=user_api_key_dict, response=new_response
                    )
            except Exception as e:
                raise e
        return new_response
 ### DB CONNECTOR ###
 # Define the retry decorator with backoff strategy
--- a/litellm/router.py
+++ b/litellm/router.py
@ -599,6 +599,98 @@ class Router:
                self.fail_calls[model_name] += 1
            raise e
    async def amoderation(self, model: str, input: str, **kwargs):
        try:
            kwargs["model"] = model
            kwargs["input"] = input
            kwargs["original_function"] = self._amoderation
            kwargs["num_retries"] = kwargs.get("num_retries", self.num_retries)
            timeout = kwargs.get("request_timeout", self.timeout)
            kwargs.setdefault("metadata", {}).update({"model_group": model})
            response = await self.async_function_with_fallbacks(**kwargs)
            return response
        except Exception as e:
            raise e
    async def _amoderation(self, model: str, input: str, **kwargs):
        model_name = None
        try:
            verbose_router_logger.debug(
                f"Inside _moderation()- model: {model}; kwargs: {kwargs}"
            )
            deployment = self.get_available_deployment(
                model=model,
                input=input,
                specific_deployment=kwargs.pop("specific_deployment", None),
            )
            kwargs.setdefault("metadata", {}).update(
                {
                    "deployment": deployment["litellm_params"]["model"],
                    "model_info": deployment.get("model_info", {}),
                }
            )
            kwargs["model_info"] = deployment.get("model_info", {})
            data = deployment["litellm_params"].copy()
            model_name = data["model"]
            for k, v in self.default_litellm_params.items():
                if (
                    k not in kwargs and v is not None
                ):  # prioritize model-specific params > default router params
                    kwargs[k] = v
                elif k == "metadata":
                    kwargs[k].update(v)
            potential_model_client = self._get_client(
                deployment=deployment, kwargs=kwargs, client_type="async"
            )
            # check if provided keys == client keys #
            dynamic_api_key = kwargs.get("api_key", None)
            if (
                dynamic_api_key is not None
                and potential_model_client is not None
                and dynamic_api_key != potential_model_client.api_key
            ):
                model_client = None
            else:
                model_client = potential_model_client
            self.total_calls[model_name] += 1
            timeout = (
                data.get(
                    "timeout", None
                )  # timeout set on litellm_params for this deployment
                or self.timeout  # timeout set on router
                or kwargs.get(
                    "timeout", None
                )  # this uses default_litellm_params when nothing is set
            )
            response = await litellm.amoderation(
                **{
                    **data,
                    "input": input,
                    "caching": self.cache_responses,
                    "client": model_client,
                    "timeout": timeout,
                    **kwargs,
                }
            )
            self.success_calls[model_name] += 1
            verbose_router_logger.info(
                f"litellm.amoderation(model={model_name})\033[32m 200 OK\033[0m"
            )
            return response
        except Exception as e:
            verbose_router_logger.info(
                f"litellm.amoderation(model={model_name})\033[31m Exception {str(e)}\033[0m"
            )
            if model_name is not None:
                self.fail_calls[model_name] += 1
            raise e
    def text_completion(
        self,
        model: str,
--- a/litellm/router_strategy/lowest_latency.py
+++ b/litellm/router_strategy/lowest_latency.py
@ -86,7 +86,7 @@ class LowestLatencyLoggingHandler(CustomLogger):
                if isinstance(response_obj, ModelResponse):
                    completion_tokens = response_obj.usage.completion_tokens
                    total_tokens = response_obj.usage.total_tokens
-                    final_value = float(completion_tokens / response_ms.total_seconds())
+                    final_value = float(response_ms.total_seconds() / completion_tokens)
                # ------------
                # Update usage
@ -168,7 +168,7 @@ class LowestLatencyLoggingHandler(CustomLogger):
                if isinstance(response_obj, ModelResponse):
                    completion_tokens = response_obj.usage.completion_tokens
                    total_tokens = response_obj.usage.total_tokens
-                    final_value = float(completion_tokens / response_ms.total_seconds())
+                    final_value = float(response_ms.total_seconds() / completion_tokens)
                # ------------
                # Update usage
--- a/litellm/tests/test_amazing_vertex_completion.py
+++ b/litellm/tests/test_amazing_vertex_completion.py
@ -123,6 +123,10 @@ def test_vertex_ai():
            print(response)
            assert type(response.choices[0].message.content) == str
            assert len(response.choices[0].message.content) > 1
            print(
                f"response.choices[0].finish_reason: {response.choices[0].finish_reason}"
            )
            assert response.choices[0].finish_reason in litellm._openai_finish_reasons
        except Exception as e:
            pytest.fail(f"Error occurred: {e}")
--- a/litellm/tests/test_completion.py
+++ b/litellm/tests/test_completion.py
@ -71,7 +71,7 @@ def test_completion_claude():
            messages=messages,
            request_timeout=10,
        )
-        # Add any assertions here to check the response
+        # Add any assertions here to check response args
        print(response)
        print(response.usage)
        print(response.usage.completion_tokens)
@ -1545,9 +1545,9 @@ def test_completion_bedrock_titan_null_response():
            ],
        )
        # Add any assertions here to check the response
-        pytest.fail(f"Expected to fail")
+        print(f"response: {response}")
    except Exception as e:
-        pass
+        pytest.fail(f"An error occurred - {str(e)}")
 def test_completion_bedrock_titan():
@ -2093,10 +2093,6 @@ def test_completion_cloudflare():
 def test_moderation():
    import openai
    openai.api_type = "azure"
    openai.api_version = "GM"
    response = litellm.moderation(input="i'm ishaan cto of litellm")
    print(response)
    output = response.results[0]
--- a/litellm/tests/test_presidio_masking.py
+++ b/litellm/tests/test_presidio_masking.py
@ -0,0 +1,65 @@
 # What is this?
 ## Unit test for presidio pii masking
 import sys, os, asyncio, time, random
 from datetime import datetime
 import traceback
 from dotenv import load_dotenv
 load_dotenv()
 import os
 sys.path.insert(
    0, os.path.abspath("../..")
 )  # Adds the parent directory to the system path
 import pytest
 import litellm
 from litellm.proxy.hooks.presidio_pii_masking import _OPTIONAL_PresidioPIIMasking
 from litellm import Router, mock_completion
 from litellm.proxy.utils import ProxyLogging
 from litellm.proxy._types import UserAPIKeyAuth
 from litellm.caching import DualCache
@pytest.mark.asyncio
 async def test_output_parsing():
    """
    - have presidio pii masking - mask an input message
    - make llm completion call
    - have presidio pii masking - output parse message
    - assert that no masked tokens are in the input message
    """
    litellm.output_parse_pii = True
    pii_masking = _OPTIONAL_PresidioPIIMasking(mock_testing=True)
    initial_message = [
        {
            "role": "user",
            "content": "hello world, my name is Jane Doe. My number is: 034453334",
        }
    ]
    filtered_message = [
        {
            "role": "user",
            "content": "hello world, my name is <PERSON>. My number is: <PHONE_NUMBER>",
        }
    ]
    pii_masking.pii_tokens = {"<PERSON>": "Jane Doe", "<PHONE_NUMBER>": "034453334"}
    response = mock_completion(
        model="gpt-3.5-turbo",
        messages=filtered_message,
        mock_response="Hello <PERSON>! How can I assist you today?",
    )
    new_response = await pii_masking.async_post_call_success_hook(
        user_api_key_dict=UserAPIKeyAuth(), response=response
    )
    assert (
        new_response.choices[0].message.content
        == "Hello Jane Doe! How can I assist you today?"
    )
 # asyncio.run(test_output_parsing())
--- a/litellm/tests/test_proxy_exception_mapping.py
+++ b/litellm/tests/test_proxy_exception_mapping.py
@ -139,7 +139,7 @@ def test_exception_openai_bad_model(client):
            response=response
        )
        print("Type of exception=", type(openai_exception))
-        assert isinstance(openai_exception, openai.NotFoundError)
+        assert isinstance(openai_exception, openai.BadRequestError)
    except Exception as e:
        pytest.fail(f"LiteLLM Proxy test failed. Exception {str(e)}")
@ -160,7 +160,6 @@ def test_chat_completion_exception_any_model(client):
        response = client.post("/chat/completions", json=test_data)
        json_response = response.json()
        print("keys in json response", json_response.keys())
        assert json_response.keys() == {"error"}
        # make an openai client to call _make_status_error_from_response
--- a/litellm/tests/test_router.py
+++ b/litellm/tests/test_router.py
@ -991,3 +991,23 @@ def test_router_timeout():
        print(e)
        print(vars(e))
        pass
@pytest.mark.asyncio
 async def test_router_amoderation():
    model_list = [
        {
            "model_name": "openai-moderations",
            "litellm_params": {
                "model": "text-moderation-stable",
                "api_key": os.getenv("OPENAI_API_KEY", None),
            },
        }
    ]
    router = Router(model_list=model_list)
    result = await router.amoderation(
        model="openai-moderations", input="this is valid good text"
    )
    print("moderation result", result)
--- a/litellm/tests/test_rules.py
+++ b/litellm/tests/test_rules.py
@ -58,6 +58,18 @@ def my_post_call_rule(input: str):
    return {"decision": True}
 def my_post_call_rule_2(input: str):
    input = input.lower()
    print(f"input: {input}")
    print(f"INSIDE MY POST CALL RULE, len(input) - {len(input)}")
    if len(input) < 200 and len(input) > 0:
        return {
            "decision": False,
            "message": "This violates LiteLLM Proxy Rules. Response too short",
        }
    return {"decision": True}
 # test_pre_call_rule()
 # Test 2: Post-call rule
 # commenting out of ci/cd since llm's have variable output which was causing our pipeline to fail erratically.
@ -94,3 +106,24 @@ def test_post_call_rule():
 # test_post_call_rule()
 def test_post_call_rule_streaming():
    try:
        litellm.pre_call_rules = []
        litellm.post_call_rules = [my_post_call_rule_2]
        ### completion
        response = completion(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": "say sorry"}],
            max_tokens=2,
            stream=True,
        )
        for chunk in response:
            print(f"chunk: {chunk}")
        pytest.fail(f"Completion call should have been failed. ")
    except Exception as e:
        print("Got exception", e)
        print(type(e))
        print(vars(e))
        assert e.message == "This violates LiteLLM Proxy Rules. Response too short"
--- a/litellm/utils.py
+++ b/litellm/utils.py
@ -738,6 +738,8 @@ class CallTypes(Enum):
    text_completion = "text_completion"
    image_generation = "image_generation"
    aimage_generation = "aimage_generation"
    moderation = "moderation"
    amoderation = "amoderation"
 # Logging function -> log the exact model details + what's being sent | Non-BlockingP
@ -2100,6 +2102,11 @@ def client(original_function):
                or call_type == CallTypes.aimage_generation.value
            ):
                messages = args[0] if len(args) > 0 else kwargs["prompt"]
            elif (
                call_type == CallTypes.moderation.value
                or call_type == CallTypes.amoderation.value
            ):
                messages = args[1] if len(args) > 1 else kwargs["input"]
            elif (
                call_type == CallTypes.atext_completion.value
                or call_type == CallTypes.text_completion.value
@ -7692,6 +7699,7 @@ class CustomStreamWrapper:
        self.special_tokens = ["<|assistant|>", "<|system|>", "<|user|>", "<s>", "</s>"]
        self.holding_chunk = ""
        self.complete_response = ""
        self.response_uptil_now = ""
        _model_info = (
            self.logging_obj.model_call_details.get("litellm_params", {}).get(
                "model_info", {}
@ -7703,6 +7711,7 @@ class CustomStreamWrapper:
        }  # returned as x-litellm-model-id response header in proxy
        self.response_id = None
        self.logging_loop = None
        self.rules = Rules()
    def __iter__(self):
        return self
@ -8659,7 +8668,7 @@ class CustomStreamWrapper:
                    chunk = next(self.completion_stream)
                if chunk is not None and chunk != b"":
                    print_verbose(f"PROCESSED CHUNK PRE CHUNK CREATOR: {chunk}")
-                    response = self.chunk_creator(chunk=chunk)
+                    response: Optional[ModelResponse] = self.chunk_creator(chunk=chunk)
                    print_verbose(f"PROCESSED CHUNK POST CHUNK CREATOR: {response}")
                    if response is None:
                        continue
@ -8667,7 +8676,12 @@ class CustomStreamWrapper:
                    threading.Thread(
                        target=self.run_success_logging_in_thread, args=(response,)
                    ).start()  # log response
-
+                    self.response_uptil_now += (
                        response.choices[0].delta.get("content", "") or ""
                    )
                    self.rules.post_call_rules(
                        input=self.response_uptil_now, model=self.model
                    )
                    # RETURN RESULT
                    return response
        except StopIteration:
@ -8705,7 +8719,9 @@ class CustomStreamWrapper:
                    # chunk_creator() does logging/stream chunk building. We need to let it know its being called in_async_func, so we don't double add chunks.
                    # __anext__ also calls async_success_handler, which does logging
                    print_verbose(f"PROCESSED ASYNC CHUNK PRE CHUNK CREATOR: {chunk}")
-                    processed_chunk = self.chunk_creator(chunk=chunk)
+                    processed_chunk: Optional[ModelResponse] = self.chunk_creator(
                        chunk=chunk
                    )
                    print_verbose(
                        f"PROCESSED ASYNC CHUNK POST CHUNK CREATOR: {processed_chunk}"
                    )
@ -8720,6 +8736,12 @@ class CustomStreamWrapper:
                            processed_chunk,
                        )
                    )
                    self.response_uptil_now += (
                        processed_chunk.choices[0].delta.get("content", "") or ""
                    )
                    self.rules.post_call_rules(
                        input=self.response_uptil_now, model=self.model
                    )
                    return processed_chunk
                raise StopAsyncIteration
            else:  # temporary patch for non-aiohttp async calls
@ -8733,7 +8755,9 @@ class CustomStreamWrapper:
                        chunk = next(self.completion_stream)
                    if chunk is not None and chunk != b"":
                        print_verbose(f"PROCESSED CHUNK PRE CHUNK CREATOR: {chunk}")
-                        processed_chunk = self.chunk_creator(chunk=chunk)
+                        processed_chunk: Optional[ModelResponse] = self.chunk_creator(
                            chunk=chunk
                        )
                        print_verbose(
                            f"PROCESSED CHUNK POST CHUNK CREATOR: {processed_chunk}"
                        )
@ -8750,6 +8774,12 @@ class CustomStreamWrapper:
                            )
                        )
                        self.response_uptil_now += (
                            processed_chunk.choices[0].delta.get("content", "") or ""
                        )
                        self.rules.post_call_rules(
                            input=self.response_uptil_now, model=self.model
                        )
                        # RETURN RESULT
                        return processed_chunk
        except StopAsyncIteration:
--- a/model_prices_and_context_window.json
+++ b/model_prices_and_context_window.json
@ -198,6 +198,33 @@
        "litellm_provider": "openai",
        "mode": "embedding"
    },
    "text-moderation-stable": {
        "max_tokens": 32768,
        "max_input_tokens": 32768,
        "max_output_tokens": 0,
        "input_cost_per_token": 0.000000,
        "output_cost_per_token": 0.000000,
        "litellm_provider": "openai",
        "mode": "moderations"
    },
    "text-moderation-007": {
        "max_tokens": 32768,
        "max_input_tokens": 32768,
        "max_output_tokens": 0,
        "input_cost_per_token": 0.000000,
        "output_cost_per_token": 0.000000,
        "litellm_provider": "openai",
        "mode": "moderations"
    },
    "text-moderation-latest": {
        "max_tokens": 32768,
        "max_input_tokens": 32768,
        "max_output_tokens": 0,
        "input_cost_per_token": 0.000000,
        "output_cost_per_token": 0.000000,
        "litellm_provider": "openai",
        "mode": "moderations"
    },
    "256-x-256/dall-e-2": {
        "mode": "image_generation",
        "input_cost_per_pixel": 0.00000024414,
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [tool.poetry]
 name = "litellm"
-version = "1.23.10"
+version = "1.23.16"
 description = "Library to easily interface with LLM API providers"
 authors = ["BerriAI"]
 license = "MIT"
@ -69,7 +69,7 @@ requires = ["poetry-core", "wheel"]
 build-backend = "poetry.core.masonry.api"
 [tool.commitizen]
-version = "1.23.10"
+version = "1.23.16"
 version_files = [
    "pyproject.toml:^version"
 ]
--- a/requirements.txt
+++ b/requirements.txt
@ -27,7 +27,7 @@ tiktoken>=0.4.0 # for calculating usage
 importlib-metadata>=6.8.0 # for random utils
 tokenizers==0.14.0 # for calculating usage
 click==8.1.7 # for proxy cli 
-jinja2==3.1.2 # for prompt templates
+jinja2==3.1.3 # for prompt templates
 certifi>=2023.7.22 # [TODO] clean up 
 aiohttp==3.9.0 # for network calls
 aioboto3==12.3.0 # for async sagemaker calls
--- a/tests/test_openai_endpoints.py
+++ b/tests/test_openai_endpoints.py
@ -88,6 +88,22 @@ async def test_chat_completion():
        await chat_completion(session=session, key=key_2)
@pytest.mark.asyncio
 async def test_chat_completion_old_key():
    """
    Production test for backwards compatibility. Test db against a pre-generated (old key)
    - Create key
    Make chat completion call
    """
    async with aiohttp.ClientSession() as session:
        try:
            key = "sk-yNXvlRO4SxIGG0XnRMYxTw"
            await chat_completion(session=session, key=key)
        except Exception as e:
            key = "sk-2KV0sAElLQqMpLZXdNf3yw"  # try diff db key (in case db url is for the other db)
            await chat_completion(session=session, key=key)
 async def completion(session, key):
    url = "http://0.0.0.0:4000/completions"
    headers = {
--- a/ui/litellm-dashboard/out/404.html
+++ b/ui/litellm-dashboard/out/404.html
--- a/ui/litellm-dashboard/out/_next/static/chunks/145-9c160ad5539e000f.js
+++ b/ui/litellm-dashboard/out/_next/static/chunks/145-9c160ad5539e000f.js
--- a/ui/litellm-dashboard/out/_next/static/chunks/69-e1b183dda365ec86.js
+++ b/ui/litellm-dashboard/out/_next/static/chunks/69-e1b183dda365ec86.js
--- a/ui/litellm-dashboard/out/_next/static/chunks/app/_not-found-fe7849876eed0a1f.js
+++ b/ui/litellm-dashboard/out/_next/static/chunks/app/_not-found-fe7849876eed0a1f.js
@ -0,0 +1 @@
 (self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[165],{83155:function(e,t,n){(window.__NEXT_P=window.__NEXT_P||[]).push(["/_not-found",function(){return n(84032)}])},84032:function(e,t,n){"use strict";Object.defineProperty(t,"__esModule",{value:!0}),Object.defineProperty(t,"default",{enumerable:!0,get:function(){return i}}),n(86921);let o=n(3827);n(64090);let r={error:{fontFamily:'system-ui,"Segoe UI",Roboto,Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji"',height:"100vh",textAlign:"center",display:"flex",flexDirection:"column",alignItems:"center",justifyContent:"center"},desc:{display:"inline-block"},h1:{display:"inline-block",margin:"0 20px 0 0",padding:"0 23px 0 0",fontSize:24,fontWeight:500,verticalAlign:"top",lineHeight:"49px"},h2:{fontSize:14,fontWeight:400,lineHeight:"49px",margin:0}};function i(){return(0,o.jsxs)(o.Fragment,{children:[(0,o.jsx)("title",{children:"404: This page could not be found."}),(0,o.jsx)("div",{style:r.error,children:(0,o.jsxs)("div",{children:[(0,o.jsx)("style",{dangerouslySetInnerHTML:{__html:"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}),(0,o.jsx)("h1",{className:"next-error-h1",style:r.h1,children:"404"}),(0,o.jsx)("div",{style:r.desc,children:(0,o.jsx)("h2",{style:r.h2,children:"This page could not be found."})})]})})]})}("function"==typeof t.default||"object"==typeof t.default&&null!==t.default)&&void 0===t.default.__esModule&&(Object.defineProperty(t.default,"__esModule",{value:!0}),Object.assign(t.default,t),e.exports=t.default)}},function(e){e.O(0,[971,69,744],function(){return e(e.s=83155)}),_N_E=e.O()}]);
--- a/ui/litellm-dashboard/out/_next/static/chunks/app/layout-9fdafd7b01fc14f1.js
+++ b/ui/litellm-dashboard/out/_next/static/chunks/app/layout-9fdafd7b01fc14f1.js
@ -0,0 +1 @@
 (self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[185],{87421:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=87421)}),_N_E=n.O()}]);
--- a/ui/litellm-dashboard/out/_next/static/chunks/app/page-fcb69349f15d154b.js
+++ b/ui/litellm-dashboard/out/_next/static/chunks/app/page-fcb69349f15d154b.js
--- a/ui/litellm-dashboard/out/_next/static/chunks/fd9d1056-a85b2c176012d8e5.js
+++ b/ui/litellm-dashboard/out/_next/static/chunks/fd9d1056-a85b2c176012d8e5.js
--- a/ui/litellm-dashboard/out/_next/static/chunks/main-app-9b4fb13a7db53edf.js
+++ b/ui/litellm-dashboard/out/_next/static/chunks/main-app-9b4fb13a7db53edf.js
@ -0,0 +1 @@
 (self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[744],{32028:function(e,n,t){Promise.resolve().then(t.t.bind(t,47690,23)),Promise.resolve().then(t.t.bind(t,48955,23)),Promise.resolve().then(t.t.bind(t,5613,23)),Promise.resolve().then(t.t.bind(t,11902,23)),Promise.resolve().then(t.t.bind(t,31778,23)),Promise.resolve().then(t.t.bind(t,77831,23))}},function(e){var n=function(n){return e(e.s=n)};e.O(0,[971,69],function(){return n(35317),n(32028)}),_N_E=e.O()}]);
--- a/ui/litellm-dashboard/out/_next/static/chunks/webpack-db47c93f042d6d15.js
+++ b/ui/litellm-dashboard/out/_next/static/chunks/webpack-db47c93f042d6d15.js
@ -0,0 +1 @@
 !function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o||0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r||"object"==typeof e&&e&&(4&r&&e.__esModule||16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t||[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/c18941d97fb7245b.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e||l.getAttribute("data-webpack")==o+n){i=l;break}}i||(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children||(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E||[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}();
--- a/ui/litellm-dashboard/out/_next/static/css/c18941d97fb7245b.css
+++ b/ui/litellm-dashboard/out/_next/static/css/c18941d97fb7245b.css
--- a/ui/litellm-dashboard/out/_next/static/lLFQRQnIrRo-GJf5spHEd/_buildManifest.js
+++ b/ui/litellm-dashboard/out/_next/static/lLFQRQnIrRo-GJf5spHEd/_buildManifest.js
@ -0,0 +1 @@
 self.__BUILD_MANIFEST={__rewrites:{afterFiles:[],beforeFiles:[],fallback:[]},"/_error":["static/chunks/pages/_error-d6107f1aac0c574c.js"],sortedPages:["/_app","/_error"]},self.__BUILD_MANIFEST_CB&&self.__BUILD_MANIFEST_CB();
--- a/ui/litellm-dashboard/out/_next/static/lLFQRQnIrRo-GJf5spHEd/_ssgManifest.js
+++ b/ui/litellm-dashboard/out/_next/static/lLFQRQnIrRo-GJf5spHEd/_ssgManifest.js
@ -0,0 +1 @@
 self.__SSG_MANIFEST=new Set([]);self.__SSG_MANIFEST_CB&&self.__SSG_MANIFEST_CB()
--- a/ui/litellm-dashboard/out/index.html
+++ b/ui/litellm-dashboard/out/index.html
@ -1 +1 @@
-<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-85c9b4219c1bb384.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-9b4acf26920649bc.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-096338c8e1915716.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/654259bbf9e4c196.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[75985,[\"838\",\"static/chunks/838-7fa0bab5a1c3631d.js\",\"931\",\"static/chunks/app/page-5a7453e3903c5d60.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/654259bbf9e4c196.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"4mrMigZY9ob7yaIDjXpX6\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>
+<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-a85b2c176012d8e5.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-e1b183dda365ec86.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-9b4fb13a7db53edf.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/c18941d97fb7245b.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[48016,[\"145\",\"static/chunks/145-9c160ad5539e000f.js\",\"931\",\"static/chunks/app/page-fcb69349f15d154b.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/c18941d97fb7245b.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"lLFQRQnIrRo-GJf5spHEd\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>
--- a/ui/litellm-dashboard/out/index.txt
+++ b/ui/litellm-dashboard/out/index.txt
@ -1,7 +1,7 @@
 2:I[77831,[],""]
-3:I[75985,["838","static/chunks/838-7fa0bab5a1c3631d.js","931","static/chunks/app/page-5a7453e3903c5d60.js"],""]
+3:I[48016,["145","static/chunks/145-9c160ad5539e000f.js","931","static/chunks/app/page-fcb69349f15d154b.js"],""]
 4:I[5613,[],""]
 5:I[31778,[],""]
-0:["4mrMigZY9ob7yaIDjXpX6",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/654259bbf9e4c196.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
+0:["lLFQRQnIrRo-GJf5spHEd",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/c18941d97fb7245b.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
 6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"🚅 LiteLLM"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
 1:null
--- a/ui/litellm-dashboard/package-lock.json
+++ b/ui/litellm-dashboard/package-lock.json
--- a/ui/litellm-dashboard/package.json
+++ b/ui/litellm-dashboard/package.json
@ -19,14 +19,18 @@
    "jsonwebtoken": "^9.0.2",
    "jwt-decode": "^4.0.0",
    "next": "14.1.0",
    "openai": "^4.28.0",
    "react": "^18",
-    "react-dom": "^18"
+    "react-dom": "^18",
    "react-markdown": "^9.0.1",
    "react-syntax-highlighter": "^15.5.0"
  },
  "devDependencies": {
    "@tailwindcss/forms": "^0.5.7",
    "@types/node": "^20",
    "@types/react": "18.2.48",
    "@types/react-dom": "^18",
    "@types/react-syntax-highlighter": "^15.5.11",
    "autoprefixer": "^10.4.17",
    "eslint": "^8",
    "eslint-config-next": "14.1.0",
--- a/ui/litellm-dashboard/src/app/page.tsx
+++ b/ui/litellm-dashboard/src/app/page.tsx
@ -3,6 +3,8 @@ import React, { Suspense, useEffect, useState } from "react";
 import { useSearchParams } from "next/navigation";
 import Navbar from "../components/navbar";
 import UserDashboard from "../components/user_dashboard";
 import ModelDashboard from "@/components/model_dashboard";
 import ChatUI from "@/components/chat_ui";
 import Sidebar from "../components/leftnav";
 import Usage from "../components/usage";
 import { jwtDecode } from "jwt-decode";
@ -80,7 +82,22 @@ const CreateKeyPage = () => {
              userEmail={userEmail}
              setUserEmail={setUserEmail}
            />
-          ) : (
+          ) : page == "models" ? (
            <ModelDashboard
              userID={userID}
              userRole={userRole}
              token={token}
              accessToken={accessToken}
            />
          ) : page == "llm-playground" ? (
            <ChatUI
              userID={userID}
              userRole={userRole}
              token={token}
              accessToken={accessToken}
            />
          )
          : (
            <Usage
              userID={userID}
              userRole={userRole}
--- a/ui/litellm-dashboard/src/components/chat_ui.tsx
+++ b/ui/litellm-dashboard/src/components/chat_ui.tsx
@ -0,0 +1,301 @@
 import React, { useState, useEffect } from "react";
 import ReactMarkdown from "react-markdown";
 import { Card, Title, Table, TableHead, TableRow, TableCell, TableBody, Grid, Tab,
    TabGroup,
    TabList,
    TabPanel,
    Metric,
    Select,
    SelectItem,
    TabPanels, } from "@tremor/react";
 import { modelInfoCall } from "./networking";
 import openai from "openai";
 import { Prism as SyntaxHighlighter } from 'react-syntax-highlighter';
 interface ChatUIProps {
  accessToken: string | null;
  token: string | null;
  userRole: string | null;
  userID: string | null;
 }
 async function generateModelResponse(inputMessage: string, updateUI: (chunk: string) => void, selectedModel: string, accessToken: string) {
  const client = new openai.OpenAI({
    apiKey: accessToken, // Replace with your OpenAI API key
    baseURL: 'http://0.0.0.0:4000', // Replace with your OpenAI API base URL
    dangerouslyAllowBrowser: true, // using a temporary litellm proxy key
  });
  const response = await client.chat.completions.create({
    model: selectedModel,
    stream: true,
    messages: [
      {
        role: 'user',
        content: inputMessage,
      },
    ],
  });
  for await (const chunk of response) {
    console.log(chunk);
    if (chunk.choices[0].delta.content) {
      updateUI(chunk.choices[0].delta.content);
    }
  }
 }
 const ChatUI: React.FC<ChatUIProps> = ({ accessToken, token, userRole, userID }) => {
    const [inputMessage, setInputMessage] = useState("");
    const [chatHistory, setChatHistory] = useState<any[]>([]);
    const [selectedModel, setSelectedModel] = useState<string | undefined>(undefined);
    const [modelInfo, setModelInfo] = useState<any | null>(null); // Declare modelInfo at the component level
    useEffect(() => {
      if (!accessToken || !token || !userRole || !userID) {
            return;
      } 
      // Fetch model info and set the default selected model
      const fetchModelInfo = async () => {
        const fetchedModelInfo = await modelInfoCall(accessToken, userID, userRole);
        console.log("model_info:", fetchedModelInfo);
        if (fetchedModelInfo?.data.length > 0) {
          setModelInfo(fetchedModelInfo);
          setSelectedModel(fetchedModelInfo.data[0].model_name);
        }
      };
      fetchModelInfo();
    }, [accessToken, userID, userRole]);
    const updateUI = (role: string, chunk: string) => {
      setChatHistory((prevHistory) => {
        const lastMessage = prevHistory[prevHistory.length - 1];
        if (lastMessage && lastMessage.role === role) {
          return [
            ...prevHistory.slice(0, prevHistory.length - 1),
            { role, content: lastMessage.content + chunk },
          ];
        } else {
          return [...prevHistory, { role, content: chunk }];
        }
      });
    };
    const handleSendMessage = async () => {
      if (inputMessage.trim() === "") return;
      if (!accessToken || !token || !userRole || !userID) {
        return;
      }
      setChatHistory((prevHistory) => [
        ...prevHistory,
        { role: "user", content: inputMessage },
      ]);
      try {
        if (selectedModel) {
          await generateModelResponse(inputMessage, (chunk) => updateUI("assistant", chunk), selectedModel, accessToken);
        }
      } catch (error) {
        console.error("Error fetching model response", error);
        updateUI("assistant", "Error fetching model response");
      }
      setInputMessage("");
    };
    return (
      <div style={{ width: "100%", position: "relative" }}>
        <Grid className="gap-2 p-10 h-[75vh] w-full">
          <Card>
          <TabGroup>
            <TabList className="mt-4">
                <Tab>Chat</Tab>
                <Tab>API Reference</Tab>
            </TabList>
            <TabPanels>
                <TabPanel>
                    <div>
                    <label>Select Model:</label>
                    <select
                        value={selectedModel || ""}
                        onChange={(e) => setSelectedModel(e.target.value)}
                    >
                        {/* Populate dropdown options from available models */}
                        {modelInfo?.data.map((element: { model_name: string }) => (
                        <option key={element.model_name} value={element.model_name}>
                            {element.model_name}
                        </option>
                        ))}
                    </select>
                    </div>
                    <Table className="mt-5" style={{ display: "block", maxHeight: "60vh", overflowY: "auto" }}>
                    <TableHead>
                        <TableRow>
                        <TableCell>
                            <Title>Chat</Title>
                        </TableCell>
                        </TableRow>
                    </TableHead>
                    <TableBody>
                        {chatHistory.map((message, index) => (
                        <TableRow key={index}>
                            <TableCell>{`${message.role}: ${message.content}`}</TableCell>
                        </TableRow>
                        ))}
                    </TableBody>
                    </Table>
                    <div className="mt-3" style={{ position: "absolute", bottom: 5, width: "95%" }}>
                    <div className="flex">
                        <input
                        type="text"
                        value={inputMessage}
                        onChange={(e) => setInputMessage(e.target.value)}
                        className="flex-1 p-2 border rounded-md mr-2"
                        placeholder="Type your message..."
                        />
                        <button onClick={handleSendMessage} className="p-2 bg-blue-500 text-white rounded-md">
                        Send
                        </button>
                    </div>
                    </div>
                </TabPanel>
                <TabPanel>
                    <TabGroup>
                        <TabList>
                            <Tab>OpenAI Python SDK</Tab>
                            <Tab>LlamaIndex</Tab>
                            <Tab>Langchain Py</Tab>
                        </TabList>
                        <TabPanels>
                        <TabPanel>
                    <SyntaxHighlighter language="python">
            {`
 import openai
 client = openai.OpenAI(
    api_key="your_api_key",
    base_url="http://0.0.0.0:4000" # proxy base url
 )
 response = client.chat.completions.create(
    model="gpt-3.5-turbo", # model to use from Models Tab
    messages = [
        {
            "role": "user",
            "content": "this is a test request, write a short poem"
        }
    ],
    extra_body={
        "metadata": {
            "generation_name": "ishaan-generation-openai-client",
            "generation_id": "openai-client-gen-id22",
            "trace_id": "openai-client-trace-id22",
            "trace_user_id": "openai-client-user-id2"
        }
    }
 )
 print(response)
            `}
        </SyntaxHighlighter>
     </TabPanel>
     <TabPanel>
                    <SyntaxHighlighter language="python">
            {`
 import os, dotenv
 from llama_index.llms import AzureOpenAI
 from llama_index.embeddings import AzureOpenAIEmbedding
 from llama_index import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
 llm = AzureOpenAI(
    engine="azure-gpt-3.5",               # model_name on litellm proxy
    temperature=0.0,
    azure_endpoint="http://0.0.0.0:4000", # litellm proxy endpoint
    api_key="sk-1234",                    # litellm proxy API Key
    api_version="2023-07-01-preview",
 )
 embed_model = AzureOpenAIEmbedding(
    deployment_name="azure-embedding-model",
    azure_endpoint="http://0.0.0.0:4000",
    api_key="sk-1234",
    api_version="2023-07-01-preview",
 )
 documents = SimpleDirectoryReader("llama_index_data").load_data()
 service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)
 index = VectorStoreIndex.from_documents(documents, service_context=service_context)
 query_engine = index.as_query_engine()
 response = query_engine.query("What did the author do growing up?")
 print(response)
            `}
        </SyntaxHighlighter>
     </TabPanel>
     <TabPanel>
                    <SyntaxHighlighter language="python">
            {`
 from langchain.chat_models import ChatOpenAI
 from langchain.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
 )
 from langchain.schema import HumanMessage, SystemMessage
 chat = ChatOpenAI(
    openai_api_base="http://0.0.0.0:8000",
    model = "gpt-3.5-turbo",
    temperature=0.1,
    extra_body={
        "metadata": {
            "generation_name": "ishaan-generation-langchain-client",
            "generation_id": "langchain-client-gen-id22",
            "trace_id": "langchain-client-trace-id22",
            "trace_user_id": "langchain-client-user-id2"
        }
    }
 )
 messages = [
    SystemMessage(
        content="You are a helpful assistant that im using to make a test request to."
    ),
    HumanMessage(
        content="test from litellm. tell me why it's amazing in 1 sentence"
    ),
 ]
 response = chat(messages)
 print(response)
            `}
        </SyntaxHighlighter>
     </TabPanel>
        </TabPanels>
        </TabGroup>
                </TabPanel>
            </TabPanels>
        </TabGroup>
          </Card>
        </Grid>
      </div>
    );
  };
  export default ChatUI;
--- a/ui/litellm-dashboard/src/components/create_key_button.tsx
+++ b/ui/litellm-dashboard/src/components/create_key_button.tsx
@ -14,6 +14,7 @@ interface CreateKeyProps {
  userRole: string | null;
  accessToken: string;
  data: any[] | null;
  userModels: string[];
  setData: React.Dispatch<React.SetStateAction<any[] | null>>;
 }
@ -22,6 +23,7 @@ const CreateKey: React.FC<CreateKeyProps> = ({
  userRole,
  accessToken,
  data,
  userModels,
  setData,
 }) => {
  const [form] = Form.useForm();
@ -42,20 +44,13 @@ const CreateKey: React.FC<CreateKeyProps> = ({
  const handleCreate = async (formValues: Record<string, any>) => {
    try {
      message.info("Making API Call");
      // Check if "models" exists and is not an empty string
      if (formValues.models && formValues.models.trim() !== '') {
        // Format the "models" field as an array
        formValues.models = formValues.models.split(',').map((model: string) => model.trim());
      } else {
        // If "models" is undefined or an empty string, set it to an empty array
        formValues.models = [];
      }
      setIsModalVisible(true);
      const response = await keyCreateCall(accessToken, userID, formValues);
      setData((prevData) => (prevData ? [...prevData, response] : [response])); // Check if prevData is null
      setApiKey(response["key"]);
      message.success("API Key Created");
      form.resetFields();
      localStorage.removeItem("userData" + userID)
    } catch (error) {
      console.error("Error creating the key:", error);
    }
@ -90,13 +85,22 @@ const CreateKey: React.FC<CreateKeyProps> = ({
            >
              <Input placeholder="ai_team" />
            </Form.Item>
            <Form.Item
-              label="Models (Comma Separated). Eg: gpt-3.5-turbo,gpt-4"
+            label="Models"
-              name="models"
+            name="models"
          >
            <Select
              mode="multiple"
              placeholder="Select models"
              style={{ width: '100%' }}
            >
-              <Input placeholder="gpt-4,gpt-3.5-turbo" />
+              {userModels.map((model) => (
-            </Form.Item>
+                <Option key={model} value={model}>
                  {model}
                </Option>
              ))}
            </Select>
          </Form.Item>
            <Form.Item
--- a/ui/litellm-dashboard/src/components/leftnav.tsx
+++ b/ui/litellm-dashboard/src/components/leftnav.tsx
@ -20,7 +20,13 @@ const Sidebar: React.FC<SidebarProps> = ({ setPage }) => {
          <Menu.Item key="1" onClick={() => setPage("api-keys")}>
            API Keys
          </Menu.Item>
-          <Menu.Item key="2" onClick={() => setPage("usage")}>
+          <Menu.Item key="2" onClick={() => setPage("models")}>
            Models
          </Menu.Item>
          <Menu.Item key="3" onClick={() => setPage("llm-playground")}>
            Chat UI
          </Menu.Item>
          <Menu.Item key="4" onClick={() => setPage("usage")}>
            Usage
          </Menu.Item>
        </Menu>
--- a/ui/litellm-dashboard/src/components/model_dashboard.tsx
+++ b/ui/litellm-dashboard/src/components/model_dashboard.tsx
@ -0,0 +1,124 @@
 import React, { useState, useEffect } from "react";
 import { Card, Title, Subtitle, Table, TableHead, TableRow, TableCell, TableBody, Metric, Grid } from "@tremor/react";
 import { modelInfoCall } from "./networking";
 interface ModelDashboardProps {
  accessToken: string | null;
  token: string | null;
  userRole: string | null;
  userID: string | null;
 }
 const ModelDashboard: React.FC<ModelDashboardProps> = ({
  accessToken,
  token,
  userRole,
  userID,
 }) => {
  const [modelData, setModelData] = useState<any>({ data: [] });
  useEffect(() => {
    if (!accessToken || !token || !userRole || !userID) {
      return;
    }
    const fetchData = async () => {
      try {
        // Replace with your actual API call for model data
        const modelDataResponse = await modelInfoCall(accessToken, userID,  userRole);
        console.log("Model data response:", modelDataResponse.data);
        setModelData(modelDataResponse);
      } catch (error) {
        console.error("There was an error fetching the model data", error);
      }
    };
    if (accessToken && token && userRole && userID) {
      fetchData();
    }
  }, [accessToken, token, userRole, userID]);
  if (!modelData) {
    return <div>Loading...</div>;
  }
  // loop through model data and edit each row 
  for (let i = 0; i < modelData.data.length; i++) {
    let curr_model = modelData.data[i];
    let litellm_model_name = curr_model?.litellm_params?.model;
    let model_info = curr_model?.model_info;
    let defaultProvider = "openai";
    let provider = "";
    let input_cost = "Undefined"
    let output_cost = "Undefined"
    let max_tokens = "Undefined"
    // Check if litellm_model_name is null or undefined
    if (litellm_model_name) {
        // Split litellm_model_name based on "/"
        let splitModel = litellm_model_name.split("/");
        // Get the first element in the split
        let firstElement = splitModel[0];
        // If there is only one element, default provider to openai
        provider = splitModel.length === 1 ? defaultProvider : firstElement;
        console.log("Provider:", provider);
    } else {
        // litellm_model_name is null or undefined, default provider to openai
        provider = defaultProvider;
        console.log("Provider:", provider);
    }
    if (model_info) {
        input_cost = model_info?.input_cost_per_token;
        output_cost = model_info?.output_cost_per_token;
        max_tokens = model_info?.max_tokens;
    }
    modelData.data[i].provider = provider
    modelData.data[i].input_cost = input_cost
    modelData.data[i].output_cost = output_cost
    modelData.data[i].max_tokens = max_tokens
  }
  return (
    <div style={{ width: "100%" }}>
    <Grid className="gap-2 p-10 h-[75vh] w-full">
      <Card>
        <Table className="mt-5">
          <TableHead>
            <TableRow>
              <TableCell><Title>Model Name </Title></TableCell>
              <TableCell><Title>Provider</Title></TableCell>
              <TableCell><Title>Input Price per token ($)</Title></TableCell>
              <TableCell><Title>Output Price per token ($)</Title></TableCell>
              <TableCell><Title>Max Tokens</Title></TableCell>
            </TableRow>
          </TableHead>
          <TableBody>
            {modelData.data.map((model: any) => (
              <TableRow key={model.model_name}>
                <TableCell><Title>{model.model_name}</Title></TableCell>
                <TableCell>{model.provider}</TableCell>
                <TableCell>{model.input_cost}</TableCell>
                <TableCell>{model.output_cost}</TableCell>
                <TableCell>{model.max_tokens}</TableCell>
              </TableRow>
            ))}
          </TableBody>
        </Table>
      </Card>
      </Grid>
    </div>
  );
 };
 export default ModelDashboard;
--- a/ui/litellm-dashboard/src/components/networking.tsx
+++ b/ui/litellm-dashboard/src/components/networking.tsx
@ -137,6 +137,41 @@ export const userInfoCall = async (
  }
 };
 export const modelInfoCall = async (
  accessToken: String,
  userID: String,
  userRole: String
 ) => {
  try {
    let url = proxyBaseUrl ? `${proxyBaseUrl}/model/info` : `/model/info`;
    message.info("Requesting model data");
    const response = await fetch(url, {
      method: "GET",
      headers: {
        Authorization: `Bearer ${accessToken}`,
        "Content-Type": "application/json",
      },
    });
    if (!response.ok) {
      const errorData = await response.text();
      message.error(errorData);
      throw new Error("Network response was not ok");
    }
    const data = await response.json();
    message.info("Received model data");
    return data;
    // Handle success - you might want to update some state or UI based on the created key
  } catch (error) {
    console.error("Failed to create key:", error);
    throw error;
  }
 };
 export const keySpendLogsCall = async (accessToken: String, token: String) => {
  try {
    const url = proxyBaseUrl ? `${proxyBaseUrl}/spend/logs` : `/spend/logs`;
--- a/ui/litellm-dashboard/src/components/user_dashboard.tsx
+++ b/ui/litellm-dashboard/src/components/user_dashboard.tsx
@ -1,6 +1,6 @@
 "use client";
 import React, { useState, useEffect } from "react";
-import { userInfoCall } from "./networking";
+import { userInfoCall, modelInfoCall } from "./networking";
 import { Grid, Col, Card, Text } from "@tremor/react";
 import CreateKey from "./create_key_button";
 import ViewKeyTable from "./view_key_table";
@ -47,6 +47,7 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
  const token = searchParams.get("token");
  const [accessToken, setAccessToken] = useState<string | null>(null);
  const [userModels, setUserModels] = useState<string[]>([]);
  function formatUserRole(userRole: string) {
    if (!userRole) {
@ -96,22 +97,39 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
      }
    }
    if (userID && accessToken && userRole && !data) {
-      const cachedData = localStorage.getItem("userData");
+      const cachedData = localStorage.getItem("userData" + userID);
-      const cachedSpendData = localStorage.getItem("userSpendData");
+      const cachedSpendData = localStorage.getItem("userSpendData" + userID);
-      if (cachedData && cachedSpendData) {
+      const cachedUserModels = localStorage.getItem("userModels" + userID);
      if (cachedData && cachedSpendData && cachedUserModels) {
        setData(JSON.parse(cachedData));
        setUserSpendData(JSON.parse(cachedSpendData));
        setUserModels(JSON.parse(cachedUserModels));
      } else {
        const fetchData = async () => {
          try {
            const response = await userInfoCall(accessToken, userID, userRole);
            setUserSpendData(response["user_info"]);
            setData(response["keys"]); // Assuming this is the correct path to your data
-            localStorage.setItem("userData", JSON.stringify(response["keys"]));
+            localStorage.setItem("userData" + userID, JSON.stringify(response["keys"]));
            localStorage.setItem(
-              "userSpendData",
+              "userSpendData" + userID,
              JSON.stringify(response["user_info"])
            );
            const model_info = await modelInfoCall(accessToken, userID, userRole);
            console.log("model_info:", model_info);
            // loop through model_info["data"] and create an array of element.model_name
            let available_model_names = model_info["data"].map((element: { model_name: string; }) => element.model_name);
            console.log("available_model_names:", available_model_names);
            setUserModels(available_model_names);
            console.log("userModels:", userModels);
            localStorage.setItem("userModels" + userID, JSON.stringify(available_model_names));
          } catch (error) {
            console.error("There was an error fetching the data", error);
            // Optionally, update your UI to reflect the error state here as well
@ -158,6 +176,7 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
          <CreateKey
            userID={userID}
            userRole={userRole}
            userModels={userModels}
            accessToken={accessToken}
            data={data}
            setData={setData}
--- a/ui/litellm-dashboard/src/components/view_key_table.tsx
+++ b/ui/litellm-dashboard/src/components/view_key_table.tsx
@ -43,6 +43,7 @@ const ViewKeyTable: React.FC<ViewKeyTableProps> = ({
    // Set the key to delete and open the confirmation modal
    setKeyToDelete(token);
    localStorage.removeItem("userData" + userID)
    setIsDeleteModalOpen(true);
  };
		`@ -1 +1 @@`
			`(self.webpackChunk_N_E=self.webpackChunk_N_E\|\|[]).push([[185],{11837:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=11837)}),_N_E=n.O()}]);`				`(self.webpackChunk_N_E=self.webpackChunk_N_E\|\|[]).push([[185],{87421:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=87421)}),_N_E=n.O()}]);`
		`@ -1 +1 @@`
			!function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o\|\|0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r\|\|"object"==typeof e&&e&&(4&r&&e.__esModule\|\|16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t\|\|[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/654259bbf9e4c196.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this\|\|Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e\|\|l.getAttribute("data-webpack")==o+n){i=l;break}}i\|\|(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children\|\|(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E\|\|[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}();				!function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o\|\|0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r\|\|"object"==typeof e&&e&&(4&r&&e.__esModule\|\|16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t\|\|[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/c18941d97fb7245b.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this\|\|Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e\|\|l.getAttribute("data-webpack")==o+n){i=l;break}}i\|\|(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children\|\|(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E\|\|[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}();
		`@ -1 +1 @@`
			<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-85c9b4219c1bb384.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-9b4acf26920649bc.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-096338c8e1915716.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f\|\|[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/654259bbf9e4c196.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[75985,[\"838\",\"static/chunks/838-7fa0bab5a1c3631d.js\",\"931\",\"static/chunks/app/page-5a7453e3903c5d60.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/654259bbf9e4c196.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"4mrMigZY9ob7yaIDjXpX6\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>				<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-a85b2c176012d8e5.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-e1b183dda365ec86.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-9b4fb13a7db53edf.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f\|\|[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/c18941d97fb7245b.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[48016,[\"145\",\"static/chunks/145-9c160ad5539e000f.js\",\"931\",\"static/chunks/app/page-fcb69349f15d154b.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/c18941d97fb7245b.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"lLFQRQnIrRo-GJf5spHEd\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>
		`@ -0,0 +1 @@`
							(self.webpackChunk_N_E=self.webpackChunk_N_E\|\|[]).push([[165],{83155:function(e,t,n){(window.__NEXT_P=window.__NEXT_P\|\|[]).push(["/_not-found",function(){return n(84032)}])},84032:function(e,t,n){"use strict";Object.defineProperty(t,"__esModule",{value:!0}),Object.defineProperty(t,"default",{enumerable:!0,get:function(){return i}}),n(86921);let o=n(3827);n(64090);let r={error:{fontFamily:'system-ui,"Segoe UI",Roboto,Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji"',height:"100vh",textAlign:"center",display:"flex",flexDirection:"column",alignItems:"center",justifyContent:"center"},desc:{display:"inline-block"},h1:{display:"inline-block",margin:"0 20px 0 0",padding:"0 23px 0 0",fontSize:24,fontWeight:500,verticalAlign:"top",lineHeight:"49px"},h2:{fontSize:14,fontWeight:400,lineHeight:"49px",margin:0}};function i(){return(0,o.jsxs)(o.Fragment,{children:[(0,o.jsx)("title",{children:"404: This page could not be found."}),(0,o.jsx)("div",{style:r.error,children:(0,o.jsxs)("div",{children:[(0,o.jsx)("style",{dangerouslySetInnerHTML:{__html:"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}),(0,o.jsx)("h1",{className:"next-error-h1",style:r.h1,children:"404"}),(0,o.jsx)("div",{style:r.desc,children:(0,o.jsx)("h2",{style:r.h2,children:"This page could not be found."})})]})})]})}("function"==typeof t.default\|\|"object"==typeof t.default&&null!==t.default)&&void 0===t.default.__esModule&&(Object.defineProperty(t.default,"__esModule",{value:!0}),Object.assign(t.default,t),e.exports=t.default)}},function(e){e.O(0,[971,69,744],function(){return e(e.s=83155)}),_N_E=e.O()}]);
		`@ -0,0 +1 @@`
							`self.__BUILD_MANIFEST={__rewrites:{afterFiles:[],beforeFiles:[],fallback:[]},"/_error":["static/chunks/pages/_error-d6107f1aac0c574c.js"],sortedPages:["/_app","/_error"]},self.__BUILD_MANIFEST_CB&&self.__BUILD_MANIFEST_CB();`
		`@ -0,0 +1 @@`
							`self.__SSG_MANIFEST=new Set([]);self.__SSG_MANIFEST_CB&&self.__SSG_MANIFEST_CB()`