forked from phoenix/litellm-mirror
Merge branch 'main' into litellm_aioboto3_sagemaker
This commit is contained in:
commit
57654f4533
79 changed files with 3440 additions and 253 deletions
|
@ -34,6 +34,8 @@ LiteLLM manages:
|
|||
[**Jump to OpenAI Proxy Docs**](https://github.com/BerriAI/litellm?tab=readme-ov-file#openai-proxy---docs) <br>
|
||||
[**Jump to Supported LLM Providers**](https://github.com/BerriAI/litellm?tab=readme-ov-file#supported-provider-docs)
|
||||
|
||||
Support for more providers. Missing a provider or LLM Platform, raise a [feature request](https://github.com/BerriAI/litellm/issues/new?assignees=&labels=enhancement&projects=&template=feature_request.yml&title=%5BFeature%5D%3A+).
|
||||
|
||||
# Usage ([**Docs**](https://docs.litellm.ai/docs/))
|
||||
> [!IMPORTANT]
|
||||
> LiteLLM v1.0.0 now requires `openai>=1.0.0`. Migration guide [here](https://docs.litellm.ai/docs/migration)
|
||||
|
|
|
@ -1,5 +1,10 @@
|
|||
# Custom Callbacks
|
||||
|
||||
:::info
|
||||
**For PROXY** [Go Here](../proxy/logging.md#custom-callback-class-async)
|
||||
:::
|
||||
|
||||
|
||||
## Callback Class
|
||||
You can create a custom callback class to precisely log events as they occur in litellm.
|
||||
|
||||
|
|
|
@ -79,6 +79,23 @@ model_list:
|
|||
mode: embedding # 👈 ADD THIS
|
||||
```
|
||||
|
||||
### Image Generation Models
|
||||
|
||||
We need some way to know if the model is an image generation model when running checks, if you have this in your config, specifying mode it makes an image generation health check
|
||||
|
||||
```yaml
|
||||
model_list:
|
||||
- model_name: dall-e-3
|
||||
litellm_params:
|
||||
model: azure/dall-e-3
|
||||
api_base: os.environ/AZURE_API_BASE
|
||||
api_key: os.environ/AZURE_API_KEY
|
||||
api_version: "2023-07-01-preview"
|
||||
model_info:
|
||||
mode: image_generation # 👈 ADD THIS
|
||||
```
|
||||
|
||||
|
||||
### Text Completion Models
|
||||
|
||||
We need some way to know if the model is a text completion model when running checks, if you have this in your config, specifying mode it makes an embedding health check
|
||||
|
|
|
@ -4,21 +4,24 @@ import Image from '@theme/IdealImage';
|
|||
|
||||
LiteLLM supports [Microsoft Presidio](https://github.com/microsoft/presidio/) for PII masking.
|
||||
|
||||
## Step 1. Add env
|
||||
|
||||
## Quick Start
|
||||
### Step 1. Add env
|
||||
|
||||
```bash
|
||||
export PRESIDIO_ANALYZER_API_BASE="http://localhost:5002"
|
||||
export PRESIDIO_ANONYMIZER_API_BASE="http://localhost:5001"
|
||||
```
|
||||
|
||||
## Step 2. Set it as a callback in config.yaml
|
||||
### Step 2. Set it as a callback in config.yaml
|
||||
|
||||
```yaml
|
||||
litellm_settings:
|
||||
litellm.callbacks = ["presidio"]
|
||||
callbacks = ["presidio", ...] # e.g. ["presidio", custom_callbacks.proxy_handler_instance]
|
||||
```
|
||||
|
||||
## Start proxy
|
||||
### Step 3. Start proxy
|
||||
|
||||
|
||||
```
|
||||
litellm --config /path/to/config.yaml
|
||||
|
@ -27,4 +30,28 @@ litellm --config /path/to/config.yaml
|
|||
|
||||
This will mask the input going to the llm provider
|
||||
|
||||
<Image img={require('../../img/presidio_screenshot.png')} />
|
||||
<Image img={require('../../img/presidio_screenshot.png')} />
|
||||
|
||||
## Output parsing
|
||||
|
||||
LLM responses can sometimes contain the masked tokens.
|
||||
|
||||
For presidio 'replace' operations, LiteLLM can check the LLM response and replace the masked token with the user-submitted values.
|
||||
|
||||
Just set `litellm.output_parse_pii = True`, to enable this.
|
||||
|
||||
|
||||
```yaml
|
||||
litellm_settings:
|
||||
output_parse_pii: true
|
||||
```
|
||||
|
||||
**Expected Flow: **
|
||||
|
||||
1. User Input: "hello world, my name is Jane Doe. My number is: 034453334"
|
||||
|
||||
2. LLM Input: "hello world, my name is [PERSON]. My number is: [PHONE_NUMBER]"
|
||||
|
||||
3. LLM Response: "Hey [PERSON], nice to meet you!"
|
||||
|
||||
4. User Response: "Hey Jane Doe, nice to meet you!"
|
||||
|
|
|
@ -370,12 +370,12 @@ See the latest available ghcr docker image here:
|
|||
https://github.com/berriai/litellm/pkgs/container/litellm
|
||||
|
||||
```shell
|
||||
docker pull ghcr.io/berriai/litellm:main-v1.16.13
|
||||
docker pull ghcr.io/berriai/litellm:main-latest
|
||||
```
|
||||
|
||||
### Run the Docker Image
|
||||
```shell
|
||||
docker run ghcr.io/berriai/litellm:main-v1.16.13
|
||||
docker run ghcr.io/berriai/litellm:main-latest
|
||||
```
|
||||
|
||||
#### Run the Docker Image with LiteLLM CLI args
|
||||
|
@ -384,12 +384,12 @@ See all supported CLI args [here](https://docs.litellm.ai/docs/proxy/cli):
|
|||
|
||||
Here's how you can run the docker image and pass your config to `litellm`
|
||||
```shell
|
||||
docker run ghcr.io/berriai/litellm:main-v1.16.13 --config your_config.yaml
|
||||
docker run ghcr.io/berriai/litellm:main-latest --config your_config.yaml
|
||||
```
|
||||
|
||||
Here's how you can run the docker image and start litellm on port 8002 with `num_workers=8`
|
||||
```shell
|
||||
docker run ghcr.io/berriai/litellm:main-v1.16.13 --port 8002 --num_workers 8
|
||||
docker run ghcr.io/berriai/litellm:main-latest --port 8002 --num_workers 8
|
||||
```
|
||||
|
||||
#### Run the Docker Image using docker compose
|
||||
|
|
|
@ -37,12 +37,12 @@ http://0.0.0.0:8000/ui # <proxy_base_url>/ui
|
|||
```
|
||||
|
||||
|
||||
## Get Admin UI Link on Swagger
|
||||
### 3. Get Admin UI Link on Swagger
|
||||
Your Proxy Swagger is available on the root of the Proxy: e.g.: `http://localhost:4000/`
|
||||
|
||||
<Image img={require('../../img/ui_link.png')} />
|
||||
|
||||
## Change default username + password
|
||||
### 4. Change default username + password
|
||||
|
||||
Set the following in your .env on the Proxy
|
||||
|
||||
|
@ -111,6 +111,29 @@ MICROSOFT_TENANT="5a39737
|
|||
|
||||
</TabItem>
|
||||
|
||||
|
||||
<TabItem value="Generic" label="Generic SSO Provider">
|
||||
|
||||
A generic OAuth client that can be used to quickly create support for any OAuth provider with close to no code
|
||||
|
||||
**Required .env variables on your Proxy**
|
||||
```shell
|
||||
|
||||
GENERIC_CLIENT_ID = "******"
|
||||
GENERIC_CLIENT_SECRET = "G*******"
|
||||
GENERIC_AUTHORIZATION_ENDPOINT = "http://localhost:9090/auth"
|
||||
GENERIC_TOKEN_ENDPOINT = "http://localhost:9090/token"
|
||||
GENERIC_USERINFO_ENDPOINT = "http://localhost:9090/me"
|
||||
```
|
||||
|
||||
- Set Redirect URI, if your provider requires it
|
||||
- Set a redirect url = `<your proxy base url>/sso/callback`
|
||||
```shell
|
||||
http://localhost:4000/sso/callback
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
|
||||
</Tabs>
|
||||
|
||||
### Step 3. Test flow
|
||||
|
|
|
@ -197,7 +197,7 @@ from openai import OpenAI
|
|||
# set api_key to send to proxy server
|
||||
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:8000")
|
||||
|
||||
response = openai.embeddings.create(
|
||||
response = client.embeddings.create(
|
||||
input=["hello from litellm"],
|
||||
model="text-embedding-ada-002"
|
||||
)
|
||||
|
@ -281,6 +281,84 @@ print(query_result[:5])
|
|||
|
||||
```
|
||||
|
||||
## `/moderations`
|
||||
|
||||
|
||||
### Request Format
|
||||
Input, Output and Exceptions are mapped to the OpenAI format for all supported models
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="openai" label="OpenAI Python v1.0.0+">
|
||||
|
||||
```python
|
||||
import openai
|
||||
from openai import OpenAI
|
||||
|
||||
# set base_url to your proxy server
|
||||
# set api_key to send to proxy server
|
||||
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:8000")
|
||||
|
||||
response = client.moderations.create(
|
||||
input="hello from litellm",
|
||||
model="text-moderation-stable"
|
||||
)
|
||||
|
||||
print(response)
|
||||
|
||||
```
|
||||
</TabItem>
|
||||
<TabItem value="Curl" label="Curl Request">
|
||||
|
||||
```shell
|
||||
curl --location 'http://0.0.0.0:8000/moderations' \
|
||||
--header 'Content-Type: application/json' \
|
||||
--header 'Authorization: Bearer sk-1234' \
|
||||
--data '{"input": "Sample text goes here", "model": "text-moderation-stable"}'
|
||||
```
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
|
||||
### Response Format
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "modr-8sFEN22QCziALOfWTa77TodNLgHwA",
|
||||
"model": "text-moderation-007",
|
||||
"results": [
|
||||
{
|
||||
"categories": {
|
||||
"harassment": false,
|
||||
"harassment/threatening": false,
|
||||
"hate": false,
|
||||
"hate/threatening": false,
|
||||
"self-harm": false,
|
||||
"self-harm/instructions": false,
|
||||
"self-harm/intent": false,
|
||||
"sexual": false,
|
||||
"sexual/minors": false,
|
||||
"violence": false,
|
||||
"violence/graphic": false
|
||||
},
|
||||
"category_scores": {
|
||||
"harassment": 0.000019947197870351374,
|
||||
"harassment/threatening": 5.5971017900446896e-6,
|
||||
"hate": 0.000028560316422954202,
|
||||
"hate/threatening": 2.2631787999216613e-8,
|
||||
"self-harm": 2.9121162015144364e-7,
|
||||
"self-harm/instructions": 9.314219084899378e-8,
|
||||
"self-harm/intent": 8.093739012338119e-8,
|
||||
"sexual": 0.00004414955765241757,
|
||||
"sexual/minors": 0.0000156943697220413,
|
||||
"violence": 0.00022354527027346194,
|
||||
"violence/graphic": 8.804164281173144e-6
|
||||
},
|
||||
"flagged": false
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## Advanced
|
||||
|
||||
|
|
|
@ -696,7 +696,9 @@ general_settings:
|
|||
"region_name": "us-west-2"
|
||||
"user_table_name": "your-user-table",
|
||||
"key_table_name": "your-token-table",
|
||||
"config_table_name": "your-config-table"
|
||||
"config_table_name": "your-config-table",
|
||||
"aws_role_name": "your-aws_role_name",
|
||||
"aws_session_name": "your-aws_session_name",
|
||||
}
|
||||
```
|
||||
|
||||
|
|
|
@ -67,6 +67,7 @@ max_budget: float = 0.0 # set the max budget across all providers
|
|||
budget_duration: Optional[
|
||||
str
|
||||
] = None # proxy only - resets budget after fixed duration. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d").
|
||||
_openai_finish_reasons = ["stop", "length", "function_call", "content_filter", "null"]
|
||||
_openai_completion_params = [
|
||||
"functions",
|
||||
"function_call",
|
||||
|
@ -164,6 +165,8 @@ secret_manager_client: Optional[
|
|||
] = None # list of instantiated key management clients - e.g. azure kv, infisical, etc.
|
||||
_google_kms_resource_name: Optional[str] = None
|
||||
_key_management_system: Optional[KeyManagementSystem] = None
|
||||
#### PII MASKING ####
|
||||
output_parse_pii: bool = False
|
||||
#############################################
|
||||
|
||||
|
||||
|
|
|
@ -675,6 +675,9 @@ class S3Cache(BaseCache):
|
|||
def flush_cache(self):
|
||||
pass
|
||||
|
||||
async def disconnect(self):
|
||||
pass
|
||||
|
||||
|
||||
class DualCache(BaseCache):
|
||||
"""
|
||||
|
|
|
@ -2,9 +2,11 @@
|
|||
# On success, logs events to Promptlayer
|
||||
import dotenv, os
|
||||
import requests
|
||||
|
||||
from litellm.proxy._types import UserAPIKeyAuth
|
||||
from litellm.caching import DualCache
|
||||
from typing import Literal
|
||||
|
||||
from typing import Literal, Union
|
||||
|
||||
dotenv.load_dotenv() # Loading env variables using dotenv
|
||||
import traceback
|
||||
|
@ -54,7 +56,7 @@ class CustomLogger: # https://docs.litellm.ai/docs/observability/custom_callbac
|
|||
user_api_key_dict: UserAPIKeyAuth,
|
||||
cache: DualCache,
|
||||
data: dict,
|
||||
call_type: Literal["completion", "embeddings"],
|
||||
call_type: Literal["completion", "embeddings", "image_generation"],
|
||||
):
|
||||
pass
|
||||
|
||||
|
@ -63,6 +65,13 @@ class CustomLogger: # https://docs.litellm.ai/docs/observability/custom_callbac
|
|||
):
|
||||
pass
|
||||
|
||||
async def async_post_call_success_hook(
|
||||
self,
|
||||
user_api_key_dict: UserAPIKeyAuth,
|
||||
response,
|
||||
):
|
||||
pass
|
||||
|
||||
#### SINGLE-USE #### - https://docs.litellm.ai/docs/observability/custom_callback#using-your-custom-callback-function
|
||||
|
||||
def log_input_event(self, model, messages, kwargs, print_verbose, callback_func):
|
||||
|
|
|
@ -477,8 +477,8 @@ def init_bedrock_client(
|
|||
|
||||
|
||||
def convert_messages_to_prompt(model, messages, provider, custom_prompt_dict):
|
||||
# handle anthropic prompts using anthropic constants
|
||||
if provider == "anthropic":
|
||||
# handle anthropic prompts and amazon titan prompts
|
||||
if provider == "anthropic" or provider == "amazon":
|
||||
if model in custom_prompt_dict:
|
||||
# check if the model has a registered custom prompt
|
||||
model_prompt_details = custom_prompt_dict[model]
|
||||
|
@ -490,7 +490,7 @@ def convert_messages_to_prompt(model, messages, provider, custom_prompt_dict):
|
|||
)
|
||||
else:
|
||||
prompt = prompt_factory(
|
||||
model=model, messages=messages, custom_llm_provider="anthropic"
|
||||
model=model, messages=messages, custom_llm_provider="bedrock"
|
||||
)
|
||||
else:
|
||||
prompt = ""
|
||||
|
@ -623,6 +623,7 @@ def completion(
|
|||
"textGenerationConfig": inference_params,
|
||||
}
|
||||
)
|
||||
|
||||
else:
|
||||
data = json.dumps({})
|
||||
|
||||
|
|
|
@ -90,9 +90,11 @@ def ollama_pt(
|
|||
return {"prompt": prompt, "images": images}
|
||||
else:
|
||||
prompt = "".join(
|
||||
m["content"]
|
||||
if isinstance(m["content"], str) is str
|
||||
else "".join(m["content"])
|
||||
(
|
||||
m["content"]
|
||||
if isinstance(m["content"], str) is str
|
||||
else "".join(m["content"])
|
||||
)
|
||||
for m in messages
|
||||
)
|
||||
return prompt
|
||||
|
@ -422,6 +424,34 @@ def anthropic_pt(
|
|||
return prompt
|
||||
|
||||
|
||||
def amazon_titan_pt(
|
||||
messages: list,
|
||||
): # format - https://github.com/BerriAI/litellm/issues/1896
|
||||
"""
|
||||
Amazon Titan uses 'User:' and 'Bot: in it's prompt template
|
||||
"""
|
||||
|
||||
class AmazonTitanConstants(Enum):
|
||||
HUMAN_PROMPT = "\n\nUser: " # Assuming this is similar to Anthropic prompt formatting, since amazon titan's prompt formatting is currently undocumented
|
||||
AI_PROMPT = "\n\nBot: "
|
||||
|
||||
prompt = ""
|
||||
for idx, message in enumerate(messages):
|
||||
if message["role"] == "user":
|
||||
prompt += f"{AmazonTitanConstants.HUMAN_PROMPT.value}{message['content']}"
|
||||
elif message["role"] == "system":
|
||||
prompt += f"{AmazonTitanConstants.HUMAN_PROMPT.value}<admin>{message['content']}</admin>"
|
||||
else:
|
||||
prompt += f"{AmazonTitanConstants.AI_PROMPT.value}{message['content']}"
|
||||
if (
|
||||
idx == 0 and message["role"] == "assistant"
|
||||
): # ensure the prompt always starts with `\n\nHuman: `
|
||||
prompt = f"{AmazonTitanConstants.HUMAN_PROMPT.value}" + prompt
|
||||
if messages[-1]["role"] != "assistant":
|
||||
prompt += f"{AmazonTitanConstants.AI_PROMPT.value}"
|
||||
return prompt
|
||||
|
||||
|
||||
def _load_image_from_url(image_url):
|
||||
try:
|
||||
from PIL import Image
|
||||
|
@ -636,6 +666,14 @@ def prompt_factory(
|
|||
return gemini_text_image_pt(messages=messages)
|
||||
elif custom_llm_provider == "mistral":
|
||||
return mistral_api_pt(messages=messages)
|
||||
elif custom_llm_provider == "bedrock":
|
||||
if "amazon.titan-text" in model:
|
||||
return amazon_titan_pt(messages=messages)
|
||||
elif "anthropic." in model:
|
||||
if any(_ in model for _ in ["claude-2.1", "claude-v2:1"]):
|
||||
return claude_2_1_pt(messages=messages)
|
||||
else:
|
||||
return anthropic_pt(messages=messages)
|
||||
try:
|
||||
if "meta-llama/llama-2" in model and "chat" in model:
|
||||
return llama_2_chat_pt(messages=messages)
|
||||
|
|
|
@ -484,7 +484,7 @@ def embedding(
|
|||
aws_access_key_id = optional_params.pop("aws_access_key_id", None)
|
||||
aws_region_name = optional_params.pop("aws_region_name", None)
|
||||
|
||||
if aws_access_key_id != None:
|
||||
if aws_access_key_id is not None:
|
||||
# uses auth params passed to completion
|
||||
# aws_access_key_id is not None, assume user is trying to auth using litellm.completion
|
||||
client = boto3.client(
|
||||
|
|
|
@ -4,7 +4,7 @@ from enum import Enum
|
|||
import requests
|
||||
import time
|
||||
from typing import Callable, Optional, Union
|
||||
from litellm.utils import ModelResponse, Usage, CustomStreamWrapper
|
||||
from litellm.utils import ModelResponse, Usage, CustomStreamWrapper, map_finish_reason
|
||||
import litellm, uuid
|
||||
import httpx
|
||||
|
||||
|
@ -575,9 +575,9 @@ def completion(
|
|||
model_response["model"] = model
|
||||
## CALCULATING USAGE
|
||||
if model in litellm.vertex_language_models and response_obj is not None:
|
||||
model_response["choices"][0].finish_reason = response_obj.candidates[
|
||||
0
|
||||
].finish_reason.name
|
||||
model_response["choices"][0].finish_reason = map_finish_reason(
|
||||
response_obj.candidates[0].finish_reason.name
|
||||
)
|
||||
usage = Usage(
|
||||
prompt_tokens=response_obj.usage_metadata.prompt_token_count,
|
||||
completion_tokens=response_obj.usage_metadata.candidates_token_count,
|
||||
|
@ -771,9 +771,9 @@ async def async_completion(
|
|||
model_response["model"] = model
|
||||
## CALCULATING USAGE
|
||||
if model in litellm.vertex_language_models and response_obj is not None:
|
||||
model_response["choices"][0].finish_reason = response_obj.candidates[
|
||||
0
|
||||
].finish_reason.name
|
||||
model_response["choices"][0].finish_reason = map_finish_reason(
|
||||
response_obj.candidates[0].finish_reason.name
|
||||
)
|
||||
usage = Usage(
|
||||
prompt_tokens=response_obj.usage_metadata.prompt_token_count,
|
||||
completion_tokens=response_obj.usage_metadata.candidates_token_count,
|
||||
|
|
|
@ -10,7 +10,6 @@
|
|||
import os, openai, sys, json, inspect, uuid, datetime, threading
|
||||
from typing import Any, Literal, Union
|
||||
from functools import partial
|
||||
|
||||
import dotenv, traceback, random, asyncio, time, contextvars
|
||||
from copy import deepcopy
|
||||
import httpx
|
||||
|
@ -2964,16 +2963,39 @@ def text_completion(
|
|||
|
||||
|
||||
##### Moderation #######################
|
||||
def moderation(input: str, api_key: Optional[str] = None):
|
||||
|
||||
|
||||
def moderation(
|
||||
input: str, model: Optional[str] = None, api_key: Optional[str] = None, **kwargs
|
||||
):
|
||||
# only supports open ai for now
|
||||
api_key = (
|
||||
api_key or litellm.api_key or litellm.openai_key or get_secret("OPENAI_API_KEY")
|
||||
)
|
||||
openai.api_key = api_key
|
||||
openai.api_type = "open_ai" # type: ignore
|
||||
openai.api_version = None
|
||||
openai.base_url = "https://api.openai.com/v1/"
|
||||
response = openai.moderations.create(input=input)
|
||||
|
||||
openai_client = kwargs.get("client", None)
|
||||
if openai_client is None:
|
||||
openai_client = openai.OpenAI(
|
||||
api_key=api_key,
|
||||
)
|
||||
|
||||
response = openai_client.moderations.create(input=input, model=model)
|
||||
return response
|
||||
|
||||
|
||||
##### Moderation #######################
|
||||
@client
|
||||
async def amoderation(input: str, model: str, api_key: Optional[str] = None, **kwargs):
|
||||
# only supports open ai for now
|
||||
api_key = (
|
||||
api_key or litellm.api_key or litellm.openai_key or get_secret("OPENAI_API_KEY")
|
||||
)
|
||||
openai_client = kwargs.get("client", None)
|
||||
if openai_client is None:
|
||||
openai_client = openai.AsyncOpenAI(
|
||||
api_key=api_key,
|
||||
)
|
||||
response = await openai_client.moderations.create(input=input, model=model)
|
||||
return response
|
||||
|
||||
|
||||
|
|
|
@ -198,6 +198,33 @@
|
|||
"litellm_provider": "openai",
|
||||
"mode": "embedding"
|
||||
},
|
||||
"text-moderation-stable": {
|
||||
"max_tokens": 32768,
|
||||
"max_input_tokens": 32768,
|
||||
"max_output_tokens": 0,
|
||||
"input_cost_per_token": 0.000000,
|
||||
"output_cost_per_token": 0.000000,
|
||||
"litellm_provider": "openai",
|
||||
"mode": "moderations"
|
||||
},
|
||||
"text-moderation-007": {
|
||||
"max_tokens": 32768,
|
||||
"max_input_tokens": 32768,
|
||||
"max_output_tokens": 0,
|
||||
"input_cost_per_token": 0.000000,
|
||||
"output_cost_per_token": 0.000000,
|
||||
"litellm_provider": "openai",
|
||||
"mode": "moderations"
|
||||
},
|
||||
"text-moderation-latest": {
|
||||
"max_tokens": 32768,
|
||||
"max_input_tokens": 32768,
|
||||
"max_output_tokens": 0,
|
||||
"input_cost_per_token": 0.000000,
|
||||
"output_cost_per_token": 0.000000,
|
||||
"litellm_provider": "openai",
|
||||
"mode": "moderations"
|
||||
},
|
||||
"256-x-256/dall-e-2": {
|
||||
"mode": "image_generation",
|
||||
"input_cost_per_pixel": 0.00000024414,
|
||||
|
|
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
@ -1 +1 @@
|
|||
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[185],{11837:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=11837)}),_N_E=n.O()}]);
|
||||
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[185],{87421:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=87421)}),_N_E=n.O()}]);
|
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
@ -1 +1 @@
|
|||
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[744],{70377:function(e,n,t){Promise.resolve().then(t.t.bind(t,47690,23)),Promise.resolve().then(t.t.bind(t,48955,23)),Promise.resolve().then(t.t.bind(t,5613,23)),Promise.resolve().then(t.t.bind(t,11902,23)),Promise.resolve().then(t.t.bind(t,31778,23)),Promise.resolve().then(t.t.bind(t,77831,23))}},function(e){var n=function(n){return e(e.s=n)};e.O(0,[971,69],function(){return n(35317),n(70377)}),_N_E=e.O()}]);
|
||||
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[744],{32028:function(e,n,t){Promise.resolve().then(t.t.bind(t,47690,23)),Promise.resolve().then(t.t.bind(t,48955,23)),Promise.resolve().then(t.t.bind(t,5613,23)),Promise.resolve().then(t.t.bind(t,11902,23)),Promise.resolve().then(t.t.bind(t,31778,23)),Promise.resolve().then(t.t.bind(t,77831,23))}},function(e){var n=function(n){return e(e.s=n)};e.O(0,[971,69],function(){return n(35317),n(32028)}),_N_E=e.O()}]);
|
|
@ -1 +1 @@
|
|||
!function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o||0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r||"object"==typeof e&&e&&(4&r&&e.__esModule||16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t||[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/654259bbf9e4c196.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e||l.getAttribute("data-webpack")==o+n){i=l;break}}i||(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children||(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E||[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}();
|
||||
!function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o||0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r||"object"==typeof e&&e&&(4&r&&e.__esModule||16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t||[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/c18941d97fb7245b.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e||l.getAttribute("data-webpack")==o+n){i=l;break}}i||(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children||(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E||[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}();
|
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
@ -1 +1 @@
|
|||
<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-85c9b4219c1bb384.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-9b4acf26920649bc.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-096338c8e1915716.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/654259bbf9e4c196.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[75985,[\"838\",\"static/chunks/838-7fa0bab5a1c3631d.js\",\"931\",\"static/chunks/app/page-5a7453e3903c5d60.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/654259bbf9e4c196.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"4mrMigZY9ob7yaIDjXpX6\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>
|
||||
<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-a85b2c176012d8e5.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-e1b183dda365ec86.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-9b4fb13a7db53edf.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/c18941d97fb7245b.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[48016,[\"145\",\"static/chunks/145-9c160ad5539e000f.js\",\"931\",\"static/chunks/app/page-fcb69349f15d154b.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/c18941d97fb7245b.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"lLFQRQnIrRo-GJf5spHEd\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>
|
|
@ -1,7 +1,7 @@
|
|||
2:I[77831,[],""]
|
||||
3:I[75985,["838","static/chunks/838-7fa0bab5a1c3631d.js","931","static/chunks/app/page-5a7453e3903c5d60.js"],""]
|
||||
3:I[48016,["145","static/chunks/145-9c160ad5539e000f.js","931","static/chunks/app/page-fcb69349f15d154b.js"],""]
|
||||
4:I[5613,[],""]
|
||||
5:I[31778,[],""]
|
||||
0:["4mrMigZY9ob7yaIDjXpX6",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/654259bbf9e4c196.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
|
||||
0:["lLFQRQnIrRo-GJf5spHEd",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/c18941d97fb7245b.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
|
||||
6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"🚅 LiteLLM"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
|
||||
1:null
|
||||
|
|
|
@ -234,6 +234,15 @@ class DynamoDBArgs(LiteLLMBase):
|
|||
key_table_name: str = "LiteLLM_VerificationToken"
|
||||
config_table_name: str = "LiteLLM_Config"
|
||||
spend_table_name: str = "LiteLLM_SpendLogs"
|
||||
aws_role_name: Optional[str] = None
|
||||
aws_session_name: Optional[str] = None
|
||||
aws_web_identity_token: Optional[str] = None
|
||||
aws_provider_id: Optional[str] = None
|
||||
aws_policy_arns: Optional[List[str]] = None
|
||||
aws_policy: Optional[str] = None
|
||||
aws_duration_seconds: Optional[int] = None
|
||||
assume_role_aws_role_name: Optional[str] = None
|
||||
assume_role_aws_session_name: Optional[str] = None
|
||||
|
||||
|
||||
class ConfigGeneralSettings(LiteLLMBase):
|
||||
|
|
|
@ -53,6 +53,41 @@ class DynamoDBWrapper(CustomDB):
|
|||
self.database_arguments = database_arguments
|
||||
self.region_name = database_arguments.region_name
|
||||
|
||||
def set_env_vars_based_on_arn(self):
|
||||
if self.database_arguments.aws_role_name is None:
|
||||
return
|
||||
verbose_proxy_logger.debug(
|
||||
f"DynamoDB: setting env vars based on arn={self.database_arguments.aws_role_name}"
|
||||
)
|
||||
import boto3, os
|
||||
|
||||
sts_client = boto3.client("sts")
|
||||
|
||||
# call 1
|
||||
non_used_assumed_role = sts_client.assume_role_with_web_identity(
|
||||
RoleArn=self.database_arguments.aws_role_name,
|
||||
RoleSessionName=self.database_arguments.aws_session_name,
|
||||
WebIdentityToken=self.database_arguments.aws_web_identity_token,
|
||||
)
|
||||
|
||||
# call 2
|
||||
assumed_role = sts_client.assume_role(
|
||||
RoleArn=self.database_arguments.assume_role_aws_role_name,
|
||||
RoleSessionName=self.database_arguments.assume_role_aws_session_name,
|
||||
)
|
||||
|
||||
aws_access_key_id = assumed_role["Credentials"]["AccessKeyId"]
|
||||
aws_secret_access_key = assumed_role["Credentials"]["SecretAccessKey"]
|
||||
aws_session_token = assumed_role["Credentials"]["SessionToken"]
|
||||
|
||||
verbose_proxy_logger.debug(
|
||||
f"Got STS assumed Role, aws_access_key_id={aws_access_key_id}"
|
||||
)
|
||||
# set these in the env so aiodynamo can use them
|
||||
os.environ["AWS_ACCESS_KEY_ID"] = aws_access_key_id
|
||||
os.environ["AWS_SECRET_ACCESS_KEY"] = aws_secret_access_key
|
||||
os.environ["AWS_SESSION_TOKEN"] = aws_session_token
|
||||
|
||||
async def connect(self):
|
||||
"""
|
||||
Connect to DB, and creating / updating any tables
|
||||
|
@ -75,6 +110,7 @@ class DynamoDBWrapper(CustomDB):
|
|||
import aiohttp
|
||||
|
||||
verbose_proxy_logger.debug("DynamoDB Wrapper - Attempting to connect")
|
||||
self.set_env_vars_based_on_arn()
|
||||
# before making ClientSession check if ssl_verify=False
|
||||
if self.database_arguments.ssl_verify == False:
|
||||
client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False))
|
||||
|
@ -171,6 +207,8 @@ class DynamoDBWrapper(CustomDB):
|
|||
from aiohttp import ClientSession
|
||||
import aiohttp
|
||||
|
||||
self.set_env_vars_based_on_arn()
|
||||
|
||||
if self.database_arguments.ssl_verify == False:
|
||||
client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False))
|
||||
else:
|
||||
|
@ -214,6 +252,8 @@ class DynamoDBWrapper(CustomDB):
|
|||
from aiohttp import ClientSession
|
||||
import aiohttp
|
||||
|
||||
self.set_env_vars_based_on_arn()
|
||||
|
||||
if self.database_arguments.ssl_verify == False:
|
||||
client_session = ClientSession(connector=aiohttp.TCPConnector(ssl=False))
|
||||
else:
|
||||
|
@ -261,6 +301,7 @@ class DynamoDBWrapper(CustomDB):
|
|||
async def update_data(
|
||||
self, key: str, value: dict, table_name: Literal["user", "key", "config"]
|
||||
):
|
||||
self.set_env_vars_based_on_arn()
|
||||
from aiodynamo.client import Client
|
||||
from aiodynamo.credentials import Credentials, StaticCredentials
|
||||
from aiodynamo.http.httpx import HTTPX
|
||||
|
@ -334,4 +375,5 @@ class DynamoDBWrapper(CustomDB):
|
|||
"""
|
||||
Not Implemented yet.
|
||||
"""
|
||||
self.set_env_vars_based_on_arn()
|
||||
return super().delete_data(keys, table_name)
|
||||
|
|
|
@ -8,14 +8,19 @@
|
|||
# Tell us how we can improve! - Krrish & Ishaan
|
||||
|
||||
|
||||
from typing import Optional
|
||||
import litellm, traceback, sys
|
||||
from typing import Optional, Literal, Union
|
||||
import litellm, traceback, sys, uuid
|
||||
from litellm.caching import DualCache
|
||||
from litellm.proxy._types import UserAPIKeyAuth
|
||||
from litellm.integrations.custom_logger import CustomLogger
|
||||
from fastapi import HTTPException
|
||||
from litellm._logging import verbose_proxy_logger
|
||||
from litellm import ModelResponse
|
||||
from litellm.utils import (
|
||||
ModelResponse,
|
||||
EmbeddingResponse,
|
||||
ImageResponse,
|
||||
StreamingChoices,
|
||||
)
|
||||
from datetime import datetime
|
||||
import aiohttp, asyncio
|
||||
|
||||
|
@ -24,7 +29,13 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
|
|||
user_api_key_cache = None
|
||||
|
||||
# Class variables or attributes
|
||||
def __init__(self):
|
||||
def __init__(self, mock_testing: bool = False):
|
||||
self.pii_tokens: dict = (
|
||||
{}
|
||||
) # mapping of PII token to original text - only used with Presidio `replace` operation
|
||||
if mock_testing == True: # for testing purposes only
|
||||
return
|
||||
|
||||
self.presidio_analyzer_api_base = litellm.get_secret(
|
||||
"PRESIDIO_ANALYZER_API_BASE", None
|
||||
)
|
||||
|
@ -51,12 +62,15 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
|
|||
pass
|
||||
|
||||
async def check_pii(self, text: str) -> str:
|
||||
"""
|
||||
[TODO] make this more performant for high-throughput scenario
|
||||
"""
|
||||
try:
|
||||
async with aiohttp.ClientSession() as session:
|
||||
# Make the first request to /analyze
|
||||
analyze_url = f"{self.presidio_analyzer_api_base}/analyze"
|
||||
analyze_payload = {"text": text, "language": "en"}
|
||||
|
||||
redacted_text = None
|
||||
async with session.post(analyze_url, json=analyze_payload) as response:
|
||||
analyze_results = await response.json()
|
||||
|
||||
|
@ -72,6 +86,26 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
|
|||
) as response:
|
||||
redacted_text = await response.json()
|
||||
|
||||
new_text = text
|
||||
if redacted_text is not None:
|
||||
for item in redacted_text["items"]:
|
||||
start = item["start"]
|
||||
end = item["end"]
|
||||
replacement = item["text"] # replacement token
|
||||
if (
|
||||
item["operator"] == "replace"
|
||||
and litellm.output_parse_pii == True
|
||||
):
|
||||
# check if token in dict
|
||||
# if exists, add a uuid to the replacement token for swapping back to the original text in llm response output parsing
|
||||
if replacement in self.pii_tokens:
|
||||
replacement = replacement + uuid.uuid4()
|
||||
|
||||
self.pii_tokens[replacement] = new_text[
|
||||
start:end
|
||||
] # get text it'll replace
|
||||
|
||||
new_text = new_text[:start] + replacement + new_text[end:]
|
||||
return redacted_text["text"]
|
||||
except Exception as e:
|
||||
traceback.print_exc()
|
||||
|
@ -94,6 +128,7 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
|
|||
if call_type == "completion": # /chat/completions requests
|
||||
messages = data["messages"]
|
||||
tasks = []
|
||||
|
||||
for m in messages:
|
||||
if isinstance(m["content"], str):
|
||||
tasks.append(self.check_pii(text=m["content"]))
|
||||
|
@ -104,3 +139,30 @@ class _OPTIONAL_PresidioPIIMasking(CustomLogger):
|
|||
"content"
|
||||
] = r # replace content with redacted string
|
||||
return data
|
||||
|
||||
async def async_post_call_success_hook(
|
||||
self,
|
||||
user_api_key_dict: UserAPIKeyAuth,
|
||||
response: Union[ModelResponse, EmbeddingResponse, ImageResponse],
|
||||
):
|
||||
"""
|
||||
Output parse the response object to replace the masked tokens with user sent values
|
||||
"""
|
||||
verbose_proxy_logger.debug(
|
||||
f"PII Masking Args: litellm.output_parse_pii={litellm.output_parse_pii}; type of response={type(response)}"
|
||||
)
|
||||
if litellm.output_parse_pii == False:
|
||||
return response
|
||||
|
||||
if isinstance(response, ModelResponse) and not isinstance(
|
||||
response.choices[0], StreamingChoices
|
||||
): # /chat/completions requests
|
||||
if isinstance(response.choices[0].message.content, str):
|
||||
verbose_proxy_logger.debug(
|
||||
f"self.pii_tokens: {self.pii_tokens}; initial response: {response.choices[0].message.content}"
|
||||
)
|
||||
for key, value in self.pii_tokens.items():
|
||||
response.choices[0].message.content = response.choices[
|
||||
0
|
||||
].message.content.replace(key, value)
|
||||
return response
|
||||
|
|
|
@ -570,6 +570,7 @@ def run_server(
|
|||
"worker_class": "uvicorn.workers.UvicornWorker",
|
||||
"preload": True, # Add the preload flag,
|
||||
"accesslog": "-", # Log to stdout
|
||||
"timeout": 600, # default to very high number, bedrock/anthropic.claude-v2:1 can take 30+ seconds for the 1st chunk to come in
|
||||
"access_log_format": '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s',
|
||||
}
|
||||
|
||||
|
|
|
@ -9,73 +9,41 @@ model_list:
|
|||
mode: chat
|
||||
max_tokens: 4096
|
||||
base_model: azure/gpt-4-1106-preview
|
||||
access_groups: ["public"]
|
||||
- model_name: openai-gpt-3.5
|
||||
litellm_params:
|
||||
model: gpt-3.5-turbo
|
||||
api_key: os.environ/OPENAI_API_KEY
|
||||
model_info:
|
||||
access_groups: ["public"]
|
||||
- model_name: anthropic-claude-v2.1
|
||||
litellm_params:
|
||||
model: bedrock/anthropic.claude-v2:1
|
||||
timeout: 300 # sets a 5 minute timeout
|
||||
model_info:
|
||||
access_groups: ["private"]
|
||||
- model_name: anthropic-claude-v2
|
||||
litellm_params:
|
||||
model: bedrock/anthropic.claude-v2
|
||||
- model_name: bedrock-cohere
|
||||
litellm_params:
|
||||
model: bedrock/cohere.command-text-v14
|
||||
timeout: 0.0001
|
||||
- model_name: gpt-4
|
||||
litellm_params:
|
||||
model: azure/chatgpt-v-2
|
||||
api_base: https://openai-gpt-4-test-v-1.openai.azure.com/
|
||||
api_version: "2023-05-15"
|
||||
api_key: os.environ/AZURE_API_KEY # The `os.environ/` prefix tells litellm to read this from the env. See https://docs.litellm.ai/docs/simple_proxy#load-api-keys-from-vault
|
||||
- model_name: gpt-vision
|
||||
model_info:
|
||||
base_model: azure/gpt-4
|
||||
- model_name: text-moderation-stable
|
||||
litellm_params:
|
||||
model: azure/gpt-4-vision
|
||||
base_url: https://gpt-4-vision-resource.openai.azure.com/openai/deployments/gpt-4-vision/extensions
|
||||
api_key: os.environ/AZURE_VISION_API_KEY
|
||||
api_version: "2023-09-01-preview"
|
||||
dataSources:
|
||||
- type: AzureComputerVision
|
||||
parameters:
|
||||
endpoint: os.environ/AZURE_VISION_ENHANCE_ENDPOINT
|
||||
key: os.environ/AZURE_VISION_ENHANCE_KEY
|
||||
- model_name: BEDROCK_GROUP
|
||||
litellm_params:
|
||||
model: bedrock/cohere.command-text-v14
|
||||
timeout: 0.0001
|
||||
- model_name: tg-ai
|
||||
litellm_params:
|
||||
model: together_ai/mistralai/Mistral-7B-Instruct-v0.1
|
||||
- model_name: sagemaker
|
||||
litellm_params:
|
||||
model: sagemaker/berri-benchmarking-Llama-2-70b-chat-hf-4
|
||||
- model_name: openai-gpt-3.5
|
||||
litellm_params:
|
||||
model: gpt-3.5-turbo
|
||||
model: text-moderation-stable
|
||||
api_key: os.environ/OPENAI_API_KEY
|
||||
model_info:
|
||||
mode: chat
|
||||
- model_name: azure-cloudflare
|
||||
litellm_params:
|
||||
model: azure/chatgpt-v-2
|
||||
api_base: https://gateway.ai.cloudflare.com/v1/0399b10e77ac6668c80404a5ff49eb37/litellm-test/azure-openai/openai-gpt-4-test-v-1
|
||||
api_key: os.environ/AZURE_API_KEY
|
||||
api_version: "2023-07-01-preview"
|
||||
- model_name: azure-embedding-model
|
||||
litellm_params:
|
||||
model: azure/azure-embedding-model
|
||||
api_base: os.environ/AZURE_API_BASE
|
||||
api_key: os.environ/AZURE_API_KEY
|
||||
api_version: "2023-07-01-preview"
|
||||
model_info:
|
||||
mode: embedding
|
||||
base_model: text-embedding-ada-002
|
||||
- model_name: text-embedding-ada-002
|
||||
litellm_params:
|
||||
model: text-embedding-ada-002
|
||||
api_key: os.environ/OPENAI_API_KEY
|
||||
model_info:
|
||||
mode: embedding
|
||||
litellm_settings:
|
||||
fallbacks: [{"openai-gpt-3.5": ["azure-gpt-3.5"]}]
|
||||
success_callback: ['langfuse']
|
||||
max_budget: 10 # global budget for proxy
|
||||
max_user_budget: 0.0001
|
||||
budget_duration: 30d # global budget duration, will reset after 30d
|
||||
default_key_generate_params:
|
||||
max_budget: 1.5000
|
||||
models: ["azure-gpt-3.5"]
|
||||
duration: None
|
||||
upperbound_key_generate_params:
|
||||
max_budget: 100
|
||||
duration: "30d"
|
||||
# setting callback class
|
||||
# callbacks: custom_callbacks.proxy_handler_instance # sets litellm.callbacks = [proxy_handler_instance]
|
||||
|
||||
|
@ -93,6 +61,7 @@ general_settings:
|
|||
|
||||
|
||||
|
||||
|
||||
environment_variables:
|
||||
# otel: True # OpenTelemetry Logger
|
||||
# master_key: sk-1234 # [OPTIONAL] Only use this if you to require all calls to contain this key (Authorization: Bearer sk-1234)
|
||||
|
|
|
@ -403,34 +403,43 @@ async def user_api_key_auth(
|
|||
verbose_proxy_logger.debug(
|
||||
f"LLM Model List pre access group check: {llm_model_list}"
|
||||
)
|
||||
access_groups = []
|
||||
from collections import defaultdict
|
||||
|
||||
access_groups = defaultdict(list)
|
||||
if llm_model_list is not None:
|
||||
for m in llm_model_list:
|
||||
for group in m.get("model_info", {}).get("access_groups", []):
|
||||
access_groups.append((m["model_name"], group))
|
||||
model_name = m["model_name"]
|
||||
access_groups[group].append(model_name)
|
||||
|
||||
allowed_models = valid_token.models
|
||||
access_group_idx = set()
|
||||
models_in_current_access_groups = []
|
||||
if (
|
||||
len(access_groups) > 0
|
||||
): # check if token contains any model access groups
|
||||
for idx, m in enumerate(valid_token.models):
|
||||
for model_name, group in access_groups:
|
||||
if m == group:
|
||||
access_group_idx.add(idx)
|
||||
allowed_models.append(model_name)
|
||||
for idx, m in enumerate(
|
||||
valid_token.models
|
||||
): # loop token models, if any of them are an access group add the access group
|
||||
if m in access_groups:
|
||||
# if it is an access group we need to remove it from valid_token.models
|
||||
models_in_group = access_groups[m]
|
||||
models_in_current_access_groups.extend(models_in_group)
|
||||
|
||||
# Filter out models that are access_groups
|
||||
filtered_models = [
|
||||
m for m in valid_token.models if m not in access_groups
|
||||
]
|
||||
|
||||
filtered_models += models_in_current_access_groups
|
||||
verbose_proxy_logger.debug(
|
||||
f"model: {model}; allowed_models: {allowed_models}"
|
||||
f"model: {model}; allowed_models: {filtered_models}"
|
||||
)
|
||||
if model is not None and model not in allowed_models:
|
||||
if model is not None and model not in filtered_models:
|
||||
raise ValueError(
|
||||
f"API Key not allowed to access model. This token can only access models={valid_token.models}. Tried to access {model}"
|
||||
)
|
||||
for val in access_group_idx:
|
||||
allowed_models.pop(val)
|
||||
valid_token.models = allowed_models
|
||||
valid_token.models = filtered_models
|
||||
verbose_proxy_logger.debug(
|
||||
f"filtered allowed_models: {allowed_models}; valid_token.models: {valid_token.models}"
|
||||
f"filtered allowed_models: {filtered_models}; valid_token.models: {valid_token.models}"
|
||||
)
|
||||
|
||||
# Check 2. If user_id for this token is in budget
|
||||
|
@ -682,34 +691,31 @@ async def user_api_key_auth(
|
|||
# sso/login, ui/login, /key functions and /user functions
|
||||
# this will never be allowed to call /chat/completions
|
||||
token_team = getattr(valid_token, "team_id", None)
|
||||
if token_team is not None:
|
||||
if token_team == "litellm-dashboard":
|
||||
# this token is only used for managing the ui
|
||||
allowed_routes = [
|
||||
"/sso",
|
||||
"/login",
|
||||
"/key",
|
||||
"/spend",
|
||||
"/user",
|
||||
]
|
||||
# check if the current route startswith any of the allowed routes
|
||||
if (
|
||||
route is not None
|
||||
and isinstance(route, str)
|
||||
and any(
|
||||
route.startswith(allowed_route)
|
||||
for allowed_route in allowed_routes
|
||||
)
|
||||
):
|
||||
# Do something if the current route starts with any of the allowed routes
|
||||
pass
|
||||
else:
|
||||
raise Exception(
|
||||
f"This key is made for LiteLLM UI, Tried to access route: {route}. Not allowed"
|
||||
)
|
||||
return UserAPIKeyAuth(api_key=api_key, **valid_token_dict)
|
||||
else:
|
||||
raise Exception(f"Invalid Key Passed to LiteLLM Proxy")
|
||||
if token_team is not None and token_team == "litellm-dashboard":
|
||||
# this token is only used for managing the ui
|
||||
allowed_routes = [
|
||||
"/sso",
|
||||
"/login",
|
||||
"/key",
|
||||
"/spend",
|
||||
"/user",
|
||||
"/model/info",
|
||||
]
|
||||
# check if the current route startswith any of the allowed routes
|
||||
if (
|
||||
route is not None
|
||||
and isinstance(route, str)
|
||||
and any(
|
||||
route.startswith(allowed_route) for allowed_route in allowed_routes
|
||||
)
|
||||
):
|
||||
# Do something if the current route starts with any of the allowed routes
|
||||
pass
|
||||
else:
|
||||
raise Exception(
|
||||
f"This key is made for LiteLLM UI, Tried to access route: {route}. Not allowed"
|
||||
)
|
||||
return UserAPIKeyAuth(api_key=api_key, **valid_token_dict)
|
||||
except Exception as e:
|
||||
# verbose_proxy_logger.debug(f"An exception occurred - {traceback.format_exc()}")
|
||||
traceback.print_exc()
|
||||
|
@ -1443,6 +1449,24 @@ class ProxyConfig:
|
|||
database_type == "dynamo_db" or database_type == "dynamodb"
|
||||
):
|
||||
database_args = general_settings.get("database_args", None)
|
||||
### LOAD FROM os.environ/ ###
|
||||
for k, v in database_args.items():
|
||||
if isinstance(v, str) and v.startswith("os.environ/"):
|
||||
database_args[k] = litellm.get_secret(v)
|
||||
if isinstance(k, str) and k == "aws_web_identity_token":
|
||||
value = database_args[k]
|
||||
verbose_proxy_logger.debug(
|
||||
f"Loading AWS Web Identity Token from file: {value}"
|
||||
)
|
||||
if os.path.exists(value):
|
||||
with open(value, "r") as file:
|
||||
token_content = file.read()
|
||||
database_args[k] = token_content
|
||||
else:
|
||||
verbose_proxy_logger.info(
|
||||
f"DynamoDB Loading - {value} is not a valid file path"
|
||||
)
|
||||
verbose_proxy_logger.debug(f"database_args: {database_args}")
|
||||
custom_db_client = DBClient(
|
||||
custom_db_args=database_args, custom_db_type=database_type
|
||||
)
|
||||
|
@ -1580,8 +1604,6 @@ async def generate_key_helper_fn(
|
|||
tpm_limit = tpm_limit
|
||||
rpm_limit = rpm_limit
|
||||
allowed_cache_controls = allowed_cache_controls
|
||||
if type(team_id) is not str:
|
||||
team_id = str(team_id)
|
||||
try:
|
||||
# Create a new verification token (you may want to enhance this logic based on your needs)
|
||||
user_data = {
|
||||
|
@ -2057,14 +2079,6 @@ def model_list(
|
|||
if user_model is not None:
|
||||
all_models += [user_model]
|
||||
verbose_proxy_logger.debug(f"all_models: {all_models}")
|
||||
### CHECK OLLAMA MODELS ###
|
||||
try:
|
||||
response = requests.get("http://0.0.0.0:11434/api/tags")
|
||||
models = response.json()["models"]
|
||||
ollama_models = ["ollama/" + m["name"].replace(":latest", "") for m in models]
|
||||
all_models.extend(ollama_models)
|
||||
except Exception as e:
|
||||
pass
|
||||
return dict(
|
||||
data=[
|
||||
{
|
||||
|
@ -2355,8 +2369,13 @@ async def chat_completion(
|
|||
llm_router is not None and data["model"] in llm_router.deployment_names
|
||||
): # model in router deployments, calling a specific deployment on the router
|
||||
response = await llm_router.acompletion(**data, specific_deployment=True)
|
||||
else: # router is not set
|
||||
elif user_model is not None: # `litellm --model <your-model-name>`
|
||||
response = await litellm.acompletion(**data)
|
||||
else:
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_400_BAD_REQUEST,
|
||||
detail={"error": "Invalid model name passed in"},
|
||||
)
|
||||
|
||||
# Post Call Processing
|
||||
data["litellm_status"] = "success" # used for alerting
|
||||
|
@ -2387,6 +2406,11 @@ async def chat_completion(
|
|||
)
|
||||
|
||||
fastapi_response.headers["x-litellm-model-id"] = model_id
|
||||
|
||||
### CALL HOOKS ### - modify outgoing data
|
||||
response = await proxy_logging_obj.post_call_success_hook(
|
||||
user_api_key_dict=user_api_key_dict, response=response
|
||||
)
|
||||
return response
|
||||
except Exception as e:
|
||||
traceback.print_exc()
|
||||
|
@ -2417,7 +2441,12 @@ async def chat_completion(
|
|||
traceback.print_exc()
|
||||
|
||||
if isinstance(e, HTTPException):
|
||||
raise e
|
||||
raise ProxyException(
|
||||
message=getattr(e, "detail", str(e)),
|
||||
type=getattr(e, "type", "None"),
|
||||
param=getattr(e, "param", "None"),
|
||||
code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
|
||||
)
|
||||
else:
|
||||
error_traceback = traceback.format_exc()
|
||||
error_msg = f"{str(e)}\n\n{error_traceback}"
|
||||
|
@ -2567,8 +2596,13 @@ async def embeddings(
|
|||
llm_router is not None and data["model"] in llm_router.deployment_names
|
||||
): # model in router deployments, calling a specific deployment on the router
|
||||
response = await llm_router.aembedding(**data, specific_deployment=True)
|
||||
else:
|
||||
elif user_model is not None: # `litellm --model <your-model-name>`
|
||||
response = await litellm.aembedding(**data)
|
||||
else:
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_400_BAD_REQUEST,
|
||||
detail={"error": "Invalid model name passed in"},
|
||||
)
|
||||
|
||||
### ALERTING ###
|
||||
data["litellm_status"] = "success" # used for alerting
|
||||
|
@ -2586,7 +2620,12 @@ async def embeddings(
|
|||
)
|
||||
traceback.print_exc()
|
||||
if isinstance(e, HTTPException):
|
||||
raise e
|
||||
raise ProxyException(
|
||||
message=getattr(e, "message", str(e)),
|
||||
type=getattr(e, "type", "None"),
|
||||
param=getattr(e, "param", "None"),
|
||||
code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
|
||||
)
|
||||
else:
|
||||
error_traceback = traceback.format_exc()
|
||||
error_msg = f"{str(e)}\n\n{error_traceback}"
|
||||
|
@ -2702,8 +2741,13 @@ async def image_generation(
|
|||
response = await llm_router.aimage_generation(
|
||||
**data
|
||||
) # ensure this goes the llm_router, router will do the correct alias mapping
|
||||
else:
|
||||
elif user_model is not None: # `litellm --model <your-model-name>`
|
||||
response = await litellm.aimage_generation(**data)
|
||||
else:
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_400_BAD_REQUEST,
|
||||
detail={"error": "Invalid model name passed in"},
|
||||
)
|
||||
|
||||
### ALERTING ###
|
||||
data["litellm_status"] = "success" # used for alerting
|
||||
|
@ -2721,7 +2765,165 @@ async def image_generation(
|
|||
)
|
||||
traceback.print_exc()
|
||||
if isinstance(e, HTTPException):
|
||||
raise e
|
||||
raise ProxyException(
|
||||
message=getattr(e, "message", str(e)),
|
||||
type=getattr(e, "type", "None"),
|
||||
param=getattr(e, "param", "None"),
|
||||
code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
|
||||
)
|
||||
else:
|
||||
error_traceback = traceback.format_exc()
|
||||
error_msg = f"{str(e)}\n\n{error_traceback}"
|
||||
raise ProxyException(
|
||||
message=getattr(e, "message", error_msg),
|
||||
type=getattr(e, "type", "None"),
|
||||
param=getattr(e, "param", "None"),
|
||||
code=getattr(e, "status_code", 500),
|
||||
)
|
||||
|
||||
|
||||
@router.post(
|
||||
"/v1/moderations",
|
||||
dependencies=[Depends(user_api_key_auth)],
|
||||
response_class=ORJSONResponse,
|
||||
tags=["moderations"],
|
||||
)
|
||||
@router.post(
|
||||
"/moderations",
|
||||
dependencies=[Depends(user_api_key_auth)],
|
||||
response_class=ORJSONResponse,
|
||||
tags=["moderations"],
|
||||
)
|
||||
async def moderations(
|
||||
request: Request,
|
||||
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
|
||||
):
|
||||
"""
|
||||
The moderations endpoint is a tool you can use to check whether content complies with an LLM Providers policies.
|
||||
|
||||
Quick Start
|
||||
```
|
||||
curl --location 'http://0.0.0.0:4000/moderations' \
|
||||
--header 'Content-Type: application/json' \
|
||||
--header 'Authorization: Bearer sk-1234' \
|
||||
--data '{"input": "Sample text goes here", "model": "text-moderation-stable"}'
|
||||
```
|
||||
"""
|
||||
global proxy_logging_obj
|
||||
try:
|
||||
# Use orjson to parse JSON data, orjson speeds up requests significantly
|
||||
body = await request.body()
|
||||
data = orjson.loads(body)
|
||||
|
||||
# Include original request and headers in the data
|
||||
data["proxy_server_request"] = {
|
||||
"url": str(request.url),
|
||||
"method": request.method,
|
||||
"headers": dict(request.headers),
|
||||
"body": copy.copy(data), # use copy instead of deepcopy
|
||||
}
|
||||
|
||||
if data.get("user", None) is None and user_api_key_dict.user_id is not None:
|
||||
data["user"] = user_api_key_dict.user_id
|
||||
|
||||
data["model"] = (
|
||||
general_settings.get("moderation_model", None) # server default
|
||||
or user_model # model name passed via cli args
|
||||
or data["model"] # default passed in http request
|
||||
)
|
||||
if user_model:
|
||||
data["model"] = user_model
|
||||
|
||||
if "metadata" not in data:
|
||||
data["metadata"] = {}
|
||||
data["metadata"]["user_api_key"] = user_api_key_dict.api_key
|
||||
data["metadata"]["user_api_key_metadata"] = user_api_key_dict.metadata
|
||||
_headers = dict(request.headers)
|
||||
_headers.pop(
|
||||
"authorization", None
|
||||
) # do not store the original `sk-..` api key in the db
|
||||
data["metadata"]["headers"] = _headers
|
||||
data["metadata"]["user_api_key_user_id"] = user_api_key_dict.user_id
|
||||
data["metadata"]["endpoint"] = str(request.url)
|
||||
|
||||
### TEAM-SPECIFIC PARAMS ###
|
||||
if user_api_key_dict.team_id is not None:
|
||||
team_config = await proxy_config.load_team_config(
|
||||
team_id=user_api_key_dict.team_id
|
||||
)
|
||||
if len(team_config) == 0:
|
||||
pass
|
||||
else:
|
||||
team_id = team_config.pop("team_id", None)
|
||||
data["metadata"]["team_id"] = team_id
|
||||
data = {
|
||||
**team_config,
|
||||
**data,
|
||||
} # add the team-specific configs to the completion call
|
||||
|
||||
router_model_names = (
|
||||
[m["model_name"] for m in llm_model_list]
|
||||
if llm_model_list is not None
|
||||
else []
|
||||
)
|
||||
|
||||
### CALL HOOKS ### - modify incoming data / reject request before calling the model
|
||||
data = await proxy_logging_obj.pre_call_hook(
|
||||
user_api_key_dict=user_api_key_dict, data=data, call_type="moderation"
|
||||
)
|
||||
|
||||
start_time = time.time()
|
||||
|
||||
## ROUTE TO CORRECT ENDPOINT ##
|
||||
# skip router if user passed their key
|
||||
if "api_key" in data:
|
||||
response = await litellm.amoderation(**data)
|
||||
elif (
|
||||
llm_router is not None and data["model"] in router_model_names
|
||||
): # model in router model list
|
||||
response = await llm_router.amoderation(**data)
|
||||
elif (
|
||||
llm_router is not None and data["model"] in llm_router.deployment_names
|
||||
): # model in router deployments, calling a specific deployment on the router
|
||||
response = await llm_router.amoderation(**data, specific_deployment=True)
|
||||
elif (
|
||||
llm_router is not None
|
||||
and llm_router.model_group_alias is not None
|
||||
and data["model"] in llm_router.model_group_alias
|
||||
): # model set in model_group_alias
|
||||
response = await llm_router.amoderation(
|
||||
**data
|
||||
) # ensure this goes the llm_router, router will do the correct alias mapping
|
||||
elif user_model is not None: # `litellm --model <your-model-name>`
|
||||
response = await litellm.amoderation(**data)
|
||||
else:
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_400_BAD_REQUEST,
|
||||
detail={"error": "Invalid model name passed in"},
|
||||
)
|
||||
|
||||
### ALERTING ###
|
||||
data["litellm_status"] = "success" # used for alerting
|
||||
end_time = time.time()
|
||||
asyncio.create_task(
|
||||
proxy_logging_obj.response_taking_too_long(
|
||||
start_time=start_time, end_time=end_time, type="slow_response"
|
||||
)
|
||||
)
|
||||
|
||||
return response
|
||||
except Exception as e:
|
||||
await proxy_logging_obj.post_call_failure_hook(
|
||||
user_api_key_dict=user_api_key_dict, original_exception=e
|
||||
)
|
||||
traceback.print_exc()
|
||||
if isinstance(e, HTTPException):
|
||||
raise ProxyException(
|
||||
message=getattr(e, "message", str(e)),
|
||||
type=getattr(e, "type", "None"),
|
||||
param=getattr(e, "param", "None"),
|
||||
code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
|
||||
)
|
||||
else:
|
||||
error_traceback = traceback.format_exc()
|
||||
error_msg = f"{str(e)}\n\n{error_traceback}"
|
||||
|
@ -3516,6 +3718,7 @@ async def google_login(request: Request):
|
|||
"""
|
||||
microsoft_client_id = os.getenv("MICROSOFT_CLIENT_ID", None)
|
||||
google_client_id = os.getenv("GOOGLE_CLIENT_ID", None)
|
||||
generic_client_id = os.getenv("GENERIC_CLIENT_ID", None)
|
||||
|
||||
# get url from request
|
||||
redirect_url = os.getenv("PROXY_BASE_URL", str(request.base_url))
|
||||
|
@ -3574,6 +3777,69 @@ async def google_login(request: Request):
|
|||
)
|
||||
with microsoft_sso:
|
||||
return await microsoft_sso.get_login_redirect()
|
||||
elif generic_client_id is not None:
|
||||
from fastapi_sso.sso.generic import create_provider, DiscoveryDocument
|
||||
|
||||
generic_client_secret = os.getenv("GENERIC_CLIENT_SECRET", None)
|
||||
generic_authorization_endpoint = os.getenv(
|
||||
"GENERIC_AUTHORIZATION_ENDPOINT", None
|
||||
)
|
||||
generic_token_endpoint = os.getenv("GENERIC_TOKEN_ENDPOINT", None)
|
||||
generic_userinfo_endpoint = os.getenv("GENERIC_USERINFO_ENDPOINT", None)
|
||||
if generic_client_secret is None:
|
||||
raise ProxyException(
|
||||
message="GENERIC_CLIENT_SECRET not set. Set it in .env file",
|
||||
type="auth_error",
|
||||
param="GENERIC_CLIENT_SECRET",
|
||||
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||||
)
|
||||
if generic_authorization_endpoint is None:
|
||||
raise ProxyException(
|
||||
message="GENERIC_AUTHORIZATION_ENDPOINT not set. Set it in .env file",
|
||||
type="auth_error",
|
||||
param="GENERIC_AUTHORIZATION_ENDPOINT",
|
||||
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||||
)
|
||||
if generic_token_endpoint is None:
|
||||
raise ProxyException(
|
||||
message="GENERIC_TOKEN_ENDPOINT not set. Set it in .env file",
|
||||
type="auth_error",
|
||||
param="GENERIC_TOKEN_ENDPOINT",
|
||||
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||||
)
|
||||
if generic_userinfo_endpoint is None:
|
||||
raise ProxyException(
|
||||
message="GENERIC_USERINFO_ENDPOINT not set. Set it in .env file",
|
||||
type="auth_error",
|
||||
param="GENERIC_USERINFO_ENDPOINT",
|
||||
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||||
)
|
||||
|
||||
verbose_proxy_logger.debug(
|
||||
f"authorization_endpoint: {generic_authorization_endpoint}\ntoken_endpoint: {generic_token_endpoint}\nuserinfo_endpoint: {generic_userinfo_endpoint}"
|
||||
)
|
||||
|
||||
verbose_proxy_logger.debug(
|
||||
f"GENERIC_REDIRECT_URI: {redirect_url}\nGENERIC_CLIENT_ID: {generic_client_id}\n"
|
||||
)
|
||||
|
||||
discovery = DiscoveryDocument(
|
||||
authorization_endpoint=generic_authorization_endpoint,
|
||||
token_endpoint=generic_token_endpoint,
|
||||
userinfo_endpoint=generic_userinfo_endpoint,
|
||||
)
|
||||
|
||||
SSOProvider = create_provider(name="oidc", discovery_document=discovery)
|
||||
generic_sso = SSOProvider(
|
||||
client_id=generic_client_id,
|
||||
client_secret=generic_client_secret,
|
||||
redirect_uri=redirect_url,
|
||||
allow_insecure_http=True,
|
||||
)
|
||||
|
||||
with generic_sso:
|
||||
return await generic_sso.get_login_redirect()
|
||||
|
||||
elif ui_username is not None:
|
||||
# No Google, Microsoft SSO
|
||||
# Use UI Credentials set in .env
|
||||
|
@ -3673,6 +3939,7 @@ async def auth_callback(request: Request):
|
|||
global general_settings
|
||||
microsoft_client_id = os.getenv("MICROSOFT_CLIENT_ID", None)
|
||||
google_client_id = os.getenv("GOOGLE_CLIENT_ID", None)
|
||||
generic_client_id = os.getenv("GENERIC_CLIENT_ID", None)
|
||||
|
||||
# get url from request
|
||||
redirect_url = os.getenv("PROXY_BASE_URL", str(request.base_url))
|
||||
|
@ -3728,6 +3995,77 @@ async def auth_callback(request: Request):
|
|||
allow_insecure_http=True,
|
||||
)
|
||||
result = await microsoft_sso.verify_and_process(request)
|
||||
elif generic_client_id is not None:
|
||||
# make generic sso provider
|
||||
from fastapi_sso.sso.generic import create_provider, DiscoveryDocument
|
||||
|
||||
generic_client_secret = os.getenv("GENERIC_CLIENT_SECRET", None)
|
||||
generic_authorization_endpoint = os.getenv(
|
||||
"GENERIC_AUTHORIZATION_ENDPOINT", None
|
||||
)
|
||||
generic_token_endpoint = os.getenv("GENERIC_TOKEN_ENDPOINT", None)
|
||||
generic_userinfo_endpoint = os.getenv("GENERIC_USERINFO_ENDPOINT", None)
|
||||
if generic_client_secret is None:
|
||||
raise ProxyException(
|
||||
message="GENERIC_CLIENT_SECRET not set. Set it in .env file",
|
||||
type="auth_error",
|
||||
param="GENERIC_CLIENT_SECRET",
|
||||
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||||
)
|
||||
if generic_authorization_endpoint is None:
|
||||
raise ProxyException(
|
||||
message="GENERIC_AUTHORIZATION_ENDPOINT not set. Set it in .env file",
|
||||
type="auth_error",
|
||||
param="GENERIC_AUTHORIZATION_ENDPOINT",
|
||||
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||||
)
|
||||
if generic_token_endpoint is None:
|
||||
raise ProxyException(
|
||||
message="GENERIC_TOKEN_ENDPOINT not set. Set it in .env file",
|
||||
type="auth_error",
|
||||
param="GENERIC_TOKEN_ENDPOINT",
|
||||
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||||
)
|
||||
if generic_userinfo_endpoint is None:
|
||||
raise ProxyException(
|
||||
message="GENERIC_USERINFO_ENDPOINT not set. Set it in .env file",
|
||||
type="auth_error",
|
||||
param="GENERIC_USERINFO_ENDPOINT",
|
||||
code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||||
)
|
||||
|
||||
verbose_proxy_logger.debug(
|
||||
f"authorization_endpoint: {generic_authorization_endpoint}\ntoken_endpoint: {generic_token_endpoint}\nuserinfo_endpoint: {generic_userinfo_endpoint}"
|
||||
)
|
||||
|
||||
verbose_proxy_logger.debug(
|
||||
f"GENERIC_REDIRECT_URI: {redirect_url}\nGENERIC_CLIENT_ID: {generic_client_id}\n"
|
||||
)
|
||||
|
||||
discovery = DiscoveryDocument(
|
||||
authorization_endpoint=generic_authorization_endpoint,
|
||||
token_endpoint=generic_token_endpoint,
|
||||
userinfo_endpoint=generic_userinfo_endpoint,
|
||||
)
|
||||
|
||||
SSOProvider = create_provider(name="oidc", discovery_document=discovery)
|
||||
generic_sso = SSOProvider(
|
||||
client_id=generic_client_id,
|
||||
client_secret=generic_client_secret,
|
||||
redirect_uri=redirect_url,
|
||||
allow_insecure_http=True,
|
||||
)
|
||||
verbose_proxy_logger.debug(f"calling generic_sso.verify_and_process")
|
||||
|
||||
request_body = await request.body()
|
||||
|
||||
request_query_params = request.query_params
|
||||
|
||||
# get "code" from query params
|
||||
code = request_query_params.get("code")
|
||||
|
||||
result = await generic_sso.verify_and_process(request)
|
||||
verbose_proxy_logger.debug(f"generic result: {result}")
|
||||
|
||||
# User is Authe'd in - generate key for the UI to access Proxy
|
||||
user_email = getattr(result, "email", None)
|
||||
|
@ -3936,7 +4274,6 @@ async def add_new_model(model_params: ModelParams):
|
|||
)
|
||||
|
||||
|
||||
#### [BETA] - This is a beta endpoint, format might change based on user feedback https://github.com/BerriAI/litellm/issues/933. If you need a stable endpoint use /model/info
|
||||
@router.get(
|
||||
"/model/info",
|
||||
description="Provides more info about each model in /models, including config.yaml descriptions (except api key and api base)",
|
||||
|
@ -3969,6 +4306,28 @@ async def model_info_v1(
|
|||
# read litellm model_prices_and_context_window.json to get the following:
|
||||
# input_cost_per_token, output_cost_per_token, max_tokens
|
||||
litellm_model_info = get_litellm_model_info(model=model)
|
||||
|
||||
# 2nd pass on the model, try seeing if we can find model in litellm model_cost map
|
||||
if litellm_model_info == {}:
|
||||
# use litellm_param model_name to get model_info
|
||||
litellm_params = model.get("litellm_params", {})
|
||||
litellm_model = litellm_params.get("model", None)
|
||||
try:
|
||||
litellm_model_info = litellm.get_model_info(model=litellm_model)
|
||||
except:
|
||||
litellm_model_info = {}
|
||||
# 3rd pass on the model, try seeing if we can find model but without the "/" in model cost map
|
||||
if litellm_model_info == {}:
|
||||
# use litellm_param model_name to get model_info
|
||||
litellm_params = model.get("litellm_params", {})
|
||||
litellm_model = litellm_params.get("model", None)
|
||||
split_model = litellm_model.split("/")
|
||||
if len(split_model) > 0:
|
||||
litellm_model = split_model[-1]
|
||||
try:
|
||||
litellm_model_info = litellm.get_model_info(model=litellm_model)
|
||||
except:
|
||||
litellm_model_info = {}
|
||||
for k, v in litellm_model_info.items():
|
||||
if k not in model_info:
|
||||
model_info[k] = v
|
||||
|
|
|
@ -4,25 +4,29 @@ const openai = require('openai');
|
|||
process.env.DEBUG=false;
|
||||
async function runOpenAI() {
|
||||
const client = new openai.OpenAI({
|
||||
apiKey: 'sk-JkKeNi6WpWDngBsghJ6B9g',
|
||||
baseURL: 'http://0.0.0.0:8000'
|
||||
apiKey: 'sk-1234',
|
||||
baseURL: 'http://0.0.0.0:4000'
|
||||
});
|
||||
|
||||
|
||||
|
||||
try {
|
||||
const response = await client.chat.completions.create({
|
||||
model: 'sagemaker',
|
||||
model: 'anthropic-claude-v2.1',
|
||||
stream: true,
|
||||
max_tokens: 1000,
|
||||
messages: [
|
||||
{
|
||||
role: 'user',
|
||||
content: 'write a 20 pg essay about YC ',
|
||||
content: 'write a 20 pg essay about YC '.repeat(6000),
|
||||
},
|
||||
],
|
||||
});
|
||||
|
||||
console.log(response);
|
||||
let original = '';
|
||||
for await (const chunk of response) {
|
||||
original += chunk.choices[0].delta.content;
|
||||
console.log(original);
|
||||
console.log(chunk);
|
||||
console.log(chunk.choices[0].delta.content);
|
||||
}
|
||||
|
|
|
@ -11,6 +11,7 @@ from litellm.caching import DualCache
|
|||
from litellm.proxy.hooks.parallel_request_limiter import (
|
||||
_PROXY_MaxParallelRequestsHandler,
|
||||
)
|
||||
from litellm import ModelResponse, EmbeddingResponse, ImageResponse
|
||||
from litellm.proxy.hooks.max_budget_limiter import _PROXY_MaxBudgetLimiter
|
||||
from litellm.proxy.hooks.cache_control_check import _PROXY_CacheControlCheck
|
||||
from litellm.integrations.custom_logger import CustomLogger
|
||||
|
@ -92,7 +93,9 @@ class ProxyLogging:
|
|||
self,
|
||||
user_api_key_dict: UserAPIKeyAuth,
|
||||
data: dict,
|
||||
call_type: Literal["completion", "embeddings", "image_generation"],
|
||||
call_type: Literal[
|
||||
"completion", "embeddings", "image_generation", "moderation"
|
||||
],
|
||||
):
|
||||
"""
|
||||
Allows users to modify/reject the incoming request to the proxy, without having to deal with parsing Request body.
|
||||
|
@ -377,6 +380,28 @@ class ProxyLogging:
|
|||
raise e
|
||||
return
|
||||
|
||||
async def post_call_success_hook(
|
||||
self,
|
||||
response: Union[ModelResponse, EmbeddingResponse, ImageResponse],
|
||||
user_api_key_dict: UserAPIKeyAuth,
|
||||
):
|
||||
"""
|
||||
Allow user to modify outgoing data
|
||||
|
||||
Covers:
|
||||
1. /chat/completions
|
||||
"""
|
||||
new_response = copy.deepcopy(response)
|
||||
for callback in litellm.callbacks:
|
||||
try:
|
||||
if isinstance(callback, CustomLogger):
|
||||
await callback.async_post_call_success_hook(
|
||||
user_api_key_dict=user_api_key_dict, response=new_response
|
||||
)
|
||||
except Exception as e:
|
||||
raise e
|
||||
return new_response
|
||||
|
||||
|
||||
### DB CONNECTOR ###
|
||||
# Define the retry decorator with backoff strategy
|
||||
|
|
|
@ -599,6 +599,98 @@ class Router:
|
|||
self.fail_calls[model_name] += 1
|
||||
raise e
|
||||
|
||||
async def amoderation(self, model: str, input: str, **kwargs):
|
||||
try:
|
||||
kwargs["model"] = model
|
||||
kwargs["input"] = input
|
||||
kwargs["original_function"] = self._amoderation
|
||||
kwargs["num_retries"] = kwargs.get("num_retries", self.num_retries)
|
||||
timeout = kwargs.get("request_timeout", self.timeout)
|
||||
kwargs.setdefault("metadata", {}).update({"model_group": model})
|
||||
|
||||
response = await self.async_function_with_fallbacks(**kwargs)
|
||||
|
||||
return response
|
||||
except Exception as e:
|
||||
raise e
|
||||
|
||||
async def _amoderation(self, model: str, input: str, **kwargs):
|
||||
model_name = None
|
||||
try:
|
||||
verbose_router_logger.debug(
|
||||
f"Inside _moderation()- model: {model}; kwargs: {kwargs}"
|
||||
)
|
||||
deployment = self.get_available_deployment(
|
||||
model=model,
|
||||
input=input,
|
||||
specific_deployment=kwargs.pop("specific_deployment", None),
|
||||
)
|
||||
kwargs.setdefault("metadata", {}).update(
|
||||
{
|
||||
"deployment": deployment["litellm_params"]["model"],
|
||||
"model_info": deployment.get("model_info", {}),
|
||||
}
|
||||
)
|
||||
kwargs["model_info"] = deployment.get("model_info", {})
|
||||
data = deployment["litellm_params"].copy()
|
||||
model_name = data["model"]
|
||||
for k, v in self.default_litellm_params.items():
|
||||
if (
|
||||
k not in kwargs and v is not None
|
||||
): # prioritize model-specific params > default router params
|
||||
kwargs[k] = v
|
||||
elif k == "metadata":
|
||||
kwargs[k].update(v)
|
||||
|
||||
potential_model_client = self._get_client(
|
||||
deployment=deployment, kwargs=kwargs, client_type="async"
|
||||
)
|
||||
# check if provided keys == client keys #
|
||||
dynamic_api_key = kwargs.get("api_key", None)
|
||||
if (
|
||||
dynamic_api_key is not None
|
||||
and potential_model_client is not None
|
||||
and dynamic_api_key != potential_model_client.api_key
|
||||
):
|
||||
model_client = None
|
||||
else:
|
||||
model_client = potential_model_client
|
||||
self.total_calls[model_name] += 1
|
||||
|
||||
timeout = (
|
||||
data.get(
|
||||
"timeout", None
|
||||
) # timeout set on litellm_params for this deployment
|
||||
or self.timeout # timeout set on router
|
||||
or kwargs.get(
|
||||
"timeout", None
|
||||
) # this uses default_litellm_params when nothing is set
|
||||
)
|
||||
|
||||
response = await litellm.amoderation(
|
||||
**{
|
||||
**data,
|
||||
"input": input,
|
||||
"caching": self.cache_responses,
|
||||
"client": model_client,
|
||||
"timeout": timeout,
|
||||
**kwargs,
|
||||
}
|
||||
)
|
||||
|
||||
self.success_calls[model_name] += 1
|
||||
verbose_router_logger.info(
|
||||
f"litellm.amoderation(model={model_name})\033[32m 200 OK\033[0m"
|
||||
)
|
||||
return response
|
||||
except Exception as e:
|
||||
verbose_router_logger.info(
|
||||
f"litellm.amoderation(model={model_name})\033[31m Exception {str(e)}\033[0m"
|
||||
)
|
||||
if model_name is not None:
|
||||
self.fail_calls[model_name] += 1
|
||||
raise e
|
||||
|
||||
def text_completion(
|
||||
self,
|
||||
model: str,
|
||||
|
|
|
@ -86,7 +86,7 @@ class LowestLatencyLoggingHandler(CustomLogger):
|
|||
if isinstance(response_obj, ModelResponse):
|
||||
completion_tokens = response_obj.usage.completion_tokens
|
||||
total_tokens = response_obj.usage.total_tokens
|
||||
final_value = float(completion_tokens / response_ms.total_seconds())
|
||||
final_value = float(response_ms.total_seconds() / completion_tokens)
|
||||
|
||||
# ------------
|
||||
# Update usage
|
||||
|
@ -168,7 +168,7 @@ class LowestLatencyLoggingHandler(CustomLogger):
|
|||
if isinstance(response_obj, ModelResponse):
|
||||
completion_tokens = response_obj.usage.completion_tokens
|
||||
total_tokens = response_obj.usage.total_tokens
|
||||
final_value = float(completion_tokens / response_ms.total_seconds())
|
||||
final_value = float(response_ms.total_seconds() / completion_tokens)
|
||||
|
||||
# ------------
|
||||
# Update usage
|
||||
|
|
|
@ -123,6 +123,10 @@ def test_vertex_ai():
|
|||
print(response)
|
||||
assert type(response.choices[0].message.content) == str
|
||||
assert len(response.choices[0].message.content) > 1
|
||||
print(
|
||||
f"response.choices[0].finish_reason: {response.choices[0].finish_reason}"
|
||||
)
|
||||
assert response.choices[0].finish_reason in litellm._openai_finish_reasons
|
||||
except Exception as e:
|
||||
pytest.fail(f"Error occurred: {e}")
|
||||
|
||||
|
|
|
@ -71,7 +71,7 @@ def test_completion_claude():
|
|||
messages=messages,
|
||||
request_timeout=10,
|
||||
)
|
||||
# Add any assertions here to check the response
|
||||
# Add any assertions here to check response args
|
||||
print(response)
|
||||
print(response.usage)
|
||||
print(response.usage.completion_tokens)
|
||||
|
@ -1545,9 +1545,9 @@ def test_completion_bedrock_titan_null_response():
|
|||
],
|
||||
)
|
||||
# Add any assertions here to check the response
|
||||
pytest.fail(f"Expected to fail")
|
||||
print(f"response: {response}")
|
||||
except Exception as e:
|
||||
pass
|
||||
pytest.fail(f"An error occurred - {str(e)}")
|
||||
|
||||
|
||||
def test_completion_bedrock_titan():
|
||||
|
@ -2093,10 +2093,6 @@ def test_completion_cloudflare():
|
|||
|
||||
|
||||
def test_moderation():
|
||||
import openai
|
||||
|
||||
openai.api_type = "azure"
|
||||
openai.api_version = "GM"
|
||||
response = litellm.moderation(input="i'm ishaan cto of litellm")
|
||||
print(response)
|
||||
output = response.results[0]
|
||||
|
|
65
litellm/tests/test_presidio_masking.py
Normal file
65
litellm/tests/test_presidio_masking.py
Normal file
|
@ -0,0 +1,65 @@
|
|||
# What is this?
|
||||
## Unit test for presidio pii masking
|
||||
import sys, os, asyncio, time, random
|
||||
from datetime import datetime
|
||||
import traceback
|
||||
from dotenv import load_dotenv
|
||||
|
||||
load_dotenv()
|
||||
import os
|
||||
|
||||
sys.path.insert(
|
||||
0, os.path.abspath("../..")
|
||||
) # Adds the parent directory to the system path
|
||||
import pytest
|
||||
import litellm
|
||||
from litellm.proxy.hooks.presidio_pii_masking import _OPTIONAL_PresidioPIIMasking
|
||||
from litellm import Router, mock_completion
|
||||
from litellm.proxy.utils import ProxyLogging
|
||||
from litellm.proxy._types import UserAPIKeyAuth
|
||||
from litellm.caching import DualCache
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_output_parsing():
|
||||
"""
|
||||
- have presidio pii masking - mask an input message
|
||||
- make llm completion call
|
||||
- have presidio pii masking - output parse message
|
||||
- assert that no masked tokens are in the input message
|
||||
"""
|
||||
litellm.output_parse_pii = True
|
||||
pii_masking = _OPTIONAL_PresidioPIIMasking(mock_testing=True)
|
||||
|
||||
initial_message = [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "hello world, my name is Jane Doe. My number is: 034453334",
|
||||
}
|
||||
]
|
||||
|
||||
filtered_message = [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "hello world, my name is <PERSON>. My number is: <PHONE_NUMBER>",
|
||||
}
|
||||
]
|
||||
|
||||
pii_masking.pii_tokens = {"<PERSON>": "Jane Doe", "<PHONE_NUMBER>": "034453334"}
|
||||
|
||||
response = mock_completion(
|
||||
model="gpt-3.5-turbo",
|
||||
messages=filtered_message,
|
||||
mock_response="Hello <PERSON>! How can I assist you today?",
|
||||
)
|
||||
new_response = await pii_masking.async_post_call_success_hook(
|
||||
user_api_key_dict=UserAPIKeyAuth(), response=response
|
||||
)
|
||||
|
||||
assert (
|
||||
new_response.choices[0].message.content
|
||||
== "Hello Jane Doe! How can I assist you today?"
|
||||
)
|
||||
|
||||
|
||||
# asyncio.run(test_output_parsing())
|
|
@ -139,7 +139,7 @@ def test_exception_openai_bad_model(client):
|
|||
response=response
|
||||
)
|
||||
print("Type of exception=", type(openai_exception))
|
||||
assert isinstance(openai_exception, openai.NotFoundError)
|
||||
assert isinstance(openai_exception, openai.BadRequestError)
|
||||
|
||||
except Exception as e:
|
||||
pytest.fail(f"LiteLLM Proxy test failed. Exception {str(e)}")
|
||||
|
@ -160,7 +160,6 @@ def test_chat_completion_exception_any_model(client):
|
|||
response = client.post("/chat/completions", json=test_data)
|
||||
|
||||
json_response = response.json()
|
||||
print("keys in json response", json_response.keys())
|
||||
assert json_response.keys() == {"error"}
|
||||
|
||||
# make an openai client to call _make_status_error_from_response
|
||||
|
|
|
@ -991,3 +991,23 @@ def test_router_timeout():
|
|||
print(e)
|
||||
print(vars(e))
|
||||
pass
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_router_amoderation():
|
||||
model_list = [
|
||||
{
|
||||
"model_name": "openai-moderations",
|
||||
"litellm_params": {
|
||||
"model": "text-moderation-stable",
|
||||
"api_key": os.getenv("OPENAI_API_KEY", None),
|
||||
},
|
||||
}
|
||||
]
|
||||
|
||||
router = Router(model_list=model_list)
|
||||
result = await router.amoderation(
|
||||
model="openai-moderations", input="this is valid good text"
|
||||
)
|
||||
|
||||
print("moderation result", result)
|
||||
|
|
|
@ -58,6 +58,18 @@ def my_post_call_rule(input: str):
|
|||
return {"decision": True}
|
||||
|
||||
|
||||
def my_post_call_rule_2(input: str):
|
||||
input = input.lower()
|
||||
print(f"input: {input}")
|
||||
print(f"INSIDE MY POST CALL RULE, len(input) - {len(input)}")
|
||||
if len(input) < 200 and len(input) > 0:
|
||||
return {
|
||||
"decision": False,
|
||||
"message": "This violates LiteLLM Proxy Rules. Response too short",
|
||||
}
|
||||
return {"decision": True}
|
||||
|
||||
|
||||
# test_pre_call_rule()
|
||||
# Test 2: Post-call rule
|
||||
# commenting out of ci/cd since llm's have variable output which was causing our pipeline to fail erratically.
|
||||
|
@ -94,3 +106,24 @@ def test_post_call_rule():
|
|||
|
||||
|
||||
# test_post_call_rule()
|
||||
|
||||
|
||||
def test_post_call_rule_streaming():
|
||||
try:
|
||||
litellm.pre_call_rules = []
|
||||
litellm.post_call_rules = [my_post_call_rule_2]
|
||||
### completion
|
||||
response = completion(
|
||||
model="gpt-3.5-turbo",
|
||||
messages=[{"role": "user", "content": "say sorry"}],
|
||||
max_tokens=2,
|
||||
stream=True,
|
||||
)
|
||||
for chunk in response:
|
||||
print(f"chunk: {chunk}")
|
||||
pytest.fail(f"Completion call should have been failed. ")
|
||||
except Exception as e:
|
||||
print("Got exception", e)
|
||||
print(type(e))
|
||||
print(vars(e))
|
||||
assert e.message == "This violates LiteLLM Proxy Rules. Response too short"
|
||||
|
|
|
@ -738,6 +738,8 @@ class CallTypes(Enum):
|
|||
text_completion = "text_completion"
|
||||
image_generation = "image_generation"
|
||||
aimage_generation = "aimage_generation"
|
||||
moderation = "moderation"
|
||||
amoderation = "amoderation"
|
||||
|
||||
|
||||
# Logging function -> log the exact model details + what's being sent | Non-BlockingP
|
||||
|
@ -2100,6 +2102,11 @@ def client(original_function):
|
|||
or call_type == CallTypes.aimage_generation.value
|
||||
):
|
||||
messages = args[0] if len(args) > 0 else kwargs["prompt"]
|
||||
elif (
|
||||
call_type == CallTypes.moderation.value
|
||||
or call_type == CallTypes.amoderation.value
|
||||
):
|
||||
messages = args[1] if len(args) > 1 else kwargs["input"]
|
||||
elif (
|
||||
call_type == CallTypes.atext_completion.value
|
||||
or call_type == CallTypes.text_completion.value
|
||||
|
@ -7692,6 +7699,7 @@ class CustomStreamWrapper:
|
|||
self.special_tokens = ["<|assistant|>", "<|system|>", "<|user|>", "<s>", "</s>"]
|
||||
self.holding_chunk = ""
|
||||
self.complete_response = ""
|
||||
self.response_uptil_now = ""
|
||||
_model_info = (
|
||||
self.logging_obj.model_call_details.get("litellm_params", {}).get(
|
||||
"model_info", {}
|
||||
|
@ -7703,6 +7711,7 @@ class CustomStreamWrapper:
|
|||
} # returned as x-litellm-model-id response header in proxy
|
||||
self.response_id = None
|
||||
self.logging_loop = None
|
||||
self.rules = Rules()
|
||||
|
||||
def __iter__(self):
|
||||
return self
|
||||
|
@ -8659,7 +8668,7 @@ class CustomStreamWrapper:
|
|||
chunk = next(self.completion_stream)
|
||||
if chunk is not None and chunk != b"":
|
||||
print_verbose(f"PROCESSED CHUNK PRE CHUNK CREATOR: {chunk}")
|
||||
response = self.chunk_creator(chunk=chunk)
|
||||
response: Optional[ModelResponse] = self.chunk_creator(chunk=chunk)
|
||||
print_verbose(f"PROCESSED CHUNK POST CHUNK CREATOR: {response}")
|
||||
if response is None:
|
||||
continue
|
||||
|
@ -8667,7 +8676,12 @@ class CustomStreamWrapper:
|
|||
threading.Thread(
|
||||
target=self.run_success_logging_in_thread, args=(response,)
|
||||
).start() # log response
|
||||
|
||||
self.response_uptil_now += (
|
||||
response.choices[0].delta.get("content", "") or ""
|
||||
)
|
||||
self.rules.post_call_rules(
|
||||
input=self.response_uptil_now, model=self.model
|
||||
)
|
||||
# RETURN RESULT
|
||||
return response
|
||||
except StopIteration:
|
||||
|
@ -8705,7 +8719,9 @@ class CustomStreamWrapper:
|
|||
# chunk_creator() does logging/stream chunk building. We need to let it know its being called in_async_func, so we don't double add chunks.
|
||||
# __anext__ also calls async_success_handler, which does logging
|
||||
print_verbose(f"PROCESSED ASYNC CHUNK PRE CHUNK CREATOR: {chunk}")
|
||||
processed_chunk = self.chunk_creator(chunk=chunk)
|
||||
processed_chunk: Optional[ModelResponse] = self.chunk_creator(
|
||||
chunk=chunk
|
||||
)
|
||||
print_verbose(
|
||||
f"PROCESSED ASYNC CHUNK POST CHUNK CREATOR: {processed_chunk}"
|
||||
)
|
||||
|
@ -8720,6 +8736,12 @@ class CustomStreamWrapper:
|
|||
processed_chunk,
|
||||
)
|
||||
)
|
||||
self.response_uptil_now += (
|
||||
processed_chunk.choices[0].delta.get("content", "") or ""
|
||||
)
|
||||
self.rules.post_call_rules(
|
||||
input=self.response_uptil_now, model=self.model
|
||||
)
|
||||
return processed_chunk
|
||||
raise StopAsyncIteration
|
||||
else: # temporary patch for non-aiohttp async calls
|
||||
|
@ -8733,7 +8755,9 @@ class CustomStreamWrapper:
|
|||
chunk = next(self.completion_stream)
|
||||
if chunk is not None and chunk != b"":
|
||||
print_verbose(f"PROCESSED CHUNK PRE CHUNK CREATOR: {chunk}")
|
||||
processed_chunk = self.chunk_creator(chunk=chunk)
|
||||
processed_chunk: Optional[ModelResponse] = self.chunk_creator(
|
||||
chunk=chunk
|
||||
)
|
||||
print_verbose(
|
||||
f"PROCESSED CHUNK POST CHUNK CREATOR: {processed_chunk}"
|
||||
)
|
||||
|
@ -8750,6 +8774,12 @@ class CustomStreamWrapper:
|
|||
)
|
||||
)
|
||||
|
||||
self.response_uptil_now += (
|
||||
processed_chunk.choices[0].delta.get("content", "") or ""
|
||||
)
|
||||
self.rules.post_call_rules(
|
||||
input=self.response_uptil_now, model=self.model
|
||||
)
|
||||
# RETURN RESULT
|
||||
return processed_chunk
|
||||
except StopAsyncIteration:
|
||||
|
|
|
@ -198,6 +198,33 @@
|
|||
"litellm_provider": "openai",
|
||||
"mode": "embedding"
|
||||
},
|
||||
"text-moderation-stable": {
|
||||
"max_tokens": 32768,
|
||||
"max_input_tokens": 32768,
|
||||
"max_output_tokens": 0,
|
||||
"input_cost_per_token": 0.000000,
|
||||
"output_cost_per_token": 0.000000,
|
||||
"litellm_provider": "openai",
|
||||
"mode": "moderations"
|
||||
},
|
||||
"text-moderation-007": {
|
||||
"max_tokens": 32768,
|
||||
"max_input_tokens": 32768,
|
||||
"max_output_tokens": 0,
|
||||
"input_cost_per_token": 0.000000,
|
||||
"output_cost_per_token": 0.000000,
|
||||
"litellm_provider": "openai",
|
||||
"mode": "moderations"
|
||||
},
|
||||
"text-moderation-latest": {
|
||||
"max_tokens": 32768,
|
||||
"max_input_tokens": 32768,
|
||||
"max_output_tokens": 0,
|
||||
"input_cost_per_token": 0.000000,
|
||||
"output_cost_per_token": 0.000000,
|
||||
"litellm_provider": "openai",
|
||||
"mode": "moderations"
|
||||
},
|
||||
"256-x-256/dall-e-2": {
|
||||
"mode": "image_generation",
|
||||
"input_cost_per_pixel": 0.00000024414,
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
[tool.poetry]
|
||||
name = "litellm"
|
||||
version = "1.23.10"
|
||||
version = "1.23.16"
|
||||
description = "Library to easily interface with LLM API providers"
|
||||
authors = ["BerriAI"]
|
||||
license = "MIT"
|
||||
|
@ -69,7 +69,7 @@ requires = ["poetry-core", "wheel"]
|
|||
build-backend = "poetry.core.masonry.api"
|
||||
|
||||
[tool.commitizen]
|
||||
version = "1.23.10"
|
||||
version = "1.23.16"
|
||||
version_files = [
|
||||
"pyproject.toml:^version"
|
||||
]
|
||||
|
|
|
@ -27,7 +27,7 @@ tiktoken>=0.4.0 # for calculating usage
|
|||
importlib-metadata>=6.8.0 # for random utils
|
||||
tokenizers==0.14.0 # for calculating usage
|
||||
click==8.1.7 # for proxy cli
|
||||
jinja2==3.1.2 # for prompt templates
|
||||
jinja2==3.1.3 # for prompt templates
|
||||
certifi>=2023.7.22 # [TODO] clean up
|
||||
aiohttp==3.9.0 # for network calls
|
||||
aioboto3==12.3.0 # for async sagemaker calls
|
||||
|
|
|
@ -88,6 +88,22 @@ async def test_chat_completion():
|
|||
await chat_completion(session=session, key=key_2)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_chat_completion_old_key():
|
||||
"""
|
||||
Production test for backwards compatibility. Test db against a pre-generated (old key)
|
||||
- Create key
|
||||
Make chat completion call
|
||||
"""
|
||||
async with aiohttp.ClientSession() as session:
|
||||
try:
|
||||
key = "sk-yNXvlRO4SxIGG0XnRMYxTw"
|
||||
await chat_completion(session=session, key=key)
|
||||
except Exception as e:
|
||||
key = "sk-2KV0sAElLQqMpLZXdNf3yw" # try diff db key (in case db url is for the other db)
|
||||
await chat_completion(session=session, key=key)
|
||||
|
||||
|
||||
async def completion(session, key):
|
||||
url = "http://0.0.0.0:4000/completions"
|
||||
headers = {
|
||||
|
|
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
@ -0,0 +1 @@
|
|||
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[165],{83155:function(e,t,n){(window.__NEXT_P=window.__NEXT_P||[]).push(["/_not-found",function(){return n(84032)}])},84032:function(e,t,n){"use strict";Object.defineProperty(t,"__esModule",{value:!0}),Object.defineProperty(t,"default",{enumerable:!0,get:function(){return i}}),n(86921);let o=n(3827);n(64090);let r={error:{fontFamily:'system-ui,"Segoe UI",Roboto,Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji"',height:"100vh",textAlign:"center",display:"flex",flexDirection:"column",alignItems:"center",justifyContent:"center"},desc:{display:"inline-block"},h1:{display:"inline-block",margin:"0 20px 0 0",padding:"0 23px 0 0",fontSize:24,fontWeight:500,verticalAlign:"top",lineHeight:"49px"},h2:{fontSize:14,fontWeight:400,lineHeight:"49px",margin:0}};function i(){return(0,o.jsxs)(o.Fragment,{children:[(0,o.jsx)("title",{children:"404: This page could not be found."}),(0,o.jsx)("div",{style:r.error,children:(0,o.jsxs)("div",{children:[(0,o.jsx)("style",{dangerouslySetInnerHTML:{__html:"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}),(0,o.jsx)("h1",{className:"next-error-h1",style:r.h1,children:"404"}),(0,o.jsx)("div",{style:r.desc,children:(0,o.jsx)("h2",{style:r.h2,children:"This page could not be found."})})]})})]})}("function"==typeof t.default||"object"==typeof t.default&&null!==t.default)&&void 0===t.default.__esModule&&(Object.defineProperty(t.default,"__esModule",{value:!0}),Object.assign(t.default,t),e.exports=t.default)}},function(e){e.O(0,[971,69,744],function(){return e(e.s=83155)}),_N_E=e.O()}]);
|
|
@ -0,0 +1 @@
|
|||
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[185],{87421:function(n,e,t){Promise.resolve().then(t.t.bind(t,99646,23)),Promise.resolve().then(t.t.bind(t,63385,23))},63385:function(){},99646:function(n){n.exports={style:{fontFamily:"'__Inter_c23dc8', '__Inter_Fallback_c23dc8'",fontStyle:"normal"},className:"__className_c23dc8"}}},function(n){n.O(0,[971,69,744],function(){return n(n.s=87421)}),_N_E=n.O()}]);
|
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
@ -0,0 +1 @@
|
|||
(self.webpackChunk_N_E=self.webpackChunk_N_E||[]).push([[744],{32028:function(e,n,t){Promise.resolve().then(t.t.bind(t,47690,23)),Promise.resolve().then(t.t.bind(t,48955,23)),Promise.resolve().then(t.t.bind(t,5613,23)),Promise.resolve().then(t.t.bind(t,11902,23)),Promise.resolve().then(t.t.bind(t,31778,23)),Promise.resolve().then(t.t.bind(t,77831,23))}},function(e){var n=function(n){return e(e.s=n)};e.O(0,[971,69],function(){return n(35317),n(32028)}),_N_E=e.O()}]);
|
|
@ -0,0 +1 @@
|
|||
!function(){"use strict";var e,t,n,r,o,u,i,c,f,a={},l={};function d(e){var t=l[e];if(void 0!==t)return t.exports;var n=l[e]={id:e,loaded:!1,exports:{}},r=!0;try{a[e](n,n.exports,d),r=!1}finally{r&&delete l[e]}return n.loaded=!0,n.exports}d.m=a,e=[],d.O=function(t,n,r,o){if(n){o=o||0;for(var u=e.length;u>0&&e[u-1][2]>o;u--)e[u]=e[u-1];e[u]=[n,r,o];return}for(var i=1/0,u=0;u<e.length;u++){for(var n=e[u][0],r=e[u][1],o=e[u][2],c=!0,f=0;f<n.length;f++)i>=o&&Object.keys(d.O).every(function(e){return d.O[e](n[f])})?n.splice(f--,1):(c=!1,o<i&&(i=o));if(c){e.splice(u--,1);var a=r();void 0!==a&&(t=a)}}return t},d.n=function(e){var t=e&&e.__esModule?function(){return e.default}:function(){return e};return d.d(t,{a:t}),t},n=Object.getPrototypeOf?function(e){return Object.getPrototypeOf(e)}:function(e){return e.__proto__},d.t=function(e,r){if(1&r&&(e=this(e)),8&r||"object"==typeof e&&e&&(4&r&&e.__esModule||16&r&&"function"==typeof e.then))return e;var o=Object.create(null);d.r(o);var u={};t=t||[null,n({}),n([]),n(n)];for(var i=2&r&&e;"object"==typeof i&&!~t.indexOf(i);i=n(i))Object.getOwnPropertyNames(i).forEach(function(t){u[t]=function(){return e[t]}});return u.default=function(){return e},d.d(o,u),o},d.d=function(e,t){for(var n in t)d.o(t,n)&&!d.o(e,n)&&Object.defineProperty(e,n,{enumerable:!0,get:t[n]})},d.f={},d.e=function(e){return Promise.all(Object.keys(d.f).reduce(function(t,n){return d.f[n](e,t),t},[]))},d.u=function(e){},d.miniCssF=function(e){return"static/css/c18941d97fb7245b.css"},d.g=function(){if("object"==typeof globalThis)return globalThis;try{return this||Function("return this")()}catch(e){if("object"==typeof window)return window}}(),d.o=function(e,t){return Object.prototype.hasOwnProperty.call(e,t)},r={},o="_N_E:",d.l=function(e,t,n,u){if(r[e]){r[e].push(t);return}if(void 0!==n)for(var i,c,f=document.getElementsByTagName("script"),a=0;a<f.length;a++){var l=f[a];if(l.getAttribute("src")==e||l.getAttribute("data-webpack")==o+n){i=l;break}}i||(c=!0,(i=document.createElement("script")).charset="utf-8",i.timeout=120,d.nc&&i.setAttribute("nonce",d.nc),i.setAttribute("data-webpack",o+n),i.src=d.tu(e)),r[e]=[t];var s=function(t,n){i.onerror=i.onload=null,clearTimeout(p);var o=r[e];if(delete r[e],i.parentNode&&i.parentNode.removeChild(i),o&&o.forEach(function(e){return e(n)}),t)return t(n)},p=setTimeout(s.bind(null,void 0,{type:"timeout",target:i}),12e4);i.onerror=s.bind(null,i.onerror),i.onload=s.bind(null,i.onload),c&&document.head.appendChild(i)},d.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},d.nmd=function(e){return e.paths=[],e.children||(e.children=[]),e},d.tt=function(){return void 0===u&&(u={createScriptURL:function(e){return e}},"undefined"!=typeof trustedTypes&&trustedTypes.createPolicy&&(u=trustedTypes.createPolicy("nextjs#bundler",u))),u},d.tu=function(e){return d.tt().createScriptURL(e)},d.p="/ui/_next/",i={272:0},d.f.j=function(e,t){var n=d.o(i,e)?i[e]:void 0;if(0!==n){if(n)t.push(n[2]);else if(272!=e){var r=new Promise(function(t,r){n=i[e]=[t,r]});t.push(n[2]=r);var o=d.p+d.u(e),u=Error();d.l(o,function(t){if(d.o(i,e)&&(0!==(n=i[e])&&(i[e]=void 0),n)){var r=t&&("load"===t.type?"missing":t.type),o=t&&t.target&&t.target.src;u.message="Loading chunk "+e+" failed.\n("+r+": "+o+")",u.name="ChunkLoadError",u.type=r,u.request=o,n[1](u)}},"chunk-"+e,e)}else i[e]=0}},d.O.j=function(e){return 0===i[e]},c=function(e,t){var n,r,o=t[0],u=t[1],c=t[2],f=0;if(o.some(function(e){return 0!==i[e]})){for(n in u)d.o(u,n)&&(d.m[n]=u[n]);if(c)var a=c(d)}for(e&&e(t);f<o.length;f++)r=o[f],d.o(i,r)&&i[r]&&i[r][0](),i[r]=0;return d.O(a)},(f=self.webpackChunk_N_E=self.webpackChunk_N_E||[]).forEach(c.bind(null,0)),f.push=c.bind(null,f.push.bind(f))}();
|
File diff suppressed because one or more lines are too long
|
@ -0,0 +1 @@
|
|||
self.__BUILD_MANIFEST={__rewrites:{afterFiles:[],beforeFiles:[],fallback:[]},"/_error":["static/chunks/pages/_error-d6107f1aac0c574c.js"],sortedPages:["/_app","/_error"]},self.__BUILD_MANIFEST_CB&&self.__BUILD_MANIFEST_CB();
|
|
@ -0,0 +1 @@
|
|||
self.__SSG_MANIFEST=new Set([]);self.__SSG_MANIFEST_CB&&self.__SSG_MANIFEST_CB()
|
|
@ -1 +1 @@
|
|||
<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-85c9b4219c1bb384.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-9b4acf26920649bc.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-096338c8e1915716.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-b7e811ae2c6ca05f.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/654259bbf9e4c196.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[75985,[\"838\",\"static/chunks/838-7fa0bab5a1c3631d.js\",\"931\",\"static/chunks/app/page-5a7453e3903c5d60.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/654259bbf9e4c196.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"4mrMigZY9ob7yaIDjXpX6\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>
|
||||
<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-a85b2c176012d8e5.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-e1b183dda365ec86.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-9b4fb13a7db53edf.js" async="" crossorigin=""></script><title>🚅 LiteLLM</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-db47c93f042d6d15.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/c9a5bc6a7c948fb0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/c18941d97fb7245b.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[48016,[\"145\",\"static/chunks/145-9c160ad5539e000f.js\",\"931\",\"static/chunks/app/page-fcb69349f15d154b.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/c18941d97fb7245b.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"lLFQRQnIrRo-GJf5spHEd\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_c23dc8\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"🚅 LiteLLM\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>
|
|
@ -1,7 +1,7 @@
|
|||
2:I[77831,[],""]
|
||||
3:I[75985,["838","static/chunks/838-7fa0bab5a1c3631d.js","931","static/chunks/app/page-5a7453e3903c5d60.js"],""]
|
||||
3:I[48016,["145","static/chunks/145-9c160ad5539e000f.js","931","static/chunks/app/page-fcb69349f15d154b.js"],""]
|
||||
4:I[5613,[],""]
|
||||
5:I[31778,[],""]
|
||||
0:["4mrMigZY9ob7yaIDjXpX6",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/654259bbf9e4c196.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
|
||||
0:["lLFQRQnIrRo-GJf5spHEd",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_c23dc8","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/c18941d97fb7245b.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
|
||||
6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"🚅 LiteLLM"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
|
||||
1:null
|
||||
|
|
1634
ui/litellm-dashboard/package-lock.json
generated
1634
ui/litellm-dashboard/package-lock.json
generated
File diff suppressed because it is too large
Load diff
|
@ -19,14 +19,18 @@
|
|||
"jsonwebtoken": "^9.0.2",
|
||||
"jwt-decode": "^4.0.0",
|
||||
"next": "14.1.0",
|
||||
"openai": "^4.28.0",
|
||||
"react": "^18",
|
||||
"react-dom": "^18"
|
||||
"react-dom": "^18",
|
||||
"react-markdown": "^9.0.1",
|
||||
"react-syntax-highlighter": "^15.5.0"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@tailwindcss/forms": "^0.5.7",
|
||||
"@types/node": "^20",
|
||||
"@types/react": "18.2.48",
|
||||
"@types/react-dom": "^18",
|
||||
"@types/react-syntax-highlighter": "^15.5.11",
|
||||
"autoprefixer": "^10.4.17",
|
||||
"eslint": "^8",
|
||||
"eslint-config-next": "14.1.0",
|
||||
|
|
|
@ -3,6 +3,8 @@ import React, { Suspense, useEffect, useState } from "react";
|
|||
import { useSearchParams } from "next/navigation";
|
||||
import Navbar from "../components/navbar";
|
||||
import UserDashboard from "../components/user_dashboard";
|
||||
import ModelDashboard from "@/components/model_dashboard";
|
||||
import ChatUI from "@/components/chat_ui";
|
||||
import Sidebar from "../components/leftnav";
|
||||
import Usage from "../components/usage";
|
||||
import { jwtDecode } from "jwt-decode";
|
||||
|
@ -80,7 +82,22 @@ const CreateKeyPage = () => {
|
|||
userEmail={userEmail}
|
||||
setUserEmail={setUserEmail}
|
||||
/>
|
||||
) : (
|
||||
) : page == "models" ? (
|
||||
<ModelDashboard
|
||||
userID={userID}
|
||||
userRole={userRole}
|
||||
token={token}
|
||||
accessToken={accessToken}
|
||||
/>
|
||||
) : page == "llm-playground" ? (
|
||||
<ChatUI
|
||||
userID={userID}
|
||||
userRole={userRole}
|
||||
token={token}
|
||||
accessToken={accessToken}
|
||||
/>
|
||||
)
|
||||
: (
|
||||
<Usage
|
||||
userID={userID}
|
||||
userRole={userRole}
|
||||
|
|
301
ui/litellm-dashboard/src/components/chat_ui.tsx
Normal file
301
ui/litellm-dashboard/src/components/chat_ui.tsx
Normal file
|
@ -0,0 +1,301 @@
|
|||
import React, { useState, useEffect } from "react";
|
||||
import ReactMarkdown from "react-markdown";
|
||||
import { Card, Title, Table, TableHead, TableRow, TableCell, TableBody, Grid, Tab,
|
||||
TabGroup,
|
||||
TabList,
|
||||
TabPanel,
|
||||
Metric,
|
||||
Select,
|
||||
SelectItem,
|
||||
TabPanels, } from "@tremor/react";
|
||||
import { modelInfoCall } from "./networking";
|
||||
import openai from "openai";
|
||||
import { Prism as SyntaxHighlighter } from 'react-syntax-highlighter';
|
||||
|
||||
|
||||
interface ChatUIProps {
|
||||
accessToken: string | null;
|
||||
token: string | null;
|
||||
userRole: string | null;
|
||||
userID: string | null;
|
||||
}
|
||||
|
||||
async function generateModelResponse(inputMessage: string, updateUI: (chunk: string) => void, selectedModel: string, accessToken: string) {
|
||||
const client = new openai.OpenAI({
|
||||
apiKey: accessToken, // Replace with your OpenAI API key
|
||||
baseURL: 'http://0.0.0.0:4000', // Replace with your OpenAI API base URL
|
||||
dangerouslyAllowBrowser: true, // using a temporary litellm proxy key
|
||||
});
|
||||
|
||||
const response = await client.chat.completions.create({
|
||||
model: selectedModel,
|
||||
stream: true,
|
||||
messages: [
|
||||
{
|
||||
role: 'user',
|
||||
content: inputMessage,
|
||||
},
|
||||
],
|
||||
});
|
||||
|
||||
for await (const chunk of response) {
|
||||
console.log(chunk);
|
||||
if (chunk.choices[0].delta.content) {
|
||||
updateUI(chunk.choices[0].delta.content);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const ChatUI: React.FC<ChatUIProps> = ({ accessToken, token, userRole, userID }) => {
|
||||
const [inputMessage, setInputMessage] = useState("");
|
||||
const [chatHistory, setChatHistory] = useState<any[]>([]);
|
||||
const [selectedModel, setSelectedModel] = useState<string | undefined>(undefined);
|
||||
const [modelInfo, setModelInfo] = useState<any | null>(null); // Declare modelInfo at the component level
|
||||
|
||||
useEffect(() => {
|
||||
if (!accessToken || !token || !userRole || !userID) {
|
||||
return;
|
||||
}
|
||||
// Fetch model info and set the default selected model
|
||||
const fetchModelInfo = async () => {
|
||||
const fetchedModelInfo = await modelInfoCall(accessToken, userID, userRole);
|
||||
console.log("model_info:", fetchedModelInfo);
|
||||
|
||||
if (fetchedModelInfo?.data.length > 0) {
|
||||
setModelInfo(fetchedModelInfo);
|
||||
setSelectedModel(fetchedModelInfo.data[0].model_name);
|
||||
}
|
||||
};
|
||||
|
||||
fetchModelInfo();
|
||||
}, [accessToken, userID, userRole]);
|
||||
|
||||
const updateUI = (role: string, chunk: string) => {
|
||||
setChatHistory((prevHistory) => {
|
||||
const lastMessage = prevHistory[prevHistory.length - 1];
|
||||
|
||||
if (lastMessage && lastMessage.role === role) {
|
||||
return [
|
||||
...prevHistory.slice(0, prevHistory.length - 1),
|
||||
{ role, content: lastMessage.content + chunk },
|
||||
];
|
||||
} else {
|
||||
return [...prevHistory, { role, content: chunk }];
|
||||
}
|
||||
});
|
||||
};
|
||||
|
||||
const handleSendMessage = async () => {
|
||||
if (inputMessage.trim() === "") return;
|
||||
|
||||
if (!accessToken || !token || !userRole || !userID) {
|
||||
return;
|
||||
}
|
||||
|
||||
setChatHistory((prevHistory) => [
|
||||
...prevHistory,
|
||||
{ role: "user", content: inputMessage },
|
||||
]);
|
||||
|
||||
try {
|
||||
if (selectedModel) {
|
||||
await generateModelResponse(inputMessage, (chunk) => updateUI("assistant", chunk), selectedModel, accessToken);
|
||||
}
|
||||
} catch (error) {
|
||||
console.error("Error fetching model response", error);
|
||||
updateUI("assistant", "Error fetching model response");
|
||||
}
|
||||
|
||||
setInputMessage("");
|
||||
};
|
||||
|
||||
return (
|
||||
<div style={{ width: "100%", position: "relative" }}>
|
||||
<Grid className="gap-2 p-10 h-[75vh] w-full">
|
||||
<Card>
|
||||
<TabGroup>
|
||||
<TabList className="mt-4">
|
||||
<Tab>Chat</Tab>
|
||||
<Tab>API Reference</Tab>
|
||||
</TabList>
|
||||
|
||||
<TabPanels>
|
||||
<TabPanel>
|
||||
<div>
|
||||
<label>Select Model:</label>
|
||||
<select
|
||||
value={selectedModel || ""}
|
||||
onChange={(e) => setSelectedModel(e.target.value)}
|
||||
>
|
||||
{/* Populate dropdown options from available models */}
|
||||
{modelInfo?.data.map((element: { model_name: string }) => (
|
||||
<option key={element.model_name} value={element.model_name}>
|
||||
{element.model_name}
|
||||
</option>
|
||||
))}
|
||||
</select>
|
||||
</div>
|
||||
<Table className="mt-5" style={{ display: "block", maxHeight: "60vh", overflowY: "auto" }}>
|
||||
<TableHead>
|
||||
<TableRow>
|
||||
<TableCell>
|
||||
<Title>Chat</Title>
|
||||
</TableCell>
|
||||
</TableRow>
|
||||
</TableHead>
|
||||
<TableBody>
|
||||
{chatHistory.map((message, index) => (
|
||||
<TableRow key={index}>
|
||||
<TableCell>{`${message.role}: ${message.content}`}</TableCell>
|
||||
</TableRow>
|
||||
))}
|
||||
</TableBody>
|
||||
</Table>
|
||||
<div className="mt-3" style={{ position: "absolute", bottom: 5, width: "95%" }}>
|
||||
<div className="flex">
|
||||
<input
|
||||
type="text"
|
||||
value={inputMessage}
|
||||
onChange={(e) => setInputMessage(e.target.value)}
|
||||
className="flex-1 p-2 border rounded-md mr-2"
|
||||
placeholder="Type your message..."
|
||||
/>
|
||||
<button onClick={handleSendMessage} className="p-2 bg-blue-500 text-white rounded-md">
|
||||
Send
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</TabPanel>
|
||||
<TabPanel>
|
||||
<TabGroup>
|
||||
<TabList>
|
||||
<Tab>OpenAI Python SDK</Tab>
|
||||
<Tab>LlamaIndex</Tab>
|
||||
<Tab>Langchain Py</Tab>
|
||||
</TabList>
|
||||
<TabPanels>
|
||||
<TabPanel>
|
||||
|
||||
<SyntaxHighlighter language="python">
|
||||
{`
|
||||
import openai
|
||||
client = openai.OpenAI(
|
||||
api_key="your_api_key",
|
||||
base_url="http://0.0.0.0:4000" # proxy base url
|
||||
)
|
||||
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-3.5-turbo", # model to use from Models Tab
|
||||
messages = [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "this is a test request, write a short poem"
|
||||
}
|
||||
],
|
||||
extra_body={
|
||||
"metadata": {
|
||||
"generation_name": "ishaan-generation-openai-client",
|
||||
"generation_id": "openai-client-gen-id22",
|
||||
"trace_id": "openai-client-trace-id22",
|
||||
"trace_user_id": "openai-client-user-id2"
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
print(response)
|
||||
`}
|
||||
</SyntaxHighlighter>
|
||||
</TabPanel>
|
||||
<TabPanel>
|
||||
|
||||
<SyntaxHighlighter language="python">
|
||||
{`
|
||||
import os, dotenv
|
||||
|
||||
from llama_index.llms import AzureOpenAI
|
||||
from llama_index.embeddings import AzureOpenAIEmbedding
|
||||
from llama_index import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
|
||||
|
||||
llm = AzureOpenAI(
|
||||
engine="azure-gpt-3.5", # model_name on litellm proxy
|
||||
temperature=0.0,
|
||||
azure_endpoint="http://0.0.0.0:4000", # litellm proxy endpoint
|
||||
api_key="sk-1234", # litellm proxy API Key
|
||||
api_version="2023-07-01-preview",
|
||||
)
|
||||
|
||||
embed_model = AzureOpenAIEmbedding(
|
||||
deployment_name="azure-embedding-model",
|
||||
azure_endpoint="http://0.0.0.0:4000",
|
||||
api_key="sk-1234",
|
||||
api_version="2023-07-01-preview",
|
||||
)
|
||||
|
||||
|
||||
documents = SimpleDirectoryReader("llama_index_data").load_data()
|
||||
service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)
|
||||
index = VectorStoreIndex.from_documents(documents, service_context=service_context)
|
||||
|
||||
query_engine = index.as_query_engine()
|
||||
response = query_engine.query("What did the author do growing up?")
|
||||
print(response)
|
||||
|
||||
`}
|
||||
</SyntaxHighlighter>
|
||||
</TabPanel>
|
||||
<TabPanel>
|
||||
|
||||
<SyntaxHighlighter language="python">
|
||||
{`
|
||||
from langchain.chat_models import ChatOpenAI
|
||||
from langchain.prompts.chat import (
|
||||
ChatPromptTemplate,
|
||||
HumanMessagePromptTemplate,
|
||||
SystemMessagePromptTemplate,
|
||||
)
|
||||
from langchain.schema import HumanMessage, SystemMessage
|
||||
|
||||
chat = ChatOpenAI(
|
||||
openai_api_base="http://0.0.0.0:8000",
|
||||
model = "gpt-3.5-turbo",
|
||||
temperature=0.1,
|
||||
extra_body={
|
||||
"metadata": {
|
||||
"generation_name": "ishaan-generation-langchain-client",
|
||||
"generation_id": "langchain-client-gen-id22",
|
||||
"trace_id": "langchain-client-trace-id22",
|
||||
"trace_user_id": "langchain-client-user-id2"
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
messages = [
|
||||
SystemMessage(
|
||||
content="You are a helpful assistant that im using to make a test request to."
|
||||
),
|
||||
HumanMessage(
|
||||
content="test from litellm. tell me why it's amazing in 1 sentence"
|
||||
),
|
||||
]
|
||||
response = chat(messages)
|
||||
|
||||
print(response)
|
||||
|
||||
`}
|
||||
</SyntaxHighlighter>
|
||||
</TabPanel>
|
||||
</TabPanels>
|
||||
</TabGroup>
|
||||
|
||||
</TabPanel>
|
||||
</TabPanels>
|
||||
</TabGroup>
|
||||
</Card>
|
||||
</Grid>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
|
||||
|
||||
|
||||
export default ChatUI;
|
|
@ -14,6 +14,7 @@ interface CreateKeyProps {
|
|||
userRole: string | null;
|
||||
accessToken: string;
|
||||
data: any[] | null;
|
||||
userModels: string[];
|
||||
setData: React.Dispatch<React.SetStateAction<any[] | null>>;
|
||||
}
|
||||
|
||||
|
@ -22,6 +23,7 @@ const CreateKey: React.FC<CreateKeyProps> = ({
|
|||
userRole,
|
||||
accessToken,
|
||||
data,
|
||||
userModels,
|
||||
setData,
|
||||
}) => {
|
||||
const [form] = Form.useForm();
|
||||
|
@ -42,20 +44,13 @@ const CreateKey: React.FC<CreateKeyProps> = ({
|
|||
const handleCreate = async (formValues: Record<string, any>) => {
|
||||
try {
|
||||
message.info("Making API Call");
|
||||
// Check if "models" exists and is not an empty string
|
||||
if (formValues.models && formValues.models.trim() !== '') {
|
||||
// Format the "models" field as an array
|
||||
formValues.models = formValues.models.split(',').map((model: string) => model.trim());
|
||||
} else {
|
||||
// If "models" is undefined or an empty string, set it to an empty array
|
||||
formValues.models = [];
|
||||
}
|
||||
setIsModalVisible(true);
|
||||
const response = await keyCreateCall(accessToken, userID, formValues);
|
||||
setData((prevData) => (prevData ? [...prevData, response] : [response])); // Check if prevData is null
|
||||
setApiKey(response["key"]);
|
||||
message.success("API Key Created");
|
||||
form.resetFields();
|
||||
localStorage.removeItem("userData" + userID)
|
||||
} catch (error) {
|
||||
console.error("Error creating the key:", error);
|
||||
}
|
||||
|
@ -90,13 +85,22 @@ const CreateKey: React.FC<CreateKeyProps> = ({
|
|||
>
|
||||
<Input placeholder="ai_team" />
|
||||
</Form.Item>
|
||||
|
||||
<Form.Item
|
||||
label="Models (Comma Separated). Eg: gpt-3.5-turbo,gpt-4"
|
||||
name="models"
|
||||
label="Models"
|
||||
name="models"
|
||||
>
|
||||
<Select
|
||||
mode="multiple"
|
||||
placeholder="Select models"
|
||||
style={{ width: '100%' }}
|
||||
>
|
||||
<Input placeholder="gpt-4,gpt-3.5-turbo" />
|
||||
</Form.Item>
|
||||
{userModels.map((model) => (
|
||||
<Option key={model} value={model}>
|
||||
{model}
|
||||
</Option>
|
||||
))}
|
||||
</Select>
|
||||
</Form.Item>
|
||||
|
||||
|
||||
<Form.Item
|
||||
|
|
|
@ -20,7 +20,13 @@ const Sidebar: React.FC<SidebarProps> = ({ setPage }) => {
|
|||
<Menu.Item key="1" onClick={() => setPage("api-keys")}>
|
||||
API Keys
|
||||
</Menu.Item>
|
||||
<Menu.Item key="2" onClick={() => setPage("usage")}>
|
||||
<Menu.Item key="2" onClick={() => setPage("models")}>
|
||||
Models
|
||||
</Menu.Item>
|
||||
<Menu.Item key="3" onClick={() => setPage("llm-playground")}>
|
||||
Chat UI
|
||||
</Menu.Item>
|
||||
<Menu.Item key="4" onClick={() => setPage("usage")}>
|
||||
Usage
|
||||
</Menu.Item>
|
||||
</Menu>
|
||||
|
|
124
ui/litellm-dashboard/src/components/model_dashboard.tsx
Normal file
124
ui/litellm-dashboard/src/components/model_dashboard.tsx
Normal file
|
@ -0,0 +1,124 @@
|
|||
import React, { useState, useEffect } from "react";
|
||||
import { Card, Title, Subtitle, Table, TableHead, TableRow, TableCell, TableBody, Metric, Grid } from "@tremor/react";
|
||||
import { modelInfoCall } from "./networking";
|
||||
|
||||
interface ModelDashboardProps {
|
||||
accessToken: string | null;
|
||||
token: string | null;
|
||||
userRole: string | null;
|
||||
userID: string | null;
|
||||
}
|
||||
|
||||
const ModelDashboard: React.FC<ModelDashboardProps> = ({
|
||||
accessToken,
|
||||
token,
|
||||
userRole,
|
||||
userID,
|
||||
}) => {
|
||||
const [modelData, setModelData] = useState<any>({ data: [] });
|
||||
|
||||
useEffect(() => {
|
||||
if (!accessToken || !token || !userRole || !userID) {
|
||||
return;
|
||||
}
|
||||
const fetchData = async () => {
|
||||
try {
|
||||
// Replace with your actual API call for model data
|
||||
const modelDataResponse = await modelInfoCall(accessToken, userID, userRole);
|
||||
console.log("Model data response:", modelDataResponse.data);
|
||||
setModelData(modelDataResponse);
|
||||
} catch (error) {
|
||||
console.error("There was an error fetching the model data", error);
|
||||
}
|
||||
};
|
||||
|
||||
if (accessToken && token && userRole && userID) {
|
||||
fetchData();
|
||||
}
|
||||
}, [accessToken, token, userRole, userID]);
|
||||
|
||||
if (!modelData) {
|
||||
return <div>Loading...</div>;
|
||||
}
|
||||
|
||||
// loop through model data and edit each row
|
||||
for (let i = 0; i < modelData.data.length; i++) {
|
||||
let curr_model = modelData.data[i];
|
||||
let litellm_model_name = curr_model?.litellm_params?.model;
|
||||
|
||||
let model_info = curr_model?.model_info;
|
||||
|
||||
let defaultProvider = "openai";
|
||||
let provider = "";
|
||||
let input_cost = "Undefined"
|
||||
let output_cost = "Undefined"
|
||||
let max_tokens = "Undefined"
|
||||
|
||||
// Check if litellm_model_name is null or undefined
|
||||
if (litellm_model_name) {
|
||||
// Split litellm_model_name based on "/"
|
||||
let splitModel = litellm_model_name.split("/");
|
||||
|
||||
// Get the first element in the split
|
||||
let firstElement = splitModel[0];
|
||||
|
||||
// If there is only one element, default provider to openai
|
||||
provider = splitModel.length === 1 ? defaultProvider : firstElement;
|
||||
|
||||
console.log("Provider:", provider);
|
||||
} else {
|
||||
// litellm_model_name is null or undefined, default provider to openai
|
||||
provider = defaultProvider;
|
||||
console.log("Provider:", provider);
|
||||
}
|
||||
|
||||
if (model_info) {
|
||||
input_cost = model_info?.input_cost_per_token;
|
||||
output_cost = model_info?.output_cost_per_token;
|
||||
max_tokens = model_info?.max_tokens;
|
||||
|
||||
}
|
||||
modelData.data[i].provider = provider
|
||||
modelData.data[i].input_cost = input_cost
|
||||
modelData.data[i].output_cost = output_cost
|
||||
modelData.data[i].max_tokens = max_tokens
|
||||
|
||||
}
|
||||
|
||||
|
||||
return (
|
||||
<div style={{ width: "100%" }}>
|
||||
<Grid className="gap-2 p-10 h-[75vh] w-full">
|
||||
<Card>
|
||||
<Table className="mt-5">
|
||||
<TableHead>
|
||||
<TableRow>
|
||||
<TableCell><Title>Model Name </Title></TableCell>
|
||||
<TableCell><Title>Provider</Title></TableCell>
|
||||
<TableCell><Title>Input Price per token ($)</Title></TableCell>
|
||||
<TableCell><Title>Output Price per token ($)</Title></TableCell>
|
||||
<TableCell><Title>Max Tokens</Title></TableCell>
|
||||
</TableRow>
|
||||
</TableHead>
|
||||
<TableBody>
|
||||
{modelData.data.map((model: any) => (
|
||||
<TableRow key={model.model_name}>
|
||||
|
||||
<TableCell><Title>{model.model_name}</Title></TableCell>
|
||||
<TableCell>{model.provider}</TableCell>
|
||||
<TableCell>{model.input_cost}</TableCell>
|
||||
<TableCell>{model.output_cost}</TableCell>
|
||||
<TableCell>{model.max_tokens}</TableCell>
|
||||
|
||||
|
||||
</TableRow>
|
||||
))}
|
||||
</TableBody>
|
||||
</Table>
|
||||
</Card>
|
||||
</Grid>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
|
||||
export default ModelDashboard;
|
|
@ -137,6 +137,41 @@ export const userInfoCall = async (
|
|||
}
|
||||
};
|
||||
|
||||
|
||||
|
||||
export const modelInfoCall = async (
|
||||
accessToken: String,
|
||||
userID: String,
|
||||
userRole: String
|
||||
) => {
|
||||
try {
|
||||
let url = proxyBaseUrl ? `${proxyBaseUrl}/model/info` : `/model/info`;
|
||||
|
||||
message.info("Requesting model data");
|
||||
const response = await fetch(url, {
|
||||
method: "GET",
|
||||
headers: {
|
||||
Authorization: `Bearer ${accessToken}`,
|
||||
"Content-Type": "application/json",
|
||||
},
|
||||
});
|
||||
|
||||
if (!response.ok) {
|
||||
const errorData = await response.text();
|
||||
message.error(errorData);
|
||||
throw new Error("Network response was not ok");
|
||||
}
|
||||
|
||||
const data = await response.json();
|
||||
message.info("Received model data");
|
||||
return data;
|
||||
// Handle success - you might want to update some state or UI based on the created key
|
||||
} catch (error) {
|
||||
console.error("Failed to create key:", error);
|
||||
throw error;
|
||||
}
|
||||
};
|
||||
|
||||
export const keySpendLogsCall = async (accessToken: String, token: String) => {
|
||||
try {
|
||||
const url = proxyBaseUrl ? `${proxyBaseUrl}/spend/logs` : `/spend/logs`;
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
"use client";
|
||||
import React, { useState, useEffect } from "react";
|
||||
import { userInfoCall } from "./networking";
|
||||
import { userInfoCall, modelInfoCall } from "./networking";
|
||||
import { Grid, Col, Card, Text } from "@tremor/react";
|
||||
import CreateKey from "./create_key_button";
|
||||
import ViewKeyTable from "./view_key_table";
|
||||
|
@ -47,6 +47,7 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
|
|||
|
||||
const token = searchParams.get("token");
|
||||
const [accessToken, setAccessToken] = useState<string | null>(null);
|
||||
const [userModels, setUserModels] = useState<string[]>([]);
|
||||
|
||||
function formatUserRole(userRole: string) {
|
||||
if (!userRole) {
|
||||
|
@ -96,22 +97,39 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
|
|||
}
|
||||
}
|
||||
if (userID && accessToken && userRole && !data) {
|
||||
const cachedData = localStorage.getItem("userData");
|
||||
const cachedSpendData = localStorage.getItem("userSpendData");
|
||||
if (cachedData && cachedSpendData) {
|
||||
const cachedData = localStorage.getItem("userData" + userID);
|
||||
const cachedSpendData = localStorage.getItem("userSpendData" + userID);
|
||||
const cachedUserModels = localStorage.getItem("userModels" + userID);
|
||||
if (cachedData && cachedSpendData && cachedUserModels) {
|
||||
setData(JSON.parse(cachedData));
|
||||
setUserSpendData(JSON.parse(cachedSpendData));
|
||||
setUserModels(JSON.parse(cachedUserModels));
|
||||
|
||||
} else {
|
||||
const fetchData = async () => {
|
||||
try {
|
||||
const response = await userInfoCall(accessToken, userID, userRole);
|
||||
setUserSpendData(response["user_info"]);
|
||||
setData(response["keys"]); // Assuming this is the correct path to your data
|
||||
localStorage.setItem("userData", JSON.stringify(response["keys"]));
|
||||
localStorage.setItem("userData" + userID, JSON.stringify(response["keys"]));
|
||||
localStorage.setItem(
|
||||
"userSpendData",
|
||||
"userSpendData" + userID,
|
||||
JSON.stringify(response["user_info"])
|
||||
);
|
||||
|
||||
const model_info = await modelInfoCall(accessToken, userID, userRole);
|
||||
console.log("model_info:", model_info);
|
||||
// loop through model_info["data"] and create an array of element.model_name
|
||||
let available_model_names = model_info["data"].map((element: { model_name: string; }) => element.model_name);
|
||||
console.log("available_model_names:", available_model_names);
|
||||
setUserModels(available_model_names);
|
||||
|
||||
console.log("userModels:", userModels);
|
||||
|
||||
localStorage.setItem("userModels" + userID, JSON.stringify(available_model_names));
|
||||
|
||||
|
||||
|
||||
} catch (error) {
|
||||
console.error("There was an error fetching the data", error);
|
||||
// Optionally, update your UI to reflect the error state here as well
|
||||
|
@ -158,6 +176,7 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
|
|||
<CreateKey
|
||||
userID={userID}
|
||||
userRole={userRole}
|
||||
userModels={userModels}
|
||||
accessToken={accessToken}
|
||||
data={data}
|
||||
setData={setData}
|
||||
|
|
|
@ -43,6 +43,7 @@ const ViewKeyTable: React.FC<ViewKeyTableProps> = ({
|
|||
|
||||
// Set the key to delete and open the confirmation modal
|
||||
setKeyToDelete(token);
|
||||
localStorage.removeItem("userData" + userID)
|
||||
setIsDeleteModalOpen(true);
|
||||
};
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue