Merge branch 'main' into litellm_gemini_context_caching

This commit is contained in:
Krish Dholakia 2024-08-26 22:22:17 -07:00 committed by GitHub
commit 08bd4788dc
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
78 changed files with 1284 additions and 354 deletions

View file

@ -193,7 +193,7 @@ response2 = completion(
], ],
max_tokens=20, max_tokens=20,
) )
print(f"response2: {response1}") print(f"response2: {response2}")
assert response1.id == response2.id assert response1.id == response2.id
# response1 == response2, response 1 is cached # response1 == response2, response 1 is cached
``` ```

View file

@ -14,7 +14,7 @@ https://github.com/BerriAI/litellm
## How to use LiteLLM ## How to use LiteLLM
You can use litellm through either: You can use litellm through either:
1. [LiteLLM Proxy Server](#openai-proxy) - Server (LLM Gateway) to call 100+ LLMs, load balance, cost tracking across projects 1. [LiteLLM Proxy Server](#litellm-proxy-server-llm-gateway) - Server (LLM Gateway) to call 100+ LLMs, load balance, cost tracking across projects
2. [LiteLLM python SDK](#basic-usage) - Python Client to call 100+ LLMs, load balance, cost tracking 2. [LiteLLM python SDK](#basic-usage) - Python Client to call 100+ LLMs, load balance, cost tracking
### **When to use LiteLLM Proxy Server (LLM Gateway)** ### **When to use LiteLLM Proxy Server (LLM Gateway)**

View file

@ -0,0 +1,89 @@
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
# LiteLLM Proxy (LLM Gateway)
:::tip
[LiteLLM Providers a **self hosted** proxy server (AI Gateway)](../simple_proxy) to call all the LLMs in the OpenAI format
:::
**[LiteLLM Proxy](../simple_proxy) is OpenAI compatible**, you just need the `openai/` prefix before the model
## Required Variables
```python
os.environ["OPENAI_API_KEY"] = "" # "sk-1234" your litellm proxy api key
os.environ["OPENAI_API_BASE"] = "" # "http://localhost:4000" your litellm proxy api base
```
## Usage (Non Streaming)
```python
import os
import litellm
from litellm import completion
os.environ["OPENAI_API_KEY"] = ""
# set custom api base to your proxy
# either set .env or litellm.api_base
# os.environ["OPENAI_API_BASE"] = ""
litellm.api_base = "your-openai-proxy-url"
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(model="openai/your-model-name", messages)
```
## Usage - passing `api_base`, `api_key` per request
If you need to set api_base dynamically, just pass it in completions instead - completions(...,api_base="your-proxy-api-base")
```python
import os
import litellm
from litellm import completion
os.environ["OPENAI_API_KEY"] = ""
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(
model="openai/your-model-name",
messages,
api_base = "your-litellm-proxy-url",
api_key = "your-litellm-proxy-api-key"
)
```
## Usage - Streaming
```python
import os
import litellm
from litellm import completion
os.environ["OPENAI_API_KEY"] = ""
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(
model="openai/your-model-name",
messages,
api_base = "your-litellm-proxy-url",
stream=True
)
for chunk in response:
print(chunk)
```
## **Usage with Langchain, LLamaindex, OpenAI Js, Anthropic SDK, Instructor**
#### [Follow this doc to see how to use litellm proxy with langchain, llamaindex, anthropic etc](../proxy/user_keys)

View file

@ -1194,6 +1194,14 @@ response = completion(
|------------------|--------------------------------------| |------------------|--------------------------------------|
| gemini-pro | `completion('gemini-pro', messages)`, `completion('vertex_ai/gemini-pro', messages)` | | gemini-pro | `completion('gemini-pro', messages)`, `completion('vertex_ai/gemini-pro', messages)` |
## Fine-tuned Models
Fine tuned models on vertex have a numerical model/endpoint id.
| Model Name | Function Call |
|------------------|--------------------------------------|
| your fine tuned model | `completion(model='vertex_ai/4965075652664360960', messages)`|
## Gemini Pro Vision ## Gemini Pro Vision
| Model Name | Function Call | | Model Name | Function Call |
|------------------|--------------------------------------| |------------------|--------------------------------------|

View file

@ -71,7 +71,7 @@ litellm --config config.yaml --detailed_debug
## 4. Test request ## 4. Test request
**[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys##request-format)** **[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys#request-format)**
<Tabs> <Tabs>
<TabItem label="Unsuccessful call" value = "not-allowed"> <TabItem label="Unsuccessful call" value = "not-allowed">

View file

@ -40,7 +40,7 @@ litellm --config config.yaml --detailed_debug
### 3. Test request ### 3. Test request
**[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys##request-format)** **[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys#request-format)**
<Tabs> <Tabs>
<TabItem label="Unsuccessful call" value = "not-allowed"> <TabItem label="Unsuccessful call" value = "not-allowed">

View file

@ -202,7 +202,7 @@ litellm --config config.yaml --detailed_debug
#### Test `"custom-pre-guard"` #### Test `"custom-pre-guard"`
**[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys##request-format)** **[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys#request-format)**
<Tabs> <Tabs>
<TabItem label="Modify input" value = "not-allowed"> <TabItem label="Modify input" value = "not-allowed">
@ -282,7 +282,7 @@ curl -i http://localhost:4000/v1/chat/completions \
#### Test `"custom-during-guard"` #### Test `"custom-during-guard"`
**[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys##request-format)** **[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys#request-format)**
<Tabs> <Tabs>
<TabItem label="Unsuccessful call" value = "not-allowed"> <TabItem label="Unsuccessful call" value = "not-allowed">
@ -346,7 +346,7 @@ curl -i http://localhost:4000/v1/chat/completions \
**[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys##request-format)** **[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys#request-format)**
<Tabs> <Tabs>
<TabItem label="Unsuccessful call" value = "not-allowed"> <TabItem label="Unsuccessful call" value = "not-allowed">

View file

@ -46,7 +46,7 @@ litellm --config config.yaml --detailed_debug
### 3. Test request ### 3. Test request
**[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys##request-format)** **[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys#request-format)**
<Tabs> <Tabs>
<TabItem label="Unsuccessful call" value = "not-allowed"> <TabItem label="Unsuccessful call" value = "not-allowed">

View file

@ -48,7 +48,7 @@ litellm --config config.yaml --detailed_debug
## 3. Test request ## 3. Test request
**[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys##request-format)** **[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys#request-format)**
<Tabs> <Tabs>
<TabItem label="Unsuccessful call" value = "not-allowed"> <TabItem label="Unsuccessful call" value = "not-allowed">

View file

@ -810,6 +810,9 @@ print(result)
</TabItem> </TabItem>
</Tabs> </Tabs>
## Using with Vertex, Boto3, Anthropic SDK (Native format)
👉 **[Here's how to use litellm proxy with Vertex, boto3, Anthropic SDK - in the native format](../pass_through/vertex_ai.md)**
## Advanced ## Advanced

View file

@ -72,7 +72,7 @@ litellm --config config.yaml --detailed_debug
## 4. Test request ## 4. Test request
**[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys##request-format)** **[Langchain, OpenAI SDK Usage Examples](../proxy/user_keys#request-format)**
<Tabs> <Tabs>
<TabItem label="Unsuccessful call" value = "not-allowed"> <TabItem label="Unsuccessful call" value = "not-allowed">

View file

@ -128,6 +128,7 @@ const sidebars = {
"providers/anthropic", "providers/anthropic",
"providers/aws_sagemaker", "providers/aws_sagemaker",
"providers/bedrock", "providers/bedrock",
"providers/litellm_proxy",
"providers/mistral", "providers/mistral",
"providers/codestral", "providers/codestral",
"providers/cohere", "providers/cohere",

View file

@ -838,7 +838,7 @@ from .llms.databricks import DatabricksConfig, DatabricksEmbeddingConfig
from .llms.predibase import PredibaseConfig from .llms.predibase import PredibaseConfig
from .llms.anthropic_text import AnthropicTextConfig from .llms.anthropic_text import AnthropicTextConfig
from .llms.replicate import ReplicateConfig from .llms.replicate import ReplicateConfig
from .llms.cohere import CohereConfig from .llms.cohere.completion import CohereConfig
from .llms.clarifai import ClarifaiConfig from .llms.clarifai import ClarifaiConfig
from .llms.ai21 import AI21Config from .llms.ai21 import AI21Config
from .llms.together_ai import TogetherAIConfig from .llms.together_ai import TogetherAIConfig

View file

@ -10,7 +10,5 @@ def generic_chunk_has_all_required_fields(chunk: dict) -> bool:
""" """
_all_fields = GChunk.__annotations__ _all_fields = GChunk.__annotations__
# this is an optional field in GenericStreamingChunk, it's not required to be present decision = all(key in _all_fields for key in chunk)
_all_fields.pop("provider_specific_fields", None) return decision
return all(key in chunk for key in _all_fields)

View file

@ -13,7 +13,7 @@ import litellm
from litellm.types.llms.cohere import ToolResultObject from litellm.types.llms.cohere import ToolResultObject
from litellm.utils import Choices, Message, ModelResponse, Usage from litellm.utils import Choices, Message, ModelResponse, Usage
from .prompt_templates.factory import cohere_message_pt, cohere_messages_pt_v2 from ..prompt_templates.factory import cohere_message_pt, cohere_messages_pt_v2
class CohereError(Exception): class CohereError(Exception):

View file

@ -1,6 +1,5 @@
#################### OLD ######################## ##### Calls /generate endpoint #######
##### See `cohere_chat.py` for `/chat` calls ####
#################################################
import json import json
import os import os
import time import time
@ -252,163 +251,3 @@ def completion(
) )
setattr(model_response, "usage", usage) setattr(model_response, "usage", usage)
return model_response return model_response
def _process_embedding_response(
embeddings: list,
model_response: litellm.EmbeddingResponse,
model: str,
encoding: Any,
input: list,
) -> litellm.EmbeddingResponse:
output_data = []
for idx, embedding in enumerate(embeddings):
output_data.append(
{"object": "embedding", "index": idx, "embedding": embedding}
)
model_response.object = "list"
model_response.data = output_data
model_response.model = model
input_tokens = 0
for text in input:
input_tokens += len(encoding.encode(text))
setattr(
model_response,
"usage",
Usage(
prompt_tokens=input_tokens, completion_tokens=0, total_tokens=input_tokens
),
)
return model_response
async def async_embedding(
model: str,
data: dict,
input: list,
model_response: litellm.utils.EmbeddingResponse,
timeout: Union[float, httpx.Timeout],
logging_obj: LiteLLMLoggingObj,
optional_params: dict,
api_base: str,
api_key: Optional[str],
headers: dict,
encoding: Callable,
client: Optional[AsyncHTTPHandler] = None,
):
## LOGGING
logging_obj.pre_call(
input=input,
api_key=api_key,
additional_args={
"complete_input_dict": data,
"headers": headers,
"api_base": api_base,
},
)
## COMPLETION CALL
if client is None:
client = AsyncHTTPHandler(concurrent_limit=1)
response = await client.post(api_base, headers=headers, data=json.dumps(data))
## LOGGING
logging_obj.post_call(
input=input,
api_key=api_key,
additional_args={"complete_input_dict": data},
original_response=response,
)
embeddings = response.json()["embeddings"]
## PROCESS RESPONSE ##
return _process_embedding_response(
embeddings=embeddings,
model_response=model_response,
model=model,
encoding=encoding,
input=input,
)
def embedding(
model: str,
input: list,
model_response: litellm.EmbeddingResponse,
logging_obj: LiteLLMLoggingObj,
optional_params: dict,
headers: dict,
encoding: Any,
api_key: Optional[str] = None,
aembedding: Optional[bool] = None,
timeout: Union[float, httpx.Timeout] = httpx.Timeout(None),
client: Optional[Union[HTTPHandler, AsyncHTTPHandler]] = None,
):
headers = validate_environment(api_key, headers=headers)
embed_url = "https://api.cohere.ai/v1/embed"
model = model
data = {"model": model, "texts": input, **optional_params}
if "3" in model and "input_type" not in data:
# cohere v3 embedding models require input_type, if no input_type is provided, default to "search_document"
data["input_type"] = "search_document"
## LOGGING
logging_obj.pre_call(
input=input,
api_key=api_key,
additional_args={"complete_input_dict": data},
)
## ROUTING
if aembedding is True:
return async_embedding(
model=model,
data=data,
input=input,
model_response=model_response,
timeout=timeout,
logging_obj=logging_obj,
optional_params=optional_params,
api_base=embed_url,
api_key=api_key,
headers=headers,
encoding=encoding,
)
## COMPLETION CALL
if client is None or not isinstance(client, HTTPHandler):
client = HTTPHandler(concurrent_limit=1)
response = client.post(embed_url, headers=headers, data=json.dumps(data))
## LOGGING
logging_obj.post_call(
input=input,
api_key=api_key,
additional_args={"complete_input_dict": data},
original_response=response,
)
"""
response
{
'object': "list",
'data': [
]
'model',
'usage'
}
"""
if response.status_code != 200:
raise CohereError(message=response.text, status_code=response.status_code)
embeddings = response.json()["embeddings"]
return _process_embedding_response(
embeddings=embeddings,
model_response=model_response,
model=model,
encoding=encoding,
input=input,
)

View file

@ -0,0 +1,201 @@
import json
import os
import time
import traceback
import types
from enum import Enum
from typing import Any, Callable, Optional, Union
import httpx # type: ignore
import requests # type: ignore
import litellm
from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
from litellm.llms.custom_httpx.http_handler import AsyncHTTPHandler, HTTPHandler
from litellm.utils import Choices, Message, ModelResponse, Usage
def validate_environment(api_key, headers: dict):
headers.update(
{
"Request-Source": "unspecified:litellm",
"accept": "application/json",
"content-type": "application/json",
}
)
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
return headers
class CohereError(Exception):
def __init__(self, status_code, message):
self.status_code = status_code
self.message = message
self.request = httpx.Request(
method="POST", url="https://api.cohere.ai/v1/generate"
)
self.response = httpx.Response(status_code=status_code, request=self.request)
super().__init__(
self.message
) # Call the base class constructor with the parameters it needs
def _process_embedding_response(
embeddings: list,
model_response: litellm.EmbeddingResponse,
model: str,
encoding: Any,
input: list,
) -> litellm.EmbeddingResponse:
output_data = []
for idx, embedding in enumerate(embeddings):
output_data.append(
{"object": "embedding", "index": idx, "embedding": embedding}
)
model_response.object = "list"
model_response.data = output_data
model_response.model = model
input_tokens = 0
for text in input:
input_tokens += len(encoding.encode(text))
setattr(
model_response,
"usage",
Usage(
prompt_tokens=input_tokens, completion_tokens=0, total_tokens=input_tokens
),
)
return model_response
async def async_embedding(
model: str,
data: dict,
input: list,
model_response: litellm.utils.EmbeddingResponse,
timeout: Union[float, httpx.Timeout],
logging_obj: LiteLLMLoggingObj,
optional_params: dict,
api_base: str,
api_key: Optional[str],
headers: dict,
encoding: Callable,
client: Optional[AsyncHTTPHandler] = None,
):
## LOGGING
logging_obj.pre_call(
input=input,
api_key=api_key,
additional_args={
"complete_input_dict": data,
"headers": headers,
"api_base": api_base,
},
)
## COMPLETION CALL
if client is None:
client = AsyncHTTPHandler(concurrent_limit=1)
response = await client.post(api_base, headers=headers, data=json.dumps(data))
## LOGGING
logging_obj.post_call(
input=input,
api_key=api_key,
additional_args={"complete_input_dict": data},
original_response=response,
)
embeddings = response.json()["embeddings"]
## PROCESS RESPONSE ##
return _process_embedding_response(
embeddings=embeddings,
model_response=model_response,
model=model,
encoding=encoding,
input=input,
)
def embedding(
model: str,
input: list,
model_response: litellm.EmbeddingResponse,
logging_obj: LiteLLMLoggingObj,
optional_params: dict,
headers: dict,
encoding: Any,
api_key: Optional[str] = None,
aembedding: Optional[bool] = None,
timeout: Union[float, httpx.Timeout] = httpx.Timeout(None),
client: Optional[Union[HTTPHandler, AsyncHTTPHandler]] = None,
):
headers = validate_environment(api_key, headers=headers)
embed_url = "https://api.cohere.ai/v1/embed"
model = model
data = {"model": model, "texts": input, **optional_params}
if "3" in model and "input_type" not in data:
# cohere v3 embedding models require input_type, if no input_type is provided, default to "search_document"
data["input_type"] = "search_document"
## LOGGING
logging_obj.pre_call(
input=input,
api_key=api_key,
additional_args={"complete_input_dict": data},
)
## ROUTING
if aembedding is True:
return async_embedding(
model=model,
data=data,
input=input,
model_response=model_response,
timeout=timeout,
logging_obj=logging_obj,
optional_params=optional_params,
api_base=embed_url,
api_key=api_key,
headers=headers,
encoding=encoding,
)
## COMPLETION CALL
if client is None or not isinstance(client, HTTPHandler):
client = HTTPHandler(concurrent_limit=1)
response = client.post(embed_url, headers=headers, data=json.dumps(data))
## LOGGING
logging_obj.post_call(
input=input,
api_key=api_key,
additional_args={"complete_input_dict": data},
original_response=response,
)
"""
response
{
'object': "list",
'data': [
]
'model',
'usage'
}
"""
if response.status_code != 200:
raise CohereError(message=response.text, status_code=response.status_code)
embeddings = response.json()["embeddings"]
return _process_embedding_response(
embeddings=embeddings,
model_response=model_response,
model=model,
encoding=encoding,
input=input,
)

View file

@ -7,7 +7,7 @@ import time
import types import types
from enum import Enum from enum import Enum
from functools import partial from functools import partial
from typing import Callable, List, Literal, Optional, Tuple, Union from typing import Any, Callable, List, Literal, Optional, Tuple, Union
import httpx # type: ignore import httpx # type: ignore
import requests # type: ignore import requests # type: ignore
@ -22,7 +22,11 @@ from litellm.types.llms.openai import (
ChatCompletionToolCallFunctionChunk, ChatCompletionToolCallFunctionChunk,
ChatCompletionUsageBlock, ChatCompletionUsageBlock,
) )
from litellm.types.utils import GenericStreamingChunk, ProviderField from litellm.types.utils import (
CustomStreamingDecoder,
GenericStreamingChunk,
ProviderField,
)
from litellm.utils import CustomStreamWrapper, EmbeddingResponse, ModelResponse, Usage from litellm.utils import CustomStreamWrapper, EmbeddingResponse, ModelResponse, Usage
from .base import BaseLLM from .base import BaseLLM
@ -171,15 +175,21 @@ async def make_call(
model: str, model: str,
messages: list, messages: list,
logging_obj, logging_obj,
streaming_decoder: Optional[CustomStreamingDecoder] = None,
): ):
response = await client.post(api_base, headers=headers, data=data, stream=True) response = await client.post(api_base, headers=headers, data=data, stream=True)
if response.status_code != 200: if response.status_code != 200:
raise DatabricksError(status_code=response.status_code, message=response.text) raise DatabricksError(status_code=response.status_code, message=response.text)
completion_stream = ModelResponseIterator( if streaming_decoder is not None:
streaming_response=response.aiter_lines(), sync_stream=False completion_stream: Any = streaming_decoder.aiter_bytes(
) response.aiter_bytes(chunk_size=1024)
)
else:
completion_stream = ModelResponseIterator(
streaming_response=response.aiter_lines(), sync_stream=False
)
# LOGGING # LOGGING
logging_obj.post_call( logging_obj.post_call(
input=messages, input=messages,
@ -199,6 +209,7 @@ def make_sync_call(
model: str, model: str,
messages: list, messages: list,
logging_obj, logging_obj,
streaming_decoder: Optional[CustomStreamingDecoder] = None,
): ):
if client is None: if client is None:
client = HTTPHandler() # Create a new client if none provided client = HTTPHandler() # Create a new client if none provided
@ -208,9 +219,14 @@ def make_sync_call(
if response.status_code != 200: if response.status_code != 200:
raise DatabricksError(status_code=response.status_code, message=response.read()) raise DatabricksError(status_code=response.status_code, message=response.read())
completion_stream = ModelResponseIterator( if streaming_decoder is not None:
streaming_response=response.iter_lines(), sync_stream=True completion_stream = streaming_decoder.iter_bytes(
) response.iter_bytes(chunk_size=1024)
)
else:
completion_stream = ModelResponseIterator(
streaming_response=response.iter_lines(), sync_stream=True
)
# LOGGING # LOGGING
logging_obj.post_call( logging_obj.post_call(
@ -283,6 +299,7 @@ class DatabricksChatCompletion(BaseLLM):
logger_fn=None, logger_fn=None,
headers={}, headers={},
client: Optional[AsyncHTTPHandler] = None, client: Optional[AsyncHTTPHandler] = None,
streaming_decoder: Optional[CustomStreamingDecoder] = None,
) -> CustomStreamWrapper: ) -> CustomStreamWrapper:
data["stream"] = True data["stream"] = True
@ -296,6 +313,7 @@ class DatabricksChatCompletion(BaseLLM):
model=model, model=model,
messages=messages, messages=messages,
logging_obj=logging_obj, logging_obj=logging_obj,
streaming_decoder=streaming_decoder,
), ),
model=model, model=model,
custom_llm_provider=custom_llm_provider, custom_llm_provider=custom_llm_provider,
@ -371,6 +389,9 @@ class DatabricksChatCompletion(BaseLLM):
timeout: Optional[Union[float, httpx.Timeout]] = None, timeout: Optional[Union[float, httpx.Timeout]] = None,
client: Optional[Union[HTTPHandler, AsyncHTTPHandler]] = None, client: Optional[Union[HTTPHandler, AsyncHTTPHandler]] = None,
custom_endpoint: Optional[bool] = None, custom_endpoint: Optional[bool] = None,
streaming_decoder: Optional[
CustomStreamingDecoder
] = None, # if openai-compatible api needs custom stream decoder - e.g. sagemaker
): ):
custom_endpoint = custom_endpoint or optional_params.pop( custom_endpoint = custom_endpoint or optional_params.pop(
"custom_endpoint", None "custom_endpoint", None
@ -436,6 +457,7 @@ class DatabricksChatCompletion(BaseLLM):
headers=headers, headers=headers,
client=client, client=client,
custom_llm_provider=custom_llm_provider, custom_llm_provider=custom_llm_provider,
streaming_decoder=streaming_decoder,
) )
else: else:
return self.acompletion_function( return self.acompletion_function(
@ -473,6 +495,7 @@ class DatabricksChatCompletion(BaseLLM):
model=model, model=model,
messages=messages, messages=messages,
logging_obj=logging_obj, logging_obj=logging_obj,
streaming_decoder=streaming_decoder,
), ),
model=model, model=model,
custom_llm_provider=custom_llm_provider, custom_llm_provider=custom_llm_provider,

View file

@ -24,8 +24,11 @@ from litellm.llms.custom_httpx.http_handler import (
from litellm.types.llms.openai import ( from litellm.types.llms.openai import (
ChatCompletionToolCallChunk, ChatCompletionToolCallChunk,
ChatCompletionUsageBlock, ChatCompletionUsageBlock,
OpenAIChatCompletionChunk,
) )
from litellm.types.utils import CustomStreamingDecoder
from litellm.types.utils import GenericStreamingChunk as GChunk from litellm.types.utils import GenericStreamingChunk as GChunk
from litellm.types.utils import StreamingChatCompletionChunk
from litellm.utils import ( from litellm.utils import (
CustomStreamWrapper, CustomStreamWrapper,
EmbeddingResponse, EmbeddingResponse,
@ -34,8 +37,8 @@ from litellm.utils import (
get_secret, get_secret,
) )
from .base_aws_llm import BaseAWSLLM from ..base_aws_llm import BaseAWSLLM
from .prompt_templates.factory import custom_prompt, prompt_factory from ..prompt_templates.factory import custom_prompt, prompt_factory
_response_stream_shape_cache = None _response_stream_shape_cache = None
@ -241,6 +244,10 @@ class SagemakerLLM(BaseAWSLLM):
aws_region_name=aws_region_name, aws_region_name=aws_region_name,
) )
custom_stream_decoder = AWSEventStreamDecoder(
model="", is_messages_api=True
)
return openai_like_chat_completions.completion( return openai_like_chat_completions.completion(
model=model, model=model,
messages=messages, messages=messages,
@ -259,6 +266,7 @@ class SagemakerLLM(BaseAWSLLM):
headers=prepared_request.headers, headers=prepared_request.headers,
custom_endpoint=True, custom_endpoint=True,
custom_llm_provider="sagemaker_chat", custom_llm_provider="sagemaker_chat",
streaming_decoder=custom_stream_decoder, # type: ignore
) )
## Load Config ## Load Config
@ -332,7 +340,7 @@ class SagemakerLLM(BaseAWSLLM):
) )
return response return response
else: else:
if stream is not None and stream == True: if stream is not None and stream is True:
sync_handler = _get_httpx_client() sync_handler = _get_httpx_client()
sync_response = sync_handler.post( sync_response = sync_handler.post(
url=prepared_request.url, url=prepared_request.url,
@ -847,12 +855,21 @@ def get_response_stream_shape():
class AWSEventStreamDecoder: class AWSEventStreamDecoder:
def __init__(self, model: str) -> None: def __init__(self, model: str, is_messages_api: Optional[bool] = None) -> None:
from botocore.parsers import EventStreamJSONParser from botocore.parsers import EventStreamJSONParser
self.model = model self.model = model
self.parser = EventStreamJSONParser() self.parser = EventStreamJSONParser()
self.content_blocks: List = [] self.content_blocks: List = []
self.is_messages_api = is_messages_api
def _chunk_parser_messages_api(
self, chunk_data: dict
) -> StreamingChatCompletionChunk:
openai_chunk = StreamingChatCompletionChunk(**chunk_data)
return openai_chunk
def _chunk_parser(self, chunk_data: dict) -> GChunk: def _chunk_parser(self, chunk_data: dict) -> GChunk:
verbose_logger.debug("in sagemaker chunk parser, chunk_data %s", chunk_data) verbose_logger.debug("in sagemaker chunk parser, chunk_data %s", chunk_data)
@ -868,6 +885,7 @@ class AWSEventStreamDecoder:
index=_index, index=_index,
is_finished=True, is_finished=True,
finish_reason="stop", finish_reason="stop",
usage=None,
) )
return GChunk( return GChunk(
@ -875,9 +893,12 @@ class AWSEventStreamDecoder:
index=_index, index=_index,
is_finished=is_finished, is_finished=is_finished,
finish_reason=finish_reason, finish_reason=finish_reason,
usage=None,
) )
def iter_bytes(self, iterator: Iterator[bytes]) -> Iterator[GChunk]: def iter_bytes(
self, iterator: Iterator[bytes]
) -> Iterator[Optional[Union[GChunk, StreamingChatCompletionChunk]]]:
"""Given an iterator that yields lines, iterate over it & yield every event encountered""" """Given an iterator that yields lines, iterate over it & yield every event encountered"""
from botocore.eventstream import EventStreamBuffer from botocore.eventstream import EventStreamBuffer
@ -898,7 +919,10 @@ class AWSEventStreamDecoder:
# Try to parse the accumulated JSON # Try to parse the accumulated JSON
try: try:
_data = json.loads(accumulated_json) _data = json.loads(accumulated_json)
yield self._chunk_parser(chunk_data=_data) if self.is_messages_api:
yield self._chunk_parser_messages_api(chunk_data=_data)
else:
yield self._chunk_parser(chunk_data=_data)
# Reset accumulated_json after successful parsing # Reset accumulated_json after successful parsing
accumulated_json = "" accumulated_json = ""
except json.JSONDecodeError: except json.JSONDecodeError:
@ -909,16 +933,20 @@ class AWSEventStreamDecoder:
if accumulated_json: if accumulated_json:
try: try:
_data = json.loads(accumulated_json) _data = json.loads(accumulated_json)
yield self._chunk_parser(chunk_data=_data) if self.is_messages_api:
except json.JSONDecodeError: yield self._chunk_parser_messages_api(chunk_data=_data)
else:
yield self._chunk_parser(chunk_data=_data)
except json.JSONDecodeError as e:
# Handle or log any unparseable data at the end # Handle or log any unparseable data at the end
verbose_logger.error( verbose_logger.error(
f"Warning: Unparseable JSON data remained: {accumulated_json}" f"Warning: Unparseable JSON data remained: {accumulated_json}"
) )
yield None
async def aiter_bytes( async def aiter_bytes(
self, iterator: AsyncIterator[bytes] self, iterator: AsyncIterator[bytes]
) -> AsyncIterator[GChunk]: ) -> AsyncIterator[Optional[Union[GChunk, StreamingChatCompletionChunk]]]:
"""Given an async iterator that yields lines, iterate over it & yield every event encountered""" """Given an async iterator that yields lines, iterate over it & yield every event encountered"""
from botocore.eventstream import EventStreamBuffer from botocore.eventstream import EventStreamBuffer
@ -940,7 +968,10 @@ class AWSEventStreamDecoder:
# Try to parse the accumulated JSON # Try to parse the accumulated JSON
try: try:
_data = json.loads(accumulated_json) _data = json.loads(accumulated_json)
yield self._chunk_parser(chunk_data=_data) if self.is_messages_api:
yield self._chunk_parser_messages_api(chunk_data=_data)
else:
yield self._chunk_parser(chunk_data=_data)
# Reset accumulated_json after successful parsing # Reset accumulated_json after successful parsing
accumulated_json = "" accumulated_json = ""
except json.JSONDecodeError: except json.JSONDecodeError:
@ -951,12 +982,16 @@ class AWSEventStreamDecoder:
if accumulated_json: if accumulated_json:
try: try:
_data = json.loads(accumulated_json) _data = json.loads(accumulated_json)
yield self._chunk_parser(chunk_data=_data) if self.is_messages_api:
yield self._chunk_parser_messages_api(chunk_data=_data)
else:
yield self._chunk_parser(chunk_data=_data)
except json.JSONDecodeError: except json.JSONDecodeError:
# Handle or log any unparseable data at the end # Handle or log any unparseable data at the end
verbose_logger.error( verbose_logger.error(
f"Warning: Unparseable JSON data remained: {accumulated_json}" f"Warning: Unparseable JSON data remained: {accumulated_json}"
) )
yield None
def _parse_message_from_event(self, event) -> Optional[str]: def _parse_message_from_event(self, event) -> Optional[str]:
response_dict = event.to_response_dict() response_dict = event.to_response_dict()

View file

@ -32,6 +32,7 @@ from litellm.types.llms.openai import (
ChatCompletionResponseMessage, ChatCompletionResponseMessage,
ChatCompletionToolCallChunk, ChatCompletionToolCallChunk,
ChatCompletionToolCallFunctionChunk, ChatCompletionToolCallFunctionChunk,
ChatCompletionToolParamFunctionChunk,
ChatCompletionUsageBlock, ChatCompletionUsageBlock,
) )
from litellm.types.llms.vertex_ai import ( from litellm.types.llms.vertex_ai import (
@ -303,10 +304,48 @@ class GoogleAIStudioGeminiConfig: # key diff from VertexAI - 'frequency_penalty
"stream", "stream",
"tools", "tools",
"tool_choice", "tool_choice",
"functions",
"response_format", "response_format",
"n", "n",
"stop", "stop",
] ]
def _map_function(self, value: List[dict]) -> List[Tools]:
gtool_func_declarations = []
googleSearchRetrieval: Optional[dict] = None
for tool in value:
openai_function_object: Optional[ChatCompletionToolParamFunctionChunk] = (
None
)
if "function" in tool: # tools list
openai_function_object = ChatCompletionToolParamFunctionChunk( # type: ignore
**tool["function"]
)
elif "name" in tool: # functions list
openai_function_object = ChatCompletionToolParamFunctionChunk(**tool) # type: ignore
# check if grounding
if tool.get("googleSearchRetrieval", None) is not None:
googleSearchRetrieval = tool["googleSearchRetrieval"]
elif openai_function_object is not None:
gtool_func_declaration = FunctionDeclaration(
name=openai_function_object["name"],
description=openai_function_object.get("description", ""),
parameters=openai_function_object.get("parameters", {}),
)
gtool_func_declarations.append(gtool_func_declaration)
else:
# assume it's a provider-specific param
verbose_logger.warning(
"Invalid tool={}. Use `litellm.set_verbose` or `litellm --detailed_debug` to see raw request."
)
_tools = Tools(
function_declarations=gtool_func_declarations,
)
if googleSearchRetrieval is not None:
_tools["googleSearchRetrieval"] = googleSearchRetrieval
return [_tools]
def map_tool_choice_values( def map_tool_choice_values(
self, model: str, tool_choice: Union[str, dict] self, model: str, tool_choice: Union[str, dict]
@ -370,26 +409,11 @@ class GoogleAIStudioGeminiConfig: # key diff from VertexAI - 'frequency_penalty
if "json_schema" in value and "schema" in value["json_schema"]: # type: ignore if "json_schema" in value and "schema" in value["json_schema"]: # type: ignore
optional_params["response_mime_type"] = "application/json" optional_params["response_mime_type"] = "application/json"
optional_params["response_schema"] = value["json_schema"]["schema"] # type: ignore optional_params["response_schema"] = value["json_schema"]["schema"] # type: ignore
if param == "tools" and isinstance(value, list): if (param == "tools" or param == "functions") and isinstance(value, list):
gtool_func_declarations = [] optional_params["tools"] = self._map_function(value=value)
for tool in value: optional_params["litellm_param_is_function_call"] = (
_parameters = tool.get("function", {}).get("parameters", {}) True if param == "functions" else False
_properties = _parameters.get("properties", {}) )
if isinstance(_properties, dict):
for _, _property in _properties.items():
if "enum" in _property and "format" not in _property:
_property["format"] = "enum"
gtool_func_declaration = FunctionDeclaration(
name=tool["function"]["name"],
description=tool["function"].get("description", ""),
)
if len(_parameters.keys()) > 0:
gtool_func_declaration["parameters"] = _parameters
gtool_func_declarations.append(gtool_func_declaration)
optional_params["tools"] = [
Tools(function_declarations=gtool_func_declarations)
]
if param == "tool_choice" and ( if param == "tool_choice" and (
isinstance(value, str) or isinstance(value, dict) isinstance(value, str) or isinstance(value, dict)
): ):
@ -513,6 +537,7 @@ class VertexGeminiConfig:
"max_tokens", "max_tokens",
"stream", "stream",
"tools", "tools",
"functions",
"tool_choice", "tool_choice",
"response_format", "response_format",
"n", "n",
@ -548,6 +573,44 @@ class VertexGeminiConfig:
status_code=400, status_code=400,
) )
def _map_function(self, value: List[dict]) -> List[Tools]:
gtool_func_declarations = []
googleSearchRetrieval: Optional[dict] = None
for tool in value:
openai_function_object: Optional[ChatCompletionToolParamFunctionChunk] = (
None
)
if "function" in tool: # tools list
openai_function_object = ChatCompletionToolParamFunctionChunk( # type: ignore
**tool["function"]
)
elif "name" in tool: # functions list
openai_function_object = ChatCompletionToolParamFunctionChunk(**tool) # type: ignore
# check if grounding
if tool.get("googleSearchRetrieval", None) is not None:
googleSearchRetrieval = tool["googleSearchRetrieval"]
elif openai_function_object is not None:
gtool_func_declaration = FunctionDeclaration(
name=openai_function_object["name"],
description=openai_function_object.get("description", ""),
parameters=openai_function_object.get("parameters", {}),
)
gtool_func_declarations.append(gtool_func_declaration)
else:
# assume it's a provider-specific param
verbose_logger.warning(
"Invalid tool={}. Use `litellm.set_verbose` or `litellm --detailed_debug` to see raw request."
)
_tools = Tools(
function_declarations=gtool_func_declarations,
)
if googleSearchRetrieval is not None:
_tools["googleSearchRetrieval"] = googleSearchRetrieval
return [_tools]
def map_openai_params( def map_openai_params(
self, self,
model: str, model: str,
@ -589,33 +652,11 @@ class VertexGeminiConfig:
optional_params["frequency_penalty"] = value optional_params["frequency_penalty"] = value
if param == "presence_penalty": if param == "presence_penalty":
optional_params["presence_penalty"] = value optional_params["presence_penalty"] = value
if param == "tools" and isinstance(value, list): if (param == "tools" or param == "functions") and isinstance(value, list):
gtool_func_declarations = [] optional_params["tools"] = self._map_function(value=value)
googleSearchRetrieval: Optional[dict] = None optional_params["litellm_param_is_function_call"] = (
provider_specific_tools: List[dict] = [] True if param == "functions" else False
for tool in value:
# check if grounding
try:
gtool_func_declaration = FunctionDeclaration(
name=tool["function"]["name"],
description=tool["function"].get("description", ""),
parameters=tool["function"].get("parameters", {}),
)
gtool_func_declarations.append(gtool_func_declaration)
except KeyError:
if tool.get("googleSearchRetrieval", None) is not None:
googleSearchRetrieval = tool["googleSearchRetrieval"]
else:
# assume it's a provider-specific param
verbose_logger.warning(
"Got KeyError parsing tool={}. Assuming it's a provider-specific param. Use `litellm.set_verbose` or `litellm --detailed_debug` to see raw request."
)
_tools = Tools(
function_declarations=gtool_func_declarations,
) )
if googleSearchRetrieval is not None:
_tools["googleSearchRetrieval"] = googleSearchRetrieval
optional_params["tools"] = [_tools] + provider_specific_tools
if param == "tool_choice" and ( if param == "tool_choice" and (
isinstance(value, str) or isinstance(value, dict) isinstance(value, str) or isinstance(value, dict)
): ):
@ -774,6 +815,7 @@ class VertexLLM(BaseLLM):
model_response: ModelResponse, model_response: ModelResponse,
logging_obj: litellm.litellm_core_utils.litellm_logging.Logging, logging_obj: litellm.litellm_core_utils.litellm_logging.Logging,
optional_params: dict, optional_params: dict,
litellm_params: dict,
api_key: str, api_key: str,
data: Union[dict, str], data: Union[dict, str],
messages: List, messages: List,
@ -790,7 +832,6 @@ class VertexLLM(BaseLLM):
) )
print_verbose(f"raw model_response: {response.text}") print_verbose(f"raw model_response: {response.text}")
## RESPONSE OBJECT ## RESPONSE OBJECT
try: try:
completion_response = GenerateContentResponseBody(**response.json()) # type: ignore completion_response = GenerateContentResponseBody(**response.json()) # type: ignore
@ -898,6 +939,7 @@ class VertexLLM(BaseLLM):
chat_completion_message = {"role": "assistant"} chat_completion_message = {"role": "assistant"}
content_str = "" content_str = ""
tools: List[ChatCompletionToolCallChunk] = [] tools: List[ChatCompletionToolCallChunk] = []
functions: Optional[ChatCompletionToolCallFunctionChunk] = None
for idx, candidate in enumerate(completion_response["candidates"]): for idx, candidate in enumerate(completion_response["candidates"]):
if "content" not in candidate: if "content" not in candidate:
continue continue
@ -920,19 +962,24 @@ class VertexLLM(BaseLLM):
candidate["content"]["parts"][0]["functionCall"]["args"] candidate["content"]["parts"][0]["functionCall"]["args"]
), ),
) )
_tool_response_chunk = ChatCompletionToolCallChunk( if litellm_params.get("litellm_param_is_function_call") is True:
id=f"call_{str(uuid.uuid4())}", functions = _function_chunk
type="function", else:
function=_function_chunk, _tool_response_chunk = ChatCompletionToolCallChunk(
index=candidate.get("index", idx), id=f"call_{str(uuid.uuid4())}",
) type="function",
tools.append(_tool_response_chunk) function=_function_chunk,
index=candidate.get("index", idx),
)
tools.append(_tool_response_chunk)
chat_completion_message["content"] = ( chat_completion_message["content"] = (
content_str if len(content_str) > 0 else None content_str if len(content_str) > 0 else None
) )
chat_completion_message["tool_calls"] = tools if len(tools) > 0:
chat_completion_message["tool_calls"] = tools
if functions is not None:
chat_completion_message["function_call"] = functions
choice = litellm.Choices( choice = litellm.Choices(
finish_reason=candidate.get("finishReason", "stop"), finish_reason=candidate.get("finishReason", "stop"),
index=candidate.get("index", idx), index=candidate.get("index", idx),
@ -1155,6 +1202,15 @@ class VertexLLM(BaseLLM):
else: else:
url = f"https://{vertex_location}-aiplatform.googleapis.com/{version}/projects/{vertex_project}/locations/{vertex_location}/publishers/google/models/{model}:{endpoint}" url = f"https://{vertex_location}-aiplatform.googleapis.com/{version}/projects/{vertex_project}/locations/{vertex_location}/publishers/google/models/{model}:{endpoint}"
# if model is only numeric chars then it's a fine tuned gemini model
# model = 4965075652664360960
# send to this url: url = f"https://{vertex_location}-aiplatform.googleapis.com/{version}/projects/{vertex_project}/locations/{vertex_location}/endpoints/{model}:{endpoint}"
if model.isdigit():
# It's a fine-tuned Gemini model
url = f"https://{vertex_location}-aiplatform.googleapis.com/{version}/projects/{vertex_project}/locations/{vertex_location}/endpoints/{model}:{endpoint}"
if stream is True:
url += "?alt=sse"
if ( if (
api_base is not None api_base is not None
): # for cloudflare ai gateway - https://github.com/BerriAI/litellm/issues/4317 ): # for cloudflare ai gateway - https://github.com/BerriAI/litellm/issues/4317
@ -1220,7 +1276,7 @@ class VertexLLM(BaseLLM):
logging_obj, logging_obj,
stream, stream,
optional_params: dict, optional_params: dict,
litellm_params=None, litellm_params: dict,
logger_fn=None, logger_fn=None,
headers={}, headers={},
client: Optional[AsyncHTTPHandler] = None, client: Optional[AsyncHTTPHandler] = None,
@ -1254,6 +1310,7 @@ class VertexLLM(BaseLLM):
messages=messages, messages=messages,
print_verbose=print_verbose, print_verbose=print_verbose,
optional_params=optional_params, optional_params=optional_params,
litellm_params=litellm_params,
encoding=encoding, encoding=encoding,
) )
@ -1275,7 +1332,7 @@ class VertexLLM(BaseLLM):
vertex_location: Optional[str], vertex_location: Optional[str],
vertex_credentials: Optional[str], vertex_credentials: Optional[str],
gemini_api_key: Optional[str], gemini_api_key: Optional[str],
litellm_params=None, litellm_params: dict,
logger_fn=None, logger_fn=None,
extra_headers: Optional[dict] = None, extra_headers: Optional[dict] = None,
client: Optional[Union[AsyncHTTPHandler, HTTPHandler]] = None, client: Optional[Union[AsyncHTTPHandler, HTTPHandler]] = None,
@ -1287,7 +1344,6 @@ class VertexLLM(BaseLLM):
optional_params=optional_params optional_params=optional_params
) )
print_verbose("Incoming Vertex Args - {}".format(locals()))
auth_header, url = self._get_token_and_url( auth_header, url = self._get_token_and_url(
model=model, model=model,
gemini_api_key=gemini_api_key, gemini_api_key=gemini_api_key,
@ -1299,7 +1355,6 @@ class VertexLLM(BaseLLM):
api_base=api_base, api_base=api_base,
should_use_v1beta1_features=should_use_v1beta1_features, should_use_v1beta1_features=should_use_v1beta1_features,
) )
print_verbose("Updated URL - {}".format(url))
## TRANSFORMATION ## ## TRANSFORMATION ##
### CHECK CONTEXT CACHING ### ### CHECK CONTEXT CACHING ###
@ -1339,6 +1394,18 @@ class VertexLLM(BaseLLM):
) )
optional_params.pop("response_schema") optional_params.pop("response_schema")
# Check for any 'litellm_param_*' set during optional param mapping
remove_keys = []
for k, v in optional_params.items():
if k.startswith("litellm_param_"):
litellm_params.update({k: v})
remove_keys.append(k)
optional_params = {
k: v for k, v in optional_params.items() if k not in remove_keys
}
try: try:
content = _gemini_convert_messages_with_history(messages=messages) content = _gemini_convert_messages_with_history(messages=messages)
tools: Optional[Tools] = optional_params.pop("tools", None) tools: Optional[Tools] = optional_params.pop("tools", None)
@ -1470,6 +1537,7 @@ class VertexLLM(BaseLLM):
model_response=model_response, model_response=model_response,
logging_obj=logging_obj, logging_obj=logging_obj,
optional_params=optional_params, optional_params=optional_params,
litellm_params=litellm_params,
api_key="", api_key="",
data=data, # type: ignore data=data, # type: ignore
messages=messages, messages=messages,

View file

@ -82,8 +82,6 @@ from .llms import (
bedrock, bedrock,
clarifai, clarifai,
cloudflare, cloudflare,
cohere,
cohere_chat,
gemini, gemini,
huggingface_restapi, huggingface_restapi,
maritalk, maritalk,
@ -105,6 +103,9 @@ from .llms.anthropic_text import AnthropicTextCompletion
from .llms.azure import AzureChatCompletion, _check_dynamic_azure_params from .llms.azure import AzureChatCompletion, _check_dynamic_azure_params
from .llms.azure_text import AzureTextCompletion from .llms.azure_text import AzureTextCompletion
from .llms.bedrock_httpx import BedrockConverseLLM, BedrockLLM from .llms.bedrock_httpx import BedrockConverseLLM, BedrockLLM
from .llms.cohere import chat as cohere_chat
from .llms.cohere import completion as cohere_completion # type: ignore
from .llms.cohere import embed as cohere_embed
from .llms.custom_llm import CustomLLM, custom_chat_llm_router from .llms.custom_llm import CustomLLM, custom_chat_llm_router
from .llms.databricks import DatabricksChatCompletion from .llms.databricks import DatabricksChatCompletion
from .llms.huggingface_restapi import Huggingface from .llms.huggingface_restapi import Huggingface
@ -117,7 +118,7 @@ from .llms.prompt_templates.factory import (
prompt_factory, prompt_factory,
stringify_json_tool_call_content, stringify_json_tool_call_content,
) )
from .llms.sagemaker import SagemakerLLM from .llms.sagemaker.sagemaker import SagemakerLLM
from .llms.text_completion_codestral import CodestralTextCompletion from .llms.text_completion_codestral import CodestralTextCompletion
from .llms.text_to_speech.vertex_ai import VertexTextToSpeechAPI from .llms.text_to_speech.vertex_ai import VertexTextToSpeechAPI
from .llms.triton import TritonChatCompletion from .llms.triton import TritonChatCompletion
@ -1651,7 +1652,7 @@ def completion(
if extra_headers is not None: if extra_headers is not None:
headers.update(extra_headers) headers.update(extra_headers)
model_response = cohere.completion( model_response = cohere_completion.completion(
model=model, model=model,
messages=messages, messages=messages,
api_base=api_base, api_base=api_base,
@ -2014,7 +2015,7 @@ def completion(
model_response=model_response, model_response=model_response,
print_verbose=print_verbose, print_verbose=print_verbose,
optional_params=new_params, optional_params=new_params,
litellm_params=litellm_params, litellm_params=litellm_params, # type: ignore
logger_fn=logger_fn, logger_fn=logger_fn,
encoding=encoding, encoding=encoding,
vertex_location=vertex_ai_location, vertex_location=vertex_ai_location,
@ -2101,7 +2102,7 @@ def completion(
model_response=model_response, model_response=model_response,
print_verbose=print_verbose, print_verbose=print_verbose,
optional_params=new_params, optional_params=new_params,
litellm_params=litellm_params, litellm_params=litellm_params, # type: ignore
logger_fn=logger_fn, logger_fn=logger_fn,
encoding=encoding, encoding=encoding,
vertex_location=vertex_ai_location, vertex_location=vertex_ai_location,
@ -3463,7 +3464,7 @@ def embedding(
headers = extra_headers headers = extra_headers
else: else:
headers = {} headers = {}
response = cohere.embedding( response = cohere_embed.embedding(
model=model, model=model,
input=input, input=input,
optional_params=optional_params, optional_params=optional_params,

View file

@ -2189,6 +2189,18 @@
"mode": "image_generation", "mode": "image_generation",
"source": "https://cloud.google.com/vertex-ai/generative-ai/pricing" "source": "https://cloud.google.com/vertex-ai/generative-ai/pricing"
}, },
"vertex_ai/imagen-3.0-generate-001": {
"cost_per_image": 0.04,
"litellm_provider": "vertex_ai-image-models",
"mode": "image_generation",
"source": "https://cloud.google.com/vertex-ai/generative-ai/pricing"
},
"vertex_ai/imagen-3.0-fast-generate-001": {
"cost_per_image": 0.02,
"litellm_provider": "vertex_ai-image-models",
"mode": "image_generation",
"source": "https://cloud.google.com/vertex-ai/generative-ai/pricing"
},
"text-embedding-004": { "text-embedding-004": {
"max_tokens": 3072, "max_tokens": 3072,
"max_input_tokens": 3072, "max_input_tokens": 3072,

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -1 +1 @@
<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-193a7eac80c8baba.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-f593049e31b05aeb.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-8316d07d1f41e39f.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-9b4fb13a7db53edf.js" async="" crossorigin=""></script><title>LiteLLM Dashboard</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-193a7eac80c8baba.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/a34f9d1faa5f3315-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/cd10067a0a3408b4.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[26520,[\"665\",\"static/chunks/3014691f-b24e8254c7593934.js\",\"936\",\"static/chunks/2f6dbc85-cac2949a76539886.js\",\"505\",\"static/chunks/505-5ff3c318fddfa35c.js\",\"131\",\"static/chunks/131-cb6bfe24e23e121b.js\",\"684\",\"static/chunks/684-16b194c83a169f6d.js\",\"605\",\"static/chunks/605-8e4b96f972af8eaf.js\",\"777\",\"static/chunks/777-50d836152fad178b.js\",\"931\",\"static/chunks/app/page-b77076dbc8208d12.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/cd10067a0a3408b4.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"cjLC-FNUY9ME2ZrO3jtsn\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_86ef86\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"LiteLLM Dashboard\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html> <!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-193a7eac80c8baba.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-f593049e31b05aeb.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-8316d07d1f41e39f.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-9b4fb13a7db53edf.js" async="" crossorigin=""></script><title>LiteLLM Dashboard</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-193a7eac80c8baba.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/a34f9d1faa5f3315-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/cd10067a0a3408b4.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[18018,[\"665\",\"static/chunks/3014691f-b24e8254c7593934.js\",\"936\",\"static/chunks/2f6dbc85-cac2949a76539886.js\",\"505\",\"static/chunks/505-5ff3c318fddfa35c.js\",\"131\",\"static/chunks/131-73d0a4f8e09896fe.js\",\"684\",\"static/chunks/684-16b194c83a169f6d.js\",\"605\",\"static/chunks/605-35a95945041f7699.js\",\"777\",\"static/chunks/777-5360b5460eba0779.js\",\"931\",\"static/chunks/app/page-01641b817a14ea88.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/cd10067a0a3408b4.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"LO0Sm6uVF0pa4RdHSL0dN\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_86ef86\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"LiteLLM Dashboard\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>

View file

@ -1,7 +1,7 @@
2:I[77831,[],""] 2:I[77831,[],""]
3:I[26520,["665","static/chunks/3014691f-b24e8254c7593934.js","936","static/chunks/2f6dbc85-cac2949a76539886.js","505","static/chunks/505-5ff3c318fddfa35c.js","131","static/chunks/131-cb6bfe24e23e121b.js","684","static/chunks/684-16b194c83a169f6d.js","605","static/chunks/605-8e4b96f972af8eaf.js","777","static/chunks/777-50d836152fad178b.js","931","static/chunks/app/page-b77076dbc8208d12.js"],""] 3:I[18018,["665","static/chunks/3014691f-b24e8254c7593934.js","936","static/chunks/2f6dbc85-cac2949a76539886.js","505","static/chunks/505-5ff3c318fddfa35c.js","131","static/chunks/131-73d0a4f8e09896fe.js","684","static/chunks/684-16b194c83a169f6d.js","605","static/chunks/605-35a95945041f7699.js","777","static/chunks/777-5360b5460eba0779.js","931","static/chunks/app/page-01641b817a14ea88.js"],""]
4:I[5613,[],""] 4:I[5613,[],""]
5:I[31778,[],""] 5:I[31778,[],""]
0:["cjLC-FNUY9ME2ZrO3jtsn",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_86ef86","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/cd10067a0a3408b4.css","precedence":"next","crossOrigin":""}]],"$L6"]]]] 0:["LO0Sm6uVF0pa4RdHSL0dN",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_86ef86","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/cd10067a0a3408b4.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"LiteLLM Dashboard"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]] 6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"LiteLLM Dashboard"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
1:null 1:null

File diff suppressed because one or more lines are too long

View file

@ -1,7 +1,7 @@
2:I[77831,[],""] 2:I[77831,[],""]
3:I[87494,["505","static/chunks/505-5ff3c318fddfa35c.js","131","static/chunks/131-cb6bfe24e23e121b.js","777","static/chunks/777-50d836152fad178b.js","418","static/chunks/app/model_hub/page-79eee78ed9fccf89.js"],""] 3:I[87494,["505","static/chunks/505-5ff3c318fddfa35c.js","131","static/chunks/131-73d0a4f8e09896fe.js","777","static/chunks/777-5360b5460eba0779.js","418","static/chunks/app/model_hub/page-baad96761e038837.js"],""]
4:I[5613,[],""] 4:I[5613,[],""]
5:I[31778,[],""] 5:I[31778,[],""]
0:["cjLC-FNUY9ME2ZrO3jtsn",[[["",{"children":["model_hub",{"children":["__PAGE__",{}]}]},"$undefined","$undefined",true],["",{"children":["model_hub",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children","model_hub","children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined","styles":null}]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_86ef86","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/cd10067a0a3408b4.css","precedence":"next","crossOrigin":""}]],"$L6"]]]] 0:["LO0Sm6uVF0pa4RdHSL0dN",[[["",{"children":["model_hub",{"children":["__PAGE__",{}]}]},"$undefined","$undefined",true],["",{"children":["model_hub",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children","model_hub","children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined","styles":null}]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_86ef86","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/cd10067a0a3408b4.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"LiteLLM Dashboard"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]] 6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"LiteLLM Dashboard"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
1:null 1:null

File diff suppressed because one or more lines are too long

View file

@ -1,7 +1,7 @@
2:I[77831,[],""] 2:I[77831,[],""]
3:I[667,["665","static/chunks/3014691f-b24e8254c7593934.js","505","static/chunks/505-5ff3c318fddfa35c.js","684","static/chunks/684-16b194c83a169f6d.js","777","static/chunks/777-50d836152fad178b.js","461","static/chunks/app/onboarding/page-8be9c2a4a5c886c5.js"],""] 3:I[667,["665","static/chunks/3014691f-b24e8254c7593934.js","505","static/chunks/505-5ff3c318fddfa35c.js","684","static/chunks/684-16b194c83a169f6d.js","777","static/chunks/777-5360b5460eba0779.js","461","static/chunks/app/onboarding/page-0034957a9fa387e0.js"],""]
4:I[5613,[],""] 4:I[5613,[],""]
5:I[31778,[],""] 5:I[31778,[],""]
0:["cjLC-FNUY9ME2ZrO3jtsn",[[["",{"children":["onboarding",{"children":["__PAGE__",{}]}]},"$undefined","$undefined",true],["",{"children":["onboarding",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children","onboarding","children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined","styles":null}]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_86ef86","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/cd10067a0a3408b4.css","precedence":"next","crossOrigin":""}]],"$L6"]]]] 0:["LO0Sm6uVF0pa4RdHSL0dN",[[["",{"children":["onboarding",{"children":["__PAGE__",{}]}]},"$undefined","$undefined",true],["",{"children":["onboarding",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children","onboarding","children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined","styles":null}]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_86ef86","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/cd10067a0a3408b4.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"LiteLLM Dashboard"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]] 6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"LiteLLM Dashboard"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
1:null 1:null

View file

@ -1299,7 +1299,6 @@ class LiteLLM_VerificationToken(LiteLLMBase):
model_max_budget: Dict = {} model_max_budget: Dict = {}
soft_budget_cooldown: bool = False soft_budget_cooldown: bool = False
litellm_budget_table: Optional[dict] = None litellm_budget_table: Optional[dict] = None
org_id: Optional[str] = None # org id for a given key org_id: Optional[str] = None # org id for a given key
model_config = ConfigDict(protected_namespaces=()) model_config = ConfigDict(protected_namespaces=())

View file

@ -966,3 +966,96 @@ async def delete_verification_token(tokens: List, user_id: Optional[str] = None)
verbose_proxy_logger.debug(traceback.format_exc()) verbose_proxy_logger.debug(traceback.format_exc())
raise e raise e
return deleted_tokens return deleted_tokens
@router.post(
"/key/{key:path}/regenerate",
tags=["key management"],
dependencies=[Depends(user_api_key_auth)],
)
@management_endpoint_wrapper
async def regenerate_key_fn(
key: str,
user_api_key_dict: UserAPIKeyAuth = Depends(user_api_key_auth),
litellm_changed_by: Optional[str] = Header(
None,
description="The litellm-changed-by header enables tracking of actions performed by authorized users on behalf of other users, providing an audit trail for accountability",
),
) -> GenerateKeyResponse:
from litellm.proxy.proxy_server import (
hash_token,
premium_user,
prisma_client,
user_api_key_cache,
)
"""
Endpoint for regenerating a key
"""
if premium_user is not True:
raise ValueError(
f"Regenerating Virtual Keys is an Enterprise feature, {CommonProxyErrors.not_premium_user.value}"
)
# Check if key exists, raise exception if key is not in the DB
### 1. Create New copy that is duplicate of existing key
######################################################################
# create duplicate of existing key
# set token = new token generated
# insert new token in DB
# create hash of token
if prisma_client is None:
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail={"error": "DB not connected. prisma_client is None"},
)
if "sk" not in key:
hashed_api_key = key
else:
hashed_api_key = hash_token(key)
_key_in_db = await prisma_client.db.litellm_verificationtoken.find_unique(
where={"token": hashed_api_key},
)
if _key_in_db is None:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail={"error": f"Key {key} not found."},
)
verbose_proxy_logger.debug("key_in_db: %s", _key_in_db)
new_token = f"sk-{secrets.token_urlsafe(16)}"
new_token_hash = hash_token(new_token)
new_token_key_name = f"sk-...{new_token[-4:]}"
# update new token in DB
updated_token = await prisma_client.db.litellm_verificationtoken.update(
where={"token": hashed_api_key},
data={
"token": new_token_hash,
"key_name": new_token_key_name,
},
)
updated_token_dict = {}
if updated_token is not None:
updated_token_dict = dict(updated_token)
updated_token_dict["token"] = new_token
### 3. remove existing key entry from cache
######################################################################
if key:
user_api_key_cache.delete_cache(key)
if hashed_api_key:
user_api_key_cache.delete_cache(hashed_api_key)
return GenerateKeyResponse(
**updated_token_dict,
)

View file

@ -51,6 +51,10 @@ while retry_count < max_retries and exit_code != 0:
retry_count += 1 retry_count += 1
print(f"Attempt {retry_count}...") # noqa print(f"Attempt {retry_count}...") # noqa
# run prisma generate
result = subprocess.run(["prisma", "generate"], capture_output=True)
exit_code = result.returncode
# Run the Prisma db push command # Run the Prisma db push command
result = subprocess.run( result = subprocess.run(
["prisma", "db", "push", "--accept-data-loss"], capture_output=True ["prisma", "db", "push", "--accept-data-loss"], capture_output=True

View file

@ -2121,6 +2121,90 @@ def test_get_token_url():
pass pass
@pytest.mark.asyncio
async def test_completion_fine_tuned_model():
# load_vertex_ai_credentials()
mock_response = AsyncMock()
def return_val():
return {
"candidates": [
{
"content": {
"role": "model",
"parts": [
{
"text": "A canvas vast, a boundless blue,\nWhere clouds paint tales and winds imbue.\nThe sun descends in fiery hue,\nStars shimmer bright, a gentle few.\n\nThe moon ascends, a pearl of light,\nGuiding travelers through the night.\nThe sky embraces, holds all tight,\nA tapestry of wonder, bright."
}
],
},
"finishReason": "STOP",
"safetyRatings": [
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"probability": "NEGLIGIBLE",
"probabilityScore": 0.028930664,
"severity": "HARM_SEVERITY_NEGLIGIBLE",
"severityScore": 0.041992188,
},
# ... other safety ratings ...
],
"avgLogprobs": -0.95772853367765187,
}
],
"usageMetadata": {
"promptTokenCount": 7,
"candidatesTokenCount": 71,
"totalTokenCount": 78,
},
}
mock_response.json = return_val
mock_response.status_code = 200
expected_payload = {
"contents": [
{"role": "user", "parts": [{"text": "Write a short poem about the sky"}]}
],
"generationConfig": {},
}
with patch(
"litellm.llms.custom_httpx.http_handler.AsyncHTTPHandler.post",
return_value=mock_response,
) as mock_post:
# Act: Call the litellm.completion function
response = await litellm.acompletion(
model="vertex_ai_beta/4965075652664360960",
messages=[{"role": "user", "content": "Write a short poem about the sky"}],
)
# Assert
mock_post.assert_called_once()
url, kwargs = mock_post.call_args
print("url = ", url)
# this is the fine-tuned model endpoint
assert (
url[0]
== "https://us-central1-aiplatform.googleapis.com/v1/projects/adroit-crow-413218/locations/us-central1/endpoints/4965075652664360960:generateContent"
)
print("call args = ", kwargs)
args_to_vertexai = kwargs["json"]
print("args to vertex ai call:", args_to_vertexai)
assert args_to_vertexai == expected_payload
assert response.choices[0].message.content.startswith("A canvas vast")
assert response.choices[0].finish_reason == "stop"
assert response.usage.total_tokens == 78
# Optional: Print for debugging
print("Arguments passed to Vertex AI:", args_to_vertexai)
print("Response:", response)
def mock_gemini_request(*args, **kwargs): def mock_gemini_request(*args, **kwargs):
print(f"kwargs: {kwargs}") print(f"kwargs: {kwargs}")
mock_response = MagicMock() mock_response = MagicMock()

View file

@ -2691,8 +2691,61 @@ def test_completion_hf_model_no_provider():
# test_completion_hf_model_no_provider() # test_completion_hf_model_no_provider()
@pytest.mark.skip(reason="anyscale stopped serving public api endpoints") def gemini_mock_post(*args, **kwargs):
def test_completion_anyscale_with_functions(): mock_response = MagicMock()
mock_response.status_code = 200
mock_response.headers = {"Content-Type": "application/json"}
mock_response.json = MagicMock(
return_value={
"candidates": [
{
"content": {
"parts": [
{
"functionCall": {
"name": "get_current_weather",
"args": {"location": "Boston, MA"},
}
}
],
"role": "model",
},
"finishReason": "STOP",
"index": 0,
"safetyRatings": [
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"probability": "NEGLIGIBLE",
},
{
"category": "HARM_CATEGORY_HARASSMENT",
"probability": "NEGLIGIBLE",
},
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"probability": "NEGLIGIBLE",
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"probability": "NEGLIGIBLE",
},
],
}
],
"usageMetadata": {
"promptTokenCount": 86,
"candidatesTokenCount": 19,
"totalTokenCount": 105,
},
}
)
return mock_response
@pytest.mark.asyncio
async def test_completion_functions_param():
litellm.set_verbose = True
function1 = [ function1 = [
{ {
"name": "get_current_weather", "name": "get_current_weather",
@ -2711,18 +2764,33 @@ def test_completion_anyscale_with_functions():
} }
] ]
try: try:
messages = [{"role": "user", "content": "What is the weather like in Boston?"}] from litellm.llms.custom_httpx.http_handler import AsyncHTTPHandler
response = completion(
model="anyscale/mistralai/Mistral-7B-Instruct-v0.1",
messages=messages,
functions=function1,
)
# Add any assertions here to check the response
print(response)
cost = litellm.completion_cost(completion_response=response) messages = [{"role": "user", "content": "What is the weather like in Boston?"}]
print("cost to make anyscale completion=", cost)
assert cost > 0.0 client = AsyncHTTPHandler(concurrent_limit=1)
with patch.object(client, "post", side_effect=gemini_mock_post) as mock_client:
response: litellm.ModelResponse = await litellm.acompletion(
model="gemini/gemini-1.5-pro",
messages=messages,
functions=function1,
client=client,
)
print(response)
# Add any assertions here to check the response
mock_client.assert_called()
print(f"mock_client.call_args.kwargs: {mock_client.call_args.kwargs}")
assert "tools" in mock_client.call_args.kwargs["json"]
assert (
"litellm_param_is_function_call"
not in mock_client.call_args.kwargs["json"]
)
assert (
"litellm_param_is_function_call"
not in mock_client.call_args.kwargs["json"]["generationConfig"]
)
assert response.choices[0].message.function_call is not None
except Exception as e: except Exception as e:
pytest.fail(f"Error occurred: {e}") pytest.fail(f"Error occurred: {e}")

View file

@ -142,6 +142,8 @@ def test_parallel_function_call(model):
drop_params=True, drop_params=True,
) # get a new response from the model where it can see the function response ) # get a new response from the model where it can see the function response
print("second response\n", second_response) print("second response\n", second_response)
except litellm.InternalServerError:
pass
except litellm.RateLimitError: except litellm.RateLimitError:
pass pass
except Exception as e: except Exception as e:

View file

@ -56,6 +56,7 @@ from litellm.proxy.management_endpoints.key_management_endpoints import (
generate_key_fn, generate_key_fn,
generate_key_helper_fn, generate_key_helper_fn,
info_key_fn, info_key_fn,
regenerate_key_fn,
update_key_fn, update_key_fn,
) )
from litellm.proxy.management_endpoints.team_endpoints import ( from litellm.proxy.management_endpoints.team_endpoints import (
@ -2935,3 +2936,105 @@ async def test_team_access_groups(prisma_client):
"not allowed to call model" in e.message "not allowed to call model" in e.message
and "Allowed team models" in e.message and "Allowed team models" in e.message
) )
################ Unit Tests for testing regeneration of keys ###########
@pytest.mark.asyncio()
async def test_regenerate_api_key(prisma_client):
litellm.set_verbose = True
setattr(litellm.proxy.proxy_server, "prisma_client", prisma_client)
setattr(litellm.proxy.proxy_server, "master_key", "sk-1234")
await litellm.proxy.proxy_server.prisma_client.connect()
import uuid
# generate new key
key_alias = f"test_alias_regenerate_key-{uuid.uuid4()}"
spend = 100
max_budget = 400
models = ["fake-openai-endpoint"]
new_key = await generate_key_fn(
data=GenerateKeyRequest(
key_alias=key_alias, spend=spend, max_budget=max_budget, models=models
),
user_api_key_dict=UserAPIKeyAuth(
user_role=LitellmUserRoles.PROXY_ADMIN,
api_key="sk-1234",
user_id="1234",
),
)
generated_key = new_key.key
print(generated_key)
# assert the new key works as expected
request = Request(scope={"type": "http"})
request._url = URL(url="/chat/completions")
async def return_body():
return_string = f'{{"model": "fake-openai-endpoint"}}'
# return string as bytes
return return_string.encode()
request.body = return_body
result = await user_api_key_auth(request=request, api_key=f"Bearer {generated_key}")
print(result)
# regenerate the key
new_key = await regenerate_key_fn(
key=generated_key,
user_api_key_dict=UserAPIKeyAuth(
user_role=LitellmUserRoles.PROXY_ADMIN,
api_key="sk-1234",
user_id="1234",
),
)
print("response from regenerate_key_fn", new_key)
# assert the new key works as expected
request = Request(scope={"type": "http"})
request._url = URL(url="/chat/completions")
async def return_body_2():
return_string = f'{{"model": "fake-openai-endpoint"}}'
# return string as bytes
return return_string.encode()
request.body = return_body_2
result = await user_api_key_auth(request=request, api_key=f"Bearer {new_key.key}")
print(result)
# assert the old key stops working
request = Request(scope={"type": "http"})
request._url = URL(url="/chat/completions")
async def return_body_3():
return_string = f'{{"model": "fake-openai-endpoint"}}'
# return string as bytes
return return_string.encode()
request.body = return_body_3
try:
result = await user_api_key_auth(
request=request, api_key=f"Bearer {generated_key}"
)
print(result)
pytest.fail(f"This should have failed!. the key has been regenerated")
except Exception as e:
print("got expected exception", e)
assert "Invalid proxy server token passed" in e.message
# Check that the regenerated key has the same spend, max_budget, models and key_alias
assert new_key.spend == spend, f"Expected spend {spend} but got {new_key.spend}"
assert (
new_key.max_budget == max_budget
), f"Expected max_budget {max_budget} but got {new_key.max_budget}"
assert (
new_key.key_alias == key_alias
), f"Expected key_alias {key_alias} but got {new_key.key_alias}"
assert (
new_key.models == models
), f"Expected models {models} but got {new_key.models}"
assert new_key.key_name == f"sk-...{new_key.key[-4:]}"
pass

View file

@ -120,15 +120,24 @@ async def test_completion_sagemaker_messages_api(sync_mode):
@pytest.mark.asyncio() @pytest.mark.asyncio()
@pytest.mark.parametrize("sync_mode", [False, True]) @pytest.mark.parametrize("sync_mode", [False, True])
async def test_completion_sagemaker_stream(sync_mode): @pytest.mark.parametrize(
"model",
[
"sagemaker_chat/huggingface-pytorch-tgi-inference-2024-08-23-15-48-59-245",
"sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614",
],
)
async def test_completion_sagemaker_stream(sync_mode, model):
try: try:
from litellm.tests.test_streaming import streaming_format_tests
litellm.set_verbose = False litellm.set_verbose = False
print("testing sagemaker") print("testing sagemaker")
verbose_logger.setLevel(logging.DEBUG) verbose_logger.setLevel(logging.DEBUG)
full_text = "" full_text = ""
if sync_mode is True: if sync_mode is True:
response = litellm.completion( response = litellm.completion(
model="sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614", model=model,
messages=[ messages=[
{"role": "user", "content": "hi - what is ur name"}, {"role": "user", "content": "hi - what is ur name"},
], ],
@ -138,14 +147,15 @@ async def test_completion_sagemaker_stream(sync_mode):
input_cost_per_second=0.000420, input_cost_per_second=0.000420,
) )
for chunk in response: for idx, chunk in enumerate(response):
print(chunk) print(chunk)
streaming_format_tests(idx=idx, chunk=chunk)
full_text += chunk.choices[0].delta.content or "" full_text += chunk.choices[0].delta.content or ""
print("SYNC RESPONSE full text", full_text) print("SYNC RESPONSE full text", full_text)
else: else:
response = await litellm.acompletion( response = await litellm.acompletion(
model="sagemaker/jumpstart-dft-hf-textgeneration1-mp-20240815-185614", model=model,
messages=[ messages=[
{"role": "user", "content": "hi - what is ur name"}, {"role": "user", "content": "hi - what is ur name"},
], ],
@ -156,10 +166,12 @@ async def test_completion_sagemaker_stream(sync_mode):
) )
print("streaming response") print("streaming response")
idx = 0
async for chunk in response: async for chunk in response:
print(chunk) print(chunk)
streaming_format_tests(idx=idx, chunk=chunk)
full_text += chunk.choices[0].delta.content or "" full_text += chunk.choices[0].delta.content or ""
idx += 1
print("ASYNC RESPONSE full text", full_text) print("ASYNC RESPONSE full text", full_text)

View file

@ -755,27 +755,40 @@ async def test_completion_gemini_stream(sync_mode):
try: try:
litellm.set_verbose = True litellm.set_verbose = True
print("Streaming gemini response") print("Streaming gemini response")
messages = [ function1 = [
{"role": "system", "content": "You are a helpful assistant."},
{ {
"role": "user", "name": "get_current_weather",
"content": "Who was Alexander?", "description": "Get the current weather in a given location",
}, "parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
}
] ]
messages = [{"role": "user", "content": "What is the weather like in Boston?"}]
print("testing gemini streaming") print("testing gemini streaming")
complete_response = "" complete_response = ""
# Add any assertions here to check the response # Add any assertions here to check the response
non_empty_chunks = 0 non_empty_chunks = 0
chunks = []
if sync_mode: if sync_mode:
response = completion( response = completion(
model="gemini/gemini-1.5-flash", model="gemini/gemini-1.5-flash",
messages=messages, messages=messages,
stream=True, stream=True,
functions=function1,
) )
for idx, chunk in enumerate(response): for idx, chunk in enumerate(response):
print(chunk) print(chunk)
chunks.append(chunk)
# print(chunk.choices[0].delta) # print(chunk.choices[0].delta)
chunk, finished = streaming_format_tests(idx, chunk) chunk, finished = streaming_format_tests(idx, chunk)
if finished: if finished:
@ -787,11 +800,13 @@ async def test_completion_gemini_stream(sync_mode):
model="gemini/gemini-1.5-flash", model="gemini/gemini-1.5-flash",
messages=messages, messages=messages,
stream=True, stream=True,
functions=function1,
) )
idx = 0 idx = 0
async for chunk in response: async for chunk in response:
print(chunk) print(chunk)
chunks.append(chunk)
# print(chunk.choices[0].delta) # print(chunk.choices[0].delta)
chunk, finished = streaming_format_tests(idx, chunk) chunk, finished = streaming_format_tests(idx, chunk)
if finished: if finished:
@ -800,10 +815,17 @@ async def test_completion_gemini_stream(sync_mode):
complete_response += chunk complete_response += chunk
idx += 1 idx += 1
if complete_response.strip() == "": # if complete_response.strip() == "":
raise Exception("Empty response received") # raise Exception("Empty response received")
print(f"completion_response: {complete_response}") print(f"completion_response: {complete_response}")
assert non_empty_chunks > 1
complete_response = litellm.stream_chunk_builder(
chunks=chunks, messages=messages
)
assert complete_response.choices[0].message.function_call is not None
# assert non_empty_chunks > 1
except litellm.InternalServerError as e: except litellm.InternalServerError as e:
pass pass
except litellm.RateLimitError as e: except litellm.RateLimitError as e:

View file

@ -29,6 +29,7 @@ from openai.types.beta.thread_create_params import (
from openai.types.beta.threads.message import Message as OpenAIMessage from openai.types.beta.threads.message import Message as OpenAIMessage
from openai.types.beta.threads.message_content import MessageContent from openai.types.beta.threads.message_content import MessageContent
from openai.types.beta.threads.run import Run from openai.types.beta.threads.run import Run
from openai.types.chat import ChatCompletionChunk
from pydantic import BaseModel, Field from pydantic import BaseModel, Field
from typing_extensions import Dict, Required, TypedDict, override from typing_extensions import Dict, Required, TypedDict, override
@ -458,6 +459,7 @@ class ChatCompletionResponseMessage(TypedDict, total=False):
content: Optional[str] content: Optional[str]
tool_calls: List[ChatCompletionToolCallChunk] tool_calls: List[ChatCompletionToolCallChunk]
role: Literal["assistant"] role: Literal["assistant"]
function_call: ChatCompletionToolCallFunctionChunk
class ChatCompletionUsageBlock(TypedDict): class ChatCompletionUsageBlock(TypedDict):
@ -466,6 +468,13 @@ class ChatCompletionUsageBlock(TypedDict):
total_tokens: int total_tokens: int
class OpenAIChatCompletionChunk(ChatCompletionChunk):
def __init__(self, **kwargs):
# Set the 'object' kwarg to 'chat.completion.chunk'
kwargs["object"] = "chat.completion.chunk"
super().__init__(**kwargs)
class Hyperparameters(BaseModel): class Hyperparameters(BaseModel):
batch_size: Optional[Union[str, int]] = None # "Number of examples in each batch." batch_size: Optional[Union[str, int]] = None # "Number of examples in each batch."
learning_rate_multiplier: Optional[Union[str, float]] = ( learning_rate_multiplier: Optional[Union[str, float]] = (

View file

@ -90,7 +90,7 @@ class Schema(TypedDict, total=False):
class FunctionDeclaration(TypedDict, total=False): class FunctionDeclaration(TypedDict, total=False):
name: Required[str] name: Required[str]
description: str description: str
parameters: Schema parameters: Union[Schema, dict]
response: Schema response: Schema

View file

@ -5,11 +5,16 @@ from enum import Enum
from typing import Any, Dict, List, Literal, Optional, Tuple, Union from typing import Any, Dict, List, Literal, Optional, Tuple, Union
from openai._models import BaseModel as OpenAIObject from openai._models import BaseModel as OpenAIObject
from openai.types.completion_usage import CompletionUsage
from pydantic import ConfigDict, Field, PrivateAttr from pydantic import ConfigDict, Field, PrivateAttr
from typing_extensions import Callable, Dict, Required, TypedDict, override from typing_extensions import Callable, Dict, Required, TypedDict, override
from ..litellm_core_utils.core_helpers import map_finish_reason from ..litellm_core_utils.core_helpers import map_finish_reason
from .llms.openai import ChatCompletionToolCallChunk, ChatCompletionUsageBlock from .llms.openai import (
ChatCompletionToolCallChunk,
ChatCompletionUsageBlock,
OpenAIChatCompletionChunk,
)
def _generate_id(): # private helper function def _generate_id(): # private helper function
@ -85,7 +90,7 @@ class GenericStreamingChunk(TypedDict, total=False):
tool_use: Optional[ChatCompletionToolCallChunk] tool_use: Optional[ChatCompletionToolCallChunk]
is_finished: Required[bool] is_finished: Required[bool]
finish_reason: Required[str] finish_reason: Required[str]
usage: Optional[ChatCompletionUsageBlock] usage: Required[Optional[ChatCompletionUsageBlock]]
index: int index: int
# use this dict if you want to return any provider specific fields in the response # use this dict if you want to return any provider specific fields in the response
@ -448,9 +453,6 @@ class Choices(OpenAIObject):
setattr(self, key, value) setattr(self, key, value)
from openai.types.completion_usage import CompletionUsage
class Usage(CompletionUsage): class Usage(CompletionUsage):
def __init__( def __init__(
self, self,
@ -499,7 +501,7 @@ class StreamingChoices(OpenAIObject):
): ):
super(StreamingChoices, self).__init__(**params) super(StreamingChoices, self).__init__(**params)
if finish_reason: if finish_reason:
self.finish_reason = finish_reason self.finish_reason = map_finish_reason(finish_reason)
else: else:
self.finish_reason = None self.finish_reason = None
self.index = index self.index = index
@ -535,6 +537,17 @@ class StreamingChoices(OpenAIObject):
setattr(self, key, value) setattr(self, key, value)
class StreamingChatCompletionChunk(OpenAIChatCompletionChunk):
def __init__(self, **kwargs):
new_choices = []
for choice in kwargs["choices"]:
new_choice = StreamingChoices(**choice).model_dump()
new_choices.append(new_choice)
kwargs["choices"] = new_choices
super().__init__(**kwargs)
class ModelResponse(OpenAIObject): class ModelResponse(OpenAIObject):
id: str id: str
"""A unique identifier for the completion.""" """A unique identifier for the completion."""
@ -1231,3 +1244,20 @@ class StandardLoggingPayload(TypedDict):
response: Optional[Union[str, list, dict]] response: Optional[Union[str, list, dict]]
model_parameters: dict model_parameters: dict
hidden_params: StandardLoggingHiddenParams hidden_params: StandardLoggingHiddenParams
from typing import AsyncIterator, Iterator
class CustomStreamingDecoder:
async def aiter_bytes(
self, iterator: AsyncIterator[bytes]
) -> AsyncIterator[
Optional[Union[GenericStreamingChunk, StreamingChatCompletionChunk]]
]:
raise NotImplementedError
def iter_bytes(
self, iterator: Iterator[bytes]
) -> Iterator[Optional[Union[GenericStreamingChunk, StreamingChatCompletionChunk]]]:
raise NotImplementedError

View file

@ -4613,7 +4613,11 @@ def get_llm_provider(
if custom_llm_provider == "perplexity": if custom_llm_provider == "perplexity":
# perplexity is openai compatible, we just need to set this to custom_openai and have the api_base be https://api.perplexity.ai # perplexity is openai compatible, we just need to set this to custom_openai and have the api_base be https://api.perplexity.ai
api_base = api_base or get_secret("PERPLEXITY_API_BASE") or "https://api.perplexity.ai" # type: ignore api_base = api_base or get_secret("PERPLEXITY_API_BASE") or "https://api.perplexity.ai" # type: ignore
dynamic_api_key = api_key or get_secret("PERPLEXITYAI_API_KEY") dynamic_api_key = (
api_key
or get_secret("PERPLEXITYAI_API_KEY")
or get_secret("PERPLEXITY_API_KEY")
)
elif custom_llm_provider == "anyscale": elif custom_llm_provider == "anyscale":
# anyscale is openai compatible, we just need to set this to custom_openai and have the api_base be https://api.endpoints.anyscale.com/v1 # anyscale is openai compatible, we just need to set this to custom_openai and have the api_base be https://api.endpoints.anyscale.com/v1
api_base = api_base or get_secret("ANYSCALE_API_BASE") or "https://api.endpoints.anyscale.com/v1" # type: ignore api_base = api_base or get_secret("ANYSCALE_API_BASE") or "https://api.endpoints.anyscale.com/v1" # type: ignore
@ -6679,10 +6683,14 @@ def exception_type(
else: else:
message = str(original_exception) message = str(original_exception)
if message is not None and isinstance(message, str): if message is not None and isinstance(
message, str
): # done to prevent user-confusion. Relevant issue - https://github.com/BerriAI/litellm/issues/1414
message = message.replace("OPENAI", custom_llm_provider.upper()) message = message.replace("OPENAI", custom_llm_provider.upper())
message = message.replace("openai", custom_llm_provider) message = message.replace(
message = message.replace("OpenAI", custom_llm_provider) "openai.OpenAIError",
"{}.{}Error".format(custom_llm_provider, custom_llm_provider),
)
if custom_llm_provider == "openai": if custom_llm_provider == "openai":
exception_provider = "OpenAI" + "Exception" exception_provider = "OpenAI" + "Exception"
else: else:
@ -8805,6 +8813,7 @@ class CustomStreamWrapper:
self.chunks: List = ( self.chunks: List = (
[] []
) # keep track of the returned chunks - used for calculating the input/output tokens for stream options ) # keep track of the returned chunks - used for calculating the input/output tokens for stream options
self.is_function_call = self.check_is_function_call(logging_obj=logging_obj)
def __iter__(self): def __iter__(self):
return self return self
@ -8812,6 +8821,19 @@ class CustomStreamWrapper:
def __aiter__(self): def __aiter__(self):
return self return self
def check_is_function_call(self, logging_obj) -> bool:
if hasattr(logging_obj, "optional_params") and isinstance(
logging_obj.optional_params, dict
):
if (
"litellm_param_is_function_call" in logging_obj.optional_params
and logging_obj.optional_params["litellm_param_is_function_call"]
is True
):
return True
return False
def process_chunk(self, chunk: str): def process_chunk(self, chunk: str):
""" """
NLP Cloud streaming returns the entire response, for each chunk. Process this, to only return the delta. NLP Cloud streaming returns the entire response, for each chunk. Process this, to only return the delta.
@ -10309,6 +10331,12 @@ class CustomStreamWrapper:
## CHECK FOR TOOL USE ## CHECK FOR TOOL USE
if "tool_calls" in completion_obj and len(completion_obj["tool_calls"]) > 0: if "tool_calls" in completion_obj and len(completion_obj["tool_calls"]) > 0:
if self.is_function_call is True: # user passed in 'functions' param
completion_obj["function_call"] = completion_obj["tool_calls"][0][
"function"
]
completion_obj["tool_calls"] = None
self.tool_call = True self.tool_call = True
## RETURN ARG ## RETURN ARG
@ -10320,8 +10348,13 @@ class CustomStreamWrapper:
) )
or ( or (
"tool_calls" in completion_obj "tool_calls" in completion_obj
and completion_obj["tool_calls"] is not None
and len(completion_obj["tool_calls"]) > 0 and len(completion_obj["tool_calls"]) > 0
) )
or (
"function_call" in completion_obj
and completion_obj["function_call"] is not None
)
): # cannot set content of an OpenAI Object to be an empty string ): # cannot set content of an OpenAI Object to be an empty string
self.safety_checker() self.safety_checker()
hold, model_response_str = self.check_special_tokens( hold, model_response_str = self.check_special_tokens(
@ -10381,6 +10414,7 @@ class CustomStreamWrapper:
if self.sent_first_chunk is False: if self.sent_first_chunk is False:
completion_obj["role"] = "assistant" completion_obj["role"] = "assistant"
self.sent_first_chunk = True self.sent_first_chunk = True
model_response.choices[0].delta = Delta(**completion_obj) model_response.choices[0].delta = Delta(**completion_obj)
if completion_obj.get("index") is not None: if completion_obj.get("index") is not None:
model_response.choices[0].index = completion_obj.get( model_response.choices[0].index = completion_obj.get(

View file

@ -2189,6 +2189,18 @@
"mode": "image_generation", "mode": "image_generation",
"source": "https://cloud.google.com/vertex-ai/generative-ai/pricing" "source": "https://cloud.google.com/vertex-ai/generative-ai/pricing"
}, },
"vertex_ai/imagen-3.0-generate-001": {
"cost_per_image": 0.04,
"litellm_provider": "vertex_ai-image-models",
"mode": "image_generation",
"source": "https://cloud.google.com/vertex-ai/generative-ai/pricing"
},
"vertex_ai/imagen-3.0-fast-generate-001": {
"cost_per_image": 0.02,
"litellm_provider": "vertex_ai-image-models",
"mode": "image_generation",
"source": "https://cloud.google.com/vertex-ai/generative-ai/pricing"
},
"text-embedding-004": { "text-embedding-004": {
"max_tokens": 3072, "max_tokens": 3072,
"max_input_tokens": 3072, "max_input_tokens": 3072,

View file

@ -1,6 +1,6 @@
[tool.poetry] [tool.poetry]
name = "litellm" name = "litellm"
version = "1.44.6" version = "1.44.7"
description = "Library to easily interface with LLM API providers" description = "Library to easily interface with LLM API providers"
authors = ["BerriAI"] authors = ["BerriAI"]
license = "MIT" license = "MIT"
@ -91,7 +91,7 @@ requires = ["poetry-core", "wheel"]
build-backend = "poetry.core.masonry.api" build-backend = "poetry.core.masonry.api"
[tool.commitizen] [tool.commitizen]
version = "1.44.6" version = "1.44.7"
version_files = [ version_files = [
"pyproject.toml:^version" "pyproject.toml:^version"
] ]

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -1 +1 @@
<!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-193a7eac80c8baba.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-f593049e31b05aeb.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-8316d07d1f41e39f.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-9b4fb13a7db53edf.js" async="" crossorigin=""></script><title>LiteLLM Dashboard</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-193a7eac80c8baba.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/a34f9d1faa5f3315-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/cd10067a0a3408b4.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[26520,[\"665\",\"static/chunks/3014691f-b24e8254c7593934.js\",\"936\",\"static/chunks/2f6dbc85-cac2949a76539886.js\",\"505\",\"static/chunks/505-5ff3c318fddfa35c.js\",\"131\",\"static/chunks/131-cb6bfe24e23e121b.js\",\"684\",\"static/chunks/684-16b194c83a169f6d.js\",\"605\",\"static/chunks/605-8e4b96f972af8eaf.js\",\"777\",\"static/chunks/777-50d836152fad178b.js\",\"931\",\"static/chunks/app/page-b77076dbc8208d12.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/cd10067a0a3408b4.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"cjLC-FNUY9ME2ZrO3jtsn\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_86ef86\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"LiteLLM Dashboard\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html> <!DOCTYPE html><html id="__next_error__"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" as="script" fetchPriority="low" href="/ui/_next/static/chunks/webpack-193a7eac80c8baba.js" crossorigin=""/><script src="/ui/_next/static/chunks/fd9d1056-f593049e31b05aeb.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/69-8316d07d1f41e39f.js" async="" crossorigin=""></script><script src="/ui/_next/static/chunks/main-app-9b4fb13a7db53edf.js" async="" crossorigin=""></script><title>LiteLLM Dashboard</title><meta name="description" content="LiteLLM Proxy Admin UI"/><link rel="icon" href="/ui/favicon.ico" type="image/x-icon" sizes="16x16"/><meta name="next-size-adjust"/><script src="/ui/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js" crossorigin="" noModule=""></script></head><body><script src="/ui/_next/static/chunks/webpack-193a7eac80c8baba.js" crossorigin="" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0]);self.__next_f.push([2,null])</script><script>self.__next_f.push([1,"1:HL[\"/ui/_next/static/media/a34f9d1faa5f3315-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n2:HL[\"/ui/_next/static/css/cd10067a0a3408b4.css\",\"style\",{\"crossOrigin\":\"\"}]\n0:\"$L3\"\n"])</script><script>self.__next_f.push([1,"4:I[47690,[],\"\"]\n6:I[77831,[],\"\"]\n7:I[18018,[\"665\",\"static/chunks/3014691f-b24e8254c7593934.js\",\"936\",\"static/chunks/2f6dbc85-cac2949a76539886.js\",\"505\",\"static/chunks/505-5ff3c318fddfa35c.js\",\"131\",\"static/chunks/131-73d0a4f8e09896fe.js\",\"684\",\"static/chunks/684-16b194c83a169f6d.js\",\"605\",\"static/chunks/605-35a95945041f7699.js\",\"777\",\"static/chunks/777-5360b5460eba0779.js\",\"931\",\"static/chunks/app/page-01641b817a14ea88.js\"],\"\"]\n8:I[5613,[],\"\"]\n9:I[31778,[],\"\"]\nb:I[48955,[],\"\"]\nc:[]\n"])</script><script>self.__next_f.push([1,"3:[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/ui/_next/static/css/cd10067a0a3408b4.css\",\"precedence\":\"next\",\"crossOrigin\":\"\"}]],[\"$\",\"$L4\",null,{\"buildId\":\"LO0Sm6uVF0pa4RdHSL0dN\",\"assetPrefix\":\"/ui\",\"initialCanonicalUrl\":\"/\",\"initialTree\":[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],\"initialSeedData\":[\"\",{\"children\":[\"__PAGE__\",{},[\"$L5\",[\"$\",\"$L6\",null,{\"propsForComponent\":{\"params\":{}},\"Component\":\"$7\",\"isStaticGeneration\":true}],null]]},[null,[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"__className_86ef86\",\"children\":[\"$\",\"$L8\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"loading\":\"$undefined\",\"loadingStyles\":\"$undefined\",\"loadingScripts\":\"$undefined\",\"hasLoading\":false,\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L9\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[],\"styles\":null}]}]}],null]],\"initialHead\":[false,\"$La\"],\"globalErrorComponent\":\"$b\",\"missingSlots\":\"$Wc\"}]]\n"])</script><script>self.__next_f.push([1,"a:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}],[\"$\",\"meta\",\"1\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"2\",{\"children\":\"LiteLLM Dashboard\"}],[\"$\",\"meta\",\"3\",{\"name\":\"description\",\"content\":\"LiteLLM Proxy Admin UI\"}],[\"$\",\"link\",\"4\",{\"rel\":\"icon\",\"href\":\"/ui/favicon.ico\",\"type\":\"image/x-icon\",\"sizes\":\"16x16\"}],[\"$\",\"meta\",\"5\",{\"name\":\"next-size-adjust\"}]]\n5:null\n"])</script><script>self.__next_f.push([1,""])</script></body></html>

View file

@ -1,7 +1,7 @@
2:I[77831,[],""] 2:I[77831,[],""]
3:I[26520,["665","static/chunks/3014691f-b24e8254c7593934.js","936","static/chunks/2f6dbc85-cac2949a76539886.js","505","static/chunks/505-5ff3c318fddfa35c.js","131","static/chunks/131-cb6bfe24e23e121b.js","684","static/chunks/684-16b194c83a169f6d.js","605","static/chunks/605-8e4b96f972af8eaf.js","777","static/chunks/777-50d836152fad178b.js","931","static/chunks/app/page-b77076dbc8208d12.js"],""] 3:I[18018,["665","static/chunks/3014691f-b24e8254c7593934.js","936","static/chunks/2f6dbc85-cac2949a76539886.js","505","static/chunks/505-5ff3c318fddfa35c.js","131","static/chunks/131-73d0a4f8e09896fe.js","684","static/chunks/684-16b194c83a169f6d.js","605","static/chunks/605-35a95945041f7699.js","777","static/chunks/777-5360b5460eba0779.js","931","static/chunks/app/page-01641b817a14ea88.js"],""]
4:I[5613,[],""] 4:I[5613,[],""]
5:I[31778,[],""] 5:I[31778,[],""]
0:["cjLC-FNUY9ME2ZrO3jtsn",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_86ef86","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/cd10067a0a3408b4.css","precedence":"next","crossOrigin":""}]],"$L6"]]]] 0:["LO0Sm6uVF0pa4RdHSL0dN",[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_86ef86","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/cd10067a0a3408b4.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"LiteLLM Dashboard"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]] 6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"LiteLLM Dashboard"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
1:null 1:null

File diff suppressed because one or more lines are too long

View file

@ -1,7 +1,7 @@
2:I[77831,[],""] 2:I[77831,[],""]
3:I[87494,["505","static/chunks/505-5ff3c318fddfa35c.js","131","static/chunks/131-cb6bfe24e23e121b.js","777","static/chunks/777-50d836152fad178b.js","418","static/chunks/app/model_hub/page-79eee78ed9fccf89.js"],""] 3:I[87494,["505","static/chunks/505-5ff3c318fddfa35c.js","131","static/chunks/131-73d0a4f8e09896fe.js","777","static/chunks/777-5360b5460eba0779.js","418","static/chunks/app/model_hub/page-baad96761e038837.js"],""]
4:I[5613,[],""] 4:I[5613,[],""]
5:I[31778,[],""] 5:I[31778,[],""]
0:["cjLC-FNUY9ME2ZrO3jtsn",[[["",{"children":["model_hub",{"children":["__PAGE__",{}]}]},"$undefined","$undefined",true],["",{"children":["model_hub",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children","model_hub","children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined","styles":null}]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_86ef86","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/cd10067a0a3408b4.css","precedence":"next","crossOrigin":""}]],"$L6"]]]] 0:["LO0Sm6uVF0pa4RdHSL0dN",[[["",{"children":["model_hub",{"children":["__PAGE__",{}]}]},"$undefined","$undefined",true],["",{"children":["model_hub",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children","model_hub","children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined","styles":null}]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_86ef86","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/cd10067a0a3408b4.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"LiteLLM Dashboard"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]] 6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"LiteLLM Dashboard"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
1:null 1:null

File diff suppressed because one or more lines are too long

View file

@ -1,7 +1,7 @@
2:I[77831,[],""] 2:I[77831,[],""]
3:I[667,["665","static/chunks/3014691f-b24e8254c7593934.js","505","static/chunks/505-5ff3c318fddfa35c.js","684","static/chunks/684-16b194c83a169f6d.js","777","static/chunks/777-50d836152fad178b.js","461","static/chunks/app/onboarding/page-8be9c2a4a5c886c5.js"],""] 3:I[667,["665","static/chunks/3014691f-b24e8254c7593934.js","505","static/chunks/505-5ff3c318fddfa35c.js","684","static/chunks/684-16b194c83a169f6d.js","777","static/chunks/777-5360b5460eba0779.js","461","static/chunks/app/onboarding/page-0034957a9fa387e0.js"],""]
4:I[5613,[],""] 4:I[5613,[],""]
5:I[31778,[],""] 5:I[31778,[],""]
0:["cjLC-FNUY9ME2ZrO3jtsn",[[["",{"children":["onboarding",{"children":["__PAGE__",{}]}]},"$undefined","$undefined",true],["",{"children":["onboarding",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children","onboarding","children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined","styles":null}]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_86ef86","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/cd10067a0a3408b4.css","precedence":"next","crossOrigin":""}]],"$L6"]]]] 0:["LO0Sm6uVF0pa4RdHSL0dN",[[["",{"children":["onboarding",{"children":["__PAGE__",{}]}]},"$undefined","$undefined",true],["",{"children":["onboarding",{"children":["__PAGE__",{},["$L1",["$","$L2",null,{"propsForComponent":{"params":{}},"Component":"$3","isStaticGeneration":true}],null]]},["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children","onboarding","children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined","styles":null}]]},[null,["$","html",null,{"lang":"en","children":["$","body",null,{"className":"__className_86ef86","children":["$","$L4",null,{"parallelRouterKey":"children","segmentPath":["children"],"loading":"$undefined","loadingStyles":"$undefined","loadingScripts":"$undefined","hasLoading":false,"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L5",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[],"styles":null}]}]}],null]],[[["$","link","0",{"rel":"stylesheet","href":"/ui/_next/static/css/cd10067a0a3408b4.css","precedence":"next","crossOrigin":""}]],"$L6"]]]]
6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"LiteLLM Dashboard"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]] 6:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}],["$","meta","1",{"charSet":"utf-8"}],["$","title","2",{"children":"LiteLLM Dashboard"}],["$","meta","3",{"name":"description","content":"LiteLLM Proxy Admin UI"}],["$","link","4",{"rel":"icon","href":"/ui/favicon.ico","type":"image/x-icon","sizes":"16x16"}],["$","meta","5",{"name":"next-size-adjust"}]]
1:null 1:null

View file

@ -4800,11 +4800,11 @@
] ]
}, },
"node_modules/micromatch": { "node_modules/micromatch": {
"version": "4.0.5", "version": "4.0.8",
"resolved": "https://registry.npmjs.org/micromatch/-/micromatch-4.0.5.tgz", "resolved": "https://registry.npmjs.org/micromatch/-/micromatch-4.0.8.tgz",
"integrity": "sha512-DMy+ERcEW2q8Z2Po+WNXuw3c5YaUSFjAO5GsJqfEl7UjvtIuFKO6ZrKvcItdy98dwFI2N1tg3zNIdKaQT+aNdA==", "integrity": "sha512-PXwfBhYu0hBCPw8Dn0E+WDYb7af3dSLVWKi3HGv84IdF4TyFoC0ysxFd0Goxw7nSv4T/PzEJQxsYsEiFCKo2BA==",
"dependencies": { "dependencies": {
"braces": "^3.0.2", "braces": "^3.0.3",
"picomatch": "^2.3.1" "picomatch": "^2.3.1"
}, },
"engines": { "engines": {

View file

@ -141,6 +141,7 @@ const CreateKeyPage = () => {
<UserDashboard <UserDashboard
userID={userID} userID={userID}
userRole={userRole} userRole={userRole}
premiumUser={premiumUser}
teams={teams} teams={teams}
keys={keys} keys={keys}
setUserRole={setUserRole} setUserRole={setUserRole}
@ -175,6 +176,7 @@ const CreateKeyPage = () => {
<UserDashboard <UserDashboard
userID={userID} userID={userID}
userRole={userRole} userRole={userRole}
premiumUser={premiumUser}
teams={teams} teams={teams}
keys={keys} keys={keys}
setUserRole={setUserRole} setUserRole={setUserRole}

View file

@ -770,6 +770,37 @@ export const claimOnboardingToken = async (
throw error; throw error;
} }
}; };
export const regenerateKeyCall = async (accessToken: string, keyToRegenerate: string) => {
try {
const url = proxyBaseUrl
? `${proxyBaseUrl}/key/${keyToRegenerate}/regenerate`
: `/key/${keyToRegenerate}/regenerate`;
const response = await fetch(url, {
method: "POST",
headers: {
[globalLitellmHeaderName]: `Bearer ${accessToken}`,
"Content-Type": "application/json",
},
body: JSON.stringify({}),
});
if (!response.ok) {
const errorData = await response.text();
handleError(errorData);
throw new Error("Network response was not ok");
}
const data = await response.json();
console.log("Regenerate key Response:", data);
return data;
} catch (error) {
console.error("Failed to regenerate key:", error);
throw error;
}
};
let ModelListerrorShown = false; let ModelListerrorShown = false;
let errorTimer: NodeJS.Timeout | null = null; let errorTimer: NodeJS.Timeout | null = null;

View file

@ -48,6 +48,7 @@ interface UserDashboardProps {
setKeys: React.Dispatch<React.SetStateAction<Object[] | null>>; setKeys: React.Dispatch<React.SetStateAction<Object[] | null>>;
setProxySettings: React.Dispatch<React.SetStateAction<any>>; setProxySettings: React.Dispatch<React.SetStateAction<any>>;
proxySettings: any; proxySettings: any;
premiumUser: boolean;
} }
type TeamInterface = { type TeamInterface = {
@ -68,6 +69,7 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
setKeys, setKeys,
setProxySettings, setProxySettings,
proxySettings, proxySettings,
premiumUser,
}) => { }) => {
const [userSpendData, setUserSpendData] = useState<UserSpendData | null>( const [userSpendData, setUserSpendData] = useState<UserSpendData | null>(
null null
@ -328,6 +330,7 @@ const UserDashboard: React.FC<UserDashboardProps> = ({
selectedTeam={selectedTeam ? selectedTeam : null} selectedTeam={selectedTeam ? selectedTeam : null}
data={keys} data={keys}
setData={setKeys} setData={setKeys}
premiumUser={premiumUser}
teams={teams} teams={teams}
/> />
<CreateKey <CreateKey

View file

@ -1,12 +1,14 @@
"use client"; "use client";
import React, { useEffect, useState } from "react"; import React, { useEffect, useState } from "react";
import { keyDeleteCall, modelAvailableCall } from "./networking"; import { keyDeleteCall, modelAvailableCall } from "./networking";
import { InformationCircleIcon, StatusOnlineIcon, TrashIcon, PencilAltIcon } from "@heroicons/react/outline"; import { InformationCircleIcon, StatusOnlineIcon, TrashIcon, PencilAltIcon, RefreshIcon } from "@heroicons/react/outline";
import { keySpendLogsCall, PredictedSpendLogsCall, keyUpdateCall, modelInfoCall } from "./networking"; import { keySpendLogsCall, PredictedSpendLogsCall, keyUpdateCall, modelInfoCall, regenerateKeyCall } from "./networking";
import { import {
Badge, Badge,
Card, Card,
Table, Table,
Grid,
Col,
Button, Button,
TableBody, TableBody,
TableCell, TableCell,
@ -33,6 +35,8 @@ import {
Select, Select,
} from "antd"; } from "antd";
import { CopyToClipboard } from "react-copy-to-clipboard";
const { Option } = Select; const { Option } = Select;
const isLocal = process.env.NODE_ENV === "development"; const isLocal = process.env.NODE_ENV === "development";
const proxyBaseUrl = isLocal ? "http://localhost:4000" : null; const proxyBaseUrl = isLocal ? "http://localhost:4000" : null;
@ -65,6 +69,7 @@ interface ViewKeyTableProps {
data: any[] | null; data: any[] | null;
setData: React.Dispatch<React.SetStateAction<any[] | null>>; setData: React.Dispatch<React.SetStateAction<any[] | null>>;
teams: any[] | null; teams: any[] | null;
premiumUser: boolean;
} }
interface ItemData { interface ItemData {
@ -92,7 +97,8 @@ const ViewKeyTable: React.FC<ViewKeyTableProps> = ({
selectedTeam, selectedTeam,
data, data,
setData, setData,
teams teams,
premiumUser
}) => { }) => {
const [isButtonClicked, setIsButtonClicked] = useState(false); const [isButtonClicked, setIsButtonClicked] = useState(false);
const [isDeleteModalOpen, setIsDeleteModalOpen] = useState(false); const [isDeleteModalOpen, setIsDeleteModalOpen] = useState(false);
@ -109,6 +115,8 @@ const ViewKeyTable: React.FC<ViewKeyTableProps> = ({
const [userModels, setUserModels] = useState([]); const [userModels, setUserModels] = useState([]);
const initialKnownTeamIDs: Set<string> = new Set(); const initialKnownTeamIDs: Set<string> = new Set();
const [modelLimitModalVisible, setModelLimitModalVisible] = useState(false); const [modelLimitModalVisible, setModelLimitModalVisible] = useState(false);
const [regenerateDialogVisible, setRegenerateDialogVisible] = useState(false);
const [regeneratedKey, setRegeneratedKey] = useState<string | null>(null);
const [knownTeamIDs, setKnownTeamIDs] = useState(initialKnownTeamIDs); const [knownTeamIDs, setKnownTeamIDs] = useState(initialKnownTeamIDs);
@ -612,6 +620,38 @@ const ViewKeyTable: React.FC<ViewKeyTableProps> = ({
setKeyToDelete(null); setKeyToDelete(null);
}; };
const handleRegenerateKey = async () => {
if (!premiumUser) {
message.error("Regenerate API Key is an Enterprise feature. Please upgrade to use this feature.");
return;
}
try {
if (selectedToken == null) {
message.error("Please select a key to regenerate");
return;
}
const response = await regenerateKeyCall(accessToken, selectedToken.token);
setRegeneratedKey(response.key);
// Update the data state with the new key_name
if (data) {
const updatedData = data.map(item =>
item.token === selectedToken.token
? { ...item, key_name: response.key_name }
: item
);
setData(updatedData);
}
setRegenerateDialogVisible(false);
message.success("API Key regenerated successfully");
} catch (error) {
console.error("Error regenerating key:", error);
message.error("Failed to regenerate API Key");
}
};
if (data == null) { if (data == null) {
return; return;
} }
@ -768,6 +808,7 @@ const ViewKeyTable: React.FC<ViewKeyTableProps> = ({
size="sm" size="sm"
/> />
<Modal <Modal
open={infoDialogVisible} open={infoDialogVisible}
@ -867,6 +908,14 @@ const ViewKeyTable: React.FC<ViewKeyTableProps> = ({
size="sm" size="sm"
onClick={() => handleEditClick(item)} onClick={() => handleEditClick(item)}
/> />
<Icon
onClick={() => {
setSelectedToken(item);
setRegenerateDialogVisible(true);
}}
icon={RefreshIcon}
size="sm"
/>
<Icon <Icon
onClick={() => handleDelete(item)} onClick={() => handleDelete(item)}
icon={TrashIcon} icon={TrashIcon}
@ -942,6 +991,98 @@ const ViewKeyTable: React.FC<ViewKeyTableProps> = ({
accessToken={accessToken} accessToken={accessToken}
/> />
)} )}
{/* Regenerate Key Confirmation Dialog */}
<Modal
title="Regenerate API Key"
visible={regenerateDialogVisible}
onCancel={() => setRegenerateDialogVisible(false)}
footer={[
<Button key="cancel" onClick={() => setRegenerateDialogVisible(false)} className="mr-2">
Cancel
</Button>,
<Button
key="regenerate"
onClick={handleRegenerateKey}
disabled={!premiumUser}
>
{premiumUser ? "Regenerate" : "Upgrade to Regenerate"}
</Button>
]}
>
{premiumUser ? (
<>
<p>Are you sure you want to regenerate this key?</p>
<p>Key Alias:</p>
<pre>{selectedToken?.key_alias || 'No alias set'}</pre>
</>
) : (
<div>
<p className="mb-2 text-gray-500 italic text-[12px]">Upgrade to use this feature</p>
<Button variant="primary" className="mb-2">
<a href="https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat" target="_blank">
Get Free Trial
</a>
</Button>
</div>
)}
</Modal>
{/* Regenerated Key Display Modal */}
{regeneratedKey && (
<Modal
visible={!!regeneratedKey}
onCancel={() => setRegeneratedKey(null)}
footer={[
<Button key="close" onClick={() => setRegeneratedKey(null)}>
Close
</Button>
]}
>
<Grid numItems={1} className="gap-2 w-full">
<Title>Regenerated Key</Title>
<Col numColSpan={1}>
<p>
Please replace your old key with the new key generated. For
security reasons, <b>you will not be able to view it again</b> through
your LiteLLM account. If you lose this secret key, you will need to
generate a new one.
</p>
</Col>
<Col numColSpan={1}>
<Text className="mt-3">Key Alias:</Text>
<div
style={{
background: "#f8f8f8",
padding: "10px",
borderRadius: "5px",
marginBottom: "10px",
}}
>
<pre style={{ wordWrap: "break-word", whiteSpace: "normal" }}>
{selectedToken?.key_alias || 'No alias set'}
</pre>
</div>
<Text className="mt-3">New API Key:</Text>
<div
style={{
background: "#f8f8f8",
padding: "10px",
borderRadius: "5px",
marginBottom: "10px",
}}
>
<pre style={{ wordWrap: "break-word", whiteSpace: "normal" }}>
{regeneratedKey}
</pre>
</div>
<CopyToClipboard text={regeneratedKey} onCopy={() => message.success("API Key copied to clipboard")}>
<Button className="mt-3">Copy API Key</Button>
</CopyToClipboard>
</Col>
</Grid>
</Modal>
)}
</div> </div>
); );
}; };