diff --git a/docs/my-website/docs/enterprise.md b/docs/my-website/docs/enterprise.md index 626dc62f35..4673a827bb 100644 --- a/docs/my-website/docs/enterprise.md +++ b/docs/my-website/docs/enterprise.md @@ -31,9 +31,10 @@ This covers: - ✅ [Team Based Logging](./proxy/team_logging.md) - Allow each team to use their own Langfuse Project / custom callbacks - ✅ [Disable Logging for a Team](./proxy/team_logging.md#disable-logging-for-a-team) - Switch off all logging for a team/project (GDPR Compliance) - **Controlling Guardrails by Virtual Keys** - - **Spend Tracking & Data Exports** + - **Spend Tracking, Budgets & Data Exports** - ✅ [Tracking Spend for Custom Tags](./proxy/enterprise#tracking-spend-for-custom-tags) - ✅ [Set USD Budgets Spend for Custom Tags](./proxy/provider_budget_routing#-tag-budgets) + - ✅ [Set Model budgets for Virtual Keys](./proxy/users#-virtual-key-model-specific) - ✅ [Exporting LLM Logs to GCS Bucket, Azure Blob Storage](./proxy/bucket#🪣-logging-gcs-s3-buckets) - ✅ [API Endpoints to get Spend Reports per Team, API Key, Customer](./proxy/cost_tracking.md#✨-enterprise-api-endpoints-to-get-spend) - **Prometheus Metrics** diff --git a/docs/my-website/docs/proxy/enterprise.md b/docs/my-website/docs/proxy/enterprise.md index 66f365745a..065baae278 100644 --- a/docs/my-website/docs/proxy/enterprise.md +++ b/docs/my-website/docs/proxy/enterprise.md @@ -29,6 +29,7 @@ Features: - **Spend Tracking & Data Exports** - ✅ [Tracking Spend for Custom Tags](#tracking-spend-for-custom-tags) - ✅ [Set USD Budgets Spend for Custom Tags](./provider_budget_routing#-tag-budgets) + - ✅ [Set Model budgets for Virtual Keys](./users#-virtual-key-model-specific) - ✅ [Exporting LLM Logs to GCS Bucket, Azure Blob Storage](./proxy/bucket#🪣-logging-gcs-s3-buckets) - ✅ [`/spend/report` API endpoint](cost_tracking.md#✨-enterprise-api-endpoints-to-get-spend) - **Prometheus Metrics** diff --git a/docs/my-website/docs/proxy/users.md b/docs/my-website/docs/proxy/users.md index 1db749e83e..92ea73b9d2 100644 --- a/docs/my-website/docs/proxy/users.md +++ b/docs/my-website/docs/proxy/users.md @@ -10,16 +10,7 @@ Requirements: ## Set Budgets -You can set budgets at 5 levels: -- For the proxy -- For an internal user -- For a customer (end-user) -- For a key -- For a key (model specific budgets) - - - - +### Global Proxy Apply a budget across all calls on the proxy @@ -57,8 +48,9 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \ ], }' ``` - - + +### Team + You can: - Add budgets to Teams @@ -126,8 +118,7 @@ curl 'http://0.0.0.0:4000/team/new' \ }' ``` - - +### Team Members Use this when you want to budget a users spend within a Team @@ -196,62 +187,75 @@ curl --location 'http://localhost:4000/chat/completions' \ }' ``` - - -Use this to budget `user` passed to `/chat/completions`, **without needing to create a key for every user** +### Internal User -**Step 1. Modify config.yaml** -Define `litellm.max_end_user_budget` -```yaml -general_settings: - master_key: sk-1234 +Apply a budget across all calls an internal user (key owner) can make on the proxy. -litellm_settings: - max_end_user_budget: 0.0001 # budget for 'user' passed to /chat/completions +:::info + +For most use-cases, we recommend setting team-member budgets + +::: + +LiteLLM exposes a `/user/new` endpoint to create budgets for this. + +You can: +- Add budgets to users [**Jump**](#add-budgets-to-users) +- Add budget durations, to reset spend [**Jump**](#add-budget-duration-to-users) + +By default the `max_budget` is set to `null` and is not checked for keys + +#### **Add budgets to users** +```shell +curl --location 'http://localhost:4000/user/new' \ +--header 'Authorization: Bearer ' \ +--header 'Content-Type: application/json' \ +--data-raw '{"models": ["azure-models"], "max_budget": 0, "user_id": "krrish3@berri.ai"}' ``` -2. Make a /chat/completions call, pass 'user' - First call Works +[**See Swagger**](https://litellm-api.up.railway.app/#/user%20management/new_user_user_new_post) + +**Sample Response** + ```shell -curl --location 'http://0.0.0.0:4000/chat/completions' \ - --header 'Content-Type: application/json' \ - --header 'Authorization: Bearer sk-zi5onDRdHGD24v0Zdn7VBA' \ - --data ' { - "model": "azure-gpt-3.5", - "user": "ishaan3", - "messages": [ - { - "role": "user", - "content": "what time is it" - } - ] - }' +{ + "key": "sk-YF2OxDbrgd1y2KgwxmEA2w", + "expires": "2023-12-22T09:53:13.861000Z", + "user_id": "krrish3@berri.ai", + "max_budget": 0.0 +} ``` -3. Make a /chat/completions call, pass 'user' - Call Fails, since 'ishaan3' over budget -```shell -curl --location 'http://0.0.0.0:4000/chat/completions' \ - --header 'Content-Type: application/json' \ - --header 'Authorization: Bearer sk-zi5onDRdHGD24v0Zdn7VBA' \ - --data ' { - "model": "azure-gpt-3.5", - "user": "ishaan3", - "messages": [ - { - "role": "user", - "content": "what time is it" - } - ] - }' +#### **Add budget duration to users** + +`budget_duration`: Budget is reset at the end of specified duration. If not set, budget is never reset. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d"). + +``` +curl 'http://0.0.0.0:4000/user/new' \ +--header 'Authorization: Bearer ' \ +--header 'Content-Type: application/json' \ +--data-raw '{ + "team_id": "core-infra", # [OPTIONAL] + "max_budget": 10, + "budget_duration": 10s, +}' ``` -Error -```shell -{"error":{"message":"Budget has been exceeded: User ishaan3 has exceeded their budget. Current spend: 0.0008869999999999999; Max Budget: 0.0001","type":"auth_error","param":"None","code":401}}% +#### Create new keys for existing user + +Now you can just call `/key/generate` with that user_id (i.e. krrish3@berri.ai) and: +- **Budget Check**: krrish3@berri.ai's budget (i.e. $10) will be checked for this key +- **Spend Tracking**: spend for this key will update krrish3@berri.ai's spend as well + +```bash +curl --location 'http://0.0.0.0:4000/key/generate' \ +--header 'Authorization: Bearer ' \ +--header 'Content-Type: application/json' \ +--data '{"models": ["azure-models"], "user_id": "krrish3@berri.ai"}' ``` - - +### Virtual Key Apply a budget on a key. @@ -319,84 +323,19 @@ curl 'http://0.0.0.0:4000/key/generate' \ }' ``` - - - -Apply a budget across all calls an internal user (key owner) can make on the proxy. - -:::info - -For most use-cases, we recommend setting team-member budgets - -::: - -LiteLLM exposes a `/user/new` endpoint to create budgets for this. - -You can: -- Add budgets to users [**Jump**](#add-budgets-to-users) -- Add budget durations, to reset spend [**Jump**](#add-budget-duration-to-users) - -By default the `max_budget` is set to `null` and is not checked for keys - -#### **Add budgets to users** -```shell -curl --location 'http://localhost:4000/user/new' \ ---header 'Authorization: Bearer ' \ ---header 'Content-Type: application/json' \ ---data-raw '{"models": ["azure-models"], "max_budget": 0, "user_id": "krrish3@berri.ai"}' -``` - -[**See Swagger**](https://litellm-api.up.railway.app/#/user%20management/new_user_user_new_post) - -**Sample Response** - -```shell -{ - "key": "sk-YF2OxDbrgd1y2KgwxmEA2w", - "expires": "2023-12-22T09:53:13.861000Z", - "user_id": "krrish3@berri.ai", - "max_budget": 0.0 -} -``` - -#### **Add budget duration to users** - -`budget_duration`: Budget is reset at the end of specified duration. If not set, budget is never reset. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d"). - -``` -curl 'http://0.0.0.0:4000/user/new' \ ---header 'Authorization: Bearer ' \ ---header 'Content-Type: application/json' \ ---data-raw '{ - "team_id": "core-infra", # [OPTIONAL] - "max_budget": 10, - "budget_duration": 10s, -}' -``` - -#### Create new keys for existing user - -Now you can just call `/key/generate` with that user_id (i.e. krrish3@berri.ai) and: -- **Budget Check**: krrish3@berri.ai's budget (i.e. $10) will be checked for this key -- **Spend Tracking**: spend for this key will update krrish3@berri.ai's spend as well - -```bash -curl --location 'http://0.0.0.0:4000/key/generate' \ ---header 'Authorization: Bearer ' \ ---header 'Content-Type: application/json' \ ---data '{"models": ["azure-models"], "user_id": "krrish3@berri.ai"}' -``` - - - - +### ✨ Virtual Key (Model Specific) Apply model specific budgets on a key. Example: - Budget for `gpt-4o` is $0.0000001, for time period `1d` for `key = "sk-12345"` - Budget for `gpt-4o-mini` is $10, for time period `30d` for `key = "sk-12345"` -#### **Add model specific budgets to keys** +:::info + +✨ This is an Enterprise only feature [Get Started with Enterprise here](https://www.litellm.ai/#pricing) + +::: + The spec for `model_max_budget` is **[`Dict[str, GenericBudgetInfo]`](#genericbudgetinfo)** @@ -470,14 +409,63 @@ Expected response on failure ``` - - - +### Customers -### Reset Budgets +Use this to budget `user` passed to `/chat/completions`, **without needing to create a key for every user** + +**Step 1. Modify config.yaml** +Define `litellm.max_end_user_budget` +```yaml +general_settings: + master_key: sk-1234 + +litellm_settings: + max_end_user_budget: 0.0001 # budget for 'user' passed to /chat/completions +``` + +2. Make a /chat/completions call, pass 'user' - First call Works +```shell +curl --location 'http://0.0.0.0:4000/chat/completions' \ + --header 'Content-Type: application/json' \ + --header 'Authorization: Bearer sk-zi5onDRdHGD24v0Zdn7VBA' \ + --data ' { + "model": "azure-gpt-3.5", + "user": "ishaan3", + "messages": [ + { + "role": "user", + "content": "what time is it" + } + ] + }' +``` + +3. Make a /chat/completions call, pass 'user' - Call Fails, since 'ishaan3' over budget +```shell +curl --location 'http://0.0.0.0:4000/chat/completions' \ + --header 'Content-Type: application/json' \ + --header 'Authorization: Bearer sk-zi5onDRdHGD24v0Zdn7VBA' \ + --data ' { + "model": "azure-gpt-3.5", + "user": "ishaan3", + "messages": [ + { + "role": "user", + "content": "what time is it" + } + ] + }' +``` + +Error +```shell +{"error":{"message":"Budget has been exceeded: User ishaan3 has exceeded their budget. Current spend: 0.0008869999999999999; Max Budget: 0.0001","type":"auth_error","param":"None","code":401}}% +``` + +## Reset Budgets Reset budgets across keys/internal users/teams/customers diff --git a/litellm/proxy/management_endpoints/key_management_endpoints.py b/litellm/proxy/management_endpoints/key_management_endpoints.py index 57db5758be..402e8fbb89 100644 --- a/litellm/proxy/management_endpoints/key_management_endpoints.py +++ b/litellm/proxy/management_endpoints/key_management_endpoints.py @@ -1933,7 +1933,7 @@ async def _enforce_unique_key_alias( def validate_model_max_budget(model_max_budget: Optional[Dict]) -> None: """ - Validate the model_max_budget is GenericBudgetConfigType + Validate the model_max_budget is GenericBudgetConfigType + enforce user has an enterprise license Raises: Exception: If model_max_budget is not a valid GenericBudgetConfigType @@ -1944,6 +1944,12 @@ def validate_model_max_budget(model_max_budget: Optional[Dict]) -> None: if len(model_max_budget) == 0: return if model_max_budget is not None: + from litellm.proxy.proxy_server import CommonProxyErrors, premium_user + + if premium_user is not True: + raise ValueError( + f"You must have an enterprise license to set model_max_budget. {CommonProxyErrors.not_premium_user.value}" + ) for _model, _budget_info in model_max_budget.items(): assert isinstance(_model, str)