From c568bb6cac1083e0563e7c48e6f5e8be814d83eb Mon Sep 17 00:00:00 2001 From: Krrish Dholakia Date: Sat, 23 Dec 2023 09:48:53 +0530 Subject: [PATCH] docs(virtual_keys.md): add budget manager docs to proxy --- docs/my-website/docs/budget_manager.md | 7 +++++ docs/my-website/docs/proxy/virtual_keys.md | 33 ++++++++++++++++++++-- docs/my-website/docs/routing.md | 2 +- 3 files changed, 38 insertions(+), 4 deletions(-) diff --git a/docs/my-website/docs/budget_manager.md b/docs/my-website/docs/budget_manager.md index 2a3b25b8e..c2cc28c2f 100644 --- a/docs/my-website/docs/budget_manager.md +++ b/docs/my-website/docs/budget_manager.md @@ -8,6 +8,13 @@ Don't want to get crazy bills because either while you're calling LLM APIs **or* LiteLLM exposes: * `litellm.max_budget`: a global variable you can use to set the max budget (in USD) across all your litellm calls. If this budget is exceeded, it will raise a BudgetExceededError * `BudgetManager`: A class to help set budgets per user. BudgetManager creates a dictionary to manage the user budgets, where the key is user and the object is their current cost + model-specific costs. +* `OpenAI Proxy Server`: A server to call 100+ LLMs with an openai-compatible endpoint. Manages user budgets, spend tracking, load balancing etc. + +:::info + +If you want a server to manage user keys, budgets, etc. use our [OpenAI Proxy Server](./proxy/virtual_keys.md) + +::: ## quick start diff --git a/docs/my-website/docs/proxy/virtual_keys.md b/docs/my-website/docs/proxy/virtual_keys.md index 91702eddc..c8844a3c3 100644 --- a/docs/my-website/docs/proxy/virtual_keys.md +++ b/docs/my-website/docs/proxy/virtual_keys.md @@ -1,5 +1,5 @@ # Key Management -Track Spend and create virtual keys for the proxy +Track Spend, Set budgets and create virtual keys for the proxy Grant other's temporary access to your proxy, with keys that expire after a set duration. @@ -59,7 +59,7 @@ Expected response: } ``` -## Managing Auth - Upgrade/Downgrade Models +## Upgrade/Downgrade Models If a user is expected to use a given model (i.e. gpt3-5), and you want to: @@ -108,7 +108,7 @@ curl -X POST "https://0.0.0.0:8000/key/generate" \ - **How to upgrade / downgrade request?** Change the alias mapping - **How are routing between diff keys/api bases done?** litellm handles this by shuffling between different models in the model list with the same model_name. [**See Code**](https://github.com/BerriAI/litellm/blob/main/litellm/router.py) -## Managing Auth - Tracking Spend +## Tracking Spend You can get spend for a key by using the `/key/info` endpoint. @@ -142,6 +142,33 @@ This is automatically updated (in USD) when calls are made to /completions, /cha } ``` + + +## Set Budgets + +LiteLLM exposes a `/user/new` endpoint to create budgets for users, that persist across multiple keys. + +This is documented in the swagger (live on your server root endpoint - e.g. `http://0.0.0.0:8000/`). Here's an example request. + +```curl +curl --location 'http://localhost:8000/user/new' \ +--header 'Authorization: Bearer ' \ +--header 'Content-Type: application/json' \ +--data-raw '{"models": ["azure-models"], "max_budget": 0, "user_id": "krrish3@berri.ai"}' +``` +The request is a normal `/key/generate` request body + a `max_budget` field. + +**Sample Response** + +```curl +{ + "key": "sk-YF2OxDbrgd1y2KgwxmEA2w", + "expires": "2023-12-22T09:53:13.861000Z", + "user_id": "krrish3@berri.ai", + "max_budget": 0.0 +} +``` + ## Custom Auth You can now override the default api key auth. diff --git a/docs/my-website/docs/routing.md b/docs/my-website/docs/routing.md index 5239f7ab7..3334dbd5c 100644 --- a/docs/my-website/docs/routing.md +++ b/docs/my-website/docs/routing.md @@ -14,7 +14,7 @@ In production, litellm supports using Redis as a way to track cooldown server an :::info -If you want a server to load balance across different LLM APIs, use our [OpenAI Proxy Server](./simple_proxy#load-balancing---multiple-instances-of-1-model) +If you want a server to load balance across different LLM APIs, use our [OpenAI Proxy Server](./proxy/load_balancing.md) :::