docs(virtual_keys.md): add budget manager docs to proxy

2023-12-23 09:48:53 +05:30 · 2023-12-23 09:48:53 +05:30 · c568bb6cac
commit c568bb6cac
parent 89ee9fe400
3 changed files with 38 additions and 4 deletions
--- a/docs/my-website/docs/budget_manager.md
+++ b/docs/my-website/docs/budget_manager.md
@ -8,6 +8,13 @@ Don't want to get crazy bills because either while you're calling LLM APIs **or*
 LiteLLM exposes: 
 * `litellm.max_budget`: a global variable you can use to set the max budget (in USD) across all your litellm calls. If this budget is exceeded, it will raise a BudgetExceededError 
 * `BudgetManager`: A class to help set budgets per user. BudgetManager creates a dictionary to manage the user budgets, where the key is user and the object is their current cost + model-specific costs. 
+* `OpenAI Proxy Server`: A server to call 100+ LLMs with an openai-compatible endpoint. Manages user budgets, spend tracking, load balancing etc. 
+
+:::info
+
+If you want a server to manage user keys, budgets, etc. use our [OpenAI Proxy Server](./proxy/virtual_keys.md)
+
+:::

 ## quick start

--- a/docs/my-website/docs/proxy/virtual_keys.md
+++ b/docs/my-website/docs/proxy/virtual_keys.md
@ -1,5 +1,5 @@
 # Key Management
-Track Spend and create virtual keys for the proxy
+Track Spend, Set budgets and create virtual keys for the proxy

 Grant other's temporary access to your proxy, with keys that expire after a set duration.

@ -59,7 +59,7 @@ Expected response:
 }
 ```

-## Managing Auth - Upgrade/Downgrade Models 
+## Upgrade/Downgrade Models 

 If a user is expected to use a given model (i.e. gpt3-5), and you want to:

@ -108,7 +108,7 @@ curl -X POST "https://0.0.0.0:8000/key/generate" \
 - **How to upgrade / downgrade request?** Change the alias mapping
 - **How are routing between diff keys/api bases done?** litellm handles this by shuffling between different models in the model list with the same model_name. [**See Code**](https://github.com/BerriAI/litellm/blob/main/litellm/router.py)

-## Managing Auth - Tracking Spend 
+## Tracking Spend 

 You can get spend for a key by using the `/key/info` endpoint. 

@ -142,6 +142,33 @@ This is automatically updated (in USD) when calls are made to /completions, /cha
 }
 ```

+
+
+## Set Budgets 
+
+LiteLLM exposes a `/user/new` endpoint to create budgets for users, that persist across multiple keys. 
+
+This is documented in the swagger (live on your server root endpoint - e.g. `http://0.0.0.0:8000/`). Here's an example request. 
+
+```curl 
+curl --location 'http://localhost:8000/user/new' \
+--header 'Authorization: Bearer <your-master-key>' \
+--header 'Content-Type: application/json' \
+--data-raw '{"models": ["azure-models"], "max_budget": 0, "user_id": "krrish3@berri.ai"}' 
+```
+The request is a normal `/key/generate` request body + a `max_budget` field. 
+
+**Sample Response**
+
+```curl
+{
+    "key": "sk-YF2OxDbrgd1y2KgwxmEA2w",
+    "expires": "2023-12-22T09:53:13.861000Z",
+    "user_id": "krrish3@berri.ai",
+    "max_budget": 0.0
+}
+```
+
 ## Custom Auth 

 You can now override the default api key auth. 
--- a/docs/my-website/docs/routing.md
+++ b/docs/my-website/docs/routing.md
@ -14,7 +14,7 @@ In production, litellm supports using Redis as a way to track cooldown server an

 :::info

-If you want a server to load balance across different LLM APIs, use our [OpenAI Proxy Server](./simple_proxy#load-balancing---multiple-instances-of-1-model)
+If you want a server to load balance across different LLM APIs, use our [OpenAI Proxy Server](./proxy/load_balancing.md)

 :::