From c568bb6cac1083e0563e7c48e6f5e8be814d83eb Mon Sep 17 00:00:00 2001
From: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Sat, 23 Dec 2023 09:48:53 +0530
Subject: [PATCH] docs(virtual_keys.md): add budget manager docs to proxy

---
 docs/my-website/docs/budget_manager.md     |  7 +++++
 docs/my-website/docs/proxy/virtual_keys.md | 33 ++++++++++++++++++++--
 docs/my-website/docs/routing.md            |  2 +-
 3 files changed, 38 insertions(+), 4 deletions(-)
diff --git a/docs/my-website/docs/budget_manager.md b/docs/my-website/docs/budget_manager.md
index 2a3b25b8e..c2cc28c2f 100644
--- a/docs/my-website/docs/budget_manager.md
+++ b/docs/my-website/docs/budget_manager.md
@@ -8,6 +8,13 @@ Don't want to get crazy bills because either while you're calling LLM APIs **or*
 LiteLLM exposes: 
 * `litellm.max_budget`: a global variable you can use to set the max budget (in USD) across all your litellm calls. If this budget is exceeded, it will raise a BudgetExceededError 
 * `BudgetManager`: A class to help set budgets per user. BudgetManager creates a dictionary to manage the user budgets, where the key is user and the object is their current cost + model-specific costs. 
+* `OpenAI Proxy Server`: A server to call 100+ LLMs with an openai-compatible endpoint. Manages user budgets, spend tracking, load balancing etc. 
+
+:::info
+
+If you want a server to manage user keys, budgets, etc. use our [OpenAI Proxy Server](./proxy/virtual_keys.md)
+
+:::
 
 ## quick start
 
diff --git a/docs/my-website/docs/proxy/virtual_keys.md b/docs/my-website/docs/proxy/virtual_keys.md
index 91702eddc..c8844a3c3 100644
--- a/docs/my-website/docs/proxy/virtual_keys.md
+++ b/docs/my-website/docs/proxy/virtual_keys.md
@@ -1,5 +1,5 @@
 # Key Management
-Track Spend and create virtual keys for the proxy
+Track Spend, Set budgets and create virtual keys for the proxy
 
 Grant other's temporary access to your proxy, with keys that expire after a set duration.
 
@@ -59,7 +59,7 @@ Expected response:
 }
 ```
 
-## Managing Auth - Upgrade/Downgrade Models 
+## Upgrade/Downgrade Models 
 
 If a user is expected to use a given model (i.e. gpt3-5), and you want to:
 
@@ -108,7 +108,7 @@ curl -X POST "https://0.0.0.0:8000/key/generate" \
 - **How to upgrade / downgrade request?** Change the alias mapping
 - **How are routing between diff keys/api bases done?** litellm handles this by shuffling between different models in the model list with the same model_name. [**See Code**](https://github.com/BerriAI/litellm/blob/main/litellm/router.py)
 
-## Managing Auth - Tracking Spend 
+## Tracking Spend 
 
 You can get spend for a key by using the `/key/info` endpoint. 
 
@@ -142,6 +142,33 @@ This is automatically updated (in USD) when calls are made to /completions, /cha
 }
 ```
 
+
+
+## Set Budgets 
+
+LiteLLM exposes a `/user/new` endpoint to create budgets for users, that persist across multiple keys. 
+
+This is documented in the swagger (live on your server root endpoint - e.g. `http://0.0.0.0:8000/`). Here's an example request. 
+
+```curl 
+curl --location 'http://localhost:8000/user/new' \
+--header 'Authorization: Bearer <your-master-key>' \
+--header 'Content-Type: application/json' \
+--data-raw '{"models": ["azure-models"], "max_budget": 0, "user_id": "krrish3@berri.ai"}' 
+```
+The request is a normal `/key/generate` request body + a `max_budget` field. 
+
+**Sample Response**
+
+```curl
+{
+    "key": "sk-YF2OxDbrgd1y2KgwxmEA2w",
+    "expires": "2023-12-22T09:53:13.861000Z",
+    "user_id": "krrish3@berri.ai",
+    "max_budget": 0.0
+}
+```
+
 ## Custom Auth 
 
 You can now override the default api key auth. 
diff --git a/docs/my-website/docs/routing.md b/docs/my-website/docs/routing.md
index 5239f7ab7..3334dbd5c 100644
--- a/docs/my-website/docs/routing.md
+++ b/docs/my-website/docs/routing.md
@@ -14,7 +14,7 @@ In production, litellm supports using Redis as a way to track cooldown server an
 
 :::info
 
-If you want a server to load balance across different LLM APIs, use our [OpenAI Proxy Server](./simple_proxy#load-balancing---multiple-instances-of-1-model)
+If you want a server to load balance across different LLM APIs, use our [OpenAI Proxy Server](./proxy/load_balancing.md)
 
 :::