docs(virtual_keys.md): add budget manager docs to proxy

This commit is contained in:
Krrish Dholakia 2023-12-23 09:48:53 +05:30
parent 89ee9fe400
commit c568bb6cac
3 changed files with 38 additions and 4 deletions

View file

@ -8,6 +8,13 @@ Don't want to get crazy bills because either while you're calling LLM APIs **or*
LiteLLM exposes:
* `litellm.max_budget`: a global variable you can use to set the max budget (in USD) across all your litellm calls. If this budget is exceeded, it will raise a BudgetExceededError
* `BudgetManager`: A class to help set budgets per user. BudgetManager creates a dictionary to manage the user budgets, where the key is user and the object is their current cost + model-specific costs.
* `OpenAI Proxy Server`: A server to call 100+ LLMs with an openai-compatible endpoint. Manages user budgets, spend tracking, load balancing etc.
:::info
If you want a server to manage user keys, budgets, etc. use our [OpenAI Proxy Server](./proxy/virtual_keys.md)
:::
## quick start

View file

@ -1,5 +1,5 @@
# Key Management
Track Spend and create virtual keys for the proxy
Track Spend, Set budgets and create virtual keys for the proxy
Grant other's temporary access to your proxy, with keys that expire after a set duration.
@ -59,7 +59,7 @@ Expected response:
}
```
## Managing Auth - Upgrade/Downgrade Models
## Upgrade/Downgrade Models
If a user is expected to use a given model (i.e. gpt3-5), and you want to:
@ -108,7 +108,7 @@ curl -X POST "https://0.0.0.0:8000/key/generate" \
- **How to upgrade / downgrade request?** Change the alias mapping
- **How are routing between diff keys/api bases done?** litellm handles this by shuffling between different models in the model list with the same model_name. [**See Code**](https://github.com/BerriAI/litellm/blob/main/litellm/router.py)
## Managing Auth - Tracking Spend
## Tracking Spend
You can get spend for a key by using the `/key/info` endpoint.
@ -142,6 +142,33 @@ This is automatically updated (in USD) when calls are made to /completions, /cha
}
```
## Set Budgets
LiteLLM exposes a `/user/new` endpoint to create budgets for users, that persist across multiple keys.
This is documented in the swagger (live on your server root endpoint - e.g. `http://0.0.0.0:8000/`). Here's an example request.
```curl
curl --location 'http://localhost:8000/user/new' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{"models": ["azure-models"], "max_budget": 0, "user_id": "krrish3@berri.ai"}'
```
The request is a normal `/key/generate` request body + a `max_budget` field.
**Sample Response**
```curl
{
"key": "sk-YF2OxDbrgd1y2KgwxmEA2w",
"expires": "2023-12-22T09:53:13.861000Z",
"user_id": "krrish3@berri.ai",
"max_budget": 0.0
}
```
## Custom Auth
You can now override the default api key auth.

View file

@ -14,7 +14,7 @@ In production, litellm supports using Redis as a way to track cooldown server an
:::info
If you want a server to load balance across different LLM APIs, use our [OpenAI Proxy Server](./simple_proxy#load-balancing---multiple-instances-of-1-model)
If you want a server to load balance across different LLM APIs, use our [OpenAI Proxy Server](./proxy/load_balancing.md)
:::