forked from phoenix/litellm-mirror
docs(virtual_keys.md): add budget manager docs to proxy
This commit is contained in:
parent
89ee9fe400
commit
c568bb6cac
3 changed files with 38 additions and 4 deletions
|
@ -8,6 +8,13 @@ Don't want to get crazy bills because either while you're calling LLM APIs **or*
|
|||
LiteLLM exposes:
|
||||
* `litellm.max_budget`: a global variable you can use to set the max budget (in USD) across all your litellm calls. If this budget is exceeded, it will raise a BudgetExceededError
|
||||
* `BudgetManager`: A class to help set budgets per user. BudgetManager creates a dictionary to manage the user budgets, where the key is user and the object is their current cost + model-specific costs.
|
||||
* `OpenAI Proxy Server`: A server to call 100+ LLMs with an openai-compatible endpoint. Manages user budgets, spend tracking, load balancing etc.
|
||||
|
||||
:::info
|
||||
|
||||
If you want a server to manage user keys, budgets, etc. use our [OpenAI Proxy Server](./proxy/virtual_keys.md)
|
||||
|
||||
:::
|
||||
|
||||
## quick start
|
||||
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
# Key Management
|
||||
Track Spend and create virtual keys for the proxy
|
||||
Track Spend, Set budgets and create virtual keys for the proxy
|
||||
|
||||
Grant other's temporary access to your proxy, with keys that expire after a set duration.
|
||||
|
||||
|
@ -59,7 +59,7 @@ Expected response:
|
|||
}
|
||||
```
|
||||
|
||||
## Managing Auth - Upgrade/Downgrade Models
|
||||
## Upgrade/Downgrade Models
|
||||
|
||||
If a user is expected to use a given model (i.e. gpt3-5), and you want to:
|
||||
|
||||
|
@ -108,7 +108,7 @@ curl -X POST "https://0.0.0.0:8000/key/generate" \
|
|||
- **How to upgrade / downgrade request?** Change the alias mapping
|
||||
- **How are routing between diff keys/api bases done?** litellm handles this by shuffling between different models in the model list with the same model_name. [**See Code**](https://github.com/BerriAI/litellm/blob/main/litellm/router.py)
|
||||
|
||||
## Managing Auth - Tracking Spend
|
||||
## Tracking Spend
|
||||
|
||||
You can get spend for a key by using the `/key/info` endpoint.
|
||||
|
||||
|
@ -142,6 +142,33 @@ This is automatically updated (in USD) when calls are made to /completions, /cha
|
|||
}
|
||||
```
|
||||
|
||||
|
||||
|
||||
## Set Budgets
|
||||
|
||||
LiteLLM exposes a `/user/new` endpoint to create budgets for users, that persist across multiple keys.
|
||||
|
||||
This is documented in the swagger (live on your server root endpoint - e.g. `http://0.0.0.0:8000/`). Here's an example request.
|
||||
|
||||
```curl
|
||||
curl --location 'http://localhost:8000/user/new' \
|
||||
--header 'Authorization: Bearer <your-master-key>' \
|
||||
--header 'Content-Type: application/json' \
|
||||
--data-raw '{"models": ["azure-models"], "max_budget": 0, "user_id": "krrish3@berri.ai"}'
|
||||
```
|
||||
The request is a normal `/key/generate` request body + a `max_budget` field.
|
||||
|
||||
**Sample Response**
|
||||
|
||||
```curl
|
||||
{
|
||||
"key": "sk-YF2OxDbrgd1y2KgwxmEA2w",
|
||||
"expires": "2023-12-22T09:53:13.861000Z",
|
||||
"user_id": "krrish3@berri.ai",
|
||||
"max_budget": 0.0
|
||||
}
|
||||
```
|
||||
|
||||
## Custom Auth
|
||||
|
||||
You can now override the default api key auth.
|
||||
|
|
|
@ -14,7 +14,7 @@ In production, litellm supports using Redis as a way to track cooldown server an
|
|||
|
||||
:::info
|
||||
|
||||
If you want a server to load balance across different LLM APIs, use our [OpenAI Proxy Server](./simple_proxy#load-balancing---multiple-instances-of-1-model)
|
||||
If you want a server to load balance across different LLM APIs, use our [OpenAI Proxy Server](./proxy/load_balancing.md)
|
||||
|
||||
:::
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue