mirror of
https://github.com/BerriAI/litellm.git
synced 2025-04-24 18:24:20 +00:00
docs(routing.md): updating routing docs to include cooldown info
This commit is contained in:
parent
2c50ea94c8
commit
276041e3bb
1 changed files with 3 additions and 0 deletions
|
@ -8,6 +8,9 @@ import TabItem from '@theme/TabItem';
|
|||
LiteLLM manages:
|
||||
- Load-balance across multiple deployments (e.g. Azure/OpenAI)
|
||||
- Prioritizing important requests to ensure they don't fail (i.e. Queueing)
|
||||
- Basic reliability logic - cooldowns, fallbacks, timeouts and retries (fixed + exponential backoff) across multiple deployments/providers.
|
||||
|
||||
In production, litellm supports using Redis as a way to track cooldown server and usage (managing tpm/rpm limits).
|
||||
|
||||
## Load Balancing
|
||||
(s/o [@paulpierre](https://www.linkedin.com/in/paulpierre/) for his contribution to this implementation)
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue