forked from phoenix/litellm-mirror
(docs) simple proxy
This commit is contained in:
parent
c2f642dbec
commit
69eca78000
1 changed files with 3 additions and 5 deletions
|
@ -460,12 +460,10 @@ curl --location 'http://0.0.0.0:8000/chat/completions' \
|
||||||
```
|
```
|
||||||
|
|
||||||
### Load Balancing - Multiple Instances of 1 model
|
### Load Balancing - Multiple Instances of 1 model
|
||||||
**LiteLLM Proxy can handle 1k+ requests/second**. Use this config to load balance between multiple instances of the same model.
|
Use this config to load balance between multiple instances of the same model. The proxy will handle routing requests (using LiteLLM's Router). **Set `rpm` in the config if you want maximize throughput**
|
||||||
|
|
||||||
The proxy will handle routing requests (using LiteLLM's Router).
|
|
||||||
|
|
||||||
In the config below requests with `model=gpt-3.5-turbo` will be routed across multiple instances of `azure/gpt-3.5-turbo`
|
|
||||||
|
|
||||||
|
#### Example config
|
||||||
|
requests with `model=gpt-3.5-turbo` will be routed across multiple instances of `azure/gpt-3.5-turbo`
|
||||||
```yaml
|
```yaml
|
||||||
model_list:
|
model_list:
|
||||||
- model_name: gpt-3.5-turbo
|
- model_name: gpt-3.5-turbo
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue