(docs) simple proxy

This commit is contained in:
ishaan-jaff 2023-11-29 16:44:39 -08:00
parent c2f642dbec
commit 69eca78000

View file

@ -460,12 +460,10 @@ curl --location 'http://0.0.0.0:8000/chat/completions' \
```
### Load Balancing - Multiple Instances of 1 model
**LiteLLM Proxy can handle 1k+ requests/second**. Use this config to load balance between multiple instances of the same model.
The proxy will handle routing requests (using LiteLLM's Router).
In the config below requests with `model=gpt-3.5-turbo` will be routed across multiple instances of `azure/gpt-3.5-turbo`
Use this config to load balance between multiple instances of the same model. The proxy will handle routing requests (using LiteLLM's Router). **Set `rpm` in the config if you want maximize throughput**
#### Example config
requests with `model=gpt-3.5-turbo` will be routed across multiple instances of `azure/gpt-3.5-turbo`
```yaml
model_list:
- model_name: gpt-3.5-turbo