From 69eca78000585b99a3be9555fdaaed1c45971ecd Mon Sep 17 00:00:00 2001 From: ishaan-jaff Date: Wed, 29 Nov 2023 16:44:39 -0800 Subject: [PATCH] (docs) simple proxy --- docs/my-website/docs/simple_proxy.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/docs/my-website/docs/simple_proxy.md b/docs/my-website/docs/simple_proxy.md index 01f43daec4..48fa58f306 100644 --- a/docs/my-website/docs/simple_proxy.md +++ b/docs/my-website/docs/simple_proxy.md @@ -460,12 +460,10 @@ curl --location 'http://0.0.0.0:8000/chat/completions' \ ``` ### Load Balancing - Multiple Instances of 1 model -**LiteLLM Proxy can handle 1k+ requests/second**. Use this config to load balance between multiple instances of the same model. - -The proxy will handle routing requests (using LiteLLM's Router). - -In the config below requests with `model=gpt-3.5-turbo` will be routed across multiple instances of `azure/gpt-3.5-turbo` +Use this config to load balance between multiple instances of the same model. The proxy will handle routing requests (using LiteLLM's Router). **Set `rpm` in the config if you want maximize throughput** +#### Example config +requests with `model=gpt-3.5-turbo` will be routed across multiple instances of `azure/gpt-3.5-turbo` ```yaml model_list: - model_name: gpt-3.5-turbo