mirror of
https://github.com/BerriAI/litellm.git
synced 2025-04-25 10:44:24 +00:00
(docs) Add docs on load testing benchmarks (#7499)
* docs benchmarks * docs benchmarks
This commit is contained in:
parent
38bfefa6ef
commit
e1fcd3ee43
5 changed files with 47 additions and 11 deletions
|
@ -1,21 +1,51 @@
|
||||||
|
|
||||||
|
import Image from '@theme/IdealImage';
|
||||||
|
|
||||||
# Benchmarks
|
# Benchmarks
|
||||||
|
|
||||||
Benchmarks for LiteLLM Gateway (Proxy Server)
|
Benchmarks for LiteLLM Gateway (Proxy Server) tested against a fake OpenAI endpoint.
|
||||||
|
|
||||||
Locust Settings:
|
## 1 Instance LiteLLM Proxy
|
||||||
- 2500 Users
|
|
||||||
- 100 user Ramp Up
|
|
||||||
|
|
||||||
|
|
||||||
## Basic Benchmarks
|
| Metric | Litellm Proxy (1 Instance) |
|
||||||
|
|--------|------------------------|
|
||||||
|
| Median Latency (ms) | 110 |
|
||||||
|
| RPS | 68.2 |
|
||||||
|
|
||||||
Overhead when using a Deployed Proxy vs Direct to LLM
|
<Image img={require('../img/1_instance_proxy.png')} />
|
||||||
- Latency overhead added by LiteLLM Proxy: 107ms
|
|
||||||
|
|
||||||
| Metric | Direct to Fake Endpoint | Basic Litellm Proxy |
|
## **Horizontal Scaling**
|
||||||
|--------|------------------------|---------------------|
|
|
||||||
| RPS | 1196 | 1133.2 |
|
<Image img={require('../img/instances_vs_rps.png')} />
|
||||||
| Median Latency (ms) | 33 | 140 |
|
|
||||||
|
#### Key Findings
|
||||||
|
- Single instance: 68.2 RPS @ 100ms latency
|
||||||
|
- 10 instances: 4.3% efficiency loss (653 RPS vs expected 682 RPS), latency stable at `100ms`
|
||||||
|
- For 10,000 RPS: Need ~154 instances @ 95.7% efficiency, `100ms latency`
|
||||||
|
|
||||||
|
|
||||||
|
### 2 Instances
|
||||||
|
|
||||||
|
**Adding 1 instance, will double the RPS and maintain the `100ms-110ms` median latency.**
|
||||||
|
|
||||||
|
| Metric | Litellm Proxy (2 Instances) |
|
||||||
|
|--------|------------------------|
|
||||||
|
| Median Latency (ms) | 100 |
|
||||||
|
| RPS | 142 |
|
||||||
|
|
||||||
|
|
||||||
|
<Image img={require('../img/2_instance_proxy.png')} />
|
||||||
|
|
||||||
|
|
||||||
|
### 10 Instances
|
||||||
|
|
||||||
|
| Metric | Litellm Proxy (10 Instances) |
|
||||||
|
|--------|------------------------|
|
||||||
|
| Median Latency (ms) | 110 |
|
||||||
|
| RPS | 653 |
|
||||||
|
|
||||||
|
<Image img={require('../img/10_instance_proxy.png')} />
|
||||||
|
|
||||||
|
|
||||||
## Logging Callbacks
|
## Logging Callbacks
|
||||||
|
@ -39,3 +69,9 @@ Using LangSmith has **no impact on latency, RPS compared to Basic Litellm Proxy*
|
||||||
| RPS | 1133.2 | 1135 |
|
| RPS | 1133.2 | 1135 |
|
||||||
| Median Latency (ms) | 140 | 132 |
|
| Median Latency (ms) | 140 | 132 |
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
## Locust Settings
|
||||||
|
|
||||||
|
- 2500 Users
|
||||||
|
- 100 user Ramp Up
|
||||||
|
|
BIN
docs/my-website/img/10_instance_proxy.png
Normal file
BIN
docs/my-website/img/10_instance_proxy.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 158 KiB |
BIN
docs/my-website/img/1_instance_proxy.png
Normal file
BIN
docs/my-website/img/1_instance_proxy.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 156 KiB |
BIN
docs/my-website/img/2_instance_proxy.png
Normal file
BIN
docs/my-website/img/2_instance_proxy.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 158 KiB |
BIN
docs/my-website/img/instances_vs_rps.png
Normal file
BIN
docs/my-website/img/instances_vs_rps.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 150 KiB |
Loading…
Add table
Add a link
Reference in a new issue