forked from phoenix/litellm-mirror
(docs) add benchmarks on 1K RPS (#6704)
* docs litellm proxy benchmarks * docs GCS bucket * doc fix - reduce clutter on logging doc title
This commit is contained in:
parent
4fd0c6c8f2
commit
e5051a93a8
4 changed files with 62 additions and 12 deletions
41
docs/my-website/docs/benchmarks.md
Normal file
41
docs/my-website/docs/benchmarks.md
Normal file
|
@ -0,0 +1,41 @@
|
||||||
|
# Benchmarks
|
||||||
|
|
||||||
|
Benchmarks for LiteLLM Gateway (Proxy Server)
|
||||||
|
|
||||||
|
Locust Settings:
|
||||||
|
- 2500 Users
|
||||||
|
- 100 user Ramp Up
|
||||||
|
|
||||||
|
|
||||||
|
## Basic Benchmarks
|
||||||
|
|
||||||
|
Overhead when using a Deployed Proxy vs Direct to LLM
|
||||||
|
- Latency overhead added by LiteLLM Proxy: 107ms
|
||||||
|
|
||||||
|
| Metric | Direct to Fake Endpoint | Basic Litellm Proxy |
|
||||||
|
|--------|------------------------|---------------------|
|
||||||
|
| RPS | 1196 | 1133.2 |
|
||||||
|
| Median Latency (ms) | 33 | 140 |
|
||||||
|
|
||||||
|
|
||||||
|
## Logging Callbacks
|
||||||
|
|
||||||
|
### [GCS Bucket Logging](https://docs.litellm.ai/docs/proxy/bucket)
|
||||||
|
|
||||||
|
Using GCS Bucket has **no impact on latency, RPS compared to Basic Litellm Proxy**
|
||||||
|
|
||||||
|
| Metric | Basic Litellm Proxy | LiteLLM Proxy with GCS Bucket Logging |
|
||||||
|
|--------|------------------------|---------------------|
|
||||||
|
| RPS | 1133.2 | 1137.3 |
|
||||||
|
| Median Latency (ms) | 140 | 138 |
|
||||||
|
|
||||||
|
|
||||||
|
### [LangSmith logging](https://docs.litellm.ai/docs/proxy/logging)
|
||||||
|
|
||||||
|
Using LangSmith has **no impact on latency, RPS compared to Basic Litellm Proxy**
|
||||||
|
|
||||||
|
| Metric | Basic Litellm Proxy | LiteLLM Proxy with LangSmith |
|
||||||
|
|--------|------------------------|---------------------|
|
||||||
|
| RPS | 1133.2 | 1135 |
|
||||||
|
| Median Latency (ms) | 140 | 132 |
|
||||||
|
|
|
@ -9,7 +9,7 @@ LiteLLM Supports Logging to the following Cloud Buckets
|
||||||
- (Enterprise) ✨ [Google Cloud Storage Buckets](#logging-proxy-inputoutput-to-google-cloud-storage-buckets)
|
- (Enterprise) ✨ [Google Cloud Storage Buckets](#logging-proxy-inputoutput-to-google-cloud-storage-buckets)
|
||||||
- (Free OSS) [Amazon s3 Buckets](#logging-proxy-inputoutput---s3-buckets)
|
- (Free OSS) [Amazon s3 Buckets](#logging-proxy-inputoutput---s3-buckets)
|
||||||
|
|
||||||
## Logging Proxy Input/Output to Google Cloud Storage Buckets
|
## Google Cloud Storage Buckets
|
||||||
|
|
||||||
Log LLM Logs to [Google Cloud Storage Buckets](https://cloud.google.com/storage?hl=en)
|
Log LLM Logs to [Google Cloud Storage Buckets](https://cloud.google.com/storage?hl=en)
|
||||||
|
|
||||||
|
@ -20,6 +20,14 @@ Log LLM Logs to [Google Cloud Storage Buckets](https://cloud.google.com/storage?
|
||||||
:::
|
:::
|
||||||
|
|
||||||
|
|
||||||
|
| Property | Details |
|
||||||
|
|----------|---------|
|
||||||
|
| Description | Log LLM Input/Output to cloud storage buckets |
|
||||||
|
| Load Test Benchmarks | [Benchmarks](https://docs.litellm.ai/docs/benchmarks) |
|
||||||
|
| Google Docs on Cloud Storage | [Google Cloud Storage](https://cloud.google.com/storage?hl=en) |
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
### Usage
|
### Usage
|
||||||
|
|
||||||
1. Add `gcs_bucket` to LiteLLM Config.yaml
|
1. Add `gcs_bucket` to LiteLLM Config.yaml
|
||||||
|
@ -85,7 +93,7 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
|
||||||
6. Save the JSON file and add the path to `GCS_PATH_SERVICE_ACCOUNT`
|
6. Save the JSON file and add the path to `GCS_PATH_SERVICE_ACCOUNT`
|
||||||
|
|
||||||
|
|
||||||
## Logging Proxy Input/Output - s3 Buckets
|
## s3 Buckets
|
||||||
|
|
||||||
We will use the `--config` to set
|
We will use the `--config` to set
|
||||||
|
|
||||||
|
|
|
@ -107,7 +107,7 @@ class StandardLoggingModelInformation(TypedDict):
|
||||||
model_map_value: Optional[ModelInfo]
|
model_map_value: Optional[ModelInfo]
|
||||||
```
|
```
|
||||||
|
|
||||||
## Logging Proxy Input/Output - Langfuse
|
## Langfuse
|
||||||
|
|
||||||
We will use the `--config` to set `litellm.success_callback = ["langfuse"]` this will log all successfull LLM calls to langfuse. Make sure to set `LANGFUSE_PUBLIC_KEY` and `LANGFUSE_SECRET_KEY` in your environment
|
We will use the `--config` to set `litellm.success_callback = ["langfuse"]` this will log all successfull LLM calls to langfuse. Make sure to set `LANGFUSE_PUBLIC_KEY` and `LANGFUSE_SECRET_KEY` in your environment
|
||||||
|
|
||||||
|
@ -463,7 +463,7 @@ You will see `raw_request` in your Langfuse Metadata. This is the RAW CURL comma
|
||||||
|
|
||||||
<Image img={require('../../img/debug_langfuse.png')} />
|
<Image img={require('../../img/debug_langfuse.png')} />
|
||||||
|
|
||||||
## Logging Proxy Input/Output in OpenTelemetry format
|
## OpenTelemetry format
|
||||||
|
|
||||||
:::info
|
:::info
|
||||||
|
|
||||||
|
@ -1216,7 +1216,7 @@ litellm_settings:
|
||||||
|
|
||||||
Start the LiteLLM Proxy and make a test request to verify the logs reached your callback API
|
Start the LiteLLM Proxy and make a test request to verify the logs reached your callback API
|
||||||
|
|
||||||
## Logging LLM IO to Langsmith
|
## Langsmith
|
||||||
|
|
||||||
1. Set `success_callback: ["langsmith"]` on litellm config.yaml
|
1. Set `success_callback: ["langsmith"]` on litellm config.yaml
|
||||||
|
|
||||||
|
@ -1261,7 +1261,7 @@ Expect to see your log on Langfuse
|
||||||
<Image img={require('../../img/langsmith_new.png')} />
|
<Image img={require('../../img/langsmith_new.png')} />
|
||||||
|
|
||||||
|
|
||||||
## Logging LLM IO to Arize AI
|
## Arize AI
|
||||||
|
|
||||||
1. Set `success_callback: ["arize"]` on litellm config.yaml
|
1. Set `success_callback: ["arize"]` on litellm config.yaml
|
||||||
|
|
||||||
|
@ -1309,7 +1309,7 @@ Expect to see your log on Langfuse
|
||||||
<Image img={require('../../img/langsmith_new.png')} />
|
<Image img={require('../../img/langsmith_new.png')} />
|
||||||
|
|
||||||
|
|
||||||
## Logging LLM IO to Langtrace
|
## Langtrace
|
||||||
|
|
||||||
1. Set `success_callback: ["langtrace"]` on litellm config.yaml
|
1. Set `success_callback: ["langtrace"]` on litellm config.yaml
|
||||||
|
|
||||||
|
@ -1351,7 +1351,7 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
|
||||||
'
|
'
|
||||||
```
|
```
|
||||||
|
|
||||||
## Logging LLM IO to Galileo
|
## Galileo
|
||||||
|
|
||||||
[BETA]
|
[BETA]
|
||||||
|
|
||||||
|
@ -1466,7 +1466,7 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
|
||||||
|
|
||||||
<Image img={require('../../img/openmeter_img_2.png')} />
|
<Image img={require('../../img/openmeter_img_2.png')} />
|
||||||
|
|
||||||
## Logging Proxy Input/Output - DataDog
|
## DataDog
|
||||||
|
|
||||||
LiteLLM Supports logging to the following Datdog Integrations:
|
LiteLLM Supports logging to the following Datdog Integrations:
|
||||||
- `datadog` [Datadog Logs](https://docs.datadoghq.com/logs/)
|
- `datadog` [Datadog Logs](https://docs.datadoghq.com/logs/)
|
||||||
|
@ -1543,7 +1543,7 @@ Expected output on Datadog
|
||||||
|
|
||||||
<Image img={require('../../img/dd_small1.png')} />
|
<Image img={require('../../img/dd_small1.png')} />
|
||||||
|
|
||||||
## Logging Proxy Input/Output - DynamoDB
|
## DynamoDB
|
||||||
|
|
||||||
We will use the `--config` to set
|
We will use the `--config` to set
|
||||||
|
|
||||||
|
@ -1669,7 +1669,7 @@ Your logs should be available on DynamoDB
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
## Logging Proxy Input/Output - Sentry
|
## Sentry
|
||||||
|
|
||||||
If api calls fail (llm/database) you can log those to Sentry:
|
If api calls fail (llm/database) you can log those to Sentry:
|
||||||
|
|
||||||
|
@ -1711,7 +1711,7 @@ Test Request
|
||||||
litellm --test
|
litellm --test
|
||||||
```
|
```
|
||||||
|
|
||||||
## Logging Proxy Input/Output Athina
|
## Athina
|
||||||
|
|
||||||
[Athina](https://athina.ai/) allows you to log LLM Input/Output for monitoring, analytics, and observability.
|
[Athina](https://athina.ai/) allows you to log LLM Input/Output for monitoring, analytics, and observability.
|
||||||
|
|
||||||
|
|
|
@ -266,6 +266,7 @@ const sidebars = {
|
||||||
type: "category",
|
type: "category",
|
||||||
label: "Load Testing",
|
label: "Load Testing",
|
||||||
items: [
|
items: [
|
||||||
|
"benchmarks",
|
||||||
"load_test",
|
"load_test",
|
||||||
"load_test_advanced",
|
"load_test_advanced",
|
||||||
"load_test_sdk",
|
"load_test_sdk",
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue