diff --git a/docs/my-website/docs/benchmarks.md b/docs/my-website/docs/benchmarks.md new file mode 100644 index 000000000..86699008b --- /dev/null +++ b/docs/my-website/docs/benchmarks.md @@ -0,0 +1,41 @@ +# Benchmarks + +Benchmarks for LiteLLM Gateway (Proxy Server) + +Locust Settings: +- 2500 Users +- 100 user Ramp Up + + +## Basic Benchmarks + +Overhead when using a Deployed Proxy vs Direct to LLM +- Latency overhead added by LiteLLM Proxy: 107ms + +| Metric | Direct to Fake Endpoint | Basic Litellm Proxy | +|--------|------------------------|---------------------| +| RPS | 1196 | 1133.2 | +| Median Latency (ms) | 33 | 140 | + + +## Logging Callbacks + +### [GCS Bucket Logging](https://docs.litellm.ai/docs/proxy/bucket) + +Using GCS Bucket has **no impact on latency, RPS compared to Basic Litellm Proxy** + +| Metric | Basic Litellm Proxy | LiteLLM Proxy with GCS Bucket Logging | +|--------|------------------------|---------------------| +| RPS | 1133.2 | 1137.3 | +| Median Latency (ms) | 140 | 138 | + + +### [LangSmith logging](https://docs.litellm.ai/docs/proxy/logging) + +Using LangSmith has **no impact on latency, RPS compared to Basic Litellm Proxy** + +| Metric | Basic Litellm Proxy | LiteLLM Proxy with LangSmith | +|--------|------------------------|---------------------| +| RPS | 1133.2 | 1135 | +| Median Latency (ms) | 140 | 132 | + diff --git a/docs/my-website/docs/proxy/bucket.md b/docs/my-website/docs/proxy/bucket.md index 3422d0371..d1b9e6076 100644 --- a/docs/my-website/docs/proxy/bucket.md +++ b/docs/my-website/docs/proxy/bucket.md @@ -9,7 +9,7 @@ LiteLLM Supports Logging to the following Cloud Buckets - (Enterprise) ✨ [Google Cloud Storage Buckets](#logging-proxy-inputoutput-to-google-cloud-storage-buckets) - (Free OSS) [Amazon s3 Buckets](#logging-proxy-inputoutput---s3-buckets) -## Logging Proxy Input/Output to Google Cloud Storage Buckets +## Google Cloud Storage Buckets Log LLM Logs to [Google Cloud Storage Buckets](https://cloud.google.com/storage?hl=en) @@ -20,6 +20,14 @@ Log LLM Logs to [Google Cloud Storage Buckets](https://cloud.google.com/storage? ::: +| Property | Details | +|----------|---------| +| Description | Log LLM Input/Output to cloud storage buckets | +| Load Test Benchmarks | [Benchmarks](https://docs.litellm.ai/docs/benchmarks) | +| Google Docs on Cloud Storage | [Google Cloud Storage](https://cloud.google.com/storage?hl=en) | + + + ### Usage 1. Add `gcs_bucket` to LiteLLM Config.yaml @@ -85,7 +93,7 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \ 6. Save the JSON file and add the path to `GCS_PATH_SERVICE_ACCOUNT` -## Logging Proxy Input/Output - s3 Buckets +## s3 Buckets We will use the `--config` to set diff --git a/docs/my-website/docs/proxy/logging.md b/docs/my-website/docs/proxy/logging.md index 94faa7734..5867a8f23 100644 --- a/docs/my-website/docs/proxy/logging.md +++ b/docs/my-website/docs/proxy/logging.md @@ -107,7 +107,7 @@ class StandardLoggingModelInformation(TypedDict): model_map_value: Optional[ModelInfo] ``` -## Logging Proxy Input/Output - Langfuse +## Langfuse We will use the `--config` to set `litellm.success_callback = ["langfuse"]` this will log all successfull LLM calls to langfuse. Make sure to set `LANGFUSE_PUBLIC_KEY` and `LANGFUSE_SECRET_KEY` in your environment @@ -463,7 +463,7 @@ You will see `raw_request` in your Langfuse Metadata. This is the RAW CURL comma -## Logging Proxy Input/Output in OpenTelemetry format +## OpenTelemetry format :::info @@ -1216,7 +1216,7 @@ litellm_settings: Start the LiteLLM Proxy and make a test request to verify the logs reached your callback API -## Logging LLM IO to Langsmith +## Langsmith 1. Set `success_callback: ["langsmith"]` on litellm config.yaml @@ -1261,7 +1261,7 @@ Expect to see your log on Langfuse -## Logging LLM IO to Arize AI +## Arize AI 1. Set `success_callback: ["arize"]` on litellm config.yaml @@ -1309,7 +1309,7 @@ Expect to see your log on Langfuse -## Logging LLM IO to Langtrace +## Langtrace 1. Set `success_callback: ["langtrace"]` on litellm config.yaml @@ -1351,7 +1351,7 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \ ' ``` -## Logging LLM IO to Galileo +## Galileo [BETA] @@ -1466,7 +1466,7 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \ -## Logging Proxy Input/Output - DataDog +## DataDog LiteLLM Supports logging to the following Datdog Integrations: - `datadog` [Datadog Logs](https://docs.datadoghq.com/logs/) @@ -1543,7 +1543,7 @@ Expected output on Datadog -## Logging Proxy Input/Output - DynamoDB +## DynamoDB We will use the `--config` to set @@ -1669,7 +1669,7 @@ Your logs should be available on DynamoDB } ``` -## Logging Proxy Input/Output - Sentry +## Sentry If api calls fail (llm/database) you can log those to Sentry: @@ -1711,7 +1711,7 @@ Test Request litellm --test ``` -## Logging Proxy Input/Output Athina +## Athina [Athina](https://athina.ai/) allows you to log LLM Input/Output for monitoring, analytics, and observability. diff --git a/docs/my-website/sidebars.js b/docs/my-website/sidebars.js index 18ad940f8..1dc33f554 100644 --- a/docs/my-website/sidebars.js +++ b/docs/my-website/sidebars.js @@ -266,6 +266,7 @@ const sidebars = { type: "category", label: "Load Testing", items: [ + "benchmarks", "load_test", "load_test_advanced", "load_test_sdk",