diff --git a/docs/my-website/docs/proxy/db_deadlocks.md b/docs/my-website/docs/proxy/db_deadlocks.md index e649bdccc0..332374995d 100644 --- a/docs/my-website/docs/proxy/db_deadlocks.md +++ b/docs/my-website/docs/proxy/db_deadlocks.md @@ -71,4 +71,16 @@ litellm_settings: supported_call_types: [] # Optional: Set cache for proxy, but not on the actual llm api call ``` +## Monitoring + +LiteLLM emits the following prometheus metrics to monitor the health/status of the in memory buffer and redis buffer. + + +| Metric Name | Description | Storage Type | +|-----------------------------------------------------|-----------------------------------------------------------------------------|--------------| +| `litellm_pod_lock_manager_size` | Indicates which pod has the lock to write updates to the database. | Redis | +| `litellm_in_memory_daily_spend_update_queue_size` | Number of items in the in-memory daily spend update queue. These are the aggregate spend logs for each user. | In-Memory | +| `litellm_redis_daily_spend_update_queue_size` | Number of items in the Redis daily spend update queue. These are the aggregate spend logs for each user. | Redis | +| `litellm_in_memory_spend_update_queue_size` | In-memory aggregate spend values for keys, users, teams, team members, etc.| In-Memory | +| `litellm_redis_spend_update_queue_size` | Redis aggregate spend values for keys, users, teams, etc. | Redis | diff --git a/docs/my-website/docs/proxy/prometheus.md b/docs/my-website/docs/proxy/prometheus.md index 8dff527ae5..220a3c2c12 100644 --- a/docs/my-website/docs/proxy/prometheus.md +++ b/docs/my-website/docs/proxy/prometheus.md @@ -242,6 +242,19 @@ litellm_settings: | `litellm_redis_fails` | Number of failed redis calls | | `litellm_self_latency` | Histogram latency for successful litellm api call | +#### DB Transaction Queue Health Metrics + +Use these metrics to monitor the health of the DB Transaction Queue. Eg. Monitoring the size of the in-memory and redis buffers. + +| Metric Name | Description | Storage Type | +|-----------------------------------------------------|-----------------------------------------------------------------------------|--------------| +| `litellm_pod_lock_manager_size` | Indicates which pod has the lock to write updates to the database. | Redis | +| `litellm_in_memory_daily_spend_update_queue_size` | Number of items in the in-memory daily spend update queue. These are the aggregate spend logs for each user. | In-Memory | +| `litellm_redis_daily_spend_update_queue_size` | Number of items in the Redis daily spend update queue. These are the aggregate spend logs for each user. | Redis | +| `litellm_in_memory_spend_update_queue_size` | In-memory aggregate spend values for keys, users, teams, team members, etc.| In-Memory | +| `litellm_redis_spend_update_queue_size` | Redis aggregate spend values for keys, users, teams, etc. | Redis | + + ## **🔥 LiteLLM Maintained Grafana Dashboards **