diff --git a/docs/my-website/docs/enterprise.md b/docs/my-website/docs/enterprise.md index e3758266a..5bd09ec15 100644 --- a/docs/my-website/docs/enterprise.md +++ b/docs/my-website/docs/enterprise.md @@ -20,6 +20,8 @@ This covers: - **Spend Tracking** - ✅ [Tracking Spend for Custom Tags](./proxy/enterprise#tracking-spend-for-custom-tags) - ✅ [API Endpoints to get Spend Reports per Team, API Key, Customer](./proxy/cost_tracking.md#✨-enterprise-api-endpoints-to-get-spend) + - **Advanced Metrics** + - ✅ [`x-ratelimit-remaining-requests`, `x-ratelimit-remaining-tokens` for LLM APIs on Prometheus](./proxy/prometheus#✨-enterprise-llm-remaining-requests-and-remaining-tokens) - **Guardrails, PII Masking, Content Moderation** - ✅ [Content Moderation with LLM Guard, LlamaGuard, Secret Detection, Google Text Moderations](./proxy/enterprise#content-moderation) - ✅ [Prompt Injection Detection (with LakeraAI API)](./proxy/enterprise#prompt-injection-detection---lakeraai) diff --git a/docs/my-website/docs/proxy/enterprise.md b/docs/my-website/docs/proxy/enterprise.md index e061a917e..5dabba5ed 100644 --- a/docs/my-website/docs/proxy/enterprise.md +++ b/docs/my-website/docs/proxy/enterprise.md @@ -23,6 +23,8 @@ Features: - **Spend Tracking** - ✅ [Tracking Spend for Custom Tags](#tracking-spend-for-custom-tags) - ✅ [API Endpoints to get Spend Reports per Team, API Key, Customer](cost_tracking.md#✨-enterprise-api-endpoints-to-get-spend) +- **Advanced Metrics** + - ✅ [`x-ratelimit-remaining-requests`, `x-ratelimit-remaining-tokens` for LLM APIs on Prometheus](prometheus#✨-enterprise-llm-remaining-requests-and-remaining-tokens) - **Guardrails, PII Masking, Content Moderation** - ✅ [Content Moderation with LLM Guard, LlamaGuard, Secret Detection, Google Text Moderations](#content-moderation) - ✅ [Prompt Injection Detection (with LakeraAI API)](#prompt-injection-detection---lakeraai) diff --git a/docs/my-website/docs/proxy/prometheus.md b/docs/my-website/docs/proxy/prometheus.md index 2c7481f4c..974a081a9 100644 --- a/docs/my-website/docs/proxy/prometheus.md +++ b/docs/my-website/docs/proxy/prometheus.md @@ -61,6 +61,32 @@ http://localhost:4000/metrics | `litellm_remaining_api_key_budget_metric` | Remaining Budget for API Key (A key Created on LiteLLM)| +### ✨ (Enterprise) LLM Remaining Requests and Remaining Tokens +Set this on your config.yaml to allow you to track how close you are to hitting your TPM / RPM limits on each model group + +```yaml +litellm_settings: + success_callback: ["prometheus"] + failure_callback: ["prometheus"] + return_response_headers: true # ensures the LLM API calls track the response headers +``` + +| Metric Name | Description | +|----------------------|--------------------------------------| +| `litellm_remaining_requests_metric` | Track `x-ratelimit-remaining-requests` returned from LLM API Deployment | +| `litellm_remaining_tokens` | Track `x-ratelimit-remaining-tokens` return from LLM API Deployment | + +Example Metric +```shell +litellm_remaining_tokens +{ + api_base="https://api.openai.com/v1", + api_provider="openai", + litellm_model_name="gpt-3.5-turbo", + model_group="gpt-3.5-turbo" +} +999981.0 +``` ## Monitor System Health To monitor the health of litellm adjacent services (redis / postgres), do: