[Feat-Prometheus] Track exception status on litellm_deployment_failure_responses (#5706)

* add litellm_deployment_cooled_down

* track num cooldowns on prometheus

* track exception status

* fix linting

* docs prom metrics

* cleanup premium user checks

* prom track deployment failure state

* docs prometheus
This commit is contained in:
Ishaan Jaff 2024-09-14 18:44:31 -07:00 committed by GitHub
parent b878a67a7c
commit c8eff2dc65
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 171 additions and 130 deletions

View file

@ -53,7 +53,7 @@ from litellm.router_utils.client_initalization_utils import (
should_initialize_sync_client,
)
from litellm.router_utils.cooldown_cache import CooldownCache
from litellm.router_utils.cooldown_callbacks import router_cooldown_handler
from litellm.router_utils.cooldown_callbacks import router_cooldown_event_callback
from litellm.router_utils.cooldown_handlers import (
DEFAULT_COOLDOWN_TIME_SECONDS,
_async_get_cooldown_deployments,