Commit graph

51 commits

Author SHA1 Message Date
Ishaan Jaff
97aeacc1fa (feat proxy prometheus) track virtual key, key alias, error code, error code class on prometheus (#5968)
* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total

* fix checks for premium user on prometheus

* log_success_fallback_event and log_failure_fallback_event

* log original_exception in log_success_fallback_event

* track key, team and exception status and class on fallback metrics

* use get_standard_logging_metadata

* fix import error

* track litellm_deployment_successful_fallbacks

* add test test_proxy_fallback_metrics

* add log log_success_fallback_event

* fix test prometheus
2024-09-28 19:00:21 -07:00
Ishaan Jaff
c68cfce912 track api key and alias in remaining tokens metric (#5924) 2024-09-26 18:01:03 -07:00
Ishaan Jaff
e79d4b0fad fix prometheus track input and output tokens (#5780) 2024-09-23 08:19:22 -07:00
Ishaan Jaff
41bc69e608 [Feat] Add testing for prometheus failure metrics (#5823)
* prom - show status code and class type on prom

* log exception_class name on prometheus metrics

* prometheus track error code and status

* add bad model

* add prometheus failure metric test

* remove outdated file

* fix litellm_proxy_total_requests_metric

* add prometheus metrics testing
2024-09-21 11:36:29 -07:00
Ishaan Jaff
fd6cc10922 [Feat] Add proxy level prometheus metrics (#5789)
* add Proxy Level Tracking Metrics doc

* update service logger

* prometheus - track litellm_proxy_failed_requests_metric

* use REQUESTED_MODEL

* fix prom request_data
2024-09-19 17:13:07 -07:00
Ishaan Jaff
9fff5f3da8 [Chore LiteLLM Proxy] enforce prometheus metrics as enterprise feature (#5769)
* enforce prometheus as enterprise feature

* show correct error on prometheus metric when not enrterprise user

* docs promethues metrics enforced

* fix enforcing
2024-09-18 16:28:12 -07:00
Ishaan Jaff
121a38319a [Prometheus] track requested model (#5774)
* enforce prometheus as enterprise feature

* show correct error on prometheus metric when not enrterprise user

* docs promethues metrics enforced

* track requested model on prometheus

* docs prom metrics

* fix prom tracking failures
2024-09-18 12:46:58 -07:00
Ishaan Jaff
6368c0925e [Feat-Prometheus] Track exception status on litellm_deployment_failure_responses (#5706)
* add litellm_deployment_cooled_down

* track num cooldowns on prometheus

* track exception status

* fix linting

* docs prom metrics

* cleanup premium user checks

* prom track deployment failure state

* docs prometheus
2024-09-14 18:44:31 -07:00
Ishaan Jaff
3fae5eb94e feat prometheus add metric for failure / model 2024-08-31 10:05:23 -07:00
Ishaan Jaff
fddf10eeb8 prometheus - safe update start / end time 2024-08-28 16:13:56 -07:00
Ishaan Jaff
be853d93da fix prom latency metrics 2024-08-23 06:59:19 -07:00
Ishaan Jaff
9476582fb7 update promtheus metric names 2024-08-22 14:03:00 -07:00
Ishaan Jaff
c719c375f7 track litellm_request_latency_metric 2024-08-22 13:58:10 -07:00
Ishaan Jaff
0ccb1c17f7 fix init correct prometheus metrics 2024-08-22 13:29:35 -07:00
Ishaan Jaff
db8f789318 Merge pull request #5259 from BerriAI/litellm_return_remaining_tokens_in_header
[Feat] return `x-litellm-key-remaining-requests-{model}`: 1, `x-litellm-key-remaining-tokens-{model}: None` in response headers
2024-08-17 12:41:16 -07:00
Ishaan Jaff
a62277a6aa feat - use commong helper for getting model group 2024-08-17 10:46:04 -07:00
Ishaan Jaff
2dd098f384 show correct metric 2024-08-17 10:12:23 -07:00
Ishaan Jaff
03196742d2 add litellm-key-remaining-tokens on prometheus 2024-08-17 10:02:20 -07:00
Krrish Dholakia
2874b94fb1 refactor: replace .error() with .exception() logging for better debugging on sentry 2024-08-16 09:22:47 -07:00
Ishaan Jaff
38868a0a45 use litellm_ prefix for new deployment metrics 2024-08-14 09:08:14 -07:00
Ishaan Jaff
ecec37e220 doc new prometheus metrics 2024-08-10 17:13:36 -07:00
Ishaan Jaff
3ecf4db741 prometheus log_success_fallback_event 2024-08-10 14:05:18 -07:00
Ishaan Jaff
5765baa5b2 feat - track latency per llm deployment 2024-08-10 12:53:56 -07:00
Ishaan Jaff
e086479fd7 track llm_deployment_success_responses 2024-08-10 10:05:33 -07:00
Ishaan Jaff
c3c570ac7e feat - refactor prometheus metrics 2024-08-10 09:14:38 -07:00
Ishaan Jaff
408d17dfee refactor prom metrics 2024-08-09 09:02:23 -07:00
Ishaan Jaff
fc60bd07b2 show warning about prometheus moving to enterprise 2024-08-07 12:46:26 -07:00
Ishaan Jaff
27e8a89077 fix logging cool down deployment 2024-08-07 11:27:05 -07:00
Ishaan Jaff
92a38b213b allow setting outage metrics 2024-08-07 10:36:18 -07:00
Ishaan Jaff
426dcc9275 emit deployment_partial_outage on prometheus 2024-08-07 09:56:01 -07:00
Ishaan Jaff
7222791210 rename to set_llm_deployment_success_metrics 2024-08-07 09:46:08 -07:00
Ishaan Jaff
786a3f9e95 add set_remaining_tokens_requests_metric 2024-08-07 09:43:35 -07:00
Ishaan Jaff
09e22d8e33 feat - show key alias on prometheus metrics 2024-07-04 09:57:00 -07:00
Ishaan Jaff
86e3cae596 feat - prometheus log remaining headers 2024-07-01 20:00:47 -07:00
Ishaan Jaff
ee9e2ef980 feat - add remaining budget for key on prometheus 2024-06-13 14:37:02 -07:00
Ishaan Jaff
8d3c9aeea3 feat - add remaining team budget gauge 2024-06-13 14:28:25 -07:00
Bram van Meurs
3bd7ad6edf feat(prometheus): add api_team_alias to exported labels 2024-06-13 12:50:40 +02:00
Krrish Dholakia
e391e30285 refactor: replace 'traceback.print_exc()' with logging library
allows error logs to be in json format for otel logging
2024-06-06 13:47:43 -07:00
Krrish Dholakia
926b86af87 feat(bedrock_httpx.py): moves to using httpx client for bedrock cohere calls 2024-05-11 13:43:08 -07:00
Krrish Dholakia
5f93cae3ff feat(proxy_server.py): return litellm version in response headers 2024-05-08 16:00:08 -07:00
Krrish Dholakia
17bf753309 fix(prometheus.py): fix user-id get for prometheus 2024-04-24 08:08:42 -07:00
Krrish Dholakia
49852b68c1 fix(prometheus.py): add user tracking to prometheus 2024-04-22 15:14:38 -07:00
Krrish Dholakia
1ca2439eb7 fix(lowest_tpm_rpm_v2.py): use a combined tpm+rpm query in async get cache, to reduce redis client calls in high traffic 2024-04-20 16:13:11 -07:00
Krrish Dholakia
7065e4ee12 fix(caching.py): remove url parsing logic - causing redis ssl connections to fail
this reverts a change that was causing redis url w/ ssl to fail. this also adds unit testing for this sc
enario, to prevent future regressions
2024-04-19 14:01:13 -07:00
Krrish Dholakia
f6ac469573 fix(prometheus.py): fix metric name to be more accurate
change metric name from litellm_failed_requests_metric to litellm_llm_api_failed_requests_metric
2024-04-18 12:30:44 -07:00
Krrish Dholakia
deccde6be1 fix(utils.py): support prometheus failed call metrics 2024-04-18 12:29:15 -07:00
Ishaan Jaff
3dca830fcb fix don't log user_api_key to prometheus 2024-04-16 19:01:38 -07:00
Krrish Dholakia
866259f95f feat(prometheus_services.py): monitor health of proxy adjacent services (redis / postgres / etc.) 2024-04-13 18:15:02 -07:00
Krrish Dholakia
5fe8aa27d1 feat(prometheus.py): track team based metrics on prometheus 2024-04-03 13:43:21 -07:00
Ishaan Jaff
a23746b776 (fix) add /metrics to utils.py 2024-03-19 17:28:33 -07:00