Commit graph

929 commits

Author SHA1 Message Date
Krish Dholakia
8d3a942fbd
Litellm staging (#8270)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 15s
* fix(opik.py): cleanup

* docs(opik_integration.md): cleanup opik integration docs

* fix(redact_messages.py): fix redact messages check header logic

ensures stringified bool value in header is still asserted to true

 allows dynamic message redaction

* feat(redact_messages.py): support `x-litellm-enable-message-redaction` request header

allows dynamic message redaction
2025-02-04 22:35:48 -08:00
Krish Dholakia
8d4ad47ec3
fix(prometheus.py): fix setting key budget metrics (#8234)
* fix(prometheus.py): fix setting key budget metrics

ensures custom metadata works with key budget metric

this is a patch. root cause pr is written in a separate branch

* test: fix test
2025-02-04 19:15:50 -08:00
Krish Dholakia
c8494abdea
test(base_llm_unit_tests.py): add test to ensure drop params is respe… (#8224)
* test(base_llm_unit_tests.py): add test to ensure drop params is respected

* fix(types/prometheus.py): use typing_extensions for python3.8 compatibility

* build: add cherry picked commits
2025-02-03 16:04:44 -08:00
Ishaan Jaff
dcc3bbc264
(Fix) langfuse - setting LANGFUSE_FLUSH_INTERVAL (#8007)
* fix langfuse flush interval

* test_get_langfuse_flush_interval

* test_get_langfuse_flush_interval
2025-01-25 17:17:32 -08:00
Krish Dholakia
08b124aeb6
Litellm dev 01 25 2025 p2 (#8003)
* fix(base_utils.py): supported nested json schema passed in for anthropic calls

* refactor(base_utils.py): refactor ref parsing to prevent infinite loop

* test(test_openai_endpoints.py): refactor anthropic test to use bedrock

* fix(langfuse_prompt_management.py): add unit test for sync langfuse calls

Resolves https://github.com/BerriAI/litellm/issues/7938#issuecomment-2613293757
2025-01-25 16:50:57 -08:00
Ishaan Jaff
669b4fc955
(Prometheus) - emit key budget metrics on startup (#8002)
* add UI_SESSION_TOKEN_TEAM_ID

* add type KeyListResponseObject

* add _list_key_helper

* _initialize_api_key_budget_metrics

* key / budget metrics

* init key budget metrics on startup

* test_initialize_api_key_budget_metrics

* fix linting

* test_list_key_helper

* test_initialize_remaining_budget_metrics_exception_handling
2025-01-25 10:37:52 -08:00
Ishaan Jaff
4db1c7a9a9 linting fix 2025-01-24 21:30:24 -08:00
Ishaan Jaff
74caef0843
(Feat) - Add GCS Pub/Sub Logging integration for sending DB SpendLogs to BigQuery (#7976)
* add pub_sub

* fix custom batch logger for GCS PUB/SUB

* GCS_PUBSUB_PROJECT_ID

* e2e gcs pub sub

* add gcs pub sub

* fix logging

* add GcsPubSubLogger

* fix pub sub

* add pub sub

* docs gcs pub / sub

* docs on pub sub controls

* test_gcs_pub_sub

* fix publish_message

* test_async_gcs_pub_sub

* test_async_gcs_pub_sub
2025-01-24 20:57:20 -08:00
Krrish Dholakia
d7f862783d fix(langsmith.py): add /api/v1 to langsmith base url
ensures it works with self hosted langsmith
2025-01-24 17:58:42 -08:00
Ishaan Jaff
ed283bc5b4
(Feat) - allow setting default_on guardrails (#7973)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
* test_default_on_guardrail

* update debug on custom guardrail

* refactor guardrails init

* guardrail registry

* allow switching guardrails default_on

* fix circle import issue

* fix bedrock applying guardrails where content is a list

* fix unused import

* docs default on guardrail

* docs fix per api key
2025-01-24 10:14:05 -08:00
Krish Dholakia
c6e9240405
Add datadog health check support + fix bedrock converse cost tracking w/ region name specified (#7958)
* fix(bedrock/converse_handler.py): fix bedrock region name on async calls

* fix(utils.py): fix split model handling

Fixes bedrock cost calculation when region name is given

* feat(_health_endpoints.py): support health checking datadog integration

Closes https://github.com/BerriAI/litellm/issues/7921
2025-01-23 22:17:09 -08:00
Ishaan Jaff
f251e775f6
gcs bucket dont run truncation (#7964) 2025-01-23 20:58:31 -08:00
Ishaan Jaff
1719dc23c7
(Feat) - emit litellm_team_budget_reset_at_metric and litellm_api_key_budget_remaining_hours_metric on prometheus (#7946)
* set litellm_team_budget_reset_at_metric

* add _get_team_info_from_db_lru_cached

* _set_team_budget_metrics

* e2e test_team_budget_metrics

* update doc string

* add _get_remaining_hours_for_budget_reset

* fix team endpoints

* _get_remaining_hours_for_budget_reset

* _set_key_budget_metrics  on startup

* test_key_budget_metrics

* prom fixes for emitting key / team metrics

* fix _set_api_key_budget_metrics_after_api_request

* test_increment_remaining_budget_metrics

* unit test test_increment_remaining_budget_metrics

* test_initialize_remaining_budget_metrics
2025-01-23 18:12:47 -08:00
Ishaan Jaff
554489ea18 Revert "set litellm_team_budget_reset_at_metric"
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
This reverts commit 39d959d72a.
2025-01-23 09:05:14 -08:00
Ishaan Jaff
39d959d72a set litellm_team_budget_reset_at_metric 2025-01-23 08:56:05 -08:00
Ishaan Jaff
53a3ea3d06
(Refactor) Langfuse - remove prepare_metadata, langfuse python SDK now handles non-json serializable objects (#7925)
* test_langfuse_logging_completion_with_langfuse_metadata

* fix litellm - remove prepare metadata

* test_langfuse_logging_with_non_serializable_metadata

* detailed e2e langfuse metadata tests

* clean up langfuse logging

* fix langfuse

* remove unused imports

* fix code qa checks

* fix _prepare_metadata
2025-01-22 22:11:40 -08:00
Ishaan Jaff
a57b8f6802 fix litellm_overhead_latency_metric 2025-01-21 21:33:31 -08:00
Yuki Watanabe
3f053fc99c
Update MLflow calllback and documentation (#7809)
* Update MLlfow tracer

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* doc update

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* doc update

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* image rename

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
2025-01-21 20:56:48 -08:00
Ishaan Jaff
aa96c177be fix set_llm_deployment_success_metrics 2025-01-21 20:49:42 -08:00
Ishaan Jaff
4caf4c0277
(Feat - prometheus) - emit litellm_overhead_latency_metric (#7913)
* add track_llm_api_timing

* add track_llm_api_timing

* test_litellm_overhead

* use ResponseMetadata class for setting hidden params and response overhead

* instrument http handler

* fix track_llm_api_timing

* track_llm_api_timing

* emit response overhead on hidden params

* fix resp metadata

* fix make_sync_openai_embedding_request

* test_aaaaatext_completion_endpoint fixes

* _get_value_from_hidden_params

* set_hidden_params

* test_litellm_overhead

* test_litellm_overhead

* test_litellm_overhead

* fix import

* test_litellm_overhead_stream

* add LiteLLMLoggingObject

* use diff folder for testing

* use diff folder for overhead testing

* test litellm overhead

* use typing

* clear typing

* test_litellm_overhead

* fix async_streaming

* update_response_metadata

* move test file

* emit litellm_overhead_latency_metric on prometheus

* add prometheus callback

* litellm_overhead_latency_metric_bucket

* fix apply hidden params

* fix StandardLoggingHiddenParams
2025-01-21 20:36:30 -08:00
Ishaan Jaff
63d7d04232
(fix langfuse tags) - read tags from StandardLoggingPayload (#7903)
* fix _get_langfuse_tags

* fix _get_langfuse_tags

* fix _get_langfuse_tags

* _get_langfuse_tags

* test_get_langfuse_tags

* fix langfuse
2025-01-21 20:26:09 -08:00
Ishaan Jaff
806df5d31c
(Feat) datadog_llm_observability callback - emit request_tags on logs (#7883)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* dd - emit tags on llm obs payload

* dd  - show requester tags on traces

* test_get_datadog_tags

* _get_datadog_tags

* fix dd POD_NAME

* test_get_datadog_tags
2025-01-20 20:36:27 -08:00
Ishaan Jaff
447cf5511d fix python 3 install / usage 2025-01-18 16:04:27 -08:00
yuu341
7b3863b304
Fix: Problem with langfuse_tags when using litellm proxy with langfus… (#7825)
* Fix: Problem with langfuse_tags when using litellm proxy with langfuse integration (#7801)

* Refactor: Create method get_langfuse_tags

* Fix: Exception Handling
2025-01-18 12:39:57 -08:00
Ishaan Jaff
a489c5d95a
[fix dd llm obs] - use env vars for setting dd tags, service name (#7835)
* fix custom logger

* fix debugging dd llm obs
2025-01-17 18:57:16 -08:00
yujonglee
7584369fbe
add key and team level budget (#7831)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
2025-01-17 09:04:12 -08:00
Ishaan Jaff
2d5f8ea2c6 Revert "fix custom logger"
This reverts commit 9d2707ecfe.
2025-01-17 08:26:59 -08:00
Ishaan Jaff
9d2707ecfe fix custom logger 2025-01-17 07:39:49 -08:00
Ishaan Jaff
939e1c9b19
(datadog llm observability) - fixes + improvements for using datadog llm observability logging integration (#7824)
* dd llm obs fixes

* _ensure_string_content

* fix _get_dd_llm_obs_payload_metadata
2025-01-16 22:02:24 -08:00
Ishaan Jaff
30bb4c4cdd
(fix) BaseAWSLLM - cache IAM role credentials when used (#7775)
* fix base aws llm

* fix auth with aws role

* test aws base llm

* fix base aws llm init

* run ci/cd again

* fix get_credentials

* ci/cd run again

* _auth_with_aws_role
2025-01-14 20:16:22 -08:00
Ishaan Jaff
5fbbf47581
(Feat) prometheus - emit remaining team budget metric on proxy startup (#7777)
* fix get_paginated_teams

* use _initialize_remaining_budget_metrics

* fix prom metric

* run ci/cd again

* fix run async func

* fix _initialize_prometheus_startup_metrics

* fix _initialize_prometheus_startup_metrics

* prom unit tests

* test_get_paginated_teams
2025-01-14 20:08:23 -08:00
Ishaan Jaff
9daa6fb0b4
(prometheus - minor bug fix) - litellm_llm_api_time_to_first_token_metric not populating for bedrock models (#7740)
* fix prometheus ttft

* fix test_set_latency_metrics

* fix _set_latency_metrics

* fix _set_latency_metrics

* fix test_set_latency_metrics

* test_async_log_success_event

* huggingface/mistralai/Mistral-7B-Instruct-v0.3
2025-01-13 20:16:34 -08:00
Krish Dholakia
27892acdfc
Litellm dev 01 10 2025 p3 (#7682)
* feat(langfuse.py): log the used prompt when prompt management used

* test: fix test

* docs(self_serve.md): add doc on restricting personal key creation on ui

* feat(s3.py): support s3 logging with team alias prefixes (if available)

New preview feature

* fix(main.py): remove old if block - simplify to just await if coroutine returned

fixes lm_studio async embedding error

* fix(langfuse.py): handle get prompt check
2025-01-10 21:56:42 -08:00
Hugues Chocart
8576ca8ccb
feat: allow to pass custom parent run id (#7651) 2025-01-10 17:04:46 -08:00
vivek-athina
8e2653c609
Use environment variable for Athina logging URL (#7628)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* Use environment variable for Athina logging URL

* Added to docs as well

* Changed the env var name
2025-01-10 07:47:12 -08:00
Krish Dholakia
c10ae8879e
fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini p… (#7660)
* fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini process url

* refactor(router.py): refactor '_prompt_management_factory' to use logging obj get_chat_completion logic

deduplicates code

* fix(litellm_logging.py): update 'get_chat_completion_prompt' to update logging object messages

* docs(prompt_management.md): update prompt management to be in beta

given feedback - this still needs to be revised (e.g. passing in user message, not ignoring)

* refactor(prompt_management_base.py): introduce base class for prompt management

allows consistent behaviour across prompt management integrations

* feat(prompt_management_base.py): support adding client message to template message + refactor langfuse prompt management to use prompt management base

* fix(litellm_logging.py): log prompt id + prompt variables to langfuse if set

allows tracking what prompt was used for what purpose

* feat(litellm_logging.py): log prompt management metadata in standard logging payload + use in langfuse

allows logging prompt id / prompt variables to langfuse

* test: fix test

* fix(router.py): cleanup unused imports

* fix: fix linting error

* fix: fix trace param typing

* fix: fix linting errors

* fix: fix code qa check
2025-01-10 07:31:59 -08:00
Krish Dholakia
a187cee538
Litellm dev 01 07 2025 p3 (#7635)
* fix(__init__.py): fix mistral large tool calling

map bedrock mistral large to converse endpoint

Fixes https://github.com/BerriAI/litellm/issues/7521

* braintrust logging: respect project_id, add more metrics + more (#7613)

* braintrust logging: respect project_id, add more metrics

* braintrust logger: improve json formatting

* braintrust logger: add test for passing specific project_id

* rm unneeded import

* braintrust logging: rm unneeded var in tets

* add project_name

* update docs

---------

Co-authored-by: H <no@email.com>

---------

Co-authored-by: hi019 <65871571+hi019@users.noreply.github.com>
Co-authored-by: H <no@email.com>
2025-01-08 11:46:24 -08:00
Ishaan Jaff
081826a5d6
(Feat) soft budget alerts on keys (#7623)
* class WebhookEvent(CallInfo):
Add

* handle soft budget alerts

* handle soft budget

* fix budget alerts

* fix CallInfo

* fix _get_user_info_str

* test_soft_budget_alerts

* test_soft_budget_alert
2025-01-07 21:36:34 -08:00
Krish Dholakia
4e69711411
Litellm dev 01 07 2025 p1 (#7618)
* fix(main.py): pass custom llm provider on litellm logging provider update

* fix(cost_calculator.py): don't append provider name to return model if existing llm provider

Fixes https://github.com/BerriAI/litellm/issues/7607

* fix(prometheus_services.py): fix prometheus system health error logging

Fixes https://github.com/BerriAI/litellm/issues/7611
2025-01-07 21:22:31 -08:00
Krrish Dholakia
d5a288e29e docs: cleanup keys 2025-01-06 21:57:18 -08:00
Krish Dholakia
fef7839e8a
Litellm dev 01 06 2025 p1 (#7594)
* fix(custom_logger.py): expose new 'async_get_chat_completion_prompt' event hook

* fix(custom_logger.py): langfuse_prompt_management.py

remove 'headers' from custom logger 'async_get_chat_completion_prompt' and 'get_chat_completion_prompt' event hooks

* feat(router.py): expose new function for prompt management based routing

* feat(router.py): partial working router prompt factory logic

allows load balanced model to be used for model name w/ langfuse prompt management call

* feat(router.py): fix prompt management with load balanced model group

* feat(langfuse_prompt_management.py): support reading in openai params from langfuse

enables user to define optional params on langfuse vs. client code

* test(test_Router.py): add unit test for router based langfuse prompt management

* fix: fix linting errors
2025-01-06 21:26:21 -08:00
Ishaan Jaff
d1b101b9d7
(Fix) - Slack Alerting , don't send duplicate spend report when used on multi instance settings (#7546)
* fix send_weekly_spend_report

* test_spend_report_cache
2025-01-04 10:54:35 -08:00
Ishaan Jaff
1bb4941036
[Feature]: - allow print alert log to console (#7534)
* update send_to_webhook

* test_print_alerting_payload_warning

* add alerting_args spec

* test_alerting.py
2025-01-03 17:48:13 -08:00
Ishaan Jaff
9fef0a6d16
(fix) GCS bucket logger - apply truncate_standard_logging_payload_content to standard_logging_payload and ensure GCS flushes queue on fails (#7519)
* fix async_send_batch for gcs

* fix truncate GCS logger

* test_truncate_standard_logging_payload
2025-01-03 08:09:03 -08:00
Ishaan Jaff
4d93fe787b
Revert "(fix) GCS bucket logger - apply `truncate_standard_logging_payload_co…" (#7515)
This reverts commit 26a37c50c9.
2025-01-02 22:01:02 -08:00
Ishaan Jaff
26a37c50c9
(fix) GCS bucket logger - apply truncate_standard_logging_payload_content to standard_logging_payload and ensure GCS flushes queue on fails (#7500)
* use truncate_standard_logging_payload_content

* update truncate_standard_logging_payload_content

* update dd logger

* update gcs async_send_batch

* fix code check

* test_datadog_payload_content_truncation

* fix code quality
2025-01-01 20:21:01 -08:00
Krish Dholakia
07fc394072
Litellm dev 01 01 2025 p1 (#7498)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 11s
* refactor(prometheus.py): refactor to remove `_tag` metrics and incorporate in regular metrics

* fix(prometheus.py): handle label values not set in enum values

* feat(prometheus.py): working e2e custom metadata labels

* docs(prometheus.md): update docs to clarify how custom metrics would work

* test(test_prometheus_unit_tests.py): fix test

* test: add unit testing
2025-01-01 18:59:28 -08:00
Krish Dholakia
d984a9281a
Prometheus - custom metrics support + other improvements (#7489)
* fix(prometheus.py): refactor litellm_input_tokens_metric to use label factory

makes adding new metrics easier

* feat(prometheus.py): add 'request_model' to 'litellm_input_tokens_metric'

* refactor(prometheus.py): refactor 'litellm_output_tokens_metric' to use label factory

makes adding new metrics easier

* feat(prometheus.py): emit requested model in 'litellm_output_tokens_metric'

* feat(prometheus.py): support tracking success events with custom metrics

* refactor(prometheus.py): refactor '_set_latency_metrics' to just use the initially created enum values dictionary

reduces scope for missing values

* feat(prometheus.py): refactor all tags to support custom metadata tags

enables metadata tags to be used across for e2e tracking

* fix(prometheus.py): fix requested model on success event enum_values

* test: fix test

* test: fix test

* test: handle filenotfound error

* docs(prometheus.md): add new values to prometheus

* docs(prometheus.md): document adding custom metrics on prometheus

* bump: version 1.56.5 → 1.56.6
2025-01-01 07:41:50 -08:00
Ishaan Jaff
03b1db5a7d
(Feat) - Add PagerDuty Alerting Integration (#7478)
* define basic types

* fix verbose_logger.exception statement

* fix basic alerting

* test pager duty alerting

* test_pagerduty_alerting_high_failure_rate

* PagerDutyAlerting

* async_log_failure_event

* use pre_call_hook

* add _request_is_completed helper util

* update AlertingConfig

* rename PagerDutyInternalEvent

* _send_alert_if_thresholds_crossed

* use pagerduty as _custom_logger_compatible_callbacks_literal

* fix slack alerting imports

* fix imports in slack alerting

* PagerDutyAlerting

* fix _load_alerting_settings

* test_pagerduty_hanging_request_alerting

* working pager duty alerting

* fix linting

* doc pager duty alerting

* update hanging_response_handler

* fix import location

* update failure_threshold

* update async_pre_call_hook

* docs pagerduty

* test - callback_class_str_to_classType

* fix linting errors

* fix linting + testing error

* PagerDutyAlerting

* test_pagerduty_hanging_request_alerting

* fix unused imports

* docs pager duty

* @pytest.mark.flaky(retries=6, delay=2)

* test_model_info_bedrock_converse_enforcement
2025-01-01 07:12:51 -08:00
Krish Dholakia
080de89cfb
Fix team-based logging to langfuse + allow custom tokenizer on /token_counter endpoint (#7493)
* fix(langfuse_prompt_management.py): migrate dynamic logging to langfuse custom logger compatible class

* fix(langfuse_prompt_management.py): support failure callback logging to langfuse as well

* feat(proxy_server.py): support setting custom tokenizer on config.yaml

Allows customizing value for `/utils/token_counter`

* fix(proxy_server.py): fix linting errors

* test: skip if file not found

* style: cleanup unused import

* docs(configs.md): add docs on setting custom tokenizer
2024-12-31 23:18:41 -08:00