Commit graph

1648 commits

Author SHA1 Message Date
Ishaan Jaff
669b4fc955
(Prometheus) - emit key budget metrics on startup (#8002)
* add UI_SESSION_TOKEN_TEAM_ID

* add type KeyListResponseObject

* add _list_key_helper

* _initialize_api_key_budget_metrics

* key / budget metrics

* init key budget metrics on startup

* test_initialize_api_key_budget_metrics

* fix linting

* test_list_key_helper

* test_initialize_remaining_budget_metrics_exception_handling
2025-01-25 10:37:52 -08:00
Ishaan Jaff
d9dcfccdf6
(QA / testing) - Add unit testing for key model access checks (#7999)
* fix _model_matches_any_wildcard_pattern_in_list

* fix docstring
2025-01-25 10:01:35 -08:00
Ishaan Jaff
f77882948d test_init_custom_logger_compatible_class_as_callback 2025-01-24 21:27:22 -08:00
Krish Dholakia
8ca3229b26
Ensure base_model cost tracking works across all endpoints (#7989)
* test(test_completion_cost.py): add sdk test to ensure base model is used for cost tracking

* test(test_completion_cost.py): add sdk test to ensure custom pricing works

* fix(main.py): add base model cost tracking support for embedding calls

Enables base model cost tracking for embedding calls when base model set as a litellm_param

* fix(litellm_logging.py): update logging object with litellm params - including base model, if given

ensures base model param is always tracked

* fix(main.py): fix linting errors
2025-01-24 21:05:26 -08:00
Krish Dholakia
9df6bd90ba
fix(spend_tracking_utils.py): revert api key pass through fix (#7977)
* fix(spend_tracking_utils.py): revert api key pass through fix

* fix: fix linting error

* fix(spend_tracking_utils.py): add noqa - refactor post fixing standard logging payload on pass-through endpoints

* test(test_groq.py): bump groq model

* fix: fix positioning of noqa
2025-01-24 21:04:36 -08:00
Ishaan Jaff
74caef0843
(Feat) - Add GCS Pub/Sub Logging integration for sending DB SpendLogs to BigQuery (#7976)
* add pub_sub

* fix custom batch logger for GCS PUB/SUB

* GCS_PUBSUB_PROJECT_ID

* e2e gcs pub sub

* add gcs pub sub

* fix logging

* add GcsPubSubLogger

* fix pub sub

* add pub sub

* docs gcs pub / sub

* docs on pub sub controls

* test_gcs_pub_sub

* fix publish_message

* test_async_gcs_pub_sub

* test_async_gcs_pub_sub
2025-01-24 20:57:20 -08:00
Ishaan Jaff
bf46ae7346
(Testing) e2e testing for team budget enforcement checks (#7988)
* test_team_and_key_budget_enforcement

* test_team_budget_update

* test_gemini_pro_json_schema_httpx_content_policy_error
2025-01-24 18:18:12 -08:00
Ishaan Jaff
2017596913 Revert "test_team_and_key_budget_enforcement"
This reverts commit 9d44f51847.
2025-01-24 15:32:41 -08:00
Ishaan Jaff
9d44f51847 test_team_and_key_budget_enforcement 2025-01-24 15:31:48 -08:00
Ishaan Jaff
ed283bc5b4
(Feat) - allow setting default_on guardrails (#7973)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
* test_default_on_guardrail

* update debug on custom guardrail

* refactor guardrails init

* guardrail registry

* allow switching guardrails default_on

* fix circle import issue

* fix bedrock applying guardrails where content is a list

* fix unused import

* docs default on guardrail

* docs fix per api key
2025-01-24 10:14:05 -08:00
Krish Dholakia
1e011b66d3
Ollama ssl verify = False + Spend Logs reliability fixes (#7931)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* fix(http_handler.py): support passing ssl verify dynamically and using the correct httpx client based on passed ssl verify param

Fixes https://github.com/BerriAI/litellm/issues/6499

* feat(llm_http_handler.py): support passing `ssl_verify=False` dynamically in call args

Closes https://github.com/BerriAI/litellm/issues/6499

* fix(proxy/utils.py): prevent bad logs from breaking all cost tracking + reset list regardless of success/failure

prevents malformed logs from causing all spend tracking to break since they're constantly retried

* test(test_proxy_utils.py): add test to ensure bad log is dropped

* test(test_proxy_utils.py): ensure in-memory spend logs reset after bad log error

* test(test_user_api_key_auth.py): add unit test to ensure end user id as str works

* fix(auth_utils.py): ensure extracted end user id is always a str

prevents db cost tracking errors

* test(test_auth_utils.py): ensure get end user id from request body always returns a string

* test: update tests

* test: skip bedrock test- behaviour now supported

* test: fix testing

* refactor(spend_tracking_utils.py): reduce size of get_logging_payload

* test: fix test

* bump: version 1.59.4 → 1.59.5

* Revert "bump: version 1.59.4 → 1.59.5"

This reverts commit 1182b46b2e.

* fix(utils.py): fix spend logs retry logic

* fix(spend_tracking_utils.py): fix get tags

* fix(spend_tracking_utils.py): fix end user id spend tracking on pass-through endpoints
2025-01-23 23:05:41 -08:00
Krish Dholakia
c6e9240405
Add datadog health check support + fix bedrock converse cost tracking w/ region name specified (#7958)
* fix(bedrock/converse_handler.py): fix bedrock region name on async calls

* fix(utils.py): fix split model handling

Fixes bedrock cost calculation when region name is given

* feat(_health_endpoints.py): support health checking datadog integration

Closes https://github.com/BerriAI/litellm/issues/7921
2025-01-23 22:17:09 -08:00
Ishaan Jaff
085920aa1c
(Feat) allow setting guardrails on a team on the API (#7959)
* allow setting guardrails on a team

* test set guardrails on team

* set guardrails on a team

* fix LiteLLM_ManagementEndpoint_MetadataFields_Premium
2025-01-23 20:26:51 -08:00
Ishaan Jaff
1719dc23c7
(Feat) - emit litellm_team_budget_reset_at_metric and litellm_api_key_budget_remaining_hours_metric on prometheus (#7946)
* set litellm_team_budget_reset_at_metric

* add _get_team_info_from_db_lru_cached

* _set_team_budget_metrics

* e2e test_team_budget_metrics

* update doc string

* add _get_remaining_hours_for_budget_reset

* fix team endpoints

* _get_remaining_hours_for_budget_reset

* _set_key_budget_metrics  on startup

* test_key_budget_metrics

* prom fixes for emitting key / team metrics

* fix _set_api_key_budget_metrics_after_api_request

* test_increment_remaining_budget_metrics

* unit test test_increment_remaining_budget_metrics

* test_initialize_remaining_budget_metrics
2025-01-23 18:12:47 -08:00
Ishaan Jaff
d15ed86e3e test_chat_completion_ratelimit add retry on test 2025-01-23 18:10:31 -08:00
Ishaan Jaff
7599c9aebb
(Testing + Refactor) - Unit testing for team and virtual key budget checks (#7945)
* unit testing for test_virtual_key_max_budget_check

* refactor _team_max_budget_check

* is_model_allowed_by_pattern
2025-01-23 16:58:16 -08:00
yujonglee
f201888f69
Merge pull request #7919 from BerriAI/litellm_refactor_e2e_prometheus
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
Refactor prometheus e2e test
2025-01-23 18:39:42 +09:00
Krish Dholakia
513b1904ab
Add attempted-retries and timeout values to response headers + more testing (#7926)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
* feat(router.py): add retry headers to response

makes it easy to add testing to ensure model-specific retries are respected

* fix(add_retry_headers.py): clarify attempted retries vs. max retries

* test(test_fallbacks.py): add test for checking if max retries set for model is respected

* test(test_fallbacks.py): assert values for attempted retries and max retries are as expected

* fix(utils.py): return timeout in litellm proxy response headers

* test(test_fallbacks.py): add test to assert model specific timeout used on timeout error

* test: add bad model with timeout to proxy

* fix: fix linting error

* fix(router.py): fix get model list from model alias

* test: loosen test restriction - account for other events on proxy
2025-01-22 22:19:44 -08:00
Ishaan Jaff
b60efd4646 fix test_async_create_batch 2025-01-22 22:15:49 -08:00
Ishaan Jaff
53a3ea3d06
(Refactor) Langfuse - remove prepare_metadata, langfuse python SDK now handles non-json serializable objects (#7925)
* test_langfuse_logging_completion_with_langfuse_metadata

* fix litellm - remove prepare metadata

* test_langfuse_logging_with_non_serializable_metadata

* detailed e2e langfuse metadata tests

* clean up langfuse logging

* fix langfuse

* remove unused imports

* fix code qa checks

* fix _prepare_metadata
2025-01-22 22:11:40 -08:00
Krish Dholakia
27560bd5ad
Litellm dev 01 22 2025 p4 (#7932)
* feat(main.py): add new 'provider_specific_header' param

allows passing extra header for specific provider

* fix(litellm_pre_call_utils.py): add unit test for pre call utils

* test(test_bedrock_completion.py): skip test now that bedrock supports this
2025-01-22 21:52:07 -08:00
Krish Dholakia
4911cd80a1
fix(utils.py): move adding custom logger callback to success event in… (#7905)
* fix(utils.py): move adding custom logger callback to success event into separate function + don't add success callback to failure event

if user is explicitly choosing 'success' callback, don't log failure as well

* test(test_utils.py): add unit test to ensure custom logger callback only adds callback to specific event

* fix(utils.py): remove string from list of callbacks once corresponding callback class is added

prevents floating values - simplifies testing

* fix(utils.py): fix linting error

* test: cleanup args before test

* test: fix test

* test: update test

* test: fix test
2025-01-22 21:49:09 -08:00
Krish Dholakia
e3bacf7196
Litellm dev 01 22 2025 p1 (#7933)
* docs(docker_quick_start.md): add more troubleshooting guides

* test(test_fallbacks.py): add e2e test for proxy with fallbacks + custom fallback message

* test(test_bedrock_completion.py): skip test now that bedrock supports this behaviour

* test(test_fireworks_ai_translation.py): mock fireworks ai test
2025-01-22 19:55:32 -08:00
Krrish Dholakia
760ba4dfd5 test: skip test - Bedrock now supports this behavior 2025-01-22 18:53:12 -08:00
Krrish Dholakia
049915c14b test: mock fireworks ai test - unstable api 2025-01-22 18:52:11 -08:00
Krrish Dholakia
55546f403b Revert "fix: fix test"
This reverts commit 4e672f6269.
2025-01-22 18:21:07 -08:00
Krrish Dholakia
4e672f6269 fix: fix test 2025-01-22 18:19:49 -08:00
Ishaan Jaff
1f4ea88228
(Testing) - Add e2e testing for langfuse logging with tags (#7922)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
* move langfuse tests

* fix test

* fix completion.json

* working test

* test completion with tags

* langfuse testing fixes

* faster logging testing

* pytest-xdist in testing

* fix langfuse testing flow

* fix testing flow

* fix config for logging tests

* fix langfuse completion with tags stream

* fix _verify_langfuse_call
2025-01-22 09:09:25 -08:00
yujonglee
84a24d8779 done 2025-01-22 20:19:31 +09:00
Krish Dholakia
76795dba39
Deepseek r1 support + watsonx qa improvements (#7907)
* fix(types/utils.py): support returning 'reasoning_content' for deepseek models

Fixes https://github.com/BerriAI/litellm/issues/7877#issuecomment-2603813218

* fix(convert_dict_to_response.py): return deepseek response in provider_specific_field

allows for separating openai vs. non-openai params in model response

* fix(utils.py): support 'provider_specific_field' in delta chunk as well

allows deepseek reasoning content chunk to be returned to user from stream as well

Fixes https://github.com/BerriAI/litellm/issues/7877#issuecomment-2603813218

* fix(watsonx/chat/handler.py): fix passing space id to watsonx on chat route

* fix(watsonx/): fix watsonx_text/ route with space id

* fix(watsonx/): qa item - also adds better unit testing for watsonx embedding calls

* fix(utils.py): rename to '..fields'

* fix: fix linting errors

* fix(utils.py): fix typing - don't show provider-specific field if none or empty - prevents default respons
e from being non-oai compatible

* fix: cleanup unused imports

* docs(deepseek.md): add docs for deepseek reasoning model
2025-01-21 23:13:15 -08:00
Ishaan Jaff
4978669273 litellm_overhead_latency_metric 2025-01-21 20:51:57 -08:00
Ishaan Jaff
4caf4c0277
(Feat - prometheus) - emit litellm_overhead_latency_metric (#7913)
* add track_llm_api_timing

* add track_llm_api_timing

* test_litellm_overhead

* use ResponseMetadata class for setting hidden params and response overhead

* instrument http handler

* fix track_llm_api_timing

* track_llm_api_timing

* emit response overhead on hidden params

* fix resp metadata

* fix make_sync_openai_embedding_request

* test_aaaaatext_completion_endpoint fixes

* _get_value_from_hidden_params

* set_hidden_params

* test_litellm_overhead

* test_litellm_overhead

* test_litellm_overhead

* fix import

* test_litellm_overhead_stream

* add LiteLLMLoggingObject

* use diff folder for testing

* use diff folder for overhead testing

* test litellm overhead

* use typing

* clear typing

* test_litellm_overhead

* fix async_streaming

* update_response_metadata

* move test file

* emit litellm_overhead_latency_metric on prometheus

* add prometheus callback

* litellm_overhead_latency_metric_bucket

* fix apply hidden params

* fix StandardLoggingHiddenParams
2025-01-21 20:36:30 -08:00
Krish Dholakia
866fffb50d
Litellm dev 01 21 2025 p1 (#7898)
* fix(utils.py): don't pass 'anthropic-beta' header to vertex - will cause request to fail

* fix(utils.py): add flag to allow user to disable filtering invalid headers

ensure user can control behaviour

* style(utils.py): cleanup message

* test(test_utils.py): add unit test to cover invalid header filtering

* fix(proxy_server.py): fix custom openapi schema generation

* fix(utils.py): pass extra headers if set

* fix(main.py): fix image variation to use 'client' param
2025-01-21 20:36:11 -08:00
Ishaan Jaff
dd385410df
(Code quality) - Ban recursive functions in codebase (#7910)
* code qa add RecursiveFunctionFinder

* test_recursive_detector

* RecursiveFunctionFinder

* fix check

* recursive_detector
2025-01-21 20:33:32 -08:00
Ishaan Jaff
b6f2e659b9
(Feat) Add x-litellm-overhead-duration-ms and "x-litellm-response-duration-ms" in response from LiteLLM (#7899)
* add track_llm_api_timing

* add track_llm_api_timing

* test_litellm_overhead

* use ResponseMetadata class for setting hidden params and response overhead

* instrument http handler

* fix track_llm_api_timing

* track_llm_api_timing

* emit response overhead on hidden params

* fix resp metadata

* fix make_sync_openai_embedding_request

* test_aaaaatext_completion_endpoint fixes

* _get_value_from_hidden_params

* set_hidden_params

* test_litellm_overhead

* test_litellm_overhead

* test_litellm_overhead

* fix import

* test_litellm_overhead_stream

* add LiteLLMLoggingObject

* use diff folder for testing

* use diff folder for overhead testing

* test litellm overhead

* use typing

* clear typing

* test_litellm_overhead

* fix async_streaming

* update_response_metadata

* move test file

* pply metadata to the response objec
2025-01-21 20:27:55 -08:00
Ishaan Jaff
63d7d04232
(fix langfuse tags) - read tags from StandardLoggingPayload (#7903)
* fix _get_langfuse_tags

* fix _get_langfuse_tags

* fix _get_langfuse_tags

* _get_langfuse_tags

* test_get_langfuse_tags

* fix langfuse
2025-01-21 20:26:09 -08:00
Ishaan Jaff
2a71d9e8f1
(Bug fix) - Allow setting null for max_budget, rpm_limit, tpm_limit when updating values on a team (#7912)
* fix update_team

* fix test_key_limit_modifications
2025-01-21 19:19:36 -08:00
Krish Dholakia
b81072d90c
fix: add default credential for azure (#7095) (#7891)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 11s
* fix: add default credential for azure (#7095)

* fix: fix linting error

* fix: remove redundant test

* test: skip redundant test

---------

Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>
2025-01-21 09:01:49 -08:00
Krish Dholakia
c8aa876785
fix(proxy_server.py): fix get model info when litellm_model_id is set + move model analytics to free (#7886)
* fix(proxy_server.py): fix get model info when litellm_model_id is set

Fixes https://github.com/BerriAI/litellm/issues/7873

* test(test_models.py): add test to ensure get model info on specific deployment has same value as all model info

Fixes https://github.com/BerriAI/litellm/issues/7873

* fix(usage.tsx): make model analytics free

Fixes @iqballx's feedback

* fix(fix(invoke_handler.py):-fix-bedrock-error-chunk-parsing): return correct bedrock status code and error message if chunk in stream

Improves bedrock stream error handling

* fix(proxy_server.py): fix linting errors

* test(test_auth_checks.py): remove redundant test

* fix(proxy_server.py): fix linting errors

* test: fix flaky test

* test: fix test
2025-01-21 08:19:07 -08:00
Ishaan Jaff
0295f494b6
(e2e testing + minor refactor) - Virtual Key Max budget check (#7888)
* use helper _virtual_key_max_budget_check

* e2e testing for budget exceeded errors

* e2e budget testing

* test_chat_completion_budget_update

* test_chat_completion_high_budget
2025-01-21 06:47:26 -08:00
Krish Dholakia
64e1df1f14
Litellm dev 01 20 2025 p3 (#7890)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* fix(router.py): pass stream timeout correctly for non openai / azure models

Fixes https://github.com/BerriAI/litellm/issues/7870

* test(test_router_timeout.py): add test for streaming

* test(test_router_timeout.py): add unit testing for new router functions

* docs(ollama.md): link to section on calling ollama within docker container

* test: remove redundant test

* test: fix test to include timeout value

* docs(config_settings.md): document new router settings param
2025-01-20 21:46:36 -08:00
Krish Dholakia
4b23420a20
Litellm dev 01 20 2025 p1 (#7884)
* fix(initial-test-to-return-api-timeout-value-in-openai-timeout-exception): Makes it easier for user to debug why request timed out

* feat(openai.py): return timeout value + time taken on openai timeout errors

helps debug timeout errors

* fix(utils.py): fix num retries extraction logic when num_retries = 0

* fix(config_settings.md): litellm_logging.py

support printing payload to console if 'LITELLM_PRINT_STANDARD_LOGGING_PAYLOAD' is true

 Enables easier debug

* test(test_auth_checks.py'): remove common checks userapikeyauth enforcement check

* fix(litellm_logging.py): fix linting error
2025-01-20 21:45:48 -08:00
Ishaan Jaff
806df5d31c
(Feat) datadog_llm_observability callback - emit request_tags on logs (#7883)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* dd - emit tags on llm obs payload

* dd  - show requester tags on traces

* test_get_datadog_tags

* _get_datadog_tags

* fix dd POD_NAME

* test_get_datadog_tags
2025-01-20 20:36:27 -08:00
Krish Dholakia
4b88635372
fix(fireworks_ai/): fix global disable flag with transform messages helper (#7847)
fixes issue where .get() = none was preventing global disable flag from being picked up
2025-01-20 20:16:11 -08:00
Krish Dholakia
dca6904937
JWT Auth - enforce_rbac support + UI team view, spend calc fix (#7863)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
* fix(user_dashboard.tsx): fix spend calculation when team selected

sum all team keys, not user keys

* docs(admin_ui_sso.md): fix docs tabbing

* feat(user_api_key_auth.py): introduce new 'enforce_rbac' param on jwt auth

allows proxy admin to prevent any unmapped yet authenticated jwt tokens from calling proxy

Fixes https://github.com/BerriAI/litellm/issues/6793

* test: more unit testing + refactoring

* fix: fix returning id when obj not found in db

* fix(user_api_key_auth.py): add end user id tracking from jwt auth

* docs(token_auth.md): add doc on rbac with JWTs

* fix: fix unused params

* test: remove old test
2025-01-19 21:28:55 -08:00
Krish Dholakia
c306c2e0fc
Auth checks on invalid fallback models (#7871)
* fix(user_api_key_auth.py): handle clientside fallback model when item in list is dictionary

* fix(auth_checks.py): help user find invalid model names during dev

Ensure fallbacks work in prod

* fix(user_api_key_auth.py): fix linting check

* fix: cleanup unused variables

* fix: fix import

* fix(auth_checks.py): fix auth check
2025-01-19 21:28:10 -08:00
Krish Dholakia
3a7b13efa2
feat(health_check.py): set upperbound for api when making health check call (#7865)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 10s
* feat(health_check.py): set upperbound for api when making health check call

prevent bad model from health check to hang and cause pod restarts

* fix(health_check.py): cleanup task once completed

* fix(constants.py): bump default health check timeout to 1min

* docs(health.md): add 'health_check_timeout' to health docs on litellm

* build(proxy_server_config.yaml): add bad model to health check
2025-01-18 19:47:43 -08:00
Krish Dholakia
e67f18b153
LiteLLM Minor Fixes & Improvements (01/18/2025) - p1 (#7857)
* OllamaChatConfig supports JSON schema response format in optional parameters (#7832)

* fix(types/router.py): handle none values for bool types

Fixes https://github.com/BerriAI/litellm/issues/7855#issuecomment-2599781974

* test: handle no hf token in env

---------

Co-authored-by: trislaz <35226192+trislaz@users.noreply.github.com>
2025-01-18 19:03:50 -08:00
Ishaan Jaff
2fdbcca9ae e2e ui testing fixes 2025-01-18 07:46:55 -08:00
Krish Dholakia
1bea338597
LiteLLM Minor Fixes & Improvements (2024/16/01) (#7826)
* fix(lm_studio/chat/transformation.py): Fix https://github.com/BerriAI/litellm/issues/7811

* fix(router.py): fix mock timeout check

* fix: drop model name from fallback args since it causes a conflict with the model=model that is provided later on. (#7806)

This error happens if you provide multiple fallback models to the completion function with model name defined in each one.

* fix(router.py): remove mock_timeout before sending to request

prevents reuse in fallbacks

* test: update test

* test: revert test change - wrong pr

---------

Co-authored-by: Dudu Lasry <david1542@users.noreply.github.com>
2025-01-17 20:59:21 -08:00