Commit graph

1610 commits

Author SHA1 Message Date
Krish Dholakia
c2e3986bbc
fix(utils.py): handle failed hf tokenizer request during calls (#8032)
* fix(utils.py): handle failed hf tokenizer request during calls

prevents proxy from failing due to bad hf tokenizer calls

* fix(utils.py): convert failure callback str to custom logger class

Fixes https://github.com/BerriAI/litellm/issues/8013

* test(test_utils.py): fix test - avoid adding mlflow dep on ci/cd

* fix: add missing env vars to test

* test: cleanup redundant test
2025-01-28 17:20:36 -08:00
Ishaan Jaff
74e332bfdd fix test test_stream_chunk_builder_openai_audio_output_usage - use direct dict comparison
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
2025-01-28 16:28:24 -08:00
Krish Dholakia
2eaa0079f2
feat(handle_jwt.py): initial commit adding custom RBAC support on jwt… (#8037)
* feat(handle_jwt.py): initial commit adding custom RBAC support on jwt auth

allows admin to define user role field and allowed roles which map to 'internal_user' on litellm

* fix(auth_checks.py): ensure user allowed to access model, when calling via personal keys

Fixes https://github.com/BerriAI/litellm/issues/8029

* feat(handle_jwt.py): support role based access with model permission control on proxy

Allows admin to just grant users roles on IDP (e.g. Azure AD/Keycloak) and user can immediately start calling models

* docs(rbac): add docs on rbac for model access control

make it clear how admin can use roles to control model access on proxy

* fix: fix linting errors

* test(test_user_api_key_auth.py): add unit testing to ensure rbac role is correctly enforced

* test(test_user_api_key_auth.py): add more testing

* test(test_users.py): add unit testing to ensure user model access is always checked for new keys

Resolves https://github.com/BerriAI/litellm/issues/8029

* test: fix unit test

* fix(dot_notation_indexing.py): fix typing to work with python 3.8
2025-01-28 16:27:06 -08:00
Ishaan Jaff
9644e197f7 deepseek api testing - deepseek is currently hanging
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
2025-01-27 22:04:36 -08:00
Ishaan Jaff
46469c6087 set timeout for deepseek testing 2025-01-27 21:25:28 -08:00
Steve Farthing
fe0f9213af Bing Search Pass Thru 2025-01-27 08:58:04 -05:00
Krish Dholakia
6bafdbc546
Litellm dev 01 25 2025 p4 (#8006)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 34s
* feat(main.py): use asyncio.sleep for mock_Timeout=true on async request

adds unit testing to ensure proxy does not fail if specific Openai requests hang (e.g. recent o1 outage)

* fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming

Fixes https://github.com/BerriAI/litellm/issues/7942

* Revert "fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming"

This reverts commit 7a052a64e3.

* fix(deepseek-r-1): return reasoning_content as a top-level param

ensures compatibility with existing tools that use it

* fix: fix linting error
2025-01-26 08:01:05 -08:00
Krish Dholakia
03eef5a2a0
Fix custom pricing - separate provider info from model info (#7990)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 34s
* fix(utils.py): initial commit fixing custom cost tracking

refactors out provider specific model info from `get_model_info` - this was causing custom costs to be registered incorrectly

* fix(utils.py): cleanup `_supports_factory` to check provider info, if model info is None

some providers support features like vision across all models

* fix(utils.py): refactor to use _supports_factory

* test: update testing

* fix: fix linting errors

* test: fix testing
2025-01-25 21:49:28 -08:00
Ishaan Jaff
dcc3bbc264
(Fix) langfuse - setting LANGFUSE_FLUSH_INTERVAL (#8007)
* fix langfuse flush interval

* test_get_langfuse_flush_interval

* test_get_langfuse_flush_interval
2025-01-25 17:17:32 -08:00
Ishaan Jaff
d19614b8c0
(QA / testing) - Add e2e tests for key model access auth checks (#8000)
* fix _model_matches_any_wildcard_pattern_in_list

* test key model access checks

* add key_model_access_denied to ProxyErrorTypes

* update auth checks

* test_model_access_update

* test_team_model_access_patterns

* fix _team_model_access_check

* fix config used for otel testing

* test fix test_call_with_invalid_model

* fix model acces check tests

* test_team_access_groups

* test _model_matches_any_wildcard_pattern_in_list
2025-01-25 17:15:11 -08:00
Krish Dholakia
08b124aeb6
Litellm dev 01 25 2025 p2 (#8003)
* fix(base_utils.py): supported nested json schema passed in for anthropic calls

* refactor(base_utils.py): refactor ref parsing to prevent infinite loop

* test(test_openai_endpoints.py): refactor anthropic test to use bedrock

* fix(langfuse_prompt_management.py): add unit test for sync langfuse calls

Resolves https://github.com/BerriAI/litellm/issues/7938#issuecomment-2613293757
2025-01-25 16:50:57 -08:00
Ishaan Jaff
a7b3c664d1
(Feat) set guardrails per team (#7993)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 35s
* _add_guardrails_from_key_or_team_metadata

* e2e test test_guardrails_with_team_controls

* add try/except on team new

* test_guardrails_with_team_controls

* test_guardrails_with_api_key_controls
2025-01-25 10:41:11 -08:00
Ishaan Jaff
669b4fc955
(Prometheus) - emit key budget metrics on startup (#8002)
* add UI_SESSION_TOKEN_TEAM_ID

* add type KeyListResponseObject

* add _list_key_helper

* _initialize_api_key_budget_metrics

* key / budget metrics

* init key budget metrics on startup

* test_initialize_api_key_budget_metrics

* fix linting

* test_list_key_helper

* test_initialize_remaining_budget_metrics_exception_handling
2025-01-25 10:37:52 -08:00
Ishaan Jaff
d9dcfccdf6
(QA / testing) - Add unit testing for key model access checks (#7999)
* fix _model_matches_any_wildcard_pattern_in_list

* fix docstring
2025-01-25 10:01:35 -08:00
Ishaan Jaff
f77882948d test_init_custom_logger_compatible_class_as_callback 2025-01-24 21:27:22 -08:00
Krish Dholakia
8ca3229b26
Ensure base_model cost tracking works across all endpoints (#7989)
* test(test_completion_cost.py): add sdk test to ensure base model is used for cost tracking

* test(test_completion_cost.py): add sdk test to ensure custom pricing works

* fix(main.py): add base model cost tracking support for embedding calls

Enables base model cost tracking for embedding calls when base model set as a litellm_param

* fix(litellm_logging.py): update logging object with litellm params - including base model, if given

ensures base model param is always tracked

* fix(main.py): fix linting errors
2025-01-24 21:05:26 -08:00
Krish Dholakia
9df6bd90ba
fix(spend_tracking_utils.py): revert api key pass through fix (#7977)
* fix(spend_tracking_utils.py): revert api key pass through fix

* fix: fix linting error

* fix(spend_tracking_utils.py): add noqa - refactor post fixing standard logging payload on pass-through endpoints

* test(test_groq.py): bump groq model

* fix: fix positioning of noqa
2025-01-24 21:04:36 -08:00
Ishaan Jaff
74caef0843
(Feat) - Add GCS Pub/Sub Logging integration for sending DB SpendLogs to BigQuery (#7976)
* add pub_sub

* fix custom batch logger for GCS PUB/SUB

* GCS_PUBSUB_PROJECT_ID

* e2e gcs pub sub

* add gcs pub sub

* fix logging

* add GcsPubSubLogger

* fix pub sub

* add pub sub

* docs gcs pub / sub

* docs on pub sub controls

* test_gcs_pub_sub

* fix publish_message

* test_async_gcs_pub_sub

* test_async_gcs_pub_sub
2025-01-24 20:57:20 -08:00
Ishaan Jaff
bf46ae7346
(Testing) e2e testing for team budget enforcement checks (#7988)
* test_team_and_key_budget_enforcement

* test_team_budget_update

* test_gemini_pro_json_schema_httpx_content_policy_error
2025-01-24 18:18:12 -08:00
Ishaan Jaff
2017596913 Revert "test_team_and_key_budget_enforcement"
This reverts commit 9d44f51847.
2025-01-24 15:32:41 -08:00
Ishaan Jaff
9d44f51847 test_team_and_key_budget_enforcement 2025-01-24 15:31:48 -08:00
Ishaan Jaff
ed283bc5b4
(Feat) - allow setting default_on guardrails (#7973)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
* test_default_on_guardrail

* update debug on custom guardrail

* refactor guardrails init

* guardrail registry

* allow switching guardrails default_on

* fix circle import issue

* fix bedrock applying guardrails where content is a list

* fix unused import

* docs default on guardrail

* docs fix per api key
2025-01-24 10:14:05 -08:00
Krish Dholakia
1e011b66d3
Ollama ssl verify = False + Spend Logs reliability fixes (#7931)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* fix(http_handler.py): support passing ssl verify dynamically and using the correct httpx client based on passed ssl verify param

Fixes https://github.com/BerriAI/litellm/issues/6499

* feat(llm_http_handler.py): support passing `ssl_verify=False` dynamically in call args

Closes https://github.com/BerriAI/litellm/issues/6499

* fix(proxy/utils.py): prevent bad logs from breaking all cost tracking + reset list regardless of success/failure

prevents malformed logs from causing all spend tracking to break since they're constantly retried

* test(test_proxy_utils.py): add test to ensure bad log is dropped

* test(test_proxy_utils.py): ensure in-memory spend logs reset after bad log error

* test(test_user_api_key_auth.py): add unit test to ensure end user id as str works

* fix(auth_utils.py): ensure extracted end user id is always a str

prevents db cost tracking errors

* test(test_auth_utils.py): ensure get end user id from request body always returns a string

* test: update tests

* test: skip bedrock test- behaviour now supported

* test: fix testing

* refactor(spend_tracking_utils.py): reduce size of get_logging_payload

* test: fix test

* bump: version 1.59.4 → 1.59.5

* Revert "bump: version 1.59.4 → 1.59.5"

This reverts commit 1182b46b2e.

* fix(utils.py): fix spend logs retry logic

* fix(spend_tracking_utils.py): fix get tags

* fix(spend_tracking_utils.py): fix end user id spend tracking on pass-through endpoints
2025-01-23 23:05:41 -08:00
Krish Dholakia
c6e9240405
Add datadog health check support + fix bedrock converse cost tracking w/ region name specified (#7958)
* fix(bedrock/converse_handler.py): fix bedrock region name on async calls

* fix(utils.py): fix split model handling

Fixes bedrock cost calculation when region name is given

* feat(_health_endpoints.py): support health checking datadog integration

Closes https://github.com/BerriAI/litellm/issues/7921
2025-01-23 22:17:09 -08:00
Ishaan Jaff
085920aa1c
(Feat) allow setting guardrails on a team on the API (#7959)
* allow setting guardrails on a team

* test set guardrails on team

* set guardrails on a team

* fix LiteLLM_ManagementEndpoint_MetadataFields_Premium
2025-01-23 20:26:51 -08:00
Ishaan Jaff
1719dc23c7
(Feat) - emit litellm_team_budget_reset_at_metric and litellm_api_key_budget_remaining_hours_metric on prometheus (#7946)
* set litellm_team_budget_reset_at_metric

* add _get_team_info_from_db_lru_cached

* _set_team_budget_metrics

* e2e test_team_budget_metrics

* update doc string

* add _get_remaining_hours_for_budget_reset

* fix team endpoints

* _get_remaining_hours_for_budget_reset

* _set_key_budget_metrics  on startup

* test_key_budget_metrics

* prom fixes for emitting key / team metrics

* fix _set_api_key_budget_metrics_after_api_request

* test_increment_remaining_budget_metrics

* unit test test_increment_remaining_budget_metrics

* test_initialize_remaining_budget_metrics
2025-01-23 18:12:47 -08:00
Ishaan Jaff
d15ed86e3e test_chat_completion_ratelimit add retry on test 2025-01-23 18:10:31 -08:00
Ishaan Jaff
7599c9aebb
(Testing + Refactor) - Unit testing for team and virtual key budget checks (#7945)
* unit testing for test_virtual_key_max_budget_check

* refactor _team_max_budget_check

* is_model_allowed_by_pattern
2025-01-23 16:58:16 -08:00
yujonglee
f201888f69
Merge pull request #7919 from BerriAI/litellm_refactor_e2e_prometheus
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
Refactor prometheus e2e test
2025-01-23 18:39:42 +09:00
Krish Dholakia
513b1904ab
Add attempted-retries and timeout values to response headers + more testing (#7926)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
* feat(router.py): add retry headers to response

makes it easy to add testing to ensure model-specific retries are respected

* fix(add_retry_headers.py): clarify attempted retries vs. max retries

* test(test_fallbacks.py): add test for checking if max retries set for model is respected

* test(test_fallbacks.py): assert values for attempted retries and max retries are as expected

* fix(utils.py): return timeout in litellm proxy response headers

* test(test_fallbacks.py): add test to assert model specific timeout used on timeout error

* test: add bad model with timeout to proxy

* fix: fix linting error

* fix(router.py): fix get model list from model alias

* test: loosen test restriction - account for other events on proxy
2025-01-22 22:19:44 -08:00
Ishaan Jaff
b60efd4646 fix test_async_create_batch 2025-01-22 22:15:49 -08:00
Ishaan Jaff
53a3ea3d06
(Refactor) Langfuse - remove prepare_metadata, langfuse python SDK now handles non-json serializable objects (#7925)
* test_langfuse_logging_completion_with_langfuse_metadata

* fix litellm - remove prepare metadata

* test_langfuse_logging_with_non_serializable_metadata

* detailed e2e langfuse metadata tests

* clean up langfuse logging

* fix langfuse

* remove unused imports

* fix code qa checks

* fix _prepare_metadata
2025-01-22 22:11:40 -08:00
Krish Dholakia
27560bd5ad
Litellm dev 01 22 2025 p4 (#7932)
* feat(main.py): add new 'provider_specific_header' param

allows passing extra header for specific provider

* fix(litellm_pre_call_utils.py): add unit test for pre call utils

* test(test_bedrock_completion.py): skip test now that bedrock supports this
2025-01-22 21:52:07 -08:00
Krish Dholakia
4911cd80a1
fix(utils.py): move adding custom logger callback to success event in… (#7905)
* fix(utils.py): move adding custom logger callback to success event into separate function + don't add success callback to failure event

if user is explicitly choosing 'success' callback, don't log failure as well

* test(test_utils.py): add unit test to ensure custom logger callback only adds callback to specific event

* fix(utils.py): remove string from list of callbacks once corresponding callback class is added

prevents floating values - simplifies testing

* fix(utils.py): fix linting error

* test: cleanup args before test

* test: fix test

* test: update test

* test: fix test
2025-01-22 21:49:09 -08:00
Krish Dholakia
e3bacf7196
Litellm dev 01 22 2025 p1 (#7933)
* docs(docker_quick_start.md): add more troubleshooting guides

* test(test_fallbacks.py): add e2e test for proxy with fallbacks + custom fallback message

* test(test_bedrock_completion.py): skip test now that bedrock supports this behaviour

* test(test_fireworks_ai_translation.py): mock fireworks ai test
2025-01-22 19:55:32 -08:00
Krrish Dholakia
760ba4dfd5 test: skip test - Bedrock now supports this behavior 2025-01-22 18:53:12 -08:00
Krrish Dholakia
049915c14b test: mock fireworks ai test - unstable api 2025-01-22 18:52:11 -08:00
Krrish Dholakia
55546f403b Revert "fix: fix test"
This reverts commit 4e672f6269.
2025-01-22 18:21:07 -08:00
Krrish Dholakia
4e672f6269 fix: fix test 2025-01-22 18:19:49 -08:00
Ishaan Jaff
1f4ea88228
(Testing) - Add e2e testing for langfuse logging with tags (#7922)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
* move langfuse tests

* fix test

* fix completion.json

* working test

* test completion with tags

* langfuse testing fixes

* faster logging testing

* pytest-xdist in testing

* fix langfuse testing flow

* fix testing flow

* fix config for logging tests

* fix langfuse completion with tags stream

* fix _verify_langfuse_call
2025-01-22 09:09:25 -08:00
yujonglee
84a24d8779 done 2025-01-22 20:19:31 +09:00
Krish Dholakia
76795dba39
Deepseek r1 support + watsonx qa improvements (#7907)
* fix(types/utils.py): support returning 'reasoning_content' for deepseek models

Fixes https://github.com/BerriAI/litellm/issues/7877#issuecomment-2603813218

* fix(convert_dict_to_response.py): return deepseek response in provider_specific_field

allows for separating openai vs. non-openai params in model response

* fix(utils.py): support 'provider_specific_field' in delta chunk as well

allows deepseek reasoning content chunk to be returned to user from stream as well

Fixes https://github.com/BerriAI/litellm/issues/7877#issuecomment-2603813218

* fix(watsonx/chat/handler.py): fix passing space id to watsonx on chat route

* fix(watsonx/): fix watsonx_text/ route with space id

* fix(watsonx/): qa item - also adds better unit testing for watsonx embedding calls

* fix(utils.py): rename to '..fields'

* fix: fix linting errors

* fix(utils.py): fix typing - don't show provider-specific field if none or empty - prevents default respons
e from being non-oai compatible

* fix: cleanup unused imports

* docs(deepseek.md): add docs for deepseek reasoning model
2025-01-21 23:13:15 -08:00
Ishaan Jaff
4978669273 litellm_overhead_latency_metric 2025-01-21 20:51:57 -08:00
Ishaan Jaff
4caf4c0277
(Feat - prometheus) - emit litellm_overhead_latency_metric (#7913)
* add track_llm_api_timing

* add track_llm_api_timing

* test_litellm_overhead

* use ResponseMetadata class for setting hidden params and response overhead

* instrument http handler

* fix track_llm_api_timing

* track_llm_api_timing

* emit response overhead on hidden params

* fix resp metadata

* fix make_sync_openai_embedding_request

* test_aaaaatext_completion_endpoint fixes

* _get_value_from_hidden_params

* set_hidden_params

* test_litellm_overhead

* test_litellm_overhead

* test_litellm_overhead

* fix import

* test_litellm_overhead_stream

* add LiteLLMLoggingObject

* use diff folder for testing

* use diff folder for overhead testing

* test litellm overhead

* use typing

* clear typing

* test_litellm_overhead

* fix async_streaming

* update_response_metadata

* move test file

* emit litellm_overhead_latency_metric on prometheus

* add prometheus callback

* litellm_overhead_latency_metric_bucket

* fix apply hidden params

* fix StandardLoggingHiddenParams
2025-01-21 20:36:30 -08:00
Krish Dholakia
866fffb50d
Litellm dev 01 21 2025 p1 (#7898)
* fix(utils.py): don't pass 'anthropic-beta' header to vertex - will cause request to fail

* fix(utils.py): add flag to allow user to disable filtering invalid headers

ensure user can control behaviour

* style(utils.py): cleanup message

* test(test_utils.py): add unit test to cover invalid header filtering

* fix(proxy_server.py): fix custom openapi schema generation

* fix(utils.py): pass extra headers if set

* fix(main.py): fix image variation to use 'client' param
2025-01-21 20:36:11 -08:00
Ishaan Jaff
dd385410df
(Code quality) - Ban recursive functions in codebase (#7910)
* code qa add RecursiveFunctionFinder

* test_recursive_detector

* RecursiveFunctionFinder

* fix check

* recursive_detector
2025-01-21 20:33:32 -08:00
Ishaan Jaff
b6f2e659b9
(Feat) Add x-litellm-overhead-duration-ms and "x-litellm-response-duration-ms" in response from LiteLLM (#7899)
* add track_llm_api_timing

* add track_llm_api_timing

* test_litellm_overhead

* use ResponseMetadata class for setting hidden params and response overhead

* instrument http handler

* fix track_llm_api_timing

* track_llm_api_timing

* emit response overhead on hidden params

* fix resp metadata

* fix make_sync_openai_embedding_request

* test_aaaaatext_completion_endpoint fixes

* _get_value_from_hidden_params

* set_hidden_params

* test_litellm_overhead

* test_litellm_overhead

* test_litellm_overhead

* fix import

* test_litellm_overhead_stream

* add LiteLLMLoggingObject

* use diff folder for testing

* use diff folder for overhead testing

* test litellm overhead

* use typing

* clear typing

* test_litellm_overhead

* fix async_streaming

* update_response_metadata

* move test file

* pply metadata to the response objec
2025-01-21 20:27:55 -08:00
Ishaan Jaff
63d7d04232
(fix langfuse tags) - read tags from StandardLoggingPayload (#7903)
* fix _get_langfuse_tags

* fix _get_langfuse_tags

* fix _get_langfuse_tags

* _get_langfuse_tags

* test_get_langfuse_tags

* fix langfuse
2025-01-21 20:26:09 -08:00
Ishaan Jaff
2a71d9e8f1
(Bug fix) - Allow setting null for max_budget, rpm_limit, tpm_limit when updating values on a team (#7912)
* fix update_team

* fix test_key_limit_modifications
2025-01-21 19:19:36 -08:00
Krish Dholakia
b81072d90c
fix: add default credential for azure (#7095) (#7891)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 11s
* fix: add default credential for azure (#7095)

* fix: fix linting error

* fix: remove redundant test

* test: skip redundant test

---------

Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>
2025-01-21 09:01:49 -08:00