Krish Dholakia
78879c68a9
Revert avglogprobs change + Add azure/gpt-4o-realtime-audio cost tracking ( #9893 )
...
* test: initial commit fixing gemini logprobs
Fixes https://github.com/BerriAI/litellm/issues/9888
* fix(vertex_and_google_ai_studio.py): Revert avglogprobs change
Fixes https://github.com/BerriAI/litellm/issues/8890
* build(model_prices_and_context_window.json): add gpt-4o-realtime-preview cost to model cost map
Fixes https://github.com/BerriAI/litellm/issues/9814
* test: add cost calculation unit testing
* test: fix test
* test: update test
2025-04-10 21:23:55 -07:00
Ishaan Jaff
98e34cbf5d
[Docs] Tutorial using MSFT auto team assignment with LiteLLM ( #9898 )
...
* add default_team_params as a config.yaml setting
* create_litellm_team_from_sso_group
* test_default_team_params
* test_create_team_without_default_params
* docs default team settings
* docs msft entra id tutorial
* commit litellm docs msft group assignment
* litellm MSFT sso
* member, team assignment on litellm
* docs msft auto assignment
* bug fix default team setting
* docs litellm default team settings
* test_default_team_params
2025-04-10 20:07:55 -07:00
Ishaan Jaff
72a12e91c4
[Bug Fix MSFT SSO] Use correct field for user email when using MSFT SSO ( #9886 )
...
* fix openid_from_response
* test_microsoft_sso_handler_openid_from_response_user_principal_name
* test upsert_sso_user
2025-04-10 17:40:58 -07:00
Ishaan Jaff
94a553dbb2
[Feat] Emit Key, Team Budget metrics on a cron job schedule ( #9528 )
...
* _initialize_remaining_budget_metrics
* initialize_budget_metrics_cron_job
* initialize_budget_metrics_cron_job
* initialize_budget_metrics_cron_job
* test_initialize_budget_metrics_cron_job
* LITELLM_PROXY_ADMIN_NAME
* fix code qa checks
* test_initialize_budget_metrics_cron_job
* test_initialize_budget_metrics_cron_job
* pod lock manager allow dynamic cron job ID
* fix pod lock manager
* require cronjobid for PodLockManager
* fix DB_SPEND_UPDATE_JOB_NAME acquire / release lock
* add comment on prometheus logger
* add debug statements for emitting key, team budget metrics
* test_pod_lock_manager.py
* test_initialize_budget_metrics_cron_job
* initialize_budget_metrics_cron_job
* initialize_remaining_budget_metrics
* remove outdated test
2025-04-10 16:59:14 -07:00
Ishaan Jaff
90d862b041
[Feat SSO] - Allow admins to set default_team_params
to have default params for when litellm SSO creates default teams ( #9895 )
...
* add default_team_params as a config.yaml setting
* create_litellm_team_from_sso_group
* test_default_team_params
* test_create_team_without_default_params
* docs default team settings
2025-04-10 16:58:28 -07:00
Krrish Dholakia
7d383fc0c1
test: update testing
2025-04-10 14:15:58 -07:00
Krish Dholakia
0dbd663877
fix(cost_calculator.py): handle custom pricing at deployment level fo… ( #9855 )
...
* fix(cost_calculator.py): handle custom pricing at deployment level for router
* test: add unit tests
* fix(router.py): show custom pricing on UI
check correct model str
* fix: fix linting error
* docs(custom_pricing.md): clarify custom pricing for proxy
Fixes https://github.com/BerriAI/litellm/issues/8573#issuecomment-2790420740
* test: update code qa test
* fix: cleanup traceback
* fix: handle litellm param custom pricing
* test: update test
* fix(cost_calculator.py): add router model id to list of potential model names
* fix(cost_calculator.py): fix router model id check
* fix: router.py - maintain older model registry approach
* fix: fix ruff check
* fix(router.py): router get deployment info
add custom values to mapped dict
* test: update test
* fix(utils.py): update only if value is non-null
* test: add unit test
2025-04-09 22:13:10 -07:00
Krish Dholakia
0c5b4aa96d
feat(realtime/): add token tracking + log usage object in spend logs … ( #9843 )
...
* feat(realtime/): add token tracking + log usage object in spend logs metadata
* test: fix test
* test: update tests
* test: update testing
* test: update test
* test: update test
* test: update test
* test: update test
* test: update tesdt
* test: update test
2025-04-09 22:11:00 -07:00
Krish Dholakia
87733c8193
Fix anthropic prompt caching cost calc + trim logged message in db ( #9838 )
...
* fix(spend_tracking_utils.py): prevent logging entire mp4 files to db
Fixes https://github.com/BerriAI/litellm/issues/9732
* fix(anthropic/chat/transformation.py): Fix double counting cache creation input tokens
Fixes https://github.com/BerriAI/litellm/issues/9812
* refactor(anthropic/chat/transformation.py): refactor streaming to use same usage calculation block as non-streaming
reduce errors
* fix(bedrock/chat/converse_transformation.py): don't increment prompt tokens with cache_creation_input_tokens
* build: remove redisvl from requirements.txt (temporary)
* fix(spend_tracking_utils.py): handle circular references
* test: update code cov test
* test: update test
2025-04-09 21:26:43 -07:00
Krish Dholakia
ac4f32fb1e
Cost tracking for gemini-2.5-pro
( #9837 )
...
* build(model_prices_and_context_window.json): add google/gemini-2.0-flash-lite-001 versioned pricing
Closes https://github.com/BerriAI/litellm/issues/9829
* build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param
* build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param
* build(model_prices_and_context_window.json): add supported endpoints to gemini-2.5-pro
* build(model_prices_and_context_window.json): add gemini 200k+ pricing
* feat(utils.py): support cost calculation for gemini-2.5-pro above 200k tokens
Fixes https://github.com/BerriAI/litellm/issues/9807
* build: test dockerfile change
* build: revert apk change
* ci(config.yml): pip install wheel
* ci: test problematic package first
* ci(config.yml): pip install only binary
* ci: try more things
* ci: test different ml_dtypes version
* ci(config.yml): check ml_dtypes==0.4.0
* ci: test
* ci: cleanup config.yml
* ci: specify ml dtypes in requirements.txt
* ci: remove redisvl depedency (temporary)
* fix: fix linting errors
* test: update test
* test: fix test
2025-04-09 18:48:43 -07:00
Ishaan Jaff
4c1bb74c3d
[Feat] - SSO - Use MSFT Graph API to assign users to teams ( #9865 )
...
* refactor SSO handler
* render sso JWT on ui
* docs debug sso
* fix sso login flow use await
* fix ui sso debug JWT
* test ui sso
* remove redis vl
* fix redisvl==0.5.1
* fix ml dtypes
* fix redisvl
* fix redis vl
* fix debug_sso_callback
* fix linting error
* fix redis semantic caching dep
* working graph api assignment
* test msft sso handler openid
* testing for msft group assignment
* fix debug graph api sso flow
* fix linting errors
* add_user_to_teams_from_sso_response
* fix linting error
2025-04-09 18:26:43 -07:00
Ishaan Jaff
6f7e9b9728
[Feat SSO] Debug route - allow admins to debug SSO JWT fields ( #9835 )
...
* refactor SSO handler
* render sso JWT on ui
* docs debug sso
* fix sso login flow use await
* fix ui sso debug JWT
* test ui sso
* remove redis vl
* fix redisvl==0.5.1
* fix ml dtypes
* fix redisvl
* fix redis vl
* fix debug_sso_callback
* fix linting error
* fix redis semantic caching dep
2025-04-09 15:29:35 -07:00
Ishaan Jaff
08a3620414
[Bug Fix] Add support for UploadFile on LLM Pass through endpoints (OpenAI, Azure etc) ( #9853 )
...
* http passthrough file handling
* fix make_multipart_http_request
* test_pass_through_file_operations
* unit tests for file handling
2025-04-09 15:29:20 -07:00
Ishaan Jaff
441c7275ed
test fix post call rules ( #9826 )
2025-04-08 13:55:37 -07:00
Ishaan Jaff
e6403b717c
[Security fix - CVE-2025-0330] - Leakage of Langfuse API keys in team exception handling ( #9830 )
...
* fix team id exception in get team config
* test_team_info_masking
* test ref
2025-04-08 13:55:20 -07:00
Ishaan Jaff
ff3a6830a4
[Feat] LiteLLM Tag/Policy Management ( #9813 )
...
Read Version from pyproject.toml / read-version (push) Successful in 15s
Helm unit test / unit-test (push) Successful in 21s
* rendering tags on UI
* use /models for building tags
* CRUD endpoints for Tag management
* fix tag management
* working api for LIST tags
* working tag management
* refactor UI components
* fixes ui tag management
* clean up ui tag management
* fix tag management ui
* fix show allowed llms
* e2e tag controls
* stash change for rendering tags on UI
* ui working tag selector on Test Key page
* fixes for tag management
* clean up tag info
* fix code quality
* test for tag management
* ui clarify what tag routing is
2025-04-07 21:54:24 -07:00
Krrish Dholakia
fef2af0b17
test: fix flaky test
2025-04-07 19:42:58 -07:00
Krish Dholakia
4a128cfd64
Realtime API Cost tracking ( #9795 )
...
* fix(proxy_server.py): log realtime calls to spendlogs
Fixes https://github.com/BerriAI/litellm/issues/8410
* feat(realtime/): OpenAI Realtime API cost tracking
Closes https://github.com/BerriAI/litellm/issues/8410
* test: add unit testing for coverage
* test: add more unit testing
* fix: handle edge cases
2025-04-07 16:43:12 -07:00
Krish Dholakia
0d503ad8ad
Move daily user transaction logging outside of 'disable_spend_logs' flag - different tables ( #9772 )
...
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 18s
* refactor(db_spend_update_writer.py): aggregate table is entirely different
* test(test_db_spend_update_writer.py): add unit test to ensure if disable_spend_logs is true daily user transactions is still logged
* test: fix test
2025-04-05 09:58:16 -07:00
Krish Dholakia
5099aac1a5
Add DBRX Anthropic w/ thinking + response_format support ( #9744 )
...
* feat(databricks/chat/): add anthropic w/ reasoning content support via databricks
Allows user to call claude-3-7-sonnet with thinking via databricks
* refactor: refactor choices transformation + add unit testing
* fix(databricks/chat/transformation.py): support thinking blocks on databricks response streaming
* feat(databricks/chat/transformation.py): support response_format for claude models
* fix(databricks/chat/transformation.py): correctly handle response_format={"type": "text"}
* feat(databricks/chat/transformation.py): support 'reasoning_effort' param mapping for anthropic
* fix: fix ruff errors
* fix: fix linting error
* test: update test
* fix(databricks/chat/transformation.py): handle json mode output parsing
* fix(databricks/chat/transformation.py): handle json mode on streaming
* test: update test
* test: update dbrx testing
* test: update testing
* fix(base_model_iterator.py): handle non-json chunk
* test: update tests
* fix: fix ruff check
* fix: fix databricks config import
* fix: handle _tool = none
* test: skip invalid test
2025-04-04 22:13:32 -07:00
Ishaan Jaff
b89ed69257
Merge branch 'main' into litellm_add_auth_metrics_endpoint
2025-04-04 21:28:06 -07:00
Ishaan Jaff
8c3670e192
Merge pull request #9719 from BerriAI/litellm_metrics_pod_lock_manager
...
[Reliability] Emit operational metrics for new DB Transaction architecture
2025-04-04 21:12:06 -07:00
Ishaan Jaff
df51d8bcfa
Merge branch 'main' into litellm_metrics_pod_lock_manager
2025-04-04 21:11:39 -07:00
Ishaan Jaff
fc4c453cb9
test_no_auth_metrics_when_disabled
2025-04-04 21:02:29 -07:00
Krrish Dholakia
6395bd8d65
test: mark flaky test
2025-04-04 20:25:05 -07:00
Ishaan Jaff
150e77cd7d
Merge branch 'main' into litellm_reliability_fix_db_txs
2025-04-04 16:46:46 -07:00
Ishaan Jaff
d3018a4c28
Merge branch 'main' into litellm_metrics_pod_lock_manager
2025-04-04 16:46:32 -07:00
Krish Dholakia
e1f7bcb47d
Fix VertexAI Credential Caching issue ( #9756 )
...
* refactor(vertex_llm_base.py): Prevent credential misrouting for projects
Fixes https://github.com/BerriAI/litellm/issues/7904
* fix: passing unit tests
* fix(vertex_llm_base.py): common auth logic across sync + async vertex ai calls
prevents credential caching issue across both flows
* test: fix test
* fix(vertex_llm_base.py): handle project id in default cause
* fix(factory.py): don't pass cache control if not set
bedrock invoke does not support this
* test: fix test
* fix(vertex_llm_base.py): add .exception message in load_auth
* fix: fix ruff error
2025-04-04 16:38:08 -07:00
Ishaan Jaff
1cdee4b331
Merge branch 'main' into litellm_metrics_pod_lock_manager
2025-04-04 16:33:16 -07:00
Ishaan Jaff
decb6649ec
test_queue_flush_limit
2025-04-04 16:29:06 -07:00
Ishaan Jaff
e77a178a37
test_queue_size_reduction_with_large_volume
2025-04-04 16:21:29 -07:00
Ishaan Jaff
dc063fdfec
test_queue_size_reduction_with_large_volume
2025-04-04 15:59:35 -07:00
Ishaan Jaff
5bed0b7557
aggregated values
2025-04-04 15:55:14 -07:00
Ishaan Jaff
cdd351a03b
Merge pull request #9745 from BerriAI/litellm_sso_fixes_dev
...
[Feat] Allow assigning SSO users to teams on MSFT SSO
2025-04-04 15:40:19 -07:00
Adrian Lyjak
d640bc0a00
fix #8425 , passthrough kwargs during acompletion, and unwrap extra_body for openrouter ( #9747 )
2025-04-03 22:19:40 -07:00
Ishaan Jaff
0745f306c7
test_microsoft_sso_handler_with_empty_response
2025-04-03 22:17:06 -07:00
sajda
4a4328b5bb
fix:Gemini Flash 2.0 implementation is not returning the logprobs ( #9713 )
...
* fix:Gemini Flash 2.0 implementation is not returning the logprobs
* fix: linting error by adding a helper method called _process_candidates
2025-04-03 11:53:41 -07:00
Krish Dholakia
6dda1ba6dd
LiteLLM Minor Fixes & Improvements (04/02/2025) ( #9725 )
...
* Add date picker to usage tab + Add reasoning_content token tracking across all providers on streaming (#9722 )
* feat(new_usage.tsx): add date picker for new usage tab
allow user to look back on their usage data
* feat(anthropic/chat/transformation.py): report reasoning tokens in completion token details
allows usage tracking on how many reasoning tokens are actually being used
* feat(streaming_chunk_builder.py): return reasoning_tokens in anthropic/openai streaming response
allows tracking reasoning_token usage across providers
* Fix update team metadata + fix bulk adding models on Ui (#9721 )
* fix(handle_add_model_submit.tsx): fix bulk adding models
* fix(team_info.tsx): fix team metadata update
Fixes https://github.com/BerriAI/litellm/issues/9689
* (v0) Unified file id - allow calling multiple providers with same file id (#9718 )
* feat(files_endpoints.py): initial commit adding 'target_model_names' support
allow developer to specify all the models they want to call with the file
* feat(files_endpoints.py): return unified files endpoint
* test(test_files_endpoints.py): add validation test - if invalid purpose submitted
* feat: more updates
* feat: initial working commit of unified file id translation
* fix: additional fixes
* fix(router.py): remove model replace logic in jsonl on acreate_file
enables file upload to work for chat completion requests as well
* fix(files_endpoints.py): remove whitespace around model name
* fix(azure/handler.py): return acreate_file with correct response type
* fix: fix linting errors
* test: fix mock test to run on github actions
* fix: fix ruff errors
* fix: fix file too large error
* fix(utils.py): remove redundant var
* test: modify test to work on github actions
* test: update tests
* test: more debug logs to understand ci/cd issue
* test: fix test for respx
* test: skip mock respx test
fails on ci/cd - not clear why
* fix: fix ruff check
* fix: fix test
* fix(model_connection_test.tsx): fix linting error
* test: update unit tests
2025-04-03 11:48:52 -07:00
Ishaan Jaff
e68603e176
test create and update gauge
2025-04-02 21:31:19 -07:00
Ishaan Jaff
8405fcb748
test pod lock manager
2025-04-02 15:06:31 -07:00
Pranav Simha
2e35f07e94
Add support for max_completion_tokens to the Cohere chat transformation config ( #9701 )
2025-04-02 07:50:44 -07:00
Krish Dholakia
23051d89dd
fix(streaming_handler.py): fix completion start time tracking ( #9688 )
...
* fix(streaming_handler.py): fix completion start time tracking
Fixes https://github.com/BerriAI/litellm/issues/9210
* feat(anthropic/chat/transformation.py): map openai 'reasoning_effort' to anthropic 'thinking' param
Fixes https://github.com/BerriAI/litellm/issues/9022
* feat: map 'reasoning_effort' to 'thinking' param across bedrock + vertex
Closes https://github.com/BerriAI/litellm/issues/9022#issuecomment-2705260808
2025-04-01 22:00:56 -07:00
Ishaan Jaff
feba274a89
test DailySpendUpdateQueue
2025-04-01 18:39:23 -07:00
Ishaan Jaff
4a091a34b0
move test loc
2025-04-01 18:33:33 -07:00
Ishaan Jaff
8dc792139e
refactor file structure
2025-04-01 18:30:48 -07:00
Ishaan Jaff
4ddca7a79c
Merge branch 'main' into litellm_fix_service_account_behavior
2025-04-01 12:04:28 -07:00
Ishaan Jaff
61b609f320
Merge pull request #9673 from BerriAI/litellm_qa_deadlock_fixes
...
[Reliability] - Ensure new Redis + DB architecture tracks spend accurately
2025-04-01 12:04:03 -07:00
Ishaan Jaff
c2c5dbf24f
test_get_enforced_params
2025-04-01 08:41:53 -07:00
Ishaan Jaff
f805e15f7b
test_get_enforced_params_for_service_account_settings
2025-04-01 08:39:41 -07:00
Ishaan Jaff
e5f6529c42
test_get_enforced_params_for_service_account_settings
2025-04-01 07:46:38 -07:00