Commit graph

225 commits

Author SHA1 Message Date
Ishaan Jaff
47f65165d7 fix test 2025-04-14 17:18:29 -07:00
Ishaan Jaff
00b36ac60b test_anthropic_cache_control_hook_specific_index 2025-04-14 17:15:46 -07:00
Ishaan Jaff
53ce0d434e test_anthropic_cache_control_hook_system_message 2025-04-14 16:21:59 -07:00
Ishaan Jaff
89dfb42697
[UI QA checklist] (#9957)
* fix typo on UI

* fix for edit user tab

* fix for user spend

* add /team/permissions_list to management routes

* fix auth check for team member permissions

* fix team endpoints test
2025-04-12 20:41:50 -07:00
Krish Dholakia
00e49380df
Litellm UI qa 04 12 2025 p1 (#9955)
* fix(model_info_view.tsx): cleanup text

* fix(key_management_endpoints.py): fix filtering litellm-dashboard keys for internal users

* fix(proxy_track_cost_callback.py): prevent flooding spend logs with admin endpoint errors

* test: add unit testing for logic

* test(test_auth_exception_handler.py): add more unit testing

* fix(router.py): correctly handle retrieving model info on get_model_group_info

fixes issue where model hub was showing None prices

* fix: fix linting errors
2025-04-12 19:30:48 -07:00
Dan Shaw
433075a8d9
fix(factory.py): correct indentation for message index increment in ollama, This fixes bug #9822 (#9943)
* fix(factory.py): correct indentation for message index increment in ollama_pt function

* test: add unit tests for ollama_pt function handling various message types
2025-04-12 09:50:40 -07:00
Krish Dholakia
421e0a3004
Litellm add managed files db (#9930)
* fix(openai.py): ensure openai file object shows up on logs

* fix(managed_files.py): return unified file id as b64 str

allows retrieve file id to work as expected

* fix(managed_files.py): apply decoded file id transformation

* fix: add unit test for file id + decode logic

* fix: initial commit for litellm_proxy support with CRUD Endpoints

* fix(managed_files.py): support retrieve file operation

* fix(managed_files.py): support for DELETE endpoint for files

* fix(managed_files.py): retrieve file content support

supports retrieve file content api from openai

* fix: fix linting error

* test: update tests

* fix: fix linting error

* feat(managed_files.py): support reading / writing files in DB

* feat(managed_files.py): support deleting file from DB on delete

* test: update testing

* fix(spend_tracking_utils.py): ensure each file create request is logged correctly

* fix(managed_files.py): fix storing / returning managed file object from cache

* fix(files/main.py): pass litellm params to azure route

* test: fix test

* build: add new prisma migration

* build: bump requirements

* test: add more testing

* refactor: cleanup post merge w/ main

* fix: fix code qa errors
2025-04-12 08:24:46 -07:00
Krish Dholakia
b9f01c9f5b
fix(databricks/common_utils.py): fix custom endpoint check (#9925)
* fix(databricks/common_utils.py): fix custom endpoint check

Fixes https://github.com/BerriAI/litellm/issues/9915

* fix(common_utils.py): add unit test to ensure custom_endpoint=False is handled correctly

Fixes https://github.com/BerriAI/litellm/issues/9915
2025-04-11 23:20:49 -07:00
Krish Dholakia
3ca82c22b6
Support CRUD endpoints for Managed Files (#9924)
* fix(openai.py): ensure openai file object shows up on logs

* fix(managed_files.py): return unified file id as b64 str

allows retrieve file id to work as expected

* fix(managed_files.py): apply decoded file id transformation

* fix: add unit test for file id + decode logic

* fix: initial commit for litellm_proxy support with CRUD Endpoints

* fix(managed_files.py): support retrieve file operation

* fix(managed_files.py): support for DELETE endpoint for files

* fix(managed_files.py): retrieve file content support

supports retrieve file content api from openai

* fix: fix linting error

* test: update tests

* fix: fix linting error

* fix(files/main.py): pass litellm params to azure route

* test: fix test
2025-04-11 21:48:27 -07:00
Ishaan Jaff
f9ce754817
[Feat] Add litellm.supports_reasoning() util to track if an llm supports reasoning (#9923)
* add supports_reasoning for xai models

* add "supports_reasoning": true for o1 series models

* add supports_reasoning util

* add litellm.supports_reasoning

* add supports reasoning for claude 3-7 models

* add deepseek as supports reasoning

* test_supports_reasoning

* add supports reasoning to model group info

* add supports_reasoning

* docs supports reasoning

* fix supports_reasoning test

* "supports_reasoning": false,

* fix test

* supports_reasoning
2025-04-11 17:56:04 -07:00
Ishaan Jaff
91c0a794b9
[Feat - Team Member Permissions] - CRUD Endpoints for managing team member permissions (#9919)
* add team_member_permissions

* add GetTeamMemberPermissionsRequest types

* crud endpoint for team member permissions

* test team member permissions CRUD

* fix GetTeamMemberPermissionsRequest
2025-04-11 17:15:16 -07:00
Ishaan Jaff
8b1d2d6956
[Feat - UI] - Allow setting Default Team setting when LiteLLM SSO auto creates teams (#9918)
* endpoint for updating default team settings on ui

* add GET default team settings endpoint

* ui expose default team settings on UI

* update to use DefaultTeamSSOParams

* DefaultTeamSSOParams

* fix DefaultTeamSSOParams

* docs team management

* test_update_default_team_settings
2025-04-11 14:07:10 -07:00
Krish Dholakia
0415f1205e
Litellm dev 04 10 2025 p3 (#9903)
* feat(managed_files.py): encode file type in unified file id

simplify calling gemini models

* fix(common_utils.py): fix extracting file type from unified file id

* fix(litellm_logging.py): create standard logging payload for create file call

* fix: fix linting error
2025-04-11 09:29:42 -07:00
Krish Dholakia
78879c68a9
Revert avglogprobs change + Add azure/gpt-4o-realtime-audio cost tracking (#9893)
* test: initial commit fixing gemini logprobs

Fixes https://github.com/BerriAI/litellm/issues/9888

* fix(vertex_and_google_ai_studio.py): Revert avglogprobs change

Fixes https://github.com/BerriAI/litellm/issues/8890

* build(model_prices_and_context_window.json): add gpt-4o-realtime-preview cost to model cost map

Fixes https://github.com/BerriAI/litellm/issues/9814

* test: add cost calculation unit testing

* test: fix test

* test: update test
2025-04-10 21:23:55 -07:00
Ishaan Jaff
98e34cbf5d
[Docs] Tutorial using MSFT auto team assignment with LiteLLM (#9898)
* add default_team_params as a config.yaml setting

* create_litellm_team_from_sso_group

* test_default_team_params

* test_create_team_without_default_params

* docs default team settings

* docs msft entra id tutorial

* commit litellm docs msft group assignment

* litellm MSFT sso

* member, team assignment on litellm

* docs msft auto assignment

* bug fix default team setting

* docs litellm default team settings

* test_default_team_params
2025-04-10 20:07:55 -07:00
Ishaan Jaff
72a12e91c4
[Bug Fix MSFT SSO] Use correct field for user email when using MSFT SSO (#9886)
* fix openid_from_response

* test_microsoft_sso_handler_openid_from_response_user_principal_name

* test upsert_sso_user
2025-04-10 17:40:58 -07:00
Ishaan Jaff
94a553dbb2
[Feat] Emit Key, Team Budget metrics on a cron job schedule (#9528)
* _initialize_remaining_budget_metrics

* initialize_budget_metrics_cron_job

* initialize_budget_metrics_cron_job

* initialize_budget_metrics_cron_job

* test_initialize_budget_metrics_cron_job

* LITELLM_PROXY_ADMIN_NAME

* fix code qa checks

* test_initialize_budget_metrics_cron_job

* test_initialize_budget_metrics_cron_job

* pod lock manager allow dynamic cron job ID

* fix pod lock manager

* require cronjobid for PodLockManager

* fix DB_SPEND_UPDATE_JOB_NAME acquire / release lock

* add comment on prometheus logger

* add debug statements for emitting key, team budget metrics

* test_pod_lock_manager.py

* test_initialize_budget_metrics_cron_job

* initialize_budget_metrics_cron_job

* initialize_remaining_budget_metrics

* remove outdated test
2025-04-10 16:59:14 -07:00
Ishaan Jaff
90d862b041
[Feat SSO] - Allow admins to set default_team_params to have default params for when litellm SSO creates default teams (#9895)
* add default_team_params as a config.yaml setting

* create_litellm_team_from_sso_group

* test_default_team_params

* test_create_team_without_default_params

* docs default team settings
2025-04-10 16:58:28 -07:00
Krrish Dholakia
7d383fc0c1 test: update testing 2025-04-10 14:15:58 -07:00
Krish Dholakia
0dbd663877
fix(cost_calculator.py): handle custom pricing at deployment level fo… (#9855)
* fix(cost_calculator.py): handle custom pricing at deployment level for router

* test: add unit tests

* fix(router.py): show custom pricing on UI

check correct model str

* fix: fix linting error

* docs(custom_pricing.md): clarify custom pricing for proxy

Fixes https://github.com/BerriAI/litellm/issues/8573#issuecomment-2790420740

* test: update code qa test

* fix: cleanup traceback

* fix: handle litellm param custom pricing

* test: update test

* fix(cost_calculator.py): add router model id to list of potential model names

* fix(cost_calculator.py): fix router model id check

* fix: router.py - maintain older model registry approach

* fix: fix ruff check

* fix(router.py): router get deployment info

add custom values to mapped dict

* test: update test

* fix(utils.py): update only if value is non-null

* test: add unit test
2025-04-09 22:13:10 -07:00
Krish Dholakia
0c5b4aa96d
feat(realtime/): add token tracking + log usage object in spend logs … (#9843)
* feat(realtime/): add token tracking + log usage object in spend logs metadata

* test: fix test

* test: update tests

* test: update testing

* test: update test

* test: update test

* test: update test

* test: update test

* test: update tesdt

* test: update test
2025-04-09 22:11:00 -07:00
Krish Dholakia
87733c8193
Fix anthropic prompt caching cost calc + trim logged message in db (#9838)
* fix(spend_tracking_utils.py): prevent logging entire mp4 files to db

Fixes https://github.com/BerriAI/litellm/issues/9732

* fix(anthropic/chat/transformation.py): Fix double counting cache creation input tokens

Fixes https://github.com/BerriAI/litellm/issues/9812

* refactor(anthropic/chat/transformation.py): refactor streaming to use same usage calculation block as non-streaming

reduce errors

* fix(bedrock/chat/converse_transformation.py): don't increment prompt tokens with cache_creation_input_tokens

* build: remove redisvl from requirements.txt (temporary)

* fix(spend_tracking_utils.py): handle circular references

* test: update code cov test

* test: update test
2025-04-09 21:26:43 -07:00
Krish Dholakia
ac4f32fb1e
Cost tracking for gemini-2.5-pro (#9837)
* build(model_prices_and_context_window.json): add google/gemini-2.0-flash-lite-001 versioned pricing

Closes https://github.com/BerriAI/litellm/issues/9829

* build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param

* build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param

* build(model_prices_and_context_window.json): add supported endpoints to gemini-2.5-pro

* build(model_prices_and_context_window.json): add gemini 200k+ pricing

* feat(utils.py): support cost calculation for gemini-2.5-pro above 200k tokens

Fixes https://github.com/BerriAI/litellm/issues/9807

* build: test dockerfile change

* build: revert apk change

* ci(config.yml): pip install wheel

* ci: test problematic package first

* ci(config.yml): pip install only binary

* ci: try more things

* ci: test different ml_dtypes version

* ci(config.yml): check ml_dtypes==0.4.0

* ci: test

* ci: cleanup config.yml

* ci: specify ml dtypes in requirements.txt

* ci: remove redisvl depedency (temporary)

* fix: fix linting errors

* test: update test

* test: fix test
2025-04-09 18:48:43 -07:00
Ishaan Jaff
4c1bb74c3d
[Feat] - SSO - Use MSFT Graph API to assign users to teams (#9865)
* refactor SSO handler

* render sso JWT on ui

* docs debug sso

* fix sso login flow use await

* fix ui sso debug JWT

* test ui sso

* remove redis vl

* fix redisvl==0.5.1

* fix ml dtypes

* fix redisvl

* fix redis vl

* fix debug_sso_callback

* fix linting error

* fix redis semantic caching dep

* working graph api assignment

* test msft sso handler openid

* testing for msft group assignment

* fix debug graph api sso flow

* fix linting errors

* add_user_to_teams_from_sso_response

* fix linting error
2025-04-09 18:26:43 -07:00
Ishaan Jaff
6f7e9b9728
[Feat SSO] Debug route - allow admins to debug SSO JWT fields (#9835)
* refactor SSO handler

* render sso JWT on ui

* docs debug sso

* fix sso login flow use await

* fix ui sso debug JWT

* test ui sso

* remove redis vl

* fix redisvl==0.5.1

* fix ml dtypes

* fix redisvl

* fix redis vl

* fix debug_sso_callback

* fix linting error

* fix redis semantic caching dep
2025-04-09 15:29:35 -07:00
Ishaan Jaff
08a3620414
[Bug Fix] Add support for UploadFile on LLM Pass through endpoints (OpenAI, Azure etc) (#9853)
* http passthrough file handling

* fix make_multipart_http_request

* test_pass_through_file_operations

* unit tests for file handling
2025-04-09 15:29:20 -07:00
Ishaan Jaff
441c7275ed
test fix post call rules (#9826) 2025-04-08 13:55:37 -07:00
Ishaan Jaff
e6403b717c
[Security fix - CVE-2025-0330] - Leakage of Langfuse API keys in team exception handling (#9830)
* fix team id exception in get team config

* test_team_info_masking

* test ref
2025-04-08 13:55:20 -07:00
Ishaan Jaff
ff3a6830a4
[Feat] LiteLLM Tag/Policy Management (#9813)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 15s
Helm unit test / unit-test (push) Successful in 21s
* rendering tags on UI

* use /models for building tags

* CRUD endpoints for Tag management

* fix tag management

* working api for LIST tags

* working tag management

* refactor UI components

* fixes ui tag management

* clean up ui tag management

* fix tag management ui

* fix show allowed llms

* e2e tag controls

* stash change for rendering tags on UI

* ui working tag selector on Test Key page

* fixes for tag management

* clean up tag info

* fix code quality

* test for tag management

* ui clarify what tag routing is
2025-04-07 21:54:24 -07:00
Krrish Dholakia
fef2af0b17 test: fix flaky test 2025-04-07 19:42:58 -07:00
Krish Dholakia
4a128cfd64
Realtime API Cost tracking (#9795)
* fix(proxy_server.py): log realtime calls to spendlogs

Fixes https://github.com/BerriAI/litellm/issues/8410

* feat(realtime/): OpenAI Realtime API cost tracking

Closes https://github.com/BerriAI/litellm/issues/8410

* test: add unit testing for coverage

* test: add more unit testing

* fix: handle edge cases
2025-04-07 16:43:12 -07:00
Krish Dholakia
0d503ad8ad
Move daily user transaction logging outside of 'disable_spend_logs' flag - different tables (#9772)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 18s
* refactor(db_spend_update_writer.py): aggregate table is entirely different

* test(test_db_spend_update_writer.py): add unit test to ensure if disable_spend_logs is true daily user transactions is still logged

* test: fix test
2025-04-05 09:58:16 -07:00
Krish Dholakia
5099aac1a5
Add DBRX Anthropic w/ thinking + response_format support (#9744)
* feat(databricks/chat/): add anthropic w/ reasoning content support via databricks

Allows user to call claude-3-7-sonnet with thinking via databricks

* refactor: refactor choices transformation + add unit testing

* fix(databricks/chat/transformation.py): support thinking blocks on databricks response streaming

* feat(databricks/chat/transformation.py): support response_format for claude models

* fix(databricks/chat/transformation.py): correctly handle response_format={"type": "text"}

* feat(databricks/chat/transformation.py): support 'reasoning_effort' param mapping for anthropic

* fix: fix ruff errors

* fix: fix linting error

* test: update test

* fix(databricks/chat/transformation.py): handle json mode output parsing

* fix(databricks/chat/transformation.py): handle json mode on streaming

* test: update test

* test: update dbrx testing

* test: update testing

* fix(base_model_iterator.py): handle non-json chunk

* test: update tests

* fix: fix ruff check

* fix: fix databricks config import

* fix: handle _tool = none

* test: skip invalid test
2025-04-04 22:13:32 -07:00
Ishaan Jaff
b89ed69257
Merge branch 'main' into litellm_add_auth_metrics_endpoint 2025-04-04 21:28:06 -07:00
Ishaan Jaff
8c3670e192
Merge pull request #9719 from BerriAI/litellm_metrics_pod_lock_manager
[Reliability] Emit operational metrics for new DB Transaction architecture
2025-04-04 21:12:06 -07:00
Ishaan Jaff
df51d8bcfa Merge branch 'main' into litellm_metrics_pod_lock_manager 2025-04-04 21:11:39 -07:00
Ishaan Jaff
fc4c453cb9 test_no_auth_metrics_when_disabled 2025-04-04 21:02:29 -07:00
Krrish Dholakia
6395bd8d65 test: mark flaky test 2025-04-04 20:25:05 -07:00
Ishaan Jaff
150e77cd7d Merge branch 'main' into litellm_reliability_fix_db_txs 2025-04-04 16:46:46 -07:00
Ishaan Jaff
d3018a4c28 Merge branch 'main' into litellm_metrics_pod_lock_manager 2025-04-04 16:46:32 -07:00
Krish Dholakia
e1f7bcb47d
Fix VertexAI Credential Caching issue (#9756)
* refactor(vertex_llm_base.py): Prevent credential misrouting for projects

Fixes https://github.com/BerriAI/litellm/issues/7904

* fix: passing unit tests

* fix(vertex_llm_base.py): common auth logic across sync + async vertex ai calls

prevents credential caching issue across both flows

* test: fix test

* fix(vertex_llm_base.py): handle project id in default cause

* fix(factory.py): don't pass cache control if not set

bedrock invoke does not support this

* test: fix test

* fix(vertex_llm_base.py): add .exception message in load_auth

* fix: fix ruff error
2025-04-04 16:38:08 -07:00
Ishaan Jaff
1cdee4b331 Merge branch 'main' into litellm_metrics_pod_lock_manager 2025-04-04 16:33:16 -07:00
Ishaan Jaff
decb6649ec test_queue_flush_limit 2025-04-04 16:29:06 -07:00
Ishaan Jaff
e77a178a37 test_queue_size_reduction_with_large_volume 2025-04-04 16:21:29 -07:00
Ishaan Jaff
dc063fdfec test_queue_size_reduction_with_large_volume 2025-04-04 15:59:35 -07:00
Ishaan Jaff
5bed0b7557 aggregated values 2025-04-04 15:55:14 -07:00
Ishaan Jaff
cdd351a03b
Merge pull request #9745 from BerriAI/litellm_sso_fixes_dev
[Feat] Allow assigning SSO users to teams on MSFT SSO
2025-04-04 15:40:19 -07:00
Adrian Lyjak
d640bc0a00
fix #8425, passthrough kwargs during acompletion, and unwrap extra_body for openrouter (#9747) 2025-04-03 22:19:40 -07:00
Ishaan Jaff
0745f306c7 test_microsoft_sso_handler_with_empty_response 2025-04-03 22:17:06 -07:00
sajda
4a4328b5bb
fix:Gemini Flash 2.0 implementation is not returning the logprobs (#9713)
* fix:Gemini Flash 2.0 implementation is not returning the logprobs

* fix: linting error by adding a helper method called _process_candidates
2025-04-03 11:53:41 -07:00