Commit graph

4701 commits

Author SHA1 Message Date
Ishaan Jaff
69a3aab4c8 ui new build 2025-04-12 09:13:00 -07:00
Ishaan Jaff
fb0c3d9e18
[DB / Infra] Add new column team_member_permissions (#9941)
* add team_member_permissions to team table

* add migration.sql file

* fix poetry lock

* fix prisma migrations

* fix poetry lock

* fix migration
2025-04-12 09:06:04 -07:00
Krish Dholakia
421e0a3004
Litellm add managed files db (#9930)
* fix(openai.py): ensure openai file object shows up on logs

* fix(managed_files.py): return unified file id as b64 str

allows retrieve file id to work as expected

* fix(managed_files.py): apply decoded file id transformation

* fix: add unit test for file id + decode logic

* fix: initial commit for litellm_proxy support with CRUD Endpoints

* fix(managed_files.py): support retrieve file operation

* fix(managed_files.py): support for DELETE endpoint for files

* fix(managed_files.py): retrieve file content support

supports retrieve file content api from openai

* fix: fix linting error

* test: update tests

* fix: fix linting error

* feat(managed_files.py): support reading / writing files in DB

* feat(managed_files.py): support deleting file from DB on delete

* test: update testing

* fix(spend_tracking_utils.py): ensure each file create request is logged correctly

* fix(managed_files.py): fix storing / returning managed file object from cache

* fix(files/main.py): pass litellm params to azure route

* test: fix test

* build: add new prisma migration

* build: bump requirements

* test: add more testing

* refactor: cleanup post merge w/ main

* fix: fix code qa errors
2025-04-12 08:24:46 -07:00
Krish Dholakia
3ca82c22b6
Support CRUD endpoints for Managed Files (#9924)
* fix(openai.py): ensure openai file object shows up on logs

* fix(managed_files.py): return unified file id as b64 str

allows retrieve file id to work as expected

* fix(managed_files.py): apply decoded file id transformation

* fix: add unit test for file id + decode logic

* fix: initial commit for litellm_proxy support with CRUD Endpoints

* fix(managed_files.py): support retrieve file operation

* fix(managed_files.py): support for DELETE endpoint for files

* fix(managed_files.py): retrieve file content support

supports retrieve file content api from openai

* fix: fix linting error

* test: update tests

* fix: fix linting error

* fix(files/main.py): pass litellm params to azure route

* test: fix test
2025-04-11 21:48:27 -07:00
Ishaan Jaff
91c0a794b9
[Feat - Team Member Permissions] - CRUD Endpoints for managing team member permissions (#9919)
* add team_member_permissions

* add GetTeamMemberPermissionsRequest types

* crud endpoint for team member permissions

* test team member permissions CRUD

* fix GetTeamMemberPermissionsRequest
2025-04-11 17:15:16 -07:00
Ishaan Jaff
8b1d2d6956
[Feat - UI] - Allow setting Default Team setting when LiteLLM SSO auto creates teams (#9918)
* endpoint for updating default team settings on ui

* add GET default team settings endpoint

* ui expose default team settings on UI

* update to use DefaultTeamSSOParams

* DefaultTeamSSOParams

* fix DefaultTeamSSOParams

* docs team management

* test_update_default_team_settings
2025-04-11 14:07:10 -07:00
Krish Dholakia
0415f1205e
Litellm dev 04 10 2025 p3 (#9903)
* feat(managed_files.py): encode file type in unified file id

simplify calling gemini models

* fix(common_utils.py): fix extracting file type from unified file id

* fix(litellm_logging.py): create standard logging payload for create file call

* fix: fix linting error
2025-04-11 09:29:42 -07:00
Krish Dholakia
9f27e8363f
Realtime API: Support 'base_model' cost tracking + show response in spend logs (if enabled) (#9897)
* refactor(litellm_logging.py): refactor realtime cost tracking to use common code as rest

Ensures basic features like base model just work

* feat(realtime/): support 'base_model' cost tracking on realtime api

Fixes issue where base model was not working on realtime

* fix: fix ruff linting error

* test: fix test
2025-04-10 21:24:45 -07:00
Ishaan Jaff
f5c5c79ea4 update docs 2025-04-10 20:18:54 -07:00
Ishaan Jaff
98e34cbf5d
[Docs] Tutorial using MSFT auto team assignment with LiteLLM (#9898)
* add default_team_params as a config.yaml setting

* create_litellm_team_from_sso_group

* test_default_team_params

* test_create_team_without_default_params

* docs default team settings

* docs msft entra id tutorial

* commit litellm docs msft group assignment

* litellm MSFT sso

* member, team assignment on litellm

* docs msft auto assignment

* bug fix default team setting

* docs litellm default team settings

* test_default_team_params
2025-04-10 20:07:55 -07:00
Ishaan Jaff
72a12e91c4
[Bug Fix MSFT SSO] Use correct field for user email when using MSFT SSO (#9886)
* fix openid_from_response

* test_microsoft_sso_handler_openid_from_response_user_principal_name

* test upsert_sso_user
2025-04-10 17:40:58 -07:00
Ishaan Jaff
94a553dbb2
[Feat] Emit Key, Team Budget metrics on a cron job schedule (#9528)
* _initialize_remaining_budget_metrics

* initialize_budget_metrics_cron_job

* initialize_budget_metrics_cron_job

* initialize_budget_metrics_cron_job

* test_initialize_budget_metrics_cron_job

* LITELLM_PROXY_ADMIN_NAME

* fix code qa checks

* test_initialize_budget_metrics_cron_job

* test_initialize_budget_metrics_cron_job

* pod lock manager allow dynamic cron job ID

* fix pod lock manager

* require cronjobid for PodLockManager

* fix DB_SPEND_UPDATE_JOB_NAME acquire / release lock

* add comment on prometheus logger

* add debug statements for emitting key, team budget metrics

* test_pod_lock_manager.py

* test_initialize_budget_metrics_cron_job

* initialize_budget_metrics_cron_job

* initialize_remaining_budget_metrics

* remove outdated test
2025-04-10 16:59:14 -07:00
Ishaan Jaff
90d862b041
[Feat SSO] - Allow admins to set default_team_params to have default params for when litellm SSO creates default teams (#9895)
* add default_team_params as a config.yaml setting

* create_litellm_team_from_sso_group

* test_default_team_params

* test_create_team_without_default_params

* docs default team settings
2025-04-10 16:58:28 -07:00
Krish Dholakia
0dbd663877
fix(cost_calculator.py): handle custom pricing at deployment level fo… (#9855)
* fix(cost_calculator.py): handle custom pricing at deployment level for router

* test: add unit tests

* fix(router.py): show custom pricing on UI

check correct model str

* fix: fix linting error

* docs(custom_pricing.md): clarify custom pricing for proxy

Fixes https://github.com/BerriAI/litellm/issues/8573#issuecomment-2790420740

* test: update code qa test

* fix: cleanup traceback

* fix: handle litellm param custom pricing

* test: update test

* fix(cost_calculator.py): add router model id to list of potential model names

* fix(cost_calculator.py): fix router model id check

* fix: router.py - maintain older model registry approach

* fix: fix ruff check

* fix(router.py): router get deployment info

add custom values to mapped dict

* test: update test

* fix(utils.py): update only if value is non-null

* test: add unit test
2025-04-09 22:13:10 -07:00
Krish Dholakia
0c5b4aa96d
feat(realtime/): add token tracking + log usage object in spend logs … (#9843)
* feat(realtime/): add token tracking + log usage object in spend logs metadata

* test: fix test

* test: update tests

* test: update testing

* test: update test

* test: update test

* test: update test

* test: update test

* test: update tesdt

* test: update test
2025-04-09 22:11:00 -07:00
Krish Dholakia
87733c8193
Fix anthropic prompt caching cost calc + trim logged message in db (#9838)
* fix(spend_tracking_utils.py): prevent logging entire mp4 files to db

Fixes https://github.com/BerriAI/litellm/issues/9732

* fix(anthropic/chat/transformation.py): Fix double counting cache creation input tokens

Fixes https://github.com/BerriAI/litellm/issues/9812

* refactor(anthropic/chat/transformation.py): refactor streaming to use same usage calculation block as non-streaming

reduce errors

* fix(bedrock/chat/converse_transformation.py): don't increment prompt tokens with cache_creation_input_tokens

* build: remove redisvl from requirements.txt (temporary)

* fix(spend_tracking_utils.py): handle circular references

* test: update code cov test

* test: update test
2025-04-09 21:26:43 -07:00
Ishaan Jaff
1359e6d7a6
[SSO] Connect LiteLLM to Azure Entra ID Enterprise Application (#9872)
* refactor SSO handler

* render sso JWT on ui

* docs debug sso

* fix sso login flow use await

* fix ui sso debug JWT

* test ui sso

* remove redis vl

* fix redisvl==0.5.1

* fix ml dtypes

* fix redisvl

* fix redis vl

* fix debug_sso_callback

* fix linting error

* fix redis semantic caching dep

* working graph api assignment

* test msft sso handler openid

* testing for msft group assignment

* fix debug graph api sso flow

* fix linting errors

* add_user_to_teams_from_sso_response

* ui sso fix team assignments

* linting fix _get_group_ids_from_graph_api_response

* add MicrosoftServicePrincipalTeam

* create_litellm_teams_from_service_principal_team_ids

* create_litellm_teams_from_service_principal_team_ids

* docs MICROSOFT_SERVICE_PRINCIPAL_ID

* fix linting errors
2025-04-09 20:26:59 -07:00
Krish Dholakia
ac4f32fb1e
Cost tracking for gemini-2.5-pro (#9837)
* build(model_prices_and_context_window.json): add google/gemini-2.0-flash-lite-001 versioned pricing

Closes https://github.com/BerriAI/litellm/issues/9829

* build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param

* build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param

* build(model_prices_and_context_window.json): add supported endpoints to gemini-2.5-pro

* build(model_prices_and_context_window.json): add gemini 200k+ pricing

* feat(utils.py): support cost calculation for gemini-2.5-pro above 200k tokens

Fixes https://github.com/BerriAI/litellm/issues/9807

* build: test dockerfile change

* build: revert apk change

* ci(config.yml): pip install wheel

* ci: test problematic package first

* ci(config.yml): pip install only binary

* ci: try more things

* ci: test different ml_dtypes version

* ci(config.yml): check ml_dtypes==0.4.0

* ci: test

* ci: cleanup config.yml

* ci: specify ml dtypes in requirements.txt

* ci: remove redisvl depedency (temporary)

* fix: fix linting errors

* test: update test

* test: fix test
2025-04-09 18:48:43 -07:00
Ishaan Jaff
4c1bb74c3d
[Feat] - SSO - Use MSFT Graph API to assign users to teams (#9865)
* refactor SSO handler

* render sso JWT on ui

* docs debug sso

* fix sso login flow use await

* fix ui sso debug JWT

* test ui sso

* remove redis vl

* fix redisvl==0.5.1

* fix ml dtypes

* fix redisvl

* fix redis vl

* fix debug_sso_callback

* fix linting error

* fix redis semantic caching dep

* working graph api assignment

* test msft sso handler openid

* testing for msft group assignment

* fix debug graph api sso flow

* fix linting errors

* add_user_to_teams_from_sso_response

* fix linting error
2025-04-09 18:26:43 -07:00
Krrish Dholakia
9ec1972926 fix(internal_user_endpoints.py): increase default page size for /user/daily/activity 2025-04-09 17:50:13 -07:00
Ishaan Jaff
6f7e9b9728
[Feat SSO] Debug route - allow admins to debug SSO JWT fields (#9835)
* refactor SSO handler

* render sso JWT on ui

* docs debug sso

* fix sso login flow use await

* fix ui sso debug JWT

* test ui sso

* remove redis vl

* fix redisvl==0.5.1

* fix ml dtypes

* fix redisvl

* fix redis vl

* fix debug_sso_callback

* fix linting error

* fix redis semantic caching dep
2025-04-09 15:29:35 -07:00
Ishaan Jaff
08a3620414
[Bug Fix] Add support for UploadFile on LLM Pass through endpoints (OpenAI, Azure etc) (#9853)
* http passthrough file handling

* fix make_multipart_http_request

* test_pass_through_file_operations

* unit tests for file handling
2025-04-09 15:29:20 -07:00
Jacob Hagstedt P Suorra
dc9bfae053
Add user alias to API endpoint (#9859)
Co-authored-by: Jacob Hagstedt <wcgs@novonordisk.com>
2025-04-09 13:16:35 -07:00
Ishaan Jaff
441c7275ed
test fix post call rules (#9826) 2025-04-08 13:55:37 -07:00
Ishaan Jaff
e6403b717c
[Security fix - CVE-2025-0330] - Leakage of Langfuse API keys in team exception handling (#9830)
* fix team id exception in get team config

* test_team_info_masking

* test ref
2025-04-08 13:55:20 -07:00
Ishaan Jaff
ff3a6830a4
[Feat] LiteLLM Tag/Policy Management (#9813)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 15s
Helm unit test / unit-test (push) Successful in 21s
* rendering tags on UI

* use /models for building tags

* CRUD endpoints for Tag management

* fix tag management

* working api for LIST tags

* working tag management

* refactor UI components

* fixes ui tag management

* clean up ui tag management

* fix tag management ui

* fix show allowed llms

* e2e tag controls

* stash change for rendering tags on UI

* ui working tag selector on Test Key page

* fixes for tag management

* clean up tag info

* fix code quality

* test for tag management

* ui clarify what tag routing is
2025-04-07 21:54:24 -07:00
Krish Dholakia
4a128cfd64
Realtime API Cost tracking (#9795)
* fix(proxy_server.py): log realtime calls to spendlogs

Fixes https://github.com/BerriAI/litellm/issues/8410

* feat(realtime/): OpenAI Realtime API cost tracking

Closes https://github.com/BerriAI/litellm/issues/8410

* test: add unit testing for coverage

* test: add more unit testing

* fix: handle edge cases
2025-04-07 16:43:12 -07:00
KX
0ac896a6f2
feat: add offline swagger docs (#7653) 2025-04-06 13:55:06 -07:00
Krish Dholakia
792ee079c2
Litellm 04 05 2025 release notes (#9785)
* docs: update docs

* docs: additional cleanup

* docs(index.md): add initial links

* docs: more doc updates

* docs(index.md): add more links

* docs(files.md): add gemini files API to docs

* docs(index.md): add more docs

* docs: more docs

* docs: update docs
2025-04-06 09:03:51 -07:00
Ishaan Jaff
52b35cd809
[UI Polish] - Polish login screen (#9778)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 21s
Helm unit test / unit-test (push) Successful in 24s
* fix admin ui utils login screen

* ui - add layer of polish on login screen

* ui fix design of login page

* ui fix color scheme on login page
2025-04-05 14:56:03 -07:00
Ishaan Jaff
7f6de81196 ui new build 2025-04-05 12:30:37 -07:00
Ishaan Jaff
3a7061a05c
bug fix de depluciate model list (#9775) 2025-04-05 12:29:11 -07:00
Krish Dholakia
0d503ad8ad
Move daily user transaction logging outside of 'disable_spend_logs' flag - different tables (#9772)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 18s
* refactor(db_spend_update_writer.py): aggregate table is entirely different

* test(test_db_spend_update_writer.py): add unit test to ensure if disable_spend_logs is true daily user transactions is still logged

* test: fix test
2025-04-05 09:58:16 -07:00
Krish Dholakia
5099aac1a5
Add DBRX Anthropic w/ thinking + response_format support (#9744)
* feat(databricks/chat/): add anthropic w/ reasoning content support via databricks

Allows user to call claude-3-7-sonnet with thinking via databricks

* refactor: refactor choices transformation + add unit testing

* fix(databricks/chat/transformation.py): support thinking blocks on databricks response streaming

* feat(databricks/chat/transformation.py): support response_format for claude models

* fix(databricks/chat/transformation.py): correctly handle response_format={"type": "text"}

* feat(databricks/chat/transformation.py): support 'reasoning_effort' param mapping for anthropic

* fix: fix ruff errors

* fix: fix linting error

* test: update test

* fix(databricks/chat/transformation.py): handle json mode output parsing

* fix(databricks/chat/transformation.py): handle json mode on streaming

* test: update test

* test: update dbrx testing

* test: update testing

* fix(base_model_iterator.py): handle non-json chunk

* test: update tests

* fix: fix ruff check

* fix: fix databricks config import

* fix: handle _tool = none

* test: skip invalid test
2025-04-04 22:13:32 -07:00
Krish Dholakia
e3b231bc11
fix(litellm-proxy-extras/utils.py): check migrations from correct directory + place prisma schema inside litellm-proxy-extras dir (#9767)
Allows prisma migrate deploy to work as expected on new db's
2025-04-04 22:11:07 -07:00
Ishaan Jaff
b89ed69257
Merge branch 'main' into litellm_add_auth_metrics_endpoint 2025-04-04 21:28:06 -07:00
Ishaan Jaff
8c3670e192
Merge pull request #9719 from BerriAI/litellm_metrics_pod_lock_manager
[Reliability] Emit operational metrics for new DB Transaction architecture
2025-04-04 21:12:06 -07:00
Ishaan Jaff
df51d8bcfa Merge branch 'main' into litellm_metrics_pod_lock_manager 2025-04-04 21:11:39 -07:00
Chaos Yu
001043ba05
make sure metadata available and have a value (#9764) 2025-04-04 20:39:12 -07:00
Ishaan Jaff
eaad3b2402 PrometheusAuthMiddleware 2025-04-04 20:37:53 -07:00
Ishaan Jaff
86b473d267 allow adding auth on /metrics endpoint 2025-04-04 20:37:17 -07:00
Krish Dholakia
d66db2207b
Allow team members to see team models (#9742)
* fix(proxy_server.py): allow team member to see team models

* fix(model_dashboard.tsx): show edit + delete icons to be disabled if user is not admin and did not create models

* fix(proxy_server.py): fix ruff function size error

* fix(proxy_server.py): fix user model filter check
2025-04-04 20:36:48 -07:00
Ishaan Jaff
96ce5dbf7d _should_run_auth_on_metrics_endpoint 2025-04-04 20:32:04 -07:00
Ishaan Jaff
c7523818b4 PrometheusAuthMiddleware 2025-04-04 20:27:17 -07:00
Ishaan Jaff
253060cb09 allow requiring auth for /metrics endpoint 2025-04-04 17:35:02 -07:00
Ishaan Jaff
150e77cd7d Merge branch 'main' into litellm_reliability_fix_db_txs 2025-04-04 16:46:46 -07:00
Ishaan Jaff
901d6fe7b7 add operational metrics for pod lock manager v2 arch 2025-04-04 16:41:07 -07:00
Ishaan Jaff
1cdee4b331 Merge branch 'main' into litellm_metrics_pod_lock_manager 2025-04-04 16:33:16 -07:00
Ishaan Jaff
decb6649ec test_queue_flush_limit 2025-04-04 16:29:06 -07:00
Ishaan Jaff
e77a178a37 test_queue_size_reduction_with_large_volume 2025-04-04 16:21:29 -07:00