Ishaan Jaff
5856fdf99d
add BaseReasoningEffortTests
2025-04-11 18:14:01 -07:00
Ishaan Jaff
f9ce754817
[Feat] Add litellm.supports_reasoning() util to track if an llm supports reasoning ( #9923 )
...
* add supports_reasoning for xai models
* add "supports_reasoning": true for o1 series models
* add supports_reasoning util
* add litellm.supports_reasoning
* add supports reasoning for claude 3-7 models
* add deepseek as supports reasoning
* test_supports_reasoning
* add supports reasoning to model group info
* add supports_reasoning
* docs supports reasoning
* fix supports_reasoning test
* "supports_reasoning": false,
* fix test
* supports_reasoning
2025-04-11 17:56:04 -07:00
Ishaan Jaff
91c0a794b9
[Feat - Team Member Permissions] - CRUD Endpoints for managing team member permissions ( #9919 )
...
* add team_member_permissions
* add GetTeamMemberPermissionsRequest types
* crud endpoint for team member permissions
* test team member permissions CRUD
* fix GetTeamMemberPermissionsRequest
2025-04-11 17:15:16 -07:00
Ishaan Jaff
2d6ad534bc
[Feat - PR1] Add xAI grok-3 models to LiteLLM ( #9920 )
...
* add xai/grok-3-mini-beta, xai/grok-3-beta
* add grok-3-fast-latest models
* supports_response_schema
* fix pricing
* docs xai
2025-04-11 15:12:12 -07:00
Ishaan Jaff
8b1d2d6956
[Feat - UI] - Allow setting Default Team setting when LiteLLM SSO auto creates teams ( #9918 )
...
* endpoint for updating default team settings on ui
* add GET default team settings endpoint
* ui expose default team settings on UI
* update to use DefaultTeamSSOParams
* DefaultTeamSSOParams
* fix DefaultTeamSSOParams
* docs team management
* test_update_default_team_settings
2025-04-11 14:07:10 -07:00
Krish Dholakia
0415f1205e
Litellm dev 04 10 2025 p3 ( #9903 )
...
* feat(managed_files.py): encode file type in unified file id
simplify calling gemini models
* fix(common_utils.py): fix extracting file type from unified file id
* fix(litellm_logging.py): create standard logging payload for create file call
* fix: fix linting error
2025-04-11 09:29:42 -07:00
Krish Dholakia
9f27e8363f
Realtime API: Support 'base_model' cost tracking + show response in spend logs (if enabled) ( #9897 )
...
* refactor(litellm_logging.py): refactor realtime cost tracking to use common code as rest
Ensures basic features like base model just work
* feat(realtime/): support 'base_model' cost tracking on realtime api
Fixes issue where base model was not working on realtime
* fix: fix ruff linting error
* test: fix test
2025-04-10 21:24:45 -07:00
Krish Dholakia
78879c68a9
Revert avglogprobs change + Add azure/gpt-4o-realtime-audio cost tracking ( #9893 )
...
* test: initial commit fixing gemini logprobs
Fixes https://github.com/BerriAI/litellm/issues/9888
* fix(vertex_and_google_ai_studio.py): Revert avglogprobs change
Fixes https://github.com/BerriAI/litellm/issues/8890
* build(model_prices_and_context_window.json): add gpt-4o-realtime-preview cost to model cost map
Fixes https://github.com/BerriAI/litellm/issues/9814
* test: add cost calculation unit testing
* test: fix test
* test: update test
2025-04-10 21:23:55 -07:00
Ishaan Jaff
f5c5c79ea4
update docs
2025-04-10 20:18:54 -07:00
Ishaan Jaff
98e34cbf5d
[Docs] Tutorial using MSFT auto team assignment with LiteLLM ( #9898 )
...
* add default_team_params as a config.yaml setting
* create_litellm_team_from_sso_group
* test_default_team_params
* test_create_team_without_default_params
* docs default team settings
* docs msft entra id tutorial
* commit litellm docs msft group assignment
* litellm MSFT sso
* member, team assignment on litellm
* docs msft auto assignment
* bug fix default team setting
* docs litellm default team settings
* test_default_team_params
2025-04-10 20:07:55 -07:00
Ishaan Jaff
72a12e91c4
[Bug Fix MSFT SSO] Use correct field for user email when using MSFT SSO ( #9886 )
...
* fix openid_from_response
* test_microsoft_sso_handler_openid_from_response_user_principal_name
* test upsert_sso_user
2025-04-10 17:40:58 -07:00
Ishaan Jaff
94a553dbb2
[Feat] Emit Key, Team Budget metrics on a cron job schedule ( #9528 )
...
* _initialize_remaining_budget_metrics
* initialize_budget_metrics_cron_job
* initialize_budget_metrics_cron_job
* initialize_budget_metrics_cron_job
* test_initialize_budget_metrics_cron_job
* LITELLM_PROXY_ADMIN_NAME
* fix code qa checks
* test_initialize_budget_metrics_cron_job
* test_initialize_budget_metrics_cron_job
* pod lock manager allow dynamic cron job ID
* fix pod lock manager
* require cronjobid for PodLockManager
* fix DB_SPEND_UPDATE_JOB_NAME acquire / release lock
* add comment on prometheus logger
* add debug statements for emitting key, team budget metrics
* test_pod_lock_manager.py
* test_initialize_budget_metrics_cron_job
* initialize_budget_metrics_cron_job
* initialize_remaining_budget_metrics
* remove outdated test
2025-04-10 16:59:14 -07:00
Ishaan Jaff
90d862b041
[Feat SSO] - Allow admins to set default_team_params
to have default params for when litellm SSO creates default teams ( #9895 )
...
* add default_team_params as a config.yaml setting
* create_litellm_team_from_sso_group
* test_default_team_params
* test_create_team_without_default_params
* docs default team settings
2025-04-10 16:58:28 -07:00
Krish Dholakia
0dbd663877
fix(cost_calculator.py): handle custom pricing at deployment level fo… ( #9855 )
...
* fix(cost_calculator.py): handle custom pricing at deployment level for router
* test: add unit tests
* fix(router.py): show custom pricing on UI
check correct model str
* fix: fix linting error
* docs(custom_pricing.md): clarify custom pricing for proxy
Fixes https://github.com/BerriAI/litellm/issues/8573#issuecomment-2790420740
* test: update code qa test
* fix: cleanup traceback
* fix: handle litellm param custom pricing
* test: update test
* fix(cost_calculator.py): add router model id to list of potential model names
* fix(cost_calculator.py): fix router model id check
* fix: router.py - maintain older model registry approach
* fix: fix ruff check
* fix(router.py): router get deployment info
add custom values to mapped dict
* test: update test
* fix(utils.py): update only if value is non-null
* test: add unit test
2025-04-09 22:13:10 -07:00
Krish Dholakia
0c5b4aa96d
feat(realtime/): add token tracking + log usage object in spend logs … ( #9843 )
...
* feat(realtime/): add token tracking + log usage object in spend logs metadata
* test: fix test
* test: update tests
* test: update testing
* test: update test
* test: update test
* test: update test
* test: update test
* test: update tesdt
* test: update test
2025-04-09 22:11:00 -07:00
Krish Dholakia
87733c8193
Fix anthropic prompt caching cost calc + trim logged message in db ( #9838 )
...
* fix(spend_tracking_utils.py): prevent logging entire mp4 files to db
Fixes https://github.com/BerriAI/litellm/issues/9732
* fix(anthropic/chat/transformation.py): Fix double counting cache creation input tokens
Fixes https://github.com/BerriAI/litellm/issues/9812
* refactor(anthropic/chat/transformation.py): refactor streaming to use same usage calculation block as non-streaming
reduce errors
* fix(bedrock/chat/converse_transformation.py): don't increment prompt tokens with cache_creation_input_tokens
* build: remove redisvl from requirements.txt (temporary)
* fix(spend_tracking_utils.py): handle circular references
* test: update code cov test
* test: update test
2025-04-09 21:26:43 -07:00
Ishaan Jaff
1359e6d7a6
[SSO] Connect LiteLLM to Azure Entra ID Enterprise Application ( #9872 )
...
* refactor SSO handler
* render sso JWT on ui
* docs debug sso
* fix sso login flow use await
* fix ui sso debug JWT
* test ui sso
* remove redis vl
* fix redisvl==0.5.1
* fix ml dtypes
* fix redisvl
* fix redis vl
* fix debug_sso_callback
* fix linting error
* fix redis semantic caching dep
* working graph api assignment
* test msft sso handler openid
* testing for msft group assignment
* fix debug graph api sso flow
* fix linting errors
* add_user_to_teams_from_sso_response
* ui sso fix team assignments
* linting fix _get_group_ids_from_graph_api_response
* add MicrosoftServicePrincipalTeam
* create_litellm_teams_from_service_principal_team_ids
* create_litellm_teams_from_service_principal_team_ids
* docs MICROSOFT_SERVICE_PRINCIPAL_ID
* fix linting errors
2025-04-09 20:26:59 -07:00
Krish Dholakia
ac4f32fb1e
Cost tracking for gemini-2.5-pro
( #9837 )
...
* build(model_prices_and_context_window.json): add google/gemini-2.0-flash-lite-001 versioned pricing
Closes https://github.com/BerriAI/litellm/issues/9829
* build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param
* build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param
* build(model_prices_and_context_window.json): add supported endpoints to gemini-2.5-pro
* build(model_prices_and_context_window.json): add gemini 200k+ pricing
* feat(utils.py): support cost calculation for gemini-2.5-pro above 200k tokens
Fixes https://github.com/BerriAI/litellm/issues/9807
* build: test dockerfile change
* build: revert apk change
* ci(config.yml): pip install wheel
* ci: test problematic package first
* ci(config.yml): pip install only binary
* ci: try more things
* ci: test different ml_dtypes version
* ci(config.yml): check ml_dtypes==0.4.0
* ci: test
* ci: cleanup config.yml
* ci: specify ml dtypes in requirements.txt
* ci: remove redisvl depedency (temporary)
* fix: fix linting errors
* test: update test
* test: fix test
2025-04-09 18:48:43 -07:00
Ishaan Jaff
4c1bb74c3d
[Feat] - SSO - Use MSFT Graph API to assign users to teams ( #9865 )
...
* refactor SSO handler
* render sso JWT on ui
* docs debug sso
* fix sso login flow use await
* fix ui sso debug JWT
* test ui sso
* remove redis vl
* fix redisvl==0.5.1
* fix ml dtypes
* fix redisvl
* fix redis vl
* fix debug_sso_callback
* fix linting error
* fix redis semantic caching dep
* working graph api assignment
* test msft sso handler openid
* testing for msft group assignment
* fix debug graph api sso flow
* fix linting errors
* add_user_to_teams_from_sso_response
* fix linting error
2025-04-09 18:26:43 -07:00
Krrish Dholakia
9ec1972926
fix(internal_user_endpoints.py): increase default page size for /user/daily/activity
2025-04-09 17:50:13 -07:00
Ishaan Jaff
6f7e9b9728
[Feat SSO] Debug route - allow admins to debug SSO JWT fields ( #9835 )
...
* refactor SSO handler
* render sso JWT on ui
* docs debug sso
* fix sso login flow use await
* fix ui sso debug JWT
* test ui sso
* remove redis vl
* fix redisvl==0.5.1
* fix ml dtypes
* fix redisvl
* fix redis vl
* fix debug_sso_callback
* fix linting error
* fix redis semantic caching dep
2025-04-09 15:29:35 -07:00
Ishaan Jaff
08a3620414
[Bug Fix] Add support for UploadFile on LLM Pass through endpoints (OpenAI, Azure etc) ( #9853 )
...
* http passthrough file handling
* fix make_multipart_http_request
* test_pass_through_file_operations
* unit tests for file handling
2025-04-09 15:29:20 -07:00
Krish Dholakia
6ba3c4a4f8
VertexAI non-jsonl file storage support ( #9781 )
...
* test: add initial e2e test
* fix(vertex_ai/files): initial commit adding sync file create support
* refactor: initial commit of vertex ai non-jsonl files reaching gcp endpoint
* fix(vertex_ai/files/transformation.py): initial working commit of non-jsonl file call reaching backend endpoint
* fix(vertex_ai/files/transformation.py): working e2e non-jsonl file upload
* test: working e2e jsonl call
* test: unit testing for jsonl file creation
* fix(vertex_ai/transformation.py): reset file pointer after read
allow multiple reads on same file object
* fix: fix linting errors
* fix: fix ruff linting errors
* fix: fix import
* fix: fix linting error
* fix: fix linting error
* fix(vertex_ai/files/transformation.py): fix linting error
* test: update test
* test: update tests
* fix: fix linting errors
* fix: fix test
* fix: fix linting error
2025-04-09 14:01:48 -07:00
qvalentin
93532e00db
feat: add enterpriseWebSearch tool for vertex-ai ( #9856 )
2025-04-09 13:17:48 -07:00
Jacob Hagstedt P Suorra
dc9bfae053
Add user alias to API endpoint ( #9859 )
...
Co-authored-by: Jacob Hagstedt <wcgs@novonordisk.com>
2025-04-09 13:16:35 -07:00
Li Yang
11389535d5
chore: fix haiku cache read pricing per token ( #9834 )
2025-04-08 16:43:09 -07:00
Ishaan Jaff
441c7275ed
test fix post call rules ( #9826 )
2025-04-08 13:55:37 -07:00
Ishaan Jaff
e6403b717c
[Security fix - CVE-2025-0330] - Leakage of Langfuse API keys in team exception handling ( #9830 )
...
* fix team id exception in get team config
* test_team_info_masking
* test ref
2025-04-08 13:55:20 -07:00
Krrish Dholakia
367f48004d
build(model_prices_and_context_window.json): consistent params
2025-04-08 12:45:33 -07:00
Ishaan Jaff
ff3a6830a4
[Feat] LiteLLM Tag/Policy Management ( #9813 )
...
Read Version from pyproject.toml / read-version (push) Successful in 15s
Helm unit test / unit-test (push) Successful in 21s
* rendering tags on UI
* use /models for building tags
* CRUD endpoints for Tag management
* fix tag management
* working api for LIST tags
* working tag management
* refactor UI components
* fixes ui tag management
* clean up ui tag management
* fix tag management ui
* fix show allowed llms
* e2e tag controls
* stash change for rendering tags on UI
* ui working tag selector on Test Key page
* fixes for tag management
* clean up tag info
* fix code quality
* test for tag management
* ui clarify what tag routing is
2025-04-07 21:54:24 -07:00
Krish Dholakia
ac9f03beae
Allow passing thinking
param to litellm proxy via client sdk + Code QA Refactor on get_optional_params (get correct values) ( #9386 )
...
* fix(litellm_proxy/chat/transformation.py): support 'thinking' param
Fixes https://github.com/BerriAI/litellm/issues/9380
* feat(azure/gpt_transformation.py): add azure audio model support
Closes https://github.com/BerriAI/litellm/issues/6305
* fix(utils.py): use provider_config in common functions
* fix(utils.py): add missing provider configs to get_chat_provider_config
* test: fix test
* fix: fix path
* feat(utils.py): make bedrock invoke nova config baseconfig compatible
* fix: fix linting errors
* fix(azure_ai/transformation.py): remove buggy optional param filtering for azure ai
Removes incorrect check for support tool choice when calling azure ai - prevented calling models with response_format unless on litell model cost map
* fix(amazon_cohere_transformation.py): fix bedrock invoke cohere transformation to inherit from coherechatconfig
* test: fix azure ai tool choice mapping
* fix: fix model cost map to add 'supports_tool_choice' to cohere models
* fix(get_supported_openai_params.py): check if custom llm provider in llm providers
* fix(get_supported_openai_params.py): fix llm provider in list check
* fix: fix ruff check errors
* fix: support defs when calling bedrock nova
* fix(factory.py): fix test
2025-04-07 21:04:11 -07:00
Krish Dholakia
fcf17d114f
Litellm dev 04 05 2025 p2 ( #9774 )
...
* test: move test to just checking async
* fix(transformation.py): handle function call with no schema
* fix(utils.py): handle pydantic base model in message tool calls
Fix https://github.com/BerriAI/litellm/issues/9321
* fix(vertex_and_google_ai_studio.py): handle tools=[]
Fixes https://github.com/BerriAI/litellm/issues/9080
* test: remove max token restriction
* test: fix basic test
* fix(get_supported_openai_params.py): fix check
* fix(converse_transformation.py): support fake streaming for meta.llama3-3-70b-instruct-v1:0
* fix: fix test
* fix: parse out empty dictionary on dbrx streaming + tool calls
* fix(handle-'strict'-param-when-calling-fireworks-ai): fireworks ai does not support 'strict' param
* fix: fix ruff check
'
* fix: handle no strict in function
* fix: revert bedrock change - handle in separate PR
2025-04-07 21:02:52 -07:00
Krish Dholakia
8d338aee78
fix(databricks/chat/transformation.py): remove reasoning_effort from request ( #9811 )
...
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 27s
Fixes https://github.com/BerriAI/litellm/issues/9700#issuecomment-2784431995
2025-04-07 19:43:19 -07:00
Krish Dholakia
8e3c7b2de0
fix(vertex_ai.py): move to only passing in accepted keys by vertex ai response schema ( #8992 )
...
* fix(vertex_ai.py): common_utils.py
move to only passing in accepted keys by vertex ai
prevent json schema compatible keys like $id, and $comment from causing vertex ai openapi calls to fail
* fix(test_vertex.py): add testing to ensure only accepted schema params passed in
* fix(common_utils.py): fix linting error
* test: update test
* test: accept function
2025-04-07 18:07:01 -07:00
Krish Dholakia
4a128cfd64
Realtime API Cost tracking ( #9795 )
...
* fix(proxy_server.py): log realtime calls to spendlogs
Fixes https://github.com/BerriAI/litellm/issues/8410
* feat(realtime/): OpenAI Realtime API cost tracking
Closes https://github.com/BerriAI/litellm/issues/8410
* test: add unit testing for coverage
* test: add more unit testing
* fix: handle edge cases
2025-04-07 16:43:12 -07:00
Krish Dholakia
9a60cd9deb
fix(gemini/transformation.py): handle file_data being passed in ( #9786 )
2025-04-07 16:32:08 -07:00
KX
0ac896a6f2
feat: add offline swagger docs ( #7653 )
2025-04-06 13:55:06 -07:00
Krish Dholakia
792ee079c2
Litellm 04 05 2025 release notes ( #9785 )
...
* docs: update docs
* docs: additional cleanup
* docs(index.md): add initial links
* docs: more doc updates
* docs(index.md): add more links
* docs(files.md): add gemini files API to docs
* docs(index.md): add more docs
* docs: more docs
* docs: update docs
2025-04-06 09:03:51 -07:00
Ishaan Jaff
52b35cd809
[UI Polish] - Polish login screen ( #9778 )
...
Read Version from pyproject.toml / read-version (push) Successful in 21s
Helm unit test / unit-test (push) Successful in 24s
* fix admin ui utils login screen
* ui - add layer of polish on login screen
* ui fix design of login page
* ui fix color scheme on login page
2025-04-05 14:56:03 -07:00
Ishaan Jaff
7f6de81196
ui new build
2025-04-05 12:30:37 -07:00
Ishaan Jaff
3a7061a05c
bug fix de depluciate model list ( #9775 )
2025-04-05 12:29:11 -07:00
Krish Dholakia
34bdf36eab
Add inference providers support for Hugging Face ( #8258 ) ( #9738 ) ( #9773 )
...
* Add inference providers support for Hugging Face (#8258 )
* add first version of inference providers for huggingface
* temporarily skipping tests
* Add documentation
* Fix titles
* remove max_retries from params and clean up
* add suggestions
* use llm http handler
* update doc
* add suggestions
* run formatters
* add tests
* revert
* revert
* rename file
* set maxsize for lru cache
* fix embeddings
* fix inference url
* fix tests following breaking change in main
* use ChatCompletionRequest
* fix tests and lint
* [Hugging Face] Remove outdated chat completion tests and fix embedding tests (#9749 )
* remove or fix tests
* fix link in doc
* fix(config_settings.md): document hf api key
---------
Co-authored-by: célina <hanouticelina@gmail.com>
2025-04-05 10:50:15 -07:00
Krish Dholakia
0d503ad8ad
Move daily user transaction logging outside of 'disable_spend_logs' flag - different tables ( #9772 )
...
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 18s
* refactor(db_spend_update_writer.py): aggregate table is entirely different
* test(test_db_spend_update_writer.py): add unit test to ensure if disable_spend_logs is true daily user transactions is still logged
* test: fix test
2025-04-05 09:58:16 -07:00
Krrish Dholakia
af9db827fc
fix(databricks/chat/transformation.py): handle empty headers case
2025-04-05 08:33:56 -07:00
Krish Dholakia
5099aac1a5
Add DBRX Anthropic w/ thinking + response_format support ( #9744 )
...
* feat(databricks/chat/): add anthropic w/ reasoning content support via databricks
Allows user to call claude-3-7-sonnet with thinking via databricks
* refactor: refactor choices transformation + add unit testing
* fix(databricks/chat/transformation.py): support thinking blocks on databricks response streaming
* feat(databricks/chat/transformation.py): support response_format for claude models
* fix(databricks/chat/transformation.py): correctly handle response_format={"type": "text"}
* feat(databricks/chat/transformation.py): support 'reasoning_effort' param mapping for anthropic
* fix: fix ruff errors
* fix: fix linting error
* test: update test
* fix(databricks/chat/transformation.py): handle json mode output parsing
* fix(databricks/chat/transformation.py): handle json mode on streaming
* test: update test
* test: update dbrx testing
* test: update testing
* fix(base_model_iterator.py): handle non-json chunk
* test: update tests
* fix: fix ruff check
* fix: fix databricks config import
* fix: handle _tool = none
* test: skip invalid test
2025-04-04 22:13:32 -07:00
Krish Dholakia
e3b231bc11
fix(litellm-proxy-extras/utils.py): check migrations from correct directory + place prisma schema inside litellm-proxy-extras dir ( #9767 )
...
Allows prisma migrate deploy to work as expected on new db's
2025-04-04 22:11:07 -07:00
Ishaan Jaff
220fa23d2b
watsonx/ibm/granite-3-8b-instruct
2025-04-04 21:46:02 -07:00
Ishaan Jaff
e2bb203075
update watsonx/ibm/granite-3-8b-instruct"
2025-04-04 21:45:04 -07:00
Ishaan Jaff
f0f2f819bd
Merge pull request #9760 from BerriAI/litellm_prometheus_error_monitoring
...
[Reliability] Prometheus emit llm provider on failure metric - make it easy to differentiate litellm error vs llm api error
2025-04-04 21:37:28 -07:00
Ishaan Jaff
b89ed69257
Merge branch 'main' into litellm_add_auth_metrics_endpoint
2025-04-04 21:28:06 -07:00