Commit graph

262 commits

Author SHA1 Message Date
Ishaan Jaff
a40ecc3fe4 fix import loc 2025-04-23 17:33:29 -07:00
Ishaan Jaff
42c1e93839 fix imports 2025-04-23 16:28:07 -07:00
Ishaan Jaff
8777e38880 fix imports 2025-04-23 16:22:01 -07:00
Ishaan Jaff
ce58c53ff1 refactor location of proxy 2025-04-23 14:38:44 -07:00
Krish Dholakia
217681eb5e
Litellm dev 04 22 2025 p1 (#10206)
* fix(openai.py): initial commit adding generic event type for openai responses api streaming

Ensures handling for undocumented event types - e.g. "response.reasoning_summary_part.added"

* fix(transformation.py): handle unknown openai response type

* fix(datadog_llm_observability.py): handle dict[str, any] -> dict[str, str] conversion

Fixes https://github.com/BerriAI/litellm/issues/9494

* test: add more unit testing

* test: add unit test

* fix(common_utils.py): fix message with content list

* test: update testing
2025-04-22 23:58:43 -07:00
Ishaan Jaff
96e31d205c
feat: Added Missing Attributes For Arize & Phoenix Integration (#10043) (#10215)
* feat: Added Missing Attributes For Arize & Phoenix Integration

* chore: Added noqa for PLR0915 to suppress warning

* chore: Moved Contributor Test to Correct Location

* chore: Removed Redundant Fallback

Co-authored-by: Ali Saleh <saleh.a@turing.com>
2025-04-22 21:34:51 -07:00
Krish Dholakia
5f98d4d7de
UI - Users page - Enable global sorting (allows finding users with highest spend) (#10211)
* fix(view_users.tsx): add time tracking logic to debounce search - prevent new queries from being overwritten by previous ones

* fix(internal_user_endpoints.py): add sort functionality to user list endpoint

* feat(internal_user_endpoints.py): support sort by on `/user/list`

* fix(view_users.tsx): enable global sorting

allows finding user with highest spend

* feat(view_users.tsx): support filtering by sso user id

* test(search_users.spec.ts): add tests to ensure filtering works

* test: add more unit testing
2025-04-22 19:59:53 -07:00
Ishaan Jaff
868cdd0226
[Feat] Add Support for DELETE /v1/responses/{response_id} on OpenAI, Azure OpenAI (#10205)
* add transform_delete_response_api_request to base responses config

* add transform_delete_response_api_request

* add delete_response_api_handler

* fixes for deleting responses, response API

* add adelete_responses

* add async test_basic_openai_responses_delete_endpoint

* test_basic_openai_responses_delete_endpoint

* working delete for streaming on responses API

* fixes azure transformation

* TestAnthropicResponsesAPITest

* fix code check

* fix linting

* fixes for get_complete_url

* test_basic_openai_responses_streaming_delete_endpoint

* streaming fixes
2025-04-22 18:27:03 -07:00
Dwij
b2955a2bdd
Add AgentOps Integration to LiteLLM (#9685)
* feat(sidebars): add new item for agentops integration in Logging & Observability category

* Update agentops_integration.md to enhance title formatting and remove redundant section

* Enhance AgentOps integration in documentation and codebase by removing LiteLLMCallbackHandler references, adding environment variable configurations, and updating logging initialization for AgentOps support.

* Update AgentOps integration documentation to include instructions for obtaining API keys and clarify environment variable setup.

* Add unit tests for AgentOps integration and improve error handling in token fetching

* Add unit tests for AgentOps configuration and token fetching functionality

* Corrected agentops test directory

* Linting fix

* chore: add OpenTelemetry dependencies to pyproject.toml

* chore: update OpenTelemetry dependencies and add new packages in pyproject.toml and poetry.lock
2025-04-22 10:29:01 -07:00
Krish Dholakia
a7db0df043
Gemini-2.5-flash improvements (#10198)
* fix(vertex_and_google_ai_studio_gemini.py): allow thinking budget = 0

Fixes https://github.com/BerriAI/litellm/issues/10121

* fix(vertex_and_google_ai_studio_gemini.py): handle nuance in counting exclusive vs. inclusive tokens

Addresses https://github.com/BerriAI/litellm/pull/10141#discussion_r2052272035
2025-04-21 22:48:00 -07:00
Ishaan Jaff
0c2f705417
[Feat] Add Responses API - Routing Affinity logic for sessions (#10193)
* test for test_responses_api_routing_with_previous_response_id

* test_responses_api_routing_with_previous_response_id

* add ResponsesApiDeploymentCheck

* ResponsesApiDeploymentCheck

* ResponsesApiDeploymentCheck

* fix ResponsesApiDeploymentCheck

* test_responses_api_routing_with_previous_response_id

* ResponsesApiDeploymentCheck

* test_responses_api_deployment_check.py

* docs routing affinity

* simplify ResponsesApiDeploymentCheck

* test response id

* fix code quality check
2025-04-21 20:00:27 -07:00
Krish Dholakia
0c3b7bb37d
fix(router.py): handle edge case where user sets 'model_group' inside… (#10191)
* fix(router.py): handle edge case where user sets 'model_group' inside 'model_info'

* fix(key_management_endpoints.py): security fix - return hashed token in 'token' field

Ensures when creating a key on UI - only hashed token shown

* test(test_key_management_endpoints.py): add unit test

* test: update test
2025-04-21 16:17:45 -07:00
Nilanjan De
03245c732a
Fix: Potential SQLi in spend_management_endpoints.py (#9878)
* fix: Potential SQLi in spend_management_endpoints.py

* fix tests

* test: add tests for global spend keys endpoint

* chore: update error message

* chore: lint

* chore: rename test
2025-04-21 14:29:38 -07:00
Li Yang
10257426a2
fix(bedrock): wrong system prompt transformation (#10120)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 25s
* fix(bedrock): wrong system transformation

* chore: add one more test case

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
2025-04-21 08:48:14 -07:00
Krish Dholakia
e0a613f88a
fix(common_daily_activity.py): support empty entity id field (#10175)
* fix(common_daily_activity.py): support empty entity id field

allows returning empty response when user is not admin and does not belong to any team

* test(test_common_daily_activity.py): add unit testing
2025-04-19 22:20:28 -07:00
Ishaan Jaff
b0024bb229
[Bug Fix] Spend Tracking Bug Fix, don't modify in memory default litellm params (#10167)
* _update_kwargs_with_default_litellm_params

* test_update_kwargs_does_not_mutate_defaults_and_merges_metadata
2025-04-19 14:13:59 -07:00
Krish Dholakia
03b5399f86
test(utils.py): handle scenario where text tokens + reasoning tokens … (#10165)
* test(utils.py): handle scenario where text tokens + reasoning tokens set, but reasoning tokens not charged separately

Addresses https://github.com/BerriAI/litellm/pull/10141#discussion_r2051555332

* fix(vertex_and_google_ai_studio.py): only set content if non-empty str
2025-04-19 12:32:38 -07:00
Krish Dholakia
5c929317cd
fix(triton/completion/transformation.py): remove bad_words / stop wor… (#10163)
* fix(triton/completion/transformation.py): remove bad_words / stop words from triton call

parameter 'bad_words' has invalid type. It should be either 'int', 'bool', or 'string'.

* fix(proxy_track_cost_callback.py): add debug logging for track cost callback error
2025-04-19 11:23:37 -07:00
Krish Dholakia
f08a4e3c06
Support 'file' message type for VLLM video url's + Anthropic redacted message thinking support (#10129)
* feat(hosted_vllm/chat/transformation.py): support calling vllm video url with openai 'file' message type

allows switching between gemini/vllm easily

* [WIP] redacted thinking tests (#9044)

* WIP: redacted thinking tests

* test: add test for redacted thinking in assistant message

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* fix(anthropic/chat/transformation.py): support redacted thinking block on anthropic completion

Fixes https://github.com/BerriAI/litellm/issues/9058

* fix(anthropic/chat/handler.py): transform anthropic redacted messages on streaming

Fixes https://github.com/BerriAI/litellm/issues/9058

* fix(bedrock/): support redacted text on streaming + non-streaming

Fixes https://github.com/BerriAI/litellm/issues/9058

* feat(litellm_proxy/chat/transformation.py): support 'reasoning_effort' param for proxy

allows using reasoning effort with thinking models on proxy

* test: update tests

* fix(utils.py): fix linting error

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting error

* fix: fix linting errors

* fix(anthropic/chat/transformation.py): fix returning citations in chat completion

---------

Co-authored-by: Johann Miller <22018973+johannkm@users.noreply.github.com>
2025-04-19 11:16:37 -07:00
Krish Dholakia
2508ca71cb
Handle fireworks ai tool calling response (#10130)
* feat(fireworks_ai/chat): handle tool calling with fireworks ai correctly

Fixes https://github.com/BerriAI/litellm/issues/7209

* fix(utils.py): handle none type in message

* fix: fix model name in test

* fix(utils.py): fix validate check for openai messages

* fix: fix model returned

* fix(main.py): fix text completion routing

* test: update testing

* test: skip test - cohere having RBAC issues
2025-04-19 09:37:45 -07:00
Krish Dholakia
36308a31be
Gemini-2.5-flash - support reasoning cost calc + return reasoning content (#10141)
* build(model_prices_and_context_window.json): add vertex ai gemini-2.5-flash pricing

* build(model_prices_and_context_window.json): add gemini reasoning token pricing

* fix(vertex_and_google_ai_studio_gemini.py): support counting thinking tokens for gemini

allows accurate cost calc

* fix(utils.py): add reasoning token cost calc to generic cost calc

ensures gemini-2.5-flash cost calculation is accurate

* build(model_prices_and_context_window.json): mark gemini-2.5-flash as 'supports_reasoning'

* feat(gemini/): support 'thinking' + 'reasoning_effort' params + new unit tests

allow controlling thinking effort for gemini-2.5-flash models

* test: update unit testing

* feat(vertex_and_google_ai_studio_gemini.py): return reasoning content if given in gemini response

* test: update model name

* fix: fix ruff check

* test(test_spend_management_endpoints.py): update tests to be less sensitive to new keys / updates to usage object

* fix(vertex_and_google_ai_studio_gemini.py): fix translation
2025-04-19 09:20:52 -07:00
Krrish Dholakia
d726e0f34c test: update testing imports 2025-04-19 09:13:16 -07:00
Ishaan Jaff
d3e04eac7f
[Feat] Unified Responses API - Add Azure Responses API support (#10116)
* initial commit for azure responses api support

* update get complete url

* fixes for responses API

* working azure responses API

* working responses API

* test suite for responses API

* azure responses API test suite

* fix test with complete url

* fix test refactor

* test fix metadata checks

* fix code quality check
2025-04-17 16:47:59 -07:00
Krish Dholakia
c73a6a8d1e
Add new /vertex_ai/discovery route - enables calling AgentBuilder API routes (#10084)
* feat(llm_passthrough_endpoints.py): expose new `/vertex_ai/discovery/` endpoint

Allows calling vertex ai discovery endpoints via passthrough

 For agentbuilder api calls

* refactor(llm_passthrough_endpoints.py): use common _base_vertex_proxy_route

Prevents duplicate code

* feat(llm_passthrough_endpoints.py): add vertex endpoint specific passthrough handlers
2025-04-16 21:45:51 -07:00
Ishaan Jaff
6220f3e7b8
[Feat SSO] Add LiteLLM SCIM Integration for Team and User management (#10072)
* fix NewUser response type

* add scim router

* add v0 scim v2 endpoints

* working scim transformation

* use 1 file for types

* fix scim firstname and givenName storage

* working SCIMErrorResponse

* working team / group provisioning on SCIM

* add SCIMPatchOp

* move scim folder

* fix import scim_router

* fix dont auto create scim keys

* add auth on all scim endpoints

* add is_virtual_key_allowed_to_call_route

* fix allowed routes

* fix for key management

* fix allowed routes check

* clean up error message

* fix code check

* fix for route checks

* ui SCIM support

* add UI tab for SCIM

* fixes SCIM

* fixes for SCIM settings on ui

* scim settings

* clean up scim view

* add migration for allowed_routes in keys table

* refactor scim transform

* fix SCIM linting error

* fix code quality check

* fix ui linting

* test_scim_transformations.py
2025-04-16 19:21:47 -07:00
Krish Dholakia
c603680d2a
fix(stream_chunk_builder_utils.py): don't set index on modelresponse (#10063)
* fix(stream_chunk_builder_utils.py): don't set index on modelresponse

* test: update tests
2025-04-16 10:11:47 -07:00
Michael Leshchinsky
e19d05980c
Add litellm call id passing to Aim guardrails on pre and post-hooks calls (#10021)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 19s
* Add litellm_call_id passing to aim guardrails on pre and post-hooks

* Add test that ensures that pre_call_hook receives litellm call id when common_request_processing called
2025-04-16 07:41:28 -07:00
Krish Dholakia
fdfa1108a6
Add property ordering for vertex ai schema (#9828) + Fix combining multiple tool calls (#10040)
* fix #9783: Retain schema field ordering for google gemini and vertex (#9828)

* test: update test

* refactor(groq.py): initial commit migrating groq to base_llm_http_handler

* fix(streaming_chunk_builder_utils.py): fix how tool content is combined

Fixes https://github.com/BerriAI/litellm/issues/10034

* fix(vertex_ai/common_utils.py): prevent infinite loop in helper function

* fix(groq/chat/transformation.py): handle groq streaming errors correctly

* fix(groq/chat/transformation.py): handle max_retries

---------

Co-authored-by: Adrian Lyjak <adrian@chatmeter.com>
2025-04-15 22:29:25 -07:00
Krish Dholakia
1b9b745cae
Fix gcs pub sub logging with env var GCS_PROJECT_ID (#10042)
* fix(pub_sub.py): fix passing project id in pub sub call

Fixes issue where GCS_PUBSUB_PROJECT_ID was not being used

* test(test_pub_sub.py): add unit test to prevent future regressions

* test: fix test
2025-04-15 21:50:48 -07:00
Ishaan Jaff
bd88263b29
[Feat - Cost Tracking improvement] Track prompt caching metrics in DailyUserSpendTransactions (#10029)
* stash changes

* emit cache read/write tokens to daily spend update

* emit cache read/write tokens on daily activity

* update types.ts

* docs prompt caching

* undo ui change

* fix activity metrics

* fix prompt caching metrics

* fix typed dict fields

* fix get_aggregated_daily_spend_update_transactions

* fix aggregating cache tokens

* test_cache_token_fields_aggregation

* daily_transaction

* add cache_creation_input_tokens and cache_read_input_tokens to LiteLLM_DailyUserSpend

* test_daily_spend_update_queue.py
2025-04-15 21:40:57 -07:00
Ishaan Jaff
d32d6fe03e
[UI] Bug Fix - Show created_at and updated_at for Users Page (#10033)
* add created_at and updated_at as fields for internal user table

* test_get_users_includes_timestamps
2025-04-15 21:15:44 -07:00
Krish Dholakia
d3e7a137ad
Revert "fix #9783: Retain schema field ordering for google gemini and vertex …" (#10038)
This reverts commit e3729f9855.
2025-04-15 19:21:33 -07:00
Adrian Lyjak
e3729f9855
fix #9783: Retain schema field ordering for google gemini and vertex (#9828) 2025-04-15 19:12:02 -07:00
Marc Abramowitz
837a6948d8
Fix typo: Entrata -> Entra in code (#9922)
* Fix typo: Entrata -> Entra

* Fix a few more
2025-04-15 17:31:18 -07:00
Krish Dholakia
6b5f093087
Revert "Fix case where only system messages are passed to Gemini (#9992)" (#10027)
This reverts commit 2afd922f8c.
2025-04-15 13:34:03 -07:00
Nolan Tremelling
2afd922f8c
Fix case where only system messages are passed to Gemini (#9992) 2025-04-15 13:30:49 -07:00
Ishaan Jaff
4f9bcd9b94
fix mock tests (#10003) 2025-04-14 22:09:22 -07:00
Krish Dholakia
8faf56922c
Fix azure tenant id check from env var + response_format check on api_version 2025+ (#9993)
* fix(azure/common_utils.py): check for azure tenant id, client id, client secret in env var

Fixes https://github.com/BerriAI/litellm/issues/9598#issuecomment-2801966027

* fix(azure/gpt_transformation.py): fix passing response_format to azure when api year = 2025

Fixes https://github.com/BerriAI/litellm/issues/9703

* test: monkeypatch azure api version in test

* test: update testing

* test: fix test

* test: update test

* docs(config_settings.md): document env vars
2025-04-14 22:02:35 -07:00
Ishaan Jaff
c1a642ce20
[UI] Allow setting prompt cache_control_injection_points (#10000)
* test_anthropic_cache_control_hook_system_message

* test_anthropic_cache_control_hook.py

* should_run_prompt_management_hooks

* fix should_run_prompt_management_hooks

* test_anthropic_cache_control_hook_specific_index

* fix test

* fix linting errors

* ChatCompletionCachedContent

* initial commit for cache control

* fixes ui design

* fix inserting cache_control_injection_points

* fix entering cache control points

* fixes for using cache control on ui + backend

* update cache control settings on edit model page

* fix init custom logger compatible class

* fix linting errors

* fix linting errors

* fix get_chat_completion_prompt
2025-04-14 21:17:42 -07:00
Ishaan Jaff
6cfa50d278
[Feat] Add support for cache_control_injection_points for Anthropic API, Bedrock API (#9996)
* test_anthropic_cache_control_hook_system_message

* test_anthropic_cache_control_hook.py

* should_run_prompt_management_hooks

* fix should_run_prompt_management_hooks

* test_anthropic_cache_control_hook_specific_index

* fix test

* fix linting errors

* ChatCompletionCachedContent
2025-04-14 20:50:13 -07:00
Ishaan Jaff
89dfb42697
[UI QA checklist] (#9957)
* fix typo on UI

* fix for edit user tab

* fix for user spend

* add /team/permissions_list to management routes

* fix auth check for team member permissions

* fix team endpoints test
2025-04-12 20:41:50 -07:00
Krish Dholakia
00e49380df
Litellm UI qa 04 12 2025 p1 (#9955)
* fix(model_info_view.tsx): cleanup text

* fix(key_management_endpoints.py): fix filtering litellm-dashboard keys for internal users

* fix(proxy_track_cost_callback.py): prevent flooding spend logs with admin endpoint errors

* test: add unit testing for logic

* test(test_auth_exception_handler.py): add more unit testing

* fix(router.py): correctly handle retrieving model info on get_model_group_info

fixes issue where model hub was showing None prices

* fix: fix linting errors
2025-04-12 19:30:48 -07:00
Dan Shaw
433075a8d9
fix(factory.py): correct indentation for message index increment in ollama, This fixes bug #9822 (#9943)
* fix(factory.py): correct indentation for message index increment in ollama_pt function

* test: add unit tests for ollama_pt function handling various message types
2025-04-12 09:50:40 -07:00
Krish Dholakia
421e0a3004
Litellm add managed files db (#9930)
* fix(openai.py): ensure openai file object shows up on logs

* fix(managed_files.py): return unified file id as b64 str

allows retrieve file id to work as expected

* fix(managed_files.py): apply decoded file id transformation

* fix: add unit test for file id + decode logic

* fix: initial commit for litellm_proxy support with CRUD Endpoints

* fix(managed_files.py): support retrieve file operation

* fix(managed_files.py): support for DELETE endpoint for files

* fix(managed_files.py): retrieve file content support

supports retrieve file content api from openai

* fix: fix linting error

* test: update tests

* fix: fix linting error

* feat(managed_files.py): support reading / writing files in DB

* feat(managed_files.py): support deleting file from DB on delete

* test: update testing

* fix(spend_tracking_utils.py): ensure each file create request is logged correctly

* fix(managed_files.py): fix storing / returning managed file object from cache

* fix(files/main.py): pass litellm params to azure route

* test: fix test

* build: add new prisma migration

* build: bump requirements

* test: add more testing

* refactor: cleanup post merge w/ main

* fix: fix code qa errors
2025-04-12 08:24:46 -07:00
Krish Dholakia
b9f01c9f5b
fix(databricks/common_utils.py): fix custom endpoint check (#9925)
* fix(databricks/common_utils.py): fix custom endpoint check

Fixes https://github.com/BerriAI/litellm/issues/9915

* fix(common_utils.py): add unit test to ensure custom_endpoint=False is handled correctly

Fixes https://github.com/BerriAI/litellm/issues/9915
2025-04-11 23:20:49 -07:00
Krish Dholakia
3ca82c22b6
Support CRUD endpoints for Managed Files (#9924)
* fix(openai.py): ensure openai file object shows up on logs

* fix(managed_files.py): return unified file id as b64 str

allows retrieve file id to work as expected

* fix(managed_files.py): apply decoded file id transformation

* fix: add unit test for file id + decode logic

* fix: initial commit for litellm_proxy support with CRUD Endpoints

* fix(managed_files.py): support retrieve file operation

* fix(managed_files.py): support for DELETE endpoint for files

* fix(managed_files.py): retrieve file content support

supports retrieve file content api from openai

* fix: fix linting error

* test: update tests

* fix: fix linting error

* fix(files/main.py): pass litellm params to azure route

* test: fix test
2025-04-11 21:48:27 -07:00
Ishaan Jaff
f9ce754817
[Feat] Add litellm.supports_reasoning() util to track if an llm supports reasoning (#9923)
* add supports_reasoning for xai models

* add "supports_reasoning": true for o1 series models

* add supports_reasoning util

* add litellm.supports_reasoning

* add supports reasoning for claude 3-7 models

* add deepseek as supports reasoning

* test_supports_reasoning

* add supports reasoning to model group info

* add supports_reasoning

* docs supports reasoning

* fix supports_reasoning test

* "supports_reasoning": false,

* fix test

* supports_reasoning
2025-04-11 17:56:04 -07:00
Ishaan Jaff
91c0a794b9
[Feat - Team Member Permissions] - CRUD Endpoints for managing team member permissions (#9919)
* add team_member_permissions

* add GetTeamMemberPermissionsRequest types

* crud endpoint for team member permissions

* test team member permissions CRUD

* fix GetTeamMemberPermissionsRequest
2025-04-11 17:15:16 -07:00
Ishaan Jaff
8b1d2d6956
[Feat - UI] - Allow setting Default Team setting when LiteLLM SSO auto creates teams (#9918)
* endpoint for updating default team settings on ui

* add GET default team settings endpoint

* ui expose default team settings on UI

* update to use DefaultTeamSSOParams

* DefaultTeamSSOParams

* fix DefaultTeamSSOParams

* docs team management

* test_update_default_team_settings
2025-04-11 14:07:10 -07:00
Krish Dholakia
0415f1205e
Litellm dev 04 10 2025 p3 (#9903)
* feat(managed_files.py): encode file type in unified file id

simplify calling gemini models

* fix(common_utils.py): fix extracting file type from unified file id

* fix(litellm_logging.py): create standard logging payload for create file call

* fix: fix linting error
2025-04-11 09:29:42 -07:00