Commit graph

619 commits

Author SHA1 Message Date
Ishaan Jaff
2e58e47b43
[Bug Fix] Add Cost Tracking for gpt-image-1 when quality is unspecified (#10247)
* TestOpenAIGPTImage1

* fixes for cost calc

* fix ImageGenerationRequestQuality.MEDIUM
2025-04-23 15:16:40 -07:00
Krish Dholakia
217681eb5e
Litellm dev 04 22 2025 p1 (#10206)
* fix(openai.py): initial commit adding generic event type for openai responses api streaming

Ensures handling for undocumented event types - e.g. "response.reasoning_summary_part.added"

* fix(transformation.py): handle unknown openai response type

* fix(datadog_llm_observability.py): handle dict[str, any] -> dict[str, str] conversion

Fixes https://github.com/BerriAI/litellm/issues/9494

* test: add more unit testing

* test: add unit test

* fix(common_utils.py): fix message with content list

* test: update testing
2025-04-22 23:58:43 -07:00
Ishaan Jaff
868cdd0226
[Feat] Add Support for DELETE /v1/responses/{response_id} on OpenAI, Azure OpenAI (#10205)
* add transform_delete_response_api_request to base responses config

* add transform_delete_response_api_request

* add delete_response_api_handler

* fixes for deleting responses, response API

* add adelete_responses

* add async test_basic_openai_responses_delete_endpoint

* test_basic_openai_responses_delete_endpoint

* working delete for streaming on responses API

* fixes azure transformation

* TestAnthropicResponsesAPITest

* fix code check

* fix linting

* fixes for get_complete_url

* test_basic_openai_responses_streaming_delete_endpoint

* streaming fixes
2025-04-22 18:27:03 -07:00
Krish Dholakia
66680c421d
Add global filtering to Users tab (#10195)
* style(internal_user_endpoints.py): add response model to `/user/list` endpoint

make sure we maintain consistent response spec

* fix(key_management_endpoints.py): return 'created_at' and 'updated_at' on `/key/generate`

Show 'created_at' on UI when key created

* test(test_keys.py): add e2e test to ensure created at is always returned

* fix(view_users.tsx): support global search by user email

allows easier search

* test(search_users.spec.ts): add e2e test ensure user search works on admin ui

* fix(view_users.tsx): support filtering user by role and user id

More powerful filtering on internal users table

* fix(view_users.tsx): allow filtering users by team

* style(view_users.tsx): cleanup ui to show filters in consistent style

* refactor(view_users.tsx): cleanup to just use 1 variable for the data

* fix(view_users.tsx): cleanup use effect hooks

* fix(internal_user_endpoints.py): fix check to pass testing

* test: update tests

* test: update tests

* Revert "test: update tests"

This reverts commit 6553eeb232.

* fix(view_userts.tsx): add back in 'previous' and 'next' tabs for pagination
2025-04-22 13:59:43 -07:00
Ishaan Jaff
0c2f705417
[Feat] Add Responses API - Routing Affinity logic for sessions (#10193)
* test for test_responses_api_routing_with_previous_response_id

* test_responses_api_routing_with_previous_response_id

* add ResponsesApiDeploymentCheck

* ResponsesApiDeploymentCheck

* ResponsesApiDeploymentCheck

* fix ResponsesApiDeploymentCheck

* test_responses_api_routing_with_previous_response_id

* ResponsesApiDeploymentCheck

* test_responses_api_deployment_check.py

* docs routing affinity

* simplify ResponsesApiDeploymentCheck

* test response id

* fix code quality check
2025-04-21 20:00:27 -07:00
Ishaan Jaff
4eac0f64f3
[Feat] Pass through endpoints - ensure PassthroughStandardLoggingPayload is logged and contains method, url, request/response body (#10194)
* ensure passthrough_logging_payload is filled in kwargs

* test_assistants_passthrough_logging

* test_assistants_passthrough_logging

* test_assistants_passthrough_logging

* test_threads_passthrough_logging

* test _init_kwargs_for_pass_through_endpoint

* _init_kwargs_for_pass_through_endpoint
2025-04-21 19:46:22 -07:00
Krish Dholakia
55a17730fb
fix(transformation.py): pass back in gemini thinking content to api (#10173)
Ensures thinking content always returned
2025-04-19 18:03:05 -07:00
Ishaan Jaff
653570824a
Bug Fix - Responses API, Loosen restrictions on allowed environments for computer use tool (#10168)
* loosen allowed types on ComputerToolParam

* test_basic_computer_use_preview_tool_call
2025-04-19 14:40:32 -07:00
Krish Dholakia
f08a4e3c06
Support 'file' message type for VLLM video url's + Anthropic redacted message thinking support (#10129)
* feat(hosted_vllm/chat/transformation.py): support calling vllm video url with openai 'file' message type

allows switching between gemini/vllm easily

* [WIP] redacted thinking tests (#9044)

* WIP: redacted thinking tests

* test: add test for redacted thinking in assistant message

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* fix(anthropic/chat/transformation.py): support redacted thinking block on anthropic completion

Fixes https://github.com/BerriAI/litellm/issues/9058

* fix(anthropic/chat/handler.py): transform anthropic redacted messages on streaming

Fixes https://github.com/BerriAI/litellm/issues/9058

* fix(bedrock/): support redacted text on streaming + non-streaming

Fixes https://github.com/BerriAI/litellm/issues/9058

* feat(litellm_proxy/chat/transformation.py): support 'reasoning_effort' param for proxy

allows using reasoning effort with thinking models on proxy

* test: update tests

* fix(utils.py): fix linting error

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting error

* fix: fix linting errors

* fix(anthropic/chat/transformation.py): fix returning citations in chat completion

---------

Co-authored-by: Johann Miller <22018973+johannkm@users.noreply.github.com>
2025-04-19 11:16:37 -07:00
Krish Dholakia
2508ca71cb
Handle fireworks ai tool calling response (#10130)
* feat(fireworks_ai/chat): handle tool calling with fireworks ai correctly

Fixes https://github.com/BerriAI/litellm/issues/7209

* fix(utils.py): handle none type in message

* fix: fix model name in test

* fix(utils.py): fix validate check for openai messages

* fix: fix model returned

* fix(main.py): fix text completion routing

* test: update testing

* test: skip test - cohere having RBAC issues
2025-04-19 09:37:45 -07:00
Krish Dholakia
36308a31be
Gemini-2.5-flash - support reasoning cost calc + return reasoning content (#10141)
* build(model_prices_and_context_window.json): add vertex ai gemini-2.5-flash pricing

* build(model_prices_and_context_window.json): add gemini reasoning token pricing

* fix(vertex_and_google_ai_studio_gemini.py): support counting thinking tokens for gemini

allows accurate cost calc

* fix(utils.py): add reasoning token cost calc to generic cost calc

ensures gemini-2.5-flash cost calculation is accurate

* build(model_prices_and_context_window.json): mark gemini-2.5-flash as 'supports_reasoning'

* feat(gemini/): support 'thinking' + 'reasoning_effort' params + new unit tests

allow controlling thinking effort for gemini-2.5-flash models

* test: update unit testing

* feat(vertex_and_google_ai_studio_gemini.py): return reasoning content if given in gemini response

* test: update model name

* fix: fix ruff check

* test(test_spend_management_endpoints.py): update tests to be less sensitive to new keys / updates to usage object

* fix(vertex_and_google_ai_studio_gemini.py): fix translation
2025-04-19 09:20:52 -07:00
Krish Dholakia
ef6ac42658
Litellm dev 04 18 2025 p2 (#10157)
* fix(proxy/_types.py): allow internal user to call api playground

* fix(new_usage.tsx): cleanup tag based usage - only show for proxy admin

not clear what tags internal user should be allowed to see

* fix(team_endpoints.py): allow internal user view spend for teams they belong to

* fix(team_endpoints.py): return team alias on `/team/daily/activity` API

allows displaying team alias on ui

* fix: fix linting error

* fix(entity_usage.tsx): allow viewing top keys by team

* fix(entity_usage.tsx): show alias, if available in breakdown

allows entity alias to be easily displayed

* Show usage by key (on all up, team, and tag usage dashboards)  (#10152)

* fix(entity_usage.tsx): allow user to select team in team usage tab

* fix(new_usage.tsx): load all tags for filtering

* fix(tag_management_endpoints.py): return dynamic tags from db on `/tag/list`

* fix(litellm_pre_call_utils.py): support x-litellm-tags even if tag based routing not enabled

* fix(new_usage.tsx): show breakdown of usage by api key on dashboard

helpful when looking at spend by team

* fix(networking.tsx): exclude litellm-dashboard team id's from calls

adds noisy ui tokens to key activity

* fix(new_usage.tsx): allow user to see activity by key on main tab

* feat(internal_user_endpoints.py): refactor to use common_daily_activity function

reuses same logic across teams/keys/tags

Allows returning team_alias in api_keys consistently

* fix(leftnav.tsx): swap old usage with new usage tab

* fix(entity_usage.tsx): show breakdown of teams in daily spend chart

* style(new_usage.tsx): show global usage tab if user is admin / has admin view

* fix(new_usage.tsx): add disclaimer for new usage dashboard

* fix(new_usage.tsx): fix linting error

* Allow filtering usage dashboard by team + tag (#10150)

* fix(entity_usage.tsx): allow user to select team in team usage tab

* fix(new_usage.tsx): load all tags for filtering

* fix(tag_management_endpoints.py): return dynamic tags from db on `/tag/list`

* fix(litellm_pre_call_utils.py): support x-litellm-tags even if tag based routing not enabled

* fix: fix linting error
2025-04-19 07:32:23 -07:00
Ishaan Jaff
3d5022bd79
[Feat] Support for all litellm providers on Responses API (works with Codex) - Anthropic, Bedrock API, VertexAI, Ollama (#10132)
* transform request

* basic handler for LiteLLMCompletionTransformationHandler

* complete transform litellm to responses api

* fixes to test

* fix stream=True

* fix streaming iterator

* fixes for transformation

* fixes for anthropic codex support

* fix pass response_api_optional_params

* test anthropic responses api tools

* update responses types

* working codex with litellm

* add session handler

* fixes streaming iterator

* fix handler

* add litellm codex example

* fix code quality

* test fix

* docs litellm codex

* litellm codexdoc

* docs openai codex with litellm

* docs litellm openai codex

* litellm codex

* linting fixes for transforming responses API

* fix import error

* fix responses api test

* add sync iterator support for responses api
2025-04-18 19:53:59 -07:00
Ishaan Jaff
6220f3e7b8
[Feat SSO] Add LiteLLM SCIM Integration for Team and User management (#10072)
* fix NewUser response type

* add scim router

* add v0 scim v2 endpoints

* working scim transformation

* use 1 file for types

* fix scim firstname and givenName storage

* working SCIMErrorResponse

* working team / group provisioning on SCIM

* add SCIMPatchOp

* move scim folder

* fix import scim_router

* fix dont auto create scim keys

* add auth on all scim endpoints

* add is_virtual_key_allowed_to_call_route

* fix allowed routes

* fix for key management

* fix allowed routes check

* clean up error message

* fix code check

* fix for route checks

* ui SCIM support

* add UI tab for SCIM

* fixes SCIM

* fixes for SCIM settings on ui

* scim settings

* clean up scim view

* add migration for allowed_routes in keys table

* refactor scim transform

* fix SCIM linting error

* fix code quality check

* fix ui linting

* test_scim_transformations.py
2025-04-16 19:21:47 -07:00
Krish Dholakia
c0d7e9f16d
Add new /tag/daily/activity endpoint + Add tag dashboard to UI (#10073)
Some checks failed
Read Version from pyproject.toml / read-version (push) Successful in 15s
Helm unit test / unit-test (push) Successful in 24s
Publish Prisma Migrations / publish-migrations (push) Failing after 1m47s
* feat: initial commit adding daily tag spend table to db

* feat(db_spend_update_writer.py): correctly log tag spend transactions

* build(schema.prisma): add new tag table to root

* build: add new migration file

* feat(common_daily_activity.py): add `/tag/daily/activity` API endpoint

allows viewing daily spend by tag

* feat(tag_management_endpoints.py): support comma separated list of tags + tag breakdown metric

allows querying multiple tags + knowing what tags are driving spend

* feat(entity_usage.tsx): initial commit adding tag based usage to litellm dashboard

brings back tag based usage tracking to UI at 1m+ spend logs

* feat(entity_usage.tsx): add top api key view to ui

* feat(entity_usage.tsx): add tag table to ui

* feat(entity_usage.tsx): allow filtering by tag

* refactor(entity_usage.tsx): reorder components

* build(ui/): fix linting error

* fix: fix ruff checks

* fix(schema.prisma): drop uniqueness requirement on tag

allows dailytagspend to have multiple rows with the same tag

* build(schema.prisma): drop uniqueness requirement on tag in dailytagspend

allows tag agg. view to work on multiple rows with same tag

* build(schema.prisma): drop tag uniqueness requirement
2025-04-16 15:24:44 -07:00
Krish Dholakia
d8a1071bc4
Add aggregate spend by tag (#10071)
* feat: initial commit adding daily tag spend table to db

* feat(db_spend_update_writer.py): correctly log tag spend transactions

* build(schema.prisma): add new tag table to root

* build: add new migration file
2025-04-16 12:26:21 -07:00
Krish Dholakia
47e811d6ce
fix(llm_http_handler.py): fix fake streaming (#10061)
* fix(llm_http_handler.py): fix fake streaming

allows groq to work with llm_http_handler

* fix(groq.py): migrate groq to openai like config

ensures json mode handling works correctly
2025-04-16 10:15:11 -07:00
Krish Dholakia
9b77559ccf
Add aggregate team based usage logging (#10039)
* feat(schema.prisma): initial commit adding aggregate table for team spend

allows team spend to be visible at 1m+ logs

* feat(db_spend_update_writer.py): support logging aggregate team spend

allows usage dashboard to work at 1m+ logs

* feat(litellm-proxy-extras/): add new migration file

* fix(db_spend_update_writer.py): fix return type

* build: bump requirements

* fix: fix ruff error
2025-04-15 20:58:48 -07:00
Ishaan Jaff
c1a642ce20
[UI] Allow setting prompt cache_control_injection_points (#10000)
* test_anthropic_cache_control_hook_system_message

* test_anthropic_cache_control_hook.py

* should_run_prompt_management_hooks

* fix should_run_prompt_management_hooks

* test_anthropic_cache_control_hook_specific_index

* fix test

* fix linting errors

* ChatCompletionCachedContent

* initial commit for cache control

* fixes ui design

* fix inserting cache_control_injection_points

* fix entering cache control points

* fixes for using cache control on ui + backend

* update cache control settings on edit model page

* fix init custom logger compatible class

* fix linting errors

* fix linting errors

* fix get_chat_completion_prompt
2025-04-14 21:17:42 -07:00
Ishaan Jaff
6cfa50d278
[Feat] Add support for cache_control_injection_points for Anthropic API, Bedrock API (#9996)
* test_anthropic_cache_control_hook_system_message

* test_anthropic_cache_control_hook.py

* should_run_prompt_management_hooks

* fix should_run_prompt_management_hooks

* test_anthropic_cache_control_hook_specific_index

* fix test

* fix linting errors

* ChatCompletionCachedContent
2025-04-14 20:50:13 -07:00
Krish Dholakia
2ed593e052
Updated cohere v2 passthrough (#9997)
* Add cohere `/v2/chat` pass-through cost tracking support (#8235)

* feat(cohere_passthrough_handler.py): initial working commit with cohere passthrough cost tracking

* fix(v2_transformation.py): support cohere /v2/chat endpoint

* fix: fix linting errors

* fix: fix import

* fix(v2_transformation.py): fix linting error

* test: handle openai exception change
2025-04-14 19:51:01 -07:00
Krish Dholakia
421e0a3004
Litellm add managed files db (#9930)
* fix(openai.py): ensure openai file object shows up on logs

* fix(managed_files.py): return unified file id as b64 str

allows retrieve file id to work as expected

* fix(managed_files.py): apply decoded file id transformation

* fix: add unit test for file id + decode logic

* fix: initial commit for litellm_proxy support with CRUD Endpoints

* fix(managed_files.py): support retrieve file operation

* fix(managed_files.py): support for DELETE endpoint for files

* fix(managed_files.py): retrieve file content support

supports retrieve file content api from openai

* fix: fix linting error

* test: update tests

* fix: fix linting error

* feat(managed_files.py): support reading / writing files in DB

* feat(managed_files.py): support deleting file from DB on delete

* test: update testing

* fix(spend_tracking_utils.py): ensure each file create request is logged correctly

* fix(managed_files.py): fix storing / returning managed file object from cache

* fix(files/main.py): pass litellm params to azure route

* test: fix test

* build: add new prisma migration

* build: bump requirements

* test: add more testing

* refactor: cleanup post merge w/ main

* fix: fix code qa errors
2025-04-12 08:24:46 -07:00
Krish Dholakia
3ca82c22b6
Support CRUD endpoints for Managed Files (#9924)
* fix(openai.py): ensure openai file object shows up on logs

* fix(managed_files.py): return unified file id as b64 str

allows retrieve file id to work as expected

* fix(managed_files.py): apply decoded file id transformation

* fix: add unit test for file id + decode logic

* fix: initial commit for litellm_proxy support with CRUD Endpoints

* fix(managed_files.py): support retrieve file operation

* fix(managed_files.py): support for DELETE endpoint for files

* fix(managed_files.py): retrieve file content support

supports retrieve file content api from openai

* fix: fix linting error

* test: update tests

* fix: fix linting error

* fix(files/main.py): pass litellm params to azure route

* test: fix test
2025-04-11 21:48:27 -07:00
Ishaan Jaff
f9ce754817
[Feat] Add litellm.supports_reasoning() util to track if an llm supports reasoning (#9923)
* add supports_reasoning for xai models

* add "supports_reasoning": true for o1 series models

* add supports_reasoning util

* add litellm.supports_reasoning

* add supports reasoning for claude 3-7 models

* add deepseek as supports reasoning

* test_supports_reasoning

* add supports reasoning to model group info

* add supports_reasoning

* docs supports reasoning

* fix supports_reasoning test

* "supports_reasoning": false,

* fix test

* supports_reasoning
2025-04-11 17:56:04 -07:00
Ishaan Jaff
91c0a794b9
[Feat - Team Member Permissions] - CRUD Endpoints for managing team member permissions (#9919)
* add team_member_permissions

* add GetTeamMemberPermissionsRequest types

* crud endpoint for team member permissions

* test team member permissions CRUD

* fix GetTeamMemberPermissionsRequest
2025-04-11 17:15:16 -07:00
Ishaan Jaff
8b1d2d6956
[Feat - UI] - Allow setting Default Team setting when LiteLLM SSO auto creates teams (#9918)
* endpoint for updating default team settings on ui

* add GET default team settings endpoint

* ui expose default team settings on UI

* update to use DefaultTeamSSOParams

* DefaultTeamSSOParams

* fix DefaultTeamSSOParams

* docs team management

* test_update_default_team_settings
2025-04-11 14:07:10 -07:00
Krish Dholakia
0415f1205e
Litellm dev 04 10 2025 p3 (#9903)
* feat(managed_files.py): encode file type in unified file id

simplify calling gemini models

* fix(common_utils.py): fix extracting file type from unified file id

* fix(litellm_logging.py): create standard logging payload for create file call

* fix: fix linting error
2025-04-11 09:29:42 -07:00
Krish Dholakia
9f27e8363f
Realtime API: Support 'base_model' cost tracking + show response in spend logs (if enabled) (#9897)
* refactor(litellm_logging.py): refactor realtime cost tracking to use common code as rest

Ensures basic features like base model just work

* feat(realtime/): support 'base_model' cost tracking on realtime api

Fixes issue where base model was not working on realtime

* fix: fix ruff linting error

* test: fix test
2025-04-10 21:24:45 -07:00
Krish Dholakia
78879c68a9
Revert avglogprobs change + Add azure/gpt-4o-realtime-audio cost tracking (#9893)
* test: initial commit fixing gemini logprobs

Fixes https://github.com/BerriAI/litellm/issues/9888

* fix(vertex_and_google_ai_studio.py): Revert avglogprobs change

Fixes https://github.com/BerriAI/litellm/issues/8890

* build(model_prices_and_context_window.json): add gpt-4o-realtime-preview cost to model cost map

Fixes https://github.com/BerriAI/litellm/issues/9814

* test: add cost calculation unit testing

* test: fix test

* test: update test
2025-04-10 21:23:55 -07:00
Krish Dholakia
0dbd663877
fix(cost_calculator.py): handle custom pricing at deployment level fo… (#9855)
* fix(cost_calculator.py): handle custom pricing at deployment level for router

* test: add unit tests

* fix(router.py): show custom pricing on UI

check correct model str

* fix: fix linting error

* docs(custom_pricing.md): clarify custom pricing for proxy

Fixes https://github.com/BerriAI/litellm/issues/8573#issuecomment-2790420740

* test: update code qa test

* fix: cleanup traceback

* fix: handle litellm param custom pricing

* test: update test

* fix(cost_calculator.py): add router model id to list of potential model names

* fix(cost_calculator.py): fix router model id check

* fix: router.py - maintain older model registry approach

* fix: fix ruff check

* fix(router.py): router get deployment info

add custom values to mapped dict

* test: update test

* fix(utils.py): update only if value is non-null

* test: add unit test
2025-04-09 22:13:10 -07:00
Krish Dholakia
0c5b4aa96d
feat(realtime/): add token tracking + log usage object in spend logs … (#9843)
* feat(realtime/): add token tracking + log usage object in spend logs metadata

* test: fix test

* test: update tests

* test: update testing

* test: update test

* test: update test

* test: update test

* test: update test

* test: update tesdt

* test: update test
2025-04-09 22:11:00 -07:00
Ishaan Jaff
1359e6d7a6
[SSO] Connect LiteLLM to Azure Entra ID Enterprise Application (#9872)
* refactor SSO handler

* render sso JWT on ui

* docs debug sso

* fix sso login flow use await

* fix ui sso debug JWT

* test ui sso

* remove redis vl

* fix redisvl==0.5.1

* fix ml dtypes

* fix redisvl

* fix redis vl

* fix debug_sso_callback

* fix linting error

* fix redis semantic caching dep

* working graph api assignment

* test msft sso handler openid

* testing for msft group assignment

* fix debug graph api sso flow

* fix linting errors

* add_user_to_teams_from_sso_response

* ui sso fix team assignments

* linting fix _get_group_ids_from_graph_api_response

* add MicrosoftServicePrincipalTeam

* create_litellm_teams_from_service_principal_team_ids

* create_litellm_teams_from_service_principal_team_ids

* docs MICROSOFT_SERVICE_PRINCIPAL_ID

* fix linting errors
2025-04-09 20:26:59 -07:00
Krish Dholakia
ac4f32fb1e
Cost tracking for gemini-2.5-pro (#9837)
* build(model_prices_and_context_window.json): add google/gemini-2.0-flash-lite-001 versioned pricing

Closes https://github.com/BerriAI/litellm/issues/9829

* build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param

* build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param

* build(model_prices_and_context_window.json): add supported endpoints to gemini-2.5-pro

* build(model_prices_and_context_window.json): add gemini 200k+ pricing

* feat(utils.py): support cost calculation for gemini-2.5-pro above 200k tokens

Fixes https://github.com/BerriAI/litellm/issues/9807

* build: test dockerfile change

* build: revert apk change

* ci(config.yml): pip install wheel

* ci: test problematic package first

* ci(config.yml): pip install only binary

* ci: try more things

* ci: test different ml_dtypes version

* ci(config.yml): check ml_dtypes==0.4.0

* ci: test

* ci: cleanup config.yml

* ci: specify ml dtypes in requirements.txt

* ci: remove redisvl depedency (temporary)

* fix: fix linting errors

* test: update test

* test: fix test
2025-04-09 18:48:43 -07:00
Ishaan Jaff
4c1bb74c3d
[Feat] - SSO - Use MSFT Graph API to assign users to teams (#9865)
* refactor SSO handler

* render sso JWT on ui

* docs debug sso

* fix sso login flow use await

* fix ui sso debug JWT

* test ui sso

* remove redis vl

* fix redisvl==0.5.1

* fix ml dtypes

* fix redisvl

* fix redis vl

* fix debug_sso_callback

* fix linting error

* fix redis semantic caching dep

* working graph api assignment

* test msft sso handler openid

* testing for msft group assignment

* fix debug graph api sso flow

* fix linting errors

* add_user_to_teams_from_sso_response

* fix linting error
2025-04-09 18:26:43 -07:00
Krish Dholakia
6ba3c4a4f8
VertexAI non-jsonl file storage support (#9781)
* test: add initial e2e test

* fix(vertex_ai/files): initial commit adding sync file create support

* refactor: initial commit of vertex ai non-jsonl files reaching gcp endpoint

* fix(vertex_ai/files/transformation.py): initial working commit of non-jsonl file call reaching backend endpoint

* fix(vertex_ai/files/transformation.py): working e2e non-jsonl file upload

* test: working e2e jsonl call

* test: unit testing for jsonl file creation

* fix(vertex_ai/transformation.py): reset file pointer after read

allow multiple reads on same file object

* fix: fix linting errors

* fix: fix ruff linting errors

* fix: fix import

* fix: fix linting error

* fix: fix linting error

* fix(vertex_ai/files/transformation.py): fix linting error

* test: update test

* test: update tests

* fix: fix linting errors

* fix: fix test

* fix: fix linting error
2025-04-09 14:01:48 -07:00
qvalentin
93532e00db
feat: add enterpriseWebSearch tool for vertex-ai (#9856) 2025-04-09 13:17:48 -07:00
Ishaan Jaff
ff3a6830a4
[Feat] LiteLLM Tag/Policy Management (#9813)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 15s
Helm unit test / unit-test (push) Successful in 21s
* rendering tags on UI

* use /models for building tags

* CRUD endpoints for Tag management

* fix tag management

* working api for LIST tags

* working tag management

* refactor UI components

* fixes ui tag management

* clean up ui tag management

* fix tag management ui

* fix show allowed llms

* e2e tag controls

* stash change for rendering tags on UI

* ui working tag selector on Test Key page

* fixes for tag management

* clean up tag info

* fix code quality

* test for tag management

* ui clarify what tag routing is
2025-04-07 21:54:24 -07:00
Krish Dholakia
fcf17d114f
Litellm dev 04 05 2025 p2 (#9774)
* test: move test to just checking async

* fix(transformation.py): handle function call with no schema

* fix(utils.py): handle pydantic base model in message tool calls

Fix https://github.com/BerriAI/litellm/issues/9321

* fix(vertex_and_google_ai_studio.py): handle tools=[]

Fixes https://github.com/BerriAI/litellm/issues/9080

* test: remove max token restriction

* test: fix basic test

* fix(get_supported_openai_params.py): fix check

* fix(converse_transformation.py): support fake streaming for meta.llama3-3-70b-instruct-v1:0

* fix: fix test

* fix: parse out empty dictionary on dbrx streaming + tool calls

* fix(handle-'strict'-param-when-calling-fireworks-ai): fireworks ai does not support 'strict' param

* fix: fix ruff check

'

* fix: handle no strict in function

* fix: revert bedrock change - handle in separate PR
2025-04-07 21:02:52 -07:00
Krish Dholakia
8e3c7b2de0
fix(vertex_ai.py): move to only passing in accepted keys by vertex ai response schema (#8992)
* fix(vertex_ai.py): common_utils.py

move to only passing in accepted keys by vertex ai

prevent json schema compatible keys like $id, and $comment from causing vertex ai openapi calls to fail

* fix(test_vertex.py): add testing to ensure only accepted schema params passed in

* fix(common_utils.py): fix linting error

* test: update test

* test: accept function
2025-04-07 18:07:01 -07:00
Krish Dholakia
4a128cfd64
Realtime API Cost tracking (#9795)
* fix(proxy_server.py): log realtime calls to spendlogs

Fixes https://github.com/BerriAI/litellm/issues/8410

* feat(realtime/): OpenAI Realtime API cost tracking

Closes https://github.com/BerriAI/litellm/issues/8410

* test: add unit testing for coverage

* test: add more unit testing

* fix: handle edge cases
2025-04-07 16:43:12 -07:00
Krish Dholakia
5099aac1a5
Add DBRX Anthropic w/ thinking + response_format support (#9744)
* feat(databricks/chat/): add anthropic w/ reasoning content support via databricks

Allows user to call claude-3-7-sonnet with thinking via databricks

* refactor: refactor choices transformation + add unit testing

* fix(databricks/chat/transformation.py): support thinking blocks on databricks response streaming

* feat(databricks/chat/transformation.py): support response_format for claude models

* fix(databricks/chat/transformation.py): correctly handle response_format={"type": "text"}

* feat(databricks/chat/transformation.py): support 'reasoning_effort' param mapping for anthropic

* fix: fix ruff errors

* fix: fix linting error

* test: update test

* fix(databricks/chat/transformation.py): handle json mode output parsing

* fix(databricks/chat/transformation.py): handle json mode on streaming

* test: update test

* test: update dbrx testing

* test: update testing

* fix(base_model_iterator.py): handle non-json chunk

* test: update tests

* fix: fix ruff check

* fix: fix databricks config import

* fix: handle _tool = none

* test: skip invalid test
2025-04-04 22:13:32 -07:00
Ishaan Jaff
8c3670e192
Merge pull request #9719 from BerriAI/litellm_metrics_pod_lock_manager
[Reliability] Emit operational metrics for new DB Transaction architecture
2025-04-04 21:12:06 -07:00
Krish Dholakia
af42e5855f
Gemini image generation output support (#9646)
* fix(gemini/transformation.py): make GET request to get uri details, if cannot be inferred

* fix: fix linting errors

* Revert "fix: fix linting errors"

This reverts commit 926a5a527f.

* fix(gemini/transformation.py): modalities param support

Partially resolves https://github.com/BerriAI/litellm/issues/9237

* feat(google_ai_studio/): add image generation support

Closes https://github.com/BerriAI/litellm/issues/9237

* fix: fix types

* fix: fix ruff check
2025-04-04 20:37:48 -07:00
Krish Dholakia
c555c15ad7
fix(router.py): support reusable credentials via passthrough router (#9758)
* fix(router.py): support reusable credentials via passthrough router

enables reusable vertex credentials to be used in passthrough

* test: fix test

* test(test_router_adding_deployments.py): add unit testing
2025-04-04 18:40:14 -07:00
Ishaan Jaff
901d6fe7b7 add operational metrics for pod lock manager v2 arch 2025-04-04 16:41:07 -07:00
Ishaan Jaff
1cdee4b331 Merge branch 'main' into litellm_metrics_pod_lock_manager 2025-04-04 16:33:16 -07:00
Krrish Dholakia
ad90871ad6 fix(factory.py): don't pass cache control if not set
bedrock invoke does not support this
2025-04-04 12:37:34 -07:00
Albert Örwall
bd5a8d582b
Fix prompt caching for Anthropic tool calls (#9706)
* Add prompt cache support to Anhtropic tool calls

* Fix linting issue and add test
2025-04-03 20:19:21 -07:00
Krish Dholakia
6dda1ba6dd
LiteLLM Minor Fixes & Improvements (04/02/2025) (#9725)
* Add date picker to usage tab + Add reasoning_content token tracking across all providers on streaming (#9722)

* feat(new_usage.tsx): add date picker for new usage tab

allow user to look back on their usage data

* feat(anthropic/chat/transformation.py): report reasoning tokens in completion token details

allows usage tracking on how many reasoning tokens are actually being used

* feat(streaming_chunk_builder.py): return reasoning_tokens in anthropic/openai streaming response

allows tracking reasoning_token usage across providers

* Fix update team metadata + fix bulk adding models on Ui  (#9721)

* fix(handle_add_model_submit.tsx): fix bulk adding models

* fix(team_info.tsx): fix team metadata update

Fixes https://github.com/BerriAI/litellm/issues/9689

* (v0) Unified file id - allow calling multiple providers with same file id (#9718)

* feat(files_endpoints.py): initial commit adding 'target_model_names' support

allow developer to specify all the models they want to call with the file

* feat(files_endpoints.py): return unified files endpoint

* test(test_files_endpoints.py): add validation test - if invalid purpose submitted

* feat: more updates

* feat: initial working commit of unified file id translation

* fix: additional fixes

* fix(router.py): remove model replace logic in jsonl on acreate_file

enables file upload to work for chat completion requests as well

* fix(files_endpoints.py): remove whitespace around model name

* fix(azure/handler.py): return acreate_file with correct response type

* fix: fix linting errors

* test: fix mock test to run on github actions

* fix: fix ruff errors

* fix: fix file too large error

* fix(utils.py): remove redundant var

* test: modify test to work on github actions

* test: update tests

* test: more debug logs to understand ci/cd issue

* test: fix test for respx

* test: skip mock respx test

fails on ci/cd - not clear why

* fix: fix ruff check

* fix: fix test

* fix(model_connection_test.tsx): fix linting error

* test: update unit tests
2025-04-03 11:48:52 -07:00
Ishaan Jaff
dd2d1dc2f4 Merge branch 'main' into litellm_metrics_pod_lock_manager 2025-04-02 21:35:55 -07:00