Commit graph

13440 commits

Author SHA1 Message Date
Ishaan Jaff
0b77f93221 can_user_call_model 2025-04-19 15:50:47 -07:00
Ishaan Jaff
653570824a
Bug Fix - Responses API, Loosen restrictions on allowed environments for computer use tool (#10168)
* loosen allowed types on ComputerToolParam

* test_basic_computer_use_preview_tool_call
2025-04-19 14:40:32 -07:00
Ishaan Jaff
c80e984d7e ui new build 2025-04-19 14:19:33 -07:00
Ishaan Jaff
b0024bb229
[Bug Fix] Spend Tracking Bug Fix, don't modify in memory default litellm params (#10167)
* _update_kwargs_with_default_litellm_params

* test_update_kwargs_does_not_mutate_defaults_and_merges_metadata
2025-04-19 14:13:59 -07:00
Ishaan Jaff
0717369ae6
[Feat] Expose Responses API on LiteLLM UI Test Key Page (#10166)
* add /responses API on UI

* add makeOpenAIResponsesRequest

* add makeOpenAIResponsesRequest

* fix add responses API on UI

* fix endpoint selector

* responses API render chunks on litellm chat ui

* fixes to streaming iterator

* fix render responses completed events

* fixes for MockResponsesAPIStreamingIterator

* transform_responses_api_request_to_chat_completion_request

* fix for responses API

* test_basic_openai_responses_api_streaming

* fix base responses api tests
2025-04-19 13:18:54 -07:00
Krish Dholakia
03b5399f86
test(utils.py): handle scenario where text tokens + reasoning tokens … (#10165)
* test(utils.py): handle scenario where text tokens + reasoning tokens set, but reasoning tokens not charged separately

Addresses https://github.com/BerriAI/litellm/pull/10141#discussion_r2051555332

* fix(vertex_and_google_ai_studio.py): only set content if non-empty str
2025-04-19 12:32:38 -07:00
Ong Khai Wei
99db1b7690
to get API key from environment viarble of WATSONX_APIKEY (#10131) 2025-04-19 11:25:14 -07:00
Classic298
173ef01ef6
Update model_prices_and_context_window_backup.json (#10122)
* Update model_prices_and_context_window_backup.json

* Update model_prices_and_context_window_backup.json
2025-04-19 11:24:52 -07:00
Krish Dholakia
5c929317cd
fix(triton/completion/transformation.py): remove bad_words / stop wor… (#10163)
* fix(triton/completion/transformation.py): remove bad_words / stop words from triton call

parameter 'bad_words' has invalid type. It should be either 'int', 'bool', or 'string'.

* fix(proxy_track_cost_callback.py): add debug logging for track cost callback error
2025-04-19 11:23:37 -07:00
Krish Dholakia
f08a4e3c06
Support 'file' message type for VLLM video url's + Anthropic redacted message thinking support (#10129)
* feat(hosted_vllm/chat/transformation.py): support calling vllm video url with openai 'file' message type

allows switching between gemini/vllm easily

* [WIP] redacted thinking tests (#9044)

* WIP: redacted thinking tests

* test: add test for redacted thinking in assistant message

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* fix(anthropic/chat/transformation.py): support redacted thinking block on anthropic completion

Fixes https://github.com/BerriAI/litellm/issues/9058

* fix(anthropic/chat/handler.py): transform anthropic redacted messages on streaming

Fixes https://github.com/BerriAI/litellm/issues/9058

* fix(bedrock/): support redacted text on streaming + non-streaming

Fixes https://github.com/BerriAI/litellm/issues/9058

* feat(litellm_proxy/chat/transformation.py): support 'reasoning_effort' param for proxy

allows using reasoning effort with thinking models on proxy

* test: update tests

* fix(utils.py): fix linting error

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting error

* fix: fix linting errors

* fix(anthropic/chat/transformation.py): fix returning citations in chat completion

---------

Co-authored-by: Johann Miller <22018973+johannkm@users.noreply.github.com>
2025-04-19 11:16:37 -07:00
Ishaan Jaff
3c463f6715 test fix - output_cost_per_reasoning_token was added to model cost map 2025-04-19 10:02:25 -07:00
Krish Dholakia
2508ca71cb
Handle fireworks ai tool calling response (#10130)
* feat(fireworks_ai/chat): handle tool calling with fireworks ai correctly

Fixes https://github.com/BerriAI/litellm/issues/7209

* fix(utils.py): handle none type in message

* fix: fix model name in test

* fix(utils.py): fix validate check for openai messages

* fix: fix model returned

* fix(main.py): fix text completion routing

* test: update testing

* test: skip test - cohere having RBAC issues
2025-04-19 09:37:45 -07:00
Krish Dholakia
36308a31be
Gemini-2.5-flash - support reasoning cost calc + return reasoning content (#10141)
* build(model_prices_and_context_window.json): add vertex ai gemini-2.5-flash pricing

* build(model_prices_and_context_window.json): add gemini reasoning token pricing

* fix(vertex_and_google_ai_studio_gemini.py): support counting thinking tokens for gemini

allows accurate cost calc

* fix(utils.py): add reasoning token cost calc to generic cost calc

ensures gemini-2.5-flash cost calculation is accurate

* build(model_prices_and_context_window.json): mark gemini-2.5-flash as 'supports_reasoning'

* feat(gemini/): support 'thinking' + 'reasoning_effort' params + new unit tests

allow controlling thinking effort for gemini-2.5-flash models

* test: update unit testing

* feat(vertex_and_google_ai_studio_gemini.py): return reasoning content if given in gemini response

* test: update model name

* fix: fix ruff check

* test(test_spend_management_endpoints.py): update tests to be less sensitive to new keys / updates to usage object

* fix(vertex_and_google_ai_studio_gemini.py): fix translation
2025-04-19 09:20:52 -07:00
Krrish Dholakia
ba1b552e8b fix(common_daily_activity.py): fix python 3_8 error 2025-04-19 08:39:19 -07:00
Ishaan Jaff
49759d5678 fix get_azure_client 2025-04-19 08:33:26 -07:00
Krrish Dholakia
dee5182fc8 fix: fix linting error 2025-04-19 08:04:56 -07:00
Ishaan Jaff
a62805f98f fixes for assistans API tests 2025-04-19 07:59:53 -07:00
Krish Dholakia
ef6ac42658
Litellm dev 04 18 2025 p2 (#10157)
* fix(proxy/_types.py): allow internal user to call api playground

* fix(new_usage.tsx): cleanup tag based usage - only show for proxy admin

not clear what tags internal user should be allowed to see

* fix(team_endpoints.py): allow internal user view spend for teams they belong to

* fix(team_endpoints.py): return team alias on `/team/daily/activity` API

allows displaying team alias on ui

* fix: fix linting error

* fix(entity_usage.tsx): allow viewing top keys by team

* fix(entity_usage.tsx): show alias, if available in breakdown

allows entity alias to be easily displayed

* Show usage by key (on all up, team, and tag usage dashboards)  (#10152)

* fix(entity_usage.tsx): allow user to select team in team usage tab

* fix(new_usage.tsx): load all tags for filtering

* fix(tag_management_endpoints.py): return dynamic tags from db on `/tag/list`

* fix(litellm_pre_call_utils.py): support x-litellm-tags even if tag based routing not enabled

* fix(new_usage.tsx): show breakdown of usage by api key on dashboard

helpful when looking at spend by team

* fix(networking.tsx): exclude litellm-dashboard team id's from calls

adds noisy ui tokens to key activity

* fix(new_usage.tsx): allow user to see activity by key on main tab

* feat(internal_user_endpoints.py): refactor to use common_daily_activity function

reuses same logic across teams/keys/tags

Allows returning team_alias in api_keys consistently

* fix(leftnav.tsx): swap old usage with new usage tab

* fix(entity_usage.tsx): show breakdown of teams in daily spend chart

* style(new_usage.tsx): show global usage tab if user is admin / has admin view

* fix(new_usage.tsx): add disclaimer for new usage dashboard

* fix(new_usage.tsx): fix linting error

* Allow filtering usage dashboard by team + tag (#10150)

* fix(entity_usage.tsx): allow user to select team in team usage tab

* fix(new_usage.tsx): load all tags for filtering

* fix(tag_management_endpoints.py): return dynamic tags from db on `/tag/list`

* fix(litellm_pre_call_utils.py): support x-litellm-tags even if tag based routing not enabled

* fix: fix linting error
2025-04-19 07:32:23 -07:00
Ishaan Jaff
3d5022bd79
[Feat] Support for all litellm providers on Responses API (works with Codex) - Anthropic, Bedrock API, VertexAI, Ollama (#10132)
* transform request

* basic handler for LiteLLMCompletionTransformationHandler

* complete transform litellm to responses api

* fixes to test

* fix stream=True

* fix streaming iterator

* fixes for transformation

* fixes for anthropic codex support

* fix pass response_api_optional_params

* test anthropic responses api tools

* update responses types

* working codex with litellm

* add session handler

* fixes streaming iterator

* fix handler

* add litellm codex example

* fix code quality

* test fix

* docs litellm codex

* litellm codexdoc

* docs openai codex with litellm

* docs litellm openai codex

* litellm codex

* linting fixes for transforming responses API

* fix import error

* fix responses api test

* add sync iterator support for responses api
2025-04-18 19:53:59 -07:00
David Emmanuel
de3c2d14bf
Add Gemini Flash 2.5 Preview Model Price and Context Window (#10125)
* Update model_prices_and_context_window_backup.json

* Update model_prices_and_context_window.json
2025-04-18 09:44:46 -07:00
Krrish Dholakia
809eb859cf fix(azure/o_series_transformation.py): fix azure o4 model routing
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 41s
Helm unit test / unit-test (push) Successful in 49s
Fixes https://github.com/BerriAI/litellm/pull/10065#issuecomment-2814015058
2025-04-17 22:58:01 -07:00
Krrish Dholakia
614d80cb1b build(model_prices_and_context_window.json): add azure gpt-4.1 pricing
ensures cost tracking for gpt-4.1 works
2025-04-17 20:09:17 -07:00
Ishaan Jaff
d3e04eac7f
[Feat] Unified Responses API - Add Azure Responses API support (#10116)
* initial commit for azure responses api support

* update get complete url

* fixes for responses API

* working azure responses API

* working responses API

* test suite for responses API

* azure responses API test suite

* fix test with complete url

* fix test refactor

* test fix metadata checks

* fix code quality check
2025-04-17 16:47:59 -07:00
Krrish Dholakia
ff81f48af3 bump: version 1.66.2 → 1.66.3 2025-04-16 22:20:10 -07:00
Krrish Dholakia
78c6d73dea build: new ui build 2025-04-16 22:11:53 -07:00
Krish Dholakia
8ddaf3dfbc
fix(o_series_transformation.py): correctly map o4 to openai o_series model (#10079)
Fixes https://github.com/BerriAI/litellm/issues/10066
2025-04-16 21:51:31 -07:00
Krish Dholakia
c73a6a8d1e
Add new /vertex_ai/discovery route - enables calling AgentBuilder API routes (#10084)
* feat(llm_passthrough_endpoints.py): expose new `/vertex_ai/discovery/` endpoint

Allows calling vertex ai discovery endpoints via passthrough

 For agentbuilder api calls

* refactor(llm_passthrough_endpoints.py): use common _base_vertex_proxy_route

Prevents duplicate code

* feat(llm_passthrough_endpoints.py): add vertex endpoint specific passthrough handlers
2025-04-16 21:45:51 -07:00
Ishaan Jaff
12ccb954a6 ui new build 2025-04-16 19:23:04 -07:00
Ishaan Jaff
6220f3e7b8
[Feat SSO] Add LiteLLM SCIM Integration for Team and User management (#10072)
* fix NewUser response type

* add scim router

* add v0 scim v2 endpoints

* working scim transformation

* use 1 file for types

* fix scim firstname and givenName storage

* working SCIMErrorResponse

* working team / group provisioning on SCIM

* add SCIMPatchOp

* move scim folder

* fix import scim_router

* fix dont auto create scim keys

* add auth on all scim endpoints

* add is_virtual_key_allowed_to_call_route

* fix allowed routes

* fix for key management

* fix allowed routes check

* clean up error message

* fix code check

* fix for route checks

* ui SCIM support

* add UI tab for SCIM

* fixes SCIM

* fixes for SCIM settings on ui

* scim settings

* clean up scim view

* add migration for allowed_routes in keys table

* refactor scim transform

* fix SCIM linting error

* fix code quality check

* fix ui linting

* test_scim_transformations.py
2025-04-16 19:21:47 -07:00
Krish Dholakia
7ca553b235
Add team based usage dashboard at 1m+ spend logs (+ new /team/daily/activity API) (#10081)
* feat(ui/): add team based usage to dashboard

allows admin to see spend across teams + within teams at 1m+ spend logs

* fix(entity_usage.tsx): add activity page to entity usage

* style(entity_usage.tsx): place filter above tab switcher
2025-04-16 18:10:14 -07:00
Krish Dholakia
c0d7e9f16d
Add new /tag/daily/activity endpoint + Add tag dashboard to UI (#10073)
Some checks failed
Read Version from pyproject.toml / read-version (push) Successful in 15s
Helm unit test / unit-test (push) Successful in 24s
Publish Prisma Migrations / publish-migrations (push) Failing after 1m47s
* feat: initial commit adding daily tag spend table to db

* feat(db_spend_update_writer.py): correctly log tag spend transactions

* build(schema.prisma): add new tag table to root

* build: add new migration file

* feat(common_daily_activity.py): add `/tag/daily/activity` API endpoint

allows viewing daily spend by tag

* feat(tag_management_endpoints.py): support comma separated list of tags + tag breakdown metric

allows querying multiple tags + knowing what tags are driving spend

* feat(entity_usage.tsx): initial commit adding tag based usage to litellm dashboard

brings back tag based usage tracking to UI at 1m+ spend logs

* feat(entity_usage.tsx): add top api key view to ui

* feat(entity_usage.tsx): add tag table to ui

* feat(entity_usage.tsx): allow filtering by tag

* refactor(entity_usage.tsx): reorder components

* build(ui/): fix linting error

* fix: fix ruff checks

* fix(schema.prisma): drop uniqueness requirement on tag

allows dailytagspend to have multiple rows with the same tag

* build(schema.prisma): drop uniqueness requirement on tag in dailytagspend

allows tag agg. view to work on multiple rows with same tag

* build(schema.prisma): drop tag uniqueness requirement
2025-04-16 15:24:44 -07:00
Peter Dave Hello
5c078af738
Add OpenAI o3 & 4o-mini (#10065)
Reference:
- https://platform.openai.com/docs/models/o3
- https://platform.openai.com/docs/models/o4-mini
2025-04-16 12:40:13 -07:00
Krish Dholakia
d8a1071bc4
Add aggregate spend by tag (#10071)
* feat: initial commit adding daily tag spend table to db

* feat(db_spend_update_writer.py): correctly log tag spend transactions

* build(schema.prisma): add new tag table to root

* build: add new migration file
2025-04-16 12:26:21 -07:00
Krish Dholakia
47e811d6ce
fix(llm_http_handler.py): fix fake streaming (#10061)
* fix(llm_http_handler.py): fix fake streaming

allows groq to work with llm_http_handler

* fix(groq.py): migrate groq to openai like config

ensures json mode handling works correctly
2025-04-16 10:15:11 -07:00
Krish Dholakia
c603680d2a
fix(stream_chunk_builder_utils.py): don't set index on modelresponse (#10063)
* fix(stream_chunk_builder_utils.py): don't set index on modelresponse

* test: update tests
2025-04-16 10:11:47 -07:00
Ishaan Jaff
a9e8a36f89
[Bug Fix] Azure Blob Storage fixes (#10059)
* Simple fix for #9339 - upgrade the underlying library and cache the azure storage client (#9965)

* fix -  use constants for caching azure storage client

---------

Co-authored-by: Adrian Lyjak <adrian@chatmeter.com>
2025-04-16 09:47:10 -07:00
Krrish Dholakia
a743b6fc1f fix(bedrock/common_utils.py): add us-west-1 to us regions 2025-04-16 08:00:39 -07:00
ChaoFu Yang
c07eea864e
/utils/token_counter: get model_info from deployment directly (#10047) 2025-04-16 07:53:18 -07:00
Michael Leshchinsky
e19d05980c
Add litellm call id passing to Aim guardrails on pre and post-hooks calls (#10021)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 19s
* Add litellm_call_id passing to aim guardrails on pre and post-hooks

* Add test that ensures that pre_call_hook receives litellm call id when common_request_processing called
2025-04-16 07:41:28 -07:00
Ishaan Jaff
1d4fea509d ui new build 2025-04-15 22:36:44 -07:00
Ishaan Jaff
dcc43e797a
[Docs] Auto prompt caching (#10044)
* docs prompt cache controls

* doc fix auto prompt caching
2025-04-15 22:29:47 -07:00
Krish Dholakia
fdfa1108a6
Add property ordering for vertex ai schema (#9828) + Fix combining multiple tool calls (#10040)
* fix #9783: Retain schema field ordering for google gemini and vertex (#9828)

* test: update test

* refactor(groq.py): initial commit migrating groq to base_llm_http_handler

* fix(streaming_chunk_builder_utils.py): fix how tool content is combined

Fixes https://github.com/BerriAI/litellm/issues/10034

* fix(vertex_ai/common_utils.py): prevent infinite loop in helper function

* fix(groq/chat/transformation.py): handle groq streaming errors correctly

* fix(groq/chat/transformation.py): handle max_retries

---------

Co-authored-by: Adrian Lyjak <adrian@chatmeter.com>
2025-04-15 22:29:25 -07:00
Krish Dholakia
1b9b745cae
Fix gcs pub sub logging with env var GCS_PROJECT_ID (#10042)
* fix(pub_sub.py): fix passing project id in pub sub call

Fixes issue where GCS_PUBSUB_PROJECT_ID was not being used

* test(test_pub_sub.py): add unit test to prevent future regressions

* test: fix test
2025-04-15 21:50:48 -07:00
Ishaan Jaff
bd88263b29
[Feat - Cost Tracking improvement] Track prompt caching metrics in DailyUserSpendTransactions (#10029)
* stash changes

* emit cache read/write tokens to daily spend update

* emit cache read/write tokens on daily activity

* update types.ts

* docs prompt caching

* undo ui change

* fix activity metrics

* fix prompt caching metrics

* fix typed dict fields

* fix get_aggregated_daily_spend_update_transactions

* fix aggregating cache tokens

* test_cache_token_fields_aggregation

* daily_transaction

* add cache_creation_input_tokens and cache_read_input_tokens to LiteLLM_DailyUserSpend

* test_daily_spend_update_queue.py
2025-04-15 21:40:57 -07:00
Ishaan Jaff
d32d6fe03e
[UI] Bug Fix - Show created_at and updated_at for Users Page (#10033)
* add created_at and updated_at as fields for internal user table

* test_get_users_includes_timestamps
2025-04-15 21:15:44 -07:00
Krish Dholakia
9b77559ccf
Add aggregate team based usage logging (#10039)
* feat(schema.prisma): initial commit adding aggregate table for team spend

allows team spend to be visible at 1m+ logs

* feat(db_spend_update_writer.py): support logging aggregate team spend

allows usage dashboard to work at 1m+ logs

* feat(litellm-proxy-extras/): add new migration file

* fix(db_spend_update_writer.py): fix return type

* build: bump requirements

* fix: fix ruff error
2025-04-15 20:58:48 -07:00
Krish Dholakia
d3e7a137ad
Revert "fix #9783: Retain schema field ordering for google gemini and vertex …" (#10038)
This reverts commit e3729f9855.
2025-04-15 19:21:33 -07:00
Adrian Lyjak
e3729f9855
fix #9783: Retain schema field ordering for google gemini and vertex (#9828) 2025-04-15 19:12:02 -07:00
Marc Abramowitz
837a6948d8
Fix typo: Entrata -> Entra in code (#9922)
* Fix typo: Entrata -> Entra

* Fix a few more
2025-04-15 17:31:18 -07:00
Krish Dholakia
6b5f093087
Revert "Fix case where only system messages are passed to Gemini (#9992)" (#10027)
This reverts commit 2afd922f8c.
2025-04-15 13:34:03 -07:00