Commit graph

225 commits

Author SHA1 Message Date
Krish Dholakia
6dda1ba6dd
LiteLLM Minor Fixes & Improvements (04/02/2025) (#9725)
* Add date picker to usage tab + Add reasoning_content token tracking across all providers on streaming (#9722)

* feat(new_usage.tsx): add date picker for new usage tab

allow user to look back on their usage data

* feat(anthropic/chat/transformation.py): report reasoning tokens in completion token details

allows usage tracking on how many reasoning tokens are actually being used

* feat(streaming_chunk_builder.py): return reasoning_tokens in anthropic/openai streaming response

allows tracking reasoning_token usage across providers

* Fix update team metadata + fix bulk adding models on Ui  (#9721)

* fix(handle_add_model_submit.tsx): fix bulk adding models

* fix(team_info.tsx): fix team metadata update

Fixes https://github.com/BerriAI/litellm/issues/9689

* (v0) Unified file id - allow calling multiple providers with same file id (#9718)

* feat(files_endpoints.py): initial commit adding 'target_model_names' support

allow developer to specify all the models they want to call with the file

* feat(files_endpoints.py): return unified files endpoint

* test(test_files_endpoints.py): add validation test - if invalid purpose submitted

* feat: more updates

* feat: initial working commit of unified file id translation

* fix: additional fixes

* fix(router.py): remove model replace logic in jsonl on acreate_file

enables file upload to work for chat completion requests as well

* fix(files_endpoints.py): remove whitespace around model name

* fix(azure/handler.py): return acreate_file with correct response type

* fix: fix linting errors

* test: fix mock test to run on github actions

* fix: fix ruff errors

* fix: fix file too large error

* fix(utils.py): remove redundant var

* test: modify test to work on github actions

* test: update tests

* test: more debug logs to understand ci/cd issue

* test: fix test for respx

* test: skip mock respx test

fails on ci/cd - not clear why

* fix: fix ruff check

* fix: fix test

* fix(model_connection_test.tsx): fix linting error

* test: update unit tests
2025-04-03 11:48:52 -07:00
Ishaan Jaff
e68603e176 test create and update gauge 2025-04-02 21:31:19 -07:00
Ishaan Jaff
8405fcb748 test pod lock manager 2025-04-02 15:06:31 -07:00
Pranav Simha
2e35f07e94
Add support for max_completion_tokens to the Cohere chat transformation config (#9701) 2025-04-02 07:50:44 -07:00
Krish Dholakia
23051d89dd
fix(streaming_handler.py): fix completion start time tracking (#9688)
* fix(streaming_handler.py): fix completion start time tracking

Fixes https://github.com/BerriAI/litellm/issues/9210

* feat(anthropic/chat/transformation.py): map openai 'reasoning_effort' to anthropic 'thinking' param

Fixes https://github.com/BerriAI/litellm/issues/9022

* feat: map 'reasoning_effort' to 'thinking' param across bedrock + vertex

Closes https://github.com/BerriAI/litellm/issues/9022#issuecomment-2705260808
2025-04-01 22:00:56 -07:00
Ishaan Jaff
feba274a89 test DailySpendUpdateQueue 2025-04-01 18:39:23 -07:00
Ishaan Jaff
4a091a34b0 move test loc 2025-04-01 18:33:33 -07:00
Ishaan Jaff
8dc792139e refactor file structure 2025-04-01 18:30:48 -07:00
Ishaan Jaff
4ddca7a79c
Merge branch 'main' into litellm_fix_service_account_behavior 2025-04-01 12:04:28 -07:00
Ishaan Jaff
61b609f320
Merge pull request #9673 from BerriAI/litellm_qa_deadlock_fixes
[Reliability] - Ensure new Redis + DB architecture tracks spend accurately
2025-04-01 12:04:03 -07:00
Ishaan Jaff
c2c5dbf24f test_get_enforced_params 2025-04-01 08:41:53 -07:00
Ishaan Jaff
f805e15f7b test_get_enforced_params_for_service_account_settings 2025-04-01 08:39:41 -07:00
Ishaan Jaff
e5f6529c42 test_get_enforced_params_for_service_account_settings 2025-04-01 07:46:38 -07:00
Krish Dholakia
62ad84fb64
UI (new_usage.tsx): Report 'total_tokens' + report success/failure calls (#9675)
* feat(internal_user_endpoints.py): return 'total_tokens' in `/user/daily/analytics`

* test(test_internal_user_endpoints.py): add unit test to assert spend metrics and dailyspend metadata always report the same fields

* build(schema.prisma): record success + failure calls to daily user table

allows understanding why model requests might exceed provider requests (e.g. user hit rate limit error)

* fix(internal_user_endpoints.py): report success / failure requests in API

* fix(proxy/utils.py): default to success

status can be missing or none at times for successful requests

* feat(new_usage.tsx): show success/failure calls on UI

* style(new_usage.tsx): ui cleanup

* fix: fix linting error

* fix: fix linting error

* feat(litellm-proxy-extras/): add new migration files
2025-03-31 22:48:43 -07:00
Krish Dholakia
f2a7edaddc
fix(proxy_server.py): Fix "Circular reference detected" error when max_parallel_requests = 0 (#9671)
* fix(proxy_server.py): remove non-functional parent backoff/retry on /chat/completion

Causes circular reference error

* fix(http_parsing_utils.py): safely return parsed body - don't allow mutation of cached request body by client functions

Root cause fix for circular reference error

* Revert "fix: Anthropic prompt caching on GCP Vertex AI (#9605)" (#9670)

This reverts commit a8673246dc.

* add type hints for AnthropicMessagesResponse

* define types for response form AnthropicMessagesResponse

* fix response typing

* allow using litellm.messages.acreate and litellm.messages.create

* fix anthropic_messages implementation

* add clear type hints to litellm.messages.create functions

* fix anthropic_messages

* working anthropic API tests

* fixes - anthropic messages interface

* use new anthropic interface

* fix code quality check

* docs anthropic messages endpoint

* add namespace_packages = True to mypy

* fix mypy lint errors

* docs anthropic messages interface

* test: fix unit test

* test(test_http_parsing_utils.py): update tests

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2025-03-31 22:06:02 -07:00
Ishaan Jaff
115946d402 unit testing for SpendUpdateQueue 2025-03-31 21:25:24 -07:00
Krish Dholakia
5ad2fbcba6
Openrouter streaming fixes + Anthropic 'file' message support (#9667)
* fix(openrouter/transformation.py): Handle error in openrouter stream

Fixes https://github.com/Aider-AI/aider/issues/3550

* test(test_openrouter_chat_transformation.py): add unit tests

* feat(anthropic/chat/transformation.py): add openai 'file' message content type support

Closes https://github.com/BerriAI/litellm/issues/9463

* fix(factory.py): add bedrock converse support for openai 'file' message content type

Closes https://github.com/BerriAI/litellm/issues/9463
2025-03-31 21:22:59 -07:00
Krish Dholakia
46b3dbde8f
Revert "fix: Anthropic prompt caching on GCP Vertex AI (#9605)" (#9670)
This reverts commit a8673246dc.
2025-03-31 17:13:55 -07:00
Ishaan Jaff
ca4ed9ff2e ref issue 2025-03-31 16:05:10 -07:00
Ishaan Jaff
bc66827537 test_aiter_bytes_valid_chunk_followed_by_unicode_error 2025-03-31 16:04:38 -07:00
Sam
a8673246dc
fix: Anthropic prompt caching on GCP Vertex AI (#9605)
* fix: Anthropic prompt caching on GCP Vertex AI

* test(vertex): anthropic prompt caching
2025-03-29 23:40:34 -07:00
Ishaan Jaff
194327bb7c test fixes 2025-03-29 18:34:58 -07:00
Ishaan Jaff
a3df0269bb fix tests 2025-03-29 17:38:24 -07:00
Ishaan Jaff
3e378f2bec async def test_spend_logs_payload_e2e(self): 2025-03-29 17:07:36 -07:00
Ishaan Jaff
815263f7bc rename transform_openai_tool_call_request_to_mcp_tool_call_request 2025-03-29 16:28:23 -07:00
Krish Dholakia
5ac61a7572
Add bedrock latency optimized inference support (#9623)
* fix(converse_transformation.py): add performanceConfig param support on bedrock

Closes https://github.com/BerriAI/litellm/issues/7606

* fix(converse_transformation.py): refactor to use more flexible single getter for params which are separate config blocks

* test(test_main.py): add e2e mock test for bedrock performance config

* build(model_prices_and_context_window.json): add versioned multimodal embedding

* refactor(multimodal_embeddings/): migrate to config pattern

* feat(vertex_ai/multimodalembeddings): calculate usage for multimodal embedding calls

Enables cost calculation for multimodal embeddings

* feat(vertex_ai/multimodalembeddings): get usage object for embedding calls

ensures accurate cost tracking for vertexai multimodal embedding calls

* fix(embedding_handler.py): remove unused imports

* fix: fix linting errors

* fix: handle response api usage calculation

* test(test_vertex_ai_multimodal_embedding_transformation.py): update tests

* test: mark flaky test

* feat(vertex_ai/multimodal_embeddings/transformation.py): support text+image+video input

* docs(vertex.md): document sending text + image to vertex multimodal embeddings

* test: remove incorrect file

* fix(multimodal_embeddings/transformation.py): fix linting error

* style: remove unused import
2025-03-29 00:23:09 -07:00
Ishaan Jaff
7e8a02099c Merge branch 'main' into litellm_use_redis_for_updates 2025-03-28 20:12:29 -07:00
Nicholas Grabar
09daeac188 Rebasing 2 2025-03-28 15:18:09 -07:00
Nicholas Grabar
06a45706b2 Rebase 3 2025-03-28 15:18:05 -07:00
Krish Dholakia
205db622bf fix(proxy_server.py): get master key from environment, if not set in … (#9617)
* fix(proxy_server.py): get master key from environment, if not set in general settings or general settings not set at all

* test: mark flaky test

* test(test_proxy_server.py): mock prisma client

* ci: add new github workflow for testing just the mock tests

* fix: fix linting error

* ci(conftest.py): add conftest.py to isolate proxy tests

* build(pyproject.toml): add respx to dev dependencies

* build(pyproject.toml): add prisma to dev dependencies

* test: fix mock prompt management tests to use a mock anthropic key

* ci(test-litellm.yml): parallelize mock testing

make it run faster

* build(pyproject.toml): add hypercorn as dev dep

* build(pyproject.toml): separate proxy vs. core dev dependencies

make it easier for non-proxy contributors to run tests locally - e.g. no need to install hypercorn

* ci(test-litellm.yml): pin python version

* test(test_rerank.py): move test - cannot be mocked, requires aws credentials for e2e testing

* ci: add thank you message to ci

* test: add mock env var to test

* test: add autouse to tests

* test: test mock env vars for e2e tests
2025-03-28 15:16:15 -07:00
Ishaan Jaff
193052ed70 test pod lock manager 2025-03-28 15:05:17 -07:00
Nicholas Grabar
1f2bbda11d Add recursion depth to convert_anyof_null_to_nullable, constants.py. Fix recursive_detector.py raise error state 2025-03-28 13:11:19 -07:00
Ishaan Jaff
021eedaf69 test pod lock manager 2025-03-28 12:59:16 -07:00
Ishaan Jaff
c53d172b06 rename pod lock manager 2025-03-28 12:57:00 -07:00
NickGrab
b72fbdde74
Merge branch 'main' into litellm_8864-feature-vertex-anyOf-support 2025-03-28 10:25:04 -07:00
Nicholas Grabar
9437ee5e1f Revert "Unit test fixing and poetry update"
This reverts commit 8c79e1902e.
2025-03-28 10:22:32 -07:00
Nicholas Grabar
8c79e1902e Unit test fixing and poetry update 2025-03-28 09:57:53 -07:00
Krish Dholakia
b9d0f460e8
Revert "Support max_completion_tokens on Mistral (#9589)" (#9604)
This reverts commit fef5d23dd5.
2025-03-27 19:14:26 -07:00
Chris Mancuso
fef5d23dd5
Support max_completion_tokens on Mistral (#9589)
* Support max_completion_tokens on Mistral

* test fix
2025-03-27 17:27:19 -07:00
Krish Dholakia
fb83567a03
Litellm new UI build (#9601)
* build: new ui build

* build: new ui build

* fix(proxy_server.py): only show user models their key can access on `/models`

* fix(model_management_endpoints.py): ensure team admin can add models

* test: update unit testing to reflect changes

* fix(model_dashboard.tsx): fix sizing on models page

* build: fix ui
2025-03-27 17:15:25 -07:00
Ishaan Jaff
a0fd508de4 DBSpendUpdateWriter 2025-03-27 16:43:18 -07:00
Ishaan Jaff
21e3b764f5 use DBSpendUpdateWriter class for managing DB spend updates 2025-03-27 16:31:23 -07:00
Krish Dholakia
63c9f59373
Allow team admins to add/update/delete models on UI + show api base and model id on request logs (#9572)
* feat(view_logs.tsx): show model id + api base in request logs

easier debugging

* fix(index.tsx): fix length of api base

easier viewing

* refactor(leftnav.tsx): show models tab to team admin

* feat(model_dashboard.tsx): add explainer for what the 'models' page is for team admin

helps them understand how they can use it

* feat(model_management_endpoints.py): restrict model add by team to just team admin

allow team admin to add models via non-team keys (e.g. ui token)

* test(test_add_update_models.py): update unit testing for new behaviour

* fix(model_dashboard.tsx): show user the models

* feat(proxy_server.py): add new query param 'user_models_only' to `/v2/model/info`

Allows user to retrieve just the models they've added

Used in UI to show internal users just the models they've added

* feat(model_dashboard.tsx): allow team admins to view their own models

* fix: allow ui user to fetch model cost map

* feat(add_model_tab.tsx): require team admins to specify team when onboarding models

* fix(_types.py): add `/v1/model/info` to info route

`/model/info` was already there

* fix(model_info_view.tsx): allow user to edit a model they created

* fix(model_management_endpoints.py): allow team admin to update team model

* feat(model_managament_endpoints.py): allow team admin to delete team models

* fix(model_management_endpoints.py): don't require team id to be set when adding a model

* fix(proxy_server.py): fix linting error

* fix: fix ui linting error

* fix(model_management_endpoints.py): ensure consistent auth checks on all model calls

* test: remove old test - function no longer exists in same form

* test: add updated mock testing
2025-03-27 12:06:31 -07:00
Krish Dholakia
c0845fec1f
Add OpenAI gpt-4o-transcribe support (#9517)
* refactor: introduce new transformation config for gpt-4o-transcribe models

* refactor: expose new transformation configs for audio transcription

* ci: fix config yml

* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions

allows gpt-4o and whisper audio transformation to work as expected

* refactor: migrate fireworks ai + deepgram to new transform request pattern

* feat(openai/): working support for gpt-4o-audio-transcribe

* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map

* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`

* fix(get_supported_openai_params.py): fix return

* refactor(deepgram/): migrate unit test to deepgram handler

* refactor: cleanup unused imports

* fix(get_supported_openai_params.py): fix linting error

* test: update test
2025-03-26 23:10:25 -07:00
Ishaan Jaff
b6506f7bda test_db_health_readiness_check_with_prisma_error 2025-03-26 20:56:39 -07:00
Ishaan Jaff
87f0201f84 test_handle_db_exception_with_connection_error 2025-03-26 20:44:25 -07:00
Ishaan Jaff
15c04da735 refactor tests 2025-03-26 20:00:10 -07:00
Ishaan Jaff
0155b0eba2
Merge pull request #9533 from BerriAI/litellm_stability_fixes
[Reliability Fixes] - Gracefully handle exceptions when DB is having an outage
2025-03-26 18:57:38 -07:00
Ishaan Jaff
5242c5fbab test - auth exception handler 2025-03-26 18:28:28 -07:00
Ishaan Jaff
15b1a8afb0 test_is_database_connection_error_prisma_errors 2025-03-26 18:27:39 -07:00