Commit graph

473 commits

Author SHA1 Message Date
Krrish Dholakia
a34cc2031d fix(response_metadata.py): log the litellm_model_name
make it easier to track the model sent to the provider
2025-03-18 17:46:33 -07:00
Krrish Dholakia
8ed3483adb test(test_tpm_rpm_routing_v2.py): initial test, for asserting async pre call check works as expected 2025-03-18 17:36:55 -07:00
Krrish Dholakia
39ac9e3eca fix(lowest_tpm_rpm_v2.py): fix updating limits 2025-03-18 17:10:17 -07:00
Krrish Dholakia
267084a1af test(test_get_llm_provider.py): cover scenario where xai not in model name 2025-03-18 11:04:59 -07:00
Krrish Dholakia
aeec703c4e test(test_get_llm_provider.py): Minimal repro for https://github.com/BerriAI/litellm/issues/9291 2025-03-18 10:35:50 -07:00
Krish Dholakia
cd5024f3b1
Merge pull request #9333 from BerriAI/litellm_dev_03_17_2025_p2
fix(ollama/completions/transformation.py): pass prompt, untemplated o…
2025-03-17 21:48:30 -07:00
Krrish Dholakia
22faf7d232 fix(ollama/completions/transformation.py): pass prompt, untemplated on /completions request
Fixes https://github.com/BerriAI/litellm/issues/6900
2025-03-17 18:35:44 -07:00
Krrish Dholakia
c4b2e0ae3d fix(streaming_handler.py): support logging complete streaming response on cache hit 2025-03-17 18:10:39 -07:00
Krrish Dholakia
dd9e79adbd fix(streaming_handler.py): emit deep copy of completed chunk 2025-03-17 17:26:21 -07:00
Krrish Dholakia
8618295911 test: loosen test 2025-03-17 09:44:22 -07:00
Krrish Dholakia
d01361747d test: make test less flaky 2025-03-17 09:00:15 -07:00
Krish Dholakia
d4caaae1be
Merge pull request #9274 from BerriAI/litellm_contributor_rebase_branch
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 43s
Helm unit test / unit-test (push) Successful in 50s
Litellm contributor rebase branch
2025-03-14 21:57:49 -07:00
Krrish Dholakia
c2f01b0fdc fix(router.py): add new test 2025-03-14 14:23:45 -07:00
Ishaan Jaff
241a36a74f
Merge pull request #9222 from BerriAI/litellm_snowflake_pr_mar_13
[Feat] Add Snowflake Cortex to LiteLLM
2025-03-13 21:35:39 -07:00
Krish Dholakia
f89dbb8ab3
Merge branch 'main' into litellm_dev_03_13_2025_p3 2025-03-13 20:12:16 -07:00
Krish Dholakia
fd8a5960ec
Merge pull request #9216 from BerriAI/litellm_dev_03_12_2025_contributor_prs_p2
Litellm dev 03 12 2025 contributor prs p2
2025-03-13 20:03:57 -07:00
Krrish Dholakia
81abb9c6a4 test: fix tests 2025-03-13 19:26:10 -07:00
Ishaan Jaff
7dd55ce70c fix @pytest.mark.skip(reason="lakera deprecated their v1 endpoint.") 2025-03-13 17:49:37 -07:00
Krrish Dholakia
f17bc60593 test: patch test to avoid lakera changes to sensitivity 2025-03-13 15:18:08 -07:00
Krrish Dholakia
86ed6be85e fix: fix learnlm test 2025-03-13 10:54:09 -07:00
Tomer Bin
4a31b32a88 Support post-call guards for stream and non-stream responses 2025-03-13 08:53:54 +02:00
Krish Dholakia
2d957a0ed9
Merge branch 'main' into litellm_dev_03_10_2025_p3 2025-03-12 14:56:01 -07:00
Krrish Dholakia
d9c32342fe test: fix test - delete env var before running 2025-03-11 20:57:57 -07:00
Krrish Dholakia
e2ae504a81 test: skip flaky tests 2025-03-11 19:43:04 -07:00
Krrish Dholakia
3ba683be88 test: remove redundant tests 2025-03-11 17:52:05 -07:00
Krrish Dholakia
7696147968 test: skip redundant test 2025-03-11 06:31:56 -07:00
Steve Farthing
affbebdcef oops 2025-03-11 08:27:36 -04:00
Steve Farthing
dbfb7ebdaf
Merge branch 'main' into stevefarthing/bing-search-pass-thru 2025-03-11 08:06:56 -04:00
Krrish Dholakia
23f3642a15 test: fix tests 2025-03-10 22:36:55 -07:00
Krrish Dholakia
17a29bfbfd test: add direct test - fix code qa check 2025-03-10 21:59:15 -07:00
Krrish Dholakia
f87fe5006a fix: remove client init tests for router - dup behaviour - provider caching already exists 2025-03-10 21:17:36 -07:00
Krrish Dholakia
ae021671a8 test: update testing - having removed the router client init logic
this allows a user to just set the credential value in litellm params, and not have to worry about settin
g credentials
2025-03-10 20:02:33 -07:00
Krrish Dholakia
bfbe26b91d feat(azure.py): add azure bad request error support 2025-03-10 15:59:06 -07:00
Krish Dholakia
f899b828cf
Support openrouter reasoning_content on streaming (#9094)
* feat(convert_dict_to_response.py): support openrouter format of reasoning content

* fix(transformation.py): fix openrouter streaming with reasoning content

Fixes https://github.com/BerriAI/litellm/issues/8193#issuecomment-270892962

* fix: fix type error
2025-03-09 20:03:59 -07:00
Krish Dholakia
65ef65d360
feat: prioritize api_key over tenant_id for more Azure AD token provi… (#8701)
* feat: prioritize api_key over tenant_id for more Azure AD token provider (#8318)

* fix: prioritize api_key over tenant_id for Azure AD token provider

* test: Add test for Azure AD token provider in router

* fix: fix linting error

---------

Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>
2025-03-09 18:59:37 -07:00
Krish Dholakia
e00d4fb18c
Litellm dev 03 08 2025 p3 (#9089)
* feat(ollama_chat.py): pass down http client to ollama_chat

enables easier testing

* fix(factory.py): fix passing images to ollama's `/api/generate` endpoint

Fixes https://github.com/BerriAI/litellm/issues/6683

* fix(factory.py): fix ollama pt to handle templating correctly
2025-03-09 18:20:56 -07:00
Ishaan Jaff
6e3b21775f test_cost_azure_openai_prompt_caching 2025-03-08 16:19:28 -08:00
Ogun Oz
85d1427710
Fix: Create RedisClusterCache when startup nodes provided in cache args of router (#9010)
Co-authored-by: Ogün Öz <ogun.oz@cobrainer.com>
2025-03-06 17:14:32 -08:00
Ishaan Jaff
f47987e673
(Refactor) /v1/messages to follow simpler logic for Anthropic API spec (#9013)
* anthropic_messages_handler v0

* fix /messages

* working messages with router methods

* test_anthropic_messages_handler_litellm_router_non_streaming

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* AnthropicMessagesConfig

* _handle_anthropic_messages_response_logging

* working with /v1/messages endpoint

* working /v1/messages endpoint

* refactor to use router factory function

* use aanthropic_messages

* use BaseConfig for Anthropic /v1/messages

* track api key, team on /v1/messages endpoint

* fix get_logging_payload

* BaseAnthropicMessagesTest

* align test config

* test_anthropic_messages_with_thinking

* test_anthropic_streaming_with_thinking

* fix - display anthropic url for debugging

* test_bad_request_error_handling

* test_anthropic_messages_router_streaming_with_bad_request

* fix ProxyException

* test_bad_request_error_handling_streaming

* use provider_specific_header

* test_anthropic_messages_with_extra_headers

* test_anthropic_messages_to_wildcard_model

* fix gcs pub sub test

* standard_logging_payload

* fix unit testing for anthopic /v1/messages support

* fix pass through anthropic messages api

* delete dead code

* fix anthropic pass through response

* revert change to spend tracking utils

* fix get_litellm_metadata_from_kwargs

* fix spend logs payload json

* proxy_pass_through_endpoint_tests

* TestAnthropicPassthroughBasic

* fix pass through tests

* test_async_vertex_proxy_route_api_key_auth

* _handle_anthropic_messages_response_logging

* vertex_credentials

* test_set_default_vertex_config

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* test_ageneric_api_call_with_fallbacks_basic

* test__aadapter_completion
2025-03-06 00:43:08 -08:00
Krrish Dholakia
320cb1d51a docs: cleanup 'signature_delta' from docs 2025-03-05 23:53:38 -08:00
Krish Dholakia
f6535ae6ad
Support format param for specifying image type (#9019)
* fix(transformation.py): support a 'format' parameter for image's

allow user to specify mime type

* fix: pass mimetype via 'format' param

* feat(gemini/chat/transformation.py): support 'format' param for gemini

* fix(factory.py): support 'format' param on sync bedrock converse calls

* feat(bedrock/converse_transformation.py): support 'format' param for bedrock async calls

* refactor(factory.py): move to supporting 'format' param in base helper

ensures consistency in param support

* feat(gpt_transformation.py): filter out 'format' param

don't send invalid param to openai

* fix(gpt_transformation.py): fix translation

* fix: fix translation error
2025-03-05 19:52:53 -08:00
Krish Dholakia
ec4f665e29
Return signature on anthropic streaming + migrate to signature field instead of signature_delta [MINOR bump] (#9021)
* Fix missing signature_delta in thinking blocks when streaming from Claude 3.7 (#8797)

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* test: update test to enforce signature found

* feat(refactor-signature-param-to-be-'signature'-instead-of-'signature_delta'): keeps it in sync with anthropic

* fix: fix linting error

---------

Co-authored-by: Martin Krasser <krasserm@googlemail.com>
2025-03-05 19:33:54 -08:00
Krish Dholakia
5e386c28b2
Litellm dev 03 04 2025 p3 (#8997)
* fix(core_helpers.py): handle litellm_metadata instead of 'metadata'

* feat(batches/): ensure batches logs are written to db

makes batches response dict compatible

* fix(cost_calculator.py): handle batch response being a dictionary

* fix(batches/main.py): modify retrieve endpoints to use @client decorator

enables logging to work on retrieve call

* fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible

* fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type

create batch and retrieve batch share the same id

* fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted

* refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline

ensures cost is always logged to db

* fix: fix linting errors

* fix: fix linting error
2025-03-04 21:58:03 -08:00
Krish Dholakia
662c59adcf
Support caching on reasoning content + other fixes (#8973)
* fix(factory.py): pass on anthropic thinking content from assistant call

* fix(factory.py): fix anthropic messages to handle thinking blocks

Fixes https://github.com/BerriAI/litellm/issues/8961

* fix(factory.py): fix bedrock handling for assistant content in messages

Fixes https://github.com/BerriAI/litellm/issues/8961

* feat(convert_dict_to_response.py): handle reasoning content + thinking blocks in chat completion block

ensures caching works for anthropic thinking block

* fix(convert_dict_to_response.py): pass all message params to delta block

ensures streaming delta also contains the reasoning content / thinking block

* test(test_prompt_factory.py): remove redundant test

anthropic now supports assistant as the first message

* fix(factory.py): fix linting errors

* fix: fix code qa

* test: remove falsy test

* fix(litellm_logging.py): fix str conversion
2025-03-04 21:12:16 -08:00
Krrish Dholakia
8ea3d4c046 build: merge litellm_dev_03_01_2025_p2 2025-03-03 23:05:41 -08:00
Krish Dholakia
2fc6262675
fix(route_llm_request.py): move to using common router, even for clie… (#8966)
* fix(route_llm_request.py): move to using common router, even for client-side credentials

ensures fallbacks / cooldown logic still works

* test(test_route_llm_request.py): add unit test for route request

* feat(router.py): generate unique model id when clientside credential passed in

Prevents cooldowns for api key 1 from impacting api key 2

* test(test_router.py): update testing to ensure original litellm params not mutated

* fix(router.py): upsert clientside call into llm router model list

enables cooldown logic to work accurately

* fix: fix linting error

* test(test_router_utils.py): add direct test for new util on router
2025-03-03 22:57:08 -08:00
Krrish Dholakia
b9bddac776 test: fix test 2025-03-03 13:33:39 -08:00
Ishaan Jaff
bc9b3e4847
(Bug fix) - don't log messages in model_parameters in StandardLoggingPayload (#8932)
* define model param helper

* use ModelParamHelper

* get_standard_logging_model_parameters

* fix code quality

* get_standard_logging_model_parameters

* StandardLoggingPayload

* test_get_kwargs_for_cache_key

* test_langsmith_key_based_logging

* fix code qa

* fix linting
2025-03-01 13:39:45 -08:00
Krish Dholakia
88eedb22b9
vertex ai anthropic thinking param support (#8853)
* fix(vertex_llm_base.py): handle credentials passed in as dictionary

* fix(router.py): support vertex credentials as json dict

* test(test_vertex.py): allows easier testing

mock anthropic thinking response for vertex ai

* test(vertex_ai_partner_models/): don't remove "@" from model

breaks anthropic cost calculation

* test: move testing

* fix: fix linting error

* fix: fix linting error

* fix(vertex_ai_partner_models/main.py): split @ for codestral model

* test: fix test

* fix: fix stripping "@" on mistral models

* fix: fix test

* test: fix test
2025-02-26 21:37:18 -08:00
Krish Dholakia
3de4209569
fix caching on main branch (#8858)
* fix(streaming_handler.py): fix is delta empty check to handle empty str

* fix(streaming_handler.py): fix delta chunk on final response
2025-02-26 19:16:34 -08:00