Commit graph

2108 commits

Author SHA1 Message Date
Ishaan Jaff
f9ce754817
[Feat] Add litellm.supports_reasoning() util to track if an llm supports reasoning (#9923)
* add supports_reasoning for xai models

* add "supports_reasoning": true for o1 series models

* add supports_reasoning util

* add litellm.supports_reasoning

* add supports reasoning for claude 3-7 models

* add deepseek as supports reasoning

* test_supports_reasoning

* add supports reasoning to model group info

* add supports_reasoning

* docs supports reasoning

* fix supports_reasoning test

* "supports_reasoning": false,

* fix test

* supports_reasoning
2025-04-11 17:56:04 -07:00
Krish Dholakia
0dbd663877
fix(cost_calculator.py): handle custom pricing at deployment level fo… (#9855)
* fix(cost_calculator.py): handle custom pricing at deployment level for router

* test: add unit tests

* fix(router.py): show custom pricing on UI

check correct model str

* fix: fix linting error

* docs(custom_pricing.md): clarify custom pricing for proxy

Fixes https://github.com/BerriAI/litellm/issues/8573#issuecomment-2790420740

* test: update code qa test

* fix: cleanup traceback

* fix: handle litellm param custom pricing

* test: update test

* fix(cost_calculator.py): add router model id to list of potential model names

* fix(cost_calculator.py): fix router model id check

* fix: router.py - maintain older model registry approach

* fix: fix ruff check

* fix(router.py): router get deployment info

add custom values to mapped dict

* test: update test

* fix(utils.py): update only if value is non-null

* test: add unit test
2025-04-09 22:13:10 -07:00
Krish Dholakia
ac4f32fb1e
Cost tracking for gemini-2.5-pro (#9837)
* build(model_prices_and_context_window.json): add google/gemini-2.0-flash-lite-001 versioned pricing

Closes https://github.com/BerriAI/litellm/issues/9829

* build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param

* build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param

* build(model_prices_and_context_window.json): add supported endpoints to gemini-2.5-pro

* build(model_prices_and_context_window.json): add gemini 200k+ pricing

* feat(utils.py): support cost calculation for gemini-2.5-pro above 200k tokens

Fixes https://github.com/BerriAI/litellm/issues/9807

* build: test dockerfile change

* build: revert apk change

* ci(config.yml): pip install wheel

* ci: test problematic package first

* ci(config.yml): pip install only binary

* ci: try more things

* ci: test different ml_dtypes version

* ci(config.yml): check ml_dtypes==0.4.0

* ci: test

* ci: cleanup config.yml

* ci: specify ml dtypes in requirements.txt

* ci: remove redisvl depedency (temporary)

* fix: fix linting errors

* test: update test

* test: fix test
2025-04-09 18:48:43 -07:00
Krish Dholakia
6ba3c4a4f8
VertexAI non-jsonl file storage support (#9781)
* test: add initial e2e test

* fix(vertex_ai/files): initial commit adding sync file create support

* refactor: initial commit of vertex ai non-jsonl files reaching gcp endpoint

* fix(vertex_ai/files/transformation.py): initial working commit of non-jsonl file call reaching backend endpoint

* fix(vertex_ai/files/transformation.py): working e2e non-jsonl file upload

* test: working e2e jsonl call

* test: unit testing for jsonl file creation

* fix(vertex_ai/transformation.py): reset file pointer after read

allow multiple reads on same file object

* fix: fix linting errors

* fix: fix ruff linting errors

* fix: fix import

* fix: fix linting error

* fix: fix linting error

* fix(vertex_ai/files/transformation.py): fix linting error

* test: update test

* test: update tests

* fix: fix linting errors

* fix: fix test

* fix: fix linting error
2025-04-09 14:01:48 -07:00
Krish Dholakia
ac9f03beae
Allow passing thinking param to litellm proxy via client sdk + Code QA Refactor on get_optional_params (get correct values) (#9386)
* fix(litellm_proxy/chat/transformation.py): support 'thinking' param

Fixes https://github.com/BerriAI/litellm/issues/9380

* feat(azure/gpt_transformation.py): add azure audio model support

Closes https://github.com/BerriAI/litellm/issues/6305

* fix(utils.py): use provider_config in common functions

* fix(utils.py): add missing provider configs to get_chat_provider_config

* test: fix test

* fix: fix path

* feat(utils.py): make bedrock invoke nova config baseconfig compatible

* fix: fix linting errors

* fix(azure_ai/transformation.py): remove buggy optional param filtering for azure ai

Removes incorrect check for support tool choice when calling azure ai - prevented calling models with response_format unless on litell model cost map

* fix(amazon_cohere_transformation.py): fix bedrock invoke cohere transformation to inherit from coherechatconfig

* test: fix azure ai tool choice mapping

* fix: fix model cost map to add 'supports_tool_choice' to cohere models

* fix(get_supported_openai_params.py): check if custom llm provider in llm providers

* fix(get_supported_openai_params.py): fix llm provider in list check

* fix: fix ruff check errors

* fix: support defs when calling bedrock nova

* fix(factory.py): fix test
2025-04-07 21:04:11 -07:00
Krish Dholakia
fcf17d114f
Litellm dev 04 05 2025 p2 (#9774)
* test: move test to just checking async

* fix(transformation.py): handle function call with no schema

* fix(utils.py): handle pydantic base model in message tool calls

Fix https://github.com/BerriAI/litellm/issues/9321

* fix(vertex_and_google_ai_studio.py): handle tools=[]

Fixes https://github.com/BerriAI/litellm/issues/9080

* test: remove max token restriction

* test: fix basic test

* fix(get_supported_openai_params.py): fix check

* fix(converse_transformation.py): support fake streaming for meta.llama3-3-70b-instruct-v1:0

* fix: fix test

* fix: parse out empty dictionary on dbrx streaming + tool calls

* fix(handle-'strict'-param-when-calling-fireworks-ai): fireworks ai does not support 'strict' param

* fix: fix ruff check

'

* fix: handle no strict in function

* fix: revert bedrock change - handle in separate PR
2025-04-07 21:02:52 -07:00
Krish Dholakia
34bdf36eab
Add inference providers support for Hugging Face (#8258) (#9738) (#9773)
* Add inference providers support for Hugging Face (#8258)

* add first version of inference providers for huggingface

* temporarily skipping tests

* Add documentation

* Fix titles

* remove max_retries from params and clean up

* add suggestions

* use llm http handler

* update doc

* add suggestions

* run formatters

* add tests

* revert

* revert

* rename file

* set maxsize for lru cache

* fix embeddings

* fix inference url

* fix tests following breaking change in main

* use ChatCompletionRequest

* fix tests and lint

* [Hugging Face] Remove outdated chat completion tests and fix embedding tests (#9749)

* remove or fix tests

* fix link in doc

* fix(config_settings.md): document hf api key

---------

Co-authored-by: célina <hanouticelina@gmail.com>
2025-04-05 10:50:15 -07:00
Krish Dholakia
8ee32291e0
Squashed commit of the following: (#9709)
commit b12a9892b7
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Wed Apr 2 08:09:56 2025 -0700

    fix(utils.py): don't modify openai_token_counter

commit 294de31803
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 21:22:40 2025 -0700

    fix: fix linting error

commit cb6e9fbe40
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 19:52:45 2025 -0700

    refactor: complete migration

commit bfc159172d
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 19:09:59 2025 -0700

    refactor: refactor more constants

commit 43ffb6a558
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:45:24 2025 -0700

    fix: test

commit 04dbe4310c
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:28:58 2025 -0700

    refactor: refactor: move more constants into constants.py

commit 3c26284aff
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:14:46 2025 -0700

    refactor: migrate hardcoded constants out of __init__.py

commit c11e0de69d
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:11:21 2025 -0700

    build: migrate all constants into constants.py

commit 7882bdc787
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:07:37 2025 -0700

    build: initial test banning hardcoded numbers in repo
2025-04-02 21:24:54 -07:00
Ishaan Jaff
acf920a41a
Merge branch 'main' into litellm_fix_azure_o_series 2025-04-02 20:58:52 -07:00
Krish Dholakia
053b0e741f
Add Google AI Studio /v1/files upload API support (#9645)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 23s
* test: fix import for test

* fix: fix bad error string

* docs: cleanup files docs

* fix(files/main.py): cleanup error string

* style: initial commit with a provider/config pattern for files api

google ai studio files api onboarding

* fix: test

* feat(gemini/files/transformation.py): support gemini files api response transformation

* fix(gemini/files/transformation.py): return file id as gemini uri

allows id to be passed in to chat completion request, just like openai

* feat(llm_http_handler.py): support async route for files api on llm_http_handler

* fix: fix linting errors

* fix: fix model info check

* fix: fix ruff errors

* fix: fix linting errors

* Revert "fix: fix linting errors"

This reverts commit 926a5a527f.

* fix: fix linting errors

* test: fix test

* test: fix tests
2025-04-02 08:56:58 -07:00
Krish Dholakia
453003c378
fix(gemini/): add gemini/ route optional param mapping support (#9677)
Fixes https://github.com/BerriAI/litellm/issues/9654
2025-04-02 08:56:32 -07:00
Krish Dholakia
23051d89dd
fix(streaming_handler.py): fix completion start time tracking (#9688)
* fix(streaming_handler.py): fix completion start time tracking

Fixes https://github.com/BerriAI/litellm/issues/9210

* feat(anthropic/chat/transformation.py): map openai 'reasoning_effort' to anthropic 'thinking' param

Fixes https://github.com/BerriAI/litellm/issues/9022

* feat: map 'reasoning_effort' to 'thinking' param across bedrock + vertex

Closes https://github.com/BerriAI/litellm/issues/9022#issuecomment-2705260808
2025-04-01 22:00:56 -07:00
Ishaan Jaff
5f286fe147 fix _check_valid_arg 2025-04-01 21:20:31 -07:00
Ishaan Jaff
f7129e5e59 fix _apply_openai_param_overrides 2025-04-01 21:17:59 -07:00
Ishaan Jaff
9acda77b75 add allowed_openai_params 2025-04-01 19:54:35 -07:00
Krish Dholakia
9b7ebb6a7d
build(pyproject.toml): add new dev dependencies - for type checking (#9631)
* build(pyproject.toml): add new dev dependencies - for type checking

* build: reformat files to fit black

* ci: reformat to fit black

* ci(test-litellm.yml): make tests run clear

* build(pyproject.toml): add ruff

* fix: fix ruff checks

* build(mypy/): fix mypy linting errors

* fix(hashicorp_secret_manager.py): fix passing cert for tls auth

* build(mypy/): resolve all mypy errors

* test: update test

* fix: fix black formatting

* build(pre-commit-config.yaml): use poetry run black

* fix(proxy_server.py): fix linting error

* fix: fix ruff safe representation error
2025-03-29 11:02:13 -07:00
Krish Dholakia
ccbac691e5
Support discovering gemini, anthropic, xai models by calling their /v1/model endpoint (#9530)
* fix: initial commit for adding provider model discovery to gemini

* feat(gemini/): add model discovery for gemini/ route

* docs(set_keys.md): update docs to show you can check available gemini models as well

* feat(anthropic/): add model discovery for anthropic api key

* feat(xai/): add model discovery for XAI

enables checking what models an xai key can call

* ci: bump ci config yml

* fix(topaz/common_utils.py): fix linting error

* fix: fix linting error for python38
2025-03-27 22:50:48 -07:00
Krish Dholakia
c0845fec1f
Add OpenAI gpt-4o-transcribe support (#9517)
* refactor: introduce new transformation config for gpt-4o-transcribe models

* refactor: expose new transformation configs for audio transcription

* ci: fix config yml

* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions

allows gpt-4o and whisper audio transformation to work as expected

* refactor: migrate fireworks ai + deepgram to new transform request pattern

* feat(openai/): working support for gpt-4o-audio-transcribe

* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map

* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`

* fix(get_supported_openai_params.py): fix return

* refactor(deepgram/): migrate unit test to deepgram handler

* refactor: cleanup unused imports

* fix(get_supported_openai_params.py): fix linting error

* test: update test
2025-03-26 23:10:25 -07:00
Ishaan Jaff
cf22d31b2b search_context_cost_per_query 2025-03-22 14:52:58 -07:00
Ishaan Jaff
7dd37a5b18 fix supports_web_search 2025-03-22 14:02:51 -07:00
Ishaan Jaff
1d7accce9e test_supports_web_search 2025-03-22 13:49:35 -07:00
Ishaan Jaff
c4cbfd5716 supports_web_search 2025-03-22 13:01:41 -07:00
Ishaan Jaff
4b4a0b2612 supports_native_streaming 2025-03-20 13:52:30 -07:00
Ishaan Jaff
0352559c66 supports_native_streaming 2025-03-20 13:34:57 -07:00
Ishaan Jaff
bc174adcd0 add should_fake_stream 2025-03-20 09:54:26 -07:00
Krish Dholakia
d4caaae1be
Merge pull request #9274 from BerriAI/litellm_contributor_rebase_branch
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 43s
Helm unit test / unit-test (push) Successful in 50s
Litellm contributor rebase branch
2025-03-14 21:57:49 -07:00
Sunny Wan
f9a5109203
Merge branch 'BerriAI:main' into main 2025-03-13 19:37:22 -04:00
Krrish Dholakia
1cd57e95aa fix: fix linting error 2025-03-13 14:33:19 -07:00
Tomer Bin
4a31b32a88 Support post-call guards for stream and non-stream responses 2025-03-13 08:53:54 +02:00
Krish Dholakia
2d957a0ed9
Merge branch 'main' into litellm_dev_03_10_2025_p3 2025-03-12 14:56:01 -07:00
Ishaan Jaff
342741ede1 Merge branch 'main' into litellm_responses_api_support 2025-03-12 12:04:12 -07:00
Ishaan Jaff
b790f0a5c6 log input of response API 2025-03-11 22:34:18 -07:00
Ishaan Jaff
4d55212c62 add BaseResponsesAPIConfig 2025-03-11 15:57:53 -07:00
Krrish Dholakia
f56c5ca380 feat: working e2e credential management - support reusing existing credentials 2025-03-10 19:29:24 -07:00
Krrish Dholakia
fdd5ba3084 feat(credential_accessor.py): support loading in credentials from credential_list
Resolves https://github.com/BerriAI/litellm/issues/9114
2025-03-10 17:15:58 -07:00
omrishiv
0674491386
add support for Amazon Nova Canvas model (#7838)
* add initial support for Amazon Nova Canvas model

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* adjust name to AmazonNovaCanvas and map function variables to config

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* tighten model name check

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* fix quality mapping

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* add premium quality in config

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* support all Amazon Nova Canvas tasks

* remove unused import

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* add tests for image generation tasks and fix payload

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* add missing util file

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* update model prices backup file

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* remove image tasks other than text->image

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

---------

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
2025-03-10 08:02:00 -07:00
Krish Dholakia
4330ef8e81
Fix batches api cost tracking + Log batch models in spend logs / standard logging payload (#9077)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 42s
* feat(batches/): fix batch cost calculation - ensure it's accurate

use the correct cost value - prev. defaulting to non-batch cost

* feat(batch_utils.py): log batch models to spend logs + standard logging payload

makes it easy to understand how cost was calculated

* fix: fix stored payload for test

* test: fix test
2025-03-08 11:47:25 -08:00
Krish Dholakia
0e3caf92b9
UI - new API Playground for testing LiteLLM translation (#9073)
* feat: initial commit - enable dev to see translated request

* feat(utils.py): expose new endpoint - `/utils/transform_request` to see the raw request sent by litellm

* feat(transform_request.tsx): allow user to see their transformed request

* refactor(litellm_logging.py): return raw request in 3 parts - api_base, headers, request body

easier to render each individually on UI vs. extracting from combined string

* feat: transform_request.tsx

working e2e raw request viewing

* fix(litellm_logging.py): fix transform viewing for bedrock models

* fix(litellm_logging.py): don't return sensitive headers in raw request headers

prevent accidental leak

* feat(transform_request.tsx): style improvements
2025-03-07 19:39:31 -08:00
Ishaan Jaff
f47987e673
(Refactor) /v1/messages to follow simpler logic for Anthropic API spec (#9013)
* anthropic_messages_handler v0

* fix /messages

* working messages with router methods

* test_anthropic_messages_handler_litellm_router_non_streaming

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* AnthropicMessagesConfig

* _handle_anthropic_messages_response_logging

* working with /v1/messages endpoint

* working /v1/messages endpoint

* refactor to use router factory function

* use aanthropic_messages

* use BaseConfig for Anthropic /v1/messages

* track api key, team on /v1/messages endpoint

* fix get_logging_payload

* BaseAnthropicMessagesTest

* align test config

* test_anthropic_messages_with_thinking

* test_anthropic_streaming_with_thinking

* fix - display anthropic url for debugging

* test_bad_request_error_handling

* test_anthropic_messages_router_streaming_with_bad_request

* fix ProxyException

* test_bad_request_error_handling_streaming

* use provider_specific_header

* test_anthropic_messages_with_extra_headers

* test_anthropic_messages_to_wildcard_model

* fix gcs pub sub test

* standard_logging_payload

* fix unit testing for anthopic /v1/messages support

* fix pass through anthropic messages api

* delete dead code

* fix anthropic pass through response

* revert change to spend tracking utils

* fix get_litellm_metadata_from_kwargs

* fix spend logs payload json

* proxy_pass_through_endpoint_tests

* TestAnthropicPassthroughBasic

* fix pass through tests

* test_async_vertex_proxy_route_api_key_auth

* _handle_anthropic_messages_response_logging

* vertex_credentials

* test_set_default_vertex_config

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* test_ageneric_api_call_with_fallbacks_basic

* test__aadapter_completion
2025-03-06 00:43:08 -08:00
Sunny Wan
a2fed4059e added Snowflake config to ProviderConfigManager 2025-03-05 20:32:18 -05:00
Krish Dholakia
662c59adcf
Support caching on reasoning content + other fixes (#8973)
* fix(factory.py): pass on anthropic thinking content from assistant call

* fix(factory.py): fix anthropic messages to handle thinking blocks

Fixes https://github.com/BerriAI/litellm/issues/8961

* fix(factory.py): fix bedrock handling for assistant content in messages

Fixes https://github.com/BerriAI/litellm/issues/8961

* feat(convert_dict_to_response.py): handle reasoning content + thinking blocks in chat completion block

ensures caching works for anthropic thinking block

* fix(convert_dict_to_response.py): pass all message params to delta block

ensures streaming delta also contains the reasoning content / thinking block

* test(test_prompt_factory.py): remove redundant test

anthropic now supports assistant as the first message

* fix(factory.py): fix linting errors

* fix: fix code qa

* test: remove falsy test

* fix(litellm_logging.py): fix str conversion
2025-03-04 21:12:16 -08:00
Krrish Dholakia
8ea3d4c046 build: merge litellm_dev_03_01_2025_p2 2025-03-03 23:05:41 -08:00
Krish Dholakia
a65bfab697
Fix calling claude via invoke route + response_format support for claude on invoke route (#8908)
* fix(anthropic_claude3_transformation.py): fix amazon anthropic claude 3 tool calling transformation on invoke route

move to using anthropic config as base

* fix(utils.py): expose anthropic config via providerconfigmanager

* fix(llm_http_handler.py): support json mode on async completion calls

* fix(invoke_handler/make_call): support json mode for anthropic called via bedrock invoke

* fix(anthropic/): handle 'response_format: {"type": "text"}` + migrate amazon claude 3 invoke config to inherit from anthropic config

Prevents error when passing in 'response_format: {"type": "text"}

* test: fix test

* fix(utils.py): fix base invoke provider check

* fix(anthropic_claude3_transformation.py): don't pass 'stream' param

* fix: fix linting errors

* fix(converse_transformation.py): handle response_format type=text for converse
2025-02-28 17:56:26 -08:00
Krish Dholakia
017c482d7b
fix(o_series_transformation.py): fix optional param check for o-serie… (#8787)
* fix(o_series_transformation.py): fix optional param check for o-series models

o3-mini and o-1 do not support parallel tool calling

* fix(utils.py): support 'drop_params' for 'thinking' param across models

allows switching to older claude versions (or non-anthropic models) and param to be safely dropped

* fix: fix passing thinking param in optional params

allows dropping thinking_param where not applicable

* test: update old model

* fix(utils.py): fix linting errors

* fix(main.py): add param to acompletion
2025-02-26 12:26:55 -08:00
Krrish Dholakia
fcf4ea3608 build: merge squashed commit
Squashed commit of the following:

commit 6678e15381
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date:   Wed Feb 26 09:29:15 2025 -0800

    test_prompt_caching

commit bd86e0ac47
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date:   Wed Feb 26 08:57:16 2025 -0800

    test_prompt_caching

commit 2fc21ad51e
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date:   Wed Feb 26 08:13:45 2025 -0800

    test_aprompt_caching

commit d94cff55ff
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date:   Wed Feb 26 08:13:12 2025 -0800

    test_prompt_caching

commit 49c5e7811e
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date:   Wed Feb 26 07:43:53 2025 -0800

    ui new build

commit cb8d5e5917
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date:   Wed Feb 26 07:38:56 2025 -0800

    (UI) - Create Key flow for existing users (#8844)

    * working create user button

    * working create user for a key flow

    * allow searching users

    * working create user + key

    * use clear sections on create key

    * better search for users

    * fix create key

    * ui fix create key button - make it neater / cleaner

    * ui fix all keys table

commit 335ba30467
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Wed Feb 26 08:53:17 2025 -0800

    fix: fix file name

commit b8c5b31a4e
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Tue Feb 25 22:54:46 2025 -0800

    fix: fix utils

commit ac6e503461
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Feb 24 10:43:31 2025 -0800

    fix(main.py): fix openai message for assistant msg if role is missing - openai allows this

    Fixes https://github.com/BerriAI/litellm/issues/8661

commit de3989dbc5
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Feb 24 21:19:25 2025 -0800

    fix(get_litellm_params.py): handle no-log being passed in via kwargs

    Fixes https://github.com/BerriAI/litellm/issues/8380
2025-02-26 09:39:27 -08:00
Krish Dholakia
09462ba80c
Add cohere v2/rerank support (#8421) (#8605)
* Add cohere v2/rerank support (#8421)

* Support v2 endpoint cohere rerank

* Add tests and docs

* Make v1 default if old params used

* Update docs

* Update docs pt 2

* Update tests

* Add e2e test

* Clean up code

* Use inheritence for new config

* Fix linting issues (#8608)

* Fix cohere v2 failing test + linting (#8672)

* Fix test and unused imports

* Fix tests

* fix: fix linting errors

* test: handle tgai instability

* fix: skip service unavailable err

* test: print logs for unstable test

* test: skip unreliable tests

---------

Co-authored-by: vibhavbhat <vibhavb00@gmail.com>
2025-02-22 22:25:29 -08:00
Krish Dholakia
b682dc4ec8
Add cost tracking for rerank via bedrock (#8691)
* feat(bedrock/rerank): infer model region if model given as arn

* test: add unit testing to ensure bedrock region name inferred from arn on rerank

* feat(bedrock/rerank/transformation.py): include search units for bedrock rerank result

Resolves https://github.com/BerriAI/litellm/issues/7258#issuecomment-2671557137

* test(test_bedrock_completion.py): add testing for bedrock cohere rerank

* feat(cost_calculator.py): refactor rerank cost tracking to support bedrock cost tracking

* build(model_prices_and_context_window.json): add amazon.rerank model to model cost map

* fix(cost_calculator.py): bedrock/common_utils.py

get base model from model w/ arn -> handles rerank model

* build(model_prices_and_context_window.json): add bedrock cohere rerank pricing

* feat(bedrock/rerank): migrate bedrock config to basererank config

* Revert "feat(bedrock/rerank): migrate bedrock config to basererank config"

This reverts commit 84fae1f167.

* test: add testing to ensure large doc / queries are correctly counted

* Revert "test: add testing to ensure large doc / queries are correctly counted"

This reverts commit 4337f1657e.

* fix(migrate-jina-ai-to-rerank-config): enables cost tracking

* refactor(jina_ai/): finish migrating jina ai to base rerank config

enables cost tracking

* fix(jina_ai/rerank): e2e jina ai rerank cost tracking

* fix: cleanup dead code

* fix: fix python3.8 compatibility error

* test: fix test

* test: add e2e testing for azure ai rerank

* fix: fix linting error

* test: mark cohere as flaky
2025-02-20 21:00:18 -08:00
Krish Dholakia
f9df01fbc6
fix(utils.py): handle token counter error when invalid message passed in (#8670)
* fix(utils.py): handle token counter error

* fix(utils.py): testing fixes

* fix(utils.py): fix incr for num tokens from list

* fix(utils.py): fix text str token counting
2025-02-19 22:21:34 -08:00
Krrish Dholakia
9470f57e86 build: extract <think>..</think> block for amazon deepseek r1 and put in reasoning_content 2025-02-19 21:10:38 -08:00
Krish Dholakia
2340f1b31f
Pass router tags in request headers - x-litellm-tags (#8609)
* feat(litellm_pre_call_utils.py): support `x-litellm-tags` request header

allow tag based routing + spend tracking via request headers

* docs(request_headers.md): document new `x-litellm-tags` for tag based routing and spend tracking

* docs(tag_routing.md): add to docs

* fix(utils.py): only pass str values for openai metadata param

* fix(utils.py): drop non-str values for metadata param to openai

preview-feature, otel span was being sent in
2025-02-18 08:26:22 -08:00