Commit graph

48 commits

Author SHA1 Message Date
Krish Dholakia
fcf17d114f
Litellm dev 04 05 2025 p2 (#9774)
* test: move test to just checking async

* fix(transformation.py): handle function call with no schema

* fix(utils.py): handle pydantic base model in message tool calls

Fix https://github.com/BerriAI/litellm/issues/9321

* fix(vertex_and_google_ai_studio.py): handle tools=[]

Fixes https://github.com/BerriAI/litellm/issues/9080

* test: remove max token restriction

* test: fix basic test

* fix(get_supported_openai_params.py): fix check

* fix(converse_transformation.py): support fake streaming for meta.llama3-3-70b-instruct-v1:0

* fix: fix test

* fix: parse out empty dictionary on dbrx streaming + tool calls

* fix(handle-'strict'-param-when-calling-fireworks-ai): fireworks ai does not support 'strict' param

* fix: fix ruff check

'

* fix: handle no strict in function

* fix: revert bedrock change - handle in separate PR
2025-04-07 21:02:52 -07:00
Krish Dholakia
8d338aee78
fix(databricks/chat/transformation.py): remove reasoning_effort from request (#9811)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 27s
Fixes https://github.com/BerriAI/litellm/issues/9700#issuecomment-2784431995
2025-04-07 19:43:19 -07:00
Krish Dholakia
9a60cd9deb
fix(gemini/transformation.py): handle file_data being passed in (#9786) 2025-04-07 16:32:08 -07:00
Krish Dholakia
5099aac1a5
Add DBRX Anthropic w/ thinking + response_format support (#9744)
* feat(databricks/chat/): add anthropic w/ reasoning content support via databricks

Allows user to call claude-3-7-sonnet with thinking via databricks

* refactor: refactor choices transformation + add unit testing

* fix(databricks/chat/transformation.py): support thinking blocks on databricks response streaming

* feat(databricks/chat/transformation.py): support response_format for claude models

* fix(databricks/chat/transformation.py): correctly handle response_format={"type": "text"}

* feat(databricks/chat/transformation.py): support 'reasoning_effort' param mapping for anthropic

* fix: fix ruff errors

* fix: fix linting error

* test: update test

* fix(databricks/chat/transformation.py): handle json mode output parsing

* fix(databricks/chat/transformation.py): handle json mode on streaming

* test: update test

* test: update dbrx testing

* test: update testing

* fix(base_model_iterator.py): handle non-json chunk

* test: update tests

* fix: fix ruff check

* fix: fix databricks config import

* fix: handle _tool = none

* test: skip invalid test
2025-04-04 22:13:32 -07:00
Krish Dholakia
5ad2fbcba6
Openrouter streaming fixes + Anthropic 'file' message support (#9667)
* fix(openrouter/transformation.py): Handle error in openrouter stream

Fixes https://github.com/Aider-AI/aider/issues/3550

* test(test_openrouter_chat_transformation.py): add unit tests

* feat(anthropic/chat/transformation.py): add openai 'file' message content type support

Closes https://github.com/BerriAI/litellm/issues/9463

* fix(factory.py): add bedrock converse support for openai 'file' message content type

Closes https://github.com/BerriAI/litellm/issues/9463
2025-03-31 21:22:59 -07:00
Krish Dholakia
5c107c64dd
Add gemini audio input support + handle special tokens in sagemaker response (#9640)
* fix(internal_user_endpoints.py): cleanup unused variables on beta endpoint

no team/org split on daily user endpoint

* build(model_prices_and_context_window.json): gemini-2.0-flash supports audio input

* feat(gemini/transformation.py): support passing audio input to gemini

* test: fix test

* fix(gemini/transformation.py): support audio input as a url

enables passing google cloud bucket urls

* fix(gemini/transformation.py): support explicitly passing format of file

* fix(gemini/transformation.py): expand support for inferred file types from url

* fix(sagemaker/completion/transformation.py): fix special token error when counting sagemaker tokens

* test: fix import
2025-03-29 19:23:09 -07:00
Krish Dholakia
222898d727
Fix anthropic thinking + response_format (#9594)
* fix(anthropic/chat/transformation.py): Don't set tool choice on response_format conversion when thinking is enabled

Not allowed by Anthropic

Fixes https://github.com/BerriAI/litellm/issues/8901

* refactor: move test to base anthropic chat tests

ensures consistent behaviour across vertex/anthropic/bedrock

* fix(anthropic/chat/transformation.py): if thinking token is specified and max tokens is not - ensure max token to anthropic is higher than thinking tokens

* feat(converse_transformation.py): correctly handle thinking + response format on Bedrock Converse

Fixes https://github.com/BerriAI/litellm/issues/8901

* fix(converse_transformation.py): correctly handle adding max tokens

* test: handle service unavailable error
2025-03-28 15:57:40 -07:00
Krrish Dholakia
a1b716c1ef test: fix test - handle llm api inconsistency 2025-03-21 10:51:34 -07:00
Krrish Dholakia
e7ef14398f fix(anthropic/chat/transformation.py): correctly update response_format to tool call transformation
Fixes https://github.com/BerriAI/litellm/issues/9411
2025-03-21 10:20:21 -07:00
Krrish Dholakia
16224f8db6 fix(o_series_handler.py): handle async calls 2025-03-11 21:22:13 -07:00
Krrish Dholakia
4418e6dd14 build: merge branch 2025-03-02 08:31:57 -08:00
Krish Dholakia
c84b489d58
Fix bedrock passing response_format: {"type": "text"} (#8900)
* fix(converse_transformation.py): ignore type: text, value in response_format

no-op for bedrock

* fix(converse_transformation.py): handle adding response format value to tools

* fix(base_invoke_transformation.py): fix 'get_bedrock_invoke_provider' to handle cross-region-inferencing models

* test(test_bedrock_completion.py): add unit testing for bedrock invoke provider logic

* test: update test

* fix(exception_mapping_utils.py): add context window exceeded error handling for databricks provider route

* fix(fireworks_ai/): support passing tools + response_format together

* fix: cleanup

* fix(base_invoke_transformation.py): fix imports
2025-02-28 20:09:59 -08:00
Krish Dholakia
a65bfab697
Fix calling claude via invoke route + response_format support for claude on invoke route (#8908)
* fix(anthropic_claude3_transformation.py): fix amazon anthropic claude 3 tool calling transformation on invoke route

move to using anthropic config as base

* fix(utils.py): expose anthropic config via providerconfigmanager

* fix(llm_http_handler.py): support json mode on async completion calls

* fix(invoke_handler/make_call): support json mode for anthropic called via bedrock invoke

* fix(anthropic/): handle 'response_format: {"type": "text"}` + migrate amazon claude 3 invoke config to inherit from anthropic config

Prevents error when passing in 'response_format: {"type": "text"}

* test: fix test

* fix(utils.py): fix base invoke provider check

* fix(anthropic_claude3_transformation.py): don't pass 'stream' param

* fix: fix linting errors

* fix(converse_transformation.py): handle response_format type=text for converse
2025-02-28 17:56:26 -08:00
Ishaan Jaff
4858417283 test_prompt_caching 2025-02-26 08:57:16 -08:00
Ishaan Jaff
2cf66f8267 test_aprompt_caching 2025-02-26 08:13:45 -08:00
Ishaan Jaff
56b2576979 test_prompt_caching 2025-02-26 08:13:12 -08:00
Ishaan Jaff
0d398c2295 test_message_with_name 2025-02-18 21:53:47 -08:00
Krrish Dholakia
89c8dfc644 test: handle unstable tests 2025-02-18 18:44:12 -08:00
Ishaan Jaff
125f6fff67
(Feat) - Add /bedrock/meta.llama3-3-70b-instruct-v1:0 tool calling support + cost tracking + base llm unit test for tool calling (#8545)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
* Add support for bedrock meta.llama3-3-70b-instruct-v1:0 tool calling (#8512)

* fix(converse_transformation.py): fixing bedrock meta.llama3-3-70b tool calling

* test(test_bedrock_completion.py): adding llama3.3 tool compatibility check

* add TestBedrockTestSuite

* add bedrock llama 3.3 to base llm class

* us.meta.llama3-3-70b-instruct-v1:0

* test_basic_tool_calling

* TestAzureOpenAIO1

* test_basic_tool_calling

* test_basic_tool_calling

---------

Co-authored-by: miraclebakelaser <65143272+miraclebakelaser@users.noreply.github.com>
2025-02-14 14:15:25 -08:00
Krish Dholakia
4e34fc3bf8
[BETA] Support OIDC role based access to proxy (#8260)
* feat(proxy/_types.py): add new jwt field params

allows users + services to auth into proxy

* feat(handle_jwt.py): allow team role proxy access

allows proxy admin to set allowed team roles

* fix(proxy/_types.py): add 'routes' to role based permissions

allow proxy admin to restrict what routes a team can access easily

* feat(handle_jwt.py): support more flexible role based route access

v2 on role based 'allowed_routes'

* test(test_jwt.py): add unit test for rbac for proxy routes

* feat(handle_jwt.py): ensure cost tracking always works for any jwt request with `enforce_rbac=True`

* docs(token_auth.md): add documentation on controlling model access via OIDC Roles

* test: increase time delay before retrying

* test: handle model overloaded for test
2025-02-04 21:59:39 -08:00
Krish Dholakia
c8494abdea
test(base_llm_unit_tests.py): add test to ensure drop params is respe… (#8224)
* test(base_llm_unit_tests.py): add test to ensure drop params is respected

* fix(types/prometheus.py): use typing_extensions for python3.8 compatibility

* build: add cherry picked commits
2025-02-03 16:04:44 -08:00
Krish Dholakia
97b8de17ab
LiteLLM Minor Fixes & Improvements (01/16/2025) - p2 (#7828)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
* fix(vertex_ai/gemini/transformation.py): handle 'http://' image urls

* test: add base test for `http:` url's

* fix(factory.py/get_image_details): follow redirects

allows http calls to work

* fix(codestral/): fix stream chunk parsing on last chunk of stream

* Azure ad token provider (#6917)

* Update azure.py

Added optional parameter azure ad token provider

* Added parameter to main.py

* Found token provider arg location

* Fixed embeddings

* Fixed ad token provider

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* fix: fix linting errors

* fix(main.py): leave out o1 route for azure ad token provider, for now

get v0 out for sync azure gpt route to begin with

* test: skip http:// test for fireworks ai

model does not support it

* refactor: cleanup dead code

* fix: revert http:// url passthrough for gemini

google ai studio raises errors

* test: fix test

---------

Co-authored-by: bahtman <anton@baht.dk>
2025-02-02 23:17:50 -08:00
Krish Dholakia
1105e35538
Complete o3 model support (#8183)
* fix(o_series_transformation.py): add 'reasoning_effort' as o series model param

Closes https://github.com/BerriAI/litellm/issues/8182

* fix(main.py): ensure `reasoning_effort` is a mapped openai param

* refactor(azure/): rename o1_[x] files to o_series_[x]

* refactor(base_llm_unit_tests.py): refactor testing for o series reasoning effort

* test(test_azure_o_series.py): have azure o series tests correctly inherit from base o series model tests

* feat(base_utils.py): support translating 'developer' role to 'system' role for non-openai providers

Makes it easy to switch from openai to anthropic

* fix: fix linting errors

* fix(base_llm_unit_tests.py): fix test

* fix(main.py): add missing param
2025-02-02 22:36:37 -08:00
Krish Dholakia
08b124aeb6
Litellm dev 01 25 2025 p2 (#8003)
* fix(base_utils.py): supported nested json schema passed in for anthropic calls

* refactor(base_utils.py): refactor ref parsing to prevent infinite loop

* test(test_openai_endpoints.py): refactor anthropic test to use bedrock

* fix(langfuse_prompt_management.py): add unit test for sync langfuse calls

Resolves https://github.com/BerriAI/litellm/issues/7938#issuecomment-2613293757
2025-01-25 16:50:57 -08:00
Krish Dholakia
c6e9240405
Add datadog health check support + fix bedrock converse cost tracking w/ region name specified (#7958)
* fix(bedrock/converse_handler.py): fix bedrock region name on async calls

* fix(utils.py): fix split model handling

Fixes bedrock cost calculation when region name is given

* feat(_health_endpoints.py): support health checking datadog integration

Closes https://github.com/BerriAI/litellm/issues/7921
2025-01-23 22:17:09 -08:00
Krish Dholakia
71c41f8f33
QA: ensure all bedrock regional models have same supported_ as base + Anthropic nested pydantic object support (#7844)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* build: ensure all regional bedrock models have same supported values as base bedrock model

prevents drift

* test(base_llm_unit_tests.py): add testing for nested pydantic objects

* fix(test_utils.py): add test_get_potential_model_names

* fix(anthropic/chat/transformation.py): support nested pydantic objects

Fixes https://github.com/BerriAI/litellm/issues/7755
2025-01-17 19:49:12 -08:00
Krish Dholakia
347779b813
Litellm dev 12 30 2024 p1 (#7480)
* test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model

* fix(base_llm_unit_tests.py): handle azure o1 preview response format tests

skip as o1 on azure doesn't support tool calling yet

* fix: initial commit of azure o1 handler using openai caller

simplifies calling + allows fake streaming logic alr. implemented for openai to just work

* feat(azure/o1_handler.py): fake o1 streaming for azure o1 models

azure does not currently support streaming for o1

* feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info

enables user to toggle on when azure allows o1 streaming without needing to bump versions

* style(router.py): remove 'give feedback/get help' messaging when router is used

Prevents noisy messaging

Closes https://github.com/BerriAI/litellm/issues/5942

* test: fix azure o1 test

* test: fix tests

* fix: fix test
2024-12-30 21:52:52 -08:00
Krish Dholakia
c3edfc2c92
LiteLLM Minor Fixes & Improvements (12/23/2024) - p3 (#7394)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 35s
* build(model_prices_and_context_window.json): add gemini-1.5-flash context caching

* fix(context_caching/transformation.py): just use last identified cache point

Fixes https://github.com/BerriAI/litellm/issues/6738

* fix(context_caching/transformation.py): pick first contiguous block - handles system message error from google

Fixes https://github.com/BerriAI/litellm/issues/6738

* fix(vertex_ai/gemini/): track context caching tokens

* refactor(gemini/): place transformation.py inside `chat/` folder

make it easy for user to know we support the equivalent endpoint

* fix: fix import

* refactor(vertex_ai/): move vertex_ai cost calc inside vertex_ai/ folder

make it easier to see cost calculation logic

* fix: fix linting errors

* fix: fix circular import

* feat(gemini/cost_calculator.py): support gemini context caching cost calculation

generifies anthropic's cost calculation function and uses it across anthropic + gemini

* build(model_prices_and_context_window.json): add cost tracking for gemini-1.5-flash-002 w/ context caching

Closes https://github.com/BerriAI/litellm/issues/6891

* docs(gemini.md): add gemini context caching architecture diagram

make it easier for user to understand how context caching works

* docs(gemini.md): link to relevant gemini context caching code

* docs(gemini/context_caching): add readme in github, make it easy for dev to know context caching is supported + where to go for code

* fix(llm_cost_calc/utils.py): handle gemini 128k token diff cost calc scenario

* fix(deepseek/cost_calculator.py): support deepseek context caching cost calculation

* test: fix test
2024-12-23 22:02:52 -08:00
Krish Dholakia
62b00cf28d
o1 - add image param handling (#7312)
* fix(openai.py): fix returning o1 non-streaming requests

fixes issue where fake stream always true for o1

* build(model_prices_and_context_window.json): add 'supports_vision' for o1 models

* fix: add internal server error exception mapping

* fix(base_llm_unit_tests.py): drop temperature from test

* test: mark prompt caching as a flaky test
2024-12-19 11:22:25 -08:00
Krish Dholakia
6a45ee1ef7
fix(hosted_vllm/transformation.py): return fake api key, if none give… (#7301)
* fix(hosted_vllm/transformation.py): return fake api key, if none give. Prevents httpx error

Fixes https://github.com/BerriAI/litellm/issues/7291

* test: fix test

* fix(main.py): add hosted_vllm/ support for embeddings endpoint

Closes https://github.com/BerriAI/litellm/issues/7290

* docs(vllm.md): add docs on vllm embeddings usage

* fix(__init__.py): fix sambanova model test

* fix(base_llm_unit_tests.py): skip pydantic obj test if model takes >5s to respond
2024-12-18 18:41:53 -08:00
Ishaan Jaff
8060c5c698
Litellm add router to base llm testing (#7202)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 1m15s
* code qa add litellm router to base llm testing

* test_image_url

* fix img url

* fix add router to base llm class

* fix base llm testing

* add test scenario

* fix test_json_response_format

* fixes base llm testing

* fix base llm testing

* fix test image url
2024-12-13 19:16:28 -08:00
Krish Dholakia
0c0498dd60
Litellm dev 12 07 2024 (#7086)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 11s
* fix(main.py): support passing max retries to azure/openai embedding integrations

Fixes https://github.com/BerriAI/litellm/issues/7003

* feat(team_endpoints.py): allow updating team model aliases

Closes https://github.com/BerriAI/litellm/issues/6956

* feat(router.py): allow specifying model id as fallback - skips any cooldown check

Allows a default model to be checked if all models in cooldown

s/o @micahjsmith

* docs(reliability.md): add fallback to specific model to docs

* fix(utils.py): new 'is_prompt_caching_valid_prompt' helper util

Allows user to identify if messages/tools have prompt caching

Related issue: https://github.com/BerriAI/litellm/issues/6784

* feat(router.py): store model id for prompt caching valid prompt

Allows routing to that model id on subsequent requests

* fix(router.py): only cache if prompt is valid prompt caching prompt

prevents storing unnecessary items in cache

* feat(router.py): support routing prompt caching enabled models to previous deployments

Closes https://github.com/BerriAI/litellm/issues/6784

* test: fix linting errors

* feat(databricks/): convert basemodel to dict and exclude none values

allow passing pydantic message to databricks

* fix(utils.py): ensure all chat completion messages are dict

* (feat) Track `custom_llm_provider` in LiteLLMSpendLogs (#7081)

* add custom_llm_provider to SpendLogsPayload

* add custom_llm_provider to SpendLogs

* add custom llm provider to SpendLogs payload

* test_spend_logs_payload

* Add MLflow to the side bar (#7031)

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* (bug fix) SpendLogs update DB catch all possible DB errors for retrying  (#7082)

* catch DB_CONNECTION_ERROR_TYPES

* fix DB retry mechanism for SpendLog updates

* use DB_CONNECTION_ERROR_TYPES in auth checks

* fix exp back off for writing SpendLogs

* use _raise_failed_update_spend_exception to ensure errors print as NON blocking

* test_update_spend_logs_multiple_batches_with_failure

* (Feat) Add StructuredOutputs support for Fireworks.AI (#7085)

* fix model cost map fireworks ai "supports_response_schema": true,

* fix supports_response_schema

* fix map openai params fireworks ai

* test_map_response_format

* test_map_response_format

* added deepinfra/Meta-Llama-3.1-405B-Instruct (#7084)

* bump: version 1.53.9 → 1.54.0

* fix deepinfra

* litellm db fixes LiteLLM_UserTable (#7089)

* ci/cd queue new release

* fix llama-3.3-70b-versatile

* refactor - use consistent file naming convention `AI21/` -> `ai21`  (#7090)

* fix refactor - use consistent file naming convention

* ci/cd run again

* fix naming structure

* fix use consistent naming (#7092)

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>
Co-authored-by: ali sayyah <ali.sayyah2@gmail.com>
2024-12-08 00:30:33 -08:00
Krish Dholakia
816f0ef8d2
LiteLLM Minor Fixes & Improvements (12/05/2024) (#7051)
* fix(cost_calculator.py): move to using `.get_model_info()` for cost per token calculations

ensures cost tracking is reliable - handles edge cases of parsing model cost map

* build(model_prices_and_context_window.json): add 'supports_response_schema' for select tgai models

Fixes https://github.com/BerriAI/litellm/pull/7037#discussion_r1872157329

* build(model_prices_and_context_window.json): remove 'pdf input' and 'vision' support from nova micro in model map

Bedrock docs indicate no support for micro - https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html

* fix(converse_transformation.py): support amazon nova tool use

* fix(opentelemetry): Add missing LLM request type attribute to spans (#7041)

* feat(opentelemetry): add LLM request type attribute to spans

* lint

* fix: curl usage (#7038)

curl -d, --data <data> is lowercase d
curl -D, --dump-header <filename> is uppercase D

references:
https://curl.se/docs/manpage.html#-d
https://curl.se/docs/manpage.html#-D

* fix(spend_tracking.py): handle empty 'id' in model response - when creating spend log

Fixes https://github.com/BerriAI/litellm/issues/7023

* fix(streaming_chunk_builder.py): handle initial id being empty string

Fixes https://github.com/BerriAI/litellm/issues/7023

* fix(anthropic_passthrough_logging_handler.py): add end user cost tracking for anthropic pass through endpoint

* docs(pass_through/): refactor docs location + add table on supported features for pass through endpoints

* feat(anthropic_passthrough_logging_handler.py): support end user cost tracking via anthropic sdk

* docs(anthropic_completion.md): add docs on passing end user param for cost tracking on anthropic sdk

* fix(litellm_logging.py): use standard logging payload if present in kwargs

prevent datadog logging error for pass through endpoints

* docs(bedrock.md): add rerank api usage example to docs

* bugfix/change dummy tool name format (#7053)

* fix viewing keys (#7042)

* ui new build

* build(model_prices_and_context_window.json): add bedrock region models to model cost map (#7044)

* bye (#6982)

* (fix) litellm router.aspeech  (#6962)

* doc Migrating Databases

* fix aspeech on router

* test_audio_speech_router

* test_audio_speech_router

* docs show supported providers on batches api doc

* change dummy tool name format

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>

* fix: fix linting errors

* test: update test

* fix(litellm_logging.py): fix pass through check

* fix(test_otel_logging.py): fix test

* fix(cost_calculator.py): update handling for cost per second

* fix(cost_calculator.py): fix cost check

* test: fix test

* (fix) adding public routes when using custom header  (#7045)

* get_api_key_from_custom_header

* add test_get_api_key_from_custom_header

* fix testing use 1 file for test user api key auth

* fix test user api key auth

* test_custom_api_key_header_name

* build: update ui build

---------

Co-authored-by: Doron Kopit <83537683+doronkopit5@users.noreply.github.com>
Co-authored-by: lloydchang <lloydchang@gmail.com>
Co-authored-by: hgulersen <haymigulersen@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
2024-12-06 14:29:53 -08:00
Krish Dholakia
61b35c12bb
LiteLLM Minor Fixes & Improvements (12/05/2024) (#7037)
* fix(together_ai/chat): only return response_format + tools for supported models

Fixes https://github.com/BerriAI/litellm/issues/6972

* feat(bedrock/rerank): initial working commit for bedrock rerank api support

Closes https://github.com/BerriAI/litellm/issues/7021

* feat(bedrock/rerank): async bedrock rerank api support

Addresses https://github.com/BerriAI/litellm/issues/7021

* build(model_prices_and_context_window.json): add 'supports_prompt_caching' for bedrock models + cleanup cross-region from model list (duplicate information - lead to inconsistencies )

* docs(json_mode.md): clarify model support for json schema

Closes https://github.com/BerriAI/litellm/issues/6998

* fix(_service_logger.py): handle dd callback in list

ensure failed spend tracking is logged to datadog

* feat(converse_transformation.py): translate from anthropic format to bedrock format

Closes https://github.com/BerriAI/litellm/issues/7030

* fix: fix linting errors

* test: fix test
2024-12-05 00:02:31 -08:00
Krish Dholakia
6bb934c0ac
fix(key_management_endpoints.py): override metadata field value on up… (#7008)
* fix(key_management_endpoints.py): override metadata field value on update

allow user to override tags

* feat(__init__.py): expose new disable_end_user_cost_tracking_prometheus_only metric

allow disabling end user cost tracking on prometheus - fixes cardinality issue

* fix(litellm_pre_call_utils.py): add key/team level enforced params

Fixes https://github.com/BerriAI/litellm/issues/6652

* fix(key_management_endpoints.py): allow user to pass in `enforced_params` as a top level param on /key/generate and /key/update

* docs(enterprise.md): add docs on enforcing required params for llm requests

* Add support of Galadriel API (#7005)

* fix(router.py): robust retry after handling

set retry after time to 0 if >0 healthy deployments. handle base case = 1 deployment

* test(test_router.py): fix test

* feat(bedrock/): add support for 'nova' models

also adds explicit 'converse/' route for simpler routing

* fix: fix 'supports_pdf_input'

return if model supports pdf input on get_model_info

* feat(converse_transformation.py): support bedrock pdf input

* docs(document_understanding.md): add document understanding to docs

* fix(litellm_pre_call_utils.py): fix linting error

* fix(init.py): fix passing of bedrock converse models

* feat(bedrock/converse): support 'response_format={"type": "json_object"}'

* fix(converse_handler.py): fix linting error

* fix(base_llm_unit_tests.py): fix test

* fix: fix test

* test: fix test

* test: fix test

* test: remove duplicate test

---------

Co-authored-by: h4n0 <4738254+h4n0@users.noreply.github.com>
2024-12-03 23:03:50 -08:00
Ishaan Jaff
2a8d64991f
(fix) 'utf-8' codec can't encode characters error on OpenAI (#7018)
* test_openai_multilingual

* pin httpx

* fix openai pyproject

* test_multilingual_requests

* TestOpenAIChatCompletion

* fix test anthropic completion
2024-12-03 20:33:14 -08:00
Krrish Dholakia
aaa8c7caf0 test: add retry on flaky test 2024-12-02 23:08:19 -08:00
Krrish Dholakia
0caf804f4c feat(databricks/chat): support structured outputs on databricks
Closes https://github.com/BerriAI/litellm/pull/6978

- handles content as list for dbrx, - handles streaming+response_format for dbrx
2024-12-02 23:08:19 -08:00
Krish Dholakia
2d2931a215
LiteLLM Minor Fixes & Improvements (11/26/2024) (#6913)
* docs(config_settings.md): document all router_settings

* ci(config.yml): add router_settings doc test to ci/cd

* test: debug test on ci/cd

* test: debug ci/cd test

* test: fix test

* fix(team_endpoints.py): skip invalid team object. don't fail `/team/list` call

Causes downstream errors if ui just fails to load team list

* test(base_llm_unit_tests.py): add 'response_format={"type": "text"}' test to base_llm_unit_tests

adds complete coverage for all 'response_format' values to ci/cd

* feat(router.py): support wildcard routes in `get_router_model_info()`

Addresses https://github.com/BerriAI/litellm/issues/6914

* build(model_prices_and_context_window.json): add tpm/rpm limits for all gemini models

Allows for ratelimit tracking for gemini models even with wildcard routing enabled

Addresses https://github.com/BerriAI/litellm/issues/6914

* feat(router.py): add tpm/rpm tracking on success/failure to global_router

Addresses https://github.com/BerriAI/litellm/issues/6914

* feat(router.py): support wildcard routes on router.get_model_group_usage()

* fix(router.py): fix linting error

* fix(router.py): implement get_remaining_tokens_and_requests

Addresses https://github.com/BerriAI/litellm/issues/6914

* fix(router.py): fix linting errors

* test: fix test

* test: fix tests

* docs(config_settings.md): add missing dd env vars to docs

* fix(router.py): check if hidden params is dict
2024-11-28 00:01:38 +05:30
Krish Dholakia
8673f2541e
fix(key_management_endpoints.py): fix user-membership check when creating team key (#6890)
* fix(key_management_endpoints.py): fix user-membership check when creating team key

* docs: add deprecation notice on original `/v1/messages` endpoint + add better swagger tags on pass-through endpoints

* fix(gemini/): fix image_url handling for gemini

Fixes https://github.com/BerriAI/litellm/issues/6897

* fix(teams.tsx): fix member add when role is 'user'

* fix(team_endpoints.py): /team/member_add

fix adding several new members to team

* test(test_vertex.py): remove redundant test

* test(test_proxy_server.py): fix team member add tests
2024-11-26 14:19:24 +05:30
Krish Dholakia
7e5085dc7b
Litellm dev 11 21 2024 (#6837)
* Fix Vertex AI function calling invoke: use JSON format instead of protobuf text format. (#6702)

* test: test tool_call conversion when arguments is empty dict

Fixes https://github.com/BerriAI/litellm/issues/6833

* fix(openai_like/handler.py): return more descriptive error message

Fixes https://github.com/BerriAI/litellm/issues/6812

* test: skip overloaded model

* docs(anthropic.md): update anthropic docs to show how to route to any new model

* feat(groq/): fake stream when 'response_format' param is passed

Groq doesn't support streaming when response_format is set

* feat(groq/): add response_format support for groq

Closes https://github.com/BerriAI/litellm/issues/6845

* fix(o1_handler.py): remove fake streaming for o1

Closes https://github.com/BerriAI/litellm/issues/6801

* build(model_prices_and_context_window.json): add groq llama3.2b model pricing

Closes https://github.com/BerriAI/litellm/issues/6807

* fix(utils.py): fix handling ollama response format param

Fixes https://github.com/BerriAI/litellm/issues/6848#issuecomment-2491215485

* docs(sidebars.js): refactor chat endpoint placement

* fix: fix linting errors

* test: fix test

* test: fix test

* fix(openai_like/handler): handle max retries

* fix(streaming_handler.py): fix streaming check for openai-compatible providers

* test: update test

* test: correctly handle model is overloaded error

* test: update test

* test: fix test

* test: mark flaky test

---------

Co-authored-by: Guowang Li <Guowang@users.noreply.github.com>
2024-11-22 01:53:52 +05:30
Krish Dholakia
b0be5bf3a1
LiteLLM Minor Fixes & Improvements (11/19/2024) (#6820)
* fix(anthropic/chat/transformation.py): add json schema as values: json_schema

fixes passing pydantic obj to anthropic

Fixes https://github.com/BerriAI/litellm/issues/6766

* (feat): Add timestamp_granularities parameter to transcription API (#6457)

* Add timestamp_granularities parameter to transcription API

* add param to the local test

* fix(databricks/chat.py): handle max_retries optional param handling for openai-like calls

Fixes issue with calling finetuned vertex ai models via databricks route

* build(ui/): add team admins via proxy ui

* fix: fix linting error

* test: fix test

* docs(vertex.md): refactor docs

* test: handle overloaded anthropic model error

* test: remove duplicate test

* test: fix test

* test: update test to handle model overloaded error

---------

Co-authored-by: Show <35062952+BrunooShow@users.noreply.github.com>
2024-11-21 00:57:58 +05:30
Ishaan Jaff
6ae0bc4a11
[Feature]: json_schema in response support for Anthropic (#6748)
* _convert_tool_response_to_message

* fix ModelResponseIterator

* fix test_json_response_format

* test_json_response_format_stream

* fix _convert_tool_response_to_message

* use helper _handle_json_mode_chunk

* fix _process_response

* unit testing for test_convert_tool_response_to_message_no_arguments

* update doc for JSON mode
2024-11-14 16:59:45 -08:00
Krish Dholakia
e9aa492af3
LiteLLM Minor Fixes & Improvement (11/14/2024) (#6730)
* fix(ollama.py): fix get model info request

Fixes https://github.com/BerriAI/litellm/issues/6703

* feat(anthropic/chat/transformation.py): support passing user id to anthropic via openai 'user' param

* docs(anthropic.md): document all supported openai params for anthropic

* test: fix tests

* fix: fix tests

* feat(jina_ai/): add rerank support

Closes https://github.com/BerriAI/litellm/issues/6691

* test: handle service unavailable error

* fix(handler.py): refactor together ai rerank call

* test: update test to handle overloaded error

* test: fix test

* Litellm router trace (#6742)

* feat(router.py): add trace_id to parent functions - allows tracking retry/fallbacks

* feat(router.py): log trace id across retry/fallback logic

allows grouping llm logs for the same request

* test: fix tests

* fix: fix test

* fix(transformation.py): only set non-none stop_sequences

* Litellm router disable fallbacks (#6743)

* bump: version 1.52.6 → 1.52.7

* feat(router.py): enable dynamically disabling fallbacks

Allows for enabling/disabling fallbacks per key

* feat(litellm_pre_call_utils.py): support setting 'disable_fallbacks' on litellm key

* test: fix test

* fix(exception_mapping_utils.py): map 'model is overloaded' to internal server error

* test: handle gemini error

* test: fix test

* fix: new run
2024-11-15 01:02:54 +05:30
Ishaan Jaff
6d4cf2d908
(fix) using Anthropic response_format={"type": "json_object"} (#6721)
* add support for response_format=json anthropic

* add test_json_response_format to baseLLM ChatTest

* fix test_litellm_anthropic_prompt_caching_tools

* fix test_anthropic_function_call_with_no_schema

* test test_create_json_tool_call_for_response_format
2024-11-12 19:06:00 -08:00
Ishaan Jaff
9d20c19e0c
(fix) OpenAI's optional messages[].name does not work with Mistral API (#6701)
* use helper for _transform_messages mistral

* add test_message_with_name to base LLMChat test

* fix linting
2024-11-11 18:03:41 -08:00
Krish Dholakia
f59cb46e71
Litellm dev 11 11 2024 (#6693)
* fix(__init__.py): add 'watsonx_text' as mapped llm api route

Fixes https://github.com/BerriAI/litellm/issues/6663

* fix(opentelemetry.py): fix passing parallel tool calls to otel

Fixes https://github.com/BerriAI/litellm/issues/6677

* refactor(test_opentelemetry_unit_tests.py): create a base set of unit tests for all logging integrations - test for parallel tool call handling

reduces bugs in repo

* fix(__init__.py): update provider-model mapping to include all known provider-model mappings

Fixes https://github.com/BerriAI/litellm/issues/6669

* feat(anthropic): support passing document in llm api call

* docs(anthropic.md): add pdf anthropic call to docs + expose new 'supports_pdf_input' function

* fix(factory.py): fix linting error
2024-11-12 00:16:35 +05:30
Krish Dholakia
73531f4815
Litellm dev 11 08 2024 (#6658)
* fix(deepseek/chat): convert content list to str

Fixes https://github.com/BerriAI/litellm/issues/6642

* test(test_deepseek_completion.py): implement base llm unit tests

increase robustness across providers

* fix(router.py): support content policy violation fallbacks with default fallbacks

* fix(opentelemetry.py): refactor to move otel imports behing flag

Fixes https://github.com/BerriAI/litellm/issues/6636

* fix(opentelemtry.py): close span on success completion

* fix(user_api_key_auth.py): allow user_role to default to none

* fix: mark flaky test

* fix(opentelemetry.py): move otelconfig.from_env to inside the init

prevent otel errors raised just by importing the litellm class

* fix(user_api_key_auth.py): fix auth error
2024-11-08 22:07:17 +05:30