Commit graph

86 commits

Author SHA1 Message Date
Krish Dholakia
9b0f871129
Add /vllm/* and /mistral/* passthrough endpoints (adds support for Mistral OCR via passthrough)
* feat(llm_passthrough_endpoints.py): support mistral passthrough

Closes https://github.com/BerriAI/litellm/issues/9051

* feat(llm_passthrough_endpoints.py): initial commit for adding vllm passthrough route

* feat(vllm/common_utils.py): add new vllm model info route

make it possible to use vllm passthrough route via factory function

* fix(llm_passthrough_endpoints.py): add all methods to vllm passthrough route

* fix: fix linting error

* fix: fix linting error

* fix: fix ruff check

* fix(proxy/_types.py): add new passthrough routes

* docs(config_settings.md): add mistral env vars to docs
2025-04-14 22:06:33 -07:00
Krish Dholakia
87733c8193
Fix anthropic prompt caching cost calc + trim logged message in db (#9838)
* fix(spend_tracking_utils.py): prevent logging entire mp4 files to db

Fixes https://github.com/BerriAI/litellm/issues/9732

* fix(anthropic/chat/transformation.py): Fix double counting cache creation input tokens

Fixes https://github.com/BerriAI/litellm/issues/9812

* refactor(anthropic/chat/transformation.py): refactor streaming to use same usage calculation block as non-streaming

reduce errors

* fix(bedrock/chat/converse_transformation.py): don't increment prompt tokens with cache_creation_input_tokens

* build: remove redisvl from requirements.txt (temporary)

* fix(spend_tracking_utils.py): handle circular references

* test: update code cov test

* test: update test
2025-04-09 21:26:43 -07:00
Krish Dholakia
6ba3c4a4f8
VertexAI non-jsonl file storage support (#9781)
* test: add initial e2e test

* fix(vertex_ai/files): initial commit adding sync file create support

* refactor: initial commit of vertex ai non-jsonl files reaching gcp endpoint

* fix(vertex_ai/files/transformation.py): initial working commit of non-jsonl file call reaching backend endpoint

* fix(vertex_ai/files/transformation.py): working e2e non-jsonl file upload

* test: working e2e jsonl call

* test: unit testing for jsonl file creation

* fix(vertex_ai/transformation.py): reset file pointer after read

allow multiple reads on same file object

* fix: fix linting errors

* fix: fix ruff linting errors

* fix: fix import

* fix: fix linting error

* fix: fix linting error

* fix(vertex_ai/files/transformation.py): fix linting error

* test: update test

* test: update tests

* fix: fix linting errors

* fix: fix test

* fix: fix linting error
2025-04-09 14:01:48 -07:00
Krish Dholakia
ac9f03beae
Allow passing thinking param to litellm proxy via client sdk + Code QA Refactor on get_optional_params (get correct values) (#9386)
* fix(litellm_proxy/chat/transformation.py): support 'thinking' param

Fixes https://github.com/BerriAI/litellm/issues/9380

* feat(azure/gpt_transformation.py): add azure audio model support

Closes https://github.com/BerriAI/litellm/issues/6305

* fix(utils.py): use provider_config in common functions

* fix(utils.py): add missing provider configs to get_chat_provider_config

* test: fix test

* fix: fix path

* feat(utils.py): make bedrock invoke nova config baseconfig compatible

* fix: fix linting errors

* fix(azure_ai/transformation.py): remove buggy optional param filtering for azure ai

Removes incorrect check for support tool choice when calling azure ai - prevented calling models with response_format unless on litell model cost map

* fix(amazon_cohere_transformation.py): fix bedrock invoke cohere transformation to inherit from coherechatconfig

* test: fix azure ai tool choice mapping

* fix: fix model cost map to add 'supports_tool_choice' to cohere models

* fix(get_supported_openai_params.py): check if custom llm provider in llm providers

* fix(get_supported_openai_params.py): fix llm provider in list check

* fix: fix ruff check errors

* fix: support defs when calling bedrock nova

* fix(factory.py): fix test
2025-04-07 21:04:11 -07:00
Krish Dholakia
5099aac1a5
Add DBRX Anthropic w/ thinking + response_format support (#9744)
* feat(databricks/chat/): add anthropic w/ reasoning content support via databricks

Allows user to call claude-3-7-sonnet with thinking via databricks

* refactor: refactor choices transformation + add unit testing

* fix(databricks/chat/transformation.py): support thinking blocks on databricks response streaming

* feat(databricks/chat/transformation.py): support response_format for claude models

* fix(databricks/chat/transformation.py): correctly handle response_format={"type": "text"}

* feat(databricks/chat/transformation.py): support 'reasoning_effort' param mapping for anthropic

* fix: fix ruff errors

* fix: fix linting error

* test: update test

* fix(databricks/chat/transformation.py): handle json mode output parsing

* fix(databricks/chat/transformation.py): handle json mode on streaming

* test: update test

* test: update dbrx testing

* test: update testing

* fix(base_model_iterator.py): handle non-json chunk

* test: update tests

* fix: fix ruff check

* fix: fix databricks config import

* fix: handle _tool = none

* test: skip invalid test
2025-04-04 22:13:32 -07:00
Krish Dholakia
6dda1ba6dd
LiteLLM Minor Fixes & Improvements (04/02/2025) (#9725)
* Add date picker to usage tab + Add reasoning_content token tracking across all providers on streaming (#9722)

* feat(new_usage.tsx): add date picker for new usage tab

allow user to look back on their usage data

* feat(anthropic/chat/transformation.py): report reasoning tokens in completion token details

allows usage tracking on how many reasoning tokens are actually being used

* feat(streaming_chunk_builder.py): return reasoning_tokens in anthropic/openai streaming response

allows tracking reasoning_token usage across providers

* Fix update team metadata + fix bulk adding models on Ui  (#9721)

* fix(handle_add_model_submit.tsx): fix bulk adding models

* fix(team_info.tsx): fix team metadata update

Fixes https://github.com/BerriAI/litellm/issues/9689

* (v0) Unified file id - allow calling multiple providers with same file id (#9718)

* feat(files_endpoints.py): initial commit adding 'target_model_names' support

allow developer to specify all the models they want to call with the file

* feat(files_endpoints.py): return unified files endpoint

* test(test_files_endpoints.py): add validation test - if invalid purpose submitted

* feat: more updates

* feat: initial working commit of unified file id translation

* fix: additional fixes

* fix(router.py): remove model replace logic in jsonl on acreate_file

enables file upload to work for chat completion requests as well

* fix(files_endpoints.py): remove whitespace around model name

* fix(azure/handler.py): return acreate_file with correct response type

* fix: fix linting errors

* test: fix mock test to run on github actions

* fix: fix ruff errors

* fix: fix file too large error

* fix(utils.py): remove redundant var

* test: modify test to work on github actions

* test: update tests

* test: more debug logs to understand ci/cd issue

* test: fix test for respx

* test: skip mock respx test

fails on ci/cd - not clear why

* fix: fix ruff check

* fix: fix test

* fix(model_connection_test.tsx): fix linting error

* test: update unit tests
2025-04-03 11:48:52 -07:00
Krish Dholakia
8ee32291e0
Squashed commit of the following: (#9709)
commit b12a9892b7
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Wed Apr 2 08:09:56 2025 -0700

    fix(utils.py): don't modify openai_token_counter

commit 294de31803
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 21:22:40 2025 -0700

    fix: fix linting error

commit cb6e9fbe40
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 19:52:45 2025 -0700

    refactor: complete migration

commit bfc159172d
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 19:09:59 2025 -0700

    refactor: refactor more constants

commit 43ffb6a558
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:45:24 2025 -0700

    fix: test

commit 04dbe4310c
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:28:58 2025 -0700

    refactor: refactor: move more constants into constants.py

commit 3c26284aff
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:14:46 2025 -0700

    refactor: migrate hardcoded constants out of __init__.py

commit c11e0de69d
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:11:21 2025 -0700

    build: migrate all constants into constants.py

commit 7882bdc787
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:07:37 2025 -0700

    build: initial test banning hardcoded numbers in repo
2025-04-02 21:24:54 -07:00
Krish Dholakia
23051d89dd
fix(streaming_handler.py): fix completion start time tracking (#9688)
* fix(streaming_handler.py): fix completion start time tracking

Fixes https://github.com/BerriAI/litellm/issues/9210

* feat(anthropic/chat/transformation.py): map openai 'reasoning_effort' to anthropic 'thinking' param

Fixes https://github.com/BerriAI/litellm/issues/9022

* feat: map 'reasoning_effort' to 'thinking' param across bedrock + vertex

Closes https://github.com/BerriAI/litellm/issues/9022#issuecomment-2705260808
2025-04-01 22:00:56 -07:00
Ishaan Jaff
bc5cc51b9d
Merge pull request #9567 from BerriAI/litellm_anthropic_messages_improvements
[Refactor] - Expose litellm.messages.acreate() and  litellm.messages.create() to make LLM API calls in Anthropic API spec
2025-03-31 20:50:30 -07:00
Krish Dholakia
46b3dbde8f
Revert "fix: Anthropic prompt caching on GCP Vertex AI (#9605)" (#9670)
This reverts commit a8673246dc.
2025-03-31 17:13:55 -07:00
Ishaan Jaff
01d85d5fb7 Merge branch 'main' into litellm_anthropic_messages_improvements 2025-03-31 14:22:56 -07:00
Sam
a8673246dc
fix: Anthropic prompt caching on GCP Vertex AI (#9605)
* fix: Anthropic prompt caching on GCP Vertex AI

* test(vertex): anthropic prompt caching
2025-03-29 23:40:34 -07:00
Krish Dholakia
9b7ebb6a7d
build(pyproject.toml): add new dev dependencies - for type checking (#9631)
* build(pyproject.toml): add new dev dependencies - for type checking

* build: reformat files to fit black

* ci: reformat to fit black

* ci(test-litellm.yml): make tests run clear

* build(pyproject.toml): add ruff

* fix: fix ruff checks

* build(mypy/): fix mypy linting errors

* fix(hashicorp_secret_manager.py): fix passing cert for tls auth

* build(mypy/): resolve all mypy errors

* test: update test

* fix: fix black formatting

* build(pre-commit-config.yaml): use poetry run black

* fix(proxy_server.py): fix linting error

* fix: fix ruff safe representation error
2025-03-29 11:02:13 -07:00
Krish Dholakia
222898d727
Fix anthropic thinking + response_format (#9594)
* fix(anthropic/chat/transformation.py): Don't set tool choice on response_format conversion when thinking is enabled

Not allowed by Anthropic

Fixes https://github.com/BerriAI/litellm/issues/8901

* refactor: move test to base anthropic chat tests

ensures consistent behaviour across vertex/anthropic/bedrock

* fix(anthropic/chat/transformation.py): if thinking token is specified and max tokens is not - ensure max token to anthropic is higher than thinking tokens

* feat(converse_transformation.py): correctly handle thinking + response format on Bedrock Converse

Fixes https://github.com/BerriAI/litellm/issues/8901

* fix(converse_transformation.py): correctly handle adding max tokens

* test: handle service unavailable error
2025-03-28 15:57:40 -07:00
Krrish Dholakia
69e28b92c6 fix: fix python38 linting error 2025-03-28 09:38:32 -07:00
Krrish Dholakia
34a1a48af1 fix(common_utils.py): fix linting error
Some checks failed
Read Version from pyproject.toml / read-version (push) Successful in 19s
Helm unit test / unit-test (push) Successful in 20s
Publish Prisma Migrations / publish-migrations (push) Failing after 1m15s
2025-03-27 23:31:58 -07:00
Krish Dholakia
ccbac691e5
Support discovering gemini, anthropic, xai models by calling their /v1/model endpoint (#9530)
* fix: initial commit for adding provider model discovery to gemini

* feat(gemini/): add model discovery for gemini/ route

* docs(set_keys.md): update docs to show you can check available gemini models as well

* feat(anthropic/): add model discovery for anthropic api key

* feat(xai/): add model discovery for XAI

enables checking what models an xai key can call

* ci: bump ci config yml

* fix(topaz/common_utils.py): fix linting error

* fix: fix linting error for python38
2025-03-27 22:50:48 -07:00
Ishaan Jaff
8499a88e4a fixes - anthropic messages interface 2025-03-26 17:45:47 -07:00
Ishaan Jaff
9eb9a369bb working anthropic API tests 2025-03-26 17:34:41 -07:00
Ishaan Jaff
8dcdff9280 fix anthropic_messages 2025-03-26 17:21:14 -07:00
Ishaan Jaff
3640262dbf fix anthropic_messages implementation 2025-03-26 17:12:40 -07:00
Ishaan Jaff
957b7eb82c define types for response form AnthropicMessagesResponse 2025-03-26 16:54:45 -07:00
Krrish Dholakia
86be28b640 fix: fix linting error 2025-03-21 12:20:21 -07:00
Krrish Dholakia
e7ef14398f fix(anthropic/chat/transformation.py): correctly update response_format to tool call transformation
Fixes https://github.com/BerriAI/litellm/issues/9411
2025-03-21 10:20:21 -07:00
Ishaan Jaff
f47987e673
(Refactor) /v1/messages to follow simpler logic for Anthropic API spec (#9013)
* anthropic_messages_handler v0

* fix /messages

* working messages with router methods

* test_anthropic_messages_handler_litellm_router_non_streaming

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* AnthropicMessagesConfig

* _handle_anthropic_messages_response_logging

* working with /v1/messages endpoint

* working /v1/messages endpoint

* refactor to use router factory function

* use aanthropic_messages

* use BaseConfig for Anthropic /v1/messages

* track api key, team on /v1/messages endpoint

* fix get_logging_payload

* BaseAnthropicMessagesTest

* align test config

* test_anthropic_messages_with_thinking

* test_anthropic_streaming_with_thinking

* fix - display anthropic url for debugging

* test_bad_request_error_handling

* test_anthropic_messages_router_streaming_with_bad_request

* fix ProxyException

* test_bad_request_error_handling_streaming

* use provider_specific_header

* test_anthropic_messages_with_extra_headers

* test_anthropic_messages_to_wildcard_model

* fix gcs pub sub test

* standard_logging_payload

* fix unit testing for anthopic /v1/messages support

* fix pass through anthropic messages api

* delete dead code

* fix anthropic pass through response

* revert change to spend tracking utils

* fix get_litellm_metadata_from_kwargs

* fix spend logs payload json

* proxy_pass_through_endpoint_tests

* TestAnthropicPassthroughBasic

* fix pass through tests

* test_async_vertex_proxy_route_api_key_auth

* _handle_anthropic_messages_response_logging

* vertex_credentials

* test_set_default_vertex_config

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* test_ageneric_api_call_with_fallbacks_basic

* test__aadapter_completion
2025-03-06 00:43:08 -08:00
Krish Dholakia
744e10b0f0
Litellm dev 03 05 2025 p3 (#9023)
* fix(invoke_handler.py): fix converse streaming - return signature + ensure consistency with anthropic api response

* build(model_prices_and_context_window.json): fix anthropic api claude-3-7 max output tokens

with beta header this is 128k

Resolves https://github.com/BerriAI/litellm/issues/8964

* feat(handler.py): handle new anthropic 'thinking_delta' block on streaming

Fixes https://github.com/BerriAI/litellm/issues/8825
2025-03-05 22:31:39 -08:00
Krish Dholakia
ec4f665e29
Return signature on anthropic streaming + migrate to signature field instead of signature_delta [MINOR bump] (#9021)
* Fix missing signature_delta in thinking blocks when streaming from Claude 3.7 (#8797)

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* test: update test to enforce signature found

* feat(refactor-signature-param-to-be-'signature'-instead-of-'signature_delta'): keeps it in sync with anthropic

* fix: fix linting error

---------

Co-authored-by: Martin Krasser <krasserm@googlemail.com>
2025-03-05 19:33:54 -08:00
Krrish Dholakia
a63eb58f1b fix(anthropic/chat/transformation.py): fix headers to be a set
avoid duplicates
2025-03-02 08:36:43 -08:00
Krish Dholakia
c84b489d58
Fix bedrock passing response_format: {"type": "text"} (#8900)
* fix(converse_transformation.py): ignore type: text, value in response_format

no-op for bedrock

* fix(converse_transformation.py): handle adding response format value to tools

* fix(base_invoke_transformation.py): fix 'get_bedrock_invoke_provider' to handle cross-region-inferencing models

* test(test_bedrock_completion.py): add unit testing for bedrock invoke provider logic

* test: update test

* fix(exception_mapping_utils.py): add context window exceeded error handling for databricks provider route

* fix(fireworks_ai/): support passing tools + response_format together

* fix: cleanup

* fix(base_invoke_transformation.py): fix imports
2025-02-28 20:09:59 -08:00
Krish Dholakia
c8dc4f3eec
converse_transformation: pass 'description' if set in response_format (#8907)
* test(test_bedrock_completion.py): e2e test ensuring tool description is passed in

* fix(converse_transformation.py): pass description, if set

* fix(transformation.py): Fixes https://github.com/BerriAI/litellm/issues/8767#issuecomment-2689887663
2025-02-28 18:47:07 -08:00
Krish Dholakia
a65bfab697
Fix calling claude via invoke route + response_format support for claude on invoke route (#8908)
* fix(anthropic_claude3_transformation.py): fix amazon anthropic claude 3 tool calling transformation on invoke route

move to using anthropic config as base

* fix(utils.py): expose anthropic config via providerconfigmanager

* fix(llm_http_handler.py): support json mode on async completion calls

* fix(invoke_handler/make_call): support json mode for anthropic called via bedrock invoke

* fix(anthropic/): handle 'response_format: {"type": "text"}` + migrate amazon claude 3 invoke config to inherit from anthropic config

Prevents error when passing in 'response_format: {"type": "text"}

* test: fix test

* fix(utils.py): fix base invoke provider check

* fix(anthropic_claude3_transformation.py): don't pass 'stream' param

* fix: fix linting errors

* fix(converse_transformation.py): handle response_format type=text for converse
2025-02-28 17:56:26 -08:00
Ishaan Jaff
047d1b1208
(Bug Fix) - Accurate token counting for /anthropic/ API Routes on LiteLLM Proxy (#8880)
* fix _create_anthropic_response_logging_payload

* fix - pass through don't create standard logging payload

* fix logged key hash

* test_init_kwargs_for_pass_through_endpoint_basic

* test_unit_test_anthropic_pass_through

* fix anthropic pass through logging handler

* test_stream_token_counting_anthropic_with_include_usage

* convert_str_chunk_to_generic_chunk

* _build_complete_streaming_response

* test_anthropic_basic_completion_with_headers

* test_anthropic_streaming_with_headers

* improve test for pass through token counting
2025-02-27 15:43:03 -08:00
Krish Dholakia
ab7c4d1a0e
Litellm dev bedrock anthropic 3 7 v2 (#8843)
* feat(bedrock/converse/transformation.py): support claude-3-7-sonnet reasoning_Content transformation

Closes https://github.com/BerriAI/litellm/issues/8777

* fix(bedrock/): support returning `reasoning_content` on streaming for claude-3-7

Resolves https://github.com/BerriAI/litellm/issues/8777

* feat(bedrock/): unify converse reasoning content blocks for consistency across anthropic and bedrock

* fix(anthropic/chat/transformation.py): handle deepseek-style 'reasoning_content' extraction within transformation.py

simpler logic

* feat(bedrock/): fix streaming to return blocks in consistent format

* fix: fix linting error

* test: fix test

* feat(factory.py): fix bedrock thinking block translation on tool calling

allows passing the thinking blocks back to bedrock for tool calling

* fix(types/utils.py): don't exclude provider_specific_fields on model dump

ensures consistent responses

* fix: fix linting errors

* fix(convert_dict_to_response.py): pass reasoning_content on root

* fix: test

* fix(streaming_handler.py): add helper util for setting model id

* fix(streaming_handler.py): fix setting model id on model response stream chunk

* fix(streaming_handler.py): fix linting error

* fix(streaming_handler.py): fix linting error

* fix(types/utils.py): add provider_specific_fields to model stream response

* fix(streaming_handler.py): copy provider specific fields and add them to the root of the streaming response

* fix(streaming_handler.py): fix check

* fix: fix test

* fix(types/utils.py): ensure messages content is always openai compatible

* fix(types/utils.py): fix delta object to always be openai compatible

only introduce new params if variable exists

* test: fix bedrock nova tests

* test: skip flaky test

* test: skip flaky test in ci/cd
2025-02-26 16:05:33 -08:00
Krish Dholakia
017c482d7b
fix(o_series_transformation.py): fix optional param check for o-serie… (#8787)
* fix(o_series_transformation.py): fix optional param check for o-series models

o3-mini and o-1 do not support parallel tool calling

* fix(utils.py): support 'drop_params' for 'thinking' param across models

allows switching to older claude versions (or non-anthropic models) and param to be safely dropped

* fix: fix passing thinking param in optional params

allows dropping thinking_param where not applicable

* test: update old model

* fix(utils.py): fix linting errors

* fix(main.py): add param to acompletion
2025-02-26 12:26:55 -08:00
Krish Dholakia
142b195784
Add anthropic thinking + reasoning content support (#8778)
* feat(anthropic/chat/transformation.py): add anthropic thinking param support

* feat(anthropic/chat/transformation.py): support returning thinking content for anthropic on streaming responses

* feat(anthropic/chat/transformation.py): return list of thinking blocks (include block signature)

allows usage in tool call responses

* fix(types/utils.py): extract and map reasoning_content from anthropic as content str

* test: add testing to ensure thinking_blocks are returned at the root

* fix(anthropic/chat/handler.py): return thinking blocks on streaming - include signature

* feat(factory.py): handle anthropic thinking blocks translation if in assistant response

* test: handle openai internal instability

* test: handle openai audio instability

* ci: pin anthropic dep

* test: handle openai audio instability

* fix: fix linting error

* refactor(anthropic/chat/transformation.py): refactor function to remain <50 LOC

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error
2025-02-24 21:54:30 -08:00
Krish Dholakia
1dd3713f1a
Anthropic Citations API Support (#8382)
* test(test_anthropic_completion.py): add test ensuring anthropic structured output response is consistent

Resolves https://github.com/BerriAI/litellm/issues/8291

* feat(anthropic.py): support citations api with new user document message format

Resolves https://github.com/BerriAI/litellm/issues/7970

* fix(anthropic/chat/transformation.py): return citations as a provider-specific-field

Resolves https://github.com/BerriAI/litellm/issues/7970

* feat(anthropic/chat/handler.py): add streaming citations support

Resolves https://github.com/BerriAI/litellm/issues/7970

* fix(handler.py): fix code qa error

* fix(handler.py): only set provider specific fields if non-empty dict

* docs(anthropic.md): add citations api to anthropic docs
2025-02-07 22:27:01 -08:00
Krish Dholakia
dfbbf0bde8
fix: dictionary changed size during iteration error (#8327) (#8341)
Co-authored-by: Joey Feldberg <joeyfeldberg@users.noreply.github.com>
Co-authored-by: Joey Feldberg <12495578+joeyfeldberg@users.noreply.github.com>
2025-02-07 16:20:28 -08:00
Krish Dholakia
08b124aeb6
Litellm dev 01 25 2025 p2 (#8003)
* fix(base_utils.py): supported nested json schema passed in for anthropic calls

* refactor(base_utils.py): refactor ref parsing to prevent infinite loop

* test(test_openai_endpoints.py): refactor anthropic test to use bedrock

* fix(langfuse_prompt_management.py): add unit test for sync langfuse calls

Resolves https://github.com/BerriAI/litellm/issues/7938#issuecomment-2613293757
2025-01-25 16:50:57 -08:00
Krish Dholakia
71c41f8f33
QA: ensure all bedrock regional models have same supported_ as base + Anthropic nested pydantic object support (#7844)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* build: ensure all regional bedrock models have same supported values as base bedrock model

prevents drift

* test(base_llm_unit_tests.py): add testing for nested pydantic objects

* fix(test_utils.py): add test_get_potential_model_names

* fix(anthropic/chat/transformation.py): support nested pydantic objects

Fixes https://github.com/BerriAI/litellm/issues/7755
2025-01-17 19:49:12 -08:00
Krish Dholakia
80d6bbec29
Litellm dev 01 14 2025 p2 (#7772)
* feat(pass_through_endpoints.py): fix anthropic end user cost tracking

* fix(anthropic/chat/transformation.py): use returned provider model for anthropic

handles anthropic `-latest` tag in request body throwing cost calculation errors

ensures we can be accurate in our model cost tracking

* feat(model_prices_and_context_window.json): add gemini-2.0-flash-thinking-exp pricing

* test: update test to use assumption that user_api_key_dict can get anthropic user id

* test: fix test

* fix: fix test

* fix(anthropic_pass_through.py): uncomment previous anthropic end-user cost tracking code block

can't guarantee user api key dict always has end user id - too many code paths

* fix(user_api_key_auth.py): this allows end user id from request body to always be read and set in auth object

* fix(auth_check.py): fix linting error

* test: fix auth check

* fix(auth_utils.py): fix get end user id to handle metadata = None
2025-01-15 21:34:50 -08:00
Krish Dholakia
8353caa485
build(pyproject.toml): bump uvicorn depedency requirement (#7773)
* build(pyproject.toml): bump uvicorn depedency requirement

Fixes https://github.com/BerriAI/litellm/issues/7768

* fix(anthropic/chat/transformation.py): fix is_vertex_request check to actually use optional param passed in

Fixes https://github.com/BerriAI/litellm/issues/6898#issuecomment-2590860695

* fix(o1_transformation.py): fix azure o1 'is_o1_model' check to just check for o1 in model string

https://github.com/BerriAI/litellm/issues/7743

* test: load vertex creds
2025-01-14 21:47:11 -08:00
Krish Dholakia
ec5a354eac
add azure o1 pricing (#7715)
* build(model_prices_and_context_window.json): add azure o1 pricing

Closes https://github.com/BerriAI/litellm/issues/7712

* refactor: replace regex with string method for whitespace check in stop-sequences handling (#7713)

* Allows overriding keep_alive time in ollama (#7079)

* Allows overriding keep_alive time in ollama

* Also adds to ollama_chat

* Adds some info on the docs about this parameter

* fix: together ai warning (#7688)

Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com>

* fix(proxy_server.py): handle config containing thread locked objects when using get_config_state

* fix(proxy_server.py): add exception to debug

* build(model_prices_and_context_window.json): update 'supports_vision' for azure o1

---------

Co-authored-by: Wolfram Ravenwolf <52386626+WolframRavenwolf@users.noreply.github.com>
Co-authored-by: Regis David Souza Mesquita <github@rdsm.dev>
Co-authored-by: Carl <45709281+capsenz@users.noreply.github.com>
Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com>
2025-01-12 18:15:35 -08:00
Krish Dholakia
0ffc5379ea
Litellm dev 01 07 2025 p2 (#7622)
* build(ui/): update ui

* fix: drop unsupported non-whitespace characters for real when calling… (#7484)

* fix: drop unsupported non-whitespace characters for real when calling anthropic with stop sequences

* test: add parameterized test for _map_stop_sequences method in AnthropicConfig

---------

Co-authored-by: Wolfram Ravenwolf <52386626+WolframRavenwolf@users.noreply.github.com>
2025-01-08 16:56:39 -08:00
Krish Dholakia
0120176541
Litellm dev 12 30 2024 p2 (#7495)
* test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model

* fix(base_llm_unit_tests.py): handle azure o1 preview response format tests

skip as o1 on azure doesn't support tool calling yet

* fix: initial commit of azure o1 handler using openai caller

simplifies calling + allows fake streaming logic alr. implemented for openai to just work

* feat(azure/o1_handler.py): fake o1 streaming for azure o1 models

azure does not currently support streaming for o1

* feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info

enables user to toggle on when azure allows o1 streaming without needing to bump versions

* style(router.py): remove 'give feedback/get help' messaging when router is used

Prevents noisy messaging

Closes https://github.com/BerriAI/litellm/issues/5942

* fix(types/utils.py): handle none logprobs

Fixes https://github.com/BerriAI/litellm/issues/328

* fix(exception_mapping_utils.py): fix error str unbound error

* refactor(azure_ai/): move to openai_like chat completion handler

allows for easy swapping of api base url's (e.g. ai.services.com)

Fixes https://github.com/BerriAI/litellm/issues/7275

* refactor(azure_ai/): move to base llm http handler

* fix(azure_ai/): handle differing api endpoints

* fix(azure_ai/): make sure all unit tests are passing

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting error

* fix: fix linting errors

* fix(azure_ai/transformation.py): handle extra body param

* fix(azure_ai/transformation.py): fix max retries param handling

* fix: fix test

* test(test_azure_o1.py): fix test

* fix(llm_http_handler.py): support handling azure ai unprocessable entity error

* fix(llm_http_handler.py): handle sync invalid param error for azure ai

* fix(azure_ai/): streaming support with base_llm_http_handler

* fix(llm_http_handler.py): working sync stream calls with unprocessable entity handling for azure ai

* fix: fix linting errors

* fix(llm_http_handler.py): fix linting error

* fix(azure_ai/): handle cohere tool call invalid index param error
2025-01-01 18:57:29 -08:00
Krish Dholakia
c3edfc2c92
LiteLLM Minor Fixes & Improvements (12/23/2024) - p3 (#7394)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 35s
* build(model_prices_and_context_window.json): add gemini-1.5-flash context caching

* fix(context_caching/transformation.py): just use last identified cache point

Fixes https://github.com/BerriAI/litellm/issues/6738

* fix(context_caching/transformation.py): pick first contiguous block - handles system message error from google

Fixes https://github.com/BerriAI/litellm/issues/6738

* fix(vertex_ai/gemini/): track context caching tokens

* refactor(gemini/): place transformation.py inside `chat/` folder

make it easy for user to know we support the equivalent endpoint

* fix: fix import

* refactor(vertex_ai/): move vertex_ai cost calc inside vertex_ai/ folder

make it easier to see cost calculation logic

* fix: fix linting errors

* fix: fix circular import

* feat(gemini/cost_calculator.py): support gemini context caching cost calculation

generifies anthropic's cost calculation function and uses it across anthropic + gemini

* build(model_prices_and_context_window.json): add cost tracking for gemini-1.5-flash-002 w/ context caching

Closes https://github.com/BerriAI/litellm/issues/6891

* docs(gemini.md): add gemini context caching architecture diagram

make it easier for user to understand how context caching works

* docs(gemini.md): link to relevant gemini context caching code

* docs(gemini/context_caching): add readme in github, make it easy for dev to know context caching is supported + where to go for code

* fix(llm_cost_calc/utils.py): handle gemini 128k token diff cost calc scenario

* fix(deepseek/cost_calculator.py): support deepseek context caching cost calculation

* test: fix test
2024-12-23 22:02:52 -08:00
Ishaan Jaff
c7f14e936a
(code quality) run ruff rule to ban unused imports (#7313)
* remove unused imports

* fix AmazonConverseConfig

* fix test

* fix import

* ruff check fixes

* test fixes

* fix testing

* fix imports
2024-12-19 12:33:42 -08:00
Krish Dholakia
5253f639cd
fix(health.md): add rerank model health check information (#7295)
* fix(health.md): add rerank model health check information

* build(model_prices_and_context_window.json): add gemini 2.0 for google ai studio - pricing + commercial rate limits

* build(model_prices_and_context_window.json): add gemini-2.0 supports audio output = true

* docs(team_model_add.md): clarify allowing teams to add models is an enterprise feature

* fix(o1_transformation.py): add support for 'n', 'response_format' and 'stop' params for o1 and 'stream_options' param for o1-mini

* build(model_prices_and_context_window.json): add 'supports_system_message' to supporting openai models

needed as o1-preview, and o1-mini models don't support 'system message

* fix(o1_transformation.py): translate system message based on if o1 model supports it

* fix(o1_transformation.py): return 'stream' param support if o1-mini/o1-preview

o1 currently doesn't support streaming, but the other model versions do

Fixes https://github.com/BerriAI/litellm/issues/7292

* fix(o1_transformation.py): return tool calling/response_format in supported params if model map says so

Fixes https://github.com/BerriAI/litellm/issues/7292

* fix: fix linting errors

* fix: update '_transform_messages'

* fix(o1_transformation.py): fix provider passed for supported param checks

* test(base_llm_unit_tests.py): skip test if api takes >5s to respond

* fix(utils.py): return false in 'supports_factory' if can't find value

* fix(o1_transformation.py): always return stream + stream_options as supported params + handle stream options being passed in for azure o1

* feat(openai.py): support stream faking natively in openai handler

Allows o1 calls to be faked for just the "o1" model, allows native streaming for o1-mini, o1-preview

 Fixes https://github.com/BerriAI/litellm/issues/7292

* fix(openai.py): use inference param instead of original optional param
2024-12-18 19:18:10 -08:00
Ishaan Jaff
7a5dd29fe0
(fix) unable to pass input_type parameter to Voyage AI embedding mode (#7276)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 46s
* VoyageEmbeddingConfig

* fix voyage logic to get params

* add voyage embedding transformation

* add get_provider_embedding_config

* use BaseEmbeddingConfig

* voyage clean up

* use llm http handler for embedding transformations

* test_voyage_ai_embedding_extra_params

* add voyage async

* test_voyage_ai_embedding_extra_params

* add async for llm http handler

* update BaseLLMEmbeddingTest

* test_voyage_ai_embedding_extra_params

* fix linting

* fix get_provider_embedding_config

* fix anthropic text test

* update location of base/chat/transformation

* fix import path

* fix IBMWatsonXAIConfig
2024-12-17 19:23:49 -08:00
Krish Dholakia
b82add11ba
LITELLM: Remove requests library usage (#7235)
* fix(generic_api_callback.py): remove requests lib usage

* fix(budget_manager.py): remove requests lib usgae

* fix(main.py): cleanup requests lib usage

* fix(utils.py): remove requests lib usage

* fix(argilla.py): fix argilla test

* fix(athina.py): replace 'requests' lib usage with litellm module

* fix(greenscale.py): replace 'requests' lib usage with httpx

* fix: remove unused 'requests' lib import + replace usage in some places

* fix(prompt_layer.py): remove 'requests' lib usage from prompt layer

* fix(ollama_chat.py): remove 'requests' lib usage

* fix(baseten.py): replace 'requests' lib usage

* fix(codestral/): replace 'requests' lib usage

* fix(predibase/): replace 'requests' lib usage

* refactor: cleanup unused 'requests' lib imports

* fix(oobabooga.py): cleanup 'requests' lib usage

* fix(invoke_handler.py): remove unused 'requests' lib usage

* refactor: cleanup unused 'requests' lib import

* fix: fix linting errors

* refactor(ollama/): move ollama to using base llm http handler

removes 'requests' lib dep for ollama integration

* fix(ollama_chat.py): fix linting errors

* fix(ollama/completion/transformation.py): convert non-jpeg/png image to jpeg/png before passing to ollama
2024-12-17 12:50:04 -08:00
Krish Dholakia
516c2a6a70
Litellm remove circular imports (#7232)
* fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py

* fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py

'

* refactor: fix litellm.ModelResponse import on pass through endpoints

* refactor(litellm_logging.py): fix circular import for custom callbacks literal

* fix(factory.py): fix circular imports inside prompt factory

* fix(cost_calculator.py): fix circular import for 'litellm.Usage'

* fix(proxy_server.py): fix potential circular import with `litellm.Router'

* fix(proxy/utils.py): fix potential circular import in `litellm.Router`

* fix: remove circular imports in 'auth_checks' and 'guardrails/'

* fix(prompt_injection_detection.py): fix router impor t

* fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through

* fix(anthropic_pass_through_logging_handler.py): fix potential circular imports

* fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import

* fix(base.py): fix potential circular import

* fix(handler.py): fix potential circular ref in codestral + cohere handler's

* fix(azure.py): fix potential circular imports

* fix(gpt_transformation.py): fix modelresponse import

* fix(litellm_logging.py): add logging base class - simplify typing

makes it easy for other files to type check the logging obj without introducing circular imports

* fix(azure_ai/embed): fix potential circular import on handler.py

* fix(databricks/): fix potential circular imports in databricks/

* fix(vertex_ai/): fix potential circular imports on vertex ai embeddings

* fix(vertex_ai/image_gen): fix import

* fix(watsonx-+-bedrock): cleanup imports

* refactor(anthropic-pass-through-+-petals): cleanup imports

* refactor(huggingface/): cleanup imports

* fix(ollama-+-clarifai): cleanup circular imports

* fix(openai_like/): fix impor t

* fix(openai_like/): fix embedding handler

cleanup imports

* refactor(openai.py): cleanup imports

* fix(sagemaker/transformation.py): fix import

* ci(config.yml): add circular import test to ci/cd
2024-12-14 16:28:34 -08:00