Christopher
4fef15901b
Merge d78ee3182d
into b82af5b826
2025-04-24 00:56:18 -07:00
Ishaan Jaff
2e58e47b43
[Bug Fix] Add Cost Tracking for gpt-image-1 when quality is unspecified ( #10247 )
...
* TestOpenAIGPTImage1
* fixes for cost calc
* fix ImageGenerationRequestQuality.MEDIUM
2025-04-23 15:16:40 -07:00
Ishaan Jaff
104e4cb1bc
[Feat] Add infinity embedding support (contributor pr) ( #10196 )
...
* Feature - infinity support for #8764 (#10009 )
* Added support for infinity embeddings
* Added test cases
* Fixed tests and api base
* Updated docs and tests
* Removed unused import
* Updated signature
* Added support for infinity embeddings
* Added test cases
* Fixed tests and api base
* Updated docs and tests
* Removed unused import
* Updated signature
* Updated validate params
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* fix InfinityEmbeddingConfig
---------
Co-authored-by: Prathamesh Saraf <pratamesh1867@gmail.com>
2025-04-21 20:01:29 -07:00
Krish Dholakia
2508ca71cb
Handle fireworks ai tool calling response ( #10130 )
...
* feat(fireworks_ai/chat): handle tool calling with fireworks ai correctly
Fixes https://github.com/BerriAI/litellm/issues/7209
* fix(utils.py): handle none type in message
* fix: fix model name in test
* fix(utils.py): fix validate check for openai messages
* fix: fix model returned
* fix(main.py): fix text completion routing
* test: update testing
* test: skip test - cohere having RBAC issues
2025-04-19 09:37:45 -07:00
Krish Dholakia
fdfa1108a6
Add property ordering for vertex ai schema ( #9828 ) + Fix combining multiple tool calls ( #10040 )
...
* fix #9783 : Retain schema field ordering for google gemini and vertex (#9828 )
* test: update test
* refactor(groq.py): initial commit migrating groq to base_llm_http_handler
* fix(streaming_chunk_builder_utils.py): fix how tool content is combined
Fixes https://github.com/BerriAI/litellm/issues/10034
* fix(vertex_ai/common_utils.py): prevent infinite loop in helper function
* fix(groq/chat/transformation.py): handle groq streaming errors correctly
* fix(groq/chat/transformation.py): handle max_retries
---------
Co-authored-by: Adrian Lyjak <adrian@chatmeter.com>
2025-04-15 22:29:25 -07:00
Ishaan Jaff
6cfa50d278
[Feat] Add support for cache_control_injection_points
for Anthropic API, Bedrock API ( #9996 )
...
* test_anthropic_cache_control_hook_system_message
* test_anthropic_cache_control_hook.py
* should_run_prompt_management_hooks
* fix should_run_prompt_management_hooks
* test_anthropic_cache_control_hook_specific_index
* fix test
* fix linting errors
* ChatCompletionCachedContent
2025-04-14 20:50:13 -07:00
Krish Dholakia
6ba3c4a4f8
VertexAI non-jsonl file storage support ( #9781 )
...
* test: add initial e2e test
* fix(vertex_ai/files): initial commit adding sync file create support
* refactor: initial commit of vertex ai non-jsonl files reaching gcp endpoint
* fix(vertex_ai/files/transformation.py): initial working commit of non-jsonl file call reaching backend endpoint
* fix(vertex_ai/files/transformation.py): working e2e non-jsonl file upload
* test: working e2e jsonl call
* test: unit testing for jsonl file creation
* fix(vertex_ai/transformation.py): reset file pointer after read
allow multiple reads on same file object
* fix: fix linting errors
* fix: fix ruff linting errors
* fix: fix import
* fix: fix linting error
* fix: fix linting error
* fix(vertex_ai/files/transformation.py): fix linting error
* test: update test
* test: update tests
* fix: fix linting errors
* fix: fix test
* fix: fix linting error
2025-04-09 14:01:48 -07:00
Krish Dholakia
34bdf36eab
Add inference providers support for Hugging Face ( #8258 ) ( #9738 ) ( #9773 )
...
* Add inference providers support for Hugging Face (#8258 )
* add first version of inference providers for huggingface
* temporarily skipping tests
* Add documentation
* Fix titles
* remove max_retries from params and clean up
* add suggestions
* use llm http handler
* update doc
* add suggestions
* run formatters
* add tests
* revert
* revert
* rename file
* set maxsize for lru cache
* fix embeddings
* fix inference url
* fix tests following breaking change in main
* use ChatCompletionRequest
* fix tests and lint
* [Hugging Face] Remove outdated chat completion tests and fix embedding tests (#9749 )
* remove or fix tests
* fix link in doc
* fix(config_settings.md): document hf api key
---------
Co-authored-by: célina <hanouticelina@gmail.com>
2025-04-05 10:50:15 -07:00
Krish Dholakia
5099aac1a5
Add DBRX Anthropic w/ thinking + response_format support ( #9744 )
...
* feat(databricks/chat/): add anthropic w/ reasoning content support via databricks
Allows user to call claude-3-7-sonnet with thinking via databricks
* refactor: refactor choices transformation + add unit testing
* fix(databricks/chat/transformation.py): support thinking blocks on databricks response streaming
* feat(databricks/chat/transformation.py): support response_format for claude models
* fix(databricks/chat/transformation.py): correctly handle response_format={"type": "text"}
* feat(databricks/chat/transformation.py): support 'reasoning_effort' param mapping for anthropic
* fix: fix ruff errors
* fix: fix linting error
* test: update test
* fix(databricks/chat/transformation.py): handle json mode output parsing
* fix(databricks/chat/transformation.py): handle json mode on streaming
* test: update test
* test: update dbrx testing
* test: update testing
* fix(base_model_iterator.py): handle non-json chunk
* test: update tests
* fix: fix ruff check
* fix: fix databricks config import
* fix: handle _tool = none
* test: skip invalid test
2025-04-04 22:13:32 -07:00
Chris Agostino
3638109576
adding in stable diffusion usage for litellm
2025-04-04 20:16:43 -04:00
Adrian Lyjak
d640bc0a00
fix #8425 , passthrough kwargs during acompletion, and unwrap extra_body for openrouter ( #9747 )
2025-04-03 22:19:40 -07:00
Krish Dholakia
6dda1ba6dd
LiteLLM Minor Fixes & Improvements (04/02/2025) ( #9725 )
...
* Add date picker to usage tab + Add reasoning_content token tracking across all providers on streaming (#9722 )
* feat(new_usage.tsx): add date picker for new usage tab
allow user to look back on their usage data
* feat(anthropic/chat/transformation.py): report reasoning tokens in completion token details
allows usage tracking on how many reasoning tokens are actually being used
* feat(streaming_chunk_builder.py): return reasoning_tokens in anthropic/openai streaming response
allows tracking reasoning_token usage across providers
* Fix update team metadata + fix bulk adding models on Ui (#9721 )
* fix(handle_add_model_submit.tsx): fix bulk adding models
* fix(team_info.tsx): fix team metadata update
Fixes https://github.com/BerriAI/litellm/issues/9689
* (v0) Unified file id - allow calling multiple providers with same file id (#9718 )
* feat(files_endpoints.py): initial commit adding 'target_model_names' support
allow developer to specify all the models they want to call with the file
* feat(files_endpoints.py): return unified files endpoint
* test(test_files_endpoints.py): add validation test - if invalid purpose submitted
* feat: more updates
* feat: initial working commit of unified file id translation
* fix: additional fixes
* fix(router.py): remove model replace logic in jsonl on acreate_file
enables file upload to work for chat completion requests as well
* fix(files_endpoints.py): remove whitespace around model name
* fix(azure/handler.py): return acreate_file with correct response type
* fix: fix linting errors
* test: fix mock test to run on github actions
* fix: fix ruff errors
* fix: fix file too large error
* fix(utils.py): remove redundant var
* test: modify test to work on github actions
* test: update tests
* test: more debug logs to understand ci/cd issue
* test: fix test for respx
* test: skip mock respx test
fails on ci/cd - not clear why
* fix: fix ruff check
* fix: fix test
* fix(model_connection_test.tsx): fix linting error
* test: update unit tests
2025-04-03 11:48:52 -07:00
Krish Dholakia
8ee32291e0
Squashed commit of the following: ( #9709 )
...
commit b12a9892b7
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Wed Apr 2 08:09:56 2025 -0700
fix(utils.py): don't modify openai_token_counter
commit 294de31803
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 21:22:40 2025 -0700
fix: fix linting error
commit cb6e9fbe40
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 19:52:45 2025 -0700
refactor: complete migration
commit bfc159172d
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 19:09:59 2025 -0700
refactor: refactor more constants
commit 43ffb6a558
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 18:45:24 2025 -0700
fix: test
commit 04dbe4310c
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 18:28:58 2025 -0700
refactor: refactor: move more constants into constants.py
commit 3c26284aff
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 18:14:46 2025 -0700
refactor: migrate hardcoded constants out of __init__.py
commit c11e0de69d
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 18:11:21 2025 -0700
build: migrate all constants into constants.py
commit 7882bdc787
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 18:07:37 2025 -0700
build: initial test banning hardcoded numbers in repo
2025-04-02 21:24:54 -07:00
Ishaan Jaff
9acda77b75
add allowed_openai_params
2025-04-01 19:54:35 -07:00
Krish Dholakia
9b7ebb6a7d
build(pyproject.toml): add new dev dependencies - for type checking ( #9631 )
...
* build(pyproject.toml): add new dev dependencies - for type checking
* build: reformat files to fit black
* ci: reformat to fit black
* ci(test-litellm.yml): make tests run clear
* build(pyproject.toml): add ruff
* fix: fix ruff checks
* build(mypy/): fix mypy linting errors
* fix(hashicorp_secret_manager.py): fix passing cert for tls auth
* build(mypy/): resolve all mypy errors
* test: update test
* fix: fix black formatting
* build(pre-commit-config.yaml): use poetry run black
* fix(proxy_server.py): fix linting error
* fix: fix ruff safe representation error
2025-03-29 11:02:13 -07:00
Krish Dholakia
5ac61a7572
Add bedrock latency optimized inference support ( #9623 )
...
* fix(converse_transformation.py): add performanceConfig param support on bedrock
Closes https://github.com/BerriAI/litellm/issues/7606
* fix(converse_transformation.py): refactor to use more flexible single getter for params which are separate config blocks
* test(test_main.py): add e2e mock test for bedrock performance config
* build(model_prices_and_context_window.json): add versioned multimodal embedding
* refactor(multimodal_embeddings/): migrate to config pattern
* feat(vertex_ai/multimodalembeddings): calculate usage for multimodal embedding calls
Enables cost calculation for multimodal embeddings
* feat(vertex_ai/multimodalembeddings): get usage object for embedding calls
ensures accurate cost tracking for vertexai multimodal embedding calls
* fix(embedding_handler.py): remove unused imports
* fix: fix linting errors
* fix: handle response api usage calculation
* test(test_vertex_ai_multimodal_embedding_transformation.py): update tests
* test: mark flaky test
* feat(vertex_ai/multimodal_embeddings/transformation.py): support text+image+video input
* docs(vertex.md): document sending text + image to vertex multimodal embeddings
* test: remove incorrect file
* fix(multimodal_embeddings/transformation.py): fix linting error
* style: remove unused import
2025-03-29 00:23:09 -07:00
Krish Dholakia
c0845fec1f
Add OpenAI gpt-4o-transcribe support ( #9517 )
...
* refactor: introduce new transformation config for gpt-4o-transcribe models
* refactor: expose new transformation configs for audio transcription
* ci: fix config yml
* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions
allows gpt-4o and whisper audio transformation to work as expected
* refactor: migrate fireworks ai + deepgram to new transform request pattern
* feat(openai/): working support for gpt-4o-audio-transcribe
* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map
* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`
* fix(get_supported_openai_params.py): fix return
* refactor(deepgram/): migrate unit test to deepgram handler
* refactor: cleanup unused imports
* fix(get_supported_openai_params.py): fix linting error
* test: update test
2025-03-26 23:10:25 -07:00
Krish Dholakia
6fd18651d1
Support litellm.api_base
for vertex_ai + gemini/ across completion, embedding, image_generation ( #9516 )
...
Read Version from pyproject.toml / read-version (push) Successful in 19s
Helm unit test / unit-test (push) Successful in 20s
* test(tests): add unit testing for litellm_proxy integration
* fix(cost_calculator.py): fix tracking cost in sdk when calling proxy
* fix(main.py): respect litellm.api_base on `vertex_ai/` and `gemini/` routes
* fix(main.py): consistently support custom api base across gemini + vertexai on embedding + completion
* feat(vertex_ai/): test
* fix: fix linting error
* test: set api base as None before starting loadtest
2025-03-25 23:46:20 -07:00
Krish Dholakia
92883560f0
fix vertex ai multimodal embedding translation ( #9471 )
...
Read Version from pyproject.toml / read-version (push) Successful in 20s
Helm unit test / unit-test (push) Successful in 24s
* remove data:image/jpeg;base64, prefix from base64 image input
vertex_ai's multimodal embeddings endpoint expects a raw base64 string without `data:image/jpeg;base64,` prefix.
* Add Vertex Multimodal Embedding Test
* fix(test_vertex.py): add e2e tests on multimodal embeddings
* test: unit testing
* test: remove sklearn dep
* test: update test with fixed route
* test: fix test
---------
Co-authored-by: Jonarod <jonrodd@gmail.com>
Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>
2025-03-24 23:23:28 -07:00
Krrish Dholakia
48e6a7036b
test: mock sagemaker tests
2025-03-21 16:21:18 -07:00
Krrish Dholakia
76c3957140
fix(main.py): fix OR import
2025-03-20 13:52:28 -07:00
Krish Dholakia
cb4155fb16
Merge pull request #9369 from graysonchen/feature/add_openrouter_api_base
...
feat: Add support for custom OPENROUTER_API_BASE via get_secret in co…
2025-03-20 13:52:03 -07:00
Grayson Chen
f3a0261bb4
feat: Add support for custom OPENROUTER_API_BASE via get_secret in completion function
2025-03-19 21:09:03 +08:00
Ishaan Jaff
7384d45ef0
fix type errors on transcription azure
2025-03-18 14:22:30 -07:00
Ishaan Jaff
b316911120
fix typing errors
2025-03-18 12:31:44 -07:00
Ishaan Jaff
6787d0dabe
test_model_connection
2025-03-14 18:33:49 -07:00
Ishaan Jaff
5a6da56058
fix endpoint_data
2025-03-14 17:21:01 -07:00
Sunny Wan
f9a5109203
Merge branch 'BerriAI:main' into main
2025-03-13 19:37:22 -04:00
Krish Dholakia
cff1c1f7d8
Merge branch 'main' into litellm_dev_03_12_2025_p1
2025-03-12 22:14:02 -07:00
Krrish Dholakia
738c0b873d
fix(azure_ai/transformation.py): support passing api version to azure ai services endpoint
...
Fixes https://github.com/BerriAI/litellm/issues/7275
2025-03-12 15:16:42 -07:00
Krish Dholakia
2d957a0ed9
Merge branch 'main' into litellm_dev_03_10_2025_p3
2025-03-12 14:56:01 -07:00
Krrish Dholakia
e6f21d3654
fix: fix linting error
2025-03-11 18:17:00 -07:00
Krrish Dholakia
9af73f339a
test: fix tests
2025-03-11 17:42:36 -07:00
Krrish Dholakia
af71e14d79
refactor(azure/audio_transcriptions.py): support client init with common logic
2025-03-11 14:24:12 -07:00
Krrish Dholakia
2c2404dac9
refactor(azure.py): working client init logic in azure image generation
2025-03-11 14:22:25 -07:00
Krrish Dholakia
152bc67d22
refactor(azure.py): working azure client init on audio speech endpoint
2025-03-11 14:19:45 -07:00
Sunny Wan
a775c9ca13
removed handler and refactored to deepseek/chat format
2025-03-11 02:00:52 -04:00
Ishaan Jaff
7319fef29d
fix linting error
2025-03-10 13:57:50 -07:00
Ishaan Jaff
666690c31c
fix atext_completion
2025-03-10 10:18:03 -07:00
Krish Dholakia
f899b828cf
Support openrouter reasoning_content
on streaming ( #9094 )
...
* feat(convert_dict_to_response.py): support openrouter format of reasoning content
* fix(transformation.py): fix openrouter streaming with reasoning content
Fixes https://github.com/BerriAI/litellm/issues/8193#issuecomment-270892962
* fix: fix type error
2025-03-09 20:03:59 -07:00
Krish Dholakia
e00d4fb18c
Litellm dev 03 08 2025 p3 ( #9089 )
...
* feat(ollama_chat.py): pass down http client to ollama_chat
enables easier testing
* fix(factory.py): fix passing images to ollama's `/api/generate` endpoint
Fixes https://github.com/BerriAI/litellm/issues/6683
* fix(factory.py): fix ollama pt to handle templating correctly
2025-03-09 18:20:56 -07:00
Ishaan Jaff
b02af305de
[Feat] - Display thinking
tokens on OpenWebUI (Bedrock, Anthropic, Deepseek) ( #9029 )
...
Read Version from pyproject.toml / read-version (push) Successful in 14s
* if merge_reasoning_content_in_choices
* _optional_combine_thinking_block_in_choices
* stash changes
* working merge_reasoning_content_in_choices with bedrock
* fix litellm_params accessor
* fix streaming handler
* merge_reasoning_content_in_choices
* _optional_combine_thinking_block_in_choices
* test_bedrock_stream_thinking_content_openwebui
* merge_reasoning_content_in_choices
* fix for _optional_combine_thinking_block_in_choices
* linting error fix
2025-03-06 18:32:58 -08:00
Krish Dholakia
c69ec66dc5
fix(base_aws_llm.py): remove region name before sending in args ( #8998 )
...
Read Version from pyproject.toml / read-version (push) Successful in 12s
* fix(base_aws_llm.py): remove region name before sending in args
* fix(base_aws_llm.py): fix optional param pop position
* fix: fix linting error
2025-03-04 23:05:28 -08:00
Sunny Wan
fd090c8043
[FEAT] Added snowflake completion provider
2025-03-03 01:20:00 -05:00
Krrish Dholakia
5b804e5d9b
fix(main.py): pass 'thinking' param on async completion call
Read Version from pyproject.toml / read-version (push) Successful in 38s
2025-02-26 23:16:39 -08:00
Krish Dholakia
017c482d7b
fix(o_series_transformation.py): fix optional param check for o-serie… ( #8787 )
...
* fix(o_series_transformation.py): fix optional param check for o-series models
o3-mini and o-1 do not support parallel tool calling
* fix(utils.py): support 'drop_params' for 'thinking' param across models
allows switching to older claude versions (or non-anthropic models) and param to be safely dropped
* fix: fix passing thinking param in optional params
allows dropping thinking_param where not applicable
* test: update old model
* fix(utils.py): fix linting errors
* fix(main.py): add param to acompletion
2025-02-26 12:26:55 -08:00
Krrish Dholakia
fcf4ea3608
build: merge squashed commit
...
Squashed commit of the following:
commit 6678e15381
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date: Wed Feb 26 09:29:15 2025 -0800
test_prompt_caching
commit bd86e0ac47
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date: Wed Feb 26 08:57:16 2025 -0800
test_prompt_caching
commit 2fc21ad51e
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date: Wed Feb 26 08:13:45 2025 -0800
test_aprompt_caching
commit d94cff55ff
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date: Wed Feb 26 08:13:12 2025 -0800
test_prompt_caching
commit 49c5e7811e
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date: Wed Feb 26 07:43:53 2025 -0800
ui new build
commit cb8d5e5917
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date: Wed Feb 26 07:38:56 2025 -0800
(UI) - Create Key flow for existing users (#8844 )
* working create user button
* working create user for a key flow
* allow searching users
* working create user + key
* use clear sections on create key
* better search for users
* fix create key
* ui fix create key button - make it neater / cleaner
* ui fix all keys table
commit 335ba30467
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Wed Feb 26 08:53:17 2025 -0800
fix: fix file name
commit b8c5b31a4e
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Tue Feb 25 22:54:46 2025 -0800
fix: fix utils
commit ac6e503461
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Feb 24 10:43:31 2025 -0800
fix(main.py): fix openai message for assistant msg if role is missing - openai allows this
Fixes https://github.com/BerriAI/litellm/issues/8661
commit de3989dbc5
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Feb 24 21:19:25 2025 -0800
fix(get_litellm_params.py): handle no-log being passed in via kwargs
Fixes https://github.com/BerriAI/litellm/issues/8380
2025-02-26 09:39:27 -08:00
Ishaan Jaff
f9cee4c46b
(Bug Fix) Using LiteLLM Python SDK with model=litellm_proxy/
for embedding, image_generation, transcription, speech, rerank ( #8815 )
...
* test_litellm_gateway_from_sdk
* fix embedding check for openai
* test litellm proxy provider
* fix image generation openai compatible models
* fix litellm.transcription
* test_litellm_gateway_from_sdk_rerank
* docs litellm python sdk
* docs litellm python sdk with proxy
* test_litellm_gateway_from_sdk_rerank
* ci/cd run again
* test_litellm_gateway_from_sdk_image_generation
* test_litellm_gateway_from_sdk_embedding
* test_litellm_gateway_from_sdk_embedding
2025-02-25 16:22:37 -08:00
Ishaan Jaff
300d7825f5
(Observability) - Add more detailed dd tracing on Proxy Auth, Bedrock Auth ( #8693 )
...
* add dd tracer
* fix dd tracing
* add @tracer.wrap() on def user_api_key_auth
* add async_function_with_retries
* remove dead code
* add tracer.wrap on base aws llm
* add tracer.wrap on base aws llm
* fix print verbose
* fix dd tracing
* trace base aws llm
* fix test base aws llm
* fix converse transform
* test base aws llm
* BASE_AWS_LLM_PATH
* BASE_AWS_LLM_PATH
* test dd tracing
2025-02-20 18:00:41 -08:00
Krish Dholakia
2b71973b17
Litellm dev 02 18 2025 p1 ( #8630 )
...
* fix(filter.tsx): align filter icon to button correctly
* style: style improvements for filter icon
* style(filter.tsx): cleanup filter box
* style(filter.tsx): style improvement for team id box on filter
* Fix timeout bug for SageMaker Messages API completion (#8635 )
* fix(model_cost_map): fix json parse error on model cost map + add unit test (#8629 )
Fixes https://github.com/BerriAI/litellm/pull/8619#issuecomment-2666693045
* Fix timeout bug for SageMaker Messages API completion
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
---------
Co-authored-by: Bobby Lindsey <bobbywlindsey@users.noreply.github.com>
2025-02-18 15:20:17 -08:00