Commit graph

73 commits

Author SHA1 Message Date
Ishaan Jaff
55115bf520 transform_responses_api_request 2025-03-20 12:28:55 -07:00
Ishaan Jaff
bc174adcd0 add should_fake_stream 2025-03-20 09:54:26 -07:00
Krrish Dholakia
fe24b9d90b feat(azure/gpt_transformation.py): add azure audio model support
Closes https://github.com/BerriAI/litellm/issues/6305
2025-03-19 22:57:49 -07:00
Ishaan Jaff
9203910ab6 fix import hashlib 2025-03-19 21:08:19 -07:00
Ishaan Jaff
65083ca8da get_openai_client_cache_key 2025-03-18 18:35:50 -07:00
Ishaan Jaff
3daef0d740 fix common utils 2025-03-18 17:59:46 -07:00
Ishaan Jaff
a45830dac3 use common caching logic for openai/azure clients 2025-03-18 17:57:03 -07:00
Ishaan Jaff
f73e9047dc use common logic for re-using openai clients 2025-03-18 17:56:32 -07:00
Ishaan Jaff
842625a6f0 :test_completion_azure_ad_toke 2025-03-18 12:25:32 -07:00
Ishaan Jaff
a0c5fb81b8 fix logic for intializing openai clients 2025-03-18 10:23:30 -07:00
Ishaan Jaff
6e351136d7 handle _get_async_http_client for OpenAI 2025-03-18 08:56:08 -07:00
Krish Dholakia
cff1c1f7d8
Merge branch 'main' into litellm_dev_03_12_2025_p1 2025-03-12 22:14:02 -07:00
Krrish Dholakia
88e9edf7db refactor: update method signature 2025-03-12 15:23:38 -07:00
Krish Dholakia
2d957a0ed9
Merge branch 'main' into litellm_dev_03_10_2025_p3 2025-03-12 14:56:01 -07:00
Ishaan Jaff
2460f3cbab test_validate_environment 2025-03-12 12:57:40 -07:00
Ishaan Jaff
39d391d8e7 Optional[Dict] 2025-03-12 12:29:13 -07:00
Ishaan Jaff
4ff6e41c15 ResponsesAPIStreamEvents 2025-03-11 23:42:35 -07:00
Ishaan Jaff
278b6fb5f6 add debug logging 2025-03-11 23:13:10 -07:00
Ishaan Jaff
8fa313ab07 add async streaming support 2025-03-11 20:00:42 -07:00
Ishaan Jaff
f32968409e working basic openai response api request 2025-03-11 17:37:19 -07:00
Krrish Dholakia
cbc2e84044 refactor(azure.py): refactor to have client init work across all endpoints 2025-03-11 17:27:24 -07:00
Ishaan Jaff
b3999b4c75 transform_response_api_response 2025-03-11 17:03:31 -07:00
Ishaan Jaff
dff8308e92 add transform_response_api_response 2025-03-11 16:53:18 -07:00
Ishaan Jaff
0f8de3d0a5 add transform_request for OpenAI responses API 2025-03-11 16:33:26 -07:00
Ishaan Jaff
980354b78b add validate_environment, get_complete_url 2025-03-11 16:12:36 -07:00
Ishaan Jaff
401a52e694 working transform 2025-03-11 15:24:42 -07:00
Ishaan Jaff
368f1de2e1 add OpenAIResponsesAPIConfig 2025-03-11 15:10:34 -07:00
Krrish Dholakia
5f87dc229a feat(openai.py): bubble all error information back to client 2025-03-10 15:27:43 -07:00
Krrish Dholakia
c1ec82fbd5 refactor: instrument body param to bubble up on exception 2025-03-10 15:21:04 -07:00
Krish Dholakia
f6535ae6ad
Support format param for specifying image type (#9019)
* fix(transformation.py): support a 'format' parameter for image's

allow user to specify mime type

* fix: pass mimetype via 'format' param

* feat(gemini/chat/transformation.py): support 'format' param for gemini

* fix(factory.py): support 'format' param on sync bedrock converse calls

* feat(bedrock/converse_transformation.py): support 'format' param for bedrock async calls

* refactor(factory.py): move to supporting 'format' param in base helper

ensures consistency in param support

* feat(gpt_transformation.py): filter out 'format' param

don't send invalid param to openai

* fix(gpt_transformation.py): fix translation

* fix: fix translation error
2025-03-05 19:52:53 -08:00
Krish Dholakia
5e386c28b2
Litellm dev 03 04 2025 p3 (#8997)
* fix(core_helpers.py): handle litellm_metadata instead of 'metadata'

* feat(batches/): ensure batches logs are written to db

makes batches response dict compatible

* fix(cost_calculator.py): handle batch response being a dictionary

* fix(batches/main.py): modify retrieve endpoints to use @client decorator

enables logging to work on retrieve call

* fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible

* fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type

create batch and retrieve batch share the same id

* fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted

* refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline

ensures cost is always logged to db

* fix: fix linting errors

* fix: fix linting error
2025-03-04 21:58:03 -08:00
Krrish Dholakia
4418e6dd14 build: merge branch 2025-03-02 08:31:57 -08:00
Ishaan Jaff
6231052b18
[Bug]: Deepseek error on proxy after upgrading to 1.61.13-stable (#8860)
* fix deepseek error

* test_deepseek_provider_async_completion

* fix get_complete_url
2025-02-26 21:11:06 -08:00
Krish Dholakia
017c482d7b
fix(o_series_transformation.py): fix optional param check for o-serie… (#8787)
* fix(o_series_transformation.py): fix optional param check for o-series models

o3-mini and o-1 do not support parallel tool calling

* fix(utils.py): support 'drop_params' for 'thinking' param across models

allows switching to older claude versions (or non-anthropic models) and param to be safely dropped

* fix: fix passing thinking param in optional params

allows dropping thinking_param where not applicable

* test: update old model

* fix(utils.py): fix linting errors

* fix(main.py): add param to acompletion
2025-02-26 12:26:55 -08:00
Ishaan Jaff
f9cee4c46b
(Bug Fix) Using LiteLLM Python SDK with model=litellm_proxy/ for embedding, image_generation, transcription, speech, rerank (#8815)
* test_litellm_gateway_from_sdk

* fix embedding check for openai

* test litellm proxy provider

* fix image generation openai compatible models

* fix litellm.transcription

* test_litellm_gateway_from_sdk_rerank

* docs litellm python sdk

* docs litellm python sdk with proxy

* test_litellm_gateway_from_sdk_rerank

* ci/cd run again

* test_litellm_gateway_from_sdk_image_generation

* test_litellm_gateway_from_sdk_embedding

* test_litellm_gateway_from_sdk_embedding
2025-02-25 16:22:37 -08:00
Krish Dholakia
0c0f6c23e2
feat(openai/o_series_transformation.py): support native streaming for all openai o-series models (#8552)
o1 now supports streaming
2025-02-14 20:04:19 -08:00
Krish Dholakia
e26d7df91b
Litellm dev 02 10 2025 p2 (#8443)
* Fixed issue #8246 (#8250)

* Fixed issue #8246

* Added unit tests for discard() and for remove_callback_from_list_by_object()

* fix(openai.py): support dynamic passing of organization param to openai

handles scenario where client-side org id is passed to openai

---------

Co-authored-by: Erez Hadad <erezh@il.ibm.com>
2025-02-10 17:53:46 -08:00
Krish Dholakia
c83498f0cd
Litellm dev 02 07 2025 p3 (#8387)
* add back streaming for base o3 (#8361)

* test(base_llm_unit_tests.py): add base test for o-series models - ensure streaming always works

* fix(base_llm_unit_tests.py): fix test for o series models

* refactor: move test

---------

Co-authored-by: Matteo Boschini <12133566+mbosc@users.noreply.github.com>
2025-02-07 22:45:45 -08:00
Ishaan Jaff
b242c66a3b
(Feat) - Add /bedrock/invoke support for all Anthropic models (#8383)
* use anthropic transformation for bedrock/invoke

* use anthropic transforms for bedrock invoke claude

* TestBedrockInvokeClaudeJson

* add AmazonAnthropicClaudeStreamDecoder

* pass bedrock_invoke_provider to make_call

* fix _get_base_bedrock_model

* fix get_bedrock_route

* fix bedrock routing

* fixes for bedrock invoke

* test_all_model_configs

* fix AWSEventStreamDecoder linting

* fix code qa

* test_bedrock_get_base_model

* test_get_model_info_bedrock_models

* test_bedrock_base_model_helper

* test_bedrock_route_detection
2025-02-07 22:41:11 -08:00
Krish Dholakia
3c813b3a87
Fix deepseek calling - refactor to use base_llm_http_handler (#8266)
* refactor(deepseek/): move deepseek to base llm http handler

Fixes https://github.com/BerriAI/litellm/issues/8128#issuecomment-2635430457

* fix(gpt_transformation.py): support stream parsing for gpt-like calls

* test(test_deepseek_completion.py): add async streaming test

* fix(gpt_transformation.py): fix import

* fix(gpt_transformation.py): return full api base and content type
2025-02-04 22:30:00 -08:00
Krish Dholakia
c8494abdea
test(base_llm_unit_tests.py): add test to ensure drop params is respe… (#8224)
* test(base_llm_unit_tests.py): add test to ensure drop params is respected

* fix(types/prometheus.py): use typing_extensions for python3.8 compatibility

* build: add cherry picked commits
2025-02-03 16:04:44 -08:00
Krish Dholakia
1105e35538
Complete o3 model support (#8183)
* fix(o_series_transformation.py): add 'reasoning_effort' as o series model param

Closes https://github.com/BerriAI/litellm/issues/8182

* fix(main.py): ensure `reasoning_effort` is a mapped openai param

* refactor(azure/): rename o1_[x] files to o_series_[x]

* refactor(base_llm_unit_tests.py): refactor testing for o series reasoning effort

* test(test_azure_o_series.py): have azure o series tests correctly inherit from base o series model tests

* feat(base_utils.py): support translating 'developer' role to 'system' role for non-openai providers

Makes it easy to switch from openai to anthropic

* fix: fix linting errors

* fix(base_llm_unit_tests.py): fix test

* fix(main.py): add missing param
2025-02-02 22:36:37 -08:00
Krish Dholakia
23f458d2da
Improved O3 + Azure O3 support (#8181)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* fix: support azure o3 model family for fake streaming workaround (#8162)

* fix: support azure o3 model family for fake streaming workaround

* refactor: rename helper to is_o_series_model for clarity

* update function calling parameters for o3 models (#8178)

* refactor(o1_transformation.py): refactor o1 config to be o series config, expand o series model check to o3

ensures max_tokens is correctly translated for o3

* feat(openai/): refactor o1 files to be 'o_series' files

expands naming to cover o3

* fix(azure/chat/o1_handler.py): azure openai is an instance of openai - was causing resets

* test(test_azure_o_series.py): assert stream faked for azure o3 mini

Resolves https://github.com/BerriAI/litellm/pull/8162

* fix(o1_transformation.py): fix o1 transformation logic to handle explicit o1_series routing

* docs(azure.md): update doc with `o_series/` model name

---------

Co-authored-by: byrongrogan <47910641+byrongrogan@users.noreply.github.com>
Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>
2025-02-01 09:52:28 -08:00
Ishaan Jaff
2cf0daa31c
(Fixes) OpenAI Streaming Token Counting + Fixes usage track when litellm.turn_off_message_logging=True (#8156)
* working streaming usage tracking

* fix test_async_chat_openai_stream_options

* fix await asyncio.sleep(1)

* test_async_chat_azure

* fix s3 logging

* fix get_stream_options

* fix get_stream_options

* fix streaming handler

* test_stream_token_counting_with_redaction

* fix codeql concern
2025-01-31 15:06:37 -08:00
Krish Dholakia
03eef5a2a0
Fix custom pricing - separate provider info from model info (#7990)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 34s
* fix(utils.py): initial commit fixing custom cost tracking

refactors out provider specific model info from `get_model_info` - this was causing custom costs to be registered incorrectly

* fix(utils.py): cleanup `_supports_factory` to check provider info, if model info is None

some providers support features like vision across all models

* fix(utils.py): refactor to use _supports_factory

* test: update testing

* fix: fix linting errors

* test: fix testing
2025-01-25 21:49:28 -08:00
Ishaan Jaff
b6f2e659b9
(Feat) Add x-litellm-overhead-duration-ms and "x-litellm-response-duration-ms" in response from LiteLLM (#7899)
* add track_llm_api_timing

* add track_llm_api_timing

* test_litellm_overhead

* use ResponseMetadata class for setting hidden params and response overhead

* instrument http handler

* fix track_llm_api_timing

* track_llm_api_timing

* emit response overhead on hidden params

* fix resp metadata

* fix make_sync_openai_embedding_request

* test_aaaaatext_completion_endpoint fixes

* _get_value_from_hidden_params

* set_hidden_params

* test_litellm_overhead

* test_litellm_overhead

* test_litellm_overhead

* fix import

* test_litellm_overhead_stream

* add LiteLLMLoggingObject

* use diff folder for testing

* use diff folder for overhead testing

* test litellm overhead

* use typing

* clear typing

* test_litellm_overhead

* fix async_streaming

* update_response_metadata

* move test file

* pply metadata to the response objec
2025-01-21 20:27:55 -08:00
Krish Dholakia
4b23420a20
Litellm dev 01 20 2025 p1 (#7884)
* fix(initial-test-to-return-api-timeout-value-in-openai-timeout-exception): Makes it easier for user to debug why request timed out

* feat(openai.py): return timeout value + time taken on openai timeout errors

helps debug timeout errors

* fix(utils.py): fix num retries extraction logic when num_retries = 0

* fix(config_settings.md): litellm_logging.py

support printing payload to console if 'LITELLM_PRINT_STANDARD_LOGGING_PAYLOAD' is true

 Enables easier debug

* test(test_auth_checks.py'): remove common checks userapikeyauth enforcement check

* fix(litellm_logging.py): fix linting error
2025-01-20 21:45:48 -08:00
Krish Dholakia
1bea338597
LiteLLM Minor Fixes & Improvements (2024/16/01) (#7826)
* fix(lm_studio/chat/transformation.py): Fix https://github.com/BerriAI/litellm/issues/7811

* fix(router.py): fix mock timeout check

* fix: drop model name from fallback args since it causes a conflict with the model=model that is provided later on. (#7806)

This error happens if you provide multiple fallback models to the completion function with model name defined in each one.

* fix(router.py): remove mock_timeout before sending to request

prevents reuse in fallbacks

* test: update test

* test: revert test change - wrong pr

---------

Co-authored-by: Dudu Lasry <david1542@users.noreply.github.com>
2025-01-17 20:59:21 -08:00
Krish Dholakia
ad2f66b3e3
[BETA] Add OpenAI /images/variations + Topaz API support (#7700)
* feat(main.py): initial commit for `/image/variations` endpoint support

* refactor(base_llm/): introduce new base llm base config for image variation endpoints

* refactor(openai/image_variations/transformation.py): implement openai image variation transformation handler

* fix: test

* feat(openai/): working openai `/image/variation` endpoint calls via sdk

* feat(topaz/): topaz sync image variation call support

Addresses https://github.com/BerriAI/litellm/issues/7593

'

* fix(topaz/transformation.py): fix linting errors

* fix(openai/image_variations/handler.py): fix passing json data

* fix(main.py): image_variation/

support async image variation route - `aimage_variation`

* fix(test_get_model_info.py): fix test

* fix: cleanup unused imports

* feat(openai/): add async `/image/variations` endpoint support

* feat(topaz/): support async `/image/variations` calls

* fix: test

* fix(utils.py): fix get_model_info_helper for no model info w/ provider config

handles situation where model info is not known but provider config exists

* test(test_router_fallbacks.py): mark flaky test

* fix: fix unused imports

* test: bump otel load test perf threshold - accounts for current load tests hitting same server
2025-01-11 23:27:46 -08:00
Krish Dholakia
d43d83f9ef
feat(router.py): support request prioritization for text completion c… (#7540)
* feat(router.py): support request prioritization for text completion calls

* fix(internal_user_endpoints.py): fix sql query to return all keys, including null team id keys on `/user/info`

Fixes https://github.com/BerriAI/litellm/issues/7485

* fix: fix linting errors

* fix: fix linting error

* test(test_router_helper_utils.py): add direct test for '_schedule_factory'

Fixes code qa test
2025-01-03 19:35:44 -08:00