* fix(azure.py): ensure max_retries=0 is respected
Fixes https://github.com/BerriAI/litellm/issues/6129
* fix(test_openai.py): add unit test to ensure openai sdk calls always respect max_retries = 0
* test(test_azure_openai.py): add unit testing for azure_text/ route
* fix(azure.py): fix passing max retries on streaming
* fix(azure.py): fix azure max retries on async completion + streaming
* fix(completion/handler.py): fix azure text async completion + streaming
* test(test_azure_openai.py): ensure azure openai max retries always respected
* test(test_azure_o_series.py): add testing to ensure max retries always respected
* Added gemini providers for 2.0-flash and 2.0-flash lite (#8321)
* Update model_prices_and_context_window.json
added gemini providers for 2.0-flash and 2.0-flash light
* Update model_prices_and_context_window.json
fixed URL
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* Convert tool use arguments to string before counting tokens (#6989)
In at least some cases the `messages["tool_calls"]["function"]["arguments"]` is a dict, not a string. In order to tokenize it properly it needs to be a string. In the case that it is already a string this is a noop, which is also fine.
* build(model_prices_and_context_window.json): add gemini 2.0 flash lite pricing
* build(model_prices_and_context_window.json): add gemini commercial rate limits
* fix(utils.py): fix linting error
* refactor(utils.py): refactor to maintain function size
---------
Co-authored-by: Bardia Khosravi <bardiakhosravi95@gmail.com>
Co-authored-by: Josh Morrow <josh@jcmorrow.com>
* fix(convert_dict_to_response.py): only convert if response is the response_format tool call passed in
Fixes https://github.com/BerriAI/litellm/issues/8241
* fix(gpt_transformation.py): makes sure response format / tools conversion doesn't remove previous tool calls
* refactor(gpt_transformation.py): refactor out json schema converstion to base config
keeps logic consistent across providers
* fix(o_series_transformation.py): support o3 mini native streaming
Fixes https://github.com/BerriAI/litellm/issues/8274
* fix(gpt_transformation.py): remove unused variables
* test: update test
* fix(o_series_transformation.py): add 'reasoning_effort' as o series model param
Closes https://github.com/BerriAI/litellm/issues/8182
* fix(main.py): ensure `reasoning_effort` is a mapped openai param
* refactor(azure/): rename o1_[x] files to o_series_[x]
* refactor(base_llm_unit_tests.py): refactor testing for o series reasoning effort
* test(test_azure_o_series.py): have azure o series tests correctly inherit from base o series model tests
* feat(base_utils.py): support translating 'developer' role to 'system' role for non-openai providers
Makes it easy to switch from openai to anthropic
* fix: fix linting errors
* fix(base_llm_unit_tests.py): fix test
* fix(main.py): add missing param
* fix: support azure o3 model family for fake streaming workaround (#8162)
* fix: support azure o3 model family for fake streaming workaround
* refactor: rename helper to is_o_series_model for clarity
* update function calling parameters for o3 models (#8178)
* refactor(o1_transformation.py): refactor o1 config to be o series config, expand o series model check to o3
ensures max_tokens is correctly translated for o3
* feat(openai/): refactor o1 files to be 'o_series' files
expands naming to cover o3
* fix(azure/chat/o1_handler.py): azure openai is an instance of openai - was causing resets
* test(test_azure_o_series.py): assert stream faked for azure o3 mini
Resolves https://github.com/BerriAI/litellm/pull/8162
* fix(o1_transformation.py): fix o1 transformation logic to handle explicit o1_series routing
* docs(azure.md): update doc with `o_series/` model name
---------
Co-authored-by: byrongrogan <47910641+byrongrogan@users.noreply.github.com>
Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>
2025-02-01 09:52:28 -08:00
Renamed from tests/llm_translation/test_azure_o1.py (Browse further)