Commit graph

37 commits

Author SHA1 Message Date
Krish Dholakia
8ee32291e0
Squashed commit of the following: (#9709)
commit b12a9892b7
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Wed Apr 2 08:09:56 2025 -0700

    fix(utils.py): don't modify openai_token_counter

commit 294de31803
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 21:22:40 2025 -0700

    fix: fix linting error

commit cb6e9fbe40
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 19:52:45 2025 -0700

    refactor: complete migration

commit bfc159172d
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 19:09:59 2025 -0700

    refactor: refactor more constants

commit 43ffb6a558
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:45:24 2025 -0700

    fix: test

commit 04dbe4310c
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:28:58 2025 -0700

    refactor: refactor: move more constants into constants.py

commit 3c26284aff
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:14:46 2025 -0700

    refactor: migrate hardcoded constants out of __init__.py

commit c11e0de69d
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:11:21 2025 -0700

    build: migrate all constants into constants.py

commit 7882bdc787
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Mar 24 18:07:37 2025 -0700

    build: initial test banning hardcoded numbers in repo
2025-04-02 21:24:54 -07:00
Krish Dholakia
9b7ebb6a7d
build(pyproject.toml): add new dev dependencies - for type checking (#9631)
* build(pyproject.toml): add new dev dependencies - for type checking

* build: reformat files to fit black

* ci: reformat to fit black

* ci(test-litellm.yml): make tests run clear

* build(pyproject.toml): add ruff

* fix: fix ruff checks

* build(mypy/): fix mypy linting errors

* fix(hashicorp_secret_manager.py): fix passing cert for tls auth

* build(mypy/): resolve all mypy errors

* test: update test

* fix: fix black formatting

* build(pre-commit-config.yaml): use poetry run black

* fix(proxy_server.py): fix linting error

* fix: fix ruff safe representation error
2025-03-29 11:02:13 -07:00
Ishaan Jaff
b60178f534 fix azure chat logic 2025-03-18 12:42:24 -07:00
Ishaan Jaff
80a5cfa01d test_azure_embedding_max_retries_0 2025-03-18 12:35:34 -07:00
Ishaan Jaff
d4b3082ca2 fix azure embedding test 2025-03-18 12:19:12 -07:00
Ishaan Jaff
38e2dd00cc fix amebedding issue on ssl azure 2025-03-18 11:42:11 -07:00
Ishaan Jaff
a0c5fb81b8 fix logic for intializing openai clients 2025-03-18 10:23:30 -07:00
Ishaan Jaff
34142a1b62 _init_azure_client_for_cloudflare_ai_gateway 2025-03-18 10:11:54 -07:00
Ishaan Jaff
edfbf21c39 fix re-using azure openai client 2025-03-18 10:06:56 -07:00
Ishaan Jaff
f2026ef907 fix - correctly re-use azure openai client 2025-03-18 09:51:28 -07:00
Ishaan Jaff
b74f3cb76c _get_azure_openai_client 2025-03-18 09:38:27 -07:00
Ishaan Jaff
26be805ad3 rename to _get_azure_openai_client 2025-03-18 09:25:26 -07:00
Krrish Dholakia
5ffd3f56f8 fix(azure.py): track azure llm api latency metric 2025-03-13 14:47:35 -07:00
Krish Dholakia
2d957a0ed9
Merge branch 'main' into litellm_dev_03_10_2025_p3 2025-03-12 14:56:01 -07:00
Krrish Dholakia
2469072c50 fix: remove unused imports 2025-03-11 18:15:10 -07:00
Krrish Dholakia
58888f117c feat(azure.py): fix azure client init 2025-03-11 18:05:11 -07:00
Krrish Dholakia
9af73f339a test: fix tests 2025-03-11 17:42:36 -07:00
Krrish Dholakia
2c2404dac9 refactor(azure.py): working client init logic in azure image generation 2025-03-11 14:22:25 -07:00
Krrish Dholakia
152bc67d22 refactor(azure.py): working azure client init on audio speech endpoint 2025-03-11 14:19:45 -07:00
Krrish Dholakia
f7d9cce536 refactor(azure.py): refactor acompletion to use base azure sdk client 2025-03-11 13:59:13 -07:00
Krrish Dholakia
b58edb7fa1 test(test_azure_common_utils.py): add unit testing for common azure client params function 2025-03-11 12:24:08 -07:00
Krrish Dholakia
69839b3720 refactor(azure/common_utils.py): refactor azure client param logic
create common util for azure client param logic
2025-03-11 12:14:50 -07:00
Krrish Dholakia
bfbe26b91d feat(azure.py): add azure bad request error support 2025-03-10 15:59:06 -07:00
Krish Dholakia
f651d51f26
Litellm dev 02 07 2025 p2 (#8377)
* fix(caching_routes.py): mask redis password on `/cache/ping` route

* fix(caching_routes.py): fix linting erro

* fix(caching_routes.py): fix linting error on caching routes

* fix: fix test - ignore mask_dict - has a breakpoint

* fix(azure.py): add timeout param + elapsed time in azure timeout error

* fix(http_handler.py): add elapsed time to http timeout request

makes it easier to debug how long request took before failing
2025-02-07 17:30:38 -08:00
Krish Dholakia
6b8b49451f
Fix azure max retries error (#8340)
* fix(azure.py): ensure max_retries=0 is respected

Fixes https://github.com/BerriAI/litellm/issues/6129

* fix(test_openai.py): add unit test to ensure openai sdk calls always respect max_retries = 0

* test(test_azure_openai.py): add unit testing for azure_text/ route

* fix(azure.py): fix passing max retries on streaming

* fix(azure.py): fix azure max retries on async completion + streaming

* fix(completion/handler.py): fix azure text async completion + streaming

* test(test_azure_openai.py): ensure azure openai max retries always respected

* test(test_azure_o_series.py): add testing to ensure max retries always respected

* Added gemini providers for 2.0-flash and 2.0-flash lite (#8321)

* Update model_prices_and_context_window.json

added gemini providers for 2.0-flash and 2.0-flash light

* Update model_prices_and_context_window.json

fixed URL

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* Convert tool use arguments to string before counting tokens (#6989)

In at least some cases the `messages["tool_calls"]["function"]["arguments"]` is a dict, not a string. In order to tokenize it properly it needs to be a string. In the case that it is already a string this is a noop, which is also fine.

* build(model_prices_and_context_window.json): add gemini 2.0 flash lite pricing

* build(model_prices_and_context_window.json): add gemini commercial rate limits

* fix(utils.py): fix linting error

* refactor(utils.py): refactor to maintain function size

---------

Co-authored-by: Bardia Khosravi <bardiakhosravi95@gmail.com>
Co-authored-by: Josh Morrow <josh@jcmorrow.com>
2025-02-06 23:20:48 -08:00
Krish Dholakia
97b8de17ab
LiteLLM Minor Fixes & Improvements (01/16/2025) - p2 (#7828)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
* fix(vertex_ai/gemini/transformation.py): handle 'http://' image urls

* test: add base test for `http:` url's

* fix(factory.py/get_image_details): follow redirects

allows http calls to work

* fix(codestral/): fix stream chunk parsing on last chunk of stream

* Azure ad token provider (#6917)

* Update azure.py

Added optional parameter azure ad token provider

* Added parameter to main.py

* Found token provider arg location

* Fixed embeddings

* Fixed ad token provider

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* fix: fix linting errors

* fix(main.py): leave out o1 route for azure ad token provider, for now

get v0 out for sync azure gpt route to begin with

* test: skip http:// test for fireworks ai

model does not support it

* refactor: cleanup dead code

* fix: revert http:// url passthrough for gemini

google ai studio raises errors

* test: fix test

---------

Co-authored-by: bahtman <anton@baht.dk>
2025-02-02 23:17:50 -08:00
Ishaan Jaff
1e06ee3162
(Refactor) - Re use litellm.completion/litellm.embedding etc for health checks (#7455)
* add mode: realtime

* add _realtime_health_check

* test_realtime_health_check

* azure _realtime_health_check

* _realtime_health_check

* Realtime Models

* fix code quality

* delete OAI / Azure custom health check code

* simplest version of ahealth check

* update tests

* working health check post refactor

* working aspeech health check

* fix realtime health checks

* test_audio_transcription_health_check

* use get_audio_file_for_health_check

* test_text_completion_health_check

* ahealth_check

* simplify health check code

* update ahealth_check

* fix import

* fix unused imports

* fix ahealth_check

* fix local testing

* test_async_realtime_health_check
2024-12-28 18:38:54 -08:00
Ishaan Jaff
4e65722a00
(Bug Fix) Add health check support for realtime models (#7453)
* add mode: realtime

* add _realtime_health_check

* test_realtime_health_check

* azure _realtime_health_check

* _realtime_health_check

* Realtime Models

* fix code quality
2024-12-28 18:15:00 -08:00
Krish Dholakia
9237357bcc
Litellm dev 12 25 2024 p1 (#7411)
* test(test_watsonx.py): e2e unit test for watsonx custom header

covers https://github.com/BerriAI/litellm/issues/7408

* fix(common_utils.py): handle auth token already present in headers (watsonx + openai-like base handler)

Fixes https://github.com/BerriAI/litellm/issues/7408

* fix(watsonx/chat): fix chat route

Fixes https://github.com/BerriAI/litellm/issues/7408

* fix(huggingface/chat/handler.py): fix huggingface async completion calls

* Correct handling of max_retries=0 to disable AzureOpenAI retries (#7379)

* test: fix test

---------

Co-authored-by: Minh Duc <phamminhduc0711@gmail.com>
2024-12-25 17:36:30 -08:00
Ishaan Jaff
81be0b4090
(Feat) add `"/v1/batches/{batch_id:path}/cancel" endpoint (#7406)
* use 1 file for azure batches handling

* add cancel_batch endpoint

* add a cancel batch on open ai

* add cancel_batch endpoint

* add cancel batches to test

* remove unused imports

* test_batches_operations

* update test_batches_operations
2024-12-24 20:23:50 -08:00
Krish Dholakia
404bf2974b
Litellm dev 2024 12 20 p1 (#7335)
* fix(utils.py): e2e azure tts cost tracking working

moves tts response obj to include hidden params (allows for litellm call id, etc. to be sent in response headers) ; fixes spend_Tracking_utils logging payload to account for non-base model use-case

Fixes https://github.com/BerriAI/litellm/issues/7223

* fix: fix linting errors

* build(model_prices_and_context_window.json): add bedrock llama 3.3

Closes https://github.com/BerriAI/litellm/issues/7329

* fix(openai.py): fix return type for sync openai httpx response

* test: update test

* fix(spend_tracking_utils.py): fix if check

* fix(spend_tracking_utils.py): fix if check

* test: improve debugging for test

* fix: fix import
2024-12-20 21:22:31 -08:00
Ishaan Jaff
c7f14e936a
(code quality) run ruff rule to ban unused imports (#7313)
* remove unused imports

* fix AmazonConverseConfig

* fix test

* fix import

* ruff check fixes

* test fixes

* fix testing

* fix imports
2024-12-19 12:33:42 -08:00
Krish Dholakia
516c2a6a70
Litellm remove circular imports (#7232)
* fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py

* fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py

'

* refactor: fix litellm.ModelResponse import on pass through endpoints

* refactor(litellm_logging.py): fix circular import for custom callbacks literal

* fix(factory.py): fix circular imports inside prompt factory

* fix(cost_calculator.py): fix circular import for 'litellm.Usage'

* fix(proxy_server.py): fix potential circular import with `litellm.Router'

* fix(proxy/utils.py): fix potential circular import in `litellm.Router`

* fix: remove circular imports in 'auth_checks' and 'guardrails/'

* fix(prompt_injection_detection.py): fix router impor t

* fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through

* fix(anthropic_pass_through_logging_handler.py): fix potential circular imports

* fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import

* fix(base.py): fix potential circular import

* fix(handler.py): fix potential circular ref in codestral + cohere handler's

* fix(azure.py): fix potential circular imports

* fix(gpt_transformation.py): fix modelresponse import

* fix(litellm_logging.py): add logging base class - simplify typing

makes it easy for other files to type check the logging obj without introducing circular imports

* fix(azure_ai/embed): fix potential circular import on handler.py

* fix(databricks/): fix potential circular imports in databricks/

* fix(vertex_ai/): fix potential circular imports on vertex ai embeddings

* fix(vertex_ai/image_gen): fix import

* fix(watsonx-+-bedrock): cleanup imports

* refactor(anthropic-pass-through-+-petals): cleanup imports

* refactor(huggingface/): cleanup imports

* fix(ollama-+-clarifai): cleanup circular imports

* fix(openai_like/): fix impor t

* fix(openai_like/): fix embedding handler

cleanup imports

* refactor(openai.py): cleanup imports

* fix(sagemaker/transformation.py): fix import

* ci(config.yml): add circular import test to ci/cd
2024-12-14 16:28:34 -08:00
Krish Dholakia
e68bb4e051
Litellm dev 12 12 2024 (#7203)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 47s
* fix(azure/): support passing headers to azure openai endpoints

Fixes https://github.com/BerriAI/litellm/issues/6217

* fix(utils.py): move default tokenizer to just openai

hf tokenizer makes network calls when trying to get the tokenizer - this slows down execution time calls

* fix(router.py): fix pattern matching router - add generic "*" to it as well

Fixes issue where generic "*" model access group wouldn't show up

* fix(pattern_match_deployments.py): match to more specific pattern

match to more specific pattern

allows setting generic wildcard model access group and excluding specific models more easily

* fix(proxy_server.py): fix _delete_deployment to handle base case where db_model list is empty

don't delete all router models  b/c of empty list

Fixes https://github.com/BerriAI/litellm/issues/7196

* fix(anthropic/): fix handling response_format for anthropic messages with anthropic api

* fix(fireworks_ai/): support passing response_format + tool call in same message

Addresses https://github.com/BerriAI/litellm/issues/7135

* Revert "fix(fireworks_ai/): support passing response_format + tool call in same message"

This reverts commit 6a30dc6929.

* test: fix test

* fix(replicate/): fix replicate default retry/polling logic

* test: add unit testing for router pattern matching

* test: update test to use default oai tokenizer

* test: mark flaky test

* test: skip flaky test
2024-12-13 08:54:03 -08:00
Krish Dholakia
350cfc36f7
Litellm merge pr (#7161)
* build: merge branch

* test: fix openai naming

* fix(main.py): fix openai renaming

* style: ignore function length for config factory

* fix(sagemaker/): fix routing logic

* fix: fix imports

* fix: fix override
2024-12-10 22:49:26 -08:00
Krish Dholakia
0c0498dd60
Litellm dev 12 07 2024 (#7086)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 11s
* fix(main.py): support passing max retries to azure/openai embedding integrations

Fixes https://github.com/BerriAI/litellm/issues/7003

* feat(team_endpoints.py): allow updating team model aliases

Closes https://github.com/BerriAI/litellm/issues/6956

* feat(router.py): allow specifying model id as fallback - skips any cooldown check

Allows a default model to be checked if all models in cooldown

s/o @micahjsmith

* docs(reliability.md): add fallback to specific model to docs

* fix(utils.py): new 'is_prompt_caching_valid_prompt' helper util

Allows user to identify if messages/tools have prompt caching

Related issue: https://github.com/BerriAI/litellm/issues/6784

* feat(router.py): store model id for prompt caching valid prompt

Allows routing to that model id on subsequent requests

* fix(router.py): only cache if prompt is valid prompt caching prompt

prevents storing unnecessary items in cache

* feat(router.py): support routing prompt caching enabled models to previous deployments

Closes https://github.com/BerriAI/litellm/issues/6784

* test: fix linting errors

* feat(databricks/): convert basemodel to dict and exclude none values

allow passing pydantic message to databricks

* fix(utils.py): ensure all chat completion messages are dict

* (feat) Track `custom_llm_provider` in LiteLLMSpendLogs (#7081)

* add custom_llm_provider to SpendLogsPayload

* add custom_llm_provider to SpendLogs

* add custom llm provider to SpendLogs payload

* test_spend_logs_payload

* Add MLflow to the side bar (#7031)

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* (bug fix) SpendLogs update DB catch all possible DB errors for retrying  (#7082)

* catch DB_CONNECTION_ERROR_TYPES

* fix DB retry mechanism for SpendLog updates

* use DB_CONNECTION_ERROR_TYPES in auth checks

* fix exp back off for writing SpendLogs

* use _raise_failed_update_spend_exception to ensure errors print as NON blocking

* test_update_spend_logs_multiple_batches_with_failure

* (Feat) Add StructuredOutputs support for Fireworks.AI (#7085)

* fix model cost map fireworks ai "supports_response_schema": true,

* fix supports_response_schema

* fix map openai params fireworks ai

* test_map_response_format

* test_map_response_format

* added deepinfra/Meta-Llama-3.1-405B-Instruct (#7084)

* bump: version 1.53.9 → 1.54.0

* fix deepinfra

* litellm db fixes LiteLLM_UserTable (#7089)

* ci/cd queue new release

* fix llama-3.3-70b-versatile

* refactor - use consistent file naming convention `AI21/` -> `ai21`  (#7090)

* fix refactor - use consistent file naming convention

* ci/cd run again

* fix naming structure

* fix use consistent naming (#7092)

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>
Co-authored-by: ali sayyah <ali.sayyah2@gmail.com>
2024-12-08 00:30:33 -08:00
Ishaan Jaff
36e99ebce7
fix use consistent naming (#7092)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 11s
2024-12-07 22:01:00 -08:00
Renamed from litellm/llms/AzureOpenAI/azure.py (Browse further)