(fix) add linting check to ban creating AsyncHTTPHandler during LLM calling (#6855)

* fix triton * fix TEXT_COMPLETION_CODESTRAL * fix REPLICATE * fix CLARIFAI * fix HUGGINGFACE * add test_no_async_http_handler_usage * fix PREDIBASE * fix anthropic use get_async_httpx_client * fix vertex fine tuning * fix dbricks get_async_httpx_client * fix get_async_httpx_client vertex * fix get_async_httpx_client * fix get_async_httpx_client * fix make_async_azure_httpx_request * fix check_for_async_http_handler * test: cleanup mistral model * add check for AsyncClient * fix check_for_async_http_handler * fix get_async_httpx_client * fix tests using in_memory_llm_clients_cache * fix langfuse import * fix import --------- Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2025-04-25 18:54:30 +00:00 · 2024-11-21 19:03:02 -08:00 · 2024-11-21 19:03:02 -08:00 · 920f4c9f82
commit 920f4c9f82
parent 71ebf47cef
26 changed files with 288 additions and 62 deletions
--- a/litellm/llms/azure_ai/embed/handler.py
+++ b/litellm/llms/azure_ai/embed/handler.py
@ -74,7 +74,10 @@ class AzureAIEmbedding(OpenAIChatCompletion):
        client: Optional[Union[HTTPHandler, AsyncHTTPHandler]] = None,
    ) -> EmbeddingResponse:
        if client is None or not isinstance(client, AsyncHTTPHandler):
-            client = AsyncHTTPHandler(timeout=timeout, concurrent_limit=1)
+            client = get_async_httpx_client(
+                llm_provider=litellm.LlmProviders.AZURE_AI,
+                params={"timeout": timeout},
+            )

        url = "{}/images/embeddings".format(api_base)