fix(proxy_server.py): Fix "Circular reference detected" error when max_parallel_requests = 0 (#9671)

* fix(proxy_server.py): remove non-functional parent backoff/retry on /chat/completion Causes circular reference error * fix(http_parsing_utils.py): safely return parsed body - don't allow mutation of cached request body by client functions Root cause fix for circular reference error * Revert "fix: Anthropic prompt caching on GCP Vertex AI (#9605)" (#9670) This reverts commit a8673246dc. * add type hints for AnthropicMessagesResponse * define types for response form AnthropicMessagesResponse * fix response typing * allow using litellm.messages.acreate and litellm.messages.create * fix anthropic_messages implementation * add clear type hints to litellm.messages.create functions * fix anthropic_messages * working anthropic API tests * fixes - anthropic messages interface * use new anthropic interface * fix code quality check * docs anthropic messages endpoint * add namespace_packages = True to mypy * fix mypy lint errors * docs anthropic messages interface * test: fix unit test * test(test_http_parsing_utils.py): update tests --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2025-04-25 18:54:30 +00:00 · 2025-03-31 22:06:02 -07:00 · 2025-03-31 22:06:02 -07:00 · f2a7edaddc
commit f2a7edaddc
parent 136f1d60b3
4 changed files with 55 additions and 16 deletions
--- a/litellm/proxy/proxy_server.py
+++ b/litellm/proxy/proxy_server.py
@ -3308,15 +3308,6 @@ async def model_list(
    tags=["chat/completions"],
    responses={200: {"description": "Successful response"}, **ERROR_RESPONSES},
 )  # azure compatible endpoint
-@backoff.on_exception(
-    backoff.expo,
-    Exception,  # base exception to catch for the backoff
-    max_tries=global_max_parallel_request_retries,  # maximum number of retries
-    max_time=global_max_parallel_request_retry_timeout,  # maximum total time to retry for
-    on_backoff=on_backoff,  # specifying the function to call on backoff
-    giveup=giveup,
-    logger=verbose_proxy_logger,
-)
 async def chat_completion(  # noqa: PLR0915
    request: Request,
    fastapi_response: Response,