LiteLLM Minor Fixes & Improvements (11/12/2024) (#6705)

* fix(caching): convert arg to equivalent kwargs in llm caching handler prevent unexpected errors * fix(caching_handler.py): don't pass args to caching * fix(caching): remove all *args from caching.py * fix(caching): consistent function signatures + abc method * test(caching_unit_tests.py): add unit tests for llm caching ensures coverage for common caching scenarios across different implementations * refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one * fix(router.py): drop redis password requirement * fix(proxy_server.py): fix faulty slack alerting check * fix(langfuse.py): avoid copying functions/thread lock objects in metadata fixes metadata copy error when parent otel span in metadata * test: update test
2025-04-26 11:14:04 +00:00 · 2024-11-12 22:50:51 +05:30 · 2024-11-12 22:50:51 +05:30 · 9160d80fa5
commit 9160d80fa5
parent d39fd60801
23 changed files with 525 additions and 204 deletions
--- a/litellm/utils.py
+++ b/litellm/utils.py
@ -796,7 +796,7 @@ def client(original_function):  # noqa: PLR0915
                and kwargs.get("_arealtime", False) is not True
            ):  # allow users to control returning cached responses from the completion function
                # checking cache
-                print_verbose("INSIDE CHECKING CACHE")
+                verbose_logger.debug("INSIDE CHECKING SYNC CACHE")
                caching_handler_response: CachingHandlerResponse = (
                    _llm_caching_handler._sync_get_cache(
                        model=model or "",
@ -808,6 +808,7 @@ def client(original_function):  # noqa: PLR0915
                        args=args,
                    )
                )
+
                if caching_handler_response.cached_result is not None:
                    return caching_handler_response.cached_result