litellm-mirror/litellm/proxy/hooks
Krish Dholakia d88e8922d4
Litellm dev 11 02 2024 (#6561)
* fix(dual_cache.py): update in-memory check for redis batch get cache

Fixes latency delay for async_batch_redis_cache

* fix(service_logger.py): fix race condition causing otel service logging to be overwritten if service_callbacks set

* feat(user_api_key_auth.py): add parent otel component for auth

allows us to isolate how much latency is added by auth checks

* perf(parallel_request_limiter.py): move async_set_cache_pipeline (from max parallel request limiter) out of execution path (background task)

reduces latency by 200ms

* feat(user_api_key_auth.py): have user api key auth object return user tpm/rpm limits - reduces redis calls in downstream task (parallel_request_limiter)

Reduces latency by 400-800ms

* fix(parallel_request_limiter.py): use batch get cache to reduce user/key/team usage object calls

reduces latency by 50-100ms

* fix: fix linting error

* fix(_service_logger.py): fix import

* fix(user_api_key_auth.py): fix service logging

* fix(dual_cache.py): don't pass 'self'

* fix: fix python3.8 error

* fix: fix init]
2024-11-04 07:48:20 +05:30
..
__init__.py fix(proxy_server.py): enable pre+post-call hooks and max parallel request limits 2023-12-08 17:11:30 -08:00
azure_content_safety.py (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208) 2024-10-14 16:34:01 +05:30
batch_redis_get.py (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208) 2024-10-14 16:34:01 +05:30
cache_control_check.py (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208) 2024-10-14 16:34:01 +05:30
dynamic_rate_limiter.py (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208) 2024-10-14 16:34:01 +05:30
example_presidio_ad_hoc_recognizer.json fix(presidio_pii_masking.py): enable user to pass ad hoc recognizer for pii masking 2024-02-20 16:01:15 -08:00
max_budget_limiter.py redis otel tracing + async support for latency routing (#6452) 2024-10-28 21:52:12 -07:00
parallel_request_limiter.py Litellm dev 11 02 2024 (#6561) 2024-11-04 07:48:20 +05:30
presidio_pii_masking.py (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208) 2024-10-14 16:34:01 +05:30
prompt_injection_detection.py (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208) 2024-10-14 16:34:01 +05:30