docs(caching.md): add batch redis requests to docs

2024-03-15 23:01:08 -07:00 · 2024-03-15 23:01:08 -07:00 · 2d2731c3b5
commit 2d2731c3b5
parent f3cf1ec71f
1 changed files with 26 additions and 0 deletions
--- a/docs/my-website/docs/proxy/caching.md
+++ b/docs/my-website/docs/proxy/caching.md
@ -225,6 +225,32 @@ litellm_settings:
    supported_call_types: ["acompletion", "completion", "embedding", "aembedding"] # defaults to all litellm call types
 ```
 ### Turn on `batch_redis_requests` 
 **What it does?**
 When a request is made:
 - Check if a key starting with `litellm:<hashed_api_key>:<call_type>:` exists in-memory, if no - get the last 100 cached requests for this key and store it
 - New requests are stored with this `litellm:..` as the namespace
 **Why?**
 Reduce number of redis GET requests. This improved latency by 46% in prod load tests. 
 **Usage**
 ```yaml
 litellm_settings:
  cache: true
  cache_params:
    type: redis
    ... # remaining redis args (host, port, etc.)
  callbacks: ["batch_redis_requests"] # 👈 KEY CHANGE!
 ```
 [**SEE CODE**](https://github.com/BerriAI/litellm/blob/main/litellm/proxy/hooks/batch_redis_get.py)
 ### Turn on / off caching per request.  
 The proxy support 3 cache-controls: