docs caching update

2023-09-08 18:20:56 -07:00 · 2023-09-08 18:20:56 -07:00 · 05285c5844
commit 05285c5844
parent 323e095688
1 changed files with 60 additions and 1 deletions
--- a/docs/my-website/docs/caching/caching.md
+++ b/docs/my-website/docs/caching/caching.md
@ -5,7 +5,7 @@
 liteLLM implements exact match caching and supports the following Caching:
 * In-Memory Caching [Default]
 * Redis Caching Local
-* Redic Caching Hosted
+* Redis Caching Hosted
 * GPTCache 

 ## Quick Start Usage - Completion
@ -45,6 +45,65 @@ response2 = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "conten
 # response1 == response2, response 1 is cached
 ```

+### Custom Cache Keys:
+
+Define function to return cache key
+```python
+# this function takes in *args, **kwargs and returns the key you want to use for caching
+def custom_get_cache_key(*args, **kwargs):
+    # return key to use for your cache:
+    key = kwargs.get("model", "") + str(kwargs.get("messages", "")) + str(kwargs.get("temperature", "")) + str(kwargs.get("logit_bias", ""))
+    print("key for cache", key)
+    return key
+
+```
+
+Set your function as litellm.cache.get_cache_key
+```python
+from litellm.caching import Cache
+
+cache = Cache(type="redis", host=os.environ['REDIS_HOST'], port=os.environ['REDIS_PORT'], password=os.environ['REDIS_PASSWORD'])
+
+cache.get_cache_key = custom_get_cache_key # set get_cache_key function for your cache
+
+litellm.cache = cache # set litellm.cache to your cache 
+
+```
+
+### Controlling Caching for each litellm.completion call
+
+`completion()` lets you pass in `caching` (bool) [default False] to control whether to returned cached responses or not
+
+Using the caching flag
+**Ensure you have initialized litellm.cache to your cache object**
+
+```python
+from litellm import completion
+
+response2 = completion(model="gpt-3.5-turbo", messages=messages, temperature=0.1, caching=True)
+
+response3 = completion(model="gpt-3.5-turbo", messages=messages, temperature=0.1, caching=False)
+    
+```
+### Detecting Cached Responses
+For resposes that were returned as cache hit, the response includes a param `cache` = True 
+
+Example response with cache hit
+```
+{
+    'cache': True,
+    'id': 'chatcmpl-7wggdzd6OXhgE2YhcLJHJNZsEWzZ2', 
+    'created': 1694221467, 
+    'model': 'gpt-3.5-turbo-0613', 
+    'choices': [
+        {
+            'index': 0, 'message': {'role': 'assistant', 'content': 'I\'m sorry, but I couldn\'t find any information about "litellm" or how many stars it has. It is possible that you may be referring to a specific product, service, or platform that I am not familiar with. Can you please provide more context or clarify your question?'
+        }, 'finish_reason': 'stop'}
+    ], 
+    'usage': {'prompt_tokens': 17, 'completion_tokens': 59, 'total_tokens': 76}, 
+}
+
+```
 ## Caching with Streaming 
 LiteLLM can cache your streamed responses for you