update docs

2023-08-28 22:16:22 -07:00 · 2023-08-28 22:16:22 -07:00 · d9b17fb063
commit d9b17fb063
parent 8f7f9ca932
1 changed files with 10 additions and 27 deletions
--- a/docs/my-website/docs/caching/caching.md
+++ b/docs/my-website/docs/caching/caching.md
@ -1,42 +1,25 @@
 # LiteLLM - Caching

-liteLLM implements exact match caching. It can be enabled by setting
-1. `litellm.caching`: When set to `True`, enables caching for all responses. Keys are the input `messages` and values store in the cache is the corresponding `response`
+## LiteLLM Caches `completion()` and `embedding()` calls when switched on

-2. `litellm.caching_with_models`: When set to `True`, enables caching on a per-model basis.Keys are the input `messages + model` and values store in the cache is the corresponding `response` 
+liteLLM implements exact match caching and supports the following Caching:
+* In-Memory Caching [Default]
+* Redis Caching Local
+* Redic Caching Hosted
+* GPTCache 

 ## Usage
 1. Caching - cache
 Keys in the cache are `model`, the following example will lead to a cache hit
 ```python
-litellm.caching = True
+import litellm
+from litellm import completion
+from litellm.caching import Cache
+litellm.cache = Cache()

 # Make completion calls
 response1 = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Tell me a joke."}])
 response2 = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Tell me a joke."}])

 # response1 == response2, response 1 is cached
-
-# with a diff model
-response3 = completion(model="command-nightly", messages=[{"role": "user", "content": "Tell me a joke."}])
-
-# response3 == response1 == response2, since keys are messages
 ```
-
-
-2. Caching with Models - caching_with_models
-Keys in the cache are `messages + model`, the following example will not lead to a cache hit
-```python
-litellm.caching_with_models = True
-
-# Make completion calls
-response1 = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Tell me a joke."}])
-response2 = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Tell me a joke."}])
-# response1 == response2, response 1 is cached
-
-# with a diff model, this will call the API since the key is not cached
-response3 = completion(model="command-nightly", messages=[{"role": "user", "content": "Tell me a joke."}])
-
-# response3 != response1, since keys are messages + model
-```
-