add gpt cache to docs

This commit is contained in:
ishaan-jaff 2023-08-26 16:30:32 -07:00
parent 4e8bfeb6f1
commit 628cfa29f3
3 changed files with 34 additions and 1 deletions

View file

@ -0,0 +1,42 @@
# Caching
liteLLM implements exact match caching. It can be enabled by setting
1. `litellm.caching`: When set to `True`, enables caching for all responses. Keys are the input `messages` and values store in the cache is the corresponding `response`
2. `litellm.caching_with_models`: When set to `True`, enables caching on a per-model basis.Keys are the input `messages + model` and values store in the cache is the corresponding `response`
## Usage
1. Caching - cache
Keys in the cache are `model`, the following example will lead to a cache hit
```python
litellm.caching = True
# Make completion calls
response1 = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Tell me a joke."}])
response2 = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Tell me a joke."}])
# response1 == response2, response 1 is cached
# with a diff model
response3 = completion(model="command-nightly", messages=[{"role": "user", "content": "Tell me a joke."}])
# response3 == response1 == response2, since keys are messages
```
2. Caching with Models - caching_with_models
Keys in the cache are `messages + model`, the following example will not lead to a cache hit
```python
litellm.caching_with_models = True
# Make completion calls
response1 = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Tell me a joke."}])
response2 = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Tell me a joke."}])
# response1 == response2, response 1 is cached
# with a diff model, this will call the API since the key is not cached
response3 = completion(model="command-nightly", messages=[{"role": "user", "content": "Tell me a joke."}])
# response3 != response1, since keys are messages + model
```

View file

@ -0,0 +1,26 @@
# Using GPTCache with LiteLLM
GPTCache is a Library for Creating Semantic Cache for LLM Queries
GPTCache Docs: https://gptcache.readthedocs.io/en/latest/index.html#
GPTCache Github: https://github.com/zilliztech/GPTCache
## Usage
### Install GPTCache
pip install gptcache
### Using GPT Cache with Litellm Completion()
```python
from gptcache import cache
from litellm.cache import completion
# Set your .env keys
os.environ['OPENAI_API_KEY'] = ""
cache.init()
cache.set_openai_key()
messages = [{"role": "user", "content": "what is litellm YC 22?"}]
```