mirror of
https://github.com/BerriAI/litellm.git
synced 2025-04-24 18:24:20 +00:00
(docs) proxy server: add caching
This commit is contained in:
parent
44e867499f
commit
b15b723567
1 changed files with 56 additions and 0 deletions
|
@ -871,6 +871,62 @@ model_list:
|
||||||
$ litellm --config /path/to/config.yaml
|
$ litellm --config /path/to/config.yaml
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Caching Responses
|
||||||
|
|
||||||
|
Enable caching by adding the following credentials to your server environment
|
||||||
|
|
||||||
|
```shell
|
||||||
|
REDIS_HOST = "" # REDIS_HOST='redis-18841.c274.us-east-1-3.ec2.cloud.redislabs.com'
|
||||||
|
REDIS_PORT = "" # REDIS_PORT='18841'
|
||||||
|
REDIS_PASSWORD = "" # REDIS_PASSWORD='liteLlmIsAmazing'
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Using Caching
|
||||||
|
Send the same request twice:
|
||||||
|
```shell
|
||||||
|
curl http://0.0.0.0:8000/v1/chat/completions \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"model": "gpt-3.5-turbo",
|
||||||
|
"messages": [{"role": "user", "content": "write a poem about litellm!"}],
|
||||||
|
"temperature": 0.7
|
||||||
|
}'
|
||||||
|
|
||||||
|
curl http://0.0.0.0:8000/v1/chat/completions \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"model": "gpt-3.5-turbo",
|
||||||
|
"messages": [{"role": "user", "content": "write a poem about litellm!"}],
|
||||||
|
"temperature": 0.7
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Control caching per completion request
|
||||||
|
Caching can be switched on/off per `/chat/completions` request
|
||||||
|
- Caching **on** for completion - pass `caching=True`:
|
||||||
|
```shell
|
||||||
|
curl http://0.0.0.0:8000/v1/chat/completions \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"model": "gpt-3.5-turbo",
|
||||||
|
"messages": [{"role": "user", "content": "write a poem about litellm!"}],
|
||||||
|
"temperature": 0.7,
|
||||||
|
"caching": true
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
- Caching **off** for completion - pass `caching=False`:
|
||||||
|
```shell
|
||||||
|
curl http://0.0.0.0:8000/v1/chat/completions \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"model": "gpt-3.5-turbo",
|
||||||
|
"messages": [{"role": "user", "content": "write a poem about litellm!"}],
|
||||||
|
"temperature": 0.7,
|
||||||
|
"caching": false
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
## Debugging Proxy
|
## Debugging Proxy
|
||||||
Run the proxy with `--debug` to easily view debug logs
|
Run the proxy with `--debug` to easily view debug logs
|
||||||
```shell
|
```shell
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue