docs: add time.sleep() between streaming calls

LiteLLM's cache appears to be updated in the background. Without this `time.sleep()` call, both responses take `0.8s` to return, but after adding it, the second response returns in `0.006s`.
This commit is contained in:
Ajeet D'Souza 2024-08-28 17:59:07 +05:30 committed by GitHub
parent f1147696a3
commit 0533f77138
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -51,8 +51,10 @@ LiteLLM can cache your streamed responses for you
### Usage
```python
import litellm
import time
from litellm import completion
from litellm.caching import Cache
litellm.cache = Cache(type="hosted")
# Make completion calls
@ -64,6 +66,7 @@ response1 = completion(
for chunk in response1:
print(chunk)
time.sleep(1) # cache is updated asynchronously
response2 = completion(
model="gpt-3.5-turbo",
@ -72,4 +75,4 @@ response2 = completion(
caching=True)
for chunk in response2:
print(chunk)
```
```