(docs) simple proxy

2025-04-27 19:54:13 +00:00 · 2023-11-08 17:55:11 -08:00 · 2023-11-08 17:55:11 -08:00 · 901b0e690e
commit 901b0e690e
parent 277a42ea4d
1 changed files with 0 additions and 28 deletions
--- a/docs/my-website/docs/simple_proxy.md
+++ b/docs/my-website/docs/simple_proxy.md
@ -344,33 +344,6 @@ print(result)
 ## Advanced
 ### Caching
 #### Control caching per completion request
 Caching can be switched on/off per /chat/completions request
 - Caching on for completion - pass `caching=True`:
  ```shell
  curl http://0.0.0.0:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "write a poem about litellm!"}],
     "temperature": 0.7,
     "caching": true
   }'
  ```
 - Caching off for completion - pass `caching=False`:
  ```shell
  curl http://0.0.0.0:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "write a poem about litellm!"}],
     "temperature": 0.7,
     "caching": false
   }'
  ```
 ### Set Custom Prompt Templates
 LiteLLM by default checks if a model has a [prompt template and applies it](./completion/prompt_formatting.md) (e.g. if a huggingface model has a saved chat template in it's tokenizer_config.json). However, you can also set a custom prompt template on your proxy in the `config.yaml`: 
@ -480,7 +453,6 @@ model_list:
      api_base: your_api_base # url where model is deployed 
 ```
 ## Proxy CLI Arguments
 #### --host