fix(proxy/rules.md): add docs on setting post-call rules on the proxy

2024-01-04 11:16:39 +05:30 · 2024-01-04 11:16:39 +05:30 · 0f7d03f761
commit 0f7d03f761
parent 54653f9a4a
3 changed files with 44 additions and 0 deletions
--- a/docs/my-website/docs/proxy/rules.md
+++ b/docs/my-website/docs/proxy/rules.md
@ -0,0 +1,43 @@
+# Post-Call Rules 
+
+Use this to fail a request based on the output of an llm api call.
+
+## Quick Start
+
+### Step 1: Create a file (e.g. post_call_rules.py)
+
+```python
+def my_custom_rule(input): # receives the model response 
+    if len(input) < 5: # trigger fallback if the model response is too short
+         return False 
+    return True 
+```
+
+### Step 2. Point it to your proxy
+
+```python
+litellm_settings:
+  post_call_rules: post_call_rules.my_custom_rule
+  num_retries: 3
+```
+
+### Step 3. Start + test your proxy
+
+```bash
+$ litellm /path/to/config.yaml
+```
+
+```bash
+curl --location 'http://0.0.0.0:8000/v1/chat/completions' \
+--header 'Content-Type: application/json' \
+--header 'Authorization: Bearer sk-1234' \
+--data '{
+  "model": "deepseek-coder",
+  "messages": [{"role":"user","content":"What llm are you?"}],
+  "temperature": 0.7,
+  "max_tokens": 10,
+}'
+```
+---
+
+This will now check if a response is > len 5, and if it fails, it'll retry a call 3 times before failing.
--- a/docs/my-website/sidebars.js
+++ b/docs/my-website/sidebars.js
@ -112,6 +112,7 @@ const sidebars = {
        "proxy/reliability",
        "proxy/health",
        "proxy/call_hooks",
+        "proxy/rules",
        "proxy/caching",
        "proxy/alerting",
        "proxy/logging",