docs(batches.md): add loadbalancing multiple azure deployments on batches api to docs

2024-09-03 08:01:14 -07:00 · 2024-09-03 08:01:14 -07:00 · cdfea7e5ae
commit cdfea7e5ae
parent ab6ddd1a49
1 changed files with 98 additions and 1 deletions
--- a/docs/my-website/docs/batches.md
+++ b/docs/my-website/docs/batches.md
@ -223,3 +223,100 @@ curl http://0.0.0.0:4000/v1/batches?limit=2 \
 ```

 ### [👉 Health Check Azure Batch models](./proxy/health.md#batch-models-azure-only)
+
+
+### [BETA] Loadbalance Multiple Azure Deployments 
+In your config.yaml, set `enable_loadbalancing_on_batch_endpoints: true`
+
+```yaml
+model_list:
+  - model_name: "batch-gpt-4o-mini"
+    litellm_params:
+      model: "azure/gpt-4o-mini"
+      api_key: os.environ/AZURE_API_KEY
+      api_base: os.environ/AZURE_API_BASE
+    model_info:
+      mode: batch
+
+litellm_settings:
+  enable_loadbalancing_on_batch_endpoints: true # 👈 KEY CHANGE
+```
+
+Note: This works on `{PROXY_BASE_URL}/v1/files` and `{PROXY_BASE_URL}/v1/batches`.
+Note: Response is in the OpenAI-format. 
+
+1. Upload a file 
+
+Just set `model: batch-gpt-4o-mini` in your .jsonl.
+
+```bash
+curl http://localhost:4000/v1/files \
+    -H "Authorization: Bearer sk-1234" \
+    -F purpose="batch" \
+    -F file="@mydata.jsonl"
+```
+
+**Example File**
+
+Note: `model` should be your azure deployment name.
+
+```json
+{"custom_id": "task-0", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was Microsoft founded?"}]}}
+{"custom_id": "task-1", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was the first XBOX released?"}]}}
+{"custom_id": "task-2", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "What is Altair Basic?"}]}}
+```
+
+Expected Response (OpenAI-compatible)
+
+```bash
+{"id":"file-f0be81f654454113a922da60acb0eea6",...}
+```
+
+2. Create a batch 
+
+```bash
+curl http://0.0.0.0:4000/v1/batches \
+  -H "Authorization: Bearer $LITELLM_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "input_file_id": "file-f0be81f654454113a922da60acb0eea6",
+    "endpoint": "/v1/chat/completions",
+    "completion_window": "24h",
+    "model: "batch-gpt-4o-mini"
+  }'
+```
+
+Expected Response: 
+
+```bash
+{"id":"batch_94e43f0a-d805-477d-adf9-bbb9c50910ed",...}
+```
+
+3. Retrieve a batch 
+
+```bash
+curl http://0.0.0.0:4000/v1/batches/batch_94e43f0a-d805-477d-adf9-bbb9c50910ed \
+  -H "Authorization: Bearer $LITELLM_API_KEY" \
+  -H "Content-Type: application/json" \
+```
+
+
+Expected Response: 
+
+```
+{"id":"batch_94e43f0a-d805-477d-adf9-bbb9c50910ed",...}
+```
+
+4. List batch
+
+```bash
+curl http://0.0.0.0:4000/v1/batches?limit=2 \
+  -H "Authorization: Bearer $LITELLM_API_KEY" \
+  -H "Content-Type: application/json"
+```
+
+Expected Response:
+
+```bash
+{"data":[{"id":"batch_R3V...}
+```