diff --git a/docs/my-website/docs/batches.md b/docs/my-website/docs/batches.md index 898738f63..144873928 100644 --- a/docs/my-website/docs/batches.md +++ b/docs/my-website/docs/batches.md @@ -222,4 +222,101 @@ curl http://0.0.0.0:4000/v1/batches?limit=2 \ -H "Content-Type: application/json" ``` -### [👉 Health Check Azure Batch models](./proxy/health.md#batch-models-azure-only) \ No newline at end of file +### [👉 Health Check Azure Batch models](./proxy/health.md#batch-models-azure-only) + + +### [BETA] Loadbalance Multiple Azure Deployments +In your config.yaml, set `enable_loadbalancing_on_batch_endpoints: true` + +```yaml +model_list: + - model_name: "batch-gpt-4o-mini" + litellm_params: + model: "azure/gpt-4o-mini" + api_key: os.environ/AZURE_API_KEY + api_base: os.environ/AZURE_API_BASE + model_info: + mode: batch + +litellm_settings: + enable_loadbalancing_on_batch_endpoints: true # 👈 KEY CHANGE +``` + +Note: This works on `{PROXY_BASE_URL}/v1/files` and `{PROXY_BASE_URL}/v1/batches`. +Note: Response is in the OpenAI-format. + +1. Upload a file + +Just set `model: batch-gpt-4o-mini` in your .jsonl. + +```bash +curl http://localhost:4000/v1/files \ + -H "Authorization: Bearer sk-1234" \ + -F purpose="batch" \ + -F file="@mydata.jsonl" +``` + +**Example File** + +Note: `model` should be your azure deployment name. + +```json +{"custom_id": "task-0", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was Microsoft founded?"}]}} +{"custom_id": "task-1", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was the first XBOX released?"}]}} +{"custom_id": "task-2", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "What is Altair Basic?"}]}} +``` + +Expected Response (OpenAI-compatible) + +```bash +{"id":"file-f0be81f654454113a922da60acb0eea6",...} +``` + +2. Create a batch + +```bash +curl http://0.0.0.0:4000/v1/batches \ + -H "Authorization: Bearer $LITELLM_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "input_file_id": "file-f0be81f654454113a922da60acb0eea6", + "endpoint": "/v1/chat/completions", + "completion_window": "24h", + "model: "batch-gpt-4o-mini" + }' +``` + +Expected Response: + +```bash +{"id":"batch_94e43f0a-d805-477d-adf9-bbb9c50910ed",...} +``` + +3. Retrieve a batch + +```bash +curl http://0.0.0.0:4000/v1/batches/batch_94e43f0a-d805-477d-adf9-bbb9c50910ed \ + -H "Authorization: Bearer $LITELLM_API_KEY" \ + -H "Content-Type: application/json" \ +``` + + +Expected Response: + +``` +{"id":"batch_94e43f0a-d805-477d-adf9-bbb9c50910ed",...} +``` + +4. List batch + +```bash +curl http://0.0.0.0:4000/v1/batches?limit=2 \ + -H "Authorization: Bearer $LITELLM_API_KEY" \ + -H "Content-Type: application/json" +``` + +Expected Response: + +```bash +{"data":[{"id":"batch_R3V...} +``` \ No newline at end of file