forked from phoenix/litellm-mirror
docs(batches.md): add loadbalancing multiple azure deployments on batches api to docs
This commit is contained in:
parent
ab6ddd1a49
commit
cdfea7e5ae
1 changed files with 98 additions and 1 deletions
|
@ -222,4 +222,101 @@ curl http://0.0.0.0:4000/v1/batches?limit=2 \
|
||||||
-H "Content-Type: application/json"
|
-H "Content-Type: application/json"
|
||||||
```
|
```
|
||||||
|
|
||||||
### [👉 Health Check Azure Batch models](./proxy/health.md#batch-models-azure-only)
|
### [👉 Health Check Azure Batch models](./proxy/health.md#batch-models-azure-only)
|
||||||
|
|
||||||
|
|
||||||
|
### [BETA] Loadbalance Multiple Azure Deployments
|
||||||
|
In your config.yaml, set `enable_loadbalancing_on_batch_endpoints: true`
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
model_list:
|
||||||
|
- model_name: "batch-gpt-4o-mini"
|
||||||
|
litellm_params:
|
||||||
|
model: "azure/gpt-4o-mini"
|
||||||
|
api_key: os.environ/AZURE_API_KEY
|
||||||
|
api_base: os.environ/AZURE_API_BASE
|
||||||
|
model_info:
|
||||||
|
mode: batch
|
||||||
|
|
||||||
|
litellm_settings:
|
||||||
|
enable_loadbalancing_on_batch_endpoints: true # 👈 KEY CHANGE
|
||||||
|
```
|
||||||
|
|
||||||
|
Note: This works on `{PROXY_BASE_URL}/v1/files` and `{PROXY_BASE_URL}/v1/batches`.
|
||||||
|
Note: Response is in the OpenAI-format.
|
||||||
|
|
||||||
|
1. Upload a file
|
||||||
|
|
||||||
|
Just set `model: batch-gpt-4o-mini` in your .jsonl.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl http://localhost:4000/v1/files \
|
||||||
|
-H "Authorization: Bearer sk-1234" \
|
||||||
|
-F purpose="batch" \
|
||||||
|
-F file="@mydata.jsonl"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example File**
|
||||||
|
|
||||||
|
Note: `model` should be your azure deployment name.
|
||||||
|
|
||||||
|
```json
|
||||||
|
{"custom_id": "task-0", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was Microsoft founded?"}]}}
|
||||||
|
{"custom_id": "task-1", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was the first XBOX released?"}]}}
|
||||||
|
{"custom_id": "task-2", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "What is Altair Basic?"}]}}
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected Response (OpenAI-compatible)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
{"id":"file-f0be81f654454113a922da60acb0eea6",...}
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Create a batch
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl http://0.0.0.0:4000/v1/batches \
|
||||||
|
-H "Authorization: Bearer $LITELLM_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"input_file_id": "file-f0be81f654454113a922da60acb0eea6",
|
||||||
|
"endpoint": "/v1/chat/completions",
|
||||||
|
"completion_window": "24h",
|
||||||
|
"model: "batch-gpt-4o-mini"
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected Response:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
{"id":"batch_94e43f0a-d805-477d-adf9-bbb9c50910ed",...}
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Retrieve a batch
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl http://0.0.0.0:4000/v1/batches/batch_94e43f0a-d805-477d-adf9-bbb9c50910ed \
|
||||||
|
-H "Authorization: Bearer $LITELLM_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
Expected Response:
|
||||||
|
|
||||||
|
```
|
||||||
|
{"id":"batch_94e43f0a-d805-477d-adf9-bbb9c50910ed",...}
|
||||||
|
```
|
||||||
|
|
||||||
|
4. List batch
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl http://0.0.0.0:4000/v1/batches?limit=2 \
|
||||||
|
-H "Authorization: Bearer $LITELLM_API_KEY" \
|
||||||
|
-H "Content-Type: application/json"
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected Response:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
{"data":[{"id":"batch_R3V...}
|
||||||
|
```
|
Loading…
Add table
Add a link
Reference in a new issue