mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-04 12:07:34 +00:00
feat: Add optional idempotency support to batches API (#3171)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 0s
Test Llama Stack Build / build-single-provider (push) Failing after 2s
Pre-commit / pre-commit (push) Failing after 4s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 5s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Test Llama Stack Build / generate-matrix (push) Failing after 5s
Test Llama Stack Build / build (push) Has been skipped
Vector IO Integration Tests / test-matrix (push) Failing after 6s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s
Python Package Build Test / build (3.13) (push) Failing after 4s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Update ReadTheDocs / update-readthedocs (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 7s
Unit Tests / unit-tests (3.13) (push) Failing after 5s
UI Tests / ui-tests (22) (push) Failing after 6s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 14s
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 0s
Test Llama Stack Build / build-single-provider (push) Failing after 2s
Pre-commit / pre-commit (push) Failing after 4s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 5s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Test Llama Stack Build / generate-matrix (push) Failing after 5s
Test Llama Stack Build / build (push) Has been skipped
Vector IO Integration Tests / test-matrix (push) Failing after 6s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s
Python Package Build Test / build (3.13) (push) Failing after 4s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Update ReadTheDocs / update-readthedocs (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 7s
Unit Tests / unit-tests (3.13) (push) Failing after 5s
UI Tests / ui-tests (22) (push) Failing after 6s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 14s
Implements optional idempotency for batch creation using `idem_tok` parameter: * **Core idempotency**: Same token + parameters returns existing batch * **Conflict detection**: Same token + different parameters raises HTTP 409 ConflictError * **Metadata order independence**: Different key ordering doesn't affect idempotency **API changes:** - Add optional `idem_tok` parameter to `create_batch()` method - Enhanced API documentation with idempotency extensions **Implementation:** - Reference provider supports idempotent batch creation - ConflictError for proper HTTP 409 status code mapping - Comprehensive parameter validation **Testing:** - Unit tests: focused tests covering core scenarios with parametrized conflict detection - Integration tests: tests validating real OpenAI client behavior This enables client-side retry safety and prevents duplicate batch creation when using the same idempotency token, following REST API closes #3144
This commit is contained in:
parent
7519b73fcc
commit
cffc4edf47
7 changed files with 351 additions and 64 deletions
|
@ -5,6 +5,7 @@
|
|||
# the root directory of this source tree.
|
||||
|
||||
import asyncio
|
||||
import hashlib
|
||||
import itertools
|
||||
import json
|
||||
import time
|
||||
|
@ -136,28 +137,45 @@ class ReferenceBatchesImpl(Batches):
|
|||
endpoint: str,
|
||||
completion_window: Literal["24h"],
|
||||
metadata: dict[str, str] | None = None,
|
||||
idempotency_key: str | None = None,
|
||||
) -> BatchObject:
|
||||
"""
|
||||
Create a new batch for processing multiple API requests.
|
||||
|
||||
Error handling by levels -
|
||||
0. Input param handling, results in 40x errors before processing, e.g.
|
||||
- Wrong completion_window
|
||||
- Invalid metadata types
|
||||
- Unknown endpoint
|
||||
-> no batch created
|
||||
1. Errors preventing processing, result in BatchErrors aggregated in process_batch, e.g.
|
||||
- input_file_id missing
|
||||
- invalid json in file
|
||||
- missing custom_id, method, url, body
|
||||
- invalid model
|
||||
- streaming
|
||||
-> batch created, validation sends to failed status
|
||||
2. Processing errors, result in error_file_id entries, e.g.
|
||||
- Any error returned from inference endpoint
|
||||
-> batch created, goes to completed status
|
||||
This implementation provides optional idempotency: when an idempotency key
|
||||
(idempotency_key) is provided, a deterministic ID is generated based on the input
|
||||
parameters. If a batch with the same parameters already exists, it will be
|
||||
returned instead of creating a duplicate. Without an idempotency key,
|
||||
each request creates a new batch with a unique ID.
|
||||
|
||||
Args:
|
||||
input_file_id: The ID of an uploaded file containing requests for the batch.
|
||||
endpoint: The endpoint to be used for all requests in the batch.
|
||||
completion_window: The time window within which the batch should be processed.
|
||||
metadata: Optional metadata for the batch.
|
||||
idempotency_key: Optional idempotency key for enabling idempotent behavior.
|
||||
|
||||
Returns:
|
||||
The created or existing batch object.
|
||||
"""
|
||||
|
||||
# Error handling by levels -
|
||||
# 0. Input param handling, results in 40x errors before processing, e.g.
|
||||
# - Wrong completion_window
|
||||
# - Invalid metadata types
|
||||
# - Unknown endpoint
|
||||
# -> no batch created
|
||||
# 1. Errors preventing processing, result in BatchErrors aggregated in process_batch, e.g.
|
||||
# - input_file_id missing
|
||||
# - invalid json in file
|
||||
# - missing custom_id, method, url, body
|
||||
# - invalid model
|
||||
# - streaming
|
||||
# -> batch created, validation sends to failed status
|
||||
# 2. Processing errors, result in error_file_id entries, e.g.
|
||||
# - Any error returned from inference endpoint
|
||||
# -> batch created, goes to completed status
|
||||
|
||||
# TODO: set expiration time for garbage collection
|
||||
|
||||
if endpoint not in ["/v1/chat/completions"]:
|
||||
|
@ -171,6 +189,35 @@ class ReferenceBatchesImpl(Batches):
|
|||
)
|
||||
|
||||
batch_id = f"batch_{uuid.uuid4().hex[:16]}"
|
||||
|
||||
# For idempotent requests, use the idempotency key for the batch ID
|
||||
# This ensures the same key always maps to the same batch ID,
|
||||
# allowing us to detect parameter conflicts
|
||||
if idempotency_key is not None:
|
||||
hash_input = idempotency_key.encode("utf-8")
|
||||
hash_digest = hashlib.sha256(hash_input).hexdigest()[:24]
|
||||
batch_id = f"batch_{hash_digest}"
|
||||
|
||||
try:
|
||||
existing_batch = await self.retrieve_batch(batch_id)
|
||||
|
||||
if (
|
||||
existing_batch.input_file_id != input_file_id
|
||||
or existing_batch.endpoint != endpoint
|
||||
or existing_batch.completion_window != completion_window
|
||||
or existing_batch.metadata != metadata
|
||||
):
|
||||
raise ConflictError(
|
||||
f"Idempotency key '{idempotency_key}' was previously used with different parameters. "
|
||||
"Either use a new idempotency key or ensure all parameters match the original request."
|
||||
)
|
||||
|
||||
logger.info(f"Returning existing batch with ID: {batch_id}")
|
||||
return existing_batch
|
||||
except ResourceNotFoundError:
|
||||
# Batch doesn't exist, continue with creation
|
||||
pass
|
||||
|
||||
current_time = int(time.time())
|
||||
|
||||
batch = BatchObject(
|
||||
|
@ -185,6 +232,7 @@ class ReferenceBatchesImpl(Batches):
|
|||
)
|
||||
|
||||
await self.kvstore.set(f"batch:{batch_id}", batch.to_json())
|
||||
logger.info(f"Created new batch with ID: {batch_id}")
|
||||
|
||||
if self.process_batches:
|
||||
task = asyncio.create_task(self._process_batch(batch_id))
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue