feat: Add optional idempotency support to batches API (#3171)

Implements optional idempotency for batch creation using `idem_tok` parameter: * **Core idempotency**: Same token + parameters returns existing batch * **Conflict detection**: Same token + different parameters raises HTTP 409 ConflictError * **Metadata order independence**: Different key ordering doesn't affect idempotency **API changes:** - Add optional `idem_tok` parameter to `create_batch()` method - Enhanced API documentation with idempotency extensions **Implementation:** - Reference provider supports idempotent batch creation - ConflictError for proper HTTP 409 status code mapping - Comprehensive parameter validation **Testing:** - Unit tests: focused tests covering core scenarios with parametrized conflict detection - Integration tests: tests validating real OpenAI client behavior This enables client-side retry safety and prevents duplicate batch creation when using the same idempotency token, following REST API closes #3144
2025-12-03 09:53:45 +00:00 · 2025-08-22 17:50:40 -05:00 · 2025-08-22 17:50:40 -05:00 · cffc4edf47
commit cffc4edf47
parent 7519b73fcc
7 changed files with 351 additions and 64 deletions
--- a/llama_stack/apis/batches/batches.py
+++ b/llama_stack/apis/batches/batches.py
@ -29,12 +29,16 @@ class ListBatchesResponse(BaseModel):

@runtime_checkable
 class Batches(Protocol):
-    """Protocol for batch processing API operations.
-
+    """
    The Batches API enables efficient processing of multiple requests in a single operation,
    particularly useful for processing large datasets, batch evaluation workflows, and
    cost-effective inference at scale.

+    The API is designed to allow use of openai client libraries for seamless integration.
+
+    This API provides the following extensions:
+     - idempotent batch creation
+
    Note: This API is currently under active development and may undergo changes.
    """

@ -45,6 +49,7 @@ class Batches(Protocol):
        endpoint: str,
        completion_window: Literal["24h"],
        metadata: dict[str, str] | None = None,
+        idempotency_key: str | None = None,
    ) -> BatchObject:
        """Create a new batch for processing multiple API requests.

@ -52,6 +57,7 @@ class Batches(Protocol):
        :param endpoint: The endpoint to be used for all requests in the batch.
        :param completion_window: The time window within which the batch should be processed.
        :param metadata: Optional metadata for the batch.
+        :param idempotency_key: Optional idempotency key. When provided, enables idempotent behavior.
        :returns: The created batch object.
        """
        ...