mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-18 09:29:48 +00:00

Matthew Farrellee 68877f331e feat: Add optional idempotency support to batches API

Implements optional idempotency for batch creation using `idem_tok` parameter:

* **Core idempotency**: Same token + parameters returns existing batch
* **Conflict detection**: Same token + different parameters raises HTTP 409 ConflictError
* **Metadata order independence**: Different key ordering doesn't affect idempotency

**API changes:**
- Add optional `idem_tok` parameter to `create_batch()` method
- Enhanced API documentation with idempotency extensions

**Implementation:**
- Reference provider supports idempotent batch creation
- ConflictError for proper HTTP 409 status code mapping
- Comprehensive parameter validation

**Testing:**
- Unit tests: focused tests covering core scenarios with parametrized conflict detection
- Integration tests: tests validating real OpenAI client behavior

This enables client-side retry safety and prevents duplicate batch creation
when using the same idempotency token, following REST API

2025-08-08 08:08:08 -04:00

648 B

Raw Blame History

Batches

Overview

The Batches API enables efficient processing of multiple requests in a single operation, particularly useful for processing large datasets, batch evaluation workflows, and cost-effective inference at scale.

The API is designed to allow use of openai client libraries for seamless integration.

This API provides the following extensions:
 - idempotent batch creation

Note: This API is currently under active development and may undergo changes.

This section contains documentation for all available providers for the batches API.

Providers

:maxdepth: 1

inline_reference

648 B Raw Blame History

Batches

Overview

Providers

648 B

Raw Blame History