chore: introduce write queue for response_store (#3497)

# What does this PR do?
Mirroring the same changes that was used for inference_store:
https://github.com/llamastack/llama-stack/pull/3383

Will follow up with a shared internal API for managing these write
queues.

## Test Plan
existing tests
This commit is contained in:
ehhuang 2025-09-29 10:36:16 -07:00 committed by GitHub
parent 7c466a7ec5
commit 8ab6684a94
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 136 additions and 7 deletions

View file

@ -433,6 +433,12 @@ class InferenceStoreConfig(BaseModel):
num_writers: int = Field(default=4, description="Number of concurrent background writers")
class ResponsesStoreConfig(BaseModel):
sql_store_config: SqlStoreConfig
max_write_queue_size: int = Field(default=10000, description="Max queued writes for responses store")
num_writers: int = Field(default=4, description="Number of concurrent background writers")
class StackRunConfig(BaseModel):
version: int = LLAMA_STACK_RUN_CONFIG_VERSION