llama-stack-mirror/llama_stack/providers/utils/inference
Ashwin Bharambe f0b2ca1ebc fix: enable SQLite WAL mode to prevent database locking errors (#4048)
Fixes race condition causing "database is locked" errors during
concurrent writes to SQLite, particularly in streaming responses with
guardrails where multiple inference calls write simultaneously.

Enable Write-Ahead Logging (WAL) mode for SQLite which allows multiple
concurrent readers and one writer without blocking. Set busy_timeout to
5s so SQLite retries instead of failing immediately. Remove the logic
that disabled write queues for SQLite since WAL mode eliminates the
locking issues that prompted disabling them.

Fixes: test_output_safety_guardrails_safe_content[stream=True] flake
(cherry picked from commit 2381714904)
Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-11-24 14:10:20 -05:00
..
__init__.py chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
embedding_mixin.py fix(inference): enable routing of models with provider_data alone (backport #3928) (#4142) 2025-11-12 13:41:27 -08:00
inference_store.py fix: enable SQLite WAL mode to prevent database locking errors (#4048) 2025-11-24 14:10:20 -05:00
litellm_openai_mixin.py feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794) 2025-10-12 19:01:52 -07:00
model_registry.py feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
openai_compat.py fix: Update watsonx.ai provider to use LiteLLM mixin and list all models (#3674) 2025-10-08 07:29:43 -04:00
openai_mixin.py fix(inference): enable routing of models with provider_data alone (backport #3928) (#4142) 2025-11-12 13:41:27 -08:00
prompt_adapter.py chore!: Safety api refactoring to use OpenAIMessageParam (#3796) 2025-10-12 08:01:00 -07:00