mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 09:53:45 +00:00
Sql stores now share a single SqlAlchemySqlStoreImpl per backend, and kvstore_impl caches instances per (backend, namespace). This avoids spawning multiple SQLite connections for the same file, reducing lock contention and aligning the cache story for all backends. Added an async upsert API (with SQLite/Postgres dialect inserts) and routed it through AuthorizedSqlStore, then switched conversations and responses to call it. Using native ON CONFLICT DO UPDATE eliminates the insert-then-update retry window that previously caused long WAL lock retries. Introduced an opt-in conversation stress test that mirrors the recorded prompts from test_conversation_multi_turn_and_streaming while fanning them out across many threads. This gives us a fast local way to hammer the conversations/responses sync path when investigating lockups. |
||
|---|---|---|
| .. | ||
| fixtures | ||
| recordings | ||
| __init__.py | ||
| helpers.py | ||
| streaming_assertions.py | ||
| test_basic_responses.py | ||
| test_conversation_responses.py | ||
| test_conversation_stress.py | ||
| test_file_search.py | ||
| test_tool_responses.py | ||