mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 09:53:45 +00:00
fix: Vector store persistence across server restarts (#3977)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.12) (push) Failing after 2s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 8s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
Python Package Build Test / build (3.13) (push) Failing after 17s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s
Integration Tests (Replay) / generate-matrix (push) Successful in 21s
Unit Tests / unit-tests (3.12) (push) Failing after 18s
Pre-commit / pre-commit (push) Failing after 23s
Test External API and Providers / test-external (venv) (push) Failing after 22s
API Conformance Tests / check-schema-compatibility (push) Successful in 30s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 20s
UI Tests / ui-tests (22) (push) Successful in 1m10s
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.12) (push) Failing after 2s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 8s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
Python Package Build Test / build (3.13) (push) Failing after 17s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s
Integration Tests (Replay) / generate-matrix (push) Successful in 21s
Unit Tests / unit-tests (3.12) (push) Failing after 18s
Pre-commit / pre-commit (push) Failing after 23s
Test External API and Providers / test-external (venv) (push) Failing after 22s
API Conformance Tests / check-schema-compatibility (push) Successful in 30s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 20s
UI Tests / ui-tests (22) (push) Successful in 1m10s
# What does this PR do?
This PR fixes a bug in LlamaStack 0.3.0 where vector stores created via
the OpenAI-compatible API (`POST /v1/vector_stores`) would fail with
`VectorStoreNotFoundError` after server restart when attempting
operations like `vector_io.insert()` or `vector_io.query()`.
The bug affected **6 vector IO providers**: `pgvector`, `sqlite_vec`,
`chroma`, `milvus`, `qdrant`, and `weaviate`.
Created with the assistance of: claude-4.5-sonnet
## Root Cause
All affected providers had a broken
`_get_and_cache_vector_store_index()` method that:
1. Did not load existing vector stores from persistent storage during
initialization
2. Attempted to use `vector_store_table` (which was either `None` or a
`KVStore` without the required `get_vector_store()` method)
3. Could not reload vector stores after server restart or cache miss
## Solution
This PR implements a consistent pattern across all 6 providers:
1. **Load vector stores during initialization** - Pre-populate the cache
from KV store on startup
2. **Fix lazy loading** - Modified `_get_and_cache_vector_store_index()`
to load directly from KV store instead of relying on
`vector_store_table`
3. **Remove broken dependency** - Eliminated reliance on the
`vector_store_table` pattern
## Testing steps
### 1.1 Configure the stack
Create or use an existing configuration with a vector IO provider.
**Example `run.yaml`:**
```yaml
vector_io_store:
- provider_id: pgvector
provider_type: remote::pgvector
config:
host: localhost
port: 5432
db: llamastack
user: llamastack
password: llamastack
inference:
- provider_id: sentence-transformers
provider_type: inline::sentence-transformers
config:
model: sentence-transformers/all-MiniLM-L6-v2
```
### 1.2 Start the server
```bash
llama stack run run.yaml --port 5000
```
Wait for the server to fully start. You should see:
```
INFO: Started server process
INFO: Application startup complete
```
---
## Step 2: Create a Vector Store
### 2.1 Create via API
```bash
curl -X POST http://localhost:5000/v1/vector_stores \
-H "Content-Type: application/json" \
-d '{
"name": "test-persistence-store",
"extra_body": {
"embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
"embedding_dimension": 384,
"provider_id": "pgvector"
}
}' | jq
```
### 2.2 Expected Response
```json
{
"id": "vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d",
"object": "vector_store",
"name": "test-persistence-store",
"status": "completed",
"created_at": 1730304000,
"file_counts": {
"total": 0,
"completed": 0,
"in_progress": 0,
"failed": 0,
"cancelled": 0
},
"usage_bytes": 0
}
```
**Save the `id` field** (e.g.,
`vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d`) — you’ll need it for the next
steps.
---
## Step 3: Insert Data (Before Restart)
### 3.1 Insert chunks into the vector store
```bash
export VS_ID="vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d"
curl -X POST http://localhost:5000/vector-io/insert \
-H "Content-Type: application/json" \
-d "{
\"vector_store_id\": \"$VS_ID\",
\"chunks\": [
{
\"content\": \"Python is a high-level programming language known for its readability.\",
\"metadata\": {\"source\": \"doc1\", \"page\": 1}
},
{
\"content\": \"Machine learning enables computers to learn from data without explicit programming.\",
\"metadata\": {\"source\": \"doc2\", \"page\": 1}
},
{
\"content\": \"Neural networks are inspired by biological neurons in the brain.\",
\"metadata\": {\"source\": \"doc3\", \"page\": 1}
}
]
}"
```
### 3.2 Expected Response
Status: **200 OK**
Response: *Empty or success confirmation*
---
## Step 4: Query Data (Before Restart – Baseline)
### 4.1 Query the vector store
```bash
curl -X POST http://localhost:5000/vector-io/query \
-H "Content-Type: application/json" \
-d "{
\"vector_store_id\": \"$VS_ID\",
\"query\": \"What is machine learning?\"
}" | jq
```
### 4.2 Expected Response
```json
{
"chunks": [
{
"content": "Machine learning enables computers to learn from data without explicit programming.",
"metadata": {"source": "doc2", "page": 1}
},
{
"content": "Neural networks are inspired by biological neurons in the brain.",
"metadata": {"source": "doc3", "page": 1}
}
],
"scores": [0.85, 0.72]
}
```
**Checkpoint:** Works correctly before restart.
---
## Step 5: Restart the Server (Critical Test)
### 5.1 Stop the server
In the terminal where it’s running:
```
Ctrl + C
```
Wait for:
```
Shutting down...
```
### 5.2 Restart the server
```bash
llama stack run run.yaml --port 5000
```
Wait for:
```
INFO: Started server process
INFO: Application startup complete
```
The vector store cache is now empty, but data should persist.
---
## Step 6: Verify Vector Store Exists (After Restart)
### 6.1 List vector stores
```bash
curl http://localhost:5000/v1/vector_stores | jq
```
### 6.2 Expected Response
```json
{
"object": "list",
"data": [
{
"id": "vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d",
"name": "test-persistence-store",
"status": "completed"
}
]
}
```
**Checkpoint:** Vector store should be listed.
---
## Step 7: Insert Data (After Restart – THE BUG TEST)
### 7.1 Insert new chunks
```bash
curl -X POST http://localhost:5000/vector-io/insert \
-H "Content-Type: application/json" \
-d "{
\"vector_store_id\": \"$VS_ID\",
\"chunks\": [
{
\"content\": \"This chunk was inserted AFTER the server restart.\",
\"metadata\": {\"source\": \"post-restart\", \"test\": true}
}
]
}"
```
### 7.2 Expected Results
**With Fix (Correct):**
```
Status: 200 OK
Response: Success
```
**Without Fix (Bug):**
```json
{
"detail": "VectorStoreNotFoundError: Vector Store 'vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d' not found."
}
```
**Critical Test:** If insertion succeeds, the fix works.
---
## Step 8: Query Data (After Restart – Verification)
### 8.1 Query all data
```bash
curl -X POST http://localhost:5000/vector-io/query \
-H "Content-Type: application/json" \
-d "{
\"vector_store_id\": \"$VS_ID\",
\"query\": \"restart\"
}" | jq
```
### 8.2 Expected Response
```json
{
"chunks": [
{
"content": "This chunk was inserted AFTER the server restart.",
"metadata": {"source": "post-restart", "test": true}
}
],
"scores": [0.95]
}
```
**Checkpoint:** Both old and new data are queryable.
---
## Step 9: Multiple Restart Test (Extra Verification)
### 9.1 Restart again
```bash
Ctrl + C
llama stack run run.yaml --port 5000
```
### 9.2 Query after restart
```bash
curl -X POST http://localhost:5000/vector-io/query \
-H "Content-Type: application/json" \
-d "{
\"vector_store_id\": \"$VS_ID\",
\"query\": \"programming\"
}" | jq
```
**Expected:** Works correctly across multiple restarts.
---------
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
This commit is contained in:
parent
8f4c431370
commit
6147321083
8 changed files with 203 additions and 33 deletions
|
|
@ -92,6 +92,99 @@ async def test_persistence_across_adapter_restarts(vector_io_adapter):
|
|||
await vector_io_adapter.shutdown()
|
||||
|
||||
|
||||
async def test_vector_store_lazy_loading_from_kvstore(vector_io_adapter):
|
||||
"""
|
||||
Test that vector stores can be lazy-loaded from KV store when not in cache.
|
||||
|
||||
Verifies that clearing the cache doesn't break vector store access - they
|
||||
can be loaded on-demand from persistent storage.
|
||||
"""
|
||||
await vector_io_adapter.initialize()
|
||||
|
||||
vector_store_id = f"lazy_load_test_{np.random.randint(1e6)}"
|
||||
vector_store = VectorStore(
|
||||
identifier=vector_store_id,
|
||||
provider_id="test_provider",
|
||||
embedding_model="test_model",
|
||||
embedding_dimension=128,
|
||||
)
|
||||
await vector_io_adapter.register_vector_store(vector_store)
|
||||
assert vector_store_id in vector_io_adapter.cache
|
||||
|
||||
vector_io_adapter.cache.clear()
|
||||
assert vector_store_id not in vector_io_adapter.cache
|
||||
|
||||
loaded_index = await vector_io_adapter._get_and_cache_vector_store_index(vector_store_id)
|
||||
assert loaded_index is not None
|
||||
assert loaded_index.vector_store.identifier == vector_store_id
|
||||
assert vector_store_id in vector_io_adapter.cache
|
||||
|
||||
cached_index = await vector_io_adapter._get_and_cache_vector_store_index(vector_store_id)
|
||||
assert cached_index is loaded_index
|
||||
|
||||
await vector_io_adapter.shutdown()
|
||||
|
||||
|
||||
async def test_vector_store_preloading_on_initialization(vector_io_adapter):
|
||||
"""
|
||||
Test that vector stores are preloaded from KV store during initialization.
|
||||
|
||||
Verifies that after restart, all vector stores are automatically loaded into
|
||||
cache and immediately accessible without requiring lazy loading.
|
||||
"""
|
||||
await vector_io_adapter.initialize()
|
||||
|
||||
vector_store_ids = [f"preload_test_{i}_{np.random.randint(1e6)}" for i in range(3)]
|
||||
for vs_id in vector_store_ids:
|
||||
vector_store = VectorStore(
|
||||
identifier=vs_id,
|
||||
provider_id="test_provider",
|
||||
embedding_model="test_model",
|
||||
embedding_dimension=128,
|
||||
)
|
||||
await vector_io_adapter.register_vector_store(vector_store)
|
||||
|
||||
for vs_id in vector_store_ids:
|
||||
assert vs_id in vector_io_adapter.cache
|
||||
|
||||
await vector_io_adapter.shutdown()
|
||||
await vector_io_adapter.initialize()
|
||||
|
||||
for vs_id in vector_store_ids:
|
||||
assert vs_id in vector_io_adapter.cache
|
||||
|
||||
for vs_id in vector_store_ids:
|
||||
loaded_index = await vector_io_adapter._get_and_cache_vector_store_index(vs_id)
|
||||
assert loaded_index is not None
|
||||
assert loaded_index.vector_store.identifier == vs_id
|
||||
|
||||
await vector_io_adapter.shutdown()
|
||||
|
||||
|
||||
async def test_kvstore_none_raises_runtime_error(vector_io_adapter):
|
||||
"""
|
||||
Test that accessing vector stores with uninitialized kvstore raises RuntimeError.
|
||||
|
||||
Verifies proper RuntimeError is raised instead of assertions when kvstore is None.
|
||||
"""
|
||||
await vector_io_adapter.initialize()
|
||||
|
||||
vector_store_id = f"kvstore_none_test_{np.random.randint(1e6)}"
|
||||
vector_store = VectorStore(
|
||||
identifier=vector_store_id,
|
||||
provider_id="test_provider",
|
||||
embedding_model="test_model",
|
||||
embedding_dimension=128,
|
||||
)
|
||||
await vector_io_adapter.register_vector_store(vector_store)
|
||||
|
||||
vector_io_adapter.cache.clear()
|
||||
vector_io_adapter.kvstore = None
|
||||
|
||||
with pytest.raises(RuntimeError, match="KVStore not initialized"):
|
||||
await vector_io_adapter._get_and_cache_vector_store_index(vector_store_id)
|
||||
|
||||
|
||||
async def test_register_and_unregister_vector_store(vector_io_adapter):
|
||||
unique_id = f"foo_db_{np.random.randint(1e6)}"
|
||||
dummy = VectorStore(
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue