- Fix constructor call missing files_api parameter
- Add kvstore field to MilvusVectorIOConfig
- Resolves#2626
# What does this PR do?
[https://github.com/meta-llama/llama-stack/issues/2626]
## Problem
The `MilvusVectorIOAdapter` fails to initialize due to two missing
configuration issues:
1. Missing `files_api` parameter in the constructor call
2. Missing `kvstore` field in the `MilvusVectorIOConfig` class
## Root Cause
1. The adapter constructor expects 3 parameters `(config, inference_api,
files_api)` but the `get_adapter_impl` function only passes 2 parameters
2. The `MilvusVectorIOConfig` class lacks the `kvstore` field that the
adapter's `initialize()` method expects for metadata persistence
## Solution
- Added `files_api = deps.get(Api.files, None)` to safely retrieve files
API from dependencies
- Pass the files_api parameter to MilvusVectorIOAdapter constructor
- Added `kvstore: KVStoreConfig | None = None` field to
MilvusVectorIOConfig
- Maintains backward compatibility since both files_api and kvstore can
be None
Closes#2626
## Test Plan
- [x] Tested with Milvus configuration - server starts successfully
```yaml
vector_io:
- provider_id: milvus
provider_type: remote::milvus
config:
uri: http://localhost:19530
token: root:Milvus
kvstore:
type: sqlite
namespace: null
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/remote-vllm}/milvus_store.db
```
- [x] Vector operations work as expected
```python
from llama_stack_client import LlamaStackClient
from llama_stack_client.types.shared_params.document import Document as RAGDocument
from llama_stack_client.lib.agents.agent import Agent
from llama_stack_client.lib.agents.event_logger import EventLogger as AgentEventLogger
import os
endpoint = os.getenv("LLAMA_STACK_ENDPOINT")
model = os.getenv("INFERENCE_MODEL")
# Initialize the client
client = LlamaStackClient(base_url=endpoint)
vector_db_id = "my_documents"
response = client.vector_dbs.register(
vector_db_id=vector_db_id,
embedding_model="all-MiniLM-L6-v2",
embedding_dimension=384,
provider_id="milvus",
)
urls = ["getting_started/Red_Hat_AI_Inference_Server-3.0-Getting_started-en-US.pdf", "vllm_server_arguments/Red_Hat_AI_Inference_Server-3.0-vLLM_server_arguments-en-US.pdf"]
documents = [
RAGDocument(
document_id=f"num-{i}",
content=f"https://docs.redhat.com/en/documentation/red_hat_ai_inference_server/3.0/pdf/{url}",
mime_type="application/pdf",
metadata={},
)
for i, url in enumerate(urls)
]
client.tool_runtime.rag_tool.insert(
documents=documents,
vector_db_id=vector_db_id,
chunk_size_in_tokens=512,
)
rag_agent = Agent(
client,
model=model,
# Define instructions for the agent (system prompt)
instructions="You are a helpful assistant",
enable_session_persistence=False,
# Define tools available to the agent
tools=[
{
"name": "builtin::rag/knowledge_search",
"args": {
"vector_db_ids": [vector_db_id],
},
}
],
)
session_id = rag_agent.create_session("test-session")
user_prompts = [
"How to start the AI Inference Server container image? use the knowledge_search tool to get information.",
]
for prompt in user_prompts:
print(f"User> {prompt}")
response = rag_agent.create_turn(
messages=[{"role": "user", "content": prompt}],
session_id=session_id,
)
for log in AgentEventLogger().log(response):
log.print()
```
server logs:
```
INFO 2025-07-04 22:18:30,385 __main__:577 server: Listening on ['::', '0.0.0.0']:5000
INFO: Started server process [769725]
INFO: Waiting for application startup.
INFO 2025-07-04 22:18:30,390 __main__:158 server: Starting up
INFO: Application startup complete.
INFO: Uvicorn running on http://['::', '0.0.0.0']:5000 (Press CTRL+C to quit)
INFO 2025-07-04 22:18:52,193 llama_stack.distribution.routing_tables.common:200 core: Setting owner for vector_db 'my_documents' to
20:18:52.194 [START] /v1/vector-dbs
INFO: 192.168.1.249:64170 - "POST /v1/vector-dbs HTTP/1.1" 200 OK
20:18:52.216 [END] /v1/vector-dbs [StatusCode.OK] (21.89ms)
20:18:52.222 [START] /v1/tool-runtime/rag-tool/insert
INFO 2025-07-04 22:18:56,265 llama_stack.providers.utils.inference.embedding_mixin:102 uncategorized: Loading sentence transformer for
all-MiniLM-L6-v2...
WARNING 2025-07-04 22:18:59,214 opentelemetry.trace:537 uncategorized: Overriding of current TracerProvider is not allowed
INFO 2025-07-04 22:18:59,339 sentence_transformers.SentenceTransformer:219 uncategorized: Use pytorch device_name: cuda:0
INFO 2025-07-04 22:18:59,340 sentence_transformers.SentenceTransformer:227 uncategorized: Load pretrained SentenceTransformer: all-MiniLM-L6-v2
INFO: 192.168.1.249:64170 - "POST /v1/tool-runtime/rag-tool/insert HTTP/1.1" 200 OK
INFO: 192.168.1.249:64170 - "POST /v1/agents HTTP/1.1" 200 OK
INFO: 192.168.1.249:64170 - "GET /v1/tools?toolgroup_id=builtin%3A%3Arag%2Fknowledge_search HTTP/1.1" 200 OK
INFO: 192.168.1.249:64170 - "POST /v1/agents/b1f6f063-1691-4780-8d9e-facd81708b91/session HTTP/1.1" 200 OK
20:19:01.834 [END] /v1/tool-runtime/rag-tool/insert [StatusCode.OK] (9612.06ms)
20:19:01.839 [START] /v1/agents
INFO: 192.168.1.249:64170 - "POST /v1/agents/b1f6f063-1691-4780-8d9e-facd81708b91/session/d2706302-bb54-421d-a890-5e25df9cb47f/turn HTTP/1.1" 200 OK
20:19:01.839 [END] /v1/agents [StatusCode.OK] (0.18ms)
20:19:01.844 [START] /v1/tools
INFO 2025-07-04 22:19:01,853 llama_stack.providers.remote.inference.vllm.vllm:330 uncategorized: Initializing vLLM client with
base_url=http://192.168.1.183:8080/v1
20:19:01.858 [END] /v1/tools [StatusCode.OK] (14.92ms)
20:19:01.868 [START] /v1/agents/{agent_id}/session
20:19:01.868 [END] /v1/agents/{agent_id}/session [StatusCode.OK] (0.37ms)
20:19:01.873 [START] /v1/agents/{agent_id}/session/{session_id}/turn
20:19:01.885 [START] inference
20:19:05.506 [END] inference [StatusCode.OK] (3621.19ms)
INFO 2025-07-04 22:19:05,537 llama_stack.providers.inline.agents.meta_reference.agent_instance:890 agents: executing tool call: knowledge_search
with args: {'query': 'How to start the AI Inference Server container image'}
20:19:05.538 [START] tool_execution
20:19:05.928 [END] tool_execution [StatusCode.OK] (390.08ms)
20:19:05.538 [INFO] executing tool call: knowledge_search with args: {'query': 'How to start the AI Inference Server container image'}
20:19:05.935 [START] inference
20:19:17.539 [END] inference [StatusCode.OK] (11603.76ms)
20:19:17.560 [END] /v1/agents/{agent_id}/session/{session_id}/turn [StatusCode.OK] (15686.62ms)
```
- [x] No regressions in functionality
- [x] Configuration properly accepts kvstore settings
---------
Co-authored-by: Peter Gustafsson <peter.gustafsson6@gmail.com>
Co-authored-by: raghotham <rsm@meta.com>
Co-authored-by: Francisco Arceo <farceo@redhat.com>
# What does this PR do?
Simple approach to get some provider pages in the docs.
Add or update description fields in the provider configuration class
using Pydantic’s Field, ensuring these descriptions are clear and
complete, as they will be used to auto-generate provider documentation
via ./scripts/distro_codegen.py instead of editing the docs manually.
Signed-off-by: Sébastien Han <seb@redhat.com>
2025-06-30 15:13:20 +02:00
Renamed from docs/source/providers/vector_io/milvus.md (Browse further)