llama-stack-mirror/docs/source/providers/vector_io
pgustafs d39660afed
fix(remote:milvus): add missing files_api parameter and kvstore configuration (#2630)
- Fix constructor call missing files_api parameter
- Add kvstore field to MilvusVectorIOConfig
- Resolves #2626

# What does this PR do?
[https://github.com/meta-llama/llama-stack/issues/2626]
## Problem
The `MilvusVectorIOAdapter` fails to initialize due to two missing
configuration issues:
1. Missing `files_api` parameter in the constructor call
2. Missing `kvstore` field in the `MilvusVectorIOConfig` class

## Root Cause  
1. The adapter constructor expects 3 parameters `(config, inference_api,
files_api)` but the `get_adapter_impl` function only passes 2 parameters
2. The `MilvusVectorIOConfig` class lacks the `kvstore` field that the
adapter's `initialize()` method expects for metadata persistence

## Solution
- Added `files_api = deps.get(Api.files, None)` to safely retrieve files
API from dependencies
- Pass the files_api parameter to MilvusVectorIOAdapter constructor
- Added `kvstore: KVStoreConfig | None = None` field to
MilvusVectorIOConfig
- Maintains backward compatibility since both files_api and kvstore can
be None

Closes #2626

## Test Plan
- [x] Tested with Milvus configuration - server starts successfully 
```yaml
vector_io:
  - provider_id: milvus
    provider_type: remote::milvus
    config:
      uri: http://localhost:19530
      token: root:Milvus
      kvstore:
        type: sqlite
        namespace: null
        db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/remote-vllm}/milvus_store.db
```
- [x] Vector operations work as expected
```python
from llama_stack_client import LlamaStackClient
from llama_stack_client.types.shared_params.document import Document as RAGDocument
from llama_stack_client.lib.agents.agent import Agent
from llama_stack_client.lib.agents.event_logger import EventLogger as AgentEventLogger
import os


endpoint =  os.getenv("LLAMA_STACK_ENDPOINT")
model =  os.getenv("INFERENCE_MODEL")

# Initialize the client
client = LlamaStackClient(base_url=endpoint)

vector_db_id = "my_documents"

response = client.vector_dbs.register(
    vector_db_id=vector_db_id,
    embedding_model="all-MiniLM-L6-v2",
    embedding_dimension=384,
    provider_id="milvus",
)

urls = ["getting_started/Red_Hat_AI_Inference_Server-3.0-Getting_started-en-US.pdf", "vllm_server_arguments/Red_Hat_AI_Inference_Server-3.0-vLLM_server_arguments-en-US.pdf"]
documents = [
    RAGDocument(
        document_id=f"num-{i}",
        content=f"https://docs.redhat.com/en/documentation/red_hat_ai_inference_server/3.0/pdf/{url}",
        mime_type="application/pdf",
        metadata={},
    )
    for i, url in enumerate(urls)
]

client.tool_runtime.rag_tool.insert(
    documents=documents,
    vector_db_id=vector_db_id,
    chunk_size_in_tokens=512,
)

rag_agent = Agent(
    client,
    model=model,
    # Define instructions for the agent (system prompt)
    instructions="You are a helpful assistant",
    enable_session_persistence=False,
    # Define tools available to the agent
    tools=[
        {
            "name": "builtin::rag/knowledge_search",
            "args": {
                "vector_db_ids": [vector_db_id],
            },
        }
    ],
)

session_id = rag_agent.create_session("test-session")

user_prompts = [
    "How to start the AI Inference Server container image? use the knowledge_search tool to get information.",
]

for prompt in user_prompts:
    print(f"User> {prompt}")
    response = rag_agent.create_turn(
        messages=[{"role": "user", "content": prompt}],
        session_id=session_id,
    )
    for log in AgentEventLogger().log(response):
        log.print()
```    

server logs:
```
INFO     2025-07-04 22:18:30,385 __main__:577 server: Listening on ['::', '0.0.0.0']:5000                                                             
INFO:     Started server process [769725]
INFO:     Waiting for application startup.
INFO     2025-07-04 22:18:30,390 __main__:158 server: Starting up                                                                                     
INFO:     Application startup complete.
INFO:     Uvicorn running on http://['::', '0.0.0.0']:5000 (Press CTRL+C to quit)
INFO     2025-07-04 22:18:52,193 llama_stack.distribution.routing_tables.common:200 core: Setting owner for vector_db 'my_documents' to               
20:18:52.194 [START] /v1/vector-dbs
INFO:     192.168.1.249:64170 - "POST /v1/vector-dbs HTTP/1.1" 200 OK
20:18:52.216 [END] /v1/vector-dbs [StatusCode.OK] (21.89ms)
20:18:52.222 [START] /v1/tool-runtime/rag-tool/insert
INFO     2025-07-04 22:18:56,265 llama_stack.providers.utils.inference.embedding_mixin:102 uncategorized: Loading sentence transformer for            
         all-MiniLM-L6-v2...                                                                                                                          
WARNING  2025-07-04 22:18:59,214 opentelemetry.trace:537 uncategorized: Overriding of current TracerProvider is not allowed                           
INFO     2025-07-04 22:18:59,339 sentence_transformers.SentenceTransformer:219 uncategorized: Use pytorch device_name: cuda:0                         
INFO     2025-07-04 22:18:59,340 sentence_transformers.SentenceTransformer:227 uncategorized: Load pretrained SentenceTransformer: all-MiniLM-L6-v2   
INFO:     192.168.1.249:64170 - "POST /v1/tool-runtime/rag-tool/insert HTTP/1.1" 200 OK
INFO:     192.168.1.249:64170 - "POST /v1/agents HTTP/1.1" 200 OK
INFO:     192.168.1.249:64170 - "GET /v1/tools?toolgroup_id=builtin%3A%3Arag%2Fknowledge_search HTTP/1.1" 200 OK
INFO:     192.168.1.249:64170 - "POST /v1/agents/b1f6f063-1691-4780-8d9e-facd81708b91/session HTTP/1.1" 200 OK
20:19:01.834 [END] /v1/tool-runtime/rag-tool/insert [StatusCode.OK] (9612.06ms)
20:19:01.839 [START] /v1/agents
INFO:     192.168.1.249:64170 - "POST /v1/agents/b1f6f063-1691-4780-8d9e-facd81708b91/session/d2706302-bb54-421d-a890-5e25df9cb47f/turn HTTP/1.1" 200 OK
20:19:01.839 [END] /v1/agents [StatusCode.OK] (0.18ms)
20:19:01.844 [START] /v1/tools
INFO     2025-07-04 22:19:01,853 llama_stack.providers.remote.inference.vllm.vllm:330 uncategorized: Initializing vLLM client with                    
         base_url=http://192.168.1.183:8080/v1                                                                                                        
20:19:01.858 [END] /v1/tools [StatusCode.OK] (14.92ms)
20:19:01.868 [START] /v1/agents/{agent_id}/session
20:19:01.868 [END] /v1/agents/{agent_id}/session [StatusCode.OK] (0.37ms)
20:19:01.873 [START] /v1/agents/{agent_id}/session/{session_id}/turn
20:19:01.885 [START] inference
20:19:05.506 [END] inference [StatusCode.OK] (3621.19ms)
INFO     2025-07-04 22:19:05,537 llama_stack.providers.inline.agents.meta_reference.agent_instance:890 agents: executing tool call: knowledge_search  
         with args: {'query': 'How to start the AI Inference Server container image'}                                                                 
20:19:05.538 [START] tool_execution
20:19:05.928 [END] tool_execution [StatusCode.OK] (390.08ms)
 20:19:05.538 [INFO] executing tool call: knowledge_search with args: {'query': 'How to start the AI Inference Server container image'}
20:19:05.935 [START] inference
20:19:17.539 [END] inference [StatusCode.OK] (11603.76ms)
20:19:17.560 [END] /v1/agents/{agent_id}/session/{session_id}/turn [StatusCode.OK] (15686.62ms)
```
- [x] No regressions in functionality
- [x] Configuration properly accepts kvstore settings

---------

Co-authored-by: Peter Gustafsson <peter.gustafsson6@gmail.com>
Co-authored-by: raghotham <rsm@meta.com>
Co-authored-by: Francisco Arceo <farceo@redhat.com>
2025-07-09 10:08:14 +02:00
..
index.md docs: auto generated documentation for providers (#2543) 2025-06-30 15:13:20 +02:00
inline_chromadb.md docs: auto generated documentation for providers (#2543) 2025-06-30 15:13:20 +02:00
inline_faiss.md fix: store configs (#2593) 2025-07-03 10:07:23 -07:00
inline_meta-reference.md fix: store configs (#2593) 2025-07-03 10:07:23 -07:00
inline_milvus.md chore: Adding unit tests for Milvus and OpenAI compatibility (#2640) 2025-07-08 00:50:16 -07:00
inline_qdrant.md docs: auto generated documentation for providers (#2543) 2025-06-30 15:13:20 +02:00
inline_sqlite-vec.md docs: auto generated documentation for providers (#2543) 2025-06-30 15:13:20 +02:00
inline_sqlite_vec.md docs: auto generated documentation for providers (#2543) 2025-06-30 15:13:20 +02:00
remote_chromadb.md docs: auto generated documentation for providers (#2543) 2025-06-30 15:13:20 +02:00
remote_milvus.md fix(remote:milvus): add missing files_api parameter and kvstore configuration (#2630) 2025-07-09 10:08:14 +02:00
remote_pgvector.md docs: auto generated documentation for providers (#2543) 2025-06-30 15:13:20 +02:00
remote_qdrant.md docs: auto generated documentation for providers (#2543) 2025-06-30 15:13:20 +02:00
remote_weaviate.md docs: auto generated documentation for providers (#2543) 2025-06-30 15:13:20 +02:00