llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-07-14 17:16:09 +00:00

Author	SHA1	Message	Date
Mustafa Elbehery	28343fea51	chore(api): add `mypy` coverage to `meta_reference_safety` (#2661 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-09 10:22:34 +02:00
pgustafs	d39660afed	fix(remote:milvus): add missing files_api parameter and kvstore configuration (#2630 ) - Fix constructor call missing files_api parameter - Add kvstore field to MilvusVectorIOConfig - Resolves #2626 # What does this PR do? [https://github.com/meta-llama/llama-stack/issues/2626] ## Problem The `MilvusVectorIOAdapter` fails to initialize due to two missing configuration issues: 1. Missing `files_api` parameter in the constructor call 2. Missing `kvstore` field in the `MilvusVectorIOConfig` class ## Root Cause 1. The adapter constructor expects 3 parameters `(config, inference_api, files_api)` but the `get_adapter_impl` function only passes 2 parameters 2. The `MilvusVectorIOConfig` class lacks the `kvstore` field that the adapter's `initialize()` method expects for metadata persistence ## Solution - Added `files_api = deps.get(Api.files, None)` to safely retrieve files API from dependencies - Pass the files_api parameter to MilvusVectorIOAdapter constructor - Added `kvstore: KVStoreConfig \| None = None` field to MilvusVectorIOConfig - Maintains backward compatibility since both files_api and kvstore can be None Closes #2626 ## Test Plan - [x] Tested with Milvus configuration - server starts successfully ```yaml vector_io: - provider_id: milvus provider_type: remote::milvus config: uri: http://localhost:19530 token: root:Milvus kvstore: type: sqlite namespace: null db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/remote-vllm}/milvus_store.db ``` - [x] Vector operations work as expected ```python from llama_stack_client import LlamaStackClient from llama_stack_client.types.shared_params.document import Document as RAGDocument from llama_stack_client.lib.agents.agent import Agent from llama_stack_client.lib.agents.event_logger import EventLogger as AgentEventLogger import os endpoint = os.getenv("LLAMA_STACK_ENDPOINT") model = os.getenv("INFERENCE_MODEL") # Initialize the client client = LlamaStackClient(base_url=endpoint) vector_db_id = "my_documents" response = client.vector_dbs.register( vector_db_id=vector_db_id, embedding_model="all-MiniLM-L6-v2", embedding_dimension=384, provider_id="milvus", ) urls = ["getting_started/Red_Hat_AI_Inference_Server-3.0-Getting_started-en-US.pdf", "vllm_server_arguments/Red_Hat_AI_Inference_Server-3.0-vLLM_server_arguments-en-US.pdf"] documents = [ RAGDocument( document_id=f"num-{i}", content=f"https://docs.redhat.com/en/documentation/red_hat_ai_inference_server/3.0/pdf/{url}", mime_type="application/pdf", metadata={}, ) for i, url in enumerate(urls) ] client.tool_runtime.rag_tool.insert( documents=documents, vector_db_id=vector_db_id, chunk_size_in_tokens=512, ) rag_agent = Agent( client, model=model, # Define instructions for the agent (system prompt) instructions="You are a helpful assistant", enable_session_persistence=False, # Define tools available to the agent tools=[ { "name": "builtin::rag/knowledge_search", "args": { "vector_db_ids": [vector_db_id], }, } ], ) session_id = rag_agent.create_session("test-session") user_prompts = [ "How to start the AI Inference Server container image? use the knowledge_search tool to get information.", ] for prompt in user_prompts: print(f"User> {prompt}") response = rag_agent.create_turn( messages=[{"role": "user", "content": prompt}], session_id=session_id, ) for log in AgentEventLogger().log(response): log.print() ``` server logs: ``` INFO 2025-07-04 22:18:30,385 __main__:577 server: Listening on ['::', '0.0.0.0']:5000 INFO: Started server process [769725] INFO: Waiting for application startup. INFO 2025-07-04 22:18:30,390 __main__:158 server: Starting up INFO: Application startup complete. INFO: Uvicorn running on http://['::', '0.0.0.0']:5000 (Press CTRL+C to quit) INFO 2025-07-04 22:18:52,193 llama_stack.distribution.routing_tables.common:200 core: Setting owner for vector_db 'my_documents' to 20:18:52.194 [START] /v1/vector-dbs INFO: 192.168.1.249:64170 - "POST /v1/vector-dbs HTTP/1.1" 200 OK 20:18:52.216 [END] /v1/vector-dbs [StatusCode.OK] (21.89ms) 20:18:52.222 [START] /v1/tool-runtime/rag-tool/insert INFO 2025-07-04 22:18:56,265 llama_stack.providers.utils.inference.embedding_mixin:102 uncategorized: Loading sentence transformer for all-MiniLM-L6-v2... WARNING 2025-07-04 22:18:59,214 opentelemetry.trace:537 uncategorized: Overriding of current TracerProvider is not allowed INFO 2025-07-04 22:18:59,339 sentence_transformers.SentenceTransformer:219 uncategorized: Use pytorch device_name: cuda:0 INFO 2025-07-04 22:18:59,340 sentence_transformers.SentenceTransformer:227 uncategorized: Load pretrained SentenceTransformer: all-MiniLM-L6-v2 INFO: 192.168.1.249:64170 - "POST /v1/tool-runtime/rag-tool/insert HTTP/1.1" 200 OK INFO: 192.168.1.249:64170 - "POST /v1/agents HTTP/1.1" 200 OK INFO: 192.168.1.249:64170 - "GET /v1/tools?toolgroup_id=builtin%3A%3Arag%2Fknowledge_search HTTP/1.1" 200 OK INFO: 192.168.1.249:64170 - "POST /v1/agents/b1f6f063-1691-4780-8d9e-facd81708b91/session HTTP/1.1" 200 OK 20:19:01.834 [END] /v1/tool-runtime/rag-tool/insert [StatusCode.OK] (9612.06ms) 20:19:01.839 [START] /v1/agents INFO: 192.168.1.249:64170 - "POST /v1/agents/b1f6f063-1691-4780-8d9e-facd81708b91/session/d2706302-bb54-421d-a890-5e25df9cb47f/turn HTTP/1.1" 200 OK 20:19:01.839 [END] /v1/agents [StatusCode.OK] (0.18ms) 20:19:01.844 [START] /v1/tools INFO 2025-07-04 22:19:01,853 llama_stack.providers.remote.inference.vllm.vllm:330 uncategorized: Initializing vLLM client with base_url=http://192.168.1.183:8080/v1 20:19:01.858 [END] /v1/tools [StatusCode.OK] (14.92ms) 20:19:01.868 [START] /v1/agents/{agent_id}/session 20:19:01.868 [END] /v1/agents/{agent_id}/session [StatusCode.OK] (0.37ms) 20:19:01.873 [START] /v1/agents/{agent_id}/session/{session_id}/turn 20:19:01.885 [START] inference 20:19:05.506 [END] inference [StatusCode.OK] (3621.19ms) INFO 2025-07-04 22:19:05,537 llama_stack.providers.inline.agents.meta_reference.agent_instance:890 agents: executing tool call: knowledge_search with args: {'query': 'How to start the AI Inference Server container image'} 20:19:05.538 [START] tool_execution 20:19:05.928 [END] tool_execution [StatusCode.OK] (390.08ms) 20:19:05.538 [INFO] executing tool call: knowledge_search with args: {'query': 'How to start the AI Inference Server container image'} 20:19:05.935 [START] inference 20:19:17.539 [END] inference [StatusCode.OK] (11603.76ms) 20:19:17.560 [END] /v1/agents/{agent_id}/session/{session_id}/turn [StatusCode.OK] (15686.62ms) ``` - [x] No regressions in functionality - [x] Configuration properly accepts kvstore settings --------- Co-authored-by: Peter Gustafsson <peter.gustafsson6@gmail.com> Co-authored-by: raghotham <rsm@meta.com> Co-authored-by: Francisco Arceo <farceo@redhat.com>	2025-07-09 10:08:14 +02:00
Mustafa Elbehery	2d3d9664a7	chore(api): add `mypy` coverage to `prompts` (#2657 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-09 10:07:00 +02:00
ehhuang	84fa83b788	fix: update k8s templates (#2645 ) Some checks failed Integration Tests / test-matrix (server, 3.12, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.12, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 15s Details Integration Tests / test-matrix (server, 3.12, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 17s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 11s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 14s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 13s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 15s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s Details Python Package Build Test / build (3.12) (push) Failing after 33s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 41s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 40s Details Python Package Build Test / build (3.13) (push) Failing after 33s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Pre-commit / pre-commit (push) Successful in 1m23s Details # What does this PR do? - fix env variables - use gpu for vllm - add eks/apply.py for aws - add template to set hf secret ## Test Plan bash apply.sh Co-authored-by: Eric Huang <erichuang@fb.com>	2025-07-08 15:57:01 -07:00
ehhuang	daf660c4ea	feat(auth,ui): support github sign-in in the UI (#2545 ) # What does this PR do? Uses NextAuth to add github sign in support. ## Test Plan Start server with auth configured as in https://github.com/meta-llama/llama-stack/pull/2509 https://github.com/user-attachments/assets/61ff7442-f601-4b39-8686-5d0afb3b45ac	2025-07-08 11:02:57 -07:00
ehhuang	c8bac888af	feat(auth): support github tokens (#2509 ) # What does this PR do? This PR adds GitHub OAuth authentication support to Llama Stack, allowing users to authenticate using their GitHub credentials (#2508) . 1. support verifying github acesss tokens 2. support provider-specific auth error messages 3. opportunistic reorganized the auth configs for better ergonomics ## Test Plan Added unit tests. Also tested e2e manually: ``` server: port: 8321 auth: provider_config: type: github_token ``` ``` ~/projects/llama-stack/llama_stack/ui ❯ curl -v http://localhost:8321/v1/models * Host localhost:8321 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:8321... * Connected to localhost (::1) port 8321 > GET /v1/models HTTP/1.1 > Host: localhost:8321 > User-Agent: curl/8.7.1 > Accept: / > * Request completely sent off < HTTP/1.1 401 Unauthorized < date: Fri, 27 Jun 2025 21:51:25 GMT < server: uvicorn < content-type: application/json < x-trace-id: 5390c6c0654086c55d87c86d7cbf2f6a < Transfer-Encoding: chunked < * Connection #0 to host localhost left intact {"error": {"message": "Authentication required. Please provide a valid GitHub access token (https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens) in the Authorization header (Bearer <token>)"}} ~/projects/llama-stack/llama_stack/ui ❯ ./scripts/unit-tests.sh ~/projects/llama-stack/llama_stack/ui ❯ curl "http://localhost:8321/v1/models" \ -H "Authorization: Bearer <token_obtained_from_github>" \ {"data":[{"identifier":"accounts/fireworks/models/llama-guard-3-11b-vision","provider_resource_id":"accounts/fireworks/models/llama-guard-3-11b-vision","provider_id":"fireworks","type":"model","metadata":{},"model_type":"llm"},{"identifier":"accounts/fireworks/models/llama-guard-3-8b","provider_resource_id":"accounts/fireworks/models/llama-guard-3-8b","provider_id":"fireworks","type":"model","metadata":{},"model_type":"llm"},{"identifier":"accounts/fireworks/models/llama-v3p1-405b-instruct","provider_resource_id":"accounts/f ``` --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-07-08 11:02:36 -07:00
Francisco Arceo	83c89265e0	chore: Adding unit tests for Milvus and OpenAI compatibility (#2640 ) Some checks failed Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 13s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 5s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 5s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 4s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 36s Details Test Llama Stack Build / build-single-provider (push) Failing after 36s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 36s Details Test External Providers / test-external-providers (venv) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 45s Details Python Package Build Test / build (3.12) (push) Failing after 17s Details Unit Tests / unit-tests (3.13) (push) Failing after 18s Details Pre-commit / pre-commit (push) Successful in 1m35s Details # What does this PR do? - Enabling Unit tests for Milvus to start to test OpenAI compatibility and fixing a few bugs. - Also fixed an inconsistency in the Milvus config between remote and inline. - Added pymilvus to extras for testing in CI I'm going to refactor this later to include the other inline providers so that we can catch issues sooner. I have another PR where I've been testing to find other bugs in the implementation (and required changes drafted here: https://github.com/meta-llama/llama-stack/pull/2617). ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-08 00:50:16 -07:00
Charlie Doern	27b3cd570f	fix: use `--template` flag for server (#2643 ) # What does this PR do? currently when a template is used, we still use `--config`. `server.py` has a dedicated `--template` flag and logic, use that instead Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-08 00:48:50 -07:00
ehhuang	e9926564bd	fix: authorized sql store with postgres (#2641 ) Some checks failed Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 13s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 11s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 13s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 14s Details Integration Tests / test-matrix (server, 3.12, post_training) (push) Failing after 14s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers / test-external-providers (venv) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details Test Llama Stack Build / build-single-provider (push) Failing after 44s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 41s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 43s Details Pre-commit / pre-commit (push) Successful in 1m34s Details # What does this PR do? postgres has different json extract syntax from sqlite ## Test Plan added integration test	2025-07-07 19:36:34 -07:00
Ben Browning	5bb3817c49	fix: Restore the nvidia distro (#2639 ) # What does this PR do? The `nvidia` distro was previously collapsed into the `starter` distro. However, the `nvidia` distro was setup specifically to use NVIDIA NeMo microservices as providers for all APIs and not just inference, which means it was doing quite a bit more than what the `starter` distro covers today. We should work with our friends at NVIDIA to determine the best place to maintain this distro long-term, but for now this restores the `nvidia` distro and its docs back to where they were so that things continue to work for their users. ## Test Plan I ensure the `nvidia` distro could build, and run at least to the point of complaining that I didn't provide the necessary API keys. ``` uv run llama stack build --template nvidia --image-type venv uv run llama stack run llama_stack/templates/nvidia/run.yaml ``` I also made sure the docs website built and looks reasonable, with the `nvidia` distro docs at the same URL it was previously (because it has incoming links from official NVIDIA NeMo docs, among other places). ``` uv run --group docs sphinx-autobuild docs/source docs/build/html --write-all ``` Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-07-07 15:50:05 -07:00
Charlie Doern	d0ec5c3d3a	fix: print proper template path upon build (#2642 ) # What does this PR do? Rather than pointing to a dir in `llama_stack/templates` (the repo directory) we should point to `$BUILD_DIR/IMAGE_NAME-run.yaml` (`~/.llama/distributions/IMAGE_NAME/IMAGE_NAME-run.yaml`) currently we are printing: ``` You can find the newly-built template here: /Users/charliedoern/projects/Documents/llama-stack/llama_stack/templates/starter/run.yaml You can run the new Llama Stack distro via: llama stack run /Users/charliedoern/projects/Documents/llama-stack/llama_stack/templates/starter/run.yaml --image-type venv ``` but should be printing things like: ``` You can find the newly-built template here: /Users/charliedoern/.llama/distributions/starter/starter-run.yaml You can run the new Llama Stack distro via: llama stack run /Users/charliedoern/.llama/distributions/starter/starter-run.yaml --image-type venv ``` Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-07 15:39:39 -07:00
Sébastien Han	5561f1c36d	ci: error when a pipefails (#2635 ) Some checks failed Integration Tests / test-matrix (server, 3.12, inference) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.12, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (server, 3.12, providers) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.12, scoring) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 30s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 26s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 24s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 22s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 1m1s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m5s Details Pre-commit / pre-commit (push) Successful in 1m53s Details # What does this PR do? The CI was failing but the error was eaten by the pipe. Now we run the task with pipefail. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-07 16:47:30 +02:00
Wen Zhou	4bca4af3e4	refactor: set proper name for embedding all-minilm:l6-v2 and update to use "starter" in detailed_tutorial (#2627 ) Some checks failed Integration Tests / test-matrix (server, 3.12, scoring) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 5s Details Integration Tests / test-matrix (server, 3.12, datasets) (push) Failing after 32s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.12, inspect) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 22s Details Integration Tests / test-matrix (server, 3.12, agents) (push) Failing after 16s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 24s Details Integration Tests / test-matrix (server, 3.12, providers) (push) Failing after 20s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 18s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 20s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 34s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 33s Details Integration Tests / test-matrix (server, 3.12, tool_runtime) (push) Failing after 30s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Python Package Build Test / build (3.13) (push) Failing after 39s Details Update ReadTheDocs / update-readthedocs (push) Failing after 41s Details Unit Tests / unit-tests (3.12) (push) Failing after 46s Details Pre-commit / pre-commit (push) Successful in 1m30s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - we are using `all-minilm:l6-v2` but the model we download from ollama is `all-minilm:latest` latest: https://ollama.com/library/all-minilm:latest 1b226e2802db l6-v2: https://ollama.com/library/all-minilm:l6-v2 pin 1b226e2802db - even currently they are exactly the same model but if [all-minilm:l12-v2](https://ollama.com/library/all-minilm:l12-v2) is updated, "latest" might not be the same for l6-v2. - the only change in this PR is pin the model id in ollama - also update detailed_tutorial with "starter" to replace deprecated "ollama". <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ``` >INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" >llama stack build --run --template ollama --image-type venv ... Build Successful! You can find the newly-built template here: /home/wenzhou/zdtsw-forking/lls/llama-stack/llama_stack/templates/ollama/run.yaml .... - metadata: embedding_dimension: 384 model_id: all-MiniLM-L6-v2 model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType - embedding provider_id: ollama provider_model_id: all-minilm:l6-v2 ... ``` test ``` >llama-stack-client inference chat-completion --message "Write me a 2-sentence poem about the moon" INFO:httpx:HTTP Request: GET http://localhost:8321/v1/models "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/chat/completions "HTTP/1.1 200 OK" OpenAIChatCompletion( id='chatcmpl-04f99071-3da2-44ba-a19f-03b5b7fc70b7', choices=[ OpenAIChatCompletionChoice( finish_reason='stop', index=0, message=OpenAIChatCompletionChoiceMessageOpenAIAssistantMessageParam( role='assistant', content="Here is a 2-sentence poem about the moon:\n\nSilver crescent in the midnight sky,\nLuna's gentle face, a beauty to the eye.", name=None, tool_calls=None, refusal=None, annotations=None, audio=None, function_call=None ), logprobs=None ) ], created=1751644429, model='llama3.2:3b-instruct-fp16', object='chat.completion', service_tier=None, system_fingerprint='fp_ollama', usage={'completion_tokens': 33, 'prompt_tokens': 36, 'total_tokens': 69, 'completion_tokens_details': None, 'prompt_tokens_details': None} ) ``` --------- Signed-off-by: Wen Zhou <wenzhou@redhat.com>	2025-07-06 09:07:37 +05:30
dependabot[bot]	2faec38724	chore(deps): bump next from 15.3.2 to 15.3.3 in /llama_stack/ui (#2632 ) Some checks failed Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 26s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.12, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.12, inference) (push) Failing after 23s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 25s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 22s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 39s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 41s Details Python Package Build Test / build (3.12) (push) Failing after 33s Details Python Package Build Test / build (3.13) (push) Failing after 31s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Pre-commit / pre-commit (push) Successful in 1m23s Details Bumps [next](https://github.com/vercel/next.js) from 15.3.2 to 15.3.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vercel/next.js/releases">next's releases</a>.</em></p> <blockquote> <h2>v15.3.3</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>Reinstate <code>vary</code> (<a href="https://redirect.github.com/vercel/next.js/issues/79939">#79939</a>)</li> <li>fix(next-swc): Fix interestingness detection for React Compiler (<a href="https://redirect.github.com/vercel/next.js/issues/79558">#79558</a>)</li> <li>fix(next-swc): Fix react compiler usefulness detector (<a href="https://redirect.github.com/vercel/next.js/issues/79480">#79480</a>)</li> <li>fix(dev-overlay): Better handle edge-case file paths in launchEditor (<a href="https://redirect.github.com/vercel/next.js/issues/79526">#79526</a>)</li> <li>Client router should discard stale prefetch entries for static pages (<a href="https://redirect.github.com/vercel/next.js/issues/79362">#79362</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/gaojude"><code>@gaojude</code></a>, <a href="https://github.com/kdy1"><code>@kdy1</code></a>, <a href="https://github.com/bgw"><code>@bgw</code></a>, and <a href="https://github.com/unstubbable"><code>@unstubbable</code></a> for helping!</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`3ab8db7383`"><code>3ab8db7</code></a> v15.3.3</li> <li><a href="`18c8113ebd`"><code>18c8113</code></a> [backport] Reinstate <code>vary</code> (<a href="https://redirect.github.com/vercel/next.js/issues/79939">#79939</a>)</li> <li><a href="`e18212f546`"><code>e18212f</code></a> re-enable vary header deploy test (<a href="https://redirect.github.com/vercel/next.js/issues/79753">#79753</a>)</li> <li><a href="`ec202eccf0`"><code>ec202ec</code></a> Revert "[next-server] skip setting vary header for basic routes" (<a href="https://redirect.github.com/vercel/next.js/issues/79426">#79426</a>)</li> <li><a href="`e2f264fdce`"><code>e2f264f</code></a> fix(next-swc): Fix interestingness detection for React Compiler (15.3) (<a href="https://redirect.github.com/vercel/next.js/issues/79558">#79558</a>)</li> <li><a href="`562fac78da`"><code>562fac7</code></a> fix(next-swc): Fix react compiler usefulness detector (15.3) (<a href="https://redirect.github.com/vercel/next.js/issues/79480">#79480</a>)</li> <li><a href="`06097fd7bb`"><code>06097fd</code></a> fix(dev-overlay): Better handle edge-case file paths in launchEditor (<a href="https://redirect.github.com/vercel/next.js/issues/79526">#79526</a>)</li> <li><a href="`bda731fa96`"><code>bda731f</code></a> Client router should discard stale prefetch entries for static pages (<a href="https://redirect.github.com/vercel/next.js/issues/79362">#79362</a>)</li> <li>See full diff in <a href="https://github.com/vercel/next.js/compare/v15.3.2...v15.3.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=next&package-manager=npm_and_yarn&previous-version=15.3.2&new-version=15.3.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meta-llama/llama-stack/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-07-05 00:13:33 -04:00
Wen Zhou	c025cab3a3	docs: update docs to use "starter" than "ollama" (#2629 )	2025-07-05 08:44:57 +05:30
Francisco Arceo	dc7df60d42	docs: Update starter docs to include milvus inline (#2631 )	2025-07-05 08:43:39 +05:30
Sébastien Han	ea966565f6	feat: improve telemetry (#2590 ) Some checks failed Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 5s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 4s Details Integration Tests / test-matrix (server, 3.12, tool_runtime) (push) Failing after 18s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 19s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 16s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 18s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s Details Python Package Build Test / build (3.13) (push) Failing after 0s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 17s Details Update ReadTheDocs / update-readthedocs (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details Test External Providers / test-external-providers (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 58s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 1m0s Details Python Package Build Test / build (3.12) (push) Failing after 49s Details Pre-commit / pre-commit (push) Successful in 1m40s Details # What does this PR do? * Use a single env variable to setup OTEL endpoint * Update telemetry provider doc * Update general telemetry doc with the metric with generate * Left a script to setup telemetry for testing Closes: https://github.com/meta-llama/llama-stack/issues/783 Note to reviewer: the `setup_telemetry.sh` script was useful for me, it was nicely generated by AI, if we don't want it in the repo, and I can delete it, and I would understand. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-04 17:29:09 +02:00
Derek Higgins	4eae0cbfa4	fix(starter): Add missing faiss provider to build.yaml vector_io section (#2625 ) The starter template build.yaml was missing the inline::faiss provider in the vector_io section, while it was properly configured in run.yaml and starter.py's vector_io_providers list. Fixes: #2624 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-07-04 17:28:57 +02:00
Sébastien Han	df6ce8befa	fix: only load mcp when enabled in tool_group (#2621 ) # What does this PR do? The agent code is currently importing MCP modules even when MCP isn’t enabled. Do we consider this worth fixing, or are we treating MCP as a first-class dependency? I believe we should treat it as such. If everyone agrees, let’s go ahead and close this. Note: The current setup breaks if someone builds a distro without including MCP in tool_group but still serves the agent API. Also, we should bump the MCP version to support streamable responses, as SSE is being deprecated. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-04 20:27:05 +05:30
Sébastien Han	c4349f532b	feat: consolidate most distros into "starter" (#2516 ) # What does this PR do? * Removes a bunch of distros * Removed distros were added into the "starter" distribution * Doc for "starter" has been added * Partially reverts https://github.com/meta-llama/llama-stack/pull/2482 since inference providers are disabled by default and can be turned on manually via env variable. * Disables safety in starter distro Closes: https://github.com/meta-llama/llama-stack/issues/2502. ~Needs: https://github.com/meta-llama/llama-stack/pull/2482 for Ollama to work properly in the CI.~ TODO: - [ ] We can only update `install.sh` when we get a new release. - [x] Update providers documentation - [ ] Update notebooks to reference starter instead of ollama Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-04 15:58:03 +02:00
Derek Higgins	f77d4d91f5	fix: handle encoding errors when adding files to vector store (#2574 ) Some checks failed Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s Details Test Llama Stack Build / build (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 45s Details Test Llama Stack Build / build-single-provider (push) Failing after 37s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 33s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 43s Details Pre-commit / pre-commit (push) Successful in 1m35s Details - Add try-catch block around data.decode() to handle UnicodeDecodeError - Implement UTF-8 fallback when detected encoding fails - Return empty string when both encodings fail - add unit tests Fixes #2572: UnicodeDecodeError when uploading files with problematic encodings Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-07-04 12:10:18 +02:00
Ashwin Bharambe	f1c62e0af0	build: Bump version to 0.2.14	2025-07-04 12:12:12 +05:30
Matthew Farrellee	ef26259209	feat: add llama guard 4 model (#2579 ) add support for Llama Guard 4 model to the llama_guard safety provider test with - 0. NVIDIA_API_KEY=... llama stack build --image-type conda --image-name env-nvidia --providers inference=remote::nvidia,safety=inline::llama-guard --run 1. llama-stack-client models register meta-llama/Llama-Guard-4-12B --provider-model-id meta/llama-guard-4-12b 2. pytest tests/integration/safety/test_llama_guard.py Co-authored-by: raghotham <rsm@meta.com>	2025-07-03 22:29:04 -07:00
Derek Higgins	0422b4fc63	fix: CI flakiness in vector IO tests by pinning pymilvus>=2.4.10 (#2610 ) Some checks failed Integration Tests / test-matrix (server, 3.12, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.12, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.12, post_training) (push) Failing after 11s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 11s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 1m15s Details Python Package Build Test / build (3.12) (push) Failing after 1m12s Details Python Package Build Test / build (3.13) (push) Failing after 1m10s Details Test External Providers / test-external-providers (venv) (push) Failing after 1m27s Details Unit Tests / unit-tests (3.12) (push) Failing after 35s Details Unit Tests / unit-tests (3.13) (push) Failing after 34s Details Pre-commit / pre-commit (push) Successful in 2m47s Details This occurred when marshmallow 4.0.0 was installed (which removed __version_info__) By pinning pymilvus to >=2.4.10, we ensure marshmallow doesn't get installed. Also set the dependency in InlineProviderSpec as this is the one that takes effect when using the "inline::milvus" provider. Fixes https://github.com/meta-llama/llama-stack/issues/2588 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-07-04 10:27:23 +05:30
Francisco Arceo	ea80ea63ac	chore: Updating chunk id generation to ensure uniqueness (#2618 ) # What does this PR do? This handles an edge case for `generate_chunk_id` if the concatenation of the `document_id` and `chunk_text` combination are not unique. Adding the window location ensures uniqueness. ## Test Plan Added unit test Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-04 10:26:35 +05:30
Francisco Arceo	4afd619c56	chore: Add support for vector-stores files api for Milvus (#2582 ) Some checks failed Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 13s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 24s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 18s Details Test Llama Stack Build / generate-matrix (push) Successful in 20s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 28s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 4s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Python Package Build Test / build (3.12) (push) Failing after 51s Details Test Llama Stack Build / build-single-provider (push) Failing after 55s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 54s Details Pre-commit / pre-commit (push) Successful in 1m44s Details # What does this PR do? ### Summary This pull request implements support for the OpenAI Vector Store Files API for the Milvus vector store provider in `llama_stack`. It enables storing, loading, updating, and deleting file metadata and file contents in Milvus collections, allowing OpenAI vector store files to be managed directly within Milvus. ### Main Changes - Milvus Vector Store Files API Implementation - Implements all required methods for storing, loading, updating, and deleting vector store file metadata and contents (`_save_openai_vector_store_file`, `_load_openai_vector_store_file`, `_load_openai_vector_store_file_contents`, `_update_openai_vector_store_file`, `_delete_openai_vector_store_file_from_storage`). - Uses two Milvus collections: `openai_vector_store_files` (for metadata) and `openai_vector_store_files_contents` (for chunked file contents). - Collections are created dynamically if they do not exist, with appropriate schema definitions. - Collection Name Sanitization - Adds a `sanitize_collection_name` utility to ensure Milvus collection names only contain valid characters (letters, numbers, underscores). - Testing - Updates test skip logic to include `"inline::milvus"` for cases where the OpenAI Vector Store Files API is not supported, improving integration test accuracy. - Other Improvements - Passes `kvstore` to `MilvusIndex` for consistency. - Removes obsolete NotImplementedErrors and legacy code for file storage. ## Test Plan CI and tested via a test script ## Notes - `VectorDB` currently uses the `name` as the `identifier` in `openai_create_vector_store`. We need to add `name` as a field to `VectorDB` and generate the `identifier` upon creation. OpenAI is not idempotent with respect to the `name` field that they pass (i.e., you can pass the same name multiple times and OpenAI will generate a new identifier). I'll add a follow up PR for this. - The `Files` api needs to use `files-` as a prefix in the identifier. I have updated the Vector Store to use the OpenAI prefix `vs_*`. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-03 12:15:33 -07:00
Sébastien Han	dae1fcd3c2	ci: let pytest run the distro server (#2586 ) # What does this PR do? * Use #2580 functionality to auto-start the server with the tests * Reduce timeout to 30sec * Print server logs on errors * Pytest logs are collected to a file pytest.log Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-03 10:51:46 -07:00
Akram Ben Aissi	f4950f4ef0	fix: AccessDeniedError leads to HTTP 500 instead of error 403 (#2595 ) Resolves access control error visibility issues where 500 errors were returned instead of proper 403 responses with actionable error messages. • Enhance AccessDeniedError with detailed context and improve exception handling • Enhanced AccessDeniedError class to include user, action, and resource context - Added constructor parameters for action, resource, and user - Generate detailed error messages showing user principal, attributes, and attempted resource - Backward compatible with existing usage (falls back to generic message) • Updated exception handling in server.py - Import AccessDeniedError from access_control module - Return proper 403 status codes with detailed error messages - Separate handling for PermissionError (generic) vs AccessDeniedError (detailed) • Enhanced error context at raise sites - Updated routing_tables/common.py to pass action, resource, and user context - Updated agents persistence to include context in access denied errors - Provides better debugging information for access control issues • Added comprehensive unit tests - Created tests/unit/server/test_server.py with 13 test cases - Covers AccessDeniedError with and without context - Tests all exception types (ValidationError, BadRequestError, AuthenticationRequiredError, etc.) - Validates proper HTTP status codes and error message formats # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan ``` server: port: 8321 access_policy: - permit: principal: admin actions: [create, read, delete] when: user with admin in groups - permit: actions: [read] when: user with system:authenticated in roles ``` then: ``` curl --request POST --url http://localhost:8321/v1/vector-dbs \ --header "Authorization: Bearer your-bearer" \ --data '{ "vector_db_id": "my_demo_vector_db", "embedding_model": "ibm-granite/granite-embedding-125m-english", "embedding_dimension": 768, "provider_id": "milvus" }' ``` depending if user is in group admin or not, you should get the `AccessDeniedError`. Before this PR, this was leading to an error 500 and `Traceback` displayed in the logs. After the PR, logs display a simpler error (unless DEBUG logging is set) and a 403 Forbidden error is returned on the HTTP side. --------- Signed-off-by: Akram Ben Aissi <<akram.benaissi@gmail.com>>	2025-07-03 10:50:49 -07:00
ehhuang	3c43a2f529	fix: store configs (#2593 ) # What does this PR do? https://github.com/meta-llama/llama-stack/pull/2490 broke postgres_demo, as the config expected a str but the value was converted to int. This PR: 1. Updates the type of port in sqlstore to be int 2. template generation uses `dict` instead of `StackRunConfig` so as to avoid failing pydantic typechecks. 3. Adds `replace_env_vars` to StackRunConfig instantiation in `configure.py` (not sure why this wasn't needed before). ## Test Plan `llama stack build --template postgres_demo --image-type conda --run`	2025-07-03 10:07:23 -07:00
Sébastien Han	aa273944fd	fix: add mcp dependency to agent provider (#2587 ) # What does this PR do? The agent depends on utils.tools.mcp. Closes: https://github.com/meta-llama/llama-stack/issues/2576 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-03 14:59:01 +02:00
Christian Zaccaria	b246b0660e	docs: Add quick_start.ipynb notebook equivalent of index.md Quickstart guide (#2128 ) Some checks failed Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 4s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 20s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 52s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 54s Details Unit Tests / unit-tests (3.13) (push) Failing after 50s Details Pre-commit / pre-commit (push) Successful in 1m51s Details # What does this PR do? - Adding a notebook equivalent of the [getting_started/index.md#Quickstart guide](https://github.com/meta-llama/llama-stack/blob/main/docs/source/getting_started/index.md). ## To discuss Note: works locally, but I am encountering issues when attempting to run through the notebook on Google Colab. Specifically, on the last step to run the demo, the `knowledge_search` tool doesn't seem to be called i.e.,: ``` rag_tool> Ingesting document: https://www.paulgraham.com/greatwork.html prompt> How do you do great work? inference> I don't have personal experiences or emotions, but I was trained on a large corpus of text data and use various techniques such as natural language processing (NLP) and machine learning algorithms to generate human-like responses. ``` I would expect to get something like: ``` rag_tool> Ingesting document: https://www.paulgraham.com/greatwork.html prompt> How do you do great work? inference> [knowledge_search(query="What is the key to doing great work")] tool_execution> Tool:knowledge_search Args:{'query': 'What is the key to doing great work'} tool_execution> Tool:knowledge_search Response:[TextContentItem(text='knowledge_search tool found 5 chunks: .... .... ```	2025-07-03 13:55:43 +02:00
Sumanth Kamenani	577ec382e1	fix(docs): update Agents101 notebook for builtin websearch (#2591 ) - Switch from BRAVE_SEARCH_API_KEY to TAVILY_SEARCH_API_KEY - Add provider_data to LlamaStackClient for API key passing - Use builtin::websearch toolgroup instead of manual tool config - Fix message types to use UserMessage instead of plain dict - Add streaming support with proper type casting - Remove async from EventLogger loop (bug fix) Fixes websearch functionality in agents tutorial by properly configuring Tavily search provider integration. # What does this PR do? Fixes the Agents101 tutorial notebook to work with the current Llama Stack websearch implementation. The tutorial was using outdated Brave Search configuration that no longer works with the current server setup. Key Changes: - Switch API provider: Change from `BRAVE_SEARCH_API_KEY` to `TAVILY_SEARCH_API_KEY` to match server configuration - Fix client setup: Add `provider_data` to `LlamaStackClient` to properly pass API keys to server - Modernize tool usage: Replace manual tool configuration with `tools=["builtin::websearch"]` - Fix type safety: Use `UserMessage` type instead of plain dictionaries for messages - Fix streaming: Add proper streaming support with `stream=True` and type casting - Fix EventLogger: Remove incorrect `async for` usage (should be `for`) Why needed: Users following the tutorial were getting 401 Unauthorized errors because the notebook wasn't properly configured for the Tavily search provider that the server actually uses. ## Test Plan Prerequisites: 1. Start Llama Stack server with Ollama template and `TAVILY_SEARCH_API_KEY` environment variable 2. Set `TAVILY_SEARCH_API_KEY` in your `.env` file Testing Steps: 1. Clone and setup: ```bash git checkout fix-2558-update-agents101 cd docs/zero_to_hero_guide/ ``` 2. Start server with API key: ```bash export TAVILY_SEARCH_API_KEY="your_tavily_api_key" podman run -it --network=host -v ~/.llama:/root/.llama:Z \ --env INFERENCE_MODEL=$INFERENCE_MODEL \ --env OLLAMA_URL=http://localhost:11434 \ --env TAVILY_SEARCH_API_KEY=$TAVILY_SEARCH_API_KEY \ llamastack/distribution-ollama --port $LLAMA_STACK_PORT ``` 3. Run the notebook: - Open `07_Agents101.ipynb` in Jupyter - Execute all cells in order - Cell 5 should run without errors and show successful web search results Expected Results: - ✅ No 401 Unauthorized errors - ✅ Agent successfully calls `brave_search.call()` with web results - ✅ Switzerland travel recommendations appear in output - ✅ Follow-up questions work correctly Before this fix: Users got `401 Unauthorized` errors and tutorial failed After this fix: Tutorial works end-to-end with proper web search functionality Tested with: - Tavily API key (free tier) - Ollama distribution template - Llama-3.2-3B-Instruct model	2025-07-03 11:14:51 +02:00
Wen Zhou	040424acf5	docs: update full list of providers with matched APIs and dockerhub images (#2452 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - add model_type in example - change "Memory" to "VectorIO" as column name - update index.md and README.md <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> run pre-commit to catch changes. --------- Signed-off-by: Wen Zhou <wenzhou@redhat.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-07-03 10:12:56 +02:00
Nate Harada	5b07755556	docs: Minor spelling fix (#2592 ) Some checks failed Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 22s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 21s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 19s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 34s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 33s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 33s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 33s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 31s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 30s Details Python Package Build Test / build (3.12) (push) Failing after 47s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 56s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 54s Details Python Package Build Test / build (3.13) (push) Failing after 42s Details Test External Providers / test-external-providers (venv) (push) Failing after 27s Details Unit Tests / unit-tests (3.13) (push) Failing after 36s Details Unit Tests / unit-tests (3.12) (push) Failing after 38s Details Pre-commit / pre-commit (push) Successful in 2m3s Details # What does this PR do? Minor spelling fix in the comments ## Test Plan No code changes	2025-07-02 20:26:51 -04:00
Jorge	4d0d2d685f	fix: Set parameter usedforsecurity=False when calling hashlib.md5 in order to fix rag_tool.insert on FIPS clusters (#2577 ) Some checks failed Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 26s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 23s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 24s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 31s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Test External Providers / test-external-providers (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 34s Details Python Package Build Test / build (3.13) (push) Failing after 33s Details Pre-commit / pre-commit (push) Successful in 1m52s Details # What does this PR do? Set parameter `usedforsecurity=False` when calling hashlib.md5 in order to fix rag_tool.insert on FIPS clusters <!-- If resolving an issue, uncomment and update the line below --> Closes #2571 --------- Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>	2025-07-02 12:07:05 +02:00
ehhuang	fc735a414e	test: Add one-step integration testing with server auto-start (#2580 ) Some checks failed Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 18s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 21s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Python Package Build Test / build (3.12) (push) Failing after 1m3s Details Python Package Build Test / build (3.13) (push) Failing after 1m3s Details Test External Providers / test-external-providers (venv) (push) Failing after 1m7s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m15s Details Unit Tests / unit-tests (3.13) (push) Failing after 19s Details Pre-commit / pre-commit (push) Successful in 2m42s Details ## Summary Add support for `server:<config>` format in `--stack-config` option to enable seamless one-step integration testing. This eliminates the need to manually start servers in separate terminals before running tests. ## Key Features - Auto-start server: Automatically launches `llama stack run <config>` if target port is available - Smart reuse: Reuses existing server if port is already occupied - Health check polling: Waits up to 2 minutes for server readiness via `/v1/health` endpoint - Custom port support: Use `server:<config>:<port>` for non-default ports - Clean output: Server runs quietly in background without cluttering test output - Backward compatibility: All existing `--stack-config` formats continue to work ## Usage Examples ```bash # Auto-start server with default port 8321 pytest tests/integration/inference/ --stack-config=server:fireworks # Use custom port pytest tests/integration/safety/ --stack-config=server:together:8322 # Run multiple test suites seamlessly pytest tests/integration/inference/ tests/integration/agents/ --stack-config=server:starter ``` ## Implementation Details - Enhanced `llama_stack_client` fixture with server management - Updated documentation with cleaner organization and comprehensive examples - Added utility functions for port checking, server startup, and health verification ## Test Plan - Verified server auto-start when port 8321 is available - Verified server reuse when port 8321 is occupied - Tested health check polling via `/v1/health` endpoint - Confirmed custom port configuration works correctly - Verified backward compatibility with existing config formats ## Before/After Comparison Before (2 steps): ```bash # Terminal 1: Start server manually llama stack run fireworks --port 8321 # Terminal 2: Wait for startup, then run tests pytest tests/integration/inference/ --stack-config=http://localhost:8321 ``` After (1 step): ```bash # Single command handles everything pytest tests/integration/inference/ --stack-config=server:fireworks ```	2025-07-01 14:48:46 -07:00
Wen Zhou	958600a5c1	fix: update zero_to_hero package and README (#2578 ) Some checks failed Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 36s Details Python Package Build Test / build (3.12) (push) Failing after 33s Details Test Llama Stack Build / build-single-provider (push) Failing after 37s Details Test External Providers / test-external-providers (venv) (push) Failing after 32s Details Pre-commit / pre-commit (push) Successful in 1m24s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - update REAMDE.md format and python version - update package name: CustomTool was renamed to ClientTool in https://github.com/meta-llama/llama-stack-client-python/pull/73 <!-- If resolving an issue, uncomment and update the line below --> Closes #2556 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Wen Zhou <wenzhou@redhat.com>	2025-07-01 11:08:55 -07:00
Nathan Weinberg	d165000bbc	docs: specify the ability to train non-Llama models (#2573 ) # What does this PR do? Clarifies that non-Llama models can be trained via the Post Training API ## Test Plan Build docs locally Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-01 19:29:06 +05:30
Sébastien Han	25268854bc	fix: allow default empty vars for conditionals (#2570 ) # What does this PR do? We were not using conditionals correctly, conditionals can only be used when the env variable is set, so `${env.ENVIRONMENT:+}` would return None is ENVIRONMENT is not set. If you want to create a conditional value, you need to do `${env.ENVIRONMENT:=}`, this will pick the value of ENVIRONMENT if set, otherwise will return None. Closes: https://github.com/meta-llama/llama-stack/issues/2564 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-01 14:42:05 +02:00
Nathan Weinberg	faaeccc6fd	docs: update external provider guide and navigation (#2567 ) Some checks failed Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 25s Details Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 33s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.12, inspect) (push) Failing after 36s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 31s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 28s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 14s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 14s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Pre-commit / pre-commit (push) Successful in 1m23s Details # What does this PR do? The external providers guide can now be accessed directly from the sidebar ## Test Plan Build locally to test the changes Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-01 09:42:32 +02:00
Francisco Arceo	0066135944	chore: Enabling VectorIO Integration tests for Milvus (#2546 ) Some checks failed Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Test Llama Stack Build / build (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Test Llama Stack Build / build-single-provider (push) Failing after 41s Details Python Package Build Test / build (3.12) (push) Failing after 35s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 41s Details Unit Tests / unit-tests (3.13) (push) Failing after 37s Details Pre-commit / pre-commit (push) Successful in 2m3s Details	2025-06-30 19:49:59 -07:00
Francisco Arceo	5785ccda35	fix: Fixing Milvus sample config and updating documentation (#2568 )	2025-06-30 19:25:23 -07:00
Matthew Farrellee	f6d91f45ba	fix: update zero-to-hero guide for modern llama stack (#2555 ) # What does this PR do? closes #2553 ## Test Plan run through notebooks w/ llama stack running on localhost:{8321,8322}	2025-06-30 18:09:33 -07:00
Matthew Farrellee	13aa367c8a	fix: default api_key from env must be a SecretStr (#2565 ) # What does this PR do? fixes the api_key type when read from env ## Test Plan run nvidia template w/o api_key in run.yaml and perform inference before change the inference will fail w/ - ``` File ".../llama-stack/llama_stack/providers/remote/inference/nvidia/nvidia.py", line 118, in _get_client_for_base_url api_key=(self._config.api_key.get_secret_value() if self._config.api_key else "NO KEY"), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'str' object has no attribute 'get_secret_value' ```	2025-06-30 18:08:44 -07:00
Nathan Weinberg	ba9acce93b	docs: fixed incorrect API list item (#2566 ) Current text did not match section in example Ollama distro: https://llama-stack.readthedocs.io/en/latest/distributions/configuration.html Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-06-30 18:08:19 -07:00
Ashwin Bharambe	b333a3c03a	fix(ollama): Download remote image URLs for Ollama (#2551 ) Some checks failed Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 16s Details Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 19s Details Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 46s Details Python Package Build Test / build (3.12) (push) Failing after 43s Details Test External Providers / test-external-providers (venv) (push) Failing after 40s Details Python Package Build Test / build (3.13) (push) Failing after 42s Details Unit Tests / unit-tests (3.13) (push) Failing after 22s Details Unit Tests / unit-tests (3.12) (push) Failing after 25s Details Update ReadTheDocs / update-readthedocs (push) Failing after 20s Details Pre-commit / pre-commit (push) Successful in 2m13s Details ## What does this PR do? Ollama does not support remote images. Only local file paths OR base64 inputs are supported. This PR ensures that the Stack downloads remote images and passes the base64 down to the inference engine. ## Test Plan Added a test cases for Responses and ran it for both `fireworks` and `ollama` providers.	2025-06-30 20:36:11 +05:30
Sébastien Han	c9a49a80e8	docs: auto generated documentation for providers (#2543 ) # What does this PR do? Simple approach to get some provider pages in the docs. Add or update description fields in the provider configuration class using Pydantic’s Field, ensuring these descriptions are clear and complete, as they will be used to auto-generate provider documentation via ./scripts/distro_codegen.py instead of editing the docs manually. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-30 15:13:20 +02:00
Sébastien Han	8d8e90d78e	fix: add missing argument and methods (#2550 ) # What does this PR do? Resolves: ``` mypy.....................................................................Failed - hook id: mypy - exit code: 1 llama_stack/providers/utils/responses/responses_store.py:119: error: Missing positional argument "policy" in call to "fetch_one" of "AuthorizedSqlStore" [call-arg] llama_stack/providers/utils/responses/responses_store.py:122: error: "AuthorizedSqlStore" has no attribute "delete" [attr-defined] Found 2 errors in 1 file (checked 403 source files) ``` Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-30 14:55:37 +02:00
Krzysztof Malczuk	be9bf68246	feat: Add webmethod for deleting openai responses (#2160 ) Some checks failed Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 16s Details Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 12s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 17s Details Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 21s Details Test External Providers / test-external-providers (venv) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 19s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 39s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 37s Details Python Package Build Test / build (3.13) (push) Failing after 33s Details Python Package Build Test / build (3.12) (push) Failing after 36s Details Pre-commit / pre-commit (push) Failing after 1m19s Details # What does this PR do? This PR creates a webmethod for deleting open AI responses, adds and implementation for it and makes an integration test for the OpenAI delete response method. [//]: # (If resolving an issue, uncomment and update the line below) # (Closes #2077) ## Test Plan Ran the standard tests and the pre-commit hooks and the unit tests. # (## Documentation) For this pr I made the routes and implementation based on the current get and create methods. The unit tests were not able to handle this test due to the mock interface in use, which did not allow for effective CRUD to be tested. I instead created an integration test to match the existing ones in the test_openai_responses.	2025-06-30 11:28:02 +02:00
Wen Zhou	6fa5271807	docs: update document since container is not an option for "llama stack run" + update docs with current "usage" (#2531 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - change from https://github.com/meta-llama/llama-stack/issues/2110 need update documentation. "container" is not valid value for --image-type - chore: updates from standard output <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Wen Zhou <wenzhou@redhat.com>	2025-06-30 11:02:07 +05:30

1 2 3 4 5 ...

2178 commits