llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-06 18:40:57 +00:00

Author	SHA1	Message	Date
Francisco Arceo	ea80ea63ac	chore: Updating chunk id generation to ensure uniqueness (#2618 ) # What does this PR do? This handles an edge case for `generate_chunk_id` if the concatenation of the `document_id` and `chunk_text` combination are not unique. Adding the window location ensures uniqueness. ## Test Plan Added unit test Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-04 10:26:35 +05:30
Francisco Arceo	4afd619c56	chore: Add support for vector-stores files api for Milvus (#2582 ) Some checks failed Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 13s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 24s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 18s Details Test Llama Stack Build / generate-matrix (push) Successful in 20s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 28s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 4s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Python Package Build Test / build (3.12) (push) Failing after 51s Details Test Llama Stack Build / build-single-provider (push) Failing after 55s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 54s Details Pre-commit / pre-commit (push) Successful in 1m44s Details # What does this PR do? ### Summary This pull request implements support for the OpenAI Vector Store Files API for the Milvus vector store provider in `llama_stack`. It enables storing, loading, updating, and deleting file metadata and file contents in Milvus collections, allowing OpenAI vector store files to be managed directly within Milvus. ### Main Changes - Milvus Vector Store Files API Implementation - Implements all required methods for storing, loading, updating, and deleting vector store file metadata and contents (`_save_openai_vector_store_file`, `_load_openai_vector_store_file`, `_load_openai_vector_store_file_contents`, `_update_openai_vector_store_file`, `_delete_openai_vector_store_file_from_storage`). - Uses two Milvus collections: `openai_vector_store_files` (for metadata) and `openai_vector_store_files_contents` (for chunked file contents). - Collections are created dynamically if they do not exist, with appropriate schema definitions. - Collection Name Sanitization - Adds a `sanitize_collection_name` utility to ensure Milvus collection names only contain valid characters (letters, numbers, underscores). - Testing - Updates test skip logic to include `"inline::milvus"` for cases where the OpenAI Vector Store Files API is not supported, improving integration test accuracy. - Other Improvements - Passes `kvstore` to `MilvusIndex` for consistency. - Removes obsolete NotImplementedErrors and legacy code for file storage. ## Test Plan CI and tested via a test script ## Notes - `VectorDB` currently uses the `name` as the `identifier` in `openai_create_vector_store`. We need to add `name` as a field to `VectorDB` and generate the `identifier` upon creation. OpenAI is not idempotent with respect to the `name` field that they pass (i.e., you can pass the same name multiple times and OpenAI will generate a new identifier). I'll add a follow up PR for this. - The `Files` api needs to use `files-` as a prefix in the identifier. I have updated the Vector Store to use the OpenAI prefix `vs_*`. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-03 12:15:33 -07:00
Sébastien Han	dae1fcd3c2	ci: let pytest run the distro server (#2586 ) # What does this PR do? * Use #2580 functionality to auto-start the server with the tests * Reduce timeout to 30sec * Print server logs on errors * Pytest logs are collected to a file pytest.log Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-03 10:51:46 -07:00
Akram Ben Aissi	f4950f4ef0	fix: AccessDeniedError leads to HTTP 500 instead of error 403 (#2595 ) Resolves access control error visibility issues where 500 errors were returned instead of proper 403 responses with actionable error messages. • Enhance AccessDeniedError with detailed context and improve exception handling • Enhanced AccessDeniedError class to include user, action, and resource context - Added constructor parameters for action, resource, and user - Generate detailed error messages showing user principal, attributes, and attempted resource - Backward compatible with existing usage (falls back to generic message) • Updated exception handling in server.py - Import AccessDeniedError from access_control module - Return proper 403 status codes with detailed error messages - Separate handling for PermissionError (generic) vs AccessDeniedError (detailed) • Enhanced error context at raise sites - Updated routing_tables/common.py to pass action, resource, and user context - Updated agents persistence to include context in access denied errors - Provides better debugging information for access control issues • Added comprehensive unit tests - Created tests/unit/server/test_server.py with 13 test cases - Covers AccessDeniedError with and without context - Tests all exception types (ValidationError, BadRequestError, AuthenticationRequiredError, etc.) - Validates proper HTTP status codes and error message formats # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan ``` server: port: 8321 access_policy: - permit: principal: admin actions: [create, read, delete] when: user with admin in groups - permit: actions: [read] when: user with system:authenticated in roles ``` then: ``` curl --request POST --url http://localhost:8321/v1/vector-dbs \ --header "Authorization: Bearer your-bearer" \ --data '{ "vector_db_id": "my_demo_vector_db", "embedding_model": "ibm-granite/granite-embedding-125m-english", "embedding_dimension": 768, "provider_id": "milvus" }' ``` depending if user is in group admin or not, you should get the `AccessDeniedError`. Before this PR, this was leading to an error 500 and `Traceback` displayed in the logs. After the PR, logs display a simpler error (unless DEBUG logging is set) and a 403 Forbidden error is returned on the HTTP side. --------- Signed-off-by: Akram Ben Aissi <<akram.benaissi@gmail.com>>	2025-07-03 10:50:49 -07:00
ehhuang	3c43a2f529	fix: store configs (#2593 ) # What does this PR do? https://github.com/meta-llama/llama-stack/pull/2490 broke postgres_demo, as the config expected a str but the value was converted to int. This PR: 1. Updates the type of port in sqlstore to be int 2. template generation uses `dict` instead of `StackRunConfig` so as to avoid failing pydantic typechecks. 3. Adds `replace_env_vars` to StackRunConfig instantiation in `configure.py` (not sure why this wasn't needed before). ## Test Plan `llama stack build --template postgres_demo --image-type conda --run`	2025-07-03 10:07:23 -07:00
Sébastien Han	aa273944fd	fix: add mcp dependency to agent provider (#2587 ) # What does this PR do? The agent depends on utils.tools.mcp. Closes: https://github.com/meta-llama/llama-stack/issues/2576 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-03 14:59:01 +02:00
Christian Zaccaria	b246b0660e	docs: Add quick_start.ipynb notebook equivalent of index.md Quickstart guide (#2128 ) Some checks failed Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 4s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 20s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 52s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 54s Details Unit Tests / unit-tests (3.13) (push) Failing after 50s Details Pre-commit / pre-commit (push) Successful in 1m51s Details # What does this PR do? - Adding a notebook equivalent of the [getting_started/index.md#Quickstart guide](https://github.com/meta-llama/llama-stack/blob/main/docs/source/getting_started/index.md). ## To discuss Note: works locally, but I am encountering issues when attempting to run through the notebook on Google Colab. Specifically, on the last step to run the demo, the `knowledge_search` tool doesn't seem to be called i.e.,: ``` rag_tool> Ingesting document: https://www.paulgraham.com/greatwork.html prompt> How do you do great work? inference> I don't have personal experiences or emotions, but I was trained on a large corpus of text data and use various techniques such as natural language processing (NLP) and machine learning algorithms to generate human-like responses. ``` I would expect to get something like: ``` rag_tool> Ingesting document: https://www.paulgraham.com/greatwork.html prompt> How do you do great work? inference> [knowledge_search(query="What is the key to doing great work")] tool_execution> Tool:knowledge_search Args:{'query': 'What is the key to doing great work'} tool_execution> Tool:knowledge_search Response:[TextContentItem(text='knowledge_search tool found 5 chunks: .... .... ```	2025-07-03 13:55:43 +02:00
Sumanth Kamenani	577ec382e1	fix(docs): update Agents101 notebook for builtin websearch (#2591 ) - Switch from BRAVE_SEARCH_API_KEY to TAVILY_SEARCH_API_KEY - Add provider_data to LlamaStackClient for API key passing - Use builtin::websearch toolgroup instead of manual tool config - Fix message types to use UserMessage instead of plain dict - Add streaming support with proper type casting - Remove async from EventLogger loop (bug fix) Fixes websearch functionality in agents tutorial by properly configuring Tavily search provider integration. # What does this PR do? Fixes the Agents101 tutorial notebook to work with the current Llama Stack websearch implementation. The tutorial was using outdated Brave Search configuration that no longer works with the current server setup. Key Changes: - Switch API provider: Change from `BRAVE_SEARCH_API_KEY` to `TAVILY_SEARCH_API_KEY` to match server configuration - Fix client setup: Add `provider_data` to `LlamaStackClient` to properly pass API keys to server - Modernize tool usage: Replace manual tool configuration with `tools=["builtin::websearch"]` - Fix type safety: Use `UserMessage` type instead of plain dictionaries for messages - Fix streaming: Add proper streaming support with `stream=True` and type casting - Fix EventLogger: Remove incorrect `async for` usage (should be `for`) Why needed: Users following the tutorial were getting 401 Unauthorized errors because the notebook wasn't properly configured for the Tavily search provider that the server actually uses. ## Test Plan Prerequisites: 1. Start Llama Stack server with Ollama template and `TAVILY_SEARCH_API_KEY` environment variable 2. Set `TAVILY_SEARCH_API_KEY` in your `.env` file Testing Steps: 1. Clone and setup: ```bash git checkout fix-2558-update-agents101 cd docs/zero_to_hero_guide/ ``` 2. Start server with API key: ```bash export TAVILY_SEARCH_API_KEY="your_tavily_api_key" podman run -it --network=host -v ~/.llama:/root/.llama:Z \ --env INFERENCE_MODEL=$INFERENCE_MODEL \ --env OLLAMA_URL=http://localhost:11434 \ --env TAVILY_SEARCH_API_KEY=$TAVILY_SEARCH_API_KEY \ llamastack/distribution-ollama --port $LLAMA_STACK_PORT ``` 3. Run the notebook: - Open `07_Agents101.ipynb` in Jupyter - Execute all cells in order - Cell 5 should run without errors and show successful web search results Expected Results: - ✅ No 401 Unauthorized errors - ✅ Agent successfully calls `brave_search.call()` with web results - ✅ Switzerland travel recommendations appear in output - ✅ Follow-up questions work correctly Before this fix: Users got `401 Unauthorized` errors and tutorial failed After this fix: Tutorial works end-to-end with proper web search functionality Tested with: - Tavily API key (free tier) - Ollama distribution template - Llama-3.2-3B-Instruct model	2025-07-03 11:14:51 +02:00
Wen Zhou	040424acf5	docs: update full list of providers with matched APIs and dockerhub images (#2452 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - add model_type in example - change "Memory" to "VectorIO" as column name - update index.md and README.md <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> run pre-commit to catch changes. --------- Signed-off-by: Wen Zhou <wenzhou@redhat.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-07-03 10:12:56 +02:00
Nate Harada	5b07755556	docs: Minor spelling fix (#2592 ) Some checks failed Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 22s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 21s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 19s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 34s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 33s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 33s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 33s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 31s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 30s Details Python Package Build Test / build (3.12) (push) Failing after 47s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 56s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 54s Details Python Package Build Test / build (3.13) (push) Failing after 42s Details Test External Providers / test-external-providers (venv) (push) Failing after 27s Details Unit Tests / unit-tests (3.13) (push) Failing after 36s Details Unit Tests / unit-tests (3.12) (push) Failing after 38s Details Pre-commit / pre-commit (push) Successful in 2m3s Details # What does this PR do? Minor spelling fix in the comments ## Test Plan No code changes	2025-07-02 20:26:51 -04:00
Jorge	4d0d2d685f	fix: Set parameter usedforsecurity=False when calling hashlib.md5 in order to fix rag_tool.insert on FIPS clusters (#2577 ) Some checks failed Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 26s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 23s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 24s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 31s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Test External Providers / test-external-providers (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 34s Details Python Package Build Test / build (3.13) (push) Failing after 33s Details Pre-commit / pre-commit (push) Successful in 1m52s Details # What does this PR do? Set parameter `usedforsecurity=False` when calling hashlib.md5 in order to fix rag_tool.insert on FIPS clusters <!-- If resolving an issue, uncomment and update the line below --> Closes #2571 --------- Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>	2025-07-02 12:07:05 +02:00
ehhuang	fc735a414e	test: Add one-step integration testing with server auto-start (#2580 ) Some checks failed Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 18s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 21s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Python Package Build Test / build (3.12) (push) Failing after 1m3s Details Python Package Build Test / build (3.13) (push) Failing after 1m3s Details Test External Providers / test-external-providers (venv) (push) Failing after 1m7s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m15s Details Unit Tests / unit-tests (3.13) (push) Failing after 19s Details Pre-commit / pre-commit (push) Successful in 2m42s Details ## Summary Add support for `server:<config>` format in `--stack-config` option to enable seamless one-step integration testing. This eliminates the need to manually start servers in separate terminals before running tests. ## Key Features - Auto-start server: Automatically launches `llama stack run <config>` if target port is available - Smart reuse: Reuses existing server if port is already occupied - Health check polling: Waits up to 2 minutes for server readiness via `/v1/health` endpoint - Custom port support: Use `server:<config>:<port>` for non-default ports - Clean output: Server runs quietly in background without cluttering test output - Backward compatibility: All existing `--stack-config` formats continue to work ## Usage Examples ```bash # Auto-start server with default port 8321 pytest tests/integration/inference/ --stack-config=server:fireworks # Use custom port pytest tests/integration/safety/ --stack-config=server:together:8322 # Run multiple test suites seamlessly pytest tests/integration/inference/ tests/integration/agents/ --stack-config=server:starter ``` ## Implementation Details - Enhanced `llama_stack_client` fixture with server management - Updated documentation with cleaner organization and comprehensive examples - Added utility functions for port checking, server startup, and health verification ## Test Plan - Verified server auto-start when port 8321 is available - Verified server reuse when port 8321 is occupied - Tested health check polling via `/v1/health` endpoint - Confirmed custom port configuration works correctly - Verified backward compatibility with existing config formats ## Before/After Comparison Before (2 steps): ```bash # Terminal 1: Start server manually llama stack run fireworks --port 8321 # Terminal 2: Wait for startup, then run tests pytest tests/integration/inference/ --stack-config=http://localhost:8321 ``` After (1 step): ```bash # Single command handles everything pytest tests/integration/inference/ --stack-config=server:fireworks ```	2025-07-01 14:48:46 -07:00
Wen Zhou	958600a5c1	fix: update zero_to_hero package and README (#2578 ) Some checks failed Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 36s Details Python Package Build Test / build (3.12) (push) Failing after 33s Details Test Llama Stack Build / build-single-provider (push) Failing after 37s Details Test External Providers / test-external-providers (venv) (push) Failing after 32s Details Pre-commit / pre-commit (push) Successful in 1m24s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - update REAMDE.md format and python version - update package name: CustomTool was renamed to ClientTool in https://github.com/meta-llama/llama-stack-client-python/pull/73 <!-- If resolving an issue, uncomment and update the line below --> Closes #2556 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Wen Zhou <wenzhou@redhat.com>	2025-07-01 11:08:55 -07:00
Nathan Weinberg	d165000bbc	docs: specify the ability to train non-Llama models (#2573 ) # What does this PR do? Clarifies that non-Llama models can be trained via the Post Training API ## Test Plan Build docs locally Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-01 19:29:06 +05:30
Sébastien Han	25268854bc	fix: allow default empty vars for conditionals (#2570 ) # What does this PR do? We were not using conditionals correctly, conditionals can only be used when the env variable is set, so `${env.ENVIRONMENT:+}` would return None is ENVIRONMENT is not set. If you want to create a conditional value, you need to do `${env.ENVIRONMENT:=}`, this will pick the value of ENVIRONMENT if set, otherwise will return None. Closes: https://github.com/meta-llama/llama-stack/issues/2564 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-01 14:42:05 +02:00
Nathan Weinberg	faaeccc6fd	docs: update external provider guide and navigation (#2567 ) Some checks failed Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 25s Details Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 33s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.12, inspect) (push) Failing after 36s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 31s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 28s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 14s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 14s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Pre-commit / pre-commit (push) Successful in 1m23s Details # What does this PR do? The external providers guide can now be accessed directly from the sidebar ## Test Plan Build locally to test the changes Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-01 09:42:32 +02:00
Francisco Arceo	0066135944	chore: Enabling VectorIO Integration tests for Milvus (#2546 ) Some checks failed Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Test Llama Stack Build / build (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Test Llama Stack Build / build-single-provider (push) Failing after 41s Details Python Package Build Test / build (3.12) (push) Failing after 35s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 41s Details Unit Tests / unit-tests (3.13) (push) Failing after 37s Details Pre-commit / pre-commit (push) Successful in 2m3s Details	2025-06-30 19:49:59 -07:00
Francisco Arceo	5785ccda35	fix: Fixing Milvus sample config and updating documentation (#2568 )	2025-06-30 19:25:23 -07:00
Matthew Farrellee	f6d91f45ba	fix: update zero-to-hero guide for modern llama stack (#2555 ) # What does this PR do? closes #2553 ## Test Plan run through notebooks w/ llama stack running on localhost:{8321,8322}	2025-06-30 18:09:33 -07:00
Matthew Farrellee	13aa367c8a	fix: default api_key from env must be a SecretStr (#2565 ) # What does this PR do? fixes the api_key type when read from env ## Test Plan run nvidia template w/o api_key in run.yaml and perform inference before change the inference will fail w/ - ``` File ".../llama-stack/llama_stack/providers/remote/inference/nvidia/nvidia.py", line 118, in _get_client_for_base_url api_key=(self._config.api_key.get_secret_value() if self._config.api_key else "NO KEY"), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'str' object has no attribute 'get_secret_value' ```	2025-06-30 18:08:44 -07:00
Nathan Weinberg	ba9acce93b	docs: fixed incorrect API list item (#2566 ) Current text did not match section in example Ollama distro: https://llama-stack.readthedocs.io/en/latest/distributions/configuration.html Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-06-30 18:08:19 -07:00
Ashwin Bharambe	b333a3c03a	fix(ollama): Download remote image URLs for Ollama (#2551 ) Some checks failed Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 16s Details Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 19s Details Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 46s Details Python Package Build Test / build (3.12) (push) Failing after 43s Details Test External Providers / test-external-providers (venv) (push) Failing after 40s Details Python Package Build Test / build (3.13) (push) Failing after 42s Details Unit Tests / unit-tests (3.13) (push) Failing after 22s Details Unit Tests / unit-tests (3.12) (push) Failing after 25s Details Update ReadTheDocs / update-readthedocs (push) Failing after 20s Details Pre-commit / pre-commit (push) Successful in 2m13s Details ## What does this PR do? Ollama does not support remote images. Only local file paths OR base64 inputs are supported. This PR ensures that the Stack downloads remote images and passes the base64 down to the inference engine. ## Test Plan Added a test cases for Responses and ran it for both `fireworks` and `ollama` providers.	2025-06-30 20:36:11 +05:30
Sébastien Han	c9a49a80e8	docs: auto generated documentation for providers (#2543 ) # What does this PR do? Simple approach to get some provider pages in the docs. Add or update description fields in the provider configuration class using Pydantic’s Field, ensuring these descriptions are clear and complete, as they will be used to auto-generate provider documentation via ./scripts/distro_codegen.py instead of editing the docs manually. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-30 15:13:20 +02:00
Sébastien Han	8d8e90d78e	fix: add missing argument and methods (#2550 ) # What does this PR do? Resolves: ``` mypy.....................................................................Failed - hook id: mypy - exit code: 1 llama_stack/providers/utils/responses/responses_store.py:119: error: Missing positional argument "policy" in call to "fetch_one" of "AuthorizedSqlStore" [call-arg] llama_stack/providers/utils/responses/responses_store.py:122: error: "AuthorizedSqlStore" has no attribute "delete" [attr-defined] Found 2 errors in 1 file (checked 403 source files) ``` Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-30 14:55:37 +02:00
Krzysztof Malczuk	be9bf68246	feat: Add webmethod for deleting openai responses (#2160 ) Some checks failed Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 16s Details Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 12s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 17s Details Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 21s Details Test External Providers / test-external-providers (venv) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 19s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 39s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 37s Details Python Package Build Test / build (3.13) (push) Failing after 33s Details Python Package Build Test / build (3.12) (push) Failing after 36s Details Pre-commit / pre-commit (push) Failing after 1m19s Details # What does this PR do? This PR creates a webmethod for deleting open AI responses, adds and implementation for it and makes an integration test for the OpenAI delete response method. [//]: # (If resolving an issue, uncomment and update the line below) # (Closes #2077) ## Test Plan Ran the standard tests and the pre-commit hooks and the unit tests. # (## Documentation) For this pr I made the routes and implementation based on the current get and create methods. The unit tests were not able to handle this test due to the mock interface in use, which did not allow for effective CRUD to be tested. I instead created an integration test to match the existing ones in the test_openai_responses.	2025-06-30 11:28:02 +02:00
Wen Zhou	6fa5271807	docs: update document since container is not an option for "llama stack run" + update docs with current "usage" (#2531 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - change from https://github.com/meta-llama/llama-stack/issues/2110 need update documentation. "container" is not valid value for --image-type - chore: updates from standard output <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Wen Zhou <wenzhou@redhat.com>	2025-06-30 11:02:07 +05:30
dependabot[bot]	dc1b4a84c3	chore(github-deps): bump astral-sh/setup-uv from 6.3.0 to 6.3.1 (#2548 ) Some checks failed Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 13s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 28s Details Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 18s Details Integration Tests / test-matrix (http, 3.13, inference) (push) Failing after 19s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.12, inspect) (push) Failing after 32s Details Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 31s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 42s Details Python Package Build Test / build (3.12) (push) Failing after 40s Details Python Package Build Test / build (3.13) (push) Failing after 38s Details Test External Providers / test-external-providers (venv) (push) Failing after 39s Details Unit Tests / unit-tests (3.12) (push) Failing after 21s Details Unit Tests / unit-tests (3.13) (push) Failing after 19s Details Pre-commit / pre-commit (push) Successful in 2m18s Details Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.3.0 to 6.3.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v6.3.1 🌈 Do not warn when version not in manifest-file</h2> <h2>Changes</h2> <p>This is a hotfix to change the warning messages that a version could not be found in the local manifest-file to info level.</p> <p>A <code>setup-uv</code> release contains a version-manifest.json file with infos in all available <code>uv</code> releases. When a new <code>uv</code> version is released this is not contained in this file until the file gets updated and a new <code>setup-uv</code> release is made. We will overhaul this process in the future but for now the spamming of warnings is removed.</p> <h2>🐛 Bug fixes</h2> <ul> <li>Do not warn when version not in manifest-file <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/462">#462</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>chore: update known versions for 0.7.14 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/459">#459</a>)</li> <li>Revert "Set expected cache dir drive to C: on windows (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/451">#451</a>)" <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/460">#460</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`bd01e18f51`"><code>bd01e18</code></a> Do not warn when version not in manifest-file (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/462">#462</a>)</li> <li><a href="`c6a5ebaafe`"><code>c6a5eba</code></a> chore: update known versions for 0.7.14 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/459">#459</a>)</li> <li><a href="`790df8f465`"><code>790df8f</code></a> Revert "Set expected cache dir drive to C: on windows (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/451">#451</a>)" (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/460">#460</a>)</li> <li>See full diff in <a href="`445689ea25...bd01e18f51`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.3.0&new-version=6.3.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-06-29 13:55:32 -04:00
Ashwin Bharambe	21669b14e7	fix(docs): add setuptools explicitly (#2547 ) Some checks failed Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 31s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 35s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 7s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 11s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s Details Test Llama Stack Build / build (push) Failing after 10s Details Python Package Build Test / build (3.13) (push) Failing after 30s Details Pre-commit / pre-commit (push) Successful in 1m23s Details Given the shift to python3.12, we need to explicitly depend on `setuptools` for the pkg_resources import ## Test Plan Run ``` cd local/llama-stack UV_PROJECT_ENVIRONMENT=/tmp/docs uv sync --frozen --group docs cd /tmp/docs uv run python -m sphinx -T -b html -d _build/doctrees -D language=en \ ~/local/llama-stack/docs/source/ \ /tmp/docs/html ```	2025-06-28 08:14:25 +05:30
github-actions[bot]	709eb7da33	build: Bump version to 0.2.13	2025-06-27 23:56:14 +00:00
Francisco Arceo	cc19b56c87	chore: OpenAI compatibility for Milvus (#2470 ) # What does this PR do? Closes https://github.com/meta-llama/llama-stack/issues/2461 ## Test Plan Tested with the `ollama` distriubtion template and updated the vector_io provider to: ```yaml vector_io: - provider_id: milvus provider_type: inline::milvus config: db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/ollama}/milvus_store.db kvstore: type: sqlite db_name: milvus_registry.db ``` Ran the stack ```bash llama stack run ./llama_stack/templates/ollama/run.yaml --image-type venv --env OLLAMA_URL="http://0.0.0.0:11434" ``` Ran the tests: ``` pytest -sv --stack-config=http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ``` Output passed. Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-06-27 16:00:36 -07:00
Charlie Doern	65b4fae51d	fix: proper checkpointing logic for HF trainer (#2429 ) # What does this PR do? currently only the last saved model is reported as a checkpoint and associated with the job UUID. since the HF trainer handles checkpoint collection during training, we need to add all of the `checkpoint-*` folders as Checkpoint objects. Adjust the save strategy to be per-epoch to make this easier and to use less storage Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-06-27 17:36:25 -04:00
Ramakrishna Reddy Yekulla	03e61e3fcc	fix: ValueError in faiss vector database serialization (resolves #2519 ) (#2526 ) Some checks failed Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 22s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 22s Details Integration Tests / test-matrix (http, 3.13, inference) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 7s Details Python Package Build Test / build (3.12) (push) Failing after 15s Details Python Package Build Test / build (3.13) (push) Failing after 17s Details Test External Providers / test-external-providers (venv) (push) Failing after 20s Details Unit Tests / unit-tests (3.12) (push) Failing after 21s Details Unit Tests / unit-tests (3.13) (push) Failing after 11s Details Pre-commit / pre-commit (push) Successful in 1m12s Details The error message was misleading as it appeared to be an Ollama connectivity issue, but actually occurred during faiss vector database initialization. ## 🔍 Root Cause Analysis The issue was in the faiss vector database serialization logic in `llama_stack/providers/inline/vector_io/faiss/faiss.py`: 1. Saving: `faiss.serialize_index()` returns binary data (uint8 numpy array) 2. Bug: Code incorrectly used `np.savetxt()` which converts binary to text with scientific notation (e.g., `7.300000000000000000e+01`) 3. Loading: `np.loadtxt(buffer, dtype=np.uint8)` failed to parse scientific notation back to uint8 4. Result: Server crashed during initialization before reaching Ollama connectivity check ## ✅ Solution Replaced text-based serialization with proper binary serialization: ``` After (fixed): ```python # Saving - proper binary format np.save(buffer, np_index, allow_pickle=False) # Loading - proper binary format self.index = faiss.deserialize_index(np.load(buffer, allow_pickle=False)) ``` ## 🧪 Testing - ✅ Binary serialization/deserialization works correctly - ✅ Backward compatible with existing functionality - ✅ No security concerns (allow_pickle=False maintained) - ✅ Resolves the specific ValueError mentioned in the issue ## 📊 Impact This fix resolves: - ValueError during server startup with Ollama templates ## 🔗 Related Issues - Closes #2519 - Affects all users of Ollama template and faiss vector_io configurations ## 📝 Files Changed - `llama_stack/providers/inline/vector_io/faiss/faiss.py` - Fixed serialization methods in `initialize()` and `_save_index()` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: Ben Browning <bbrownin@redhat.com>	2025-06-27 14:34:52 -04:00
Rohan Awhad	7cb5d3c60f	chore: standardize unsupported model error #2517 (#2518 ) # What does this PR do? - llama_stack/exceptions.py: Add UnsupportedModelError class - remote inference ollama.py and utils/inference/model_registry.py: Changed ValueError in favor of UnsupportedModelError - utils/inference/litellm_openai_mixin.py: remove `register_model` function implementation from `LiteLLMOpenAIMixin` class. Now uses the parent class `ModelRegistryHelper`'s function implementation Closes #2517 ## Test Plan 1. Create a new `test_run_openai.yaml` and paste the following config in it: ```yaml version: '2' image_name: test-image apis: - inference providers: inference: - provider_id: openai provider_type: remote::openai config: max_tokens: 8192 models: - metadata: {} model_id: "non-existent-model" provider_id: openai model_type: llm server: port: 8321 ``` And run the server with: ```bash uv run llama stack run test_run_openai.yaml ``` You should now get a `llama_stack.exceptions.UnsupportedModelError` with the supported list of models in the error message. --- Tested for the following remote inference providers, and they all raise the `UnsupportedModelError`: - Anthropic - Cerebras - Fireworks - Gemini - Groq - Ollama - OpenAI - SambaNova - Together - Watsonx --------- Co-authored-by: Rohan Awhad <rawhad@redhat.com>	2025-06-27 14:26:58 -04:00
Yuan Tang	9baa16e498	fix(security): Upgrade protobuf and aiohttp. Fixes CVE-2025-4565 (#2541 ) # What does this PR do? Fixes CVE-2025-4565 and the following warning: ``` warning: `aiohttp==3.11.13` is yanked (reason: "Regression: https://github.com/aio-libs/aiohttp/issues/10617") ``` Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-06-27 06:58:38 -07:00
Juanma	e7eb9f9adc	fix: dataset metadata without provider_id (#2527 ) # What does this PR do? Fixes an error when inferring dataset provider_id with metadata Closes #[2506](https://github.com/meta-llama/llama-stack/issues/2506) Signed-off-by: Juanma Barea <juanmabareamartinez@gmail.com>	2025-06-27 08:51:29 -04:00
Yuan Tang	40fdce79b3	fix(security): Upgrade urllib3 to v2.5.0. Fixes CVE-2025-50181 and CVE-2025-50182 (#2534 ) Some checks failed Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 36s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 34s Details Test External Providers / test-external-providers (venv) (push) Failing after 32s Details Pre-commit / pre-commit (push) Successful in 1m21s Details This fixes CVE-2025-50181 and CVE-2025-50182. Changes via: ``` uv sync --upgrade-package urllib3 uv export --frozen --no-hashes --no-emit-project --no-default-groups --output-file=requirements.txt ``` Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-06-27 10:46:47 +02:00
Wen Zhou	8c3f2762fb	build: update temp. created Containerfile (#2492 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - conditionally created folder /.llama/providers.d if external_providers_dir is set - do not create /.cache folder, not in use anywhere - combine chmod and copy to one command <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> updated test: ``` export CONTAINER_BINARY=podman LLAMA_STACK_DIR=. uv run llama stack build --template remote-vllm --image-type container --image-name <name> ``` log: ``` Containerfile created successfully in /tmp/tmp.rPMunE39Aw/Containerfile FROM python:3.11-slim WORKDIR /app RUN apt-get update && apt-get install -y iputils-ping net-tools iproute2 dnsutils telnet curl wget telnet git procps psmisc lsof traceroute bubblewrap gcc && rm -rf /var/lib/apt/lists/* ENV UV_SYSTEM_PYTHON=1 RUN pip install uv RUN uv pip install --no-cache sentencepiece pillow pypdf transformers pythainlp faiss-cpu opentelemetry-sdk requests datasets chardet scipy nltk numpy matplotlib psycopg2-binary aiosqlite langdetect autoevals tree_sitter tqdm pandas chromadb-client opentelemetry-exporter-otlp-proto-http redis scikit-learn openai pymongo emoji sqlalchemy[asyncio] mcp aiosqlite fastapi fire httpx uvicorn opentelemetry-sdk opentelemetry-exporter-otlp-proto-http RUN uv pip install --no-cache sentence-transformers --no-deps RUN uv pip install --no-cache torch torchvision --index-url https://download.pytorch.org/whl/cpu # Allows running as non-root user RUN mkdir -p /.llama/providers.d /.cache RUN uv pip install --no-cache llama-stack RUN pip uninstall -y uv ENTRYPOINT ["python", "-m", "llama_stack.distribution.server.server", "--template", "remote-vllm"] RUN chmod -R g+rw /app /.llama /.cache PWD: /tmp/llama-stack Containerfile: /tmp/tmp.rPMunE39Aw/Containerfile + podman build --progress=plain --security-opt label=disable --platform linux/amd64 -t distribution-remote-vllm:0.2.12 -f /tmp/tmp.rPMunE39Aw/Containerfile /tmp/llama-stack .... Success! Build Successful! You can find the newly-built template here: /tmp/llama-stack/llama_stack/templates/remote-vllm/run.yaml You can run the new Llama Stack distro via: llama stack run /tmp/llama-stack/llama_stack/templates/remote-vllm/run.yaml --image-type container ``` ``` podman tag localhost/distribution-remote-vllm:dev quay.io/wenzhou/distribution-remote-vllm:2492_2 podman push quay.io/wenzhou/distribution-remote-vllm:2492_2 docker run --rm -p 8321:8321 -e INFERENCE_MODEL="meta-llama/Llama-2-7b-chat-hf" -e VLLM_URL="http://localhost:8000/v1" quay.io/wenzhou/distribution-remote-vllm:2492_2 --port 8321 INFO 2025-06-26 13:47:31,813 __main__:436 server: Using template remote-vllm config file: /app/llama-stack-source/llama_stack/templates/remote-vllm/run.yaml INFO 2025-06-26 13:47:31,818 __main__:438 server: Run configuration: INFO 2025-06-26 13:47:31,826 __main__:440 server: apis: - agents - datasetio - eval - inference - safety - scoring - telemetry - tool_runtime - vector_io benchmarks: [] container_image: null .... ``` ----- previous test: local run` >llama stack build --template remote-vllm --image-type container` image stored in `quay.io/wenzhou/distribution-remote-vllm:2492` --------- Signed-off-by: Wen Zhou <wenzhou@redhat.com>	2025-06-27 10:23:12 +02:00
Yuan Tang	0ddb293d77	docs: Add recent releases to CHANGELOG.md (#2533 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Update changelog. --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-06-26 23:04:13 -04:00
Ben Browning	0883944bc3	fix: Some missed env variable changes from PR 2490 (#2538 ) Some checks failed Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 25s Details Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 28s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 8s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Test External Providers / test-external-providers (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Test Llama Stack Build / build-single-provider (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Details Test Llama Stack Build / build (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 34s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 32s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 29s Details Pre-commit / pre-commit (push) Successful in 1m1s Details # What does this PR do? Some templates were still using the old environment variable substition syntax instead of the new one and were not getting substituted properly. Also, some places didn't handle the new None vs old empty string ("") values that come from the conditional environment variable substitution. This gets the starter and remote-vllm distributions starting again, and I tested various permutations of the starter as chroma and pgvector needed some adjustments to their config classes to handle the new possible `None` values. And, I had to tweak our `Provider` class to also handle `None` values, for cases where we disable providers in the starter config via environment variables. This may not have caught everything that was missed, but I did grep around quite a bit to try and find anything lingering. ## Test Plan The following permutations now all run (or attempt to run to the point of complaining that they can't connect to chroma, vllm, etc) when before they failed immediately on startup because of bad environment variable substitions: ``` uv run llama stack run llama_stack/templates/starter/run.yaml ENABLE_SQLITE_VEC=true uv run llama stack run llama_stack/templates/starter/run.yaml ENABLE_PGVECTOR=true uv run llama stack run llama_stack/templates/starter/run.yaml ENABLE_CHROMADB=true uv run llama stack run llama_stack/templates/starter/run.yaml uv run llama stack run llama_stack/templates/remote-vllm/run.yaml ``` <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: raghotham <rsm@meta.com>	2025-06-26 17:59:15 -07:00
Hardik Shah	eb01a3f1c5	ci: vector_io provider integration tests (#2537 ) Runs integration tests for `vector_io` across the provider matrix. This new workflow adds CI testing across - `inline::faiss`, `remote::chroma`.	2025-06-26 17:04:32 -07:00
grs	68d8f2186f	fix: fix test of root span to match what is being set (#2494 ) Some checks failed Integration Tests / test-matrix (http, 3.12, inspect) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 13s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 22s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 22s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 14s Details Integration Tests / test-matrix (http, 3.13, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.12, post_training) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 20s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 10s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 32s Details Unit Tests / unit-tests (3.12) (push) Failing after 48s Details Pre-commit / pre-commit (push) Successful in 1m32s Details # What does this PR do? I get errors when trying to query spans. It appears to be a result of traces being inserted where there is no root_span_id which causes a pydantic validation error on trying to load the data for a query response (and in any case having no span referenced undermines the purpose of the trace). The root cause as far as I can see is an invalid test in the code that inserts the trace, where it is testing for the string "true" against an object set to the python value True. <!-- If resolving an issue, uncomment and update the line below --> Closes #2493 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> With this change I can query spans. Signed-off-by: Gordon Sim <gsim@redhat.com>	2025-06-26 11:41:35 -04:00
Sébastien Han	dbdc811d16	chore: isolate bare minimum project dependencies (#2282 ) Some checks failed Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 12s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 20s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 7s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 16s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 18s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s Details Python Package Build Test / build (3.12) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 17s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.12, inference) (push) Failing after 26s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 19s Details Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 8s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 10s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 7s Details Pre-commit / pre-commit (push) Successful in 48s Details # What does this PR do? The goal is to promote the minimal set of dependencies the project needs to run, this includes: * dependencies needed to work with the CLI * dependencies needed for the server to run with no providers This also: * Relocate redundant dependencies out of the core project and into the individual providers that actually require them. * Include all necessary server dependencies so the project can run standalone, even without any providers. <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Build and run distro a server. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-26 10:14:27 +02:00
Sébastien Han	43c1f39bd6	refactor(env)!: enhanced environment variable substitution (#2490 ) # What does this PR do? This commit significantly improves the environment variable substitution functionality in Llama Stack configuration files: * The version field in configuration files has been changed from string to integer type for better type consistency across build and run configurations. * The environment variable substitution system for ${env.FOO:} was fixed and properly returns an error * The environment variable substitution system for ${env.FOO+} returns None instead of an empty strings, it better matches type annotations in config fields * The system includes automatic type conversion for boolean, integer, and float values. * The error messages have been enhanced to provide clearer guidance when environment variables are missing, including suggestions for using default values or conditional syntax. * Comprehensive documentation has been added to the configuration guide explaining all supported syntax patterns, best practices, and runtime override capabilities. * Multiple provider configurations have been updated to use the new conditional syntax for optional API keys, making the system more flexible for different deployment scenarios. The telemetry configuration has been improved to properly handle optional endpoints with appropriate validation, ensuring that required endpoints are specified when their corresponding sinks are enabled. * There were many instances of ${env.NVIDIA_API_KEY:} that should have caused the code to fail. However, due to a bug, the distro server was still being started, and early validation wasn’t triggered. As a result, failures were likely being handled downstream by the providers. I’ve maintained similar behavior by using ${env.NVIDIA_API_KEY:+}, though I believe this is incorrect for many configurations. I’ll leave it to each provider to correct it as needed. * Environment variable substitution now uses the same syntax as Bash parameter expansion. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-26 08:20:08 +05:30
Sébastien Han	36d70637b9	fix: finish conversion to StrEnum (#2514 ) # What does this PR do? We still had a few enum declared to behave like string as well as enum. Let's use StrEnum for those. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-26 08:01:26 +05:30
Sébastien Han	ac5fd57387	chore: remove nested imports (#2515 ) # What does this PR do? * Given that our API packages use "import " in `__init.py__` we don't need to do `from llama_stack.apis.models.models` but simply from llama_stack.apis.models. The decision to use `import ` is debatable and should probably be revisited at one point. * Remove unneeded Ruff F401 rule * Consolidate Ruff F403 rule in the pyprojectfrom llama_stack.apis.models.models Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-26 08:01:05 +05:30
Ben Browning	2d9fd041eb	fix: annotations list and web_search_preview in Responses (#2520 ) # What does this PR do? These are a couple of fixes to get an example LangChain app working with our OpenAI Responses API implementation. The Responses API spec requires an annotations array in `output[].content[].annotations` and we were not providing one. So, this adds that as an empty list, even though we don't do anything to populate it yet. This prevents an error from client libraries like Langchain that expect this field to always exist, even if an empty list. The other fix is `web_search_preview` is a valid name for the web search tool in the Responses API, but we only responded to `web_search` or `web_search_preview_2025_03_11`. ## Test Plan The existing Responses unit tests were expanded to test these cases, via: ``` pytest -sv tests/unit/providers/agents/meta_reference/test_openai_responses.py ``` The existing test_openai_responses.py integration tests still pass with this change, tested as below with Fireworks: ``` uv run llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ uv run pytest -sv tests/integration/agents/test_openai_responses.py \ --text-model accounts/fireworks/models/llama4-scout-instruct-basic ``` Lastly, this example LangChain app now works with Llama stack (tested with Ollama in the starter template in this case). This LangChain code is using the example snippets for using Responses API at https://python.langchain.com/docs/integrations/chat/openai/#responses-api ```python from langchain_openai import ChatOpenAI llm = ChatOpenAI( base_url="http://localhost:8321/v1/openai/v1", api_key="fake", model="ollama/meta-llama/Llama-3.2-3B-Instruct", ) tool = {"type": "web_search_preview"} llm_with_tools = llm.bind_tools([tool]) response = llm_with_tools.invoke("What was a positive news story from today?") print(response.content) ``` Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-06-26 07:59:33 +05:30
ehhuang	1d3f27fe5b	fix: resume responses with tool call output (#2524 ) Some checks failed Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (http, 3.12, inference) (push) Failing after 17s Details Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.13, inspect) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 8s Details Python Package Build Test / build (3.12) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 49s Details Test External Providers / test-external-providers (venv) (push) Failing after 49s Details Unit Tests / unit-tests (3.13) (push) Failing after 49s Details Pre-commit / pre-commit (push) Successful in 2m5s Details # What does this PR do? closes #2522 ## Test Plan added integration test LLAMA_STACK_CONFIG=http://localhost:8321 pytest -v tests/integration/agents/test_openai_responses.py --text-model "accounts/fireworks/models/llama-v3p3-70b-instruct" -vv -k 'function_call'	2025-06-25 14:43:37 -07:00
Francisco Arceo	82f13fe83e	feat: Add ChunkMetadata to Chunk (#2497 ) # What does this PR do? Adding `ChunkMetadata` so we can properly delete embeddings later. More specifically, this PR refactors and extends the chunk metadata handling in the vector database and introduces a distinction between metadata used for model context and backend-only metadata required for chunk management, storage, and retrieval. It also improves chunk ID generation and propagation throughout the stack, enhances test coverage, and adds new utility modules. ```python class ChunkMetadata(BaseModel): """ `ChunkMetadata` is backend metadata for a `Chunk` that is used to store additional information about the chunk that will NOT be inserted into the context during inference, but is required for backend functionality. Use `metadata` in `Chunk` for metadata that will be used during inference. """ document_id: str \| None = None chunk_id: str \| None = None source: str \| None = None created_timestamp: int \| None = None updated_timestamp: int \| None = None chunk_window: str \| None = None chunk_tokenizer: str \| None = None chunk_embedding_model: str \| None = None chunk_embedding_dimension: int \| None = None content_token_count: int \| None = None metadata_token_count: int \| None = None ``` Eventually we can migrate the document_id out of the `metadata` field. I've introduced the changes so that `ChunkMetadata` is backwards compatible with `metadata`. <!-- If resolving an issue, uncomment and update the line below --> Closes https://github.com/meta-llama/llama-stack/issues/2501 ## Test Plan Added unit tests --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-06-25 15:55:23 -04:00
Ben Browning	fa0b0c13d4	fix: Ollama should be optional in starter distro (#2482 ) Some checks failed Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 14s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.13, inspect) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 12s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 10s Details Test Llama Stack Build / generate-matrix (push) Successful in 7s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Test Llama Stack Build / build (push) Failing after 6s Details Test Llama Stack Build / build-single-provider (push) Failing after 1m10s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 1m8s Details Python Package Build Test / build (3.13) (push) Failing after 1m6s Details Test External Providers / test-external-providers (venv) (push) Failing after 1m4s Details Pre-commit / pre-commit (push) Successful in 2m33s Details # What does this PR do? Our starter distro required Ollama to be running (and a large list of models available in that Ollama) to successfully start. This adjusts things so that Ollama does not have to be running to use the starter template / distro. To accomplish this, a few changes were needed: * The Ollama provider is now configurable whether it raises an Exception or just logs a warning when it cannot reach the Ollama server on startup. The default is to raise an exception (same as previous behavior), but in the starter template we adjust this to just log a warning so that we can bring the stack up without needing a running Ollama server. * The starter template no longer specifies a default list of models for Ollama, as any models specified there need to actually be pulled and available in Ollama. Instead, it adds a new `OLLAMA_INFERENCE_MODEL` environment variable where users can provide an optional model to register with the Ollama provider on startup. Additional models can also be registered via the typical `models.register(...)` at runtime. * The vLLM template was adjusted to also allow an optional `VLLM_INFERENCE_MODEL` specified on startup, so that the behavior between vLLM and Ollama was consistent here to make it easy to get up and running quickly. * The default vector store was changed from sqlite-vec to faiss. sqlite-vec can enabled via setting the `ENABLE_SQLITE_VEC` environment variable, like we do for chromadb and pgvector. This is due to sqlite-vec not shipping proper arm64 binaries, like we previously fixed in #1530 for the ollama distribution. ## Test Plan With this change, the following scenarios now work with the starter template that did not before: * no Ollama running * Ollama running but not all of the Llama models pulled locally * Ollama running with a custom model registered on startup * vLLM running with a custom model registered on startup * running the starter template on linux/arm64, like when running containers on Mac without rosetta emulation --------- Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-06-25 15:54:00 +02:00
Varsha	cfee63bd0d	feat: Add search_mode support to OpenAI vector store API (#2500 ) Some checks failed Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 11s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 17s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 18s Details Test Llama Stack Build / build-single-provider (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 17s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 19s Details Test Llama Stack Build / build (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 44s Details Test External Providers / test-external-providers (venv) (push) Failing after 47s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 50s Details Pre-commit / pre-commit (push) Successful in 2m12s Details # What does this PR do? Add search_mode parameter (vector/keyword/hybrid) to openai_search_vector_store method. Fixes OpenAPI code generation by using str instead of Literal type. Closes: #2459 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>	2025-06-24 20:38:47 -04:00

1 2 3 4 5 ...

2154 commits