llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	fa2b361f46	Merge branch 'main' into add-mcp-authentication-param	2025-11-13 09:42:35 -08:00
Francisco Arceo	4442b24de7	chore: Fix docs so can be deployed (#4149 ) # What does this PR do? Building/Deploying docs is failing here: `5530320962 (step)`:8:49 Needs the playground file. Updated it to reflect current admin status. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-11-13 09:15:32 -08:00
Derek Higgins	aeaf4eb3dd	fix: remove_disabled_providers filtering models with None fields (#4132 ) Fixed bug where models with No provider_model_id were incorrectly filtered from the startup config display. The function was checking multiple fields when it should only filter items with explicitly disabled provider_id. Changes: o Modified remove_disabled_providers to only check provider_id field o Changed condition from checking multiple fields with None to only checking provider_id for "__disabled__", None or empty string o Added comprehensive unit tests Closes: #4131 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-11-13 07:24:05 -08:00
Ashwin Bharambe	1e81056a22	feat(tests): enable MCP tests in server mode (#4146 ) We would like to run all OpenAI compatibility tests using only the openai-client library. This is most friendly for contributors since they can run tests without needing to update the client-sdks (which is getting easier but still a long pole.) This is the first step in enabling that -- no using "library client" for any of the Responses tests. This seems like a reasonable trade-off since the usage of an embeddeble library client for Responses (or any OpenAI-compatible) behavior seems to be not very common. To do this, we needed to enable MCP tests (which only worked in library client mode) for server mode.	2025-11-13 07:23:23 -08:00
Akram Ben Aissi	9eb81439d2	docs: Add comprehensive Files API and Vector Store integration doc (#3279 ) docs: Add comprehensive Files API and Vector Store integration documentation - Add Files API documentation with OpenAI-compatible endpoints - Create comprehensive guide for OpenAI-compatible file operations - Reorganize documentation structure: move file operations to files/ directory - Add vector store provider documentation for Milvus, SQLite-vec, FAISS - Clean up redundant files and improve navigation - Update cross-references and eliminate documentation duplication - Support for release 0.2.14 FileResponse and Vector Store API features # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-11-13 08:50:06 -05:00
Omar Abdelwahab	1a6cb7041d	precommit	2025-11-12 19:02:54 -08:00
Omar Abdelwahab	66ca51ac0d	feat(tool-runtime): Add authorization parameter to list_runtime_tools Add authorization parameter to list_runtime_tools() method to support MCP servers that require authentication for listing tools. Changes: - Updated ToolRuntime protocol to include authorization parameter on list_runtime_tools() - Updated all provider implementations (MCP, Tavily, Brave, Bing, Wolfram Alpha) - Updated router and routing table to pass authorization through - Updated API recorder patched methods to include authorization parameter This enables authenticated tool listing for enterprise MCP deployments where IT administrators pre-configure connectors requiring authentication. Note: Client SDK will need to be regenerated from updated OpenAPI spec to support passing this parameter from client code. Tests will pass once client SDK is updated.	2025-11-12 17:27:03 -08:00
Omar Abdelwahab	e6ebbd8a7b	fix(tool-runtime): Remove authorization from list_runtime_tools in all providers Updated all tool runtime provider implementations to remove the authorization parameter from list_runtime_tools(): - tavily_search.py - brave_search.py - wolfram_alpha.py - bing_search.py These providers were missing in the previous commit. Tool listing typically doesn't require authentication - only invoke_tool() needs the authorization parameter for authenticated tool execution. This ensures all tool runtime providers have consistent signatures matching the updated protocol definition.	2025-11-12 16:20:53 -08:00
Omar Abdelwahab	18f197763b	fix(tool-runtime): Remove authorization from list_runtime_tools() The authorization parameter should only be on invoke_tool(), not on list_runtime_tools(). Tool listing typically doesn't require authentication, and the client SDK doesn't have this parameter yet. Changes: 1. Removed authorization parameter from ToolRuntime.list_runtime_tools() protocol method 2. Updated all implementations to remove the authorization parameter: - MCPProviderImpl.list_runtime_tools() - ToolRuntimeRouter.list_runtime_tools() - ToolGroupsRoutingTable.list_tools() and _index_tools() 3. Updated test to remove authorization from list_tools() call This ensures compatibility with the llama-stack-client SDK which doesn't support authorization on list_tools() yet. Only invoke_tool() requires and accepts the authorization parameter for authenticated tool execution.	2025-11-12 16:17:53 -08:00
Omar Abdelwahab	c0295a2495	revert(debug): Remove temporary debug logging from resolver Removing the debug logging that was added to diagnose signature mismatch errors. The logging served its purpose - it helped us identify that the error was coming from api_recorder.py patched methods, not the actual provider implementations. With the root cause now fixed in api_recorder.py, this debug logging is no longer needed and can be safely removed to keep the code clean.	2025-11-12 16:12:14 -08:00
Omar Abdelwahab	4a1fa139f1	revert(ci): Remove unnecessary CI workarounds from action.yml Now that we've fixed the actual root cause (api_recorder.py missing the authorization parameter), we can revert all the CI workarounds that were added during troubleshooting: Removed changes: - Cache clearing (venv, pycache, UV cache) - PYTHONDONTWRITEBYTECODE environment variable - --no-install-project flag - Force reinstalling llama-stack - Installing ci-tests distribution dependencies via llama CLI - Final bytecode cache cleanup These were all based on incorrect diagnosis (missing dependencies or module caching) and are no longer needed. The real fix was updating api_recorder.py to include the authorization parameter in patched tool runtime methods. Restoring the simpler, original CI setup that just runs 'uv sync --all-groups'.	2025-11-12 16:11:16 -08:00
Omar Abdelwahab	d156451890	fix(ci): Add authorization parameter to api_recorder tool runtime patches The ACTUAL root cause of the signature mismatch errors was found! The api_recorder.py module patches tool runtime invoke_tool methods for test recording/replay, but the patched methods were missing the new 'authorization' parameter. The debug logging revealed: Object method: patched_tavily_invoke_tool (from api_recorder module) Object method's module: llama_stack.testing.api_recorder Changes made: 1. Updated _patched_tool_invoke_method() to accept authorization parameter 2. Updated patched_tavily_invoke_tool() signature to include authorization 3. Added debug logging to resolver to help identify similar issues in the future This fix ensures that when tests run in record/replay mode, the patched methods preserve the full signature including the authorization parameter, allowing the protocol compliance checks to pass.	2025-11-12 16:06:29 -08:00
Omar Abdelwahab	bae5b14adf	debug: Add detailed logging for signature mismatch errors Adding comprehensive debug logging to understand what's causing the persistent signature mismatch errors in CI. The logging will show: - Provider class name and module - Both protocol and object signatures - The actual method object - The method's source module This will help us identify if the issue is: 1. A cached module being loaded 2. A parent class overriding the method 3. Some other source of the wrong signature Once we see the debug output, we can pinpoint the exact root cause.	2025-11-12 16:01:13 -08:00
Omar Abdelwahab	166c37bbbe	fix(ci): Prevent Python from caching old code during uv sync The signature mismatch error persists because 'uv sync' installs and potentially imports the llama-stack package, caching provider modules in memory BEFORE we do the editable install with fresh source code. This fix adds the --no-install-project flag to 'uv sync', which: 1. Installs all dependencies but skips installing the project itself 2. Prevents Python from importing and caching provider modules 3. Ensures the subsequent 'uv pip install -e .' loads fresh source code This should finally resolve the persistent signature mismatch errors in CI where the protocol has 'authorization' parameter but provider implementations appear not to.	2025-11-12 15:56:26 -08:00
Omar Abdelwahab	761a2a0ce3	fix(ci): Use 'uv run' to execute llama command in virtual environment The previous commit tried to run 'llama stack list-deps' directly, but the 'llama' command wasn't in PATH yet since the virtual environment hadn't been activated. This fix uses 'uv run llama' instead, which executes the command within the uv virtual environment context, ensuring the llama CLI is accessible.	2025-11-12 15:51:55 -08:00
Omar Abdelwahab	844a159219	fix(ci): Install ci-tests distribution dependencies to fix test failures The CI integration tests were failing with a signature mismatch error, but the root cause was missing dependencies (specifically the 'together' package). The signature mismatch was a misleading error that occurred because the provider modules failed to load properly due to missing dependencies. This fix adds a step to install all ci-tests distribution dependencies using: llama stack list-deps ci-tests \| xargs -L1 uv pip install This ensures all required provider dependencies are installed before running tests.	2025-11-12 15:49:57 -08:00
Omar Abdelwahab	0754d59999	fix(ci): Add final bytecode cache clear after installations The issue was timing - we were clearing cache before installations, but uv sync/pip install were creating new .pyc files. This commit: 1. Adds PYTHONDONTWRITEBYTECODE=1 to prevent .pyc generation 2. Clears bytecode cache AFTER all installations complete 3. Ensures no stale .pyc files exist before tests run For editable installs (-e .), Python loads from source directory, so clearing cache after installation ensures the resolver sees the latest method signatures with the authorization parameter.	2025-11-12 15:28:49 -08:00
Omar Abdelwahab	6dc2d92232	fix(ci): Clear cached .venv directory to ensure fresh install The GitHub Actions cache was restoring a cached virtual environment (.venv) with old code. This commit clears all caching layers: 1. Removes cached .venv directory (the main culprit) 2. Clears Python bytecode cache (.pyc files) 3. Clears UV cache directory This forces uv sync to create a completely fresh virtual environment with the latest source code changes, ensuring the authorization parameter is picked up across all tool runtime providers.	2025-11-12 15:25:51 -08:00
Omar Abdelwahab	8b6588dc1e	fix(ci): Clear UV cache directory instead of lock file The previous approach of removing uv.lock caused dependency resolution failures. The real issue is the UV_CACHE_DIR that contains pre-built wheels with old code. This commit: 1. Keeps uv.lock (it's part of the project) 2. Clears UV_CACHE_DIR (where compiled wheels are cached) 3. Forces uv to rebuild wheels from source This ensures the latest source code changes are picked up without breaking dependency resolution.	2025-11-12 15:23:06 -08:00
Omar Abdelwahab	6aaf4ad080	fix(ci): Remove uv.lock before sync to ensure fresh dependency resolution The uv.lock file contains cached dependency resolutions that prevent source code changes from being picked up. By removing it before uv sync, we force a fresh resolution and rebuild of dependencies. This should fix the 73 CI test failures where the resolver was loading stale method signatures without the authorization parameter.	2025-11-12 15:20:48 -08:00
Omar Abdelwahab	1ea57b0a17	Fix CI: Clear Python bytecode cache before reinstall The real issue was stale .pyc bytecode files in __pycache__ directories. These cached files contained the old method signatures without the authorization parameter, causing signature mismatch errors even though the source .py files were correct. Now clearing all __pycache__ directories and .pyc files before the force-reinstall to ensure Python loads fresh bytecode from the updated source files.	2025-11-12 15:16:34 -08:00
Omar Abdelwahab	025c301a9a	Fix CI: Force reinstall llama-stack from source The CI was using a cached/stale version of the package that didn't include our authorization parameter changes. Add explicit force reinstall step to ensure the latest source code is used.	2025-11-12 15:12:42 -08:00
Omar Abdelwahab	778b7de9cb	fix: add authorization parameter to ToolRuntimeRouter and routing table The auto-routing layer was missing the authorization parameter: - ToolRuntimeRouter.invoke_tool() now accepts and passes authorization - ToolRuntimeRouter.list_runtime_tools() now accepts and passes authorization - ToolGroupsRoutingTable.list_tools() now accepts and forwards authorization - ToolGroupsRoutingTable._index_tools() now accepts and uses authorization This fixes the '__autorouted__' provider signature mismatch error in CI.	2025-11-12 15:08:00 -08:00
Omar Abdelwahab	bf28c215d1	chore: trigger CI - all provider signatures fixed All ToolRuntime provider implementations now have 'authorization' parameter. Verified locally that signatures are correct after fresh pip install. CI note: Ensure pip install -e . runs to pick up latest code changes.	2025-11-12 15:02:13 -08:00
Omar Abdelwahab	607e3cc05c	Merge branch 'main' into add-mcp-authentication-param	2025-11-12 14:55:23 -08:00
Omar Abdelwahab	7a823bc280	fix: remove syntax errors from test files caused by sed Fixed syntax errors in test files that were introduced by batch sed replacement: - test_tools_with_schemas.py: Removed leftover broken comments and closing brace - test_mcp_json_schema.py: Removed all instances of broken comment blocks The sed command left remnants that broke Python syntax.	2025-11-12 14:54:38 -08:00
Omar Abdelwahab	d804e37e01	chore: trigger CI rebuild with fresh Python cache	2025-11-12 14:51:38 -08:00
Omar Abdelwahab	d0ec3b07b5	fix: add authorization parameter to all ToolRuntime provider implementations Updated all ToolRuntime provider implementations to match the protocol signature: - BraveSearchToolRuntimeImpl - TavilySearchToolRuntimeImpl - BingSearchToolRuntimeImpl - WolframAlphaToolRuntimeImpl - MemoryToolRuntimeImpl This fixes the signature mismatch error in CI where protocol had 'authorization' parameter but implementations didn't.	2025-11-12 14:47:22 -08:00
Omar Abdelwahab	84baa5c406	feat: unify MCP authentication across Responses and Tool Runtime APIs - Add authorization parameter to Tool Runtime API signatures (list_runtime_tools, invoke_tool) - Update MCP provider implementation to use authorization from request body instead of provider-data - Deprecate mcp_authorization and mcp_headers from provider-data (MCPProviderDataValidator now empty) - Update all Tool Runtime tests to pass authorization as request body parameter - Responses API already uses request body authorization (no changes needed) This provides a single, consistent way to pass MCP authentication tokens across both APIs, addressing reviewer feedback about avoiding multiple configuration paths.	2025-11-12 14:41:00 -08:00
Ashwin Bharambe	fcf649b97a	feat(storage): share sql/kv instances and add upsert support (#4140 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 2s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.13) (push) Failing after 17s Details Test Llama Stack Build / build-single-provider (push) Successful in 31s Details Test External API and Providers / test-external (venv) (push) Failing after 32s Details Vector IO Integration Tests / test-matrix (push) Failing after 45s Details Test Llama Stack Build / build (push) Successful in 47s Details UI Tests / ui-tests (22) (push) Successful in 1m42s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m8s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m7s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m28s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m32s Details Pre-commit / pre-commit (push) Successful in 3m20s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m33s Details A few changes to the storage layer to ensure we reduce unnecessary contention arising out of our design choices (and letting the database layer do its correct thing): - SQL stores now share a single `SqlAlchemySqlStoreImpl` per backend, and `kvstore_impl` caches instances per `(backend, namespace)`. This avoids spawning multiple SQLite connections for the same file, reducing lock contention and aligning the cache story for all backends. - Added an async upsert API (with SQLite/Postgres dialect inserts) and routed it through `AuthorizedSqlStore`, then switched conversations and responses to call it. Using native `ON CONFLICT DO UPDATE` eliminates the insert-then-update retry window that previously caused long WAL lock retries. ### Test Plan Existing tests, added a unit test for `upsert()`	2025-11-12 12:14:26 -08:00
Ashwin Bharambe	492f79ca9b	fix: harden storage semantics (#4118 ) Fixes issues in the storage system by guaranteeing immediate durability for responses and ensuring background writers stay alive. Three related fixes: * Responses to the OpenAI-compatible API now write directly to Postgres/SQLite inside the request instead of detouring through an async queue that might never drain; this restores the expected read-after-write behavior and removes the "response not found" races reported by users. * The access-control shim was stamping owner_principal/access_attributes as SQL NULL, which Postgres interprets as non-public rows; fixing it to use the empty-string/JSON-null pattern means conversations and responses stored without an authenticated user stay queryable (matching SQLite). * The inference-store queue remains for batching, but its worker tasks now start lazily on the live event loop so server startup doesn't cancel them—writes keep flowing even when the stack is launched via llama stack run. Closes #4115 ### Test Plan Added a matrix entry to test our "base" suite against Postgres as the store.	2025-11-12 10:35:39 -08:00
Derek Higgins	356f37b1ba	docs: clarify model identification uses provider_model_id not model_id (#4128 ) Updated documentation to accurately reflect current behavior where models are identified as provider_id/provider_model_id in the system. Changes: o Clarify that model_id is for configuration purposes only o Explain models are accessed as provider_id/provider_model_id o Remove outdated aliasing example that suggested model_id could be used as a custom identifier This corrects the documentation which previously suggested model_id could be used to create friendly aliases, which is not how the code actually works. Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-11-12 10:13:26 -08:00
Ken Dreyer	94e977c257	fix(docs): link to test replay-record docs for discoverability (#4134 ) Help users find the comprehensive integration testing docs by linking to the record-replay documentation. This clarifies that the technical README complements the main docs.	2025-11-12 10:04:56 -08:00
Francisco Arceo	eb3f9ac278	feat: allow returning embeddings and metadata from `/vector_stores/` methods; disallow changing Provider ID (#4046 ) # What does this PR do? - Updates `/vector_stores/{vector_store_id}/files/{file_id}/content` to allow returning `embeddings` and `metadata` using the `extra_query` - Updates the UI accordingly to display them. - Update UI to support CRUD operations in the Vector Stores section and adds a new modal exposing the functionality. - Updates Vector Store update to fail if a user tries to update Provider ID (which doesn't make sense to allow) ```python In [1]: client.vector_stores.files.content( vector_store_id=vector_store.id, file_id=file.id, extra_query={"include_embeddings": True, "include_metadata": True} ) Out [1]: FileContentResponse(attributes={}, content=[Content(text='This is a test document to check if embeddings are generated properly.\n', type='text', embedding=[0.33760684728622437, ...,], chunk_metadata={'chunk_id': '62a63ae0-c202-f060-1b86-0a688995b8d3', 'document_id': 'file-27291dbc679642ac94ffac6d2810c339', 'source': None, 'created_timestamp': 1762053437, 'updated_timestamp': 1762053437, 'chunk_window': '0-13', 'chunk_tokenizer': 'DEFAULT_TIKTOKEN_TOKENIZER', 'chunk_embedding_model': 'sentence-transformers/nomic -ai/nomic-embed-text-v1.5', 'chunk_embedding_dimension': 768, 'content_token_count': 13, 'metadata_token_count': 9}, metadata={'filename': 'test-embedding.txt', 'chunk_id': '62a63ae0-c202-f060-1b86-0a688995b8d3', 'document_id': 'file-27291dbc679642ac94ffac6d2810c339', 'token_count': 13, 'metadata_token_count': 9})], file_id='file-27291dbc679642ac94ffac6d2810c339', filename='test-embedding.txt') ``` Screenshots of UI are displayed below: ### List Vector Store with Added "Create New Vector Store" <img width="1912" height="491" alt="Screenshot 2025-11-06 at 10 47 25 PM" src="https://github.com/user-attachments/assets/a3a3ddd9-758d-4005-ac9c-5047f03916f3" /> ### Create New Vector Store <img width="1918" height="1048" alt="Screenshot 2025-11-06 at 10 47 49 PM" src="https://github.com/user-attachments/assets/b4dc0d31-696f-4e68-b109-27915090f158" /> ### Edit Vector Store <img width="1916" height="1355" alt="Screenshot 2025-11-06 at 10 48 32 PM" src="https://github.com/user-attachments/assets/ec879c63-4cf7-489f-bb1e-57ccc7931414" /> ### Vector Store Files Contents page (with Embeddings) <img width="1914" height="849" alt="Screenshot 2025-11-06 at 11 54 32 PM" src="https://github.com/user-attachments/assets/3095520d-0e90-41f7-83bd-652f6c3fbf27" /> ### Vector Store Files Contents Details page (with Embeddings) <img width="1916" height="1221" alt="Screenshot 2025-11-06 at 11 55 00 PM" src="https://github.com/user-attachments/assets/e71dbdc5-5b49-472b-a43a-5785f58d196c" /> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Tests added for Middleware extension and Provider failures. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-11-12 09:59:48 -08:00
Charlie Doern	37853ca558	fix(tests): add OpenAI client connection cleanup to prevent CI hangs (#4119 ) # What does this PR do? Add explicit connection cleanup and shorter timeouts to OpenAI client fixtures. Fixes CI deadlock after 25+ tests due to connection pool exhaustion. Also adds 60s timeout to test_conversation_context_loading as safety net. ## Test Plan tests pass Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-12 12:17:13 -05:00
Sam El-Borai	63137f9af1	chore(stainless): add config for file header (#4126 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> This PR adds Stainless config to specify the Meta copyright file header for generated files. Doing it via config instead of custom code will reduce the probability of git conflict. ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> - review preview builds	2025-11-12 11:39:21 -05:00
Akshay Ghodake	539b9c08f3	chore(deps): update pypdf to fix DoS vulnerabilities (#4121 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 6s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test llama stack list-deps / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details Python Package Build Test / build (3.12) (push) Failing after 17s Details Python Package Build Test / build (3.13) (push) Failing after 17s Details Test llama stack list-deps / show-single-provider (push) Successful in 50s Details Test Llama Stack Build / build-single-provider (push) Successful in 53s Details UI Tests / ui-tests (22) (push) Successful in 53s Details Test Llama Stack Build / build (push) Successful in 52s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 1m18s Details Test External API and Providers / test-external (venv) (push) Failing after 1m19s Details Test llama stack list-deps / list-deps (push) Failing after 1m1s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m44s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m53s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m6s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3m7s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 3m8s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m30s Details Pre-commit / pre-commit (push) Successful in 4m1s Details Update pypdf dependency to address vulnerabilities causing potential denial of service through infinite loops or excessive memory usage when handling malicious PDFs. The update remains fully backward compatible, with no changes to the PdfReader API. # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Fixes #4120 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-11-12 10:24:19 +01:00
Charlie Doern	6ca2a67a9f	chore: remove dead code (#4125 ) # What does this PR do? build_image is not used because `llama stack build` is gone. Remove it. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-12 10:09:14 +01:00
Omar Abdelwahab	893e186d5c	Merge branch 'main' into add-mcp-authentication-param	2025-11-11 11:21:09 -08:00
ehhuang	71b328fc4b	chore(ui): add npm package and dockerfile (#4100 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Pre-commit / pre-commit (push) Failing after 2s Details Integration Tests (Replay) / generate-matrix (push) Successful in 2s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 53s Details # What does this PR do? - sets up package.json for npm `llama-stack-ui` package (will update llama-stack-ops) - adds dockerfile for UI docker image ## Test Plan npx: npm build && npm pack LLAMA_STACK_UI_PORT=8322 npx /Users/erichuang/projects/ui/src/llama_stack_ui/llama-stack-ui-0.4.0-alpha.2.tgz docker: cd src/llama_stack_ui docker build . -f Dockerfile --tag test_ui --no-cache ❯ docker run -p 8322:8322 \ -e LLAMA_STACK_UI_PORT=8322 \ test_ui:latest	2025-11-11 10:40:31 -08:00
Omar Abdelwahab	945a2883de	Merge branch 'main' into add-mcp-authentication-param	2025-11-11 09:18:50 -08:00
paulengineer	e5a55f3677	docs: use 'uv pip' to avoid pitfalls of using 'pip' in virtual environment (#4122 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Pre-commit / pre-commit (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 25s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2s Details UI Tests / ui-tests (22) (push) Successful in 53s Details # What does this PR do? In the Detailed Tutorial, at Step 3, the Install with venv option creates a new virtual environment `client`, activates it then attempts to install the llama-stack-client using pip. ``` uv venv client --python 3.12 source client/bin/activate pip install llama-stack-client <- this is the problematic line ``` However, the pip command will likely fail because the `uv venv` command doesn't, by default, include adding the pip command to the virtual environment that is created. The pip command will error either because pip doesn't exist at all, or, if the pip command does exist outside of the virtual environment, return a different error message. The latter may be unclear to the user why it is failing. This PR changes 'pip' to 'uv pip', allowing the install action to function in the virtual environment as intended, and without the need for pip to be installed. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan 1. Use linux or WSL (virtual environments on Windows use `Scripts` folder instead of `bin` [virtualenv #993ba13](`993ba1316a`) which doesn't align with the tutorial) 2. Clone the `llama-stack` repo 3. Run the following and verify success: ``` uv venv client --python 3.12 source client/bin/activate ``` 5. Run the updated command: ``` uv pip install llama-stack-client ``` 6. Observe the console output confirms that the virtual environment `client` was used: > Using Python 3.12.3 environment at: client	2025-11-11 07:49:03 -05:00
Omar Abdelwahab	30a544fb8c	Merge branch 'main' into add-mcp-authentication-param	2025-11-10 18:26:48 -08:00
Nathan Weinberg	97ccfb5e62	refactor: inspect routes now shows all non-deprecated APIs (#4116 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Pre-commit / pre-commit (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test llama stack list-deps / generate-matrix (push) Successful in 4s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Test llama stack list-deps / show-single-provider (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test llama stack list-deps / list-deps (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 21s Details UI Tests / ui-tests (22) (push) Successful in 46s Details # What does this PR do? the inspect API lacked any mechanism to get all non-deprecated APIs (v1, v1alpha, v1beta) change default to this behavior 'v1' filter can be used for user' wanting a list of stable APIs ## Test Plan 1. pull the PR 2. launch a LLS server 3. run `curl http://beanlab3.bss.redhat.com:8321/v1/inspect/routes` 4. note there are APIs for `v1`, `v1alpha`, and `v1beta` but no deprecated APIs Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-11-10 15:57:17 -08:00
Charlie Doern	43adc23ef6	refactor: remove dead inference API code and clean up imports (#4093 ) # What does this PR do? Delete ~2,000 lines of dead code from the old bespoke inference API that was replaced by OpenAI-only API. This includes removing unused type conversion functions, dead provider methods, and event_logger.py. Clean up imports across the codebase to remove references to deleted types. This eliminates unnecessary code and dependencies, helping isolate the API package as a self-contained module. This is the last interdependency between the .api package and "exterior" packages, meaning that now every other package in llama stack imports the API, not the other way around. ## Test Plan this is a structural change, no tests needed. --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-10 15:29:24 -08:00
Omar Abdelwahab	5c6f713354	Merge branch 'main' into add-mcp-authentication-param	2025-11-10 15:13:45 -08:00
Shabana Baig	433438cfc0	feat: Implement the 'max_tool_calls' parameter for the Responses API (#4062 ) # Problem Responses API uses max_tool_calls parameter to limit the number of tool calls that can be generated in a response. Currently, LLS implementation of the Responses API does not support this parameter. # What does this PR do? This pull request adds the max_tool_calls field to the response object definition and updates the inline provider. it also ensures that: - the total number of calls to built-in and mcp tools do not exceed max_tool_calls - an error is thrown if max_tool_calls < 1 (behavior seen with the OpenAI Responses API, but we can change this if needed) Closes #[3563](https://github.com/llamastack/llama-stack/issues/3563) ## Test Plan - Tested manually for change in model response w.r.t supplied max_tool_calls field. - Added integration tests to test invalid max_tool_calls parameter. - Added integration tests to check max_tool_calls parameter with built-in and function tools. - Added integration tests to check max_tool_calls parameter in the returned response object. - Recorded OpenAI Responses API behavior using a sample script: https://github.com/s-akhtar-baig/llama-stack-examples/blob/main/responses/src/max_tool_calls.py Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-10 13:21:27 -08:00
Omar Abdelwahab	114ab693a5	Merge branch 'main' into add-mcp-authentication-param	2025-11-10 13:19:12 -08:00
Dennis Kennetz	209a78b618	feat: add oci genai service as chat inference provider (#3876 ) # What does this PR do? Adds OCI GenAI PaaS models for openai chat completion endpoints. ## Test Plan In an OCI tenancy with access to GenAI PaaS, perform the following steps: 1. Ensure you have IAM policies in place to use service (check docs included in this PR) 2. For local development, [setup OCI cli](https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/cliinstall.htm) and configure the CLI with your region, tenancy, and auth [here](https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/cliconfigure.htm) 3. Once configured, go through llama-stack setup and run llama-stack (uses config based auth) like: ```bash OCI_AUTH_TYPE=config_file \ OCI_CLI_PROFILE=CHICAGO \ OCI_REGION=us-chicago-1 \ OCI_COMPARTMENT_OCID=ocid1.compartment.oc1..aaaaaaaa5...5a \ llama stack run oci ``` 4. Hit the `models` endpoint to list models after server is running: ```bash curl http://localhost:8321/v1/models \| jq ... { "identifier": "meta.llama-4-scout-17b-16e-instruct", "provider_resource_id": "ocid1.generativeaimodel.oc1.us-chicago-1.am...q", "provider_id": "oci", "type": "model", "metadata": { "display_name": "meta.llama-4-scout-17b-16e-instruct", "capabilities": [ "CHAT" ], "oci_model_id": "ocid1.generativeaimodel.oc1.us-chicago-1.a...q" }, "model_type": "llm" }, ... ``` 5. Use the "display_name" field to use the model in a `/chat/completions` request: ```bash # Streaming result curl -X POST http://localhost:8321/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "meta.llama-4-scout-17b-16e-instruct", "stream": true, "temperature": 0.9, "messages": [ { "role": "system", "content": "You are a funny comedian. You can be crass." }, { "role": "user", "content": "Tell me a funny joke about programming." } ] }' # Non-streaming result curl -X POST http://localhost:8321/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "meta.llama-4-scout-17b-16e-instruct", "stream": false, "temperature": 0.9, "messages": [ { "role": "system", "content": "You are a funny comedian. You can be crass." }, { "role": "user", "content": "Tell me a funny joke about programming." } ] }' ``` 6. Try out other models from the `/models` endpoint.	2025-11-10 16:16:24 -05:00
Ashwin Bharambe	fadf17daf3	feat(api)!: deprecate register/unregister resource APIs (#4099 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Pre-commit / pre-commit (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 1m10s Details Mark all register_* / unregister_* APIs as deprecated across models, shields, tool groups, datasets, benchmarks, and scoring functions. This is the first step toward moving resource mutations to an `/admin` namespace as outlined in https://github.com/llamastack/llama-stack/issues/3809#issuecomment-3492931585. The deprecation flag will be reflected in the OpenAPI schema to warn API users that these endpoints are being phased out. Next step will be implementing the `/admin` route namespace for these resource management operations. - `register_model` / `unregister_model` - `register_shield` / `unregister_shield` - `register_tool_group` / `unregister_toolgroup` - `register_dataset` / `unregister_dataset` - `register_benchmark` / `unregister_benchmark` - `register_scoring_function` / `unregister_scoring_function`	2025-11-10 10:36:33 -08:00

1 2 3 4 5 ...

3208 commits