llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

Author	SHA1	Message	Date
mergify[bot]	0df6d4601f	fix(docs): fix glob vulnerability (backport #4193 ) (#4227 ) add npm override so docs workspace resolves glob@10.5+ <hr>This is an automatic backport of pull request #4193 done by [Mergify](https://mergify.com). Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-24 11:31:15 -08:00
mergify[bot]	b9299a20ed	fix: enable SQLite WAL mode to prevent database locking errors (backport #4048 ) (#4226 ) Fixes race condition causing "database is locked" errors during concurrent writes to SQLite, particularly in streaming responses with guardrails where multiple inference calls write simultaneously. Enable Write-Ahead Logging (WAL) mode for SQLite which allows multiple concurrent readers and one writer without blocking. Set busy_timeout to 5s so SQLite retries instead of failing immediately. Remove the logic that disabled write queues for SQLite since WAL mode eliminates the locking issues that prompted disabling them. Fixes: test_output_safety_guardrails_safe_content[stream=True] flake<hr>This is an automatic backport of pull request #4048 done by [Mergify](https://mergify.com). Signed-off-by: Charlie Doern <cdoern@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-24 11:30:57 -08:00
mergify[bot]	46bd95e453	fix: Vector store persistence across server restarts (backport #3977 ) (#4225 ) # What does this PR do? This PR fixes a bug in LlamaStack 0.3.0 where vector stores created via the OpenAI-compatible API (`POST /v1/vector_stores`) would fail with `VectorStoreNotFoundError` after server restart when attempting operations like `vector_io.insert()` or `vector_io.query()`. The bug affected 6 vector IO providers: `pgvector`, `sqlite_vec`, `chroma`, `milvus`, `qdrant`, and `weaviate`. Created with the assistance of: claude-4.5-sonnet ## Root Cause All affected providers had a broken `_get_and_cache_vector_store_index()` method that: 1. Did not load existing vector stores from persistent storage during initialization 2. Attempted to use `vector_store_table` (which was either `None` or a `KVStore` without the required `get_vector_store()` method) 3. Could not reload vector stores after server restart or cache miss ## Solution This PR implements a consistent pattern across all 6 providers: 1. Load vector stores during initialization - Pre-populate the cache from KV store on startup 2. Fix lazy loading - Modified `_get_and_cache_vector_store_index()` to load directly from KV store instead of relying on `vector_store_table` 3. Remove broken dependency - Eliminated reliance on the `vector_store_table` pattern ## Testing steps ### 1.1 Configure the stack Create or use an existing configuration with a vector IO provider. Example `run.yaml`: ```yaml vector_io_store: - provider_id: pgvector provider_type: remote::pgvector config: host: localhost port: 5432 db: llamastack user: llamastack password: llamastack inference: - provider_id: sentence-transformers provider_type: inline::sentence-transformers config: model: sentence-transformers/all-MiniLM-L6-v2 ``` ### 1.2 Start the server ```bash llama stack run run.yaml --port 5000 ``` Wait for the server to fully start. You should see: ``` INFO: Started server process INFO: Application startup complete ``` --- ## Step 2: Create a Vector Store ### 2.1 Create via API ```bash curl -X POST http://localhost:5000/v1/vector_stores \ -H "Content-Type: application/json" \ -d '{ "name": "test-persistence-store", "extra_body": { "embedding_model": "sentence-transformers/all-MiniLM-L6-v2", "embedding_dimension": 384, "provider_id": "pgvector" } }' \| jq ``` ### 2.2 Expected Response ```json { "id": "vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d", "object": "vector_store", "name": "test-persistence-store", "status": "completed", "created_at": 1730304000, "file_counts": { "total": 0, "completed": 0, "in_progress": 0, "failed": 0, "cancelled": 0 }, "usage_bytes": 0 } ``` Save the `id` field (e.g., `vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d`) — you’ll need it for the next steps. --- ## Step 3: Insert Data (Before Restart) ### 3.1 Insert chunks into the vector store ```bash export VS_ID="vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d" curl -X POST http://localhost:5000/vector-io/insert \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"chunks\": [ { \"content\": \"Python is a high-level programming language known for its readability.\", \"metadata\": {\"source\": \"doc1\", \"page\": 1} }, { \"content\": \"Machine learning enables computers to learn from data without explicit programming.\", \"metadata\": {\"source\": \"doc2\", \"page\": 1} }, { \"content\": \"Neural networks are inspired by biological neurons in the brain.\", \"metadata\": {\"source\": \"doc3\", \"page\": 1} } ] }" ``` ### 3.2 Expected Response Status: 200 OK Response: Empty or success confirmation --- ## Step 4: Query Data (Before Restart – Baseline) ### 4.1 Query the vector store ```bash curl -X POST http://localhost:5000/vector-io/query \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"query\": \"What is machine learning?\" }" \| jq ``` ### 4.2 Expected Response ```json { "chunks": [ { "content": "Machine learning enables computers to learn from data without explicit programming.", "metadata": {"source": "doc2", "page": 1} }, { "content": "Neural networks are inspired by biological neurons in the brain.", "metadata": {"source": "doc3", "page": 1} } ], "scores": [0.85, 0.72] } ``` Checkpoint: Works correctly before restart. --- ## Step 5: Restart the Server (Critical Test) ### 5.1 Stop the server In the terminal where it’s running: ``` Ctrl + C ``` Wait for: ``` Shutting down... ``` ### 5.2 Restart the server ```bash llama stack run run.yaml --port 5000 ``` Wait for: ``` INFO: Started server process INFO: Application startup complete ``` The vector store cache is now empty, but data should persist. --- ## Step 6: Verify Vector Store Exists (After Restart) ### 6.1 List vector stores ```bash curl http://localhost:5000/v1/vector_stores \| jq ``` ### 6.2 Expected Response ```json { "object": "list", "data": [ { "id": "vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d", "name": "test-persistence-store", "status": "completed" } ] } ``` Checkpoint: Vector store should be listed. --- ## Step 7: Insert Data (After Restart – THE BUG TEST) ### 7.1 Insert new chunks ```bash curl -X POST http://localhost:5000/vector-io/insert \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"chunks\": [ { \"content\": \"This chunk was inserted AFTER the server restart.\", \"metadata\": {\"source\": \"post-restart\", \"test\": true} } ] }" ``` ### 7.2 Expected Results With Fix (Correct): ``` Status: 200 OK Response: Success ``` Without Fix (Bug): ```json { "detail": "VectorStoreNotFoundError: Vector Store 'vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d' not found." } ``` Critical Test: If insertion succeeds, the fix works. --- ## Step 8: Query Data (After Restart – Verification) ### 8.1 Query all data ```bash curl -X POST http://localhost:5000/vector-io/query \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"query\": \"restart\" }" \| jq ``` ### 8.2 Expected Response ```json { "chunks": [ { "content": "This chunk was inserted AFTER the server restart.", "metadata": {"source": "post-restart", "test": true} } ], "scores": [0.95] } ``` Checkpoint: Both old and new data are queryable. --- ## Step 9: Multiple Restart Test (Extra Verification) ### 9.1 Restart again ```bash Ctrl + C llama stack run run.yaml --port 5000 ``` ### 9.2 Query after restart ```bash curl -X POST http://localhost:5000/vector-io/query \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"query\": \"programming\" }" \| jq ``` Expected: Works correctly across multiple restarts. <hr>This is an automatic backport of pull request #3977 done by [Mergify](https://mergify.com). Signed-off-by: Charlie Doern <cdoern@redhat.com> Co-authored-by: Juan Pérez de Algaba <124347725+jperezdealgaba@users.noreply.github.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-11-24 11:30:21 -08:00
mergify[bot]	f216eb99be	fix: allowed_models config did not filter models (backport #4030 ) (#4223 ) # What does this PR do? closes #4022 ## Test Plan ci w/ new tests<hr>This is an automatic backport of pull request #4030 done by [Mergify](https://mergify.com). Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-24 11:29:53 -08:00
github-actions[bot]	49a290e53e	chore: update lockfiles for 0.3.2 Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (push) Failing after 13s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m24s Details Pre-commit / pre-commit (push) Successful in 2m33s Details	2025-11-12 23:21:28 +00:00
github-actions[bot]	1536b8e890	build: Bump version to 0.3.2	2025-11-12 23:19:12 +00:00
github-actions[bot]	dbef00de28	chore: update lockfiles for 0.3.2rc3	2025-11-12 22:48:00 +00:00
github-actions[bot]	01ff0cb9e2	Release candidate 0.3.2rc3	2025-11-12 22:33:56 +00:00
github-actions[bot]	56a723c800	Release candidate 0.3.2rc2	2025-11-12 22:12:27 +00:00
github-actions[bot]	096a3c6013	Release candidate 0.3.2rc1	2025-11-12 21:43:15 +00:00
mergify[bot]	641d5144be	fix(inference): enable routing of models with provider_data alone (backport #3928 ) (#4142 ) This PR enables routing of fully qualified model IDs of the form `provider_id/model_id` even when the models are not registered with the Stack. Here's the situation: assume a remote inference provider which works only when users provide their own API keys via `X-LlamaStack-Provider-Data` header. By definition, we cannot list models and hence update our routing registry. But because we _require_ a provider ID in the models now, we can identify which provider to route to and let that provider decide. Note that we still try to look up our registry since it may have a pre-registered alias. Just that we don't outright fail when we are not able to look it up. Also, updated inference router so that the responses have the _exact_ model that the request had. ## Test Plan Added an integration test Closes #3929<hr>This is an automatic backport of pull request #3928 done by [Mergify](https://mergify.com). --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>	2025-11-12 13:41:27 -08:00
mergify[bot]	a6c3a9cadf	fix: harden storage semantics (backport #4118 ) (#4138 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Integration Tests (Replay) / generate-matrix (push) Successful in 6s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 48s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 53s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m10s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m41s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m44s Details Pre-commit / pre-commit (push) Successful in 3m22s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m16s Details Fixes issues in the storage system by guaranteeing immediate durability for responses and ensuring background writers stay alive. Three related fixes: * Responses to the OpenAI-compatible API now write directly to Postgres/SQLite inside the request instead of detouring through an async queue that might never drain; this restores the expected read-after-write behavior and removes the "response not found" races reported by users. * The access-control shim was stamping owner_principal/access_attributes as SQL NULL, which Postgres interprets as non-public rows; fixing it to use the empty-string/JSON-null pattern means conversations and responses stored without an authenticated user stay queryable (matching SQLite). * The inference-store queue remains for batching, but its worker tasks now start lazily on the live event loop so server startup doesn't cancel them—writes keep flowing even when the stack is launched via llama stack run. Closes #4115 ### Test Plan Added a matrix entry to test our "base" suite against Postgres as the store.<hr>This is an automatic backport of pull request #4118 done by [Mergify](https://mergify.com). --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-12 13:01:21 -08:00
mergify[bot]	56d87f5133	chore(ci): remove unused recordings (backport #4074 ) (#4141 ) Added a script to cleanup recordings. While doing this, moved the CI matrix generation to a separate script so there is a single source of truth for the matrix. Ran the cleanup script as: ``` PYTHONPATH=. python scripts/cleanup_recordings.py ``` Also added this as part of the pre-commit workflow to ensure that the recordings are always up to date and that no stale recordings are left in the repo. <hr>This is an automatic backport of pull request #4074 done by [Mergify](https://mergify.com). --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-12 12:36:28 -08:00
mergify[bot]	0d525d9a24	docs: clarify model identification uses provider_model_id not model_id (backport #4128 ) (#4137 ) Updated documentation to accurately reflect current behavior where models are identified as provider_id/provider_model_id in the system. Changes: o Clarify that model_id is for configuration purposes only o Explain models are accessed as provider_id/provider_model_id o Remove outdated aliasing example that suggested model_id could be used as a custom identifier This corrects the documentation which previously suggested model_id could be used to create friendly aliases, which is not how the code actually works. <hr>This is an automatic backport of pull request #4128 done by [Mergify](https://mergify.com). Signed-off-by: Derek Higgins <derekh@redhat.com> Co-authored-by: Derek Higgins <derekh@redhat.com>	2025-11-12 10:41:23 -08:00
mergify[bot]	bae22060de	docs: use 'uv pip' to avoid pitfalls of using 'pip' in virtual environment (backport #4122 ) (#4136 ) # What does this PR do? In the Detailed Tutorial, at Step 3, the Install with venv option creates a new virtual environment `client`, activates it then attempts to install the llama-stack-client using pip. ``` uv venv client --python 3.12 source client/bin/activate pip install llama-stack-client <- this is the problematic line ``` However, the pip command will likely fail because the `uv venv` command doesn't, by default, include adding the pip command to the virtual environment that is created. The pip command will error either because pip doesn't exist at all, or, if the pip command does exist outside of the virtual environment, return a different error message. The latter may be unclear to the user why it is failing. This PR changes 'pip' to 'uv pip', allowing the install action to function in the virtual environment as intended, and without the need for pip to be installed. ## Test Plan 1. Use linux or WSL (virtual environments on Windows use `Scripts` folder instead of `bin` [virtualenv #993ba13](`993ba1316a`) which doesn't align with the tutorial) 2. Clone the `llama-stack` repo 3. Run the following and verify success: ``` uv venv client --python 3.12 source client/bin/activate ``` 5. Run the updated command: ``` uv pip install llama-stack-client ``` 6. Observe the console output confirms that the virtual environment `client` was used: > Using Python 3.12.3 environment at: client<hr>This is an automatic backport of pull request #4122 done by [Mergify](https://mergify.com). Co-authored-by: paulengineer <154521137+paulengineer@users.noreply.github.com>	2025-11-12 10:41:15 -08:00
mergify[bot]	a380b5fcb1	fix: print help for list-deps if no args (backport #4078 ) (#4083 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Pre-commit / pre-commit (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details # What does this PR do? list-deps takes positional args OR things like --providers the issue with this, is that these args need to be optional since by nature, one or the other can be specified. add a check to list-deps that checks `if not args.providers and not args.config`. If this is true, help is printed and we exit. resolves #4075 ## Test Plan before: ``` ╰─ llama stack list-deps Traceback (most recent call last): File "/Users/charliedoern/projects/Documents/llama-stack/venv/bin/llama", line 10, in <module> sys.exit(main()) ^^^^^^ File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/llama.py", line 52, in main parser.run(args) File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/llama.py", line 43, in run args.func(args) File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/stack/list_deps.py", line 51, in _run_stack_list_deps_command return run_stack_list_deps_command(args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/stack/_list_deps.py", line 135, in run_stack_list_deps_command normal_deps, special_deps, external_provider_dependencies = get_provider_dependencies(build_config) ^^^^^^^^^^^^ UnboundLocalError: cannot access local variable 'build_config' where it is not associated with a value ``` after: ``` ╰─ llama stack list-deps usage: llama stack list-deps [-h] [--providers PROVIDERS] [--format {uv,deps-only}] [config \| distro] list the dependencies for a llama stack distribution positional arguments: config \| distro Path to config file to use or name of known distro (llama stack list for a list). (default: None) options: -h, --help show this help message and exit --providers PROVIDERS sync dependencies for a list of providers and only those providers. This list is formatted like: api1=provider1,api2=provider2. Where there can be multiple providers per API. (default: None) --format {uv,deps-only} Output format: 'uv' shows shell commands, 'deps-only' shows just the list of dependencies without `uv` (default) (default: deps-only) ``` <hr>This is an automatic backport of pull request #4078 done by [Mergify](https://mergify.com). Signed-off-by: Charlie Doern <cdoern@redhat.com> Co-authored-by: Charlie Doern <cdoern@redhat.com>	2025-11-05 14:58:47 -08:00
Ashwin Bharambe	8b878e9d48	fix(ci): export UV_INDEX_STRATEGY to current shell before running uv sync (#4019 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Pre-commit / pre-commit (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Fixes #4017 follow-up issue where UV_INDEX_STRATEGY was only exported to GITHUB_ENV but not to the current shell. The commit `e0bb7529` fixed the empty string issue but introduced a new bug: UV_INDEX_STRATEGY was only exported to GITHUB_ENV (for subsequent steps), not to the current shell environment. Since uv sync runs in the same step, it never saw the variable. This caused all CI runs on release-0.3.x to fail with dependency resolution errors like: ``` setuptools was found on https://test.pypi.org/simple/, but not at the requested version. A compatible version may be available on PyPI. Use --index-strategy unsafe-best-match. ``` This fix adds `export UV_INDEX_STRATEGY=unsafe-best-match` to make the variable available in the current shell before running uv commands. Note: Main branch doesn't hit this bug because UV_EXTRA_INDEX_URL is only set on release branches.	2025-11-01 12:54:19 -07:00
Ashwin Bharambe	e0bb7529ed	fix: only set UV_INDEX_STRATEGY when UV_EXTRA_INDEX_URL is present (#4017 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Pre-commit / pre-commit (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 15s Details Cherry-pick of bc12fe6c4 to release-0.3.x Fixes GitHub Actions workflows failing with UV index strategy errors when testing on RC tags and non-release branches. The issue was that UV_INDEX_STRATEGY was being set to an empty string in the environment, causing UV to fail with "error: a value is required for '--index-strategy'". The fix removes UV_INDEX_STRATEGY from the env block and only sets it to 'unsafe-best-match' when UV_EXTRA_INDEX_URL is actually present.	2025-10-31 16:22:01 -07:00
github-actions[bot]	bdd330a94a	chore: update lockfiles for 0.3.1	2025-10-31 22:56:35 +00:00
github-actions[bot]	033c1abf29	build: Bump version to 0.3.1	2025-10-31 22:54:10 +00:00
github-actions[bot]	dd6aee179d	chore: update lockfiles for 0.3.1rc5	2025-10-31 22:43:57 +00:00
github-actions[bot]	dc3674a82d	Release candidate 0.3.1rc5	2025-10-31 22:35:23 +00:00
Ashwin Bharambe	7ac81f69fe	build: fix uv.lock	2025-10-31 15:31:01 -07:00
github-actions[bot]	7fdfeef44a	Release candidate 0.3.1rc4	2025-10-31 22:15:04 +00:00
Ashwin Bharambe	bf1693c2ee	build: fix uv.lock	2025-10-31 15:07:39 -07:00
github-actions[bot]	13c6695fd3	Release candidate 0.3.1rc3	2025-10-31 21:29:01 +00:00
Ashwin Bharambe	e8a3dfbe96	docs: A getting started notebook featuring simple agent examples (#4015 ) Cherry-pick of #3955 to release-0.3.x Adds a getting started notebook with simple agent examples to help users get started with llama-stack agents. Co-authored-by: Omar Abdelwahab <omaryashraf10@gmail.com> Co-authored-by: Omar Abdelwahab <omara@fb.com>	2025-10-31 14:27:22 -07:00
Ashwin Bharambe	637f8bef9c	build: fix uv.lock	2025-10-31 14:01:45 -07:00
github-actions[bot]	a5372dbdf5	Release candidate 0.3.1rc2	2025-10-31 20:50:32 +00:00
Ashwin Bharambe	9f1e4a07c9	feat: support `workers` in run config (#4014 ) Cherry-pick of #3992 to release-0.3.x Adds support for configuring the number of workers in run.yaml configuration files. Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>	2025-10-31 13:48:55 -07:00
Ashwin Bharambe	b088665227	fix(ci): unset empty UV index env vars to prevent uv errors (#4013 ) Cherry-pick of #4012 to release-0.3.x Fixes container builds failing with UV index strategy errors when build args are passed with empty values. Docker ARGs declared with empty defaults (ARG UV_INDEX_STRATEGY="") become environment variables with empty string values in RUN commands. UV interprets these as if --index-strategy "" was passed on the command line, causing build failures with "error: a value is required for '--index-strategy <UV_INDEX_STRATEGY>'". This is a footgun because empty string ≠ unset variable, and ARGs silently propagate to all RUN commands, only failing when declared with empty defaults. The fix unsets UV_EXTRA_INDEX_URL and UV_INDEX_STRATEGY at the start of RUN blocks, saves the values early, and only restores them for editable installs with RC dependencies. All other install modes (PyPI, test-pypi, client) now run with a clean environment.	2025-10-31 13:45:47 -07:00
Ashwin Bharambe	73d70546d4	chore(release-0.3.x): handle missing external_providers_dir (#4011 ) Cherry-pick of #3974 to release-0.3.x branch. ## Summary - Fixes handling of missing external_providers_dir in stack configuration ## Original PR Fixes from #3974 Signed-off-by: Doug Edgar <dedgar@redhat.com> Co-authored-by: Doug Edgar <dedgar@redhat.com>	2025-10-31 12:55:34 -07:00
Ashwin Bharambe	a488d8ce10	fix(ci): install client from release branch before uv sync (#4002 ) Backport of #4001 to release-0.3.x branch. Fixes CI failures on release branches where uv sync can't resolve RC dependencies. ## The Problem On release branches like `release-0.3.x`, pyproject.toml requires `llama-stack-client>=0.3.1rc1`. RC versions only exist on test.pypi, not PyPI. This causes multiple CI failures: 1. `uv sync` fails because it can't resolve RC versions from PyPI 2. pre-commit hooks (uv-lock, codegen) fail for the same reason 3. mypy workflow section needs uv installed ## The Solution Configure UV to use test.pypi when on release branches: - Set `UV_INDEX_URL=https://test.pypi.org/simple/` (primary) - Set `UV_EXTRA_INDEX_URL=https://pypi.org/simple/` (fallback) - Set `UV_INDEX_STRATEGY=unsafe-best-match` to check both indexes This allows `uv sync` to resolve common packages from PyPI and RC versions from test.pypi. ## Additional Fixes - Export UV env vars to `GITHUB_ENV` so pre-commit hooks inherit them - Install uv in pre-commit workflow for mypy section - Handle missing `type_checking` dependency group on release-0.3.x - Regenerate uv.lock with RC versions for the release branch ## Changes - Created reusable `install-llama-stack-client` action for configuration - Modified `setup-runner` to set UV environment variables before sync - Modified `pre-commit` workflow to configure client and export env vars - Updated uv.lock with RC versions from test.pypi This is a cherry-pick of commits `afa9f0882`, `c86e6e906`, `626639bee`, and `081566321` from main, plus additional fixes for release branch compatibility.	2025-10-31 11:44:05 -07:00
github-actions[bot]	f8272b2faf	Release candidate 0.3.1rc1	2025-10-31 04:54:54 +00:00
Ashwin Bharambe	39f33f7f12	feat(cherry-pick): fixes for 0.3.1 release (#3998 ) ## Summary Cherry-picks 5 critical fixes from main to the release-0.3.x branch for the v0.3.1 release, plus CI workflow updates. Note: This recreates the cherry-picks from the closed PR #3991, now targeting the renamed `release-0.3.x` branch (previously `release-0.3.x-maint`). ## Commits 1. `2c56a8560` - fix(context): prevent provider data leak between streaming requests (#3924) - CRITICAL SECURITY FIX: Prevents provider credentials from leaking between requests - Fixed import path for 0.3.0 compatibility 2. `ddd32b187` - fix(inference): enable routing of models with provider_data alone (#3928) - Enables routing for fully qualified model IDs with provider_data - Resolved merge conflicts, adapted for 0.3.0 structure 3. `f7c2973aa` - fix: Avoid BadRequestError due to invalid max_tokens (#3667) - Fixes failures with Gemini and other providers that reject max_tokens=0 - Non-breaking API change 4. `d7f9da616` - fix(responses): sync conversation before yielding terminal events in streaming (#3888) - Ensures conversation sync executes even when streaming consumers break early 5. `0ffa8658b` - fix(logging): ensure logs go to stderr, loggers obey levels (#3885) - Fixes logging infrastructure 6. `75b49cb3c` - ci: support release branches and match client branch (#3990) - Updates CI workflows to support release-X.Y.x branches - Matches client branch from llama-stack-client-python for release testing - Fixes artifact name collisions ## Adaptations for 0.3.0 - Fixed import paths: `llama_stack.core.telemetry.tracing` → `llama_stack.providers.utils.telemetry.tracing` - Fixed import paths: `llama_stack.core.telemetry.telemetry` → `llama_stack.apis.telemetry` - Changed `self.telemetry_enabled` → `self.telemetry` (0.3.0 attribute name) - Removed `rerank()` method that doesn't exist in 0.3.0 ## Testing All imports verified and tests should pass once CI is set up.	2025-10-30 21:51:42 -07:00
github-actions[bot]	bf091306fe	build: Bump version to 0.3.0	2025-10-21 23:58:10 +00:00
github-actions[bot]	aabd7ac897	Release candidate 0.3.0rc6	2025-10-21 23:43:32 +00:00
Ashwin Bharambe	c0c0e337d9	misc(tests): add recordings for responses tests	2025-10-21 16:39:08 -07:00
Ashwin Bharambe	557b1b8c2d	fix(logs): restore uvicorn and llama_stack logger settings	2025-10-21 15:47:55 -07:00
slekkala1	eb2b240594	fix: remove consistency checks (#3881 ) # What does this PR do? metadata is conflicting with the default embedding model set on server side via extra body, removing the check and just letting metadata take precedence over extra body `ValueError: Embedding model inconsistent between metadata ('text-embedding-3-small') and extra_body ('sentence-transformers/nomic-ai/nomic-embed-text-v1.5')` ## Test Plan CI	2025-10-21 14:40:14 -07:00
Alexey Rybak	4c718523fa	docs: fix the building distro file (#3880 ) # What does this PR do? * Fixes the doc server build (which expects a blank line after imports) ## Test Plan * `cd docs && npm run build`	2025-10-21 14:26:35 -07:00
slekkala1	cb6a5e2687	fix: fix segfault in load model (#3879 ) # What does this PR do? Fix segfault with load model The cc-vec integration failed with segfault when used with default embedding model on macOS `model_id: nomic-ai/nomic-embed-text-v1.5` and `provider_id: sentence-transformers` Checked crash report and see this is due to torch OPENMP settings. Constrainting to 1 thread works without crashes. ## Test Plan Tested with cc-vec integration 1. start server llama stack run starter 2. Do the setup in https://github.com/raghotham/cc-vec to set env variables and try `uv run cc-vec index --url-patterns "%.github.io" --vector-store-name "ml-research" --limit 50 --chunk-size 800 --overlap 400`	2025-10-21 12:21:06 -07:00
ehhuang	1ec7216c3f	chore: update quick_start (#3878 ) # What does this PR do? ## Test Plan	2025-10-21 11:33:23 -07:00
Ashwin Bharambe	bd3c473208	revert: "chore(cleanup)!: remove tool_runtime.rag_tool" (#3877 ) Reverts llamastack/llama-stack#3871 This PR broke RAG (even from Responses -- there _is_ a dependency)	2025-10-21 11:22:06 -07:00
ehhuang	eb3e9b85f9	chore: update getting_started (#3875 ) # What does this PR do? ## Test Plan	2025-10-21 11:09:45 -07:00
Ashwin Bharambe	71ead88bce	fix(logging): move module-level initialization to explicit setup calls (#3874 ) - Moved environment variable parsing and `setup_logging()` call from module level to proper initialization points - Added explicit `setup_logging()` calls in `server.py::create_app()` and `library_client.py::AsyncLlamaStackAsLibraryClient.__init__()` Module-level side effects are bad practice and can cause issues with import order, testing, and circular dependencies. The previous implementation ran logging setup on every import of the log module, which is unpredictable and difficult to control. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-21 11:08:25 -07:00
Ashwin Bharambe	9191005ca1	fix(ci): dump server/container logs when tests fail (#3873 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 14s Details API Conformance Tests / check-schema-compatibility (push) Successful in 14s Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Python Package Build Test / build (3.13) (push) Failing after 17s Details Test Llama Stack Build / generate-matrix (push) Successful in 20s Details Unit Tests / unit-tests (3.13) (push) Failing after 18s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 25s Details Unit Tests / unit-tests (3.12) (push) Failing after 36s Details Test Llama Stack Build / build (push) Failing after 12s Details UI Tests / ui-tests (22) (push) Successful in 1m1s Details Pre-commit / pre-commit (push) Successful in 2m5s Details Output last 100 lines of server.log or docker container logs when integration tests fail to aid debugging.	2025-10-20 22:28:55 -07:00
Ashwin Bharambe	0e96279bee	chore(cleanup)!: remove tool_runtime.rag_tool (#3871 ) Kill the `builtin::rag` tool group completely since it is no longer targeted. We use the Responses implementation for knowledge_search which uses the `openai_vector_stores` pathway. --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-20 22:26:21 -07:00
Ashwin Bharambe	5aaf1a8bca	fix(ci): improve workflow logging and bot notifications (#3872 ) ## Summary - Link pre-commit bot comment to workflow run instead of PR for better debugging - Dump docker container logs before removal to ensure logs are actually captured ## Changes 1. Pre-commit bot: Changed the initial bot comment to link "pre-commit hooks" text to the actual workflow run URL instead of just having the PR number auto-link 2. Docker logs: Moved docker container log dumping from GitHub Actions to the integration-tests.sh script's stop_container() function, ensuring logs are captured before container removal ## Test plan - Pre-commit bot comment will now have a clickable link to the workflow run - Docker container logs will be successfully captured in CI runs	2025-10-20 22:08:15 -07:00
Ashwin Bharambe	122de785c4	chore(cleanup)!: kill vector_db references as far as possible (#3864 ) There should not be "vector db" anywhere.	2025-10-20 20:06:16 -07:00

1 2 3 4 5 ...

3006 commits