llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

Author	SHA1	Message	Date
mergify[bot]	01736b1f5c	chore: bump mcp package version (backport #4287 ) (#4288 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Integration Tests (Replay) / generate-matrix (push) Successful in 6s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (push) Failing after 13s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m35s Details Pre-commit / pre-commit (push) Successful in 2m21s Details # What does this PR do? Address https://github.com/modelcontextprotocol/python-sdk/security/advisories/GHSA-9h52-p55h-vw2f <hr>This is an automatic backport of pull request #4287 done by [Mergify](https://mergify.com). Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-12-03 17:48:59 +01:00
github-actions[bot]	2682916d6d	chore: update lockfiles for 0.3.4rc2	2025-12-03 15:39:31 +00:00
github-actions[bot]	18ed8cc0c9	Release candidate 0.3.4rc2	2025-12-03 15:30:46 +00:00
mergify[bot]	0899f78943	fix: Avoid model_limits KeyError (backport #4060 ) (#4283 ) # What does this PR do? It avoids model_limit KeyError while trying to get embedding models for Watsonx Closes https://github.com/llamastack/llama-stack/issues/4059 ## Test Plan Start server with watsonx distro: ```bash llama stack list-deps watsonx \| xargs -L1 uv pip install uv run llama stack run watsonx ``` Run ```python client = LlamaStackClient(base_url=base_url) client.models.list() ``` Check if there is any embedding model available (currently there is not a single one)<hr>This is an automatic backport of pull request #4060 done by [Mergify](https://mergify.com). Co-authored-by: Wojciech-Rebisz <147821486+Wojciech-Rebisz@users.noreply.github.com>	2025-12-03 10:56:24 +01:00
mergify[bot]	9b68b38c55	fix: Add policies to adapters (backport #4277 ) (#4279 ) Some checks failed Integration Tests (Replay) / generate-matrix (push) Successful in 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 13s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 16s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 45s Details Unit Tests / unit-tests (3.12) (push) Failing after 54s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 20s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m11s Details Pre-commit / pre-commit (push) Successful in 3m0s Details The configured policy wasn't being passed in and instead the default was being used (e.g. in the s3 file provider) Closes: #4276 <hr>This is an automatic backport of pull request #4277 done by [Mergify](https://mergify.com). Signed-off-by: Derek Higgins <derekh@redhat.com> Co-authored-by: Derek Higgins <derekh@redhat.com>	2025-12-02 13:27:54 -08:00
github-actions[bot]	63e2e7534f	chore: update lockfiles for 0.3.4rc1 Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 11s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m28s Details Pre-commit / pre-commit (push) Successful in 2m36s Details	2025-12-02 14:50:13 +00:00
github-actions[bot]	6eac1005ab	Release candidate 0.3.4rc1	2025-12-02 14:41:51 +00:00
Sébastien Han	384981094a	fix: uninitialised enable_write_queue (#4264 ) # What does this PR do? - Fix uv.lock - Fix uninitialised variable Against stable branch, main does not have this issue. --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-02 09:37:21 -05:00
mergify[bot]	c7fd3c4151	chore: bump starlette version (backport #4158 ) (#4248 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Integration Tests (Replay) / generate-matrix (push) Successful in 19s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 36s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m8s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m47s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m10s Details Pre-commit / pre-commit (push) Successful in 2m50s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m41s Details # What does this PR do? Require at least 0.49.1 which fixes a security vulnerability in the parsing logic of the Range header in FileResponse. Release note: https://github.com/Kludex/starlette/releases/tag/0.49.1 <hr>This is an automatic backport of pull request #4158 done by [Mergify](https://mergify.com). --------- Co-authored-by: Sébastien Han <seb@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-12-01 10:21:16 -08:00
github-actions[bot]	1d251b489a	chore: update lockfiles for 0.3.3 Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 11s Details Integration Tests (Replay) / generate-matrix (push) Successful in 7s Details Vector IO Integration Tests / test-matrix (push) Failing after 24s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 41s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 15s Details Unit Tests / unit-tests (3.12) (push) Failing after 49s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m53s Details Pre-commit / pre-commit (push) Successful in 3m49s Details	2025-11-24 21:15:11 +00:00
github-actions[bot]	2424a3d3b2	build: Bump version to 0.3.3	2025-11-24 21:12:52 +00:00
github-actions[bot]	ff6d8d5a50	chore: update lockfiles for 0.3.3rc1	2025-11-24 20:55:14 +00:00
github-actions[bot]	4f19fac36e	Release candidate 0.3.3rc1	2025-11-24 20:10:59 +00:00
mergify[bot]	2d5ed5d0f5	fix: update hard-coded google model names (backport #4212 ) (#4229 ) # What does this PR do? When we send the model names to Google's openai API, we must use the "google" name prefix. Google does not recognize the "vertexai" model names. Closes #4211 ## Test Plan ```bash uv venv --python python312 . .venv/bin/activate llama stack list-deps starter \| xargs -L1 uv pip install llama stack run starter ``` Test that this shows the gemini models with their correct names: ```bash curl http://127.0.0.1:8321/v1/models \| jq '.data \| map(select(.custom_metadata.provider_id == "vertexai"))' ``` Test that this chat completion works: ```bash curl -X POST -H "Content-Type: application/json" "http://127.0.0.1:8321/v1/chat/completions" -d '{ "model": "vertexai/google/gemini-2.5-flash", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello! Can you tell me a joke?" } ], "temperature": 1.0, "max_tokens": 256 }' ```<hr>This is an automatic backport of pull request #4212 done by [Mergify](https://mergify.com). Signed-off-by: Charlie Doern <cdoern@redhat.com> Co-authored-by: Ken Dreyer <kdreyer@redhat.com>	2025-11-24 11:32:14 -08:00
mergify[bot]	05b4394cf9	fix: enforce allowed_models during inference requests (backport #4197 ) (#4228 ) The `allowed_models` configuration was only being applied when listing models via the `/v1/models` endpoint, but the actual inference requests weren't checking this restriction. This meant users could directly request any model the provider supports by specifying it in their inference call, completely bypassing the intended cost controls. The fix adds validation to all three inference methods (chat completions, completions, and embeddings) that checks the requested model against the allowed_models list before making the provider API call. ### Test plan Added unit tests <hr>This is an automatic backport of pull request #4197 done by [Mergify](https://mergify.com). Signed-off-by: Charlie Doern <cdoern@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-24 11:31:36 -08:00
mergify[bot]	0df6d4601f	fix(docs): fix glob vulnerability (backport #4193 ) (#4227 ) add npm override so docs workspace resolves glob@10.5+ <hr>This is an automatic backport of pull request #4193 done by [Mergify](https://mergify.com). Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-24 11:31:15 -08:00
mergify[bot]	b9299a20ed	fix: enable SQLite WAL mode to prevent database locking errors (backport #4048 ) (#4226 ) Fixes race condition causing "database is locked" errors during concurrent writes to SQLite, particularly in streaming responses with guardrails where multiple inference calls write simultaneously. Enable Write-Ahead Logging (WAL) mode for SQLite which allows multiple concurrent readers and one writer without blocking. Set busy_timeout to 5s so SQLite retries instead of failing immediately. Remove the logic that disabled write queues for SQLite since WAL mode eliminates the locking issues that prompted disabling them. Fixes: test_output_safety_guardrails_safe_content[stream=True] flake<hr>This is an automatic backport of pull request #4048 done by [Mergify](https://mergify.com). Signed-off-by: Charlie Doern <cdoern@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-24 11:30:57 -08:00
mergify[bot]	46bd95e453	fix: Vector store persistence across server restarts (backport #3977 ) (#4225 ) # What does this PR do? This PR fixes a bug in LlamaStack 0.3.0 where vector stores created via the OpenAI-compatible API (`POST /v1/vector_stores`) would fail with `VectorStoreNotFoundError` after server restart when attempting operations like `vector_io.insert()` or `vector_io.query()`. The bug affected 6 vector IO providers: `pgvector`, `sqlite_vec`, `chroma`, `milvus`, `qdrant`, and `weaviate`. Created with the assistance of: claude-4.5-sonnet ## Root Cause All affected providers had a broken `_get_and_cache_vector_store_index()` method that: 1. Did not load existing vector stores from persistent storage during initialization 2. Attempted to use `vector_store_table` (which was either `None` or a `KVStore` without the required `get_vector_store()` method) 3. Could not reload vector stores after server restart or cache miss ## Solution This PR implements a consistent pattern across all 6 providers: 1. Load vector stores during initialization - Pre-populate the cache from KV store on startup 2. Fix lazy loading - Modified `_get_and_cache_vector_store_index()` to load directly from KV store instead of relying on `vector_store_table` 3. Remove broken dependency - Eliminated reliance on the `vector_store_table` pattern ## Testing steps ### 1.1 Configure the stack Create or use an existing configuration with a vector IO provider. Example `run.yaml`: ```yaml vector_io_store: - provider_id: pgvector provider_type: remote::pgvector config: host: localhost port: 5432 db: llamastack user: llamastack password: llamastack inference: - provider_id: sentence-transformers provider_type: inline::sentence-transformers config: model: sentence-transformers/all-MiniLM-L6-v2 ``` ### 1.2 Start the server ```bash llama stack run run.yaml --port 5000 ``` Wait for the server to fully start. You should see: ``` INFO: Started server process INFO: Application startup complete ``` --- ## Step 2: Create a Vector Store ### 2.1 Create via API ```bash curl -X POST http://localhost:5000/v1/vector_stores \ -H "Content-Type: application/json" \ -d '{ "name": "test-persistence-store", "extra_body": { "embedding_model": "sentence-transformers/all-MiniLM-L6-v2", "embedding_dimension": 384, "provider_id": "pgvector" } }' \| jq ``` ### 2.2 Expected Response ```json { "id": "vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d", "object": "vector_store", "name": "test-persistence-store", "status": "completed", "created_at": 1730304000, "file_counts": { "total": 0, "completed": 0, "in_progress": 0, "failed": 0, "cancelled": 0 }, "usage_bytes": 0 } ``` Save the `id` field (e.g., `vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d`) — you’ll need it for the next steps. --- ## Step 3: Insert Data (Before Restart) ### 3.1 Insert chunks into the vector store ```bash export VS_ID="vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d" curl -X POST http://localhost:5000/vector-io/insert \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"chunks\": [ { \"content\": \"Python is a high-level programming language known for its readability.\", \"metadata\": {\"source\": \"doc1\", \"page\": 1} }, { \"content\": \"Machine learning enables computers to learn from data without explicit programming.\", \"metadata\": {\"source\": \"doc2\", \"page\": 1} }, { \"content\": \"Neural networks are inspired by biological neurons in the brain.\", \"metadata\": {\"source\": \"doc3\", \"page\": 1} } ] }" ``` ### 3.2 Expected Response Status: 200 OK Response: Empty or success confirmation --- ## Step 4: Query Data (Before Restart – Baseline) ### 4.1 Query the vector store ```bash curl -X POST http://localhost:5000/vector-io/query \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"query\": \"What is machine learning?\" }" \| jq ``` ### 4.2 Expected Response ```json { "chunks": [ { "content": "Machine learning enables computers to learn from data without explicit programming.", "metadata": {"source": "doc2", "page": 1} }, { "content": "Neural networks are inspired by biological neurons in the brain.", "metadata": {"source": "doc3", "page": 1} } ], "scores": [0.85, 0.72] } ``` Checkpoint: Works correctly before restart. --- ## Step 5: Restart the Server (Critical Test) ### 5.1 Stop the server In the terminal where it’s running: ``` Ctrl + C ``` Wait for: ``` Shutting down... ``` ### 5.2 Restart the server ```bash llama stack run run.yaml --port 5000 ``` Wait for: ``` INFO: Started server process INFO: Application startup complete ``` The vector store cache is now empty, but data should persist. --- ## Step 6: Verify Vector Store Exists (After Restart) ### 6.1 List vector stores ```bash curl http://localhost:5000/v1/vector_stores \| jq ``` ### 6.2 Expected Response ```json { "object": "list", "data": [ { "id": "vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d", "name": "test-persistence-store", "status": "completed" } ] } ``` Checkpoint: Vector store should be listed. --- ## Step 7: Insert Data (After Restart – THE BUG TEST) ### 7.1 Insert new chunks ```bash curl -X POST http://localhost:5000/vector-io/insert \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"chunks\": [ { \"content\": \"This chunk was inserted AFTER the server restart.\", \"metadata\": {\"source\": \"post-restart\", \"test\": true} } ] }" ``` ### 7.2 Expected Results With Fix (Correct): ``` Status: 200 OK Response: Success ``` Without Fix (Bug): ```json { "detail": "VectorStoreNotFoundError: Vector Store 'vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d' not found." } ``` Critical Test: If insertion succeeds, the fix works. --- ## Step 8: Query Data (After Restart – Verification) ### 8.1 Query all data ```bash curl -X POST http://localhost:5000/vector-io/query \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"query\": \"restart\" }" \| jq ``` ### 8.2 Expected Response ```json { "chunks": [ { "content": "This chunk was inserted AFTER the server restart.", "metadata": {"source": "post-restart", "test": true} } ], "scores": [0.95] } ``` Checkpoint: Both old and new data are queryable. --- ## Step 9: Multiple Restart Test (Extra Verification) ### 9.1 Restart again ```bash Ctrl + C llama stack run run.yaml --port 5000 ``` ### 9.2 Query after restart ```bash curl -X POST http://localhost:5000/vector-io/query \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"query\": \"programming\" }" \| jq ``` Expected: Works correctly across multiple restarts. <hr>This is an automatic backport of pull request #3977 done by [Mergify](https://mergify.com). Signed-off-by: Charlie Doern <cdoern@redhat.com> Co-authored-by: Juan Pérez de Algaba <124347725+jperezdealgaba@users.noreply.github.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-11-24 11:30:21 -08:00
mergify[bot]	f216eb99be	fix: allowed_models config did not filter models (backport #4030 ) (#4223 ) # What does this PR do? closes #4022 ## Test Plan ci w/ new tests<hr>This is an automatic backport of pull request #4030 done by [Mergify](https://mergify.com). Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-24 11:29:53 -08:00
github-actions[bot]	49a290e53e	chore: update lockfiles for 0.3.2 Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (push) Failing after 13s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m24s Details Pre-commit / pre-commit (push) Successful in 2m33s Details	2025-11-12 23:21:28 +00:00
github-actions[bot]	1536b8e890	build: Bump version to 0.3.2	2025-11-12 23:19:12 +00:00
github-actions[bot]	dbef00de28	chore: update lockfiles for 0.3.2rc3	2025-11-12 22:48:00 +00:00
github-actions[bot]	01ff0cb9e2	Release candidate 0.3.2rc3	2025-11-12 22:33:56 +00:00
github-actions[bot]	56a723c800	Release candidate 0.3.2rc2	2025-11-12 22:12:27 +00:00
github-actions[bot]	096a3c6013	Release candidate 0.3.2rc1	2025-11-12 21:43:15 +00:00
mergify[bot]	641d5144be	fix(inference): enable routing of models with provider_data alone (backport #3928 ) (#4142 ) This PR enables routing of fully qualified model IDs of the form `provider_id/model_id` even when the models are not registered with the Stack. Here's the situation: assume a remote inference provider which works only when users provide their own API keys via `X-LlamaStack-Provider-Data` header. By definition, we cannot list models and hence update our routing registry. But because we _require_ a provider ID in the models now, we can identify which provider to route to and let that provider decide. Note that we still try to look up our registry since it may have a pre-registered alias. Just that we don't outright fail when we are not able to look it up. Also, updated inference router so that the responses have the _exact_ model that the request had. ## Test Plan Added an integration test Closes #3929<hr>This is an automatic backport of pull request #3928 done by [Mergify](https://mergify.com). --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>	2025-11-12 13:41:27 -08:00
mergify[bot]	a6c3a9cadf	fix: harden storage semantics (backport #4118 ) (#4138 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Integration Tests (Replay) / generate-matrix (push) Successful in 6s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 48s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 53s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m10s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m41s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m44s Details Pre-commit / pre-commit (push) Successful in 3m22s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m16s Details Fixes issues in the storage system by guaranteeing immediate durability for responses and ensuring background writers stay alive. Three related fixes: * Responses to the OpenAI-compatible API now write directly to Postgres/SQLite inside the request instead of detouring through an async queue that might never drain; this restores the expected read-after-write behavior and removes the "response not found" races reported by users. * The access-control shim was stamping owner_principal/access_attributes as SQL NULL, which Postgres interprets as non-public rows; fixing it to use the empty-string/JSON-null pattern means conversations and responses stored without an authenticated user stay queryable (matching SQLite). * The inference-store queue remains for batching, but its worker tasks now start lazily on the live event loop so server startup doesn't cancel them—writes keep flowing even when the stack is launched via llama stack run. Closes #4115 ### Test Plan Added a matrix entry to test our "base" suite against Postgres as the store.<hr>This is an automatic backport of pull request #4118 done by [Mergify](https://mergify.com). --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-12 13:01:21 -08:00
mergify[bot]	56d87f5133	chore(ci): remove unused recordings (backport #4074 ) (#4141 ) Added a script to cleanup recordings. While doing this, moved the CI matrix generation to a separate script so there is a single source of truth for the matrix. Ran the cleanup script as: ``` PYTHONPATH=. python scripts/cleanup_recordings.py ``` Also added this as part of the pre-commit workflow to ensure that the recordings are always up to date and that no stale recordings are left in the repo. <hr>This is an automatic backport of pull request #4074 done by [Mergify](https://mergify.com). --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-12 12:36:28 -08:00
mergify[bot]	0d525d9a24	docs: clarify model identification uses provider_model_id not model_id (backport #4128 ) (#4137 ) Updated documentation to accurately reflect current behavior where models are identified as provider_id/provider_model_id in the system. Changes: o Clarify that model_id is for configuration purposes only o Explain models are accessed as provider_id/provider_model_id o Remove outdated aliasing example that suggested model_id could be used as a custom identifier This corrects the documentation which previously suggested model_id could be used to create friendly aliases, which is not how the code actually works. <hr>This is an automatic backport of pull request #4128 done by [Mergify](https://mergify.com). Signed-off-by: Derek Higgins <derekh@redhat.com> Co-authored-by: Derek Higgins <derekh@redhat.com>	2025-11-12 10:41:23 -08:00
mergify[bot]	bae22060de	docs: use 'uv pip' to avoid pitfalls of using 'pip' in virtual environment (backport #4122 ) (#4136 ) # What does this PR do? In the Detailed Tutorial, at Step 3, the Install with venv option creates a new virtual environment `client`, activates it then attempts to install the llama-stack-client using pip. ``` uv venv client --python 3.12 source client/bin/activate pip install llama-stack-client <- this is the problematic line ``` However, the pip command will likely fail because the `uv venv` command doesn't, by default, include adding the pip command to the virtual environment that is created. The pip command will error either because pip doesn't exist at all, or, if the pip command does exist outside of the virtual environment, return a different error message. The latter may be unclear to the user why it is failing. This PR changes 'pip' to 'uv pip', allowing the install action to function in the virtual environment as intended, and without the need for pip to be installed. ## Test Plan 1. Use linux or WSL (virtual environments on Windows use `Scripts` folder instead of `bin` [virtualenv #993ba13](`993ba1316a`) which doesn't align with the tutorial) 2. Clone the `llama-stack` repo 3. Run the following and verify success: ``` uv venv client --python 3.12 source client/bin/activate ``` 5. Run the updated command: ``` uv pip install llama-stack-client ``` 6. Observe the console output confirms that the virtual environment `client` was used: > Using Python 3.12.3 environment at: client<hr>This is an automatic backport of pull request #4122 done by [Mergify](https://mergify.com). Co-authored-by: paulengineer <154521137+paulengineer@users.noreply.github.com>	2025-11-12 10:41:15 -08:00
mergify[bot]	a380b5fcb1	fix: print help for list-deps if no args (backport #4078 ) (#4083 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Pre-commit / pre-commit (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details # What does this PR do? list-deps takes positional args OR things like --providers the issue with this, is that these args need to be optional since by nature, one or the other can be specified. add a check to list-deps that checks `if not args.providers and not args.config`. If this is true, help is printed and we exit. resolves #4075 ## Test Plan before: ``` ╰─ llama stack list-deps Traceback (most recent call last): File "/Users/charliedoern/projects/Documents/llama-stack/venv/bin/llama", line 10, in <module> sys.exit(main()) ^^^^^^ File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/llama.py", line 52, in main parser.run(args) File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/llama.py", line 43, in run args.func(args) File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/stack/list_deps.py", line 51, in _run_stack_list_deps_command return run_stack_list_deps_command(args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/stack/_list_deps.py", line 135, in run_stack_list_deps_command normal_deps, special_deps, external_provider_dependencies = get_provider_dependencies(build_config) ^^^^^^^^^^^^ UnboundLocalError: cannot access local variable 'build_config' where it is not associated with a value ``` after: ``` ╰─ llama stack list-deps usage: llama stack list-deps [-h] [--providers PROVIDERS] [--format {uv,deps-only}] [config \| distro] list the dependencies for a llama stack distribution positional arguments: config \| distro Path to config file to use or name of known distro (llama stack list for a list). (default: None) options: -h, --help show this help message and exit --providers PROVIDERS sync dependencies for a list of providers and only those providers. This list is formatted like: api1=provider1,api2=provider2. Where there can be multiple providers per API. (default: None) --format {uv,deps-only} Output format: 'uv' shows shell commands, 'deps-only' shows just the list of dependencies without `uv` (default) (default: deps-only) ``` <hr>This is an automatic backport of pull request #4078 done by [Mergify](https://mergify.com). Signed-off-by: Charlie Doern <cdoern@redhat.com> Co-authored-by: Charlie Doern <cdoern@redhat.com>	2025-11-05 14:58:47 -08:00
Ashwin Bharambe	8b878e9d48	fix(ci): export UV_INDEX_STRATEGY to current shell before running uv sync (#4019 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Pre-commit / pre-commit (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Fixes #4017 follow-up issue where UV_INDEX_STRATEGY was only exported to GITHUB_ENV but not to the current shell. The commit `e0bb7529` fixed the empty string issue but introduced a new bug: UV_INDEX_STRATEGY was only exported to GITHUB_ENV (for subsequent steps), not to the current shell environment. Since uv sync runs in the same step, it never saw the variable. This caused all CI runs on release-0.3.x to fail with dependency resolution errors like: ``` setuptools was found on https://test.pypi.org/simple/, but not at the requested version. A compatible version may be available on PyPI. Use --index-strategy unsafe-best-match. ``` This fix adds `export UV_INDEX_STRATEGY=unsafe-best-match` to make the variable available in the current shell before running uv commands. Note: Main branch doesn't hit this bug because UV_EXTRA_INDEX_URL is only set on release branches.	2025-11-01 12:54:19 -07:00
Ashwin Bharambe	e0bb7529ed	fix: only set UV_INDEX_STRATEGY when UV_EXTRA_INDEX_URL is present (#4017 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Pre-commit / pre-commit (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 15s Details Cherry-pick of bc12fe6c4 to release-0.3.x Fixes GitHub Actions workflows failing with UV index strategy errors when testing on RC tags and non-release branches. The issue was that UV_INDEX_STRATEGY was being set to an empty string in the environment, causing UV to fail with "error: a value is required for '--index-strategy'". The fix removes UV_INDEX_STRATEGY from the env block and only sets it to 'unsafe-best-match' when UV_EXTRA_INDEX_URL is actually present.	2025-10-31 16:22:01 -07:00
github-actions[bot]	bdd330a94a	chore: update lockfiles for 0.3.1	2025-10-31 22:56:35 +00:00
github-actions[bot]	033c1abf29	build: Bump version to 0.3.1	2025-10-31 22:54:10 +00:00
github-actions[bot]	dd6aee179d	chore: update lockfiles for 0.3.1rc5	2025-10-31 22:43:57 +00:00
github-actions[bot]	dc3674a82d	Release candidate 0.3.1rc5	2025-10-31 22:35:23 +00:00
Ashwin Bharambe	7ac81f69fe	build: fix uv.lock	2025-10-31 15:31:01 -07:00
github-actions[bot]	7fdfeef44a	Release candidate 0.3.1rc4	2025-10-31 22:15:04 +00:00
Ashwin Bharambe	bf1693c2ee	build: fix uv.lock	2025-10-31 15:07:39 -07:00
github-actions[bot]	13c6695fd3	Release candidate 0.3.1rc3	2025-10-31 21:29:01 +00:00
Ashwin Bharambe	e8a3dfbe96	docs: A getting started notebook featuring simple agent examples (#4015 ) Cherry-pick of #3955 to release-0.3.x Adds a getting started notebook with simple agent examples to help users get started with llama-stack agents. Co-authored-by: Omar Abdelwahab <omaryashraf10@gmail.com> Co-authored-by: Omar Abdelwahab <omara@fb.com>	2025-10-31 14:27:22 -07:00
Ashwin Bharambe	637f8bef9c	build: fix uv.lock	2025-10-31 14:01:45 -07:00
github-actions[bot]	a5372dbdf5	Release candidate 0.3.1rc2	2025-10-31 20:50:32 +00:00
Ashwin Bharambe	9f1e4a07c9	feat: support `workers` in run config (#4014 ) Cherry-pick of #3992 to release-0.3.x Adds support for configuring the number of workers in run.yaml configuration files. Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>	2025-10-31 13:48:55 -07:00
Ashwin Bharambe	b088665227	fix(ci): unset empty UV index env vars to prevent uv errors (#4013 ) Cherry-pick of #4012 to release-0.3.x Fixes container builds failing with UV index strategy errors when build args are passed with empty values. Docker ARGs declared with empty defaults (ARG UV_INDEX_STRATEGY="") become environment variables with empty string values in RUN commands. UV interprets these as if --index-strategy "" was passed on the command line, causing build failures with "error: a value is required for '--index-strategy <UV_INDEX_STRATEGY>'". This is a footgun because empty string ≠ unset variable, and ARGs silently propagate to all RUN commands, only failing when declared with empty defaults. The fix unsets UV_EXTRA_INDEX_URL and UV_INDEX_STRATEGY at the start of RUN blocks, saves the values early, and only restores them for editable installs with RC dependencies. All other install modes (PyPI, test-pypi, client) now run with a clean environment.	2025-10-31 13:45:47 -07:00
Ashwin Bharambe	73d70546d4	chore(release-0.3.x): handle missing external_providers_dir (#4011 ) Cherry-pick of #3974 to release-0.3.x branch. ## Summary - Fixes handling of missing external_providers_dir in stack configuration ## Original PR Fixes from #3974 Signed-off-by: Doug Edgar <dedgar@redhat.com> Co-authored-by: Doug Edgar <dedgar@redhat.com>	2025-10-31 12:55:34 -07:00
Ashwin Bharambe	a488d8ce10	fix(ci): install client from release branch before uv sync (#4002 ) Backport of #4001 to release-0.3.x branch. Fixes CI failures on release branches where uv sync can't resolve RC dependencies. ## The Problem On release branches like `release-0.3.x`, pyproject.toml requires `llama-stack-client>=0.3.1rc1`. RC versions only exist on test.pypi, not PyPI. This causes multiple CI failures: 1. `uv sync` fails because it can't resolve RC versions from PyPI 2. pre-commit hooks (uv-lock, codegen) fail for the same reason 3. mypy workflow section needs uv installed ## The Solution Configure UV to use test.pypi when on release branches: - Set `UV_INDEX_URL=https://test.pypi.org/simple/` (primary) - Set `UV_EXTRA_INDEX_URL=https://pypi.org/simple/` (fallback) - Set `UV_INDEX_STRATEGY=unsafe-best-match` to check both indexes This allows `uv sync` to resolve common packages from PyPI and RC versions from test.pypi. ## Additional Fixes - Export UV env vars to `GITHUB_ENV` so pre-commit hooks inherit them - Install uv in pre-commit workflow for mypy section - Handle missing `type_checking` dependency group on release-0.3.x - Regenerate uv.lock with RC versions for the release branch ## Changes - Created reusable `install-llama-stack-client` action for configuration - Modified `setup-runner` to set UV environment variables before sync - Modified `pre-commit` workflow to configure client and export env vars - Updated uv.lock with RC versions from test.pypi This is a cherry-pick of commits `afa9f0882`, `c86e6e906`, `626639bee`, and `081566321` from main, plus additional fixes for release branch compatibility.	2025-10-31 11:44:05 -07:00
github-actions[bot]	f8272b2faf	Release candidate 0.3.1rc1	2025-10-31 04:54:54 +00:00
Ashwin Bharambe	39f33f7f12	feat(cherry-pick): fixes for 0.3.1 release (#3998 ) ## Summary Cherry-picks 5 critical fixes from main to the release-0.3.x branch for the v0.3.1 release, plus CI workflow updates. Note: This recreates the cherry-picks from the closed PR #3991, now targeting the renamed `release-0.3.x` branch (previously `release-0.3.x-maint`). ## Commits 1. `2c56a8560` - fix(context): prevent provider data leak between streaming requests (#3924) - CRITICAL SECURITY FIX: Prevents provider credentials from leaking between requests - Fixed import path for 0.3.0 compatibility 2. `ddd32b187` - fix(inference): enable routing of models with provider_data alone (#3928) - Enables routing for fully qualified model IDs with provider_data - Resolved merge conflicts, adapted for 0.3.0 structure 3. `f7c2973aa` - fix: Avoid BadRequestError due to invalid max_tokens (#3667) - Fixes failures with Gemini and other providers that reject max_tokens=0 - Non-breaking API change 4. `d7f9da616` - fix(responses): sync conversation before yielding terminal events in streaming (#3888) - Ensures conversation sync executes even when streaming consumers break early 5. `0ffa8658b` - fix(logging): ensure logs go to stderr, loggers obey levels (#3885) - Fixes logging infrastructure 6. `75b49cb3c` - ci: support release branches and match client branch (#3990) - Updates CI workflows to support release-X.Y.x branches - Matches client branch from llama-stack-client-python for release testing - Fixes artifact name collisions ## Adaptations for 0.3.0 - Fixed import paths: `llama_stack.core.telemetry.tracing` → `llama_stack.providers.utils.telemetry.tracing` - Fixed import paths: `llama_stack.core.telemetry.telemetry` → `llama_stack.apis.telemetry` - Changed `self.telemetry_enabled` → `self.telemetry` (0.3.0 attribute name) - Removed `rerank()` method that doesn't exist in 0.3.0 ## Testing All imports verified and tests should pass once CI is set up.	2025-10-30 21:51:42 -07:00

1 2 3 4 5 ...

3021 commits