llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-16 22:02:37 +00:00

Author	SHA1	Message	Date
Roy Belio	c574db5f1d	fix(inference): AttributeError in streaming response cleanup (#4236 ) This PR fixes issue #3185 The code calls `await event_gen.aclose()` but OpenAI's `AsyncStream` doesn't have an `aclose()` method - it has `close()` (which is async). when clients cancel streaming requests, the server tries to clean up with: ```python await event_gen.aclose() # ❌ AsyncStream doesn't have aclose()! ``` But `AsyncStream` has never had a public `aclose()` method. The error message literally tells us: ``` AttributeError: 'AsyncStream' object has no attribute 'aclose'. Did you mean: 'close'? ^^^^^^^^ ``` ## Verification * Reproduction script [`reproduce_issue_3185.sh`](https://gist.github.com/r-bit-rry/dea4f8fbb81c446f5db50ea7abd6379b) can be used to verify the fix. * Manual checks, validation against original OpenAI library code	2025-12-14 07:51:09 -05:00
Omar Abdelwahab	dfb9f6743a	docs: Adding initial updates to the RAG documentation and examples (#4377 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2s Details Integration Tests (Replay) / generate-matrix (push) Successful in 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details Python Package Build Test / build (3.12) (push) Successful in 18s Details Python Package Build Test / build (3.13) (push) Successful in 22s Details Test External API and Providers / test-external (venv) (push) Failing after 37s Details Vector IO Integration Tests / test-matrix (push) Failing after 46s Details UI Tests / ui-tests (22) (push) Successful in 1m23s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m48s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m50s Details Pre-commit / pre-commit (22) (push) Successful in 3m31s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m20s Details # What does this PR do? This PR updates the RAG examples included in docs/quick_start.ipynb, docs/getting_started/demo_script.py, rag.mdx and index.md to remove references to the deprecated vector_io and vector_db APIs and to add examples that use /v1/vector_stores with responses and completions. --------- Co-authored-by: Omar Abdelwahab <omara@fb.com> Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com>	2025-12-12 22:59:39 -05:00
Varsha	75ef052545	docs: Add details on model registration and refresh_models (#4383 ) Document the refresh_models configuration option for remote providers that use RemoteInferenceProviderConfig. - Add "Automatic vs Explicit Model Registration" section to resources.mdx - Include examples for registering custom embedding models # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>	2025-12-12 22:41:28 -05:00
Robert Riley (OCI)	10c878d782	feat: added oci-s3 compatibility (#4374 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 14s Details Python Package Build Test / build (3.12) (push) Successful in 16s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details Test External API and Providers / test-external (venv) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (push) Failing after 50s Details UI Tests / ui-tests (22) (push) Successful in 1m1s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m39s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m43s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m47s Details Pre-commit / pre-commit (22) (push) Successful in 3m42s Details # What does this PR do? The PR validates and allow access to OCI object-storage through the S3 compatibility API. Additional documentation for OCI is supplied, in notebook form, as well. ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Co-authored-by: raghotham <rsm@meta.com>	2025-12-11 15:13:55 -08:00
Shabana Baig	805abf573f	feat!: Implement include parameter specifically for adding logprobs in the output message (#4261 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 15s Details Python Package Build Test / build (3.12) (push) Successful in 17s Details Python Package Build Test / build (3.13) (push) Successful in 18s Details Test External API and Providers / test-external (venv) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (push) Failing after 43s Details UI Tests / ui-tests (22) (push) Successful in 52s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m45s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m58s Details Pre-commit / pre-commit (22) (push) Successful in 3m9s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m5s Details # Problem As an Application Developer, I want to use the include parameter with the value message.output_text.logprobs, so that I can receive log probabilities for output tokens to assess the model's confidence in its response. # What does this PR do? - Updates the include parameter in various resource definitions - Updates the inline provider to return logprobs when "message.output_text.logprobs" is passed in the include parameter - Converts the logprobs returned by the inference provider from chat completion format to responses format Closes #[4260](https://github.com/llamastack/llama-stack/issues/4260) ## Test Plan - Created a script to explore OpenAI behavior: https://github.com/s-akhtar-baig/llama-stack-examples/blob/main/responses/src/include.py - Added integration tests and new recordings --------- Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-12-11 11:11:21 -08:00
Jaideep Rao	76e47d811a	feat(api): add readonly connectors API (#4258 ) # What does this PR do? Adds a new API for connectors and MCP registry support along with required types. Does not include any implementation for it <!-- If resolving an issue, uncomment and update the line below --> Closes #4235 and #4061 (partially) ## Test Plan no tests included --------- Signed-off-by: Jaideep Rao <jrao@redhat.com> Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com>	2025-12-11 10:19:55 -08:00
Sébastien Han	470fe55e87	fix(inference): respect table_name config in InferenceStore (#4371 ) # What does this PR do? The InferenceStore class was ignoring the table_name field from InferenceStoreReference and always using the hardcoded value "chat_completions". This meant that any custom table_name configured in the run config (e.g., "inference_store" in run-with-postgres-store.yaml) was silently ignored. This change updates all SQL operations in InferenceStore to use self.reference.table_name instead of the hardcoded string, ensuring the configured table name is properly respected. A new test has been added to verify that custom table names work correctly for storing, retrieving, and listing chat completions. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan CI Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-11 14:50:23 +01:00
Charlie Doern	7308c8aef1	feat: add workflow_dispatch and self-trigger to stainless builds (#4361 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details Test External API and Providers / test-external (venv) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (push) Failing after 48s Details UI Tests / ui-tests (22) (push) Successful in 1m36s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m43s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m54s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m24s Details Pre-commit / pre-commit (22) (push) Successful in 4m22s Details # What does this PR do? Currently impossible to test workflow changes (pull_request_target uses base branch definition) or manually trigger SDK builds. This adds both capabilities. - Add workflow_dispatch with pr_number input for manual testing - Add workflow file to path triggers for automatic testing - Fetch PR details via gh CLI for manual runs - Update jobs to use computed PR data for both trigger types ## Test Plan impossible to test until it merges unfortunately. I am doing this in a smaller PR so that I can use it immediately in a follow up. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-10 12:48:27 -08:00
Francisco Javier Arceo	95b2948d11	feat: Add support for query rewrite in vector_store.search (#4171 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 20s Details Test External API and Providers / test-external (venv) (push) Failing after 41s Details Vector IO Integration Tests / test-matrix (push) Failing after 49s Details UI Tests / ui-tests (22) (push) Successful in 51s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m27s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m45s Details Pre-commit / pre-commit (22) (push) Failing after 2m30s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m22s Details # What does this PR do? Actualize query rewrite in search API, add `default_query_expansion_model` and `query_expansion_prompt` in `VectorStoresConfig`. Makes `rewrite_query` parameter functional in vector store search. - `rewrite_query=false` (default): Use original query - `rewrite_query=true`: Expand query via LLM, or fail gracefully if no LLM available Adds 4 parameters to`VectorStoresConfig`: - `default_query_expansion_model`: LLM model for query expansion (optional) - `query_expansion_prompt`: Custom prompt template (optional, uses built-in default) - `query_expansion_max_tokens`: Configurable token limit (default: 100) - `query_expansion_temperature`: Configurable temperature (default: 0.3) Enabled `run.yaml`: ```yaml vector_stores: rewrite_query_params: model: provider_id: "ollama" model_id: "llama3.2:3b-instruct-fp16" # prompt defaults to built-in # max_tokens defaults to 100 # temperature defaults to 0.3 ``` Fully customized `run.yaml`: ```yaml vector_stores: default_provider_id: faiss default_embedding_model: provider_id: sentence-transformers model_id: nomic-ai/nomic-embed-text-v1.5 rewrite_query_params: model: provider_id: ollama model_id: llama3.2:3b-instruct-fp16 prompt: "Rewrite this search query to improve retrieval results by expanding it with relevant synonyms and related terms: {query}" max_tokens: 100 temperature: 0.3 ``` ## Test Plan Added test and recording Example script as well: ```python import asyncio from llama_stack_client import LlamaStackClient from io import BytesIO def gen_file(client, text: str=""): file_buffer = BytesIO(text.encode('utf-8')) file_buffer.name = "my_file.txt" uploaded_file = client.files.create( file=file_buffer, purpose="assistants" ) return uploaded_file async def test_query_rewriting(): client = LlamaStackClient(base_url="http://0.0.0.0:8321/") uploaded_file = gen_file(client, "banana banana apple") uploaded_file2 = gen_file(client, "orange orange kiwi") vs = client.vector_stores.create() xf_vs = client.vector_stores.files.create(vector_store_id=vs.id, file_id=uploaded_file.id) xf_vs1 = client.vector_stores.files.create(vector_store_id=vs.id, file_id=uploaded_file2.id) response1 = client.vector_stores.search( vector_store_id=vs.id, query="apple", max_num_results=3, rewrite_query=False ) response2 = client.vector_stores.search( vector_store_id=vs.id, query="kiwi", max_num_results=3, rewrite_query=True, ) print(f"\n🔵 Response 1 (rewrite_query=False):\n\033[94m{response1}\033[0m") print(f"\n🟢 Response 2 (rewrite_query=True):\n\033[92m{response2}\033[0m") for f in [uploaded_file.id, uploaded_file2.id]: client.files.delete(file_id=f) client.vector_stores.delete(vector_store_id=vs.id) if __name__ == "__main__": asyncio.run(test_query_rewriting()) ``` And see the screen shot of the server logs showing it worked. <img width="1111" height="826" alt="Screenshot 2025-11-19 at 1 16 03 PM" src="https://github.com/user-attachments/assets/2d188b44-1fef-4df5-b465-2d6728ca49ce" /> Notice the log: ```bash Query rewritten: 'kiwi' → 'kiwi, a small brown or green fruit native to New Zealand, or a person having a fuzzy brown outer skin similar in appearance.' ``` So `kiwi` was expanded. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>	2025-12-10 10:06:19 -05:00
Sébastien Han	ff375f1abb	feat: convert Benchmarks API to use FastAPI router (#4309 ) # What does this PR do? Convert the Benchmarks API from @webmethod decorators to FastAPI router pattern, matching the Batches API structure. One notable change is the update of stack.py to handle request models in register_resources(). Closes: #4308 ## Test Plan CI and `curl http://localhost:8321/v1/inspect/routes \| jq '.data[] \| select(.route \| contains("benchmark"))'` --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-10 15:04:27 +01:00
Charlie Doern	661985e240	feat: remove usage of build yaml (#4192 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Failing after 3s Details Test Llama Stack Build / build (push) Has been skipped Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test llama stack list-deps / generate-matrix (push) Failing after 3s Details Test llama stack list-deps / list-deps (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.13) (push) Successful in 19s Details Python Package Build Test / build (3.12) (push) Successful in 23s Details Test Llama Stack Build / build-single-provider (push) Successful in 33s Details Test llama stack list-deps / show-single-provider (push) Successful in 36s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 44s Details Vector IO Integration Tests / test-matrix (push) Failing after 57s Details Test External API and Providers / test-external (venv) (push) Failing after 1m37s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m56s Details UI Tests / ui-tests (22) (push) Successful in 2m2s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m35s Details Pre-commit / pre-commit (22) (push) Successful in 3m16s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 3m34s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 3m59s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m30s Details # What does this PR do? the build.yaml is only used in the following ways: 1. list-deps 2. distribution code-gen since `llama stack build` no longer exists, I found myself asking "why do we need two different files for list-deps and run"? Removing the BuildConfig and altering the usage of the DistributionTemplate in llama stack list-deps is the first step in removing the build yaml entirely. Removing the BuildConfig and build.yaml cuts the files users need to maintain in half, and allows us to focus on the stability of _just_ the run.yaml This PR removes the build.yaml, BuildConfig datatype, and its usage throughout the codebase. Users are now expected to point to run.yaml files when running list-deps, and our codebase automatically uses these types now for things like `get_provider_registry`. Additionally, two renames: `StackRunConfig` -> `StackConfig` and `run.yaml` -> `config.yaml`. The build.yaml made sense for when we were managing the build process for the user and actually _producing_ a run.yaml _from_ the build.yaml, but now that we are simply just getting the provider registry and listing the deps, switching to config.yaml simplifies the scope here greatly. ## Test Plan existing list-deps usage should work in the tests. --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-10 10:12:12 +01:00
Varsha	17e6912288	docs: Fix vector_store_create params (#4364 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 14s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details Test External API and Providers / test-external (venv) (push) Failing after 31s Details Vector IO Integration Tests / test-matrix (push) Failing after 38s Details UI Tests / ui-tests (22) (push) Successful in 44s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m30s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m29s Details Pre-commit / pre-commit (22) (push) Successful in 2m59s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m38s Details	2025-12-09 19:48:43 -05:00
Francisco Javier Arceo	fcea9893a4	feat(UI): Adding Files API to Admin UI (#4319 ) # What does this PR do? ## Files Admin Page <img width="1919" height="1238" alt="Screenshot 2025-12-09 at 10 33 06 AM" src="https://github.com/user-attachments/assets/3dd545f0-32bc-45be-af2b-1823800015f2" /> ## Files Upload Modal <img width="1919" height="1287" alt="Screenshot 2025-12-09 at 10 33 38 AM" src="https://github.com/user-attachments/assets/776bb372-75d3-4ccd-b6b5-c9dfb3fcb350" /> ## Files Detail <img width="1918" height="1099" alt="Screenshot 2025-12-09 at 10 34 26 AM" src="https://github.com/user-attachments/assets/f256dbf8-4047-4d79-923d-404161b05f36" /> Note, content preview has some handling for JSON, CSV, and PDF to enable nicer rendering. Pure text rendering is trivial. ### Files Detail File Content Preview (TXT) <img width="1918" height="1341" alt="Screenshot 2025-12-09 at 10 41 20 AM" src="https://github.com/user-attachments/assets/4fa0ddb7-ffff-424b-b764-0bd4af6ed976" /> ### Files Detail File Content Preview (JSON) <img width="1909" height="1233" alt="Screenshot 2025-12-09 at 10 39 57 AM" src="https://github.com/user-attachments/assets/b912f07a-2dff-483b-b73c-2f69dd0d87ad" /> ### Files Detail File Content Preview (HTML) <img width="1916" height="1348" alt="Screenshot 2025-12-09 at 10 40 27 AM" src="https://github.com/user-attachments/assets/17ebec0a-8754-4552-977d-d3c44f7f6973" /> ### Files Detail File Content Preview (CSV) <img width="1919" height="1177" alt="Screenshot 2025-12-09 at 10 34 50 AM" src="https://github.com/user-attachments/assets/20bd0755-1757-4a3a-99d2-fbd072f81f49" /> ### Files Detail File Content Preview (PDF) <img width="1917" height="1154" alt="Screenshot 2025-12-09 at 10 36 48 AM" src="https://github.com/user-attachments/assets/2873e6fe-4da3-4cbd-941b-7d903270b749" /> Closes https://github.com/llamastack/llama-stack/issues/4144 ## Test Plan Added Tests Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-12-09 16:28:05 -05:00
Robert Riley (OCI)	6ad5fb5577	feat: Adding OCI Embeddings (#4300 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 11s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 18s Details Test External API and Providers / test-external (venv) (push) Failing after 30s Details UI Tests / ui-tests (22) (push) Successful in 56s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m1s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m44s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m48s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m17s Details Pre-commit / pre-commit (22) (push) Successful in 3m22s Details # What does this PR do? Enabling usage of OCI embedding models. ## Test Plan Testing embedding model: `OCI_COMPARTMENT_OCID="" OCI_REGION="us-chicago-1" OCI_AUTH_TYPE=config_file pytest -sv tests/integration/inference/test_openai_embeddings.py --stack-config oci --embedding-model oci/openai.text-embedding-3-small --inference-mode live` Testing chat model: `OCI_COMPARTMENT_OCID="" OCI_REGION="us-chicago-1" OCI_AUTH_TYPE=config_file pytest -sv tests/integration/inference/ --stack-config oci --text-model oci/openai.gpt-4.1-nano-2025-04-14 --inference-mode live` Testing curl for embeddings: `curl -X POST http://localhost:8321/v1/embeddings -H "Content-Type: application/json" -d '{ "model": "oci/openai.text-embedding-3-small", "input": ["First text", "Second text"], "encoding_format": "float" }'` `{"object":"list","data":[{"object":"embedding","embedding":[-0.017190756...0.025272394],"index":1}],"model":"oci/openai.text-embedding-3-small","usage":{"prompt_tokens":4,"total_tokens":4}}` --------- Co-authored-by: Omar Abdelwahab <omaryashraf10@gmail.com>	2025-12-08 13:05:39 -08:00
Sébastien Han	d82a2cd6f8	fix: httpcore deadlock in CI by properly closing streaming responses (#4335 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details Python Package Build Test / build (3.12) (push) Successful in 18s Details Test External API and Providers / test-external (venv) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (push) Failing after 33s Details UI Tests / ui-tests (22) (push) Successful in 1m13s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m37s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m11s Details Pre-commit / pre-commit (22) (push) Successful in 3m39s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m1s Details # What does this PR do? The test_conversation_error_handling test was timing out in CI with a deadlock in httpcore's connection pool. The root cause was the preceding test_conversation_multi_turn_and_streaming test, which broke out of the streaming response iterator early without properly closing the underlying HTTP connection. When a streaming response iterator is abandoned mid-stream, the HTTP connection remains in an incomplete state. Since the openai_client fixture is session-scoped, subsequent tests reuse the same httpcore connection pool. The dangling connection causes the pool's internal lock to deadlock when the next test attempts to acquire a new connection. The fix wraps the streaming response in a context manager, which ensures the connection is properly closed when exiting the with block, even when breaking out of the loop early. This is a best practice when working with streaming HTTP responses that may not be fully consumed. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-08 16:38:46 +01:00
dependabot[bot]	20c11d8fd4	chore(github-deps): bump stainless-api/upload-openapi-spec-action from 1.7.0 to 1.7.1 (#4334 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 6s Details API Conformance Tests / check-schema-compatibility (push) Successful in 18s Details Python Package Build Test / build (3.12) (push) Successful in 18s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 30s Details Test llama stack list-deps / generate-matrix (push) Successful in 33s Details Test Llama Stack Build / generate-matrix (push) Successful in 36s Details Test llama stack list-deps / show-single-provider (push) Successful in 33s Details Python Package Build Test / build (3.13) (push) Successful in 59s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 1m8s Details Test Llama Stack Build / build-single-provider (push) Successful in 1m12s Details Test External API and Providers / test-external (venv) (push) Failing after 1m9s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m24s Details UI Tests / ui-tests (22) (push) Successful in 1m29s Details Test Llama Stack Build / build (push) Successful in 1m0s Details Test llama stack list-deps / list-deps (push) Failing after 1m23s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m42s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m51s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 3m47s Details Pre-commit / pre-commit (22) (push) Successful in 3m55s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m7s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 4m43s Details Bumps [stainless-api/upload-openapi-spec-action](https://github.com/stainless-api/upload-openapi-spec-action) from 1.7.0 to 1.7.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/stainless-api/upload-openapi-spec-action/releases">stainless-api/upload-openapi-spec-action's releases</a>.</em></p> <blockquote> <h2>v1.7.1</h2> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.7.0...v1.7.1">1.7.1</a> (2025-12-01)</h2> <h3>Bug Fixes</h3> <ul> <li>improve getMergeBase to handle shallow clones more robustly (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/138">#138</a>) (<a href="`3687845465`">3687845</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/stainless-api/upload-openapi-spec-action/blob/main/CHANGELOG.md">stainless-api/upload-openapi-spec-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.7.0...v1.7.1">1.7.1</a> (2025-12-01)</h2> <h3>Bug Fixes</h3> <ul> <li>improve getMergeBase to handle shallow clones more robustly (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/138">#138</a>) (<a href="`3687845465`">3687845</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.6.0...v1.7.0">1.7.0</a> (2025-11-17)</h2> <h3>Features</h3> <ul> <li><strong>preview:</strong> add output documented_spec_path to preview action (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/135">#135</a>) (<a href="`5e80cc40da`">5e80cc4</a>)</li> <li><strong>preview:</strong> add output_dir input and write documented spec to file (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/137">#137</a>) (<a href="`d30490c89b`">d30490c</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.5...v1.6.0">1.6.0</a> (2025-10-30)</h2> <h3>Features</h3> <ul> <li>add support for github OIDC auth (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/133">#133</a>) (<a href="`259674c1b3`">259674c</a>)</li> <li>change fail on semantics (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/124">#124</a>) (<a href="`e1046240c0`">e104624</a>)</li> </ul> <h3>Bug Fixes</h3> <ul> <li>accept multiline conventional commits (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/129">#129</a>) (<a href="`d2dcc0b3bf`">d2dcc0b</a>)</li> <li>tweak categorizeOutcomes (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/132">#132</a>) (<a href="`c45d6a9c79`">c45d6a9</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.4...v1.5.5">1.5.5</a> (2025-09-26)</h2> <h3>Bug Fixes</h3> <ul> <li>rollback filtering diagnostics by target (<a href="`54328a386f`">54328a3</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.3...v1.5.4">1.5.4</a> (2025-09-25)</h2> <h3>Bug Fixes</h3> <ul> <li>check for latestRun before commenting (<a href="`53fef9f328`">53fef9f</a>)</li> <li>filter diagnostics by target (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/125">#125</a>) (<a href="`102dc971cb`">102dc97</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.2...v1.5.3">1.5.3</a> (2025-09-16)</h2> <h3>Bug Fixes</h3> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`a4d631c1e9`"><code>a4d631c</code></a> chore(main): release 1.7.1 (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/141">#141</a>)</li> <li><a href="`56c2d869b3`"><code>56c2d86</code></a> chore: add structured logger (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/139">#139</a>)</li> <li><a href="`3687845465`"><code>3687845</code></a> fix: improve getMergeBase to handle shallow clones more robustly (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/138">#138</a>)</li> <li>See full diff in <a href="`9133735bca...a4d631c1e9`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=stainless-api/upload-openapi-spec-action&package-manager=github_actions&previous-version=1.7.0&new-version=1.7.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-08 12:04:22 +01:00
dependabot[bot]	912ab6b4a2	chore(github-deps): bump actions/setup-node from 6.0.0 to 6.1.0 (#4333 ) Bumps [actions/setup-node](https://github.com/actions/setup-node) from 6.0.0 to 6.1.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/setup-node/releases">actions/setup-node's releases</a>.</em></p> <blockquote> <h2>v6.1.0</h2> <h2>What's Changed</h2> <h3>Enhancement:</h3> <ul> <li>Remove always-auth configuration handling by <a href="https://github.com/priyagupta108"><code>@priyagupta108</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1436">actions/setup-node#1436</a></li> </ul> <h3>Dependency updates:</h3> <ul> <li>Upgrade <code>@actions/cache</code> from 4.0.3 to 4.1.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1384">actions/setup-node#1384</a></li> <li>Upgrade actions/checkout from 5 to 6 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1439">actions/setup-node#1439</a></li> <li>Upgrade js-yaml from 3.14.1 to 3.14.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1435">actions/setup-node#1435</a></li> </ul> <h3>Documentation update:</h3> <ul> <li>Add example for restore-only cache in documentation by <a href="https://github.com/aparnajyothi-y"><code>@aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1419">actions/setup-node#1419</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v6...v6.1.0">https://github.com/actions/setup-node/compare/v6...v6.1.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`395ad32622`"><code>395ad32</code></a> Bump js-yaml from 3.14.1 to 3.14.2 (<a href="https://redirect.github.com/actions/setup-node/issues/1435">#1435</a>)</li> <li><a href="`a4d2e2bbca`"><code>a4d2e2b</code></a> Bump actions/checkout from 5 to 6 (<a href="https://redirect.github.com/actions/setup-node/issues/1439">#1439</a>)</li> <li><a href="`b9b25d45f7`"><code>b9b25d4</code></a> Remove always-auth configuration handling from action (<a href="https://redirect.github.com/actions/setup-node/issues/1436">#1436</a>)</li> <li><a href="`633bb92bc0`"><code>633bb92</code></a> Bump <code>@actions/cache</code> from 4.0.3 to 4.1.0 (<a href="https://redirect.github.com/actions/setup-node/issues/1384">#1384</a>)</li> <li><a href="`dda4788290`"><code>dda4788</code></a> Add example for restore-only cache in documentation (<a href="https://redirect.github.com/actions/setup-node/issues/1419">#1419</a>)</li> <li>See full diff in <a href="`2028fbc5c2...395ad32622`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=6.0.0&new-version=6.1.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-08 12:03:44 +01:00
dependabot[bot]	39d23d9894	chore(github-deps): bump actions/stale from 10.1.0 to 10.1.1 (#4332 ) Bumps [actions/stale](https://github.com/actions/stale) from 10.1.0 to 10.1.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/stale/releases">actions/stale's releases</a>.</em></p> <blockquote> <h2>v10.1.1</h2> <h2>What's Changed</h2> <h3>Bug Fix</h3> <ul> <li>Add Missing Input Reading for <code>only-issue-types</code> by <a href="https://github.com/Bibo-Joshi"><code>@Bibo-Joshi</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1298">actions/stale#1298</a></li> </ul> <h3>Improvement</h3> <ul> <li>Improves error handling when rate limiting is disabled on GHES. by <a href="https://github.com/chiranjib-swain"><code>@chiranjib-swain</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1300">actions/stale#1300</a></li> </ul> <h3>Dependency Upgrades</h3> <ul> <li>Upgrade eslint-config-prettier from 8.10.0 to 10.1.8 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1276">actions/stale#1276</a></li> <li>Upgrade <code>@types/node</code> from 20.10.3 to 24.2.0 and document breaking changes in v10 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1280">actions/stale#1280</a></li> <li>Upgrade actions/publish-action from 0.3.0 to 0.4.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1291">actions/stale#1291</a></li> <li>Upgrade actions/checkout from 4 to 6 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1306">actions/stale#1306</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/chiranjib-swain"><code>@chiranjib-swain</code></a> made their first contribution in <a href="https://redirect.github.com/actions/stale/pull/1300">actions/stale#1300</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/stale/compare/v10...v10.1.1">https://github.com/actions/stale/compare/v10...v10.1.1</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`997185467f`"><code>9971854</code></a> build(deps): bump actions/checkout from 4 to 6 (<a href="https://redirect.github.com/actions/stale/issues/1306">#1306</a>)</li> <li><a href="`5611b9defa`"><code>5611b9d</code></a> build(deps): bump actions/publish-action from 0.3.0 to 0.4.0 (<a href="https://redirect.github.com/actions/stale/issues/1291">#1291</a>)</li> <li><a href="`fad0de84e5`"><code>fad0de8</code></a> Improves error handling when rate limiting is disabled on GHES. (<a href="https://redirect.github.com/actions/stale/issues/1300">#1300</a>)</li> <li><a href="`39bea7de61`"><code>39bea7d</code></a> Add Missing Input Reading for <code>only-issue-types</code> (<a href="https://redirect.github.com/actions/stale/issues/1298">#1298</a>)</li> <li><a href="`e46bbabb3e`"><code>e46bbab</code></a> build(deps-dev): bump <code>@types/node</code> from 20.10.3 to 24.2.0 and document breakin...</li> <li><a href="`65d1d4804d`"><code>65d1d48</code></a> build(deps-dev): bump eslint-config-prettier from 8.10.0 to 10.1.8 (<a href="https://redirect.github.com/actions/stale/issues/1276">#1276</a>)</li> <li>See full diff in <a href="`5f858e3efb...997185467f`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/stale&package-manager=github_actions&previous-version=10.1.0&new-version=10.1.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-08 11:56:42 +01:00
dependabot[bot]	8f585e4c7a	chore(github-deps): bump actions/checkout from 6.0.0 to 6.0.1 (#4331 ) Bumps [actions/checkout](https://github.com/actions/checkout) from 6.0.0 to 6.0.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/releases">actions/checkout's releases</a>.</em></p> <blockquote> <h2>v6.0.1</h2> <h2>What's Changed</h2> <ul> <li>Update all references from v5 and v4 to v6 by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2314">actions/checkout#2314</a></li> <li>Add worktree support for persist-credentials includeIf by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2327">actions/checkout#2327</a></li> <li>Clarify v6 README by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2328">actions/checkout#2328</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v6...v6.0.1">https://github.com/actions/checkout/compare/v6...v6.0.1</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`8e8c483db8`"><code>8e8c483</code></a> Clarify v6 README (<a href="https://redirect.github.com/actions/checkout/issues/2328">#2328</a>)</li> <li><a href="`033fa0dc0b`"><code>033fa0d</code></a> Add worktree support for persist-credentials includeIf (<a href="https://redirect.github.com/actions/checkout/issues/2327">#2327</a>)</li> <li><a href="`c2d88d3ecc`"><code>c2d88d3</code></a> Update all references from v5 and v4 to v6 (<a href="https://redirect.github.com/actions/checkout/issues/2314">#2314</a>)</li> <li>See full diff in <a href="`1af3b93b68...8e8c483db8`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=6.0.0&new-version=6.0.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-08 11:56:25 +01:00
Varsha	3ca0481e43	fix(ui): Fix model dropdown not displaying models in chat playground (#4329 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 18s Details Test External API and Providers / test-external (venv) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (push) Failing after 34s Details UI Tests / ui-tests (22) (push) Successful in 41s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m18s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m26s Details Pre-commit / pre-commit (22) (push) Successful in 2m53s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m8s Details	2025-12-05 16:54:12 -05:00
Derek Higgins	8998000aec	fix(security): redact JWT tokens in server logs (#4325 ) Add "token" to sensitive field patterns in redact_sensitive_fields() to prevent JWT tokens from being logged in plaintext. Previously only api_key, api_token, password, and secret were filtered. This prevents tokens like server.auth.provider_config.jwks.token from being exposed in server logs. Closes: #4324 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-05 15:53:47 -05:00
Derek Higgins	fc4fc03606	chore: Small Auth CI refactor (#4322 ) In preperation for ABAC addition (next PR) ``` fix(ci): allow run_dir variable expansion in YAML heredoc Remove single quotes from EOF delimiter to allow $run_dir to be expanded by bash when creating the configuration file. Previously the literal string "$run_dir" was being written to the YAML instead of the actual temp directory path. drwxr-xr-x 3 runner runner 4096 Dec 5 12:56 $run_dir ``` ``` test(ci): add test_endpoint helper function to auth tests Add reusable test_endpoint function to integration-auth-tests workflow for consistent API testing: ``` --------- Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-05 12:01:29 -08:00
Varad Ahirwadkar	06f7ff2c80	fix: Correct broken links in README (#4218 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details API Conformance Tests / check-schema-compatibility (push) Successful in 22s Details Vector IO Integration Tests / test-matrix (push) Failing after 33s Details UI Tests / ui-tests (22) (push) Successful in 38s Details Test External API and Providers / test-external (venv) (push) Failing after 43s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m23s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m38s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m49s Details Pre-commit / pre-commit (22) (push) Successful in 5m8s Details # What does this PR do? Fixing broken README links that were still pointing to the https://llamastack.github.io/latest Signed-off-by: Varad Ahirwadkar <varad.ahirwadkar1@ibm.com>	2025-12-04 14:33:32 -08:00
Nathan Weinberg	f14936035d	fix: runpod provider no longer crashes sans API key (#4316 ) # What does this PR do? previously the runpod provider would fail if the RUNPOD_API_TOKEN was not set modify the impl to default to an empty string to align with similar providers' behavior Closes #4296 ## Test Plan Run `uv run llama stack run --providers inference=remote::runpod` with `RUNPOD_API_TOKEN` unset - server now boots where it previously crashed ``` INFO 2025-12-04 13:52:59,920 uvicorn.error:84 uncategorized: Started server process [233656] INFO 2025-12-04 13:52:59,921 uvicorn.error:48 uncategorized: Waiting for application startup. INFO 2025-12-04 13:52:59,926 llama_stack.core.server.server:168 core::server: Starting up Llama Stack server (version: 0.4.0.dev0) INFO 2025-12-04 13:52:59,927 llama_stack.core.stack:495 core: starting registry refresh task INFO 2025-12-04 13:52:59,928 uvicorn.error:62 uncategorized: Application startup complete. INFO 2025-12-04 13:52:59,929 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) ``` Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-12-04 11:38:43 -08:00
Nathan Weinberg	8bbcfc4f56	fix: nvidia provider no longer crashes sans API key (#4317 ) # What does this PR do? previously the nvidia provider would throw an exception if a hosted instance was being used but no API key was set modify this behavior to instead log an error informing users that a key is needed to use a hosted NIM but still allow the server to boot Closes #4295 ## Test Plan Run `uv run llama stack run --providers inference=remote::nvidia` with `NVIDIA_API_KEY` unset - server now boots with logged error, where it previously crashed ``` INFO 2025-12-04 14:16:26,156 llama_stack.providers.remote.inference.nvidia.nvidia:47 inference::nvidia: Initializing NVIDIAInferenceAdapter(https://integrate.api.nvidia.com/v1)... ERROR 2025-12-04 14:16:26,157 llama_stack.providers.remote.inference.nvidia.nvidia:51 inference::nvidia: API key is required for hosted NVIDIA NIM. Either provide an API key or use a self-hosted NIM. INFO 2025-12-04 14:16:26,239 uvicorn.error:84 uncategorized: Started server process [251651] INFO 2025-12-04 14:16:26,240 uvicorn.error:48 uncategorized: Waiting for application startup. INFO 2025-12-04 14:16:26,244 llama_stack.core.server.server:168 core::server: Starting up Llama Stack server (version: 0.4.0.dev0) INFO 2025-12-04 14:16:26,245 llama_stack.core.stack:495 core: starting registry refresh task INFO 2025-12-04 14:16:26,246 uvicorn.error:62 uncategorized: Application startup complete. INFO 2025-12-04 14:16:26,246 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) ``` Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-12-04 11:38:16 -08:00
Derek Higgins	686065fe27	fix: access control to fail-closed when owner attributes are missing (#4273 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Python Package Build Test / build (3.12) (push) Successful in 16s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details Vector IO Integration Tests / test-matrix (push) Failing after 35s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Test External API and Providers / test-external (venv) (push) Failing after 44s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m26s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m28s Details Pre-commit / pre-commit (22) (push) Successful in 3m28s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m12s Details	2025-12-04 08:38:32 -08:00
Charlie Doern	b4903d6766	fix: llama_stack_api inspect API rename (#4311 ) # What does this PR do? when publishing llama_stack_api, `inspect.py` causes issues and gets confused to be the builtin stdlib inspect module. This is due to the top level __init__.py we have. We need to rename inspect.py to inspect_api.py to avoid this conflict. Also, uv sync `1993161624` for reference . Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-04 10:12:55 -05:00
Bwook (Byoungwook) Kim	c4c6d39c54	feat: Implement `keyword search` and `delete_chunk` at ChromaDB (#3057 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 18s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details Integration Tests (Replay) / generate-matrix (push) Successful in 23s Details Test External API and Providers / test-external (venv) (push) Failing after 26s Details Python Package Build Test / build (3.12) (push) Successful in 32s Details Vector IO Integration Tests / test-matrix (push) Failing after 40s Details UI Tests / ui-tests (22) (push) Successful in 44s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m21s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m39s Details Pre-commit / pre-commit (22) (push) Successful in 3m23s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m8s Details	2025-12-04 00:59:09 -05:00
Ashwin Bharambe	c6609a84f5	fix(tests): handle http URLs as aliases for server mode (#4306 ) Small fix needed for llama-stack-ops which invokes integration-tests.sh against docker by using a `http://` URL for stack-config	2025-12-03 21:21:18 -08:00
dependabot[bot]	1d9349c8d6	chore(deps): bump next from 15.5.4 to 15.5.7 in /src/llama_stack_ui (#4305 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 19s Details Vector IO Integration Tests / test-matrix (push) Failing after 31s Details UI Tests / ui-tests (22) (push) Successful in 33s Details Test External API and Providers / test-external (venv) (push) Failing after 48s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m30s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m31s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m58s Details Pre-commit / pre-commit (22) (push) Successful in 3m40s Details Bumps [next](https://github.com/vercel/next.js) from 15.5.4 to 15.5.7. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vercel/next.js/releases">next's releases</a>.</em></p> <blockquote> <h2>v15.5.7</h2> <p>Please see <a href="https://nextjs.org/blog/CVE-2025-66478">CVE-2025-66478</a> for additional details about this release.</p> <h2>v15.5.6</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>Turbopack: don't define process.cwd() in node_modules <a href="https://redirect.github.com/vercel/next.js/issues/83452">#83452</a></li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/mischnic"><code>@mischnic</code></a> for helping!</p> <h2>v15.5.5</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>Split code-frame into separate compiled package (<a href="https://redirect.github.com/vercel/next.js/issues/84238">#84238</a>)</li> <li>Add deprecation warning to Runtime config (<a href="https://redirect.github.com/vercel/next.js/issues/84650">#84650</a>)</li> <li>fix: unstable_cache should perform blocking revalidation during ISR revalidation (<a href="https://redirect.github.com/vercel/next.js/issues/84716">#84716</a>)</li> <li>feat: <code>experimental.middlewareClientMaxBodySize</code> body cloning limit (<a href="https://redirect.github.com/vercel/next.js/issues/84722">#84722</a>)</li> <li>fix: missing next/link types with typedRoutes (<a href="https://redirect.github.com/vercel/next.js/issues/84779">#84779</a>)</li> </ul> <h3>Misc Changes</h3> <ul> <li>docs: early October improvements and fixes (<a href="https://redirect.github.com/vercel/next.js/issues/84334">#84334</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/devjiwonchoi"><code>@devjiwonchoi</code></a>, <a href="https://github.com/ztanner"><code>@ztanner</code></a>, and <a href="https://github.com/icyJoseph"><code>@icyJoseph</code></a> for helping!</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`3eaf68b09b`"><code>3eaf68b</code></a> v15.5.7</li> <li><a href="`8367ce592a`"><code>8367ce5</code></a> update version script</li> <li><a href="`9115040008`"><code>9115040</code></a> Update React Version for Next.js 15.5.7 (<a href="https://redirect.github.com/vercel/next.js/issues/10">#10</a>)</li> <li><a href="`96f699902a`"><code>96f6999</code></a> update tag</li> <li><a href="`55ef0e3ebc`"><code>55ef0e3</code></a> v15.5.6</li> <li><a href="`92bbbb1bec`"><code>92bbbb1</code></a> Backport: don't define <code>process.cwd()</code> in node_modules (<a href="https://redirect.github.com/vercel/next.js/issues/84957">#84957</a>)</li> <li><a href="`f895b72762`"><code>f895b72</code></a> Fix url-imports test on 15-5 (<a href="https://redirect.github.com/vercel/next.js/issues/84966">#84966</a>)</li> <li><a href="`81f530db26`"><code>81f530d</code></a> v15.5.5</li> <li><a href="`9abbc0e9eb`"><code>9abbc0e</code></a> [backport] fix: missing <code>next/link</code> types with <code>typedRoutes</code> (<a href="https://redirect.github.com/vercel/next.js/issues/82814">#82814</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/84779">#84779</a>)</li> <li><a href="`121e1b566f`"><code>121e1b5</code></a> [backport] docs: early October improvements and fixes (<a href="https://redirect.github.com/vercel/next.js/issues/84334">#84334</a>)</li> <li>Additional commits viewable in <a href="https://github.com/vercel/next.js/compare/v15.5.4...v15.5.7">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=next&package-manager=npm_and_yarn&previous-version=15.5.4&new-version=15.5.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/llamastack/llama-stack/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-03 20:53:33 -05:00
Nathan Weinberg	2bdcbe7963	fix(ci): standardize CI on node 22 (#4302 ) # What does this PR do? CI was previously using both node 20 and 22 standardize on node 22 Closes #4294 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-12-03 19:10:40 -05:00
Nathan Weinberg	c57c2ae562	fix(ci): use latest version of setup-uv and remove pin (#4299 ) # What does this PR do? this commit puts aligns all 'setup-uv' instances to the latest version and removes the pin keeping several actions on a very old version Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-12-03 14:13:10 -08:00
Nathan Weinberg	ee1e63e9b9	chore(ci): unify uv versions used in pre-commit (#4297 ) # What does this PR do? we had three different versions of uv being used in pre-commit. bump all to the latest version. we should probably try and find some way to automate this. Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-12-03 14:12:25 -08:00
Charlie Doern	c9b50b7e5b	fix: check if distro dirs exist before listing (#4301 ) # What does this PR do? DISTRO_DIR and DISTRIBS_BASE_DIR need to exist for them to be iterated. our current logic allows us to iterdir without checking if they exist ## Test Plan rm ~/.llama/distributions ``` llama stack list-deps starter --format uv \| sh Using Python 3.12.11 environment at: venv Audited 51 packages in 12ms Using Python 3.12.11 environment at: venv Audited 3 packages in 2ms Using Python 3.12.11 environment at: venv Audited 1 package in 3ms Using Python 3.12.11 environment at: venv Audited 3 packages in 5ms ``` Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-03 14:05:47 -08:00
Varsha	743683ba26	feat(qdrant): implement hybrid and keyword search support (#4006 ) # What does this PR do? - Part of #3009 - Implement hybrid search using Qdrant's native query filtering - Add keyword search support - Update test suites to include qdrant for keyword and hybrid modes <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ``` pytest -sv tests/unit/providers/vector_io/ ....... ============================================================================================== slowest 10 durations =============================================================================================== 0.20s call tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_max_concurrent_files_per_batch[qdrant] 0.20s call tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_max_concurrent_files_per_batch[pgvector] 0.20s call tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_max_concurrent_files_per_batch[sqlite_vec] 0.20s call tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_max_concurrent_files_per_batch[faiss] 0.06s setup tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_insert_chunks_with_missing_document_id[pgvector] 0.04s call tests/unit/providers/vector_io/test_sqlite_vec.py::test_query_chunks_hybrid_tie_breaking 0.04s call tests/unit/providers/vector_io/test_sqlite_vec.py::test_query_chunks_hybrid_weighted_reranker_parametrization 0.03s call tests/unit/providers/vector_io/test_sqlite_vec.py::test_query_chunks_hybrid_score_selection 0.03s call tests/unit/providers/vector_io/test_sqlite_vec.py::test_query_chunks_hybrid_edge_cases 0.03s setup tests/unit/providers/vector_io/test_faiss.py::test_faiss_query_vector_returns_infinity_when_query_and_embedding_are_identical ======================================================================================== 180 passed, 47 warnings in 2.78s ========================================================================================= ``` Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com>	2025-12-03 16:39:01 -05:00
Derek Higgins	5873a316db	feat: Add debug logging for RBAC access control decisions (#4255 ) Refactor is_action_allowed() to track decision outcome, matched rule index, and reason. Add structured debug log output for troubleshooting access control. Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-03 11:04:56 -08:00
Derek Higgins	fcd6370b34	fix: set SqlRecord owner to None when owner_principal is empty (#4284 ) Changes SqlRecord creation in AuthorizedSqlStore.fetch_all to use owner=None when owner_principal is empty/missing, matching the ResourceWithOwner pattern used in routing tables. This fixes an inconsistency where SQL store was creating User(principal="") while routing tables use owner=None for public resources. Changes: o Update ProtectedResource Protocol to allow owner: User \| None o Update SqlRecord.__init__ to accept owner: User \| None o Update fetch_all to create owner=None for records without owner_principal Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-03 10:28:33 -08:00
raghotham	aa3898f486	chore(cve): Update node-forge to 1.3.3 (#4289 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.12) (push) Successful in 18s Details Python Package Build Test / build (3.13) (push) Successful in 19s Details Test External API and Providers / test-external (venv) (push) Failing after 28s Details UI Tests / ui-tests (22) (push) Successful in 33s Details Vector IO Integration Tests / test-matrix (push) Failing after 40s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m19s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m46s Details Pre-commit / pre-commit (push) Successful in 2m49s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m42s Details https://github.com/digitalbazaar/forge/security/advisories/GHSA-554w-wpv2-vw27 Taking on a direct dependency is not great 1. We don't actually use node-forge - it's only needed by webpack-dev-server's dependency (selfsigned) for generating self-signed certificates during development 2. Adding a direct dependency would be misleading - it suggests our code uses node-forge when it doesn't In the dependency chain: ``` @docusaurus/core@3.8.1 └─ webpack-dev-server@4.15.2 └─ selfsigned@2.4.1 └─ node-forge@1.3.1 ``` Latest Docusaurus (3.9.2) uses webpack-dev-server 5.2.2, which still uses selfsigned 2.4.1 So, overriding dependency on node-forge is the only option	2025-12-03 09:58:33 -08:00
Sébastien Han	3c2d74f39a	chore: bump mcp package version (#4287 ) # What does this PR do? Address https://github.com/modelcontextprotocol/python-sdk/security/advisories/GHSA-9h52-p55h-vw2f Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-03 17:38:56 +01:00
Derek Higgins	8940be23c4	fix: RBAC bypass vulnerabilities in model access (#4270 ) Closes security gaps where RBAC checks could be bypassed: o Inference router: Added RBAC enforcement in the fallback path to ensure access control is applied consistently. o Model listing: Dynamic models fetched via provider_data were returned without RBAC checks. Added filtering to ensure users only see models they have permission to access. Both fixes create temporary ModelWithOwner objects for RBAC validation, maintaining security through consistent access control enforcement. Closes: #4269 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-03 08:42:22 -05:00
Sébastien Han	7f43051a63	feat: Implement FastAPI router system (#4191 ) # What does this PR do? This commit introduces a new FastAPI router-based system for defining API endpoints, enabling a migration path away from the legacy @webmethod decorator system. The implementation includes router infrastructure, migration of the Batches API as the first example, and updates to server, OpenAPI generation, and inspection systems to support both routing approaches. The router infrastructure consists of a router registry system that allows APIs to register FastAPI router factories, which are then automatically discovered and included in the server application. Standard error responses are centralized in router_utils to ensure consistent OpenAPI specification generation with proper $ref references to component responses. The Batches API has been migrated to demonstrate the new pattern. The protocol definition and models remain in llama_stack_api/batches, maintaining clear separation between API contracts and server implementation. The FastAPI router implementation lives in llama_stack/core/server/routers/batches, following the established pattern where API contracts are defined in llama_stack_api and server routing logic lives in llama_stack/core/server. The server now checks for registered routers before falling back to the legacy webmethod-based route discovery, ensuring backward compatibility during the migration period. The OpenAPI generator has been updated to handle both router-based and webmethod-based routes, correctly extracting metadata from FastAPI route decorators and Pydantic Field descriptions. The inspect endpoint now includes routes from both systems, with proper filtering for deprecated routes and API levels. Response descriptions are now explicitly defined in router decorators, ensuring the generated OpenAPI specification matches the previous format. Error responses use $ref references to component responses (BadRequest400, TooManyRequests429, etc.) as required by the specification. This is neat and will allow us to remove a lot of boiler plate code from our generator once the migration is done. This implementation provides a foundation for incrementally migrating other APIs to the router system while maintaining full backward compatibility with existing webmethod-based APIs. Closes: https://github.com/llamastack/llama-stack/issues/4188 ## Test Plan CI, the server should start, same routes should be visible. ``` curl http://localhost:8321/v1/inspect/routes \| jq '.data[] \| select(.route \| contains("batches"))' ``` Also: ``` uv run pytest tests/integration/batches/ -vv --stack-config=http://localhost:8321 ================================================== test session starts ================================================== platform darwin -- Python 3.12.8, pytest-8.4.2, pluggy-1.6.0 -- /Users/leseb/Documents/AI/llama-stack/.venv/bin/python3 cachedir: .pytest_cache metadata: {'Python': '3.12.8', 'Platform': 'macOS-26.0.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0'}} rootdir: /Users/leseb/Documents/AI/llama-stack configfile: pyproject.toml plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 24 items tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_creation_and_retrieval[None] SKIPPED [ 4%] tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_listing[None] SKIPPED [ 8%] tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_immediate_cancellation[None] SKIPPED [ 12%] tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_e2e_chat_completions[None] SKIPPED [ 16%] tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_e2e_completions[None] SKIPPED [ 20%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_invalid_endpoint[None] SKIPPED [ 25%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_cancel_completed[None] SKIPPED [ 29%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_missing_required_fields[None] SKIPPED [ 33%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_invalid_completion_window[None] SKIPPED [ 37%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_streaming_not_supported[None] SKIPPED [ 41%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_mixed_streaming_requests[None] SKIPPED [ 45%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_endpoint_mismatch[None] SKIPPED [ 50%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_missing_required_body_fields[None] SKIPPED [ 54%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_invalid_metadata_types[None] SKIPPED [ 58%] tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_e2e_embeddings[None] SKIPPED [ 62%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_nonexistent_file_id PASSED [ 66%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_malformed_jsonl PASSED [ 70%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_file_malformed_batch_file[empty] XFAIL [ 75%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_file_malformed_batch_file[malformed] XFAIL [ 79%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_retrieve_nonexistent PASSED [ 83%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_cancel_nonexistent PASSED [ 87%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_error_handling_invalid_model PASSED [ 91%] tests/integration/batches/test_batches_idempotency.py::TestBatchesIdempotencyIntegration::test_idempotent_batch_creation_successful PASSED [ 95%] tests/integration/batches/test_batches_idempotency.py::TestBatchesIdempotencyIntegration::test_idempotency_conflict_with_different_params PASSED [100%] ================================================= slowest 10 durations ================================================== 1.01s call tests/integration/batches/test_batches_idempotency.py::TestBatchesIdempotencyIntegration::test_idempotent_batch_creation_successful 0.21s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_nonexistent_file_id 0.17s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_malformed_jsonl 0.12s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_error_handling_invalid_model 0.05s setup tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_creation_and_retrieval[None] 0.02s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_file_malformed_batch_file[empty] 0.01s call tests/integration/batches/test_batches_idempotency.py::TestBatchesIdempotencyIntegration::test_idempotency_conflict_with_different_params 0.01s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_file_malformed_batch_file[malformed] 0.01s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_retrieve_nonexistent 0.00s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_cancel_nonexistent ======================================= 7 passed, 15 skipped, 2 xfailed in 1.78s ======================================== ``` --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-03 12:25:54 +01:00
Adrian Cole	4237eb4aaa	feat: Add opt-in OpenTelemetry auto-instrumentation to Docker images (#4281 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.12) (push) Successful in 17s Details Python Package Build Test / build (3.13) (push) Successful in 21s Details Test Llama Stack Build / build-single-provider (push) Successful in 27s Details Test External API and Providers / test-external (venv) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (push) Failing after 37s Details Test Llama Stack Build / build (push) Successful in 40s Details UI Tests / ui-tests (22) (push) Successful in 1m18s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m50s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m9s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m41s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m51s Details Pre-commit / pre-commit (push) Successful in 2m54s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m42s Details # What does this PR do? This allows llama-stack users of the Docker image to use OpenTelemetry like previous versions. #4127 migrated to automatic instrumentation, but unless we add those libraries to the image, everyone needs to build a custom image to enable otel. Also, unless we establish a convention for enabling it, users who formerly just set config now need to override the entrypoint. This PR bootstraps OTEL packages, so they are available (only +10MB). It also prefixes `llama stack run` with `opentelemetry-instrument` when any `OTEL_*` environment variable is set. The result is implicit tracing like before, where you don't need a custom image to use traces or metrics. ## Test Plan ```bash # Build image docker build -f containers/Containerfile \ --build-arg DISTRO_NAME=starter \ --build-arg INSTALL_MODE=editable \ --tag llamastack/distribution-starter:otel-test . # Run with OTEL env to implicitly use `opentelemetry-instrument`. The # Settings below ensure inbound traces are honored, but no # "junk traces" like SQL connects are created. docker run -p 8321:8321 \ -e OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:4318 \ -e OTEL_SERVICE_NAME=llama-stack \ -e OTEL_TRACES_SAMPLER=parentbased_traceidratio \ -e OTEL_TRACES_SAMPLER_ARG=0.0 \ llamastack/distribution-starter:otel-test ``` Ran a sample flight search agent which is instrumented on the client side. This and llama-stack target [otel-tui](https://github.com/ymtdzzz/otel-tui) I verified no root database spans, yet database spans are attached to incoming traces. <img width="1608" height="742" alt="screenshot" src="https://github.com/user-attachments/assets/69f59b74-3054-42cd-947d-a6c0d9472a7c" /> Signed-off-by: Adrian Cole <adrian@tetrate.io>	2025-12-02 17:03:27 -08:00
Kelly Brown	e243892ef0	docs: Refine and fix nits in README (#4220 ) Description: Refines and fixes some nits in the Llama stack readme	2025-12-02 13:36:29 -08:00
Derek Higgins	0b340ffd6e	fix: correct parameter names in error messages (#4268 ) Error messages were using --test-setup, --test-subdirs, and --test-suite instead of the actual parameter names: --setup, --subdirs, and --suite Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-02 13:34:54 -08:00
Derek Higgins	fbf6c30cdc	fix: call setup_logging early to apply category-specific log levels (#4253 ) Category-specific log levels from LLAMA_STACK_LOGGING were not applied to loggers created before setup_logging() was called. This fix moves the setup_logging() call earlier in the initialization sequence to ensure all loggers respect their configured levels regardless of initialization timing. Closes: #4252 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-02 13:29:04 -08:00
Derek Higgins	2fce5abe34	fix: Add policies to adapters (#4277 ) The configured policy wasn't being passed in and instead the default was being used (e.g. in the s3 file provider) Closes: #4276 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-02 14:08:03 -05:00
Derek Higgins	4ff0c25c52	fix(files): Enforce DELETE action permission for file deletion (#4275 ) Previously, file deletion only checked READ permission via the _lookup_file_id() method. This meant any user with READ access to a file could also delete it, making it impossible to configure read-only file access. This change adds an 'action' parameter to fetch_all() and fetch_one() in AuthorizedSqlStore, defaulting to Action.READ for backward compatibility. The openai_delete_file() method now passes Action.DELETE, ensuring proper RBAC enforcement. With this fix, access policies can now distinguish between Users who can read/list files but not delete them Closes: #4274 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-02 09:56:59 -08:00
Omar Abdelwahab	ee107aadd6	fix(docs): Updated the LS documentation to point users to the correct docker container (#4267 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.12) (push) Successful in 16s Details Python Package Build Test / build (3.13) (push) Successful in 18s Details Test External API and Providers / test-external (venv) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (push) Failing after 42s Details UI Tests / ui-tests (22) (push) Successful in 1m15s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m20s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m21s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m15s Details Pre-commit / pre-commit (push) Successful in 3m51s Details # What does this PR do? Fixed the docker container name in the documentation by changing `docker pull llama-stack/distribution-starter` `docker pull llama-stack/distribution-meta-reference-gpu` to `docker pull llamastack/distribution-starter` `docker pull llamastack/distribution-meta-reference-gpu` Closes this [issue](https://github.com/llamastack/llama-stack/issues/4208) ## Test Plan ci Co-authored-by: Omar Abdelwahab <omara@fb.com>	2025-12-01 21:03:34 -08:00
Derek Higgins	9616448213	fix: use string annotations for S3Client type hints (#4242 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 15s Details Test Llama Stack Build / build-single-provider (push) Successful in 21s Details Test External API and Providers / test-external (venv) (push) Failing after 25s Details Python Package Build Test / build (3.13) (push) Successful in 34s Details Python Package Build Test / build (3.12) (push) Successful in 41s Details Vector IO Integration Tests / test-matrix (push) Failing after 57s Details UI Tests / ui-tests (22) (push) Successful in 57s Details Test Llama Stack Build / build (push) Successful in 57s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m49s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m0s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m16s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m13s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m20s Details Pre-commit / pre-commit (push) Successful in 4m5s Details fix: use string annotations for S3Client type hints Remove future annotations import and use quoted string annotations for S3Client to avoid import issues. Changes: o Remove __future__ annotations import o Use "S3Client" string annotations in type hints closes: #4241 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-01 15:47:35 -08:00
Charlie Doern	aaecd0327c	feat(api): oasdiff OpenAI openAPI spec against ours (#3529 ) # What does this PR do? diff the `/v1/` routes that are OpenAI compatible against the OpenAI openAPI spec. This will of course only trigger on PRs where the spec is changed. This will catch errors with new handwritten additions to our openAI compat routes. Instead of fetching the OpenAPI spec from a dynamic URL, which could cause non-deterministic build failures, this change uses a local copy stored at `docs/static/openai-spec.yml`. This makes the conformance check fully reproducible and prevents CI failures caused by uncontrolled upstream changes. I am marking this test with `continue-on-error: true`, until we get rid of all of the errors. Nevertheless, this is a nice utility to have so folks know if their spec changes introduce more breaking changes or fix breakages when comparing to the OpenAI openapi spec. ## Test Plan test should pass. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-01 15:27:08 -08:00

1 2 3 4 5 ...

3232 commits