Commit graph

3232 commits

Author SHA1 Message Date
Roy Belio
c574db5f1d
fix(inference): AttributeError in streaming response cleanup (#4236)
This PR fixes issue #3185 
The code calls `await event_gen.aclose()` but OpenAI's `AsyncStream`
doesn't have an `aclose()` method - it has `close()` (which is async).
when clients cancel streaming requests, the server tries to clean up
with:

```python
await event_gen.aclose()  #  AsyncStream doesn't have aclose()!
```

But `AsyncStream` has never had a public `aclose()` method. The error
message literally tells us:

```
AttributeError: 'AsyncStream' object has no attribute 'aclose'. Did you mean: 'close'?
                                                                            ^^^^^^^^
```

## Verification
* Reproduction script
[`reproduce_issue_3185.sh`](https://gist.github.com/r-bit-rry/dea4f8fbb81c446f5db50ea7abd6379b)
can be used to verify the fix.
*   Manual checks, validation against original OpenAI library code
2025-12-14 07:51:09 -05:00
Omar Abdelwahab
dfb9f6743a
docs: Adding initial updates to the RAG documentation and examples (#4377)
Some checks failed
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2s
Integration Tests (Replay) / generate-matrix (push) Successful in 4s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
API Conformance Tests / check-schema-compatibility (push) Successful in 12s
Python Package Build Test / build (3.12) (push) Successful in 18s
Python Package Build Test / build (3.13) (push) Successful in 22s
Test External API and Providers / test-external (venv) (push) Failing after 37s
Vector IO Integration Tests / test-matrix (push) Failing after 46s
UI Tests / ui-tests (22) (push) Successful in 1m23s
Unit Tests / unit-tests (3.12) (push) Failing after 1m48s
Unit Tests / unit-tests (3.13) (push) Failing after 1m50s
Pre-commit / pre-commit (22) (push) Successful in 3m31s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m20s
# What does this PR do?
This PR updates the RAG examples included in docs/quick_start.ipynb,
docs/getting_started/demo_script.py, rag.mdx and index.md to remove
references to the deprecated vector_io and vector_db APIs and to add
examples that use /v1/vector_stores with responses and completions.

---------

Co-authored-by: Omar Abdelwahab <omara@fb.com>
Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com>
2025-12-12 22:59:39 -05:00
Varsha
75ef052545
docs: Add details on model registration and refresh_models (#4383)
Document the refresh_models configuration option for remote providers
that use RemoteInferenceProviderConfig.

- Add "Automatic vs Explicit Model Registration" section to
resources.mdx
- Include examples for registering custom embedding models

# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
2025-12-12 22:41:28 -05:00
Robert Riley (OCI)
10c878d782
feat: added oci-s3 compatibility (#4374)
Some checks failed
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s
API Conformance Tests / check-schema-compatibility (push) Successful in 14s
Python Package Build Test / build (3.12) (push) Successful in 16s
Python Package Build Test / build (3.13) (push) Successful in 17s
Test External API and Providers / test-external (venv) (push) Failing after 30s
Vector IO Integration Tests / test-matrix (push) Failing after 50s
UI Tests / ui-tests (22) (push) Successful in 1m1s
Unit Tests / unit-tests (3.12) (push) Failing after 1m39s
Unit Tests / unit-tests (3.13) (push) Failing after 1m43s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m47s
Pre-commit / pre-commit (22) (push) Successful in 3m42s
# What does this PR do?
The PR validates and allow access to OCI object-storage through the S3
compatibility API. Additional documentation for OCI is supplied, in
notebook form, as well.

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

---------

Co-authored-by: raghotham <rsm@meta.com>
2025-12-11 15:13:55 -08:00
Shabana Baig
805abf573f
feat!: Implement include parameter specifically for adding logprobs in the output message (#4261)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 15s
Python Package Build Test / build (3.12) (push) Successful in 17s
Python Package Build Test / build (3.13) (push) Successful in 18s
Test External API and Providers / test-external (venv) (push) Failing after 28s
Vector IO Integration Tests / test-matrix (push) Failing after 43s
UI Tests / ui-tests (22) (push) Successful in 52s
Unit Tests / unit-tests (3.13) (push) Failing after 1m45s
Unit Tests / unit-tests (3.12) (push) Failing after 1m58s
Pre-commit / pre-commit (22) (push) Successful in 3m9s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m5s
# Problem
As an Application Developer, I want to use the include parameter with
the value message.output_text.logprobs, so that I can receive log
probabilities for output tokens to assess the model's confidence in its
response.

# What does this PR do?

- Updates the include parameter in various resource definitions
- Updates the inline provider to return logprobs when
"message.output_text.logprobs" is passed in the include parameter
- Converts the logprobs returned by the inference provider from chat
completion format to responses format

Closes #[4260](https://github.com/llamastack/llama-stack/issues/4260)

## Test Plan

- Created a script to explore OpenAI behavior:
https://github.com/s-akhtar-baig/llama-stack-examples/blob/main/responses/src/include.py
- Added integration tests and new recordings

---------

Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-12-11 11:11:21 -08:00
Jaideep Rao
76e47d811a
feat(api): add readonly connectors API (#4258)
# What does this PR do?
Adds a new API for connectors and MCP registry support along with
required types.
Does not include any implementation for it

<!-- If resolving an issue, uncomment and update the line below -->
Closes #4235 and #4061 (partially)

## Test Plan
no tests included

---------

Signed-off-by: Jaideep Rao <jrao@redhat.com>
Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com>
2025-12-11 10:19:55 -08:00
Sébastien Han
470fe55e87
fix(inference): respect table_name config in InferenceStore (#4371)
# What does this PR do?

The InferenceStore class was ignoring the table_name field from
InferenceStoreReference and always using the hardcoded value
"chat_completions". This meant that any custom table_name configured in
the run config (e.g., "inference_store" in run-with-postgres-store.yaml)
was silently ignored.

This change updates all SQL operations in InferenceStore to use
self.reference.table_name instead of the hardcoded string, ensuring the
configured table name is properly respected.

A new test has been added to verify that custom table names work
correctly for storing, retrieving, and listing chat completions.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan

CI

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-12-11 14:50:23 +01:00
Charlie Doern
7308c8aef1
feat: add workflow_dispatch and self-trigger to stainless builds (#4361)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 12s
Python Package Build Test / build (3.12) (push) Successful in 15s
Python Package Build Test / build (3.13) (push) Successful in 17s
Test External API and Providers / test-external (venv) (push) Failing after 30s
Vector IO Integration Tests / test-matrix (push) Failing after 48s
UI Tests / ui-tests (22) (push) Successful in 1m36s
Unit Tests / unit-tests (3.13) (push) Failing after 1m43s
Unit Tests / unit-tests (3.12) (push) Failing after 1m54s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m24s
Pre-commit / pre-commit (22) (push) Successful in 4m22s
# What does this PR do?

 Currently impossible to test workflow changes (pull_request_target uses
 base branch definition) or manually trigger SDK builds. This adds both
 capabilities.

  - Add workflow_dispatch with pr_number input for manual testing
  - Add workflow file to path triggers for automatic testing
  - Fetch PR details via gh CLI for manual runs
  - Update jobs to use computed PR data for both trigger types

## Test Plan

impossible to test until it merges unfortunately. I am doing this in a
smaller PR so that I can use it immediately in a follow up.

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-12-10 12:48:27 -08:00
Francisco Javier Arceo
95b2948d11
feat: Add support for query rewrite in vector_store.search (#4171)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 11s
Python Package Build Test / build (3.12) (push) Successful in 15s
Python Package Build Test / build (3.13) (push) Successful in 20s
Test External API and Providers / test-external (venv) (push) Failing after 41s
Vector IO Integration Tests / test-matrix (push) Failing after 49s
UI Tests / ui-tests (22) (push) Successful in 51s
Unit Tests / unit-tests (3.13) (push) Failing after 1m27s
Unit Tests / unit-tests (3.12) (push) Failing after 1m45s
Pre-commit / pre-commit (22) (push) Failing after 2m30s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m22s
# What does this PR do?

Actualize query rewrite in search API, add
`default_query_expansion_model` and `query_expansion_prompt` in
`VectorStoresConfig`.

Makes `rewrite_query` parameter functional in vector store search.
  - `rewrite_query=false` (default): Use original query
- `rewrite_query=true`: Expand query via LLM, or fail gracefully if no
LLM available

Adds 4 parameters to`VectorStoresConfig`:
- `default_query_expansion_model`: LLM model for query expansion
(optional)
- `query_expansion_prompt`: Custom prompt template (optional, uses
built-in default)
- `query_expansion_max_tokens`: Configurable token limit (default: 100)
- `query_expansion_temperature`: Configurable temperature (default: 0.3)

Enabled `run.yaml`:
```yaml
  vector_stores:
    rewrite_query_params:
      model:
        provider_id: "ollama"
        model_id: "llama3.2:3b-instruct-fp16"
      # prompt defaults to built-in
      # max_tokens defaults to 100
      # temperature defaults to 0.3
```

  Fully customized `run.yaml`:
```yaml
  vector_stores:
    default_provider_id: faiss
    default_embedding_model:
      provider_id: sentence-transformers
      model_id: nomic-ai/nomic-embed-text-v1.5
    rewrite_query_params:
      model:
        provider_id: ollama
        model_id: llama3.2:3b-instruct-fp16
      prompt: "Rewrite this search query to improve retrieval results by expanding it with relevant synonyms and related terms: {query}"
      max_tokens: 100
      temperature: 0.3
```

## Test Plan
Added test and recording

Example script as well:

```python
import asyncio
from llama_stack_client import LlamaStackClient
from io import BytesIO

def gen_file(client, text: str=""):
    file_buffer = BytesIO(text.encode('utf-8'))
    file_buffer.name = "my_file.txt"

    uploaded_file = client.files.create(
        file=file_buffer,
        purpose="assistants"
    )
    return uploaded_file

async def test_query_rewriting():
    client = LlamaStackClient(base_url="http://0.0.0.0:8321/")
    uploaded_file = gen_file(client, "banana banana apple")
    uploaded_file2 = gen_file(client, "orange orange kiwi")

    vs = client.vector_stores.create()
    xf_vs = client.vector_stores.files.create(vector_store_id=vs.id, file_id=uploaded_file.id)
    xf_vs1 = client.vector_stores.files.create(vector_store_id=vs.id, file_id=uploaded_file2.id)
    response1 = client.vector_stores.search(
                vector_store_id=vs.id,
                query="apple",
                max_num_results=3,
                rewrite_query=False
            )
    response2 = client.vector_stores.search(
                vector_store_id=vs.id,
                query="kiwi",
                max_num_results=3,
                rewrite_query=True,
            )

    print(f"\n🔵 Response 1 (rewrite_query=False):\n\033[94m{response1}\033[0m")
    print(f"\n🟢 Response 2 (rewrite_query=True):\n\033[92m{response2}\033[0m")

    for f in [uploaded_file.id, uploaded_file2.id]:
        client.files.delete(file_id=f)
    client.vector_stores.delete(vector_store_id=vs.id)

if __name__ == "__main__":
    asyncio.run(test_query_rewriting())
```

And see the screen shot of the server logs showing it worked. 
<img width="1111" height="826" alt="Screenshot 2025-11-19 at 1 16 03 PM"
src="https://github.com/user-attachments/assets/2d188b44-1fef-4df5-b465-2d6728ca49ce"
/>

Notice the log:
```bash
 Query rewritten:
         'kiwi' → 'kiwi, a small brown or green fruit native to New Zealand, or a person having a fuzzy brown outer skin similar in appearance.'
```
So `kiwi` was expanded.

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>
2025-12-10 10:06:19 -05:00
Sébastien Han
ff375f1abb
feat: convert Benchmarks API to use FastAPI router (#4309)
# What does this PR do?

Convert the Benchmarks API from @webmethod decorators to FastAPI router
pattern, matching the Batches API structure.

One notable change is the update of stack.py to handle request models in
register_resources().

Closes: #4308 

## Test Plan

CI and `curl http://localhost:8321/v1/inspect/routes | jq '.data[] |
select(.route | contains("benchmark"))'`

---------

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-12-10 15:04:27 +01:00
Charlie Doern
661985e240
feat: remove usage of build yaml (#4192)
Some checks failed
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Failing after 3s
Test Llama Stack Build / build (push) Has been skipped
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Test llama stack list-deps / generate-matrix (push) Failing after 3s
Test llama stack list-deps / list-deps (push) Has been skipped
API Conformance Tests / check-schema-compatibility (push) Successful in 11s
Python Package Build Test / build (3.13) (push) Successful in 19s
Python Package Build Test / build (3.12) (push) Successful in 23s
Test Llama Stack Build / build-single-provider (push) Successful in 33s
Test llama stack list-deps / show-single-provider (push) Successful in 36s
Test llama stack list-deps / list-deps-from-config (push) Successful in 44s
Vector IO Integration Tests / test-matrix (push) Failing after 57s
Test External API and Providers / test-external (venv) (push) Failing after 1m37s
Unit Tests / unit-tests (3.12) (push) Failing after 1m56s
UI Tests / ui-tests (22) (push) Successful in 2m2s
Unit Tests / unit-tests (3.13) (push) Failing after 2m35s
Pre-commit / pre-commit (22) (push) Successful in 3m16s
Test Llama Stack Build / build-custom-container-distribution (push) Successful in 3m34s
Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 3m59s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m30s
# What does this PR do?

the build.yaml is only used in the following ways:

1. list-deps
2. distribution code-gen

since `llama stack build` no longer exists, I found myself asking "why
do we need two different files for list-deps and run"?

Removing the BuildConfig and altering the usage of the
DistributionTemplate in llama stack list-deps is the first step in
removing the build yaml entirely.

Removing the BuildConfig and build.yaml cuts the files users need to
maintain in half, and allows us to focus on the stability of _just_ the
run.yaml

This PR removes the build.yaml, BuildConfig datatype, and its usage
throughout the codebase. Users are now expected to point to run.yaml
files when running list-deps, and our codebase automatically uses these
types now for things like `get_provider_registry`.

**Additionally, two renames: `StackRunConfig` -> `StackConfig` and
`run.yaml` -> `config.yaml`.**

The build.yaml made sense for when we were managing the build process
for the user and actually _producing_ a run.yaml _from_ the build.yaml,
but now that we are simply just getting the provider registry and
listing the deps, switching to config.yaml simplifies the scope here
greatly.

## Test Plan

existing list-deps usage should work in the tests.

---------

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-12-10 10:12:12 +01:00
Varsha
17e6912288
docs: Fix vector_store_create params (#4364)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 14s
Python Package Build Test / build (3.12) (push) Successful in 15s
Python Package Build Test / build (3.13) (push) Successful in 17s
Test External API and Providers / test-external (venv) (push) Failing after 31s
Vector IO Integration Tests / test-matrix (push) Failing after 38s
UI Tests / ui-tests (22) (push) Successful in 44s
Unit Tests / unit-tests (3.12) (push) Failing after 1m30s
Unit Tests / unit-tests (3.13) (push) Failing after 1m29s
Pre-commit / pre-commit (22) (push) Successful in 2m59s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m38s
2025-12-09 19:48:43 -05:00
Francisco Javier Arceo
fcea9893a4
feat(UI): Adding Files API to Admin UI (#4319)
# What does this PR do?


## Files Admin Page

<img width="1919" height="1238" alt="Screenshot 2025-12-09 at 10 33
06 AM"
src="https://github.com/user-attachments/assets/3dd545f0-32bc-45be-af2b-1823800015f2"
/>

## Files Upload Modal
<img width="1919" height="1287" alt="Screenshot 2025-12-09 at 10 33
38 AM"
src="https://github.com/user-attachments/assets/776bb372-75d3-4ccd-b6b5-c9dfb3fcb350"
/>

## Files Detail
<img width="1918" height="1099" alt="Screenshot 2025-12-09 at 10 34
26 AM"
src="https://github.com/user-attachments/assets/f256dbf8-4047-4d79-923d-404161b05f36"
/>

Note, content preview has some handling for JSON, CSV, and PDF to enable
nicer rendering. Pure text rendering is trivial.

### Files Detail File Content Preview (TXT)
<img width="1918" height="1341" alt="Screenshot 2025-12-09 at 10 41
20 AM"
src="https://github.com/user-attachments/assets/4fa0ddb7-ffff-424b-b764-0bd4af6ed976"
/>

### Files Detail File Content Preview (JSON)
<img width="1909" height="1233" alt="Screenshot 2025-12-09 at 10 39
57 AM"
src="https://github.com/user-attachments/assets/b912f07a-2dff-483b-b73c-2f69dd0d87ad"
/>

### Files Detail File Content Preview (HTML)
<img width="1916" height="1348" alt="Screenshot 2025-12-09 at 10 40
27 AM"
src="https://github.com/user-attachments/assets/17ebec0a-8754-4552-977d-d3c44f7f6973"
/>

### Files Detail File Content Preview (CSV)
<img width="1919" height="1177" alt="Screenshot 2025-12-09 at 10 34
50 AM"
src="https://github.com/user-attachments/assets/20bd0755-1757-4a3a-99d2-fbd072f81f49"
/>

### Files Detail File Content Preview (PDF)
<img width="1917" height="1154" alt="Screenshot 2025-12-09 at 10 36
48 AM"
src="https://github.com/user-attachments/assets/2873e6fe-4da3-4cbd-941b-7d903270b749"
/>


Closes https://github.com/llamastack/llama-stack/issues/4144

## Test Plan
Added Tests

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-12-09 16:28:05 -05:00
Robert Riley (OCI)
6ad5fb5577
feat: Adding OCI Embeddings (#4300)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 10s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 11s
Python Package Build Test / build (3.12) (push) Successful in 15s
Python Package Build Test / build (3.13) (push) Successful in 18s
Test External API and Providers / test-external (venv) (push) Failing after 30s
UI Tests / ui-tests (22) (push) Successful in 56s
Vector IO Integration Tests / test-matrix (push) Failing after 1m1s
Unit Tests / unit-tests (3.13) (push) Failing after 1m44s
Unit Tests / unit-tests (3.12) (push) Failing after 1m48s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m17s
Pre-commit / pre-commit (22) (push) Successful in 3m22s
# What does this PR do?
Enabling usage of OCI embedding models.

## Test Plan
Testing embedding model:
`OCI_COMPARTMENT_OCID="" OCI_REGION="us-chicago-1"
OCI_AUTH_TYPE=config_file pytest -sv
tests/integration/inference/test_openai_embeddings.py --stack-config oci
--embedding-model oci/openai.text-embedding-3-small --inference-mode
live`

Testing chat model:
`OCI_COMPARTMENT_OCID="" OCI_REGION="us-chicago-1"
OCI_AUTH_TYPE=config_file pytest -sv tests/integration/inference/
--stack-config oci --text-model oci/openai.gpt-4.1-nano-2025-04-14
--inference-mode live`

Testing curl for embeddings:
`curl -X POST http://localhost:8321/v1/embeddings -H "Content-Type:
application/json" -d '{
  "model": "oci/openai.text-embedding-3-small",
  "input": ["First text", "Second text"],
  "encoding_format": "float"
}'`

`{"object":"list","data":[{"object":"embedding","embedding":[-0.017190756...0.025272394],"index":1}],"model":"oci/openai.text-embedding-3-small","usage":{"prompt_tokens":4,"total_tokens":4}}`

---------

Co-authored-by: Omar Abdelwahab <omaryashraf10@gmail.com>
2025-12-08 13:05:39 -08:00
Sébastien Han
d82a2cd6f8
fix: httpcore deadlock in CI by properly closing streaming responses (#4335)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 10s
Python Package Build Test / build (3.13) (push) Successful in 17s
Python Package Build Test / build (3.12) (push) Successful in 18s
Test External API and Providers / test-external (venv) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (push) Failing after 33s
UI Tests / ui-tests (22) (push) Successful in 1m13s
Unit Tests / unit-tests (3.12) (push) Failing after 1m37s
Unit Tests / unit-tests (3.13) (push) Failing after 2m11s
Pre-commit / pre-commit (22) (push) Successful in 3m39s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m1s
# What does this PR do?

The test_conversation_error_handling test was timing out in CI with a
deadlock in httpcore's connection pool. The root cause was the preceding
test_conversation_multi_turn_and_streaming test, which broke out of the
streaming response iterator early without properly closing the
underlying HTTP connection.

When a streaming response iterator is abandoned mid-stream, the HTTP
connection remains in an incomplete state. Since the openai_client
fixture is session-scoped, subsequent tests reuse the same httpcore
connection pool. The dangling connection causes the pool's internal lock
to deadlock when the next test attempts to acquire a new connection.

The fix wraps the streaming response in a context manager, which ensures
the connection is properly closed when exiting the with block, even when
breaking out of the loop early. This is a best practice when working
with streaming HTTP responses that may not be fully consumed.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-12-08 16:38:46 +01:00
dependabot[bot]
20c11d8fd4
chore(github-deps): bump stainless-api/upload-openapi-spec-action from 1.7.0 to 1.7.1 (#4334)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 6s
API Conformance Tests / check-schema-compatibility (push) Successful in 18s
Python Package Build Test / build (3.12) (push) Successful in 18s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 30s
Test llama stack list-deps / generate-matrix (push) Successful in 33s
Test Llama Stack Build / generate-matrix (push) Successful in 36s
Test llama stack list-deps / show-single-provider (push) Successful in 33s
Python Package Build Test / build (3.13) (push) Successful in 59s
Test llama stack list-deps / list-deps-from-config (push) Successful in 1m8s
Test Llama Stack Build / build-single-provider (push) Successful in 1m12s
Test External API and Providers / test-external (venv) (push) Failing after 1m9s
Vector IO Integration Tests / test-matrix (push) Failing after 1m24s
UI Tests / ui-tests (22) (push) Successful in 1m29s
Test Llama Stack Build / build (push) Successful in 1m0s
Test llama stack list-deps / list-deps (push) Failing after 1m23s
Unit Tests / unit-tests (3.13) (push) Failing after 2m42s
Unit Tests / unit-tests (3.12) (push) Failing after 2m51s
Test Llama Stack Build / build-custom-container-distribution (push) Successful in 3m47s
Pre-commit / pre-commit (22) (push) Successful in 3m55s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m7s
Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 4m43s
Bumps
[stainless-api/upload-openapi-spec-action](https://github.com/stainless-api/upload-openapi-spec-action)
from 1.7.0 to 1.7.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/stainless-api/upload-openapi-spec-action/releases">stainless-api/upload-openapi-spec-action's
releases</a>.</em></p>
<blockquote>
<h2>v1.7.1</h2>
<h2><a
href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.7.0...v1.7.1">1.7.1</a>
(2025-12-01)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>improve getMergeBase to handle shallow clones more robustly (<a
href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/138">#138</a>)
(<a
href="3687845465">3687845</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/stainless-api/upload-openapi-spec-action/blob/main/CHANGELOG.md">stainless-api/upload-openapi-spec-action's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<h2><a
href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.7.0...v1.7.1">1.7.1</a>
(2025-12-01)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>improve getMergeBase to handle shallow clones more robustly (<a
href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/138">#138</a>)
(<a
href="3687845465">3687845</a>)</li>
</ul>
<h2><a
href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.6.0...v1.7.0">1.7.0</a>
(2025-11-17)</h2>
<h3>Features</h3>
<ul>
<li><strong>preview:</strong> add output documented_spec_path to preview
action (<a
href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/135">#135</a>)
(<a
href="5e80cc40da">5e80cc4</a>)</li>
<li><strong>preview:</strong> add output_dir input and write documented
spec to file (<a
href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/137">#137</a>)
(<a
href="d30490c89b">d30490c</a>)</li>
</ul>
<h2><a
href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.5...v1.6.0">1.6.0</a>
(2025-10-30)</h2>
<h3>Features</h3>
<ul>
<li>add support for github OIDC auth (<a
href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/133">#133</a>)
(<a
href="259674c1b3">259674c</a>)</li>
<li>change fail on semantics (<a
href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/124">#124</a>)
(<a
href="e1046240c0">e104624</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li>accept multiline conventional commits (<a
href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/129">#129</a>)
(<a
href="d2dcc0b3bf">d2dcc0b</a>)</li>
<li>tweak categorizeOutcomes (<a
href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/132">#132</a>)
(<a
href="c45d6a9c79">c45d6a9</a>)</li>
</ul>
<h2><a
href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.4...v1.5.5">1.5.5</a>
(2025-09-26)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>rollback filtering diagnostics by target (<a
href="54328a386f">54328a3</a>)</li>
</ul>
<h2><a
href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.3...v1.5.4">1.5.4</a>
(2025-09-25)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>check for latestRun before commenting (<a
href="53fef9f328">53fef9f</a>)</li>
<li>filter diagnostics by target (<a
href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/125">#125</a>)
(<a
href="102dc971cb">102dc97</a>)</li>
</ul>
<h2><a
href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.2...v1.5.3">1.5.3</a>
(2025-09-16)</h2>
<h3>Bug Fixes</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="a4d631c1e9"><code>a4d631c</code></a>
chore(main): release 1.7.1 (<a
href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/141">#141</a>)</li>
<li><a
href="56c2d869b3"><code>56c2d86</code></a>
chore: add structured logger (<a
href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/139">#139</a>)</li>
<li><a
href="3687845465"><code>3687845</code></a>
fix: improve getMergeBase to handle shallow clones more robustly (<a
href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/138">#138</a>)</li>
<li>See full diff in <a
href="9133735bca...a4d631c1e9">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=stainless-api/upload-openapi-spec-action&package-manager=github_actions&previous-version=1.7.0&new-version=1.7.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-08 12:04:22 +01:00
dependabot[bot]
912ab6b4a2
chore(github-deps): bump actions/setup-node from 6.0.0 to 6.1.0 (#4333)
Bumps [actions/setup-node](https://github.com/actions/setup-node) from
6.0.0 to 6.1.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-node/releases">actions/setup-node's
releases</a>.</em></p>
<blockquote>
<h2>v6.1.0</h2>
<h2>What's Changed</h2>
<h3>Enhancement:</h3>
<ul>
<li>Remove always-auth configuration handling by <a
href="https://github.com/priyagupta108"><code>@​priyagupta108</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1436">actions/setup-node#1436</a></li>
</ul>
<h3>Dependency updates:</h3>
<ul>
<li>Upgrade <code>@​actions/cache</code> from 4.0.3 to 4.1.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1384">actions/setup-node#1384</a></li>
<li>Upgrade actions/checkout from 5 to 6 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1439">actions/setup-node#1439</a></li>
<li>Upgrade js-yaml from 3.14.1 to 3.14.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1435">actions/setup-node#1435</a></li>
</ul>
<h3>Documentation update:</h3>
<ul>
<li>Add example for restore-only cache in documentation by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1419">actions/setup-node#1419</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v6...v6.1.0">https://github.com/actions/setup-node/compare/v6...v6.1.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="395ad32622"><code>395ad32</code></a>
Bump js-yaml from 3.14.1 to 3.14.2 (<a
href="https://redirect.github.com/actions/setup-node/issues/1435">#1435</a>)</li>
<li><a
href="a4d2e2bbca"><code>a4d2e2b</code></a>
Bump actions/checkout from 5 to 6 (<a
href="https://redirect.github.com/actions/setup-node/issues/1439">#1439</a>)</li>
<li><a
href="b9b25d45f7"><code>b9b25d4</code></a>
Remove always-auth configuration handling from action (<a
href="https://redirect.github.com/actions/setup-node/issues/1436">#1436</a>)</li>
<li><a
href="633bb92bc0"><code>633bb92</code></a>
Bump <code>@​actions/cache</code> from 4.0.3 to 4.1.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1384">#1384</a>)</li>
<li><a
href="dda4788290"><code>dda4788</code></a>
Add example for restore-only cache in documentation (<a
href="https://redirect.github.com/actions/setup-node/issues/1419">#1419</a>)</li>
<li>See full diff in <a
href="2028fbc5c2...395ad32622">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=6.0.0&new-version=6.1.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-08 12:03:44 +01:00
dependabot[bot]
39d23d9894
chore(github-deps): bump actions/stale from 10.1.0 to 10.1.1 (#4332)
Bumps [actions/stale](https://github.com/actions/stale) from 10.1.0 to
10.1.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/stale/releases">actions/stale's
releases</a>.</em></p>
<blockquote>
<h2>v10.1.1</h2>
<h2>What's Changed</h2>
<h3>Bug Fix</h3>
<ul>
<li>Add Missing Input Reading for <code>only-issue-types</code> by <a
href="https://github.com/Bibo-Joshi"><code>@​Bibo-Joshi</code></a> in <a
href="https://redirect.github.com/actions/stale/pull/1298">actions/stale#1298</a></li>
</ul>
<h3>Improvement</h3>
<ul>
<li>Improves error handling when rate limiting is disabled on GHES. by
<a
href="https://github.com/chiranjib-swain"><code>@​chiranjib-swain</code></a>
in <a
href="https://redirect.github.com/actions/stale/pull/1300">actions/stale#1300</a></li>
</ul>
<h3>Dependency Upgrades</h3>
<ul>
<li>Upgrade eslint-config-prettier from 8.10.0 to 10.1.8 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/stale/pull/1276">actions/stale#1276</a></li>
<li>Upgrade <code>@​types/node</code> from 20.10.3 to 24.2.0 and
document breaking changes in v10 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/stale/pull/1280">actions/stale#1280</a></li>
<li>Upgrade actions/publish-action from 0.3.0 to 0.4.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/stale/pull/1291">actions/stale#1291</a></li>
<li>Upgrade actions/checkout from 4 to 6 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/stale/pull/1306">actions/stale#1306</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/chiranjib-swain"><code>@​chiranjib-swain</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/stale/pull/1300">actions/stale#1300</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/stale/compare/v10...v10.1.1">https://github.com/actions/stale/compare/v10...v10.1.1</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="997185467f"><code>9971854</code></a>
build(deps): bump actions/checkout from 4 to 6 (<a
href="https://redirect.github.com/actions/stale/issues/1306">#1306</a>)</li>
<li><a
href="5611b9defa"><code>5611b9d</code></a>
build(deps): bump actions/publish-action from 0.3.0 to 0.4.0 (<a
href="https://redirect.github.com/actions/stale/issues/1291">#1291</a>)</li>
<li><a
href="fad0de84e5"><code>fad0de8</code></a>
Improves error handling when rate limiting is disabled on GHES. (<a
href="https://redirect.github.com/actions/stale/issues/1300">#1300</a>)</li>
<li><a
href="39bea7de61"><code>39bea7d</code></a>
Add Missing Input Reading for <code>only-issue-types</code> (<a
href="https://redirect.github.com/actions/stale/issues/1298">#1298</a>)</li>
<li><a
href="e46bbabb3e"><code>e46bbab</code></a>
build(deps-dev): bump <code>@​types/node</code> from 20.10.3 to 24.2.0
and document breakin...</li>
<li><a
href="65d1d4804d"><code>65d1d48</code></a>
build(deps-dev): bump eslint-config-prettier from 8.10.0 to 10.1.8 (<a
href="https://redirect.github.com/actions/stale/issues/1276">#1276</a>)</li>
<li>See full diff in <a
href="5f858e3efb...997185467f">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/stale&package-manager=github_actions&previous-version=10.1.0&new-version=10.1.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-08 11:56:42 +01:00
dependabot[bot]
8f585e4c7a
chore(github-deps): bump actions/checkout from 6.0.0 to 6.0.1 (#4331)
Bumps [actions/checkout](https://github.com/actions/checkout) from 6.0.0
to 6.0.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/releases">actions/checkout's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Update all references from v5 and v4 to v6 by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2314">actions/checkout#2314</a></li>
<li>Add worktree support for persist-credentials includeIf by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2327">actions/checkout#2327</a></li>
<li>Clarify v6 README by <a
href="https://github.com/ericsciple"><code>@​ericsciple</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2328">actions/checkout#2328</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v6...v6.0.1">https://github.com/actions/checkout/compare/v6...v6.0.1</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8e8c483db8"><code>8e8c483</code></a>
Clarify v6 README (<a
href="https://redirect.github.com/actions/checkout/issues/2328">#2328</a>)</li>
<li><a
href="033fa0dc0b"><code>033fa0d</code></a>
Add worktree support for persist-credentials includeIf (<a
href="https://redirect.github.com/actions/checkout/issues/2327">#2327</a>)</li>
<li><a
href="c2d88d3ecc"><code>c2d88d3</code></a>
Update all references from v5 and v4 to v6 (<a
href="https://redirect.github.com/actions/checkout/issues/2314">#2314</a>)</li>
<li>See full diff in <a
href="1af3b93b68...8e8c483db8">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=6.0.0&new-version=6.0.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-08 11:56:25 +01:00
Varsha
3ca0481e43
fix(ui): Fix model dropdown not displaying models in chat playground (#4329)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 11s
Python Package Build Test / build (3.12) (push) Successful in 15s
Python Package Build Test / build (3.13) (push) Successful in 18s
Test External API and Providers / test-external (venv) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (push) Failing after 34s
UI Tests / ui-tests (22) (push) Successful in 41s
Unit Tests / unit-tests (3.13) (push) Failing after 1m18s
Unit Tests / unit-tests (3.12) (push) Failing after 1m26s
Pre-commit / pre-commit (22) (push) Successful in 2m53s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m8s
2025-12-05 16:54:12 -05:00
Derek Higgins
8998000aec
fix(security): redact JWT tokens in server logs (#4325)
Add "token" to sensitive field patterns in redact_sensitive_fields() to
prevent JWT tokens from being logged in plaintext. Previously only
api_key, api_token, password, and secret were filtered.

This prevents tokens like server.auth.provider_config.jwks.token from
being exposed in server logs.

Closes: #4324

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-12-05 15:53:47 -05:00
Derek Higgins
fc4fc03606
chore: Small Auth CI refactor (#4322)
In preperation for ABAC addition (next PR)
```
    fix(ci): allow run_dir variable expansion in YAML heredoc
    
    Remove single quotes from EOF delimiter to allow $run_dir to
    be expanded by bash when creating the configuration file.
    Previously the literal string "$run_dir" was being written
    to the YAML instead of the actual temp directory path.
    
    drwxr-xr-x  3 runner runner   4096 Dec  5 12:56 $run_dir
```    
```
    test(ci): add test_endpoint helper function to auth tests
    
    Add reusable test_endpoint function to integration-auth-tests
    workflow for consistent API testing:
```

---------

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-12-05 12:01:29 -08:00
Varad Ahirwadkar
06f7ff2c80
fix: Correct broken links in README (#4218)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
Python Package Build Test / build (3.12) (push) Successful in 15s
Python Package Build Test / build (3.13) (push) Successful in 17s
API Conformance Tests / check-schema-compatibility (push) Successful in 22s
Vector IO Integration Tests / test-matrix (push) Failing after 33s
UI Tests / ui-tests (22) (push) Successful in 38s
Test External API and Providers / test-external (venv) (push) Failing after 43s
Unit Tests / unit-tests (3.12) (push) Failing after 1m23s
Unit Tests / unit-tests (3.13) (push) Failing after 1m38s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m49s
Pre-commit / pre-commit (22) (push) Successful in 5m8s
# What does this PR do?
Fixing broken README links that were still pointing to the
https://llamastack.github.io/latest

Signed-off-by: Varad Ahirwadkar <varad.ahirwadkar1@ibm.com>
2025-12-04 14:33:32 -08:00
Nathan Weinberg
f14936035d
fix: runpod provider no longer crashes sans API key (#4316)
# What does this PR do?
previously the runpod provider would fail if the
RUNPOD_API_TOKEN was not set

modify the impl to default to an empty string to
align with similar providers' behavior

Closes #4296

## Test Plan
Run `uv run llama stack run --providers inference=remote::runpod` with
`RUNPOD_API_TOKEN` unset - server now boots where it previously crashed
```
INFO     2025-12-04 13:52:59,920 uvicorn.error:84 uncategorized: Started server process [233656]                        
INFO     2025-12-04 13:52:59,921 uvicorn.error:48 uncategorized: Waiting for application startup.                       
INFO     2025-12-04 13:52:59,926 llama_stack.core.server.server:168 core::server: Starting up Llama Stack server        
         (version: 0.4.0.dev0)                                                                                          
INFO     2025-12-04 13:52:59,927 llama_stack.core.stack:495 core: starting registry refresh task                        
INFO     2025-12-04 13:52:59,928 uvicorn.error:62 uncategorized: Application startup complete.                          
INFO     2025-12-04 13:52:59,929 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321      
         (Press CTRL+C to quit)
```

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-12-04 11:38:43 -08:00
Nathan Weinberg
8bbcfc4f56
fix: nvidia provider no longer crashes sans API key (#4317)
# What does this PR do?
previously the nvidia provider would throw an exception if a hosted
instance was being used but no API key was set

modify this behavior to instead log an error informing users that a key
is needed to use a hosted NIM but still allow the server to boot

Closes #4295

## Test Plan
Run `uv run llama stack run --providers inference=remote::nvidia` with
`NVIDIA_API_KEY` unset - server now boots with logged error, where it
previously crashed
```
INFO     2025-12-04 14:16:26,156 llama_stack.providers.remote.inference.nvidia.nvidia:47 inference::nvidia: Initializing
         NVIDIAInferenceAdapter(https://integrate.api.nvidia.com/v1)...                                                 
ERROR    2025-12-04 14:16:26,157 llama_stack.providers.remote.inference.nvidia.nvidia:51 inference::nvidia: API key is  
         required for hosted NVIDIA NIM. Either provide an API key or use a self-hosted NIM.                            
INFO     2025-12-04 14:16:26,239 uvicorn.error:84 uncategorized: Started server process [251651]                        
INFO     2025-12-04 14:16:26,240 uvicorn.error:48 uncategorized: Waiting for application startup.                       
INFO     2025-12-04 14:16:26,244 llama_stack.core.server.server:168 core::server: Starting up Llama Stack server        
         (version: 0.4.0.dev0)                                                                                          
INFO     2025-12-04 14:16:26,245 llama_stack.core.stack:495 core: starting registry refresh task                        
INFO     2025-12-04 14:16:26,246 uvicorn.error:62 uncategorized: Application startup complete.                          
INFO     2025-12-04 14:16:26,246 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321      
         (Press CTRL+C to quit)
```

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-12-04 11:38:16 -08:00
Derek Higgins
686065fe27
fix: access control to fail-closed when owner attributes are missing (#4273)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 10s
Python Package Build Test / build (3.12) (push) Successful in 16s
Python Package Build Test / build (3.13) (push) Successful in 17s
Vector IO Integration Tests / test-matrix (push) Failing after 35s
UI Tests / ui-tests (22) (push) Successful in 39s
Test External API and Providers / test-external (venv) (push) Failing after 44s
Unit Tests / unit-tests (3.13) (push) Failing after 1m26s
Unit Tests / unit-tests (3.12) (push) Failing after 1m28s
Pre-commit / pre-commit (22) (push) Successful in 3m28s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m12s
2025-12-04 08:38:32 -08:00
Charlie Doern
b4903d6766
fix: llama_stack_api inspect API rename (#4311)
# What does this PR do?

when publishing llama_stack_api, `inspect.py` causes issues and gets
confused to be the builtin stdlib inspect module.

This is due to the top level __init__.py we have. We need to rename
inspect.py to inspect_api.py to avoid this conflict.

Also, uv sync

1993161624
for reference .

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-12-04 10:12:55 -05:00
Bwook (Byoungwook) Kim
c4c6d39c54
feat: Implement keyword search and delete_chunk at ChromaDB (#3057)
Some checks failed
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
API Conformance Tests / check-schema-compatibility (push) Successful in 11s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 18s
Python Package Build Test / build (3.13) (push) Successful in 17s
Integration Tests (Replay) / generate-matrix (push) Successful in 23s
Test External API and Providers / test-external (venv) (push) Failing after 26s
Python Package Build Test / build (3.12) (push) Successful in 32s
Vector IO Integration Tests / test-matrix (push) Failing after 40s
UI Tests / ui-tests (22) (push) Successful in 44s
Unit Tests / unit-tests (3.13) (push) Failing after 1m21s
Unit Tests / unit-tests (3.12) (push) Failing after 1m39s
Pre-commit / pre-commit (22) (push) Successful in 3m23s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m8s
2025-12-04 00:59:09 -05:00
Ashwin Bharambe
c6609a84f5
fix(tests): handle http URLs as aliases for server mode (#4306)
Small fix needed for llama-stack-ops which invokes integration-tests.sh
against docker by using a `http://` URL for stack-config
2025-12-03 21:21:18 -08:00
dependabot[bot]
1d9349c8d6
chore(deps): bump next from 15.5.4 to 15.5.7 in /src/llama_stack_ui (#4305)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Integration Tests (Replay) / generate-matrix (push) Successful in 4s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
API Conformance Tests / check-schema-compatibility (push) Successful in 10s
Python Package Build Test / build (3.12) (push) Successful in 15s
Python Package Build Test / build (3.13) (push) Successful in 19s
Vector IO Integration Tests / test-matrix (push) Failing after 31s
UI Tests / ui-tests (22) (push) Successful in 33s
Test External API and Providers / test-external (venv) (push) Failing after 48s
Unit Tests / unit-tests (3.12) (push) Failing after 1m30s
Unit Tests / unit-tests (3.13) (push) Failing after 1m31s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m58s
Pre-commit / pre-commit (22) (push) Successful in 3m40s
Bumps [next](https://github.com/vercel/next.js) from 15.5.4 to 15.5.7.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/vercel/next.js/releases">next's
releases</a>.</em></p>
<blockquote>
<h2>v15.5.7</h2>
<p>Please see <a
href="https://nextjs.org/blog/CVE-2025-66478">CVE-2025-66478</a> for
additional details about this release.</p>
<h2>v15.5.6</h2>
<blockquote>
<p>[!NOTE]<br />
This release is backporting bug fixes. It does <strong>not</strong>
include all pending features/changes on canary.</p>
</blockquote>
<h3>Core Changes</h3>
<ul>
<li>Turbopack: don't define process.cwd() in node_modules <a
href="https://redirect.github.com/vercel/next.js/issues/83452">#83452</a></li>
</ul>
<h3>Credits</h3>
<p>Huge thanks to <a
href="https://github.com/mischnic"><code>@​mischnic</code></a> for
helping!</p>
<h2>v15.5.5</h2>
<blockquote>
<p>[!NOTE]<br />
This release is backporting bug fixes. It does <strong>not</strong>
include all pending features/changes on canary.</p>
</blockquote>
<h3>Core Changes</h3>
<ul>
<li>Split code-frame into separate compiled package (<a
href="https://redirect.github.com/vercel/next.js/issues/84238">#84238</a>)</li>
<li>Add deprecation warning to Runtime config (<a
href="https://redirect.github.com/vercel/next.js/issues/84650">#84650</a>)</li>
<li>fix: unstable_cache should perform blocking revalidation during ISR
revalidation (<a
href="https://redirect.github.com/vercel/next.js/issues/84716">#84716</a>)</li>
<li>feat: <code>experimental.middlewareClientMaxBodySize</code> body
cloning limit (<a
href="https://redirect.github.com/vercel/next.js/issues/84722">#84722</a>)</li>
<li>fix: missing next/link types with typedRoutes (<a
href="https://redirect.github.com/vercel/next.js/issues/84779">#84779</a>)</li>
</ul>
<h3>Misc Changes</h3>
<ul>
<li>docs: early October improvements and fixes (<a
href="https://redirect.github.com/vercel/next.js/issues/84334">#84334</a>)</li>
</ul>
<h3>Credits</h3>
<p>Huge thanks to <a
href="https://github.com/devjiwonchoi"><code>@​devjiwonchoi</code></a>,
<a href="https://github.com/ztanner"><code>@​ztanner</code></a>, and <a
href="https://github.com/icyJoseph"><code>@​icyJoseph</code></a> for
helping!</p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="3eaf68b09b"><code>3eaf68b</code></a>
v15.5.7</li>
<li><a
href="8367ce592a"><code>8367ce5</code></a>
update version script</li>
<li><a
href="9115040008"><code>9115040</code></a>
Update React Version for Next.js 15.5.7 (<a
href="https://redirect.github.com/vercel/next.js/issues/10">#10</a>)</li>
<li><a
href="96f699902a"><code>96f6999</code></a>
update tag</li>
<li><a
href="55ef0e3ebc"><code>55ef0e3</code></a>
v15.5.6</li>
<li><a
href="92bbbb1bec"><code>92bbbb1</code></a>
Backport: don't define <code>process.cwd()</code> in node_modules (<a
href="https://redirect.github.com/vercel/next.js/issues/84957">#84957</a>)</li>
<li><a
href="f895b72762"><code>f895b72</code></a>
Fix url-imports test on 15-5 (<a
href="https://redirect.github.com/vercel/next.js/issues/84966">#84966</a>)</li>
<li><a
href="81f530db26"><code>81f530d</code></a>
v15.5.5</li>
<li><a
href="9abbc0e9eb"><code>9abbc0e</code></a>
[backport] fix: missing <code>next/link</code> types with
<code>typedRoutes</code> (<a
href="https://redirect.github.com/vercel/next.js/issues/82814">#82814</a>)
(<a
href="https://redirect.github.com/vercel/next.js/issues/84779">#84779</a>)</li>
<li><a
href="121e1b566f"><code>121e1b5</code></a>
[backport] docs: early October improvements and fixes (<a
href="https://redirect.github.com/vercel/next.js/issues/84334">#84334</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/vercel/next.js/compare/v15.5.4...v15.5.7">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=next&package-manager=npm_and_yarn&previous-version=15.5.4&new-version=15.5.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/llamastack/llama-stack/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-03 20:53:33 -05:00
Nathan Weinberg
2bdcbe7963
fix(ci): standardize CI on node 22 (#4302)
# What does this PR do?
CI was previously using both node 20 and 22
standardize on node 22

Closes #4294

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-12-03 19:10:40 -05:00
Nathan Weinberg
c57c2ae562
fix(ci): use latest version of setup-uv and remove pin (#4299)
# What does this PR do?
this commit puts aligns all 'setup-uv' instances to the latest version
and removes the pin keeping several actions on a very old version

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-12-03 14:13:10 -08:00
Nathan Weinberg
ee1e63e9b9
chore(ci): unify uv versions used in pre-commit (#4297)
# What does this PR do?
we had three different versions of uv being used
in pre-commit. bump all to the latest version.

we should probably try and find some way to automate this.

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-12-03 14:12:25 -08:00
Charlie Doern
c9b50b7e5b
fix: check if distro dirs exist before listing (#4301)
# What does this PR do?

DISTRO_DIR and DISTRIBS_BASE_DIR need to exist for them to be iterated.
our current logic allows us to iterdir without checking if they exist

## Test Plan

rm ~/.llama/distributions

```
llama stack list-deps starter --format uv | sh
Using Python 3.12.11 environment at: venv
Audited 51 packages in 12ms
Using Python 3.12.11 environment at: venv
Audited 3 packages in 2ms
Using Python 3.12.11 environment at: venv
Audited 1 package in 3ms
Using Python 3.12.11 environment at: venv
Audited 3 packages in 5ms
```

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-12-03 14:05:47 -08:00
Varsha
743683ba26
feat(qdrant): implement hybrid and keyword search support (#4006)
# What does this PR do?
-  Part of #3009
- Implement hybrid search using Qdrant's native query filtering
- Add keyword search support
- Update test suites to include qdrant for keyword and hybrid modes


<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
```
pytest -sv tests/unit/providers/vector_io/

.......
============================================================================================== slowest 10 durations ===============================================================================================
0.20s call     tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_max_concurrent_files_per_batch[qdrant]
0.20s call     tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_max_concurrent_files_per_batch[pgvector]
0.20s call     tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_max_concurrent_files_per_batch[sqlite_vec]
0.20s call     tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_max_concurrent_files_per_batch[faiss]
0.06s setup    tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_insert_chunks_with_missing_document_id[pgvector]
0.04s call     tests/unit/providers/vector_io/test_sqlite_vec.py::test_query_chunks_hybrid_tie_breaking
0.04s call     tests/unit/providers/vector_io/test_sqlite_vec.py::test_query_chunks_hybrid_weighted_reranker_parametrization
0.03s call     tests/unit/providers/vector_io/test_sqlite_vec.py::test_query_chunks_hybrid_score_selection
0.03s call     tests/unit/providers/vector_io/test_sqlite_vec.py::test_query_chunks_hybrid_edge_cases
0.03s setup    tests/unit/providers/vector_io/test_faiss.py::test_faiss_query_vector_returns_infinity_when_query_and_embedding_are_identical
======================================================================================== 180 passed, 47 warnings in 2.78s =========================================================================================
```

Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com>
2025-12-03 16:39:01 -05:00
Derek Higgins
5873a316db
feat: Add debug logging for RBAC access control decisions (#4255)
Refactor is_action_allowed() to track decision outcome, matched rule
index, and reason. Add structured debug log output for troubleshooting
access control.

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-12-03 11:04:56 -08:00
Derek Higgins
fcd6370b34
fix: set SqlRecord owner to None when owner_principal is empty (#4284)
Changes SqlRecord creation in AuthorizedSqlStore.fetch_all to use
owner=None when owner_principal is empty/missing, matching the
ResourceWithOwner pattern used in routing tables. This fixes an
inconsistency where SQL store was creating User(principal="") while
routing tables use owner=None for public resources.

Changes:
o Update ProtectedResource Protocol to allow owner: User | None 
o Update SqlRecord.__init__ to accept owner: User | None 
o Update fetch_all to create owner=None for records without
owner_principal

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-12-03 10:28:33 -08:00
raghotham
aa3898f486
chore(cve): Update node-forge to 1.3.3 (#4289)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 11s
Python Package Build Test / build (3.12) (push) Successful in 18s
Python Package Build Test / build (3.13) (push) Successful in 19s
Test External API and Providers / test-external (venv) (push) Failing after 28s
UI Tests / ui-tests (22) (push) Successful in 33s
Vector IO Integration Tests / test-matrix (push) Failing after 40s
Unit Tests / unit-tests (3.13) (push) Failing after 1m19s
Unit Tests / unit-tests (3.12) (push) Failing after 1m46s
Pre-commit / pre-commit (push) Successful in 2m49s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m42s
https://github.com/digitalbazaar/forge/security/advisories/GHSA-554w-wpv2-vw27

Taking on a direct dependency is not great
1. We don't actually use node-forge - it's only needed by
webpack-dev-server's dependency (selfsigned) for generating self-signed
certificates during development
2. Adding a direct dependency would be misleading - it suggests our code
uses node-forge when it doesn't

In the dependency chain:

```
@docusaurus/core@3.8.1
  └─ webpack-dev-server@4.15.2
      └─ selfsigned@2.4.1
          └─ node-forge@1.3.1
```
Latest Docusaurus (3.9.2) uses webpack-dev-server 5.2.2, which still
uses selfsigned 2.4.1

So, overriding dependency on node-forge is the only option
2025-12-03 09:58:33 -08:00
Sébastien Han
3c2d74f39a
chore: bump mcp package version (#4287)
# What does this PR do?

Address

https://github.com/modelcontextprotocol/python-sdk/security/advisories/GHSA-9h52-p55h-vw2f

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-12-03 17:38:56 +01:00
Derek Higgins
8940be23c4
fix: RBAC bypass vulnerabilities in model access (#4270)
Closes security gaps where RBAC checks could be bypassed:

o Inference router: Added RBAC enforcement in the fallback
  path to ensure access control is applied consistently.

o Model listing: Dynamic models fetched via provider_data were returned
  without RBAC checks. Added filtering to ensure users only see models
  they have permission to access.

Both fixes create temporary ModelWithOwner objects for RBAC validation,
maintaining security through consistent access control enforcement.

Closes: #4269

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-12-03 08:42:22 -05:00
Sébastien Han
7f43051a63
feat: Implement FastAPI router system (#4191)
# What does this PR do?

This commit introduces a new FastAPI router-based system for defining
API endpoints, enabling a migration path away from the legacy @webmethod
decorator system. The implementation includes router infrastructure,
migration of the Batches API as the first example, and updates to
server, OpenAPI generation, and inspection systems to support both
routing approaches.

The router infrastructure consists of a router registry system that
allows APIs to register FastAPI router factories, which are then
automatically discovered and included in the server application.
Standard error responses are centralized in router_utils to ensure
consistent OpenAPI specification generation with proper $ref references
to component responses.

The Batches API has been migrated to demonstrate the new pattern. The
protocol definition and models remain in llama_stack_api/batches,
maintaining clear separation between API contracts and server
implementation. The FastAPI router implementation lives in
llama_stack/core/server/routers/batches, following the established
pattern where API contracts are defined in llama_stack_api and server
routing logic lives in
llama_stack/core/server.

The server now checks for registered routers before falling back to the
legacy webmethod-based route discovery, ensuring backward compatibility
during the migration period. The OpenAPI generator has been updated to
handle both router-based and webmethod-based routes, correctly
extracting metadata from FastAPI route decorators and Pydantic Field
descriptions. The inspect endpoint now includes routes from both
systems, with proper filtering for deprecated routes and API levels.

Response descriptions are now explicitly defined in router decorators,
ensuring the generated OpenAPI specification matches the previous
format. Error responses use $ref references to component responses
(BadRequest400, TooManyRequests429, etc.) as required by the
specification. This is neat and will allow us to remove a lot of boiler
plate code from our generator once the migration is done.

This implementation provides a foundation for incrementally migrating
other APIs to the router system while maintaining full backward
compatibility with existing webmethod-based APIs.

Closes: https://github.com/llamastack/llama-stack/issues/4188

## Test Plan

CI, the server should start, same routes should be visible.

```
curl http://localhost:8321/v1/inspect/routes | jq '.data[] | select(.route | contains("batches"))'
```

Also:

```
 uv run pytest tests/integration/batches/ -vv --stack-config=http://localhost:8321
================================================== test session starts ==================================================
platform darwin -- Python 3.12.8, pytest-8.4.2, pluggy-1.6.0 -- /Users/leseb/Documents/AI/llama-stack/.venv/bin/python3
cachedir: .pytest_cache
metadata: {'Python': '3.12.8', 'Platform': 'macOS-26.0.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0'}}
rootdir: /Users/leseb/Documents/AI/llama-stack
configfile: pyproject.toml
plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0
asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 24 items                                                                                                      

tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_creation_and_retrieval[None] SKIPPED [  4%]
tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_listing[None] SKIPPED               [  8%]
tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_immediate_cancellation[None] SKIPPED [ 12%]
tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_e2e_chat_completions[None] SKIPPED  [ 16%]
tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_e2e_completions[None] SKIPPED       [ 20%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_invalid_endpoint[None] SKIPPED [ 25%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_cancel_completed[None] SKIPPED [ 29%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_missing_required_fields[None] SKIPPED [ 33%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_invalid_completion_window[None] SKIPPED [ 37%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_streaming_not_supported[None] SKIPPED [ 41%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_mixed_streaming_requests[None] SKIPPED [ 45%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_endpoint_mismatch[None] SKIPPED [ 50%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_missing_required_body_fields[None] SKIPPED [ 54%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_invalid_metadata_types[None] SKIPPED [ 58%]
tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_e2e_embeddings[None] SKIPPED        [ 62%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_nonexistent_file_id PASSED [ 66%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_malformed_jsonl PASSED     [ 70%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_file_malformed_batch_file[empty] XFAIL [ 75%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_file_malformed_batch_file[malformed] XFAIL [ 79%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_retrieve_nonexistent PASSED [ 83%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_cancel_nonexistent PASSED  [ 87%]
tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_error_handling_invalid_model PASSED [ 91%]
tests/integration/batches/test_batches_idempotency.py::TestBatchesIdempotencyIntegration::test_idempotent_batch_creation_successful PASSED [ 95%]
tests/integration/batches/test_batches_idempotency.py::TestBatchesIdempotencyIntegration::test_idempotency_conflict_with_different_params PASSED [100%]

================================================= slowest 10 durations ==================================================
1.01s call     tests/integration/batches/test_batches_idempotency.py::TestBatchesIdempotencyIntegration::test_idempotent_batch_creation_successful
0.21s call     tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_nonexistent_file_id
0.17s call     tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_malformed_jsonl
0.12s call     tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_error_handling_invalid_model
0.05s setup    tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_creation_and_retrieval[None]
0.02s call     tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_file_malformed_batch_file[empty]
0.01s call     tests/integration/batches/test_batches_idempotency.py::TestBatchesIdempotencyIntegration::test_idempotency_conflict_with_different_params
0.01s call     tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_file_malformed_batch_file[malformed]
0.01s call     tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_retrieve_nonexistent
0.00s call     tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_cancel_nonexistent
======================================= 7 passed, 15 skipped, 2 xfailed in 1.78s ========================================
```

---------

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-12-03 12:25:54 +01:00
Adrian Cole
4237eb4aaa
feat: Add opt-in OpenTelemetry auto-instrumentation to Docker images (#4281)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Successful in 4s
API Conformance Tests / check-schema-compatibility (push) Successful in 11s
Python Package Build Test / build (3.12) (push) Successful in 17s
Python Package Build Test / build (3.13) (push) Successful in 21s
Test Llama Stack Build / build-single-provider (push) Successful in 27s
Test External API and Providers / test-external (venv) (push) Failing after 28s
Vector IO Integration Tests / test-matrix (push) Failing after 37s
Test Llama Stack Build / build (push) Successful in 40s
UI Tests / ui-tests (22) (push) Successful in 1m18s
Unit Tests / unit-tests (3.12) (push) Failing after 1m50s
Unit Tests / unit-tests (3.13) (push) Failing after 2m9s
Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m41s
Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m51s
Pre-commit / pre-commit (push) Successful in 2m54s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m42s
# What does this PR do?

This allows llama-stack users of the Docker image to use OpenTelemetry
like previous versions.

#4127 migrated to automatic instrumentation, but unless we add those
libraries to the image, everyone needs to build a custom image to enable
otel. Also, unless we establish a convention for enabling it, users who
formerly just set config now need to override the entrypoint.

This PR bootstraps OTEL packages, so they are available (only +10MB). It
also prefixes `llama stack run` with `opentelemetry-instrument` when any
`OTEL_*` environment variable is set.

The result is implicit tracing like before, where you don't need a
custom image to use traces or metrics.

## Test Plan

```bash
# Build image
docker build -f containers/Containerfile \
  --build-arg DISTRO_NAME=starter \
  --build-arg INSTALL_MODE=editable \
  --tag llamastack/distribution-starter:otel-test .

# Run with OTEL env to implicitly use `opentelemetry-instrument`. The
# Settings below ensure inbound traces are honored, but no
# "junk traces" like SQL connects are created.
docker run -p 8321:8321 \
  -e OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:4318 \
  -e OTEL_SERVICE_NAME=llama-stack \
  -e OTEL_TRACES_SAMPLER=parentbased_traceidratio \
  -e OTEL_TRACES_SAMPLER_ARG=0.0 \
  llamastack/distribution-starter:otel-test
```

Ran a sample flight search agent which is instrumented on the client
side. This and llama-stack target
[otel-tui](https://github.com/ymtdzzz/otel-tui) I verified no root
database spans, yet database spans are attached to incoming traces.


<img width="1608" height="742" alt="screenshot"
src="https://github.com/user-attachments/assets/69f59b74-3054-42cd-947d-a6c0d9472a7c"
/>

Signed-off-by: Adrian Cole <adrian@tetrate.io>
2025-12-02 17:03:27 -08:00
Kelly Brown
e243892ef0
docs: Refine and fix nits in README (#4220)
Description: Refines and fixes some nits in the Llama stack readme
2025-12-02 13:36:29 -08:00
Derek Higgins
0b340ffd6e
fix: correct parameter names in error messages (#4268)
Error messages were using --test-setup, --test-subdirs, and --test-suite
instead of the actual parameter names: --setup, --subdirs, and --suite

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-12-02 13:34:54 -08:00
Derek Higgins
fbf6c30cdc
fix: call setup_logging early to apply category-specific log levels (#4253)
Category-specific log levels from LLAMA_STACK_LOGGING were not applied
to
loggers created before setup_logging() was called. This fix moves the
setup_logging() call earlier in the initialization sequence to ensure
all
loggers respect their configured levels regardless of initialization
timing.
    
    Closes: #4252

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-12-02 13:29:04 -08:00
Derek Higgins
2fce5abe34
fix: Add policies to adapters (#4277)
The configured policy wasn't being passed in and instead the default was
being used (e.g. in the s3 file provider)

Closes: #4276

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-12-02 14:08:03 -05:00
Derek Higgins
4ff0c25c52
fix(files): Enforce DELETE action permission for file deletion (#4275)
Previously, file deletion only checked READ permission via the
_lookup_file_id() method. This meant any user with READ access to a file
could also delete it, making it impossible to configure read-only file
access.

This change adds an 'action' parameter to fetch_all() and fetch_one() in
AuthorizedSqlStore, defaulting to Action.READ for backward
compatibility. The openai_delete_file() method now passes Action.DELETE,
ensuring proper RBAC enforcement.

With this fix, access policies can now distinguish between Users who can
read/list files but not delete them

Closes: #4274

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-12-02 09:56:59 -08:00
Omar Abdelwahab
ee107aadd6
fix(docs): Updated the LS documentation to point users to the correct docker container (#4267)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 11s
Python Package Build Test / build (3.12) (push) Successful in 16s
Python Package Build Test / build (3.13) (push) Successful in 18s
Test External API and Providers / test-external (venv) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (push) Failing after 42s
UI Tests / ui-tests (22) (push) Successful in 1m15s
Unit Tests / unit-tests (3.13) (push) Failing after 1m20s
Unit Tests / unit-tests (3.12) (push) Failing after 1m21s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m15s
Pre-commit / pre-commit (push) Successful in 3m51s
# What does this PR do?
Fixed the docker container name in the documentation by changing 
`docker pull llama-stack/distribution-starter`
`docker pull llama-stack/distribution-meta-reference-gpu`
to
`docker pull llamastack/distribution-starter`
`docker pull llamastack/distribution-meta-reference-gpu`


Closes this
[issue](https://github.com/llamastack/llama-stack/issues/4208)

## Test Plan
ci

Co-authored-by: Omar Abdelwahab <omara@fb.com>
2025-12-01 21:03:34 -08:00
Derek Higgins
9616448213
fix: use string annotations for S3Client type hints (#4242)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s
Test Llama Stack Build / generate-matrix (push) Successful in 4s
API Conformance Tests / check-schema-compatibility (push) Successful in 15s
Test Llama Stack Build / build-single-provider (push) Successful in 21s
Test External API and Providers / test-external (venv) (push) Failing after 25s
Python Package Build Test / build (3.13) (push) Successful in 34s
Python Package Build Test / build (3.12) (push) Successful in 41s
Vector IO Integration Tests / test-matrix (push) Failing after 57s
UI Tests / ui-tests (22) (push) Successful in 57s
Test Llama Stack Build / build (push) Successful in 57s
Unit Tests / unit-tests (3.13) (push) Failing after 1m49s
Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m0s
Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m16s
Unit Tests / unit-tests (3.12) (push) Failing after 2m13s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m20s
Pre-commit / pre-commit (push) Successful in 4m5s
fix: use string annotations for S3Client type hints
    
Remove future annotations import and use quoted string annotations for
S3Client to avoid import issues.
    
    Changes:
    o Remove __future__ annotations import
    o Use "S3Client" string annotations in type hints

closes: #4241

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-12-01 15:47:35 -08:00
Charlie Doern
aaecd0327c
feat(api): oasdiff OpenAI openAPI spec against ours (#3529)
# What does this PR do?

diff the `/v1/` routes that are OpenAI compatible against the OpenAI
openAPI spec. This will of course only trigger on PRs where the spec is
changed.

This will catch errors with new handwritten additions to our openAI
compat routes.

Instead of fetching the OpenAPI spec from a dynamic URL, which could
cause non-deterministic build failures,

this change uses a local copy stored at `docs/static/openai-spec.yml`.

This makes the conformance check fully reproducible and prevents CI
failures caused by uncontrolled upstream changes.

I am marking this test with `continue-on-error: true`, until we get rid
of all of the errors. Nevertheless, this is a nice utility to have so
folks know if their spec changes introduce more breaking changes or fix
breakages when comparing to the OpenAI openapi spec.

## Test Plan

test should pass.

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-12-01 15:27:08 -08:00