Bumps [@radix-ui/react-select](https://github.com/radix-ui/primitives)
from 2.2.5 to 2.2.6.
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/radix-ui/primitives/commits">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# What does this PR do?
the recorder mocks the openai-python interface. the openai-python
interface allows NOT_GIVEN as an input option. this change properly
handles NOT_GIVEN.
## Test Plan
ci (coverage for chat, completions, embeddings)
# What does this PR do?
Migrates MD5 and SHA-1 hash algorithms to SHA-256.
In particular, replaces:
- MD5 in chunk ID generation.
- MD5 in file verification.
- SHA-1 in model identifier digests.
And updates all related test expectations.
Original discussion:
https://github.com/llamastack/llama-stack/discussions/3413
<!-- If resolving an issue, uncomment and update the line below -->
Closes#3424.
## Test Plan
Unit tests from scripts/unit-tests.sh were updated to match the new hash
output, and ran to verify the tests pass.
Signed-off-by: Doug Edgar <dedgar@redhat.com>
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
fix: Improve pre-commit workflow error handling and feedback
- Add explicit step to check pre-commit results and provide clear error
messages
- Improve verification steps with better error messages and file
listings
- Use GitHub Actions annotations (::error:: and :⚠️:) for better
visibility
- Maintain continue-on-error for pre-commit step but add proper failure
handling
This addresses the issue where pre-commit failures were silent but still
caused workflow failures later, making it difficult to understand what
needed to be fixed.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
# What does this PR do?
only run conformance tests when the spec is changed.
Also, cache oasdiff such that it is not installed every time the test is
run
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
The notebook was
reverted(https://github.com/llamastack/llama-stack/pull/3259) as it had
some local paths, I missed correcting. Trying with corrections now
## Test Plan
Ran the Jupyter notebook
# What does this PR do?
update the async detection test for vllm
- remove a network access from unit tests
- remove direct logging use
the idea behind the test is to mock inference w/ a sleep, initiate
concurrent inference calls, verify the total execution time is close to
the sleep time. in a non-async env the total time would be closer to
sleep * num concurrent calls.
## Test Plan
ci
# What does this PR do?
update vLLM inference provider to use OpenAIMixin for openai-compat
functions
inference recordings from Qwen3-0.6B and vLLM 0.8.3 -
```
docker run --gpus all -v ~/.cache/huggingface:/root/.cache/huggingface -p 8000:8000 --ipc=host \
vllm/vllm-openai:latest \
--model Qwen/Qwen3-0.6B --enable-auto-tool-choice --tool-call-parser hermes
```
## Test Plan
```
./scripts/integration-tests.sh --stack-config server:ci-tests --setup vllm --subdirs inference
```
# What does this PR do?
- Updating documentation on migration from RAG Tool to Vector Stores and
Files APIs
- Adding exception handling for Vector Stores in RAG Tool
- Add more tests on migration from RAG Tool to Vector Stores
- Migrate off of inference_api for context_retriever for RAG
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
Integration and unit tests added
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
some providers do not produce spec compliant outputs. when this happens
the replay infra will fail to construct the proper types and will return
a dict to the client. the client likely does not expect a dict.
this was discovered with tgi, which returns finish_reason="" when valid
values are "stop", "length" or "content_filter"
## Test Plan
ci
Fixes#3370
AWS switched to requiring region-prefixed inference profile IDs instead
of foundation model IDs for on-demand throughput. This was causing
ValidationException errors.
Added auto-detection based on boto3 client region to convert model IDs
like meta.llama3-1-70b-instruct-v1:0 to
us.meta.llama3-1-70b-instruct-v1:0 depending on the detected region.
Also handles edge cases like ARNs, case insensitive regions, and None
regions.
Tested with this request.
```json
{
"model_id": "meta.llama3-1-8b-instruct-v1:0",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "tell me a riddle"
}
],
"sampling_params": {
"strategy": {
"type": "top_p",
"temperature": 0.7,
"top_p": 0.9
},
"max_tokens": 512
}
}
```
<img width="1488" height="878" alt="image"
src="https://github.com/user-attachments/assets/0d61beec-3869-4a31-8f37-9f554c280b88"
/>
# What does this PR do?
The openai package is already a dependency of the llama-stack project
itself, so let's the project dictate which openai version we need and
avoid potential breakage with unsatisfiable dependency resolution.
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
Duplicate chat completion IDs can be generated during tests especially
if they are replaying recorded responses across different tests. No need
to warn or error under those circumstances. In the wild, this is not
likely to happen at all (no evidence) so we aren't really hiding any
problem.
Bumps [openai](https://github.com/openai/openai-python) from 1.102.0 to
1.106.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/openai/openai-python/releases">openai's
releases</a>.</em></p>
<blockquote>
<h2>v1.106.1</h2>
<h2>1.106.1 (2025-09-04)</h2>
<p>Full Changelog: <a
href="https://github.com/openai/openai-python/compare/v1.106.0...v1.106.1">v1.106.0...v1.106.1</a></p>
<h3>Chores</h3>
<ul>
<li><strong>internal:</strong> move mypy configurations to
<code>pyproject.toml</code> file (<a
href="ca413a2774">ca413a2</a>)</li>
</ul>
<h2>v1.106.0</h2>
<h2>1.106.0 (2025-09-04)</h2>
<p>Full Changelog: <a
href="https://github.com/openai/openai-python/compare/v1.105.0...v1.106.0">v1.105.0...v1.106.0</a></p>
<h3>Features</h3>
<ul>
<li><strong>client:</strong> support callable api_key (<a
href="https://redirect.github.com/openai/openai-python/issues/2588">#2588</a>)
(<a
href="e1bad015b8">e1bad01</a>)</li>
<li>improve future compat with pydantic v3 (<a
href="6645d9317a">6645d93</a>)</li>
</ul>
<h2>v1.105.0</h2>
<h2>1.105.0 (2025-09-03)</h2>
<p>Full Changelog: <a
href="https://github.com/openai/openai-python/compare/v1.104.2...v1.105.0">v1.104.2...v1.105.0</a></p>
<h3>Features</h3>
<ul>
<li><strong>api:</strong> Add gpt-realtime models (<a
href="8502041480">8502041</a>)</li>
</ul>
<h2>v1.104.2</h2>
<h2>1.104.2 (2025-09-02)</h2>
<p>Full Changelog: <a
href="https://github.com/openai/openai-python/compare/v1.104.1...v1.104.2">v1.104.1...v1.104.2</a></p>
<h3>Bug Fixes</h3>
<ul>
<li><strong>types:</strong> add aliases back for web search tool types
(<a
href="2521cd8445">2521cd8</a>)</li>
</ul>
<h2>v1.104.1</h2>
<h2>1.104.1 (2025-09-02)</h2>
<p>Full Changelog: <a
href="https://github.com/openai/openai-python/compare/v1.104.0...v1.104.1">v1.104.0...v1.104.1</a></p>
<h3>Chores</h3>
<ul>
<li><strong>api:</strong> manual updates for ResponseInputAudio (<a
href="0db5061966">0db5061</a>)</li>
</ul>
<h2>v1.104.0</h2>
<h2>1.104.0 (2025-09-02)</h2>
<p>Full Changelog: <a
href="https://github.com/openai/openai-python/compare/v1.103.0...v1.104.0">v1.103.0...v1.104.0</a></p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/openai/openai-python/blob/main/CHANGELOG.md">openai's
changelog</a>.</em></p>
<blockquote>
<h2>1.106.1 (2025-09-04)</h2>
<p>Full Changelog: <a
href="https://github.com/openai/openai-python/compare/v1.106.0...v1.106.1">v1.106.0...v1.106.1</a></p>
<h3>Chores</h3>
<ul>
<li><strong>internal:</strong> move mypy configurations to
<code>pyproject.toml</code> file (<a
href="ca413a2774">ca413a2</a>)</li>
</ul>
<h2>1.106.0 (2025-09-04)</h2>
<p>Full Changelog: <a
href="https://github.com/openai/openai-python/compare/v1.105.0...v1.106.0">v1.105.0...v1.106.0</a></p>
<h3>Features</h3>
<ul>
<li><strong>client:</strong> support callable api_key (<a
href="https://redirect.github.com/openai/openai-python/issues/2588">#2588</a>)
(<a
href="e1bad015b8">e1bad01</a>)</li>
<li>improve future compat with pydantic v3 (<a
href="6645d9317a">6645d93</a>)</li>
</ul>
<h2>1.105.0 (2025-09-03)</h2>
<p>Full Changelog: <a
href="https://github.com/openai/openai-python/compare/v1.104.2...v1.105.0">v1.104.2...v1.105.0</a></p>
<h3>Features</h3>
<ul>
<li><strong>api:</strong> Add gpt-realtime models (<a
href="8502041480">8502041</a>)</li>
</ul>
<h2>1.104.2 (2025-09-02)</h2>
<p>Full Changelog: <a
href="https://github.com/openai/openai-python/compare/v1.104.1...v1.104.2">v1.104.1...v1.104.2</a></p>
<h3>Bug Fixes</h3>
<ul>
<li><strong>types:</strong> add aliases back for web search tool types
(<a
href="2521cd8445">2521cd8</a>)</li>
</ul>
<h2>1.104.1 (2025-09-02)</h2>
<p>Full Changelog: <a
href="https://github.com/openai/openai-python/compare/v1.104.0...v1.104.1">v1.104.0...v1.104.1</a></p>
<h3>Chores</h3>
<ul>
<li><strong>api:</strong> manual updates for ResponseInputAudio (<a
href="0db5061966">0db5061</a>)</li>
</ul>
<h2>1.104.0 (2025-09-02)</h2>
<p>Full Changelog: <a
href="https://github.com/openai/openai-python/compare/v1.103.0...v1.104.0">v1.103.0...v1.104.0</a></p>
<h3>Features</h3>
<ul>
<li><strong>types:</strong> replace List[str] with SequenceNotStr in
params (<a
href="bc00bda880">bc00bda</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="2adf111129"><code>2adf111</code></a>
release: 1.106.1</li>
<li><a
href="c4f9d0b997"><code>c4f9d0b</code></a>
chore(internal): move mypy configurations to <code>pyproject.toml</code>
file</li>
<li><a
href="2de8d7cde5"><code>2de8d7c</code></a>
release: 1.106.0</li>
<li><a
href="2cf4ed5072"><code>2cf4ed5</code></a>
feat: improve future compat with pydantic v3</li>
<li><a
href="25d16be18b"><code>25d16be</code></a>
feat(client): support callable api_key (<a
href="https://redirect.github.com/openai/openai-python/issues/2588">#2588</a>)</li>
<li><a
href="8672413735"><code>8672413</code></a>
release: 1.105.0</li>
<li><a
href="2c60d78b37"><code>2c60d78</code></a>
feat(api): Add gpt-realtime models</li>
<li><a
href="a52463c932"><code>a52463c</code></a>
release: 1.104.2</li>
<li><a
href="5a6931dafd"><code>5a6931d</code></a>
fix(types): add aliases back for web search tool types</li>
<li><a
href="fb152d967e"><code>fb152d9</code></a>
release: 1.104.1</li>
<li>Additional commits viewable in <a
href="https://github.com/openai/openai-python/compare/v1.102.0...v1.106.1">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [locust](https://github.com/locustio/locust) from 2.39.1 to
2.40.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/locustio/locust/releases">locust's
releases</a>.</em></p>
<blockquote>
<h2>2.40.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Pytest plugin: Delay imports to avoid monkey patching until someone
uses the fixtures by <a
href="https://github.com/cyberw"><code>@cyberw</code></a> in <a
href="https://redirect.github.com/locustio/locust/pull/3204">locustio/locust#3204</a></li>
<li>Move pytest plugin to its own directory, to prevent accidental
import by <a href="https://github.com/cyberw"><code>@cyberw</code></a>
in <a
href="https://redirect.github.com/locustio/locust/pull/3205">locustio/locust#3205</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/locustio/locust/compare/2.40.0...2.40.1">https://github.com/locustio/locust/compare/2.40.0...2.40.1</a></p>
<h2>2.40.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Refactor FastHttpSession to be more like HttpSession by <a
href="https://github.com/cyberw"><code>@cyberw</code></a> in <a
href="https://redirect.github.com/locustio/locust/pull/3198">locustio/locust#3198</a></li>
<li>Update Dockerfile base to Python 3.13 by <a
href="https://github.com/adaamz"><code>@adaamz</code></a> in <a
href="https://redirect.github.com/locustio/locust/pull/3193">locustio/locust#3193</a></li>
<li>Avoid exception in HttpUser if requests has lost track of the
request it made by <a
href="https://github.com/cyberw"><code>@cyberw</code></a> in <a
href="https://redirect.github.com/locustio/locust/pull/3201">locustio/locust#3201</a></li>
<li>Support pytests as locustfiles by <a
href="https://github.com/cyberw"><code>@cyberw</code></a> in <a
href="https://redirect.github.com/locustio/locust/pull/3200">locustio/locust#3200</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/adaamz"><code>@adaamz</code></a> made
their first contribution in <a
href="https://redirect.github.com/locustio/locust/pull/3193">locustio/locust#3193</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/locustio/locust/compare/2.39.1...2.40.0">https://github.com/locustio/locust/compare/2.39.1...2.40.0</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/locustio/locust/blob/master/CHANGELOG.md">locust's
changelog</a>.</em></p>
<blockquote>
<h1>Detailed changelog</h1>
<p>The most important changes can also be found in <a
href="https://docs.locust.io/en/latest/changelog.html">the
documentation</a>.</p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="5df19da06a"><code>5df19da</code></a>
Merge pull request <a
href="https://redirect.github.com/locustio/locust/issues/3205">#3205</a>
from locustio/move-pytest-plugin-to-own-directory</li>
<li><a
href="d41141bedd"><code>d41141b</code></a>
Move pytest plugin to its own directory, to prevent accidental import of
locu...</li>
<li><a
href="6422848afd"><code>6422848</code></a>
mention that only one locustfile can be distributed</li>
<li><a
href="aa3da739fe"><code>aa3da73</code></a>
Merge pull request <a
href="https://redirect.github.com/locustio/locust/issues/3204">#3204</a>
from locustio/delay-imports-in-pytest-plugin-to-avoi...</li>
<li><a
href="12050dedfd"><code>12050de</code></a>
Pytest plugin: Delay imports to avoid monkey patching until someone
actually ...</li>
<li><a
href="488d1f8491"><code>488d1f8</code></a>
docs</li>
<li><a
href="439b7ab91b"><code>439b7ab</code></a>
docs fix</li>
<li><a
href="fcd76a8ac3"><code>fcd76a8</code></a>
docs: rephrase</li>
<li><a
href="70c7e9b2d8"><code>70c7e9b</code></a>
docs: move pytest further up</li>
<li><a
href="06dbf98013"><code>06dbf98</code></a>
docs: fix link</li>
<li>Additional commits viewable in <a
href="https://github.com/locustio/locust/compare/2.39.1...2.40.1">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [pytest](https://github.com/pytest-dev/pytest) from 8.4.1 to
8.4.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pytest-dev/pytest/releases">pytest's
releases</a>.</em></p>
<blockquote>
<h2>8.4.2</h2>
<h1>pytest 8.4.2 (2025-09-03)</h1>
<h2>Bug fixes</h2>
<ul>
<li>
<p><a
href="https://redirect.github.com/pytest-dev/pytest/issues/13478">#13478</a>:
Fixed a crash when using
<code>console_output_style</code>{.interpreted-text
role="confval"} with <code>times</code> and a module is
skipped.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/pytest-dev/pytest/issues/13530">#13530</a>:
Fixed a crash when using <code>pytest.approx</code>{.interpreted-text
role="func"} and
<code>decimal.Decimal</code>{.interpreted-text role="class"}
instances with the <code>decimal.FloatOperation</code>{.interpreted-text
role="class"} trap set.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/pytest-dev/pytest/issues/13549">#13549</a>:
No longer evaluate type annotations in Python <code>3.14</code> when
inspecting function signatures.</p>
<p>This prevents crashes during module collection when modules do not
explicitly use <code>from __future__ import annotations</code> and
import types for annotations within a <code>if TYPE_CHECKING:</code>
block.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/pytest-dev/pytest/issues/13559">#13559</a>:
Added missing [int]{.title-ref} and [float]{.title-ref} variants to the
[Literal]{.title-ref} type annotation of the [type]{.title-ref}
parameter in <code>pytest.Parser.addini</code>{.interpreted-text
role="meth"}.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/pytest-dev/pytest/issues/13563">#13563</a>:
<code>pytest.approx</code>{.interpreted-text role="func"} now
only imports <code>numpy</code> if NumPy is already in
<code>sys.modules</code>. This fixes unconditional import behavior
introduced in [8.4.0]{.title-ref}.</p>
</li>
</ul>
<h2>Improved documentation</h2>
<ul>
<li><a
href="https://redirect.github.com/pytest-dev/pytest/issues/13577">#13577</a>:
Clarify that <code>pytest_generate_tests</code> is discovered in test
modules/classes; other hooks must be in <code>conftest.py</code> or
plugins.</li>
</ul>
<h2>Contributor-facing changes</h2>
<ul>
<li><a
href="https://redirect.github.com/pytest-dev/pytest/issues/13480">#13480</a>:
Self-testing: fixed a few test failures when run with
<code>-Wdefault</code> or a similar override.</li>
<li><a
href="https://redirect.github.com/pytest-dev/pytest/issues/13547">#13547</a>:
Self-testing: corrected expected message for
<code>test_doctest_unexpected_exception</code> in Python
<code>3.14</code>.</li>
<li><a
href="https://redirect.github.com/pytest-dev/pytest/issues/13684">#13684</a>:
Make pytest's own testsuite insensitive to the presence of the
<code>CI</code> environment variable -- by
<code>ogrisel</code>{.interpreted-text role="user"}.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="bfae4224fd"><code>bfae422</code></a>
Prepare release version 8.4.2</li>
<li><a
href="89905381a1"><code>8990538</code></a>
Fix passenv CI in tox ini and make tests insensitive to the presence of
the C...</li>
<li><a
href="ca676bfe00"><code>ca676bf</code></a>
Merge pull request <a
href="https://redirect.github.com/pytest-dev/pytest/issues/13687">#13687</a>
from pytest-dev/patchback/backports/8.4.x/e63f6e51c...</li>
<li><a
href="975a60a63c"><code>975a60a</code></a>
Merge pull request <a
href="https://redirect.github.com/pytest-dev/pytest/issues/13686">#13686</a>
from pytest-dev/patchback/backports/8.4.x/12bde8af6...</li>
<li><a
href="7723ce84b8"><code>7723ce8</code></a>
Merge pull request <a
href="https://redirect.github.com/pytest-dev/pytest/issues/13683">#13683</a>
from even-even/fix_Exeption_to_Exception_in_errorMe...</li>
<li><a
href="b7f05680d1"><code>b7f0568</code></a>
Merge pull request <a
href="https://redirect.github.com/pytest-dev/pytest/issues/13685">#13685</a>
from CoretexShadow/fix/docs-pytest-generate-tests</li>
<li><a
href="2c94c4a694"><code>2c94c4a</code></a>
add missing colon (<a
href="https://redirect.github.com/pytest-dev/pytest/issues/13640">#13640</a>)
(<a
href="https://redirect.github.com/pytest-dev/pytest/issues/13641">#13641</a>)</li>
<li><a
href="c3d7684bc0"><code>c3d7684</code></a>
Merge pull request <a
href="https://redirect.github.com/pytest-dev/pytest/issues/13606">#13606</a>
from pytest-dev/patchback/backports/8.4.x/5f9938563...</li>
<li><a
href="dc6e3be2dd"><code>dc6e3be</code></a>
Merge pull request <a
href="https://redirect.github.com/pytest-dev/pytest/issues/13605">#13605</a>
from The-Compiler/training-update-2025-07</li>
<li><a
href="f87289c36c"><code>f87289c</code></a>
Fix crash with <code>times</code> output style and skipped module (<a
href="https://redirect.github.com/pytest-dev/pytest/issues/13573">#13573</a>)
(<a
href="https://redirect.github.com/pytest-dev/pytest/issues/13579">#13579</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/pytest-dev/pytest/compare/8.4.1...8.4.2">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# What does this PR do?
* Adds a horizontal nav bar for easy access to the API reference and the
Llama Stack Github repo
<img width="2696" height="520" alt="image"
src="https://github.com/user-attachments/assets/82daffe1-c206-4e20-b95b-1e090011eecc"
/>
## Test Plan
* Built the docs and ran the local HTML server to verify changes
# What does this PR do?
Adds a write worker queue for writes to inference store. This avoids
overwhelming request processing with slow inference writes.
## Test Plan
Benchmark:
```
cd /docs/source/distributions/k8s-benchmark
# start mock server
python openai-mock-server.py --port 8000
# start stack server
LLAMA_STACK_LOGGING="all=WARNING" uv run --with llama-stack python -m llama_stack.core.server.server docs/source/distributions/k8s-benchmark/stack_run_config.yaml
# run benchmark script
uv run python3 benchmark.py --duration 120 --concurrent 50 --base-url=http://localhost:8321/v1/openai/v1 --model=vllm-inference/meta-llama/Llama-3.2-3B-Instruct
```
## RPS from 21 -> 57
# What does this PR do?
- Use BackgroundLogger when logging metric events.
- Reuse event loop in BackgroundLogger
## Test Plan
```
cd /docs/source/distributions/k8s-benchmark
# start mock server
python openai-mock-server.py --port 8000
# start stack server
LLAMA_STACK_LOGGING="all=WARNING" uv run --with llama-stack python -m llama_stack.core.server.server docs/source/distributions/k8s-benchmark/stack_run_config.yaml
# run benchmark script
uv run python3 benchmark.py --duration 120 --concurrent 50 --base-url=http://localhost:8321/v1/openai/v1 --model=vllm-inference/meta-llama/Llama-3.2-3B-Instruct
```
### RPS from 57 -> 62
# What does this PR do?
Fix fireworks chat completion broken due to telemetry expecting
response.usage
Closes https://github.com/llamastack/llama-stack/issues/3391
## Test Plan
1. `uv run --with llama-stack llama stack build --distro starter
--image-type venv --run`
Try
```
curl -X POST http://0.0.0.0:8321/v1/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct",
"messages": [{"role": "user", "content": "Hello!"}]
}'
```
```
{"id":"chatcmpl-ee922a08-0df0-4974-b0d3-b322113e8bc0","choices":[{"message":{"role":"assistant","content":"Hello! How can I assist you today?","name":null,"tool_calls":null},"finish_reason":"stop","index":0,"logprobs":null}],"object":"chat.completion","created":1757456375,"model":"fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct"}%
```
Without fix fails as mentioned in
https://github.com/llamastack/llama-stack/issues/3391
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
# What does this PR do?
update VertexAI inference provider to use openai-python for
openai-compat functions
## Test Plan
```
$ VERTEX_AI_PROJECT=... uv run llama stack build --image-type venv --providers inference=remote::vertexai --run
...
$ LLAMA_STACK_CONFIG=http://localhost:8321 uv run --group test pytest -v -ra --text-model vertexai/vertex_ai/gemini-2.5-flash tests/integration/inference/test_openai_completion.py
...
```
i don't have an account to test this. `get_api_key` may also need to be
updated per
https://cloud.google.com/vertex-ai/generative-ai/docs/start/openai
---------
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
Fix pre-commit issues: non executable shebang file, @pytest.mark.asyncio
decorator
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
The test_query_adds_vector_db_id_to_chunk_metadata test was failing
because MemoryToolRuntimeImpl.__init__() now requires a files_api
parameter.
Fixes failing unit tests for Python 3.12 and 3.13.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
# What does this PR do?
When running RAG in a multi vector DB setting, it can be difficult to
trace where retrieved chunks originate from. This PR adds the
`vector_db_id` into each chunk’s metadata, making it easier to
understand which database a given chunk came from. This is helpful for
debugging and for analyzing retrieval behavior of multiple DBs.
Relevant code:
```python
for vector_db_id, result in zip(vector_db_ids, results):
for chunk, score in zip(result.chunks, result.scores):
if not hasattr(chunk, "metadata") or chunk.metadata is None:
chunk.metadata = {}
chunk.metadata["vector_db_id"] = vector_db_id
chunks.append(chunk)
scores.append(score)
```
## Test Plan
* Ran Llama Stack in debug mode.
* Verified that `vector_db_id` was added to each chunk’s metadata.
* Confirmed that the metadata was printed in the console when using the
RAG tool.
---------
Co-authored-by: are-ces <cpompeia@redhat.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
# What does this PR do?
enables completions storage when using `llama stack build --providers` -
- GET /v1/chat/completions
- GET /v1/chat/completions/{id}
todo: llama stack build and distro codegen should use the same code
paths
## Test Plan
ci
This PR refactors the integration test system to use global "setups"
which provides better separation of concerns:
**suites = what to test, setups = how to configure.**
NOTE: if you naming suggestions, please provide feedback
Changes:
- New `tests/integration/setups.py` with global, reusable configurations
(ollama, vllm, gpt, claude)
- Modified `scripts/integration-tests.sh` options to match with the
underlying pytest options
- Updated documentation to reflect the new global setup system
The main benefit is that setups can be reused across multiple suites
(e.g., use "gpt" with any suite) even though sometimes they could
specifically tailored for a suite (vision <> ollama-vision). It is now
easier to add new configurations without modifying existing suites.
Usage examples:
- `pytest tests/integration --suite=responses --setup=gpt`
- `pytest tests/integration --suite=vision` # auto-selects
"ollama-vision" setup
- `pytest tests/integration --suite=base --setup=vllm`
# What does this PR do?
This PR adds support for OpenAI Prompts API.
Note, OpenAI does not explicitly expose the Prompts API but instead
makes it available in the Responses API and in the [Prompts
Dashboard](https://platform.openai.com/docs/guides/prompting#create-a-prompt).
I have added the following APIs:
- CREATE
- GET
- LIST
- UPDATE
- Set Default Version
The Set Default Version API is made available only in the Prompts
Dashboard and configures which prompt version is returned in the GET
(the latest version is the default).
Overall, the expected functionality in Responses will look like this:
```python
from openai import OpenAI
client = OpenAI()
response = client.responses.create(
prompt={
"id": "pmpt_68b0c29740048196bd3a6e6ac3c4d0e20ed9a13f0d15bf5e",
"version": "2",
"variables": {
"city": "San Francisco",
"age": 30,
}
}
)
```
### Resolves https://github.com/llamastack/llama-stack/issues/3276
## Test Plan
Unit tests added. Integration tests can be added after client
generation.
## Next Steps
1. Update Responses API to support Prompt API
2. I'll enhance the UI to implement the Prompt Dashboard.
3. Add cache for lower latency
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
Add Kubernetes authentication provider support
- Add KubernetesAuthProvider class for token validation using Kubernetes
SelfSubjectReview API
- Add KubernetesAuthProviderConfig with configurable API server URL, TLS
settings, and claims mapping
- Implement authentication via POST requests to
/apis/authentication.k8s.io/v1/selfsubjectreviews endpoint
- Add support for parsing Kubernetes SelfSubjectReview response format
to extract user information
- Add KUBERNETES provider type to AuthProviderType enum
- Update create_auth_provider factory function to handle 'kubernetes'
provider type
- Add comprehensive unit tests for KubernetesAuthProvider functionality
- Add documentation with configuration examples and usage instructions
The provider validates tokens by sending SelfSubjectReview requests to
the Kubernetes API server and extracts user information from the
userInfo structure in the response.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
What This Verifies:
Authentication header validation
Token validation with Kubernetes SelfSubjectReview and kubernetes server
API endpoint
Error handling for invalid tokens and HTTP errors
Request payload structure and headers
```
python -m pytest tests/unit/server/test_auth.py -k "kubernetes" -v
```
Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>