Bumps [psycopg2-binary](https://github.com/psycopg/psycopg2) from 2.9.10
to 2.9.11.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/psycopg/psycopg2/blob/master/NEWS">psycopg2-binary's
changelog</a>.</em></p>
<blockquote>
<h2>Current release</h2>
<p>What's new in psycopg 2.9.11
^^^^^^^^^^^^^^^^^^^^^^^^^^^^</p>
<ul>
<li>Add support for Python 3.14.</li>
<li>Avoid a segfault passing more arguments than placeholders if Python
is built
with assertions enabled
(🎫<code>[#1791](https://github.com/psycopg/psycopg2/issues/1791)</code>).</li>
<li><code>~psycopg2.errorcodes</code> map and
<code>~psycopg2.errors</code> classes updated to
PostgreSQL 18.</li>
<li>Drop support for Python 3.8.</li>
</ul>
<p>What's new in psycopg 2.9.10
^^^^^^^^^^^^^^^^^^^^^^^^^^^^</p>
<ul>
<li>Add support for Python 3.13.</li>
<li>Receive notifications on commit
(🎫<code>[#1728](https://github.com/psycopg/psycopg2/issues/1728)</code>).</li>
<li><code>~psycopg2.errorcodes</code> map and
<code>~psycopg2.errors</code> classes updated to
PostgreSQL 17.</li>
<li>Drop support for Python 3.7.</li>
</ul>
<p>What's new in psycopg 2.9.9
^^^^^^^^^^^^^^^^^^^^^^^^^^^</p>
<ul>
<li>Add support for Python 3.12.</li>
<li>Drop support for Python 3.6.</li>
</ul>
<p>What's new in psycopg 2.9.8
^^^^^^^^^^^^^^^^^^^^^^^^^^^</p>
<ul>
<li>Wheel package bundled with PostgreSQL 16 libpq in order to add
support for
recent features, such as <code>sslcertmode</code>.</li>
</ul>
<p>What's new in psycopg 2.9.7
^^^^^^^^^^^^^^^^^^^^^^^^^^^</p>
<ul>
<li>Fix propagation of exceptions raised during module initialization
(🎫<code>[#1598](https://github.com/psycopg/psycopg2/issues/1598)</code>).</li>
<li>Fix building when pg_config returns an empty string
(🎫<code>[#1599](https://github.com/psycopg/psycopg2/issues/1599)</code>).</li>
<li>Wheel package bundled with OpenSSL 1.1.1v.</li>
</ul>
<p>What's new in psycopg 2.9.6
^^^^^^^^^^^^^^^^^^^^^^^^^^^</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="fd9ae8cad2"><code>fd9ae8c</code></a>
chore: bump to version 2.9.11</li>
<li><a
href="d923840546"><code>d923840</code></a>
chore: update docs requirements</li>
<li><a
href="d42dc7169d"><code>d42dc71</code></a>
Merge branch 'fix-1791'</li>
<li><a
href="4fde6560c3"><code>4fde656</code></a>
fix: avoid failed assert passing more arguments than placeholders</li>
<li><a
href="8308c19d6a"><code>8308c19</code></a>
fix: drop warning about the use of deprecated PyWeakref_GetObject
function</li>
<li><a
href="1a1eabf098"><code>1a1eabf</code></a>
build(deps): bump actions/github-script from 7 to 8</li>
<li><a
href="897af8b38b"><code>897af8b</code></a>
build(deps): bump peter-evans/repository-dispatch from 3 to 4</li>
<li><a
href="ceefd30511"><code>ceefd30</code></a>
build(deps): bump actions/checkout from 4 to 5</li>
<li><a
href="4dc585430c"><code>4dc5854</code></a>
build(deps): bump actions/setup-python from 5 to 6</li>
<li><a
href="1945788dcf"><code>1945788</code></a>
Merge pull request <a
href="https://redirect.github.com/psycopg/psycopg2/issues/1802">#1802</a>
from edgarrmondragon/cp314-wheels</li>
<li>Additional commits viewable in <a
href="https://github.com/psycopg/psycopg2/compare/2.9.10...2.9.11">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from
6.8.0 to 7.0.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's
releases</a>.</em></p>
<blockquote>
<h2>v7.0.0 🌈 node24 and a lot of bugfixes</h2>
<h2>Changes</h2>
<p>This release comes with a load of bug fixes and a speed up. Because
of switching from node20 to node24 it is also a breaking change. If you
are running on GitHub hosted runners this will just work, if you are
using self-hosted runners make sure, that your runners are up to date.
If you followed the normal installation instructions your self-hosted
runner will keep itself updated.</p>
<p>This release also removes the deprecated input
<code>server-url</code> which was used to download uv releases from a
different server.
The <a
href="https://github.com/astral-sh/setup-uv?tab=readme-ov-file#manifest-file">manifest-file</a>
input supersedes that functionality by adding a flexible way to define
available versions and where they should be downloaded from.</p>
<h3>Fixes</h3>
<ul>
<li>The action now respects when the environment variable
<code>UV_CACHE_DIR</code> is already set and does not overwrite it. It
now also finds <a
href="https://docs.astral.sh/uv/reference/settings/#cache-dir">cache-dir</a>
settings in config files if you set them.</li>
<li>Some users encountered problems that <a
href="https://github.com/astral-sh/setup-uv?tab=readme-ov-file#disable-cache-pruning">cache
pruning</a> took forever because they had some <code>uv</code> processes
running in the background. Starting with uv version <code>0.8.24</code>
this action uses <code>uv cache prune --ci --force</code> to ignore the
running processes</li>
<li>If you just want to install uv but not have it available in path,
this action now respects <code>UV_NO_MODIFY_PATH</code></li>
<li>Some other actions also set the env var <code>UV_CACHE_DIR</code>.
This action can now deal with that but as this could lead to unwanted
behavior in some edgecases a warning is now displayed.</li>
</ul>
<h3>Improvements</h3>
<p>If you are using minimum version specifiers for the version of uv to
install for example</p>
<pre lang="toml"><code>[tool.uv]
required-version = ">=0.8.17"
</code></pre>
<p>This action now detects that and directly uses the latest version.
Previously it would download all available releases from the uv repo
to determine the highest matching candidate for the version specifier,
which took much more time.</p>
<p>If you are using other specifiers like <code>0.8.x</code> this action
still needs to download all available releases because the specifier
defines an upper bound (not 0.9.0 or later) and "latest" would
possibly not satisfy that.</p>
<h2>🚨 Breaking changes</h2>
<ul>
<li>Use node24 instead of node20 <a
href="https://github.com/eifinger"><code>@eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/608">#608</a>)</li>
<li>Remove deprecated input server-url <a
href="https://github.com/eifinger"><code>@eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/607">#607</a>)</li>
</ul>
<h2>🐛 Bug fixes</h2>
<ul>
<li>Respect UV_CACHE_DIR and cache-dir <a
href="https://github.com/eifinger"><code>@eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/612">#612</a>)</li>
<li>Use --force when pruning cache <a
href="https://github.com/eifinger"><code>@eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/611">#611</a>)</li>
<li>Respect UV_NO_MODIFY_PATH <a
href="https://github.com/eifinger"><code>@eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/603">#603</a>)</li>
<li>Warn when <code>UV_CACHE_DIR</code> has changed <a
href="https://github.com/jamesbraza"><code>@jamesbraza</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/601">#601</a>)</li>
</ul>
<h2>🚀 Enhancements</h2>
<ul>
<li>Shortcut to latest version for minimum version specifier <a
href="https://github.com/eifinger"><code>@eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/598">#598</a>)</li>
</ul>
<h2>🧰 Maintenance</h2>
<ul>
<li>Bump dependencies <a
href="https://github.com/eifinger"><code>@eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/613">#613</a>)</li>
<li>Fix test-uv-no-modify-path <a
href="https://github.com/eifinger"><code>@eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/604">#604</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="eb1897b8dc"><code>eb1897b</code></a>
Bump dependencies (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/613">#613</a>)</li>
<li><a
href="d78d791822"><code>d78d791</code></a>
Bump github/codeql-action from 3.30.5 to 3.30.6 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/605">#605</a>)</li>
<li><a
href="535dc2664c"><code>535dc26</code></a>
Respect UV_CACHE_DIR and cache-dir (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/612">#612</a>)</li>
<li><a
href="f610be5ff9"><code>f610be5</code></a>
Use --force when pruning cache (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/611">#611</a>)</li>
<li><a
href="3deccc0075"><code>3deccc0</code></a>
Use node24 instead of node20 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/608">#608</a>)</li>
<li><a
href="d9ee7e2f26"><code>d9ee7e2</code></a>
Remove deprecated input server-url (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/607">#607</a>)</li>
<li><a
href="59a0868fea"><code>59a0868</code></a>
Bump github/codeql-action from 3.30.3 to 3.30.5 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/594">#594</a>)</li>
<li><a
href="c952556164"><code>c952556</code></a>
Bump <code>@renovatebot/pep440</code> from 4.2.0 to 4.2.1 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/581">#581</a>)</li>
<li><a
href="51c3328db2"><code>51c3328</code></a>
Fix test-uv-no-modify-path (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/604">#604</a>)</li>
<li><a
href="f2859da213"><code>f2859da</code></a>
Respect UV_NO_MODIFY_PATH (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/603">#603</a>)</li>
<li>Additional commits viewable in <a
href="d0cc045d04...eb1897b8dc">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# What does this PR do?
Removes VectorDBs from API surface and our tests.
Moves tests to Vector Stores.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
# What does this PR do?
Allows passing through extra_body parameters to inference providers.
With this, we removed the 2 vllm-specific parameters from completions
API into `extra_body`.
Before/After
<img width="1883" height="324" alt="image"
src="https://github.com/user-attachments/assets/acb27c08-c748-46c9-b1da-0de64e9908a1"
/>
closes#2720
## Test Plan
CI and added new test
```
❯ uv run pytest -s -v tests/integration/ --stack-config=server:starter --inference-mode=record -k 'not( builtin_tool or safety_with_image or code_interpreter or test_rag ) and test_openai_completion_guided_choice' --setup=vllm --suite=base --color=yes
Uninstalled 3 packages in 125ms
Installed 3 packages in 19ms
INFO 2025-10-10 14:29:54,317 tests.integration.conftest:118 tests: Applying setup 'vllm' for suite base
INFO 2025-10-10 14:29:54,331 tests.integration.conftest:47 tests: Test stack config type: server
(stack_config=server:starter)
============================================================================================================== test session starts ==============================================================================================================
platform darwin -- Python 3.12.11, pytest-8.4.2, pluggy-1.6.0 -- /Users/erichuang/projects/llama-stack-1/.venv/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.12.11', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0'}}
rootdir: /Users/erichuang/projects/llama-stack-1
configfile: pyproject.toml
plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0
asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 285 items / 284 deselected / 1 selected
tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=vllm/Qwen/Qwen3-0.6B]
instantiating llama_stack_client
Starting llama stack server with config 'starter' on port 8321...
Waiting for server at http://localhost:8321... (0.0s elapsed)
Waiting for server at http://localhost:8321... (0.5s elapsed)
Waiting for server at http://localhost:8321... (5.1s elapsed)
Waiting for server at http://localhost:8321... (5.6s elapsed)
Waiting for server at http://localhost:8321... (10.1s elapsed)
Waiting for server at http://localhost:8321... (10.6s elapsed)
Server is ready at http://localhost:8321
llama_stack_client instantiated in 11.773s
PASSEDTerminating llama stack server process...
Terminating process 98444 and its group...
Server process and children terminated gracefully
============================================================================================================= slowest 10 durations ==============================================================================================================
11.88s setup tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=vllm/Qwen/Qwen3-0.6B]
3.02s call tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=vllm/Qwen/Qwen3-0.6B]
0.01s teardown tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=vllm/Qwen/Qwen3-0.6B]
================================================================================================ 1 passed, 284 deselected, 3 warnings in 16.21s =================================================================================================
```
# What does this PR do?
Converts openai(_chat)_completions params to pydantic BaseModel to
reduce code duplication across all providers.
## Test Plan
CI
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with
[ReviewStack](https://reviewstack.dev/llamastack/llama-stack/pull/3761).
* #3777
* __->__ #3761
The AuthenticationMiddleware was blocking all requests without an
Authorization header, including health and version endpoints that are
needed by monitoring tools, load balancers, and Kubernetes probes.
This commit allows endpoints ending in /health or /version to bypass
authentication, enabling operational tooling to function properly
without requiring credentials.
Closes: #3735
Signed-off-by: Derek Higgins <derekh@redhat.com>
Implementats usage accumulation to StreamingResponseOrchestrator.
The most important part was to pass `stream_options = { "include_usage":
true }` to the chat_completion call. This means I will have to record
all responses tests again because request hash will change :)
Test changes:
- Add usage assertions to streaming and non-streaming tests
- Update test recordings with actual usage data from OpenAI
There are many changes to responses which are landing. They are
introducing fundamental new types. This means re-recordings even from
the inference calls. Let's avoid that for now.
Once everything lands I will re-record everything, make things pass and
re-enable.