Bumps [requests](https://github.com/psf/requests) from 2.32.4 to 2.32.5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/psf/requests/releases">requests's
releases</a>.</em></p>
<blockquote>
<h2>v2.32.5</h2>
<h2>2.32.5 (2025-08-18)</h2>
<p><strong>Bugfixes</strong></p>
<ul>
<li>The SSLContext caching feature originally introduced in 2.32.0 has
created
a new class of issues in Requests that have had negative impact across a
number
of use cases. The Requests team has decided to revert this feature as
long term
maintenance of it is proving to be unsustainable in its current
iteration.</li>
</ul>
<p><strong>Deprecations</strong></p>
<ul>
<li>Added support for Python 3.14.</li>
<li>Dropped support for Python 3.8 following its end of support.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/psf/requests/blob/main/HISTORY.md">requests's
changelog</a>.</em></p>
<blockquote>
<h2>2.32.5 (2025-08-18)</h2>
<p><strong>Bugfixes</strong></p>
<ul>
<li>The SSLContext caching feature originally introduced in 2.32.0 has
created
a new class of issues in Requests that have had negative impact across a
number
of use cases. The Requests team has decided to revert this feature as
long term
maintenance of it is proving to be unsustainable in its current
iteration.</li>
</ul>
<p><strong>Deprecations</strong></p>
<ul>
<li>Added support for Python 3.14.</li>
<li>Dropped support for Python 3.8 following its end of support.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="b25c87d7cb"><code>b25c87d</code></a>
v2.32.5</li>
<li><a
href="131e506079"><code>131e506</code></a>
Merge pull request <a
href="https://redirect.github.com/psf/requests/issues/7010">#7010</a>
from psf/dependabot/github_actions/actions/checkout-...</li>
<li><a
href="b336cb2bc6"><code>b336cb2</code></a>
Bump actions/checkout from 4.2.0 to 5.0.0</li>
<li><a
href="46e939b552"><code>46e939b</code></a>
Update publish workflow to use <code>artifact-id</code> instead of
<code>name</code></li>
<li><a
href="4b9c546aa3"><code>4b9c546</code></a>
Merge pull request <a
href="https://redirect.github.com/psf/requests/issues/6999">#6999</a>
from psf/dependabot/github_actions/step-security/har...</li>
<li><a
href="7618dbef01"><code>7618dbe</code></a>
Bump step-security/harden-runner from 2.12.0 to 2.13.0</li>
<li><a
href="2edca11103"><code>2edca11</code></a>
Add support for Python 3.14 and drop support for Python 3.8 (<a
href="https://redirect.github.com/psf/requests/issues/6993">#6993</a>)</li>
<li><a
href="fec96cd597"><code>fec96cd</code></a>
Update Makefile rules (<a
href="https://redirect.github.com/psf/requests/issues/6996">#6996</a>)</li>
<li><a
href="d58d8aa2f4"><code>d58d8aa</code></a>
docs: clarify timeout parameter uses seconds in Session.request (<a
href="https://redirect.github.com/psf/requests/issues/6994">#6994</a>)</li>
<li><a
href="91a3eabd3d"><code>91a3eab</code></a>
Bump github/codeql-action from 3.28.5 to 3.29.0</li>
<li>Additional commits viewable in <a
href="https://github.com/psf/requests/compare/v2.32.4...v2.32.5">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Uses test_id in request hashes and test-scoped subdirectories to prevent
cross-test contamination. Model list endpoints exclude test_id to enable
merging recordings from different servers.
Additionally, this PR adds a `record-if-missing` mode (which we will use
instead of `record` which records everything) which is very useful.
🤖 Co-authored with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude <noreply@anthropic.com>
# What does this PR do?
Added missing configuration files
## Test Plan
run ./scripts/telemetry/setup_telemetry.sh
```
OTEL_SERVICE_NAME=llama_stack OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 TELEMETRY_SINKS=otel_trace,otel_metric uv run --with llama-stack llama stack build --distro=starter --image-type=venv --run
```
Navigate to grafana localhost:3000, query metrics and traces
IDs are now deterministic hashes based on request content, and
timestamps are normalized to constants, eliminating spurious changes
when re-recording tests.
## Changes
- Updated `inference_recorder.py` to normalize IDs and timestamps during
recording
- Added `scripts/normalize_recordings.py` utility to re-normalize
existing recordings
- Created documentation in `tests/integration/recordings/README.md`
- Normalized 350 existing recording files
# What does this PR do?
`uv add "weaviate-client>=4.16.4" --group unit`
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
## Summary
Introduce `ExtraBodyField` annotation to enable parameters that arrive
via extra_body in client SDKs but are accessible server-side with full
typing.
These parameters are documented in OpenAPI specs under
**`x-llama-stack-extra-body-params`** but excluded from generated SDK
signatures.
Add `shields` parameter to `create_openai_response` as the first
implementation using this pattern.
## Test Plan
- added an integration test which checks that shields parameter passed
via extra_body reaches server implementation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
## Summary
This PR adds a comment-triggered GitHub Actions workflow that allows
running pre-commit hooks on-demand for any pull request. When someone
comments `@github-actions run precommit` on a PR, the bot automatically
runs all pre-commit hooks and commits any formatting or linting fixes
directly to the PR branch.
The implementation uses a secure two-workflow approach: a trigger
workflow validates permissions and dispatches to an execution workflow
that runs pre-commit in a privileged context. This works safely for both
same-repo and fork PRs, with permission checks ensuring only PR authors
or repository collaborators can trigger the bot.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
# What does this PR do?
on the path to maintainable impls of inference providers. make all
configs instances of RemoteInferenceProviderConfig.
## Test Plan
ci
# What does this PR do?
Initial implementation for `Conversations` and `ConversationItems` using
`AuthorizedSqlStore` with endpoints to:
- CREATE
- UPDATE
- GET/RETRIEVE/LIST
- DELETE
Set `level=LLAMA_STACK_API_V1`.
NOTE: This does not currently incorporate changes for Responses, that'll
be done in a subsequent PR.
Closes https://github.com/llamastack/llama-stack/issues/3235
## Test Plan
- Unit tests
- Integration tests
Also comparison of [OpenAPI spec for OpenAI
API](https://github.com/openai/openai-openapi/tree/manual_spec)
```bash
oasdiff breaking --fail-on ERR docs/static/llama-stack-spec.yaml https://raw.githubusercontent.com/openai/openai-openapi/refs/heads/manual_spec/openapi.yaml --strip-prefix-base "/v1/openai/v1" \
--match-path '(^/v1/openai/v1/conversations.*|^/conversations.*)'
```
Note I still have some uncertainty about this, I borrowed this info from
@cdoern on https://github.com/llamastack/llama-stack/pull/3514 but need
to spend more time to confirm it's working, at the moment it suggests it
does.
UPDATE on `oasdiff`, I investigated the OpenAI spec further and it looks
like currently the spec does not list Conversations, so that analysis is
useless. Noting for future reference.
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
now that we consolidated the providerspec types and got rid of
`AdapterSpec`, adjust external.md
BREAKING CHANGE: external providers must update their
`get_provider_spec` function to use `RemoteProviderSpec` properly
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
remove unused chat_completion implementations
vllm features ported -
- requires max_tokens be set, use config value
- set tool_choice to none if no tools provided
## Test Plan
ci
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
- This PR implements keyword and hybrid search for Weaviate DB based on
its inbuilt functions.
- Added fixtures to conftest.py for Weaviate.
- Enabled integration tests for remote Weaviate on all 3 search modes.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes#3010
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Unit tests and integration tests should pass on this PR.
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
Addresses Issue #3271 - "Starting LLS server locally on a terminal with
120 chars width results in an output with empty lines".
This removes the specific 150-character width limit specified for the
Console, and will now auto-detect the terminal width instead. Now the
formatting of Console output is consistent across different sizes of
terminal windows.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes#3271
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Launching the server with several different sizes of terminal windows
results in Console output without unexpected spacing. e.g. `python -m
llama_stack.core.server.server /tmp/run.yaml --port 8321`
---------
Signed-off-by: Doug Edgar <dedgar@redhat.com>
Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>
# What does this PR do?
add ModelsProtocolPrivate methods to OpenAIMixin
this will allow providers using OpenAIMixin to use a common interface
## Test Plan
ci w/ new tests
# What does this PR do?
closes#3268closes#3498
When resuming from previous response ID, currently we attempt to convert
from the stored responses input to chat completion messages, which is
not always possible, e.g. for tool calls where some data is lost once
converted from chat completion message to repsonses input format.
This PR stores the chat completion messages that correspond to the
_last_ call to chat completion, which is sufficient to be resumed from
in the next responses API call, where we load these saved messages and
skip conversion entirely.
Separate issue to optimize storage:
https://github.com/llamastack/llama-stack/issues/3646
## Test Plan
existing CI tests
This is a sweeping change to clean up some gunk around our "Tool"
definitions.
First, we had two types `Tool` and `ToolDef`. The first of these was a
"Resource" type for the registry but we had stopped registering tools
inside the Registry long back (and only registered ToolGroups.) The
latter was for specifying tools for the Agents API. This PR removes the
former and adds an optional `toolgroup_id` field to the latter.
Secondly, as pointed out by @bbrowning in
https://github.com/llamastack/llama-stack/pull/3003#issuecomment-3245270132,
we were doing a lossy conversion from a full JSON schema from the MCP
tool specification into our ToolDefinition to send it to the model.
There is no necessity to do this -- we ourselves aren't doing any
execution at all but merely passing it to the chat completions API which
supports this. By doing this (and by doing it poorly), we encountered
limitations like not supporting array items, or not resolving $refs,
etc.
To fix this, we replaced the `parameters` field by `{ input_schema,
output_schema }` which can be full blown JSON schemas.
Finally, there were some types in our llama-related chat format
conversion which needed some cleanup. We are taking this opportunity to
clean those up.
This PR is a substantial breaking change to the API. However, given our
window for introducing breaking changes, this suits us just fine. I will
be landing a concurrent `llama-stack-client` change as well since API
shapes are changing.
# What does this PR do?
* Adds stainless-llama-stack-spec.yaml for Stainless client generation,
which comprises stable + experimental APIs
## Test Plan
* Manual generation
## Description
Currently, the docs page has the home book opened by default. This PR
updates the .ts so that the sidebar books are collapsed when you first
open the webpage
# What does this PR do?
this was broken by #3631, re-enable this ability by only using oasdiff
when .skip != 'true'
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
* Updates code snippets for Dell distribution, fixing specific user home
directory in code (replacing with $HOME) and updates docker instructions
to use `docker` instead of `podman`.
## Test Plan
N.A.
Co-authored-by: Connor Hack <connorhack@fb.com>
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Spammy
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
n/a
# What does this PR do?
the LiteLLMOpenAIMixin provides support for reading key from provider
data (headers users send).
this adds the same functionality to the OpenAIMixin.
this is infrastructure for migrating providers.
## Test Plan
ci w/ new tests
# What does this PR do?
Adds supplementary static content to root API spec pages. This is useful for giving context behind a specific API group, adding information on supported features or work in progress, etc.
This PR introduces supplementary information for Agents (experimental, deprecated) and Responses (stable) APIs.
<!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. -->
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
Documentation server renders rich static content for the Agents API group:

<!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
# What does this PR do?
First step towards cleaning up the API reference section of the docs.
- Separates API reference into 3 sections: stable (`v1`), experimental (`v1alpha` and `v1beta`), and deprecated (`deprecated=True`)
- Each section is accessible via the dropdown menu and `docs/api-overview`
<img width="1237" height="321" alt="Screenshot 2025-09-30 at 5 47 30 PM" src="https://github.com/user-attachments/assets/fe0e498c-b066-46ed-a48e-4739d3b6724c" />
<img width="860" height="510" alt="Screenshot 2025-09-30 at 5 47 49 PM" src="https://github.com/user-attachments/assets/a92a8d8c-94bf-42d5-9f5b-b47bb2b14f9c" />
- Deprecated APIs: Added styling to the sidebar, and a notice on the endpoint pages
<img width="867" height="428" alt="Screenshot 2025-09-30 at 5 47 43 PM" src="https://github.com/user-attachments/assets/9e6e050d-c782-461b-8084-5ff6496d7bd9" />
Closes#3628
TODO in follow-up PRs:
- Add the ability to annotate API groups with supplementary content (so we can have longer descriptions of complex APIs like Responses)
- Clean up docstrings to show API endpoints (or short semantic titles) in the sidebar
## Test Plan
- Local testing
- Made sure API conformance test still passes
# What does this PR do?
Given the rapidly changing nature of Llama Stack's APIs and the need to have clean, user-friendly API documentation, we want to split the API reference into 3 main buckets; stable, experimental and deprecated. The most straightforward way to do it is to have several automatically generated doctrees, which introduces some complexity in testing APIs for backwards compatibility.
This PR updates the API conformance test to handle cases where the API schema is split into several files; it does not change the testing criteria.
<!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. -->
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
No developer-facing changes (all existing tests should pass)
<!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
# What does this PR do?
- categories like "core::server" is not recognized so it's level is not
set by 'all=debug'
- removed spammy telemetry debug logging
## Test Plan
test server launched with LLAMA_STACK_LOGGING='all=debug'
# What does this PR do?
if the PR title has `!` or the footer of the commit has `BREAKING
CHANGE:`, skip conformance. This is documented in the API leveling
proposal
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
level the following APIs, keeping their old routes around as well until
0.4.0
1. datasetio to v1beta: used primarily by eval and training. Given that
training is v1alpha, and eval is v1alpha, datasetio is likely to change
in structure as real usages of the API spin up. Register,unregister, and
iter dataset is sparsely implemented meaning the shape of that route is
likely to change.
2. telemetry to v1alpha: telemetry has been going through many changes.
for example query_metrics was not even implemented until recently and
had to change its shape to work. putting this in v1beta will allow us to
fix functionality like OTEL, sqlite, etc. The routes themselves are set,
but the structure might change a bit
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
When a model decides to use an MCP tool call that requires no arguments,
it sets the `arguments` field to `None`. This causes the user to see a
`400 bad requst error` due to validation errors down the stack because
this field gets removed when being parsed by an openai compatible
inference provider like vLLM
This PR ensures that, as soon as the tool call args are accumulated
while streaming, we check to ensure no tool call function arguments are
set to None - if they are we replace them with "{}"
<!-- If resolving an issue, uncomment and update the line below -->
Closes#3456
## Test Plan
Added new unit test to verify that any tool calls with function
arguments set to `None` get handled correctly
---------
Signed-off-by: Jaideep Rao <jrao@redhat.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>