Updated documentation to accurately reflect current behavior where
models are identified as provider_id/provider_model_id in the system.
Changes:
o Clarify that model_id is for configuration purposes only o Explain
models are accessed as provider_id/provider_model_id o Remove outdated
aliasing example that suggested model_id could be used
as a custom identifier
This corrects the documentation which previously suggested model_id
could be used to create friendly aliases, which is not how the code
actually works.
<hr>This is an automatic backport of pull request #4128 done by
[Mergify](https://mergify.com).
Signed-off-by: Derek Higgins <derekh@redhat.com>
Co-authored-by: Derek Higgins <derekh@redhat.com>
# What does this PR do?
In the **Detailed Tutorial**, at **Step 3**, the **Install with venv**
option creates a new virtual environment `client`, activates it then
attempts to install the llama-stack-client using pip.
```
uv venv client --python 3.12
source client/bin/activate
pip install llama-stack-client <- this is the problematic line
```
However, the pip command will likely fail because the `uv venv` command
doesn't, by default, include adding the pip command to the virtual
environment that is created. The pip command will error either because
pip doesn't exist at all, or, if the pip command does exist outside of
the virtual environment, return a different error message. The latter
may be unclear to the user why it is failing.
This PR changes 'pip' to 'uv pip', allowing the install action to
function in the virtual environment as intended, and without the need
for pip to be installed.
## Test Plan
1. Use linux or WSL (virtual environments on Windows use `Scripts`
folder instead of `bin` [virtualenv
#993ba13](993ba1316a)
which doesn't align with the tutorial)
2. Clone the `llama-stack` repo
3. Run the following and verify success:
```
uv venv client --python 3.12
source client/bin/activate
```
5. Run the updated command:
```
uv pip install llama-stack-client
```
6. Observe the console output confirms that the virtual environment
`client` was used:
> Using Python 3.12.3 environment at: **client**<hr>This is an automatic
backport of pull request #4122 done by [Mergify](https://mergify.com).
Co-authored-by: paulengineer <154521137+paulengineer@users.noreply.github.com>
# What does this PR do?
list-deps takes positional args OR things like --providers
the issue with this, is that these args need to be optional since by
nature, one or the other can be specified.
add a check to list-deps that checks `if not args.providers and not
args.config`. If this is true, help is printed and we exit.
resolves#4075
## Test Plan
before:
```
╰─ llama stack list-deps
Traceback (most recent call last):
File "/Users/charliedoern/projects/Documents/llama-stack/venv/bin/llama", line 10, in <module>
sys.exit(main())
^^^^^^
File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/llama.py", line 52, in main
parser.run(args)
File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/llama.py", line 43, in run
args.func(args)
File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/stack/list_deps.py", line 51, in _run_stack_list_deps_command
return run_stack_list_deps_command(args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/stack/_list_deps.py", line 135, in run_stack_list_deps_command
normal_deps, special_deps, external_provider_dependencies = get_provider_dependencies(build_config)
^^^^^^^^^^^^
UnboundLocalError: cannot access local variable 'build_config' where it is not associated with a value
```
after:
```
╰─ llama stack list-deps
usage: llama stack list-deps [-h] [--providers PROVIDERS] [--format {uv,deps-only}] [config | distro]
list the dependencies for a llama stack distribution
positional arguments:
config | distro Path to config file to use or name of known distro (llama stack list for a list). (default: None)
options:
-h, --help show this help message and exit
--providers PROVIDERS
sync dependencies for a list of providers and only those providers. This list is formatted like: api1=provider1,api2=provider2. Where there can be multiple
providers per API. (default: None)
--format {uv,deps-only}
Output format: 'uv' shows shell commands, 'deps-only' shows just the list of dependencies without `uv` (default) (default: deps-only)
```
<hr>This is an automatic backport of pull request #4078 done by [Mergify](https://mergify.com).
Signed-off-by: Charlie Doern <cdoern@redhat.com>
Co-authored-by: Charlie Doern <cdoern@redhat.com>
Fixes#4017 follow-up issue where UV_INDEX_STRATEGY was only exported to
GITHUB_ENV but not to the current shell.
The commit e0bb7529 fixed the empty string issue but introduced a new
bug: UV_INDEX_STRATEGY was only exported to GITHUB_ENV (for subsequent
steps), not to the current shell environment. Since uv sync runs in the
same step, it never saw the variable.
This caused all CI runs on release-0.3.x to fail with dependency
resolution errors like:
```
setuptools was found on https://test.pypi.org/simple/, but not at the requested version.
A compatible version may be available on PyPI. Use --index-strategy unsafe-best-match.
```
This fix adds `export UV_INDEX_STRATEGY=unsafe-best-match` to make the
variable available in the current shell before running uv commands.
Note: Main branch doesn't hit this bug because UV_EXTRA_INDEX_URL is
only set on release branches.
Cherry-pick of bc12fe6c4 to release-0.3.x
Fixes GitHub Actions workflows failing with UV index strategy errors
when testing on RC tags and non-release branches.
The issue was that UV_INDEX_STRATEGY was being set to an empty string in
the environment, causing UV to fail with "error: a value is required for
'--index-strategy'".
The fix removes UV_INDEX_STRATEGY from the env block and only sets it to
'unsafe-best-match' when UV_EXTRA_INDEX_URL is actually present.
Cherry-pick of #3955 to release-0.3.x
Adds a getting started notebook with simple agent examples to help users
get started with llama-stack agents.
Co-authored-by: Omar Abdelwahab <omaryashraf10@gmail.com>
Co-authored-by: Omar Abdelwahab <omara@fb.com>
Cherry-pick of #3992 to release-0.3.x
Adds support for configuring the number of workers in run.yaml
configuration files.
Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>
Cherry-pick of #4012 to release-0.3.x
Fixes container builds failing with UV index strategy errors when build
args are passed with empty values.
Docker ARGs declared with empty defaults (ARG UV_INDEX_STRATEGY="")
become environment variables with empty string values in RUN commands.
UV interprets these as if --index-strategy "" was passed on the command
line, causing build failures with "error: a value is required for
'--index-strategy <UV_INDEX_STRATEGY>'".
This is a footgun because empty string ≠ unset variable, and ARGs
silently propagate to all RUN commands, only failing when declared with
empty defaults.
The fix unsets UV_EXTRA_INDEX_URL and UV_INDEX_STRATEGY at the start of
RUN blocks, saves the values early, and only restores them for editable
installs with RC dependencies. All other install modes (PyPI, test-pypi,
client) now run with a clean environment.
Cherry-pick of #3974 to release-0.3.x branch.
## Summary
- Fixes handling of missing external_providers_dir in stack
configuration
## Original PR
Fixes from #3974
Signed-off-by: Doug Edgar <dedgar@redhat.com>
Co-authored-by: Doug Edgar <dedgar@redhat.com>
Backport of #4001 to release-0.3.x branch.
Fixes CI failures on release branches where uv sync can't resolve RC
dependencies.
## The Problem
On release branches like `release-0.3.x`, pyproject.toml requires
`llama-stack-client>=0.3.1rc1`. RC versions only exist on test.pypi, not
PyPI. This causes multiple CI failures:
1. `uv sync` fails because it can't resolve RC versions from PyPI
2. pre-commit hooks (uv-lock, codegen) fail for the same reason
3. mypy workflow section needs uv installed
## The Solution
Configure UV to use test.pypi when on release branches:
- Set `UV_INDEX_URL=https://test.pypi.org/simple/` (primary)
- Set `UV_EXTRA_INDEX_URL=https://pypi.org/simple/` (fallback)
- Set `UV_INDEX_STRATEGY=unsafe-best-match` to check both indexes
This allows `uv sync` to resolve common packages from PyPI and RC
versions from test.pypi.
## Additional Fixes
- Export UV env vars to `GITHUB_ENV` so pre-commit hooks inherit them
- Install uv in pre-commit workflow for mypy section
- Handle missing `type_checking` dependency group on release-0.3.x
- Regenerate uv.lock with RC versions for the release branch
## Changes
- Created reusable `install-llama-stack-client` action for configuration
- Modified `setup-runner` to set UV environment variables before sync
- Modified `pre-commit` workflow to configure client and export env vars
- Updated uv.lock with RC versions from test.pypi
This is a cherry-pick of commits afa9f0882, c86e6e906, 626639bee, and
081566321 from main, plus additional fixes for release branch
compatibility.
## Summary
Cherry-picks 5 critical fixes from main to the release-0.3.x branch for
the v0.3.1 release, plus CI workflow updates.
**Note**: This recreates the cherry-picks from the closed PR #3991, now
targeting the renamed `release-0.3.x` branch (previously
`release-0.3.x-maint`).
## Commits
1. **2c56a8560** - fix(context): prevent provider data leak between
streaming requests (#3924)
- **CRITICAL SECURITY FIX**: Prevents provider credentials from leaking
between requests
- Fixed import path for 0.3.0 compatibility
2. **ddd32b187** - fix(inference): enable routing of models with
provider_data alone (#3928)
- Enables routing for fully qualified model IDs with provider_data
- Resolved merge conflicts, adapted for 0.3.0 structure
3. **f7c2973aa** - fix: Avoid BadRequestError due to invalid max_tokens
(#3667)
- Fixes failures with Gemini and other providers that reject
max_tokens=0
- Non-breaking API change
4. **d7f9da616** - fix(responses): sync conversation before yielding
terminal events in streaming (#3888)
- Ensures conversation sync executes even when streaming consumers break
early
5. **0ffa8658b** - fix(logging): ensure logs go to stderr, loggers obey
levels (#3885)
- Fixes logging infrastructure
6. **75b49cb3c** - ci: support release branches and match client branch
(#3990)
- Updates CI workflows to support release-X.Y.x branches
- Matches client branch from llama-stack-client-python for release
testing
- Fixes artifact name collisions
## Adaptations for 0.3.0
- Fixed import paths: `llama_stack.core.telemetry.tracing` →
`llama_stack.providers.utils.telemetry.tracing`
- Fixed import paths: `llama_stack.core.telemetry.telemetry` →
`llama_stack.apis.telemetry`
- Changed `self.telemetry_enabled` → `self.telemetry` (0.3.0 attribute
name)
- Removed `rerank()` method that doesn't exist in 0.3.0
## Testing
All imports verified and tests should pass once CI is set up.
# What does this PR do?
metadata is conflicting with the default embedding model set on server
side via extra body, removing the check and just letting metadata take
precedence over extra body
`ValueError: Embedding model inconsistent between metadata
('text-embedding-3-small') and extra_body
('sentence-transformers/nomic-ai/nomic-embed-text-v1.5')`
## Test Plan
CI
# What does this PR do?
Fix segfault with load model
The cc-vec integration failed with segfault when used with default
embedding model on macOS
`model_id: nomic-ai/nomic-embed-text-v1.5` and `provider_id:
sentence-transformers`
Checked crash report and see this is due to torch OPENMP settings.
Constrainting to 1 thread works without crashes.
## Test Plan
Tested with cc-vec integration
1. start server llama stack run starter
2. Do the setup in https://github.com/raghotham/cc-vec to set env
variables and try
`uv run cc-vec index --url-patterns "%.github.io" --vector-store-name
"ml-research" --limit 50 --chunk-size 800 --overlap 400`
- Moved environment variable parsing and `setup_logging()` call from
module level to proper initialization points
- Added explicit `setup_logging()` calls in `server.py::create_app()`
and `library_client.py::AsyncLlamaStackAsLibraryClient.__init__()`
Module-level side effects are bad practice and can cause issues with
import order, testing, and circular dependencies. The previous
implementation ran logging setup on every import of the log module,
which is unpredictable and difficult to control.
---------
Co-authored-by: Claude <noreply@anthropic.com>
Kill the `builtin::rag` tool group completely since it is no longer
targeted. We use the Responses implementation for knowledge_search which
uses the `openai_vector_stores` pathway.
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
## Summary
- Link pre-commit bot comment to workflow run instead of PR for better
debugging
- Dump docker container logs before removal to ensure logs are actually
captured
## Changes
1. **Pre-commit bot**: Changed the initial bot comment to link
"pre-commit hooks" text to the actual workflow run URL instead of just
having the PR number auto-link
2. **Docker logs**: Moved docker container log dumping from GitHub
Actions to the integration-tests.sh script's stop_container() function,
ensuring logs are captured before container removal
## Test plan
- Pre-commit bot comment will now have a clickable link to the workflow
run
- Docker container logs will be successfully captured in CI runs
# What does this PR do?
similarly to `alpha:` move `v1beta` routes under a `beta` group so the
client will have `client.beta`
From what I can tell, the openapi.stainless.yml file is hand written
while the openapi.yml file is generated and copied using the shell
script so I did this by hand.
Signed-off-by: Charlie Doern <cdoern@redhat.com>
Bumps
[@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node)
from 24.3.0 to 24.8.1.
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [fastapi](https://github.com/fastapi/fastapi) from 0.116.1 to
0.119.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/fastapi/fastapi/releases">fastapi's
releases</a>.</em></p>
<blockquote>
<h2>0.119.0</h2>
<p>FastAPI now (temporarily) supports both Pydantic v2 models and
<code>pydantic.v1</code> models at the same time in the same app, to
make it easier for any FastAPI apps still using Pydantic v1 to gradually
but quickly <strong>migrate to Pydantic v2</strong>.</p>
<pre lang="Python"><code>from fastapi import FastAPI
from pydantic import BaseModel as BaseModelV2
from pydantic.v1 import BaseModel
<p>class Item(BaseModel):<br />
name: str<br />
description: str | None = None</p>
<p>class ItemV2(BaseModelV2):<br />
title: str<br />
summary: str | None = None</p>
<p>app = FastAPI()</p>
<p><a
href="https://github.com/app"><code>@app</code></a>.post("/items/",
response_model=ItemV2)<br />
def create_item(item: Item):<br />
return {"title": item.name, "summary":
item.description}<br />
</code></pre></p>
<p>Adding this feature was a big effort with the main objective of
making it easier for the few applications still stuck in Pydantic v1 to
migrate to Pydantic v2.</p>
<p>And with this, support for <strong>Pydantic v1 is now
deprecated</strong> and will be <strong>removed</strong> from FastAPI in
a future version soon.</p>
<p><strong>Note</strong>: have in mind that the Pydantic team already
stopped supporting Pydantic v1 for recent versions of Python, starting
with Python 3.14.</p>
<p>You can read in the docs more about how to <a
href="https://fastapi.tiangolo.com/how-to/migrate-from-pydantic-v1-to-pydantic-v2/">Migrate
from Pydantic v1 to Pydantic v2</a>.</p>
<h3>Features</h3>
<ul>
<li>✨ Add support for <code>from pydantic.v1 import BaseModel</code>,
mixed Pydantic v1 and v2 models in the same app. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14168">#14168</a>
by <a
href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li>
</ul>
<h2>0.118.3</h2>
<h3>Upgrades</h3>
<ul>
<li>⬆️ Add support for Python 3.14. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/14165">#14165</a>
by <a
href="https://github.com/svlandeg"><code>@svlandeg</code></a>.</li>
</ul>
<h2>0.118.2</h2>
<h3>Fixes</h3>
<ul>
<li>🐛 Fix tagged discriminated union not recognized as body field. PR <a
href="https://redirect.github.com/fastapi/fastapi/pull/12942">#12942</a>
by <a
href="https://github.com/frankie567"><code>@frankie567</code></a>.</li>
</ul>
<h3>Internal</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="2e721e1b02"><code>2e721e1</code></a>
🔖 Release version 0.119.0</li>
<li><a
href="fc7a0686af"><code>fc7a068</code></a>
📝 Update release notes</li>
<li><a
href="3a3879b2c3"><code>3a3879b</code></a>
📝 Update release notes</li>
<li><a
href="d34918abf0"><code>d34918a</code></a>
✨ Add support for <code>from pydantic.v1 import BaseModel</code>, mixed
Pydantic v1 and ...</li>
<li><a
href="352dbefc63"><code>352dbef</code></a>
🔖 Release version 0.118.3</li>
<li><a
href="96e7d6eaa4"><code>96e7d6e</code></a>
📝 Update release notes</li>
<li><a
href="3611c3fc5b"><code>3611c3f</code></a>
⬆️ Add support for Python 3.14 (<a
href="https://redirect.github.com/fastapi/fastapi/issues/14165">#14165</a>)</li>
<li><a
href="942fce394b"><code>942fce3</code></a>
🔖 Release version 0.118.2</li>
<li><a
href="13b067c9b6"><code>13b067c</code></a>
📝 Update release notes</li>
<li><a
href="185cecd891"><code>185cecd</code></a>
🐛 Fix tagged discriminated union not recognized as body field (<a
href="https://redirect.github.com/fastapi/fastapi/issues/12942">#12942</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/fastapi/fastapi/compare/0.116.1...0.119.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>