llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-27 22:51:59 +00:00

Author	SHA1	Message	Date
Alina Ryan	55a1da5526	feat(api): add file_processor API skeleton (#4113 )	2025-12-24 08:53:24 -05:00
Costa Shulyupin	325a0bd7b3	refactor: demo_script.py (#4409 ) - simplify search result processing in demo script - optimize demo script by using inline text instead of big external file - improve printouts clarity and user experience --------- Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>	2025-12-23 13:50:10 -08:00
dependabot[bot]	22f84df68b	chore(github-deps): bump stainless-api/upload-openapi-spec-action from 1.8.1 to 1.9.0 (#4421 ) Bumps [stainless-api/upload-openapi-spec-action](https://github.com/stainless-api/upload-openapi-spec-action) from 1.8.1 to 1.9.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/stainless-api/upload-openapi-spec-action/releases">stainless-api/upload-openapi-spec-action's releases</a>.</em></p> <blockquote> <h2>v1.9.0</h2> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.8.1...v1.9.0">1.9.0</a> (2025-12-20)</h2> <h3>Features</h3> <ul> <li>check org-level enable_ai_commit_messages field (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/152">#152</a>) (<a href="`90deb1bcc4`">90deb1b</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/stainless-api/upload-openapi-spec-action/blob/main/CHANGELOG.md">stainless-api/upload-openapi-spec-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.8.1...v1.9.0">1.9.0</a> (2025-12-20)</h2> <h3>Features</h3> <ul> <li>check org-level enable_ai_commit_messages field (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/152">#152</a>) (<a href="`90deb1bcc4`">90deb1b</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.8.0...v1.8.1">1.8.1</a> (2025-12-09)</h2> <h3>Bug Fixes</h3> <ul> <li>re-enable 'targets' param in diagnostics call (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/148">#148</a>) (<a href="`3130e17c92`">3130e17</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.7.1...v1.8.0">1.8.0</a> (2025-12-08)</h2> <h3>Features</h3> <ul> <li>support AI commit message generation for preview builds (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/143">#143</a>) (<a href="`7010edb389`">7010edb</a>)</li> <li>support per-SDK commit messages in preview comments (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/142">#142</a>) (<a href="`a36c33fc21`">a36c33f</a>)</li> <li>Update to latest <code>@stainless-api/sdk</code> (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/144">#144</a>) (<a href="`a9b388bded`">a9b388b</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.7.0...v1.7.1">1.7.1</a> (2025-12-01)</h2> <h3>Bug Fixes</h3> <ul> <li>improve getMergeBase to handle shallow clones more robustly (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/138">#138</a>) (<a href="`3687845465`">3687845</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.6.0...v1.7.0">1.7.0</a> (2025-11-17)</h2> <h3>Features</h3> <ul> <li><strong>preview:</strong> add output documented_spec_path to preview action (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/135">#135</a>) (<a href="`5e80cc40da`">5e80cc4</a>)</li> <li><strong>preview:</strong> add output_dir input and write documented spec to file (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/137">#137</a>) (<a href="`d30490c89b`">d30490c</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.5...v1.6.0">1.6.0</a> (2025-10-30)</h2> <h3>Features</h3> <ul> <li>add support for github OIDC auth (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/133">#133</a>) (<a href="`259674c1b3`">259674c</a>)</li> <li>change fail on semantics (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/124">#124</a>) (<a href="`e1046240c0`">e104624</a>)</li> </ul> <h3>Bug Fixes</h3> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`11792f827d`"><code>11792f8</code></a> chore(main): release 1.9.0 (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/153">#153</a>)</li> <li><a href="`dfb2b92839`"><code>dfb2b92</code></a> chore(build): Update dist</li> <li><a href="`90deb1bcc4`"><code>90deb1b</code></a> feat: check org-level enable_ai_commit_messages field (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/152">#152</a>)</li> <li><a href="`d2c9de2be1`"><code>d2c9de2</code></a> chore: add User-Agent header (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/150">#150</a>)</li> <li>See full diff in <a href="`979824f1ea...11792f827d`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=stainless-api/upload-openapi-spec-action&package-manager=github_actions&previous-version=1.8.1&new-version=1.9.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-22 20:06:05 -05:00
dependabot[bot]	f9ac055b88	chore(github-deps): bump medyagh/setup-minikube from 0.0.20 to 0.0.21 (#4422 ) Bumps [medyagh/setup-minikube](https://github.com/medyagh/setup-minikube) from 0.0.20 to 0.0.21. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/medyagh/setup-minikube/releases">medyagh/setup-minikube's releases</a>.</em></p> <blockquote> <h2>v0.0.21</h2> <h2>What's Changed</h2> <ul> <li>add support for none driver on arm64 by <a href="https://github.com/medyagh"><code>@medyagh</code></a> in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/779">medyagh/setup-minikube#779</a></li> <li>feat: add 'nodes' action input by <a href="https://github.com/zachspar"><code>@zachspar</code></a> in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/712">medyagh/setup-minikube#712</a></li> </ul> <h2>Test/CI:</h2> <ul> <li>add vkfit test by <a href="https://github.com/medyagh"><code>@medyagh</code></a> in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/739">medyagh/setup-minikube#739</a></li> <li>ci: add concurrency settings to macos-test workflow by <a href="https://github.com/medyagh"><code>@medyagh</code></a> in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/780">medyagh/setup-minikube#780</a></li> <li>test: add dry-run tests for windows and macos by <a href="https://github.com/medyagh"><code>@medyagh</code></a> in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/781">medyagh/setup-minikube#781</a></li> <li>test: Upgrade Kubernetes version and simplify installation by <a href="https://github.com/medyagh"><code>@medyagh</code></a> in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/762">medyagh/setup-minikube#762</a></li> <li>split workflow "build-test" to "build" and "test" by <a href="https://github.com/medyagh"><code>@medyagh</code></a> in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/776">medyagh/setup-minikube#776</a></li> <li>refactor: enhance test workflow with matrix strategy for multiple sce… by <a href="https://github.com/medyagh"><code>@medyagh</code></a> in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/777">medyagh/setup-minikube#777</a></li> <li>add qemu test to github actions by <a href="https://github.com/medyagh"><code>@medyagh</code></a> in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/729">medyagh/setup-minikube#729</a></li> </ul> <h2>build</h2> <ul> <li>build(deps-dev): bump eslint-plugin-jest from 28.11.0 to 29.0.1 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/727">medyagh/setup-minikube#727</a></li> <li>build(deps-dev): bump prettier from 3.5.3 to 3.6.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/725">medyagh/setup-minikube#725</a></li> <li>build(deps-dev): bump eslint-plugin-github from 5.1.8 to 6.0.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/724">medyagh/setup-minikube#724</a></li> <li>build(deps-dev): bump eslint from 9.26.0 to 9.31.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/728">medyagh/setup-minikube#728</a></li> <li>build(deps-dev): bump <code>@typescript-eslint/eslint-plugin</code> from 8.26.1 to 8.36.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/726">medyagh/setup-minikube#726</a></li> <li>build(deps-dev): bump ts-jest from 29.2.6 to 29.4.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/730">medyagh/setup-minikube#730</a></li> <li>build(deps-dev): bump eslint from 9.31.0 to 9.32.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/738">medyagh/setup-minikube#738</a></li> <li>build(deps-dev): bump <code>@types/node</code> from 24.0.11 to 24.1.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/737">medyagh/setup-minikube#737</a></li> <li>build(deps-dev): bump <code>@typescript-eslint/parser</code> from 8.37.0 to 8.38.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/736">medyagh/setup-minikube#736</a></li> <li>build(deps-dev): bump jest-circus from 29.7.0 to 30.0.5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/735">medyagh/setup-minikube#735</a></li> <li>build(deps-dev): bump jest and <code>@types/jest</code> by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/734">medyagh/setup-minikube#734</a></li> <li>build(deps-dev): bump <code>@types/node</code> from 24.1.0 to 24.5.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/760">medyagh/setup-minikube#760</a></li> <li>build(deps): bump actions/checkout from 4.2.2 to 6.0.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/775">medyagh/setup-minikube#775</a></li> <li>build(deps): bump form-data by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/761">medyagh/setup-minikube#761</a></li> <li>build(deps): bump actions/setup-node from 4.4.0 to 6.0.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/769">medyagh/setup-minikube#769</a></li> <li>build(deps): bump glob from 10.4.5 to 10.5.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/774">medyagh/setup-minikube#774</a></li> <li>build(deps): bump js-yaml by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/773">medyagh/setup-minikube#773</a></li> <li>build(deps-dev): bump typescript from 5.8.3 to 5.9.3 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/766">medyagh/setup-minikube#766</a></li> <li>build(deps-dev): bump eslint from 9.32.0 to 9.38.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/770">medyagh/setup-minikube#770</a></li> <li>build(deps-dev): bump ts-jest from 29.4.0 to 29.4.5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/768">medyagh/setup-minikube#768</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/zachspar"><code>@zachspar</code></a> made their first contribution in <a href="https://redirect.github.com/medyagh/setup-minikube/pull/712">medyagh/setup-minikube#712</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/medyagh/setup-minikube/compare/v0...v0.0.21">https://github.com/medyagh/setup-minikube/compare/v0...v0.0.21</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`e9e035a86b`"><code>e9e035a</code></a> Merge pull request <a href="https://redirect.github.com/medyagh/setup-minikube/issues/781">#781</a> from medyagh/add_windows_test</li> <li><a href="`6d4f8d69da`"><code>6d4f8d6</code></a> fix: remove unnecessary --vm argument from download-only step in dry-run work...</li> <li><a href="`b0656d9c82`"><code>b0656d9</code></a> fix: ensure vfkit installation step runs only on macOS</li> <li><a href="`0b40b9148a`"><code>0b40b91</code></a> feat: add installation step for vfkit and related tools in dry-run workflow</li> <li><a href="`6d08f649f9`"><code>6d08f64</code></a> fix: update Docker setup step to install CLI on macOS</li> <li><a href="`93224f2cf3`"><code>93224f2</code></a> fix: adjust Docker setup condition to run on all OS types</li> <li><a href="`24746887ce`"><code>2474688</code></a> fix: correct typo in dry-run workflow and adjust Docker setup condition</li> <li><a href="`eca7409306`"><code>eca7409</code></a> feat: update concurrency settings and refine OS matrix in dry-run workflow</li> <li><a href="`d0aca93add`"><code>d0aca93</code></a> feat: add Docker setup step to dry-run workflow</li> <li><a href="`95b2fc43b9`"><code>95b2fc4</code></a> feat: add dry-run workflow for pull requests and scheduled runs</li> <li>Additional commits viewable in <a href="`e3c7f79eb1...e9e035a86b`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=medyagh/setup-minikube&package-manager=github_actions&previous-version=0.0.20&new-version=0.0.21)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-22 20:05:38 -05:00
Charlie Doern	258c52c84c	feat: introduce /admin API for stack administration and operations (#4401 ) # What does this PR do? - Add new /admin API (v1alpha) for administrative operations including provider management, health checks, version info, and route listing - Implement using FastAPI routers following batches pattern with proper request/response models - Endpoints: /admin/providers, /admin/providers/{id}, /admin/inspect/routes, /admin/health, /admin/version - Create admin module structure: models.py, api.py, fastapi_routes.py, init.py - Add AdminImpl in llama_stack/core combining provider and inspect functionality - Deprecate standalone /providers and /inspect APIs (remain functional for backward compatibility) - Consolidate duplicate types: ProviderInfo, HealthInfo, RouteInfo, etc. now defined once in admin.models ## Test Plan new admin integration suite, uses generated stainless SDK, and records new tests on this PR. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-22 12:11:49 -05:00
Dennis Kennetz	d684ec91cc	fix: code was injecting run_config.vector_stores even when it was None. (#4423 ) # What does this PR do? Fixed issue where code was injecting `run_config.vector_stores` even when it was `None`, which overrode the `default_factory` in `RagToolRuntimeConfig`. Currently, _most_ providers don't have a default implementation for vectors_stores: - nvidia - meta-reference-gpu - dell - oci - open-benchmark - postgres-demo - watsonx The only ones which do are: - ci-tests - starter - starter-gpu ## Test Plan Prior to the change, I could not start llama-stack with the oci distribution: ``` Traceback (most recent call last): File "/home/opc/llama-stack/.venv/bin/llama", line 10, in <module> sys.exit(main()) ^^^^^^ File "/home/opc/llama-stack/src/llama_stack/cli/llama.py", line 52, in main parser.run(args) File "/home/opc/llama-stack/src/llama_stack/cli/llama.py", line 46, in run args.func(args) File "/home/opc/llama-stack/src/llama_stack/cli/stack/run.py", line 184, in _run_stack_run_cmd self._uvicorn_run(config_file, args) File "/home/opc/llama-stack/src/llama_stack/cli/stack/run.py", line 242, in _uvicorn_run uvicorn.run("llama_stack.core.server.server:create_app", *uvicorn_config) # type: ignore[arg-type] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/opc/llama-stack/.venv/lib/python3.12/site-packages/uvicorn/main.py", line 580, in run server.run() File "/home/opc/llama-stack/.venv/lib/python3.12/site-packages/uvicorn/server.py", line 67, in run return asyncio.run(self.serve(sockets=sockets)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/opc/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 195, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/home/opc/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/opc/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/home/opc/llama-stack/.venv/lib/python3.12/site-packages/uvicorn/server.py", line 71, in serve await self._serve(sockets) File "/home/opc/llama-stack/.venv/lib/python3.12/site-packages/uvicorn/server.py", line 78, in _serve config.load() File "/home/opc/llama-stack/.venv/lib/python3.12/site-packages/uvicorn/config.py", line 442, in load self.loaded_app = self.loaded_app() ^^^^^^^^^^^^^^^^^ File "/home/opc/llama-stack/src/llama_stack/core/server/server.py", line 403, in create_app app = StackApp( ^^^^^^^^^ File "/home/opc/llama-stack/src/llama_stack/core/server/server.py", line 161, in __init__ future.result() File "/home/opc/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/concurrent/futures/_base.py", line 456, in result return self.__get_result() ^^^^^^^^^^^^^^^^^^^ File "/home/opc/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result raise self._exception File "/home/opc/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/concurrent/futures/thread.py", line 59, in run result = self.fn(self.args, self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/opc/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 195, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/home/opc/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/opc/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/home/opc/llama-stack/src/llama_stack/core/stack.py", line 534, in initialize impls = await resolve_impls( ^^^^^^^^^^^^^^^^^^^^ File "/home/opc/llama-stack/src/llama_stack/core/resolver.py", line 180, in resolve_impls return await instantiate_providers(sorted_providers, router_apis, dist_registry, run_config, policy, internal_impls) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/opc/llama-stack/src/llama_stack/core/resolver.py", line 321, in instantiate_providers impl = await instantiate_provider(provider, deps, inner_impls, dist_registry, run_config, policy) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/opc/llama-stack/src/llama_stack/core/resolver.py", line 417, in instantiate_provider config = config_type(provider_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/opc/llama-stack/.venv/lib/python3.12/site-packages/pydantic/main.py", line 253, in __init__ validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pydantic_core._pydantic_core.ValidationError: 1 validation error for RagToolRuntimeConfig vector_stores_config Input should be a valid dictionary or instance of VectorStoresConfig [type=model_type, input_value=None, input_type=NoneType] For further information visit https://errors.pydantic.dev/2.11/v/model_type ``` Afer tracing through and finding a simple solution to the change, I was able to run the distribution again. I also executed the integration tests for pytest: ```bash OCI_COMPARTMENT_OCID="ocid1.compartment.oc1..xxx" OCI_REGION="us-chicago-1" OCI_AUTH_TYPE=instance_principal OCI_CLI_PROFILE=CHICAGO uv run pytest -sv tests/integration/inference/ --stack-config oci --text-model oci/meta.llama-3.3-70b-instruct --inference-mode live ```	2025-12-21 23:48:16 -08:00
Derek Higgins	b6043bd53b	fix: Remove unused TELEMETRY_SINKS and add OTEL_EXPORTER_OTLP_PROTOCOL (#4406 ) Changes: o Remove TELEMETRY_SINKS environment variable from scripts (unused) o Replace with OTEL_EXPORTER_OTLP_PROTOCOL in install scripts The TELEMETRY_SINKS variable is no longer use by Python code and has been replaced with the standard OpenTelemetry environment variable OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf	2025-12-19 15:56:22 -08:00
Sumanth Kamenani	bd35aa4d78	feat: enable streaming usage metrics for OpenAI-compatible providers (#4326 ) Inject `stream_options={"include_usage": True} `when streaming and OpenTelemetry telemetry is active. Telemetry always overrides any caller preference to ensure complete and consistent observability metrics. Changes: - Add conditional stream_options injection to OpenAIMixin (benefits OpenAI, Bedrock, Runpod, Together, Fireworks providers) - Add conditional stream_options injection to LiteLLMOpenAIMixin (benefits WatsonX and other litellm-based providers) - Check telemetry status using trace.get_current_span().is_recording() - Override include_usage=False when telemetry active to prevent metric gaps - Unit tests for this functionality Fixes #3981 Note: this work originated in PR #4200, which I closed after rebasing on the telemetry changes. This PR rebases those commits, incorporates the Bedrock feedback, and carries forward the same scope described there. ## Test Plan #### OpenAIMixin + telemetry injection tests PYTHONPATH=src python -m pytest tests/unit/providers/utils/inference/test_openai_mixin.py #### LiteLLM OpenAIMixin tests PYTHONPATH=src python -m pytest tests/unit/providers/inference/test_litellm_openai_mixin.py -v #### Broader inference provider PYTHONPATH=src python -m pytest tests/unit/providers/inference/ --ignore=tests/unit/providers/inference/test_inference_client_caching.py -v	2025-12-19 15:53:53 -08:00
Derek Higgins	5ebcde3042	fix(scoring): remove broken dataset validation in score_batch methods (#4420 ) The Dataset model no longer has a dataset_schema attribute it was remove during a refactor (`5287b437a`) so this validation can no longer run. Changes: o basic scoring: removed validate_dataset_schema call and related imports o llm_as_judge scoring: removed validate_dataset_schema call and related imports o braintrust scoring: removed validate_dataset_schema call and related imports Validation is no longer needed at the dataset level since: o Dataset model changed from having dataset_schema to purpose/source fields o Scoring functions validate required fields when processing rows o Invalid data will fail naturally with clear error messages Fixes: #4419 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-19 15:52:52 -08:00
Charlie Doern	e710622d4c	fix: run all clients on stainless SDK, fix workflow, properly commit recordings (#4410 ) # What does this PR do? Various fixes to integration test recording + stainless calling of integration tests: 1. only the library client was being run, they all should be 2. the git check grabs diffs like: M tests/integration/client-typescript/package-lock.json M tests/integration/client-typescript/package.json it should not additionally: Fixes rebase conflicts when stainless workflow runs integration tests with record-if-missing mode on PRs. Previously, the workflow would: 1. Commit all files in tests/integration/ (including non-recordings) 2. Try to rebase and push to 'main' instead of the PR branch 3. Fail with merge conflicts on PR-specific changes Changes: - Add pr_head_ref and is_fork_pr parameters flowing through workflow chain - Use target-branch input instead of github.ref_name in recording commits - Detect and handle fork PRs by skipping push and uploading recordings as artifacts - Add 7-day artifact retention for fork PR recordings - Support both workflow_call and direct pull_request trigger contexts For same-repo PRs: recordings now commit/push to the PR branch correctly For fork PRs: recordings upload as downloadable artifacts with instructions you can see a failing workflow: `5846590613` with the rebase issues. --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-18 15:24:09 -05:00
Charlie Doern	5d52cb28c2	ci: record-if-missing when coming from stainless (#4408 ) # What does this PR do? we will typically need to record the missing json for net new APIs. use record-if-missing so that the integration tests can re-record and commit the files to the PR set the stainless inference mode to record-if-missing, and properly pass the pr_head_sha on workflow_call. ## Test Plan see `2031824567` which uses this commit. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-18 09:40:14 -08:00
Francisco Javier Arceo	2d149e3d2d	feat: Enhance Vector Stores config with full configurations (#4397 ) # What does this PR do? Enhances the Vector Stores config with full set of appropriate configurations - Add FileIngestionParams, ChunkRetrievalParams, and FileBatchParams subconfigs - Update RAG memory, OpenAI vector store mixin, and vector store utils to use configuration - Fix import organization across vector store components - Add comprehensive vector stores configuration documentation - Update docs navigation to include vector store configuration guide - Delete `memory/constants.py` and move constant values directly into Pydantic models ## Test Plan Tests updated + CI --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-12-17 16:56:46 -05:00
Sébastien Han	a7d509aaf9	feat: migrate Inspect API to FastAPI router (#4403 ) # What does this PR do? Migrate the Inspect API to the FastAPI router pattern. Changes: - Add inspect API to FastAPI router registry - Add PUBLIC_ROUTE_KEY support for routes that don't require auth - Update WebMethod creation to respect route's openapi_extra for authentication requirements Fixes: https://github.com/llamastack/llama-stack/issues/4346 <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan CI and various curls on /v1/inspect/routes, /v1/health, /v1/version Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-17 17:33:42 +01:00
Sébastien Han	cd5095a247	feat: migrate Providers API to FastAPI router pattern (#4405 ) # What does this PR do? Convert Providers API from @webmethod decorators to FastAPI router pattern. Fixes: https://github.com/llamastack/llama-stack/issues/4350 <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan CI Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-17 16:55:05 +01:00
Matt Leader	722d9c53e7	fix(server): add middleware for provider data and test context (#4367 ) # What does this PR do? Consolidates provider data context handling into middleware, eliminating duplication between FastAPI router routes and legacy @webmethod routes. Closes #4366 ## Test Plan Added unit test suite `test_test_context_middleware`, specifically `test_middleware_extracts_test_id_from_header` to validate the expected behavior. ``` ❯ ./scripts/unit-tests.sh tests/unit/ ``` Integration of the middleware test context with the `files` FastAPI router migration from [pull/4339](https://github.com/llamastack/llama-stack/pull/4339). ``` ❯ git switch migrate-files-api Switched to branch 'migrate-files-api' ❯ git rebase fix-test-ctx-middleware Successfully rebased and updated refs/heads/migrate-files-api. ❯ ./scripts/integration-tests.sh --inference-mode replay --suite base --setup ollama --stack-config server:starter --subdirs files ``` Signed-off-by: Matthew F Leader <mleader@redhat.com>	2025-12-16 15:00:48 -05:00
Derek Higgins	5abb7df41a	fix: ABAC bypass in vector store operations (#4394 ) Vector store operations were bypassing ABAC checks by calling providers directly instead of going through the routing table. This allowed unauthorized access to vector store data and operations. Changes: o Route all VectorIORouter methods through routing table instead of directly to providers o Update routing table to enforce ABAC checks on all vector store operations (read, update, delete) o Add test suite verifying ABAC enforcement for all vector store operations o Ensure providers are never called when authorization fails Fixes security issue where users could access vector stores they don't have permission for. Fixes: #4393 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-16 10:49:16 -08:00
Anastas Stoyanovsky	401d3b8ce6	docs: Update pre-commit version in CONTRIBUTING.md (#4399 ) # What does this PR do? Update pre-commit installation command to use version 4.4.0 or greater, as is done in CI. ## Test Plan n/a	2025-12-16 10:47:57 -08:00
Charlie Doern	66f3cf4002	feat: wire Stainless preview SDK into integration tests (#4360 ) # What does this PR do? Enable stainless-builds workflow to test preview SDKs by calling integration-tests workflow with python_url parameter. Add stainless matrix config for faster CI runs on SDK changes. - Make integration-tests.yml reusable with workflow_call inputs - Thread python_url through test setup actions to install preview SDK - Add matrix_key parameter to generate_ci_matrix.py for custom matrices - Update stainless-builds.yml to call integration tests with preview URL This allows us to test a client on the PR introducing the new changes before merging. Contributors can even write new tests using the generated client which should pass on the PR, indicating that they will pass on main upon merge ## Test Plan see triggered action using the workflows on this branch: `5810594042` which installs the stainless SDK from the given url. --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-16 09:20:40 -08:00
Charlie Doern	12116467f5	fix: remove run config from logs (#4395 ) # What does this PR do? since run.yaml is gone, update logs to say "stack config" or "stack configuration" rather than run ## Test Plan check logs Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-16 10:37:18 -05:00
Sébastien Han	700663028f	feat: convert Datasets API to use FastAPI router (#4359 ) # What does this PR do? Convert the Datasets API from webmethod decorators to FastAPI router pattern. Fixes: https://github.com/llamastack/llama-stack/issues/4344 ## Test Plan CI Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-15 11:23:04 -08:00
Jaideep Rao	56f946f3f5	feat: add support for tool_choice to responses api (#4106 ) # What does this PR do? Adds support for enforcing tool usage via responses api. See https://platform.openai.com/docs/api-reference/responses/create#responses_create-tool_choice for details from official documentation. Note: at present this PR only supports `file_search` and `web_search` as options to enforce builtin tool usage <!-- If resolving an issue, uncomment and update the line below --> Closes #3548 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> `./scripts/unit-tests.sh tests/unit/providers/agents/meta_reference/test_response_tool_context.py ` --------- Signed-off-by: Jaideep Rao <jrao@redhat.com>	2025-12-15 11:22:06 -08:00
Francisco Javier Arceo	62005dc1a9	feat: Making static prompt values in Rag/File Search configurable in Vector Store Config (#4368 ) # What does this PR do? - Enables users to configure prompts used throughout the File Search / Vector Retrieval - Configuration is defined in the Vector Stores Config so they can be modified at runtime - Backwards compatible, which means the fields are optional and default to the previously used values This is the summary of the new options in the `run.yaml` ```yaml vector_stores: file_search_params: header_template: 'knowledge_search tool found {num_chunks} chunks:\nBEGIN of knowledge_search tool results.\n' footer_template: 'END of knowledge_search tool results.\n' context_prompt_params: chunk_annotation_template: 'Result {index}\nContent: {chunk.content}\nMetadata: {metadata}\n' context_template: 'The above results were retrieved to help answer the user\'s query: "{query}". Use them as supporting information only in answering this query.{annotation_instruction}\n' annotation_prompt_params: enable_annotations: true annotation_instruction_template: 'Cite sources immediately at the end of sentences before punctuation, using `<\|file-id\|>` format like \'This is a fact <\|file-Cn3MSNn72ENTiiq11Qda4A\|>.\'. Do not add extra punctuation. Use only the file IDs provided, do not invent new ones.' chunk_annotation_template: '[{index}] {metadata_text} cite as <\|{file_id}\|>\n{chunk_text}\n' ``` <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Added tests. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-12-15 11:39:01 -05:00
asimurka	4043dedeea	fix: correctly unwrap provider data api_key from secret string (#4380 ) # What does this PR do? Fix provider header API key handling by correctly unwrapping `SecretStr` values for provider data API keys. Previously the validator cast header keys to `SecretStr` but the value wasn’t unwrapped before use, causing authentication failures with providers like Azure. Closes https://github.com/llamastack/llama-stack/issues/4370	2025-12-15 11:21:20 -05:00
Costa Shulyupin	2b85600a7e	docs: make inference model configurable (#4385 ) Allow users to specify the inference model through the INFERENCE_MODEL environment variable instead of hardcoding it, with fallback to ollama/llama3.2:3b if not set. Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>	2025-12-15 11:02:28 +01:00
dependabot[bot]	62f7818051	chore(github-deps): bump astral-sh/setup-uv from 7.1.4 to 7.1.6 (#4386 ) Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 7.1.4 to 7.1.6. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v7.1.6 🌈 add OS version to cache key to prevent binary incompatibility</h2> <h2>Changes</h2> <p>This release will invalidate your cache existing keys!</p> <p>The os version e.g. <code>ubuntu-22.04</code> is now part of the cache key. This prevents failing builds when a cache got populated with wheels built with different tools (e.g. glibc) than are present on the runner where the cache got restored.</p> <h2>🐛 Bug fixes</h2> <ul> <li>feat: add OS version to cache key to prevent binary incompatibility <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/716">#716</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>chore: update known checksums for 0.9.17 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/714">#714</a>)</li> </ul> <h2>⬆️ Dependency updates</h2> <ul> <li>Bump actions/checkout from 5.0.0 to 6.0.1 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/712">#712</a>)</li> <li>Bump actions/setup-node from 6.0.0 to 6.1.0 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/715">#715</a>)</li> </ul> <h2>v7.1.5 🌈 allow setting <code>cache-local-path</code> without <code>enable-cache: true</code></h2> <h2>Changes</h2> <p><a href="https://redirect.github.com/astral-sh/setup-uv/pull/612">astral-sh/setup-uv#612</a> fixed a faulty behavior where this action set <code>UV_CACHE_DIR</code> even though <code>enable-cache</code> was <code>false</code>. It also fixed the cases were the cache dir is already configured in a settings file like <code>pyproject.toml</code> or <code>UV_CACHE_DIR</code> was already set. Here the action shouldn't overwrite or set <code>UV_CACHE_DIR</code>.</p> <p>These fixes introduced an unwanted behavior: You can still set <code>cache-local-path</code> but this action didn't do anything. This release fixes that.</p> <p>You can now use <code>cache-local-path</code> to automatically set <code>UV_CACHE_DIR</code> even when <code>enable-cache</code> is <code>false</code> (or gets set to false by default e.g. on self-hosted runners)</p> <pre lang="yaml"><code>- name: This is now possible uses: astral-sh/setup-uv@v7 with: enable-cache: false cache-local-path: "/path/to/cache" </code></pre> <h2>🐛 Bug fixes</h2> <ul> <li>allow cache-local-path w/o enable-cache <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/707">#707</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>set biome files.maxSize to 2MiB <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/708">#708</a>)</li> <li>chore: update known checksums for 0.9.16 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/706">#706</a>)</li> <li>chore: update known checksums for 0.9.15 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/704">#704</a>)</li> <li>chore: use <code>npm ci --ignore-scripts</code> everywhere <a href="https://github.com/woodruffw"><code>@woodruffw</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/699">#699</a>)</li> <li>chore: update known checksums for 0.9.14 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/700">#700</a>)</li> <li>chore: update known checksums for 0.9.13 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/694">#694</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`681c641aba`"><code>681c641</code></a> Bump actions/checkout from 5.0.0 to 6.0.1 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/712">#712</a>)</li> <li><a href="`2e85713bb0`"><code>2e85713</code></a> Bump actions/setup-node from 6.0.0 to 6.1.0 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/715">#715</a>)</li> <li><a href="`58b6d7b303`"><code>58b6d7b</code></a> fix: add OS version to cache key to prevent binary incompatibility (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/716">#716</a>)</li> <li><a href="`e8b52af86e`"><code>e8b52af</code></a> chore: update known checksums for 0.9.17 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/714">#714</a>)</li> <li><a href="`ed21f2f24f`"><code>ed21f2f</code></a> Bump peter-evans/create-pull-request from 7.0.8 to 7.0.9 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/695">#695</a>)</li> <li><a href="`93202d8fbe`"><code>93202d8</code></a> bump dependencies (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/709">#709</a>)</li> <li><a href="`5ce090076d`"><code>5ce0900</code></a> set biome files.maxSize to 2MiB (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/708">#708</a>)</li> <li><a href="`4180991cd9`"><code>4180991</code></a> allow cache-local-path w/o enable-cache (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/707">#707</a>)</li> <li><a href="`0439606c8e`"><code>0439606</code></a> Bump github/codeql-action from 4.30.9 to 4.31.6 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/698">#698</a>)</li> <li><a href="`7dd56c18e9`"><code>7dd56c1</code></a> chore: update known checksums for 0.9.16 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/706">#706</a>)</li> <li>Additional commits viewable in <a href="`1e862dfacb...681c641aba`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=7.1.4&new-version=7.1.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-15 09:17:30 +01:00
dependabot[bot]	9b346625bc	chore(github-deps): bump stainless-api/upload-openapi-spec-action from 1.7.1 to 1.8.1 (#4387 ) Bumps [stainless-api/upload-openapi-spec-action](https://github.com/stainless-api/upload-openapi-spec-action) from 1.7.1 to 1.8.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/stainless-api/upload-openapi-spec-action/releases">stainless-api/upload-openapi-spec-action's releases</a>.</em></p> <blockquote> <h2>v1.8.1</h2> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.8.0...v1.8.1">1.8.1</a> (2025-12-09)</h2> <h3>Bug Fixes</h3> <ul> <li>re-enable 'targets' param in diagnostics call (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/148">#148</a>) (<a href="`3130e17c92`">3130e17</a>)</li> </ul> <h2>v1.8.0</h2> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.7.1...v1.8.0">1.8.0</a> (2025-12-08)</h2> <h3>Features</h3> <ul> <li>support AI commit message generation for preview builds (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/143">#143</a>) (<a href="`7010edb389`">7010edb</a>)</li> <li>support per-SDK commit messages in preview comments (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/142">#142</a>) (<a href="`a36c33fc21`">a36c33f</a>)</li> <li>Update to latest <code>@stainless-api/sdk</code> (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/144">#144</a>) (<a href="`a9b388bded`">a9b388b</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/stainless-api/upload-openapi-spec-action/blob/main/CHANGELOG.md">stainless-api/upload-openapi-spec-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.8.0...v1.8.1">1.8.1</a> (2025-12-09)</h2> <h3>Bug Fixes</h3> <ul> <li>re-enable 'targets' param in diagnostics call (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/148">#148</a>) (<a href="`3130e17c92`">3130e17</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.7.1...v1.8.0">1.8.0</a> (2025-12-08)</h2> <h3>Features</h3> <ul> <li>support AI commit message generation for preview builds (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/143">#143</a>) (<a href="`7010edb389`">7010edb</a>)</li> <li>support per-SDK commit messages in preview comments (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/142">#142</a>) (<a href="`a36c33fc21`">a36c33f</a>)</li> <li>Update to latest <code>@stainless-api/sdk</code> (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/144">#144</a>) (<a href="`a9b388bded`">a9b388b</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.7.0...v1.7.1">1.7.1</a> (2025-12-01)</h2> <h3>Bug Fixes</h3> <ul> <li>improve getMergeBase to handle shallow clones more robustly (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/138">#138</a>) (<a href="`3687845465`">3687845</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.6.0...v1.7.0">1.7.0</a> (2025-11-17)</h2> <h3>Features</h3> <ul> <li><strong>preview:</strong> add output documented_spec_path to preview action (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/135">#135</a>) (<a href="`5e80cc40da`">5e80cc4</a>)</li> <li><strong>preview:</strong> add output_dir input and write documented spec to file (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/137">#137</a>) (<a href="`d30490c89b`">d30490c</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.5...v1.6.0">1.6.0</a> (2025-10-30)</h2> <h3>Features</h3> <ul> <li>add support for github OIDC auth (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/133">#133</a>) (<a href="`259674c1b3`">259674c</a>)</li> <li>change fail on semantics (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/124">#124</a>) (<a href="`e1046240c0`">e104624</a>)</li> </ul> <h3>Bug Fixes</h3> <ul> <li>accept multiline conventional commits (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/129">#129</a>) (<a href="`d2dcc0b3bf`">d2dcc0b</a>)</li> <li>tweak categorizeOutcomes (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/132">#132</a>) (<a href="`c45d6a9c79`">c45d6a9</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.4...v1.5.5">1.5.5</a> (2025-09-26)</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`979824f1ea`"><code>979824f</code></a> chore(main): release 1.8.1 (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/149">#149</a>)</li> <li><a href="`3130e17c92`"><code>3130e17</code></a> fix: re-enable 'targets' param in diagnostics call (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/148">#148</a>)</li> <li><a href="`44e2d2a112`"><code>44e2d2a</code></a> chore(main): release 1.8.0 (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/145">#145</a>)</li> <li><a href="`7010edb389`"><code>7010edb</code></a> feat: support AI commit message generation for preview builds (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/143">#143</a>)</li> <li><a href="`a36c33fc21`"><code>a36c33f</code></a> feat: support per-SDK commit messages in preview comments (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/142">#142</a>)</li> <li><a href="`06c5fd328b`"><code>06c5fd3</code></a> chore(build): Update dist</li> <li><a href="`a9b388bded`"><code>a9b388b</code></a> feat: Update to latest <code>@stainless-api/sdk</code> (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/144">#144</a>)</li> <li>See full diff in <a href="`a4d631c1e9...979824f1ea`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=stainless-api/upload-openapi-spec-action&package-manager=github_actions&previous-version=1.7.1&new-version=1.8.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-15 09:17:13 +01:00
dependabot[bot]	f4df1a66e0	chore(github-deps): bump actions/upload-artifact from 5.0.0 to 6.0.0 (#4388 ) Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 5.0.0 to 6.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/upload-artifact/releases">actions/upload-artifact's releases</a>.</em></p> <blockquote> <h2>v6.0.0</h2> <h2>v6 - What's new</h2> <blockquote> <p>[!IMPORTANT] actions/upload-artifact@v6 now runs on Node.js 24 (<code>runs.using: node24</code>) and requires a minimum Actions Runner version of 2.327.1. If you are using self-hosted runners, ensure they are updated before upgrading.</p> </blockquote> <h3>Node.js 24</h3> <p>This release updates the runtime to Node.js 24. v5 had preliminary support for Node.js 24, however this action was by default still running on Node.js 20. Now this action by default will run on Node.js 24.</p> <h2>What's Changed</h2> <ul> <li>Upload Artifact Node 24 support by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/upload-artifact/pull/719">actions/upload-artifact#719</a></li> <li>fix: update <code>@actions/artifact</code> for Node.js 24 punycode deprecation by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/upload-artifact/pull/744">actions/upload-artifact#744</a></li> <li>prepare release v6.0.0 for Node.js 24 support by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/upload-artifact/pull/745">actions/upload-artifact#745</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/upload-artifact/compare/v5.0.0...v6.0.0">https://github.com/actions/upload-artifact/compare/v5.0.0...v6.0.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`b7c566a772`"><code>b7c566a</code></a> Merge pull request <a href="https://redirect.github.com/actions/upload-artifact/issues/745">#745</a> from actions/upload-artifact-v6-release</li> <li><a href="`e516bc8500`"><code>e516bc8</code></a> docs: correct description of Node.js 24 support in README</li> <li><a href="`ddc45ed9bc`"><code>ddc45ed</code></a> docs: update README to correct action name for Node.js 24 support</li> <li><a href="`615b319bd2`"><code>615b319</code></a> chore: release v6.0.0 for Node.js 24 support</li> <li><a href="`017748b48f`"><code>017748b</code></a> Merge pull request <a href="https://redirect.github.com/actions/upload-artifact/issues/744">#744</a> from actions/fix-storage-blob</li> <li><a href="`38d4c7997f`"><code>38d4c79</code></a> chore: rebuild dist</li> <li><a href="`7d27270e0c`"><code>7d27270</code></a> chore: add missing license cache files for <code>@actions/core</code>, <code>@actions/io</code>, and mi...</li> <li><a href="`5f643d3c94`"><code>5f643d3</code></a> chore: update license files for <code>@actions/artifact</code><a href="https://github.com/5"><code>@5</code></a>.0.1 dependencies</li> <li><a href="`1df1684032`"><code>1df1684</code></a> chore: update package-lock.json with <code>@actions/artifact</code><a href="https://github.com/5"><code>@5</code></a>.0.1</li> <li><a href="`b5b1a91840`"><code>b5b1a91</code></a> fix: update <code>@actions/artifact</code> to ^5.0.0 for Node.js 24 punycode fix</li> <li>Additional commits viewable in <a href="`330a01c490...b7c566a772`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/upload-artifact&package-manager=github_actions&previous-version=5.0.0&new-version=6.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-15 09:16:34 +01:00
dependabot[bot]	6efe0a2939	chore(github-deps): bump actions/cache from 4.3.0 to 5.0.1 (#4389 ) Bumps [actions/cache](https://github.com/actions/cache) from 4.3.0 to 5.0.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/cache/releases">actions/cache's releases</a>.</em></p> <blockquote> <h2>v5.0.1</h2> <blockquote> <p>[!IMPORTANT] <strong><code>actions/cache@v5</code> runs on the Node.js 24 runtime and requires a minimum Actions Runner version of <code>2.327.1</code>.</strong></p> <p>If you are using self-hosted runners, ensure they are updated before upgrading.</p> </blockquote> <hr /> <h1>v5.0.1</h1> <h2>What's Changed</h2> <ul> <li>fix: update <code>@actions/cache</code> for Node.js 24 punycode deprecation by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1685">actions/cache#1685</a></li> <li>prepare release v5.0.1 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1686">actions/cache#1686</a></li> </ul> <h1>v5.0.0</h1> <h2>What's Changed</h2> <ul> <li>Upgrade to use node24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1630">actions/cache#1630</a></li> <li>Prepare v5.0.0 release by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1684">actions/cache#1684</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/cache/compare/v5...v5.0.1">https://github.com/actions/cache/compare/v5...v5.0.1</a></p> <h2>v5.0.0</h2> <blockquote> <p>[!IMPORTANT] <strong><code>actions/cache@v5</code> runs on the Node.js 24 runtime and requires a minimum Actions Runner version of <code>2.327.1</code>.</strong></p> <p>If you are using self-hosted runners, ensure they are updated before upgrading.</p> </blockquote> <hr /> <h2>What's Changed</h2> <ul> <li>Upgrade to use node24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1630">actions/cache#1630</a></li> <li>Prepare v5.0.0 release by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1684">actions/cache#1684</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/cache/compare/v4.3.0...v5.0.0">https://github.com/actions/cache/compare/v4.3.0...v5.0.0</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/actions/cache/blob/main/RELEASES.md">actions/cache's changelog</a>.</em></p> <blockquote> <h1>Releases</h1> <h2>Changelog</h2> <h3>5.0.1</h3> <ul> <li>Update <code>@azure/storage-blob</code> to <code>^12.29.1</code> via <code>@actions/cache@5.0.1</code> <a href="https://redirect.github.com/actions/cache/pull/1685">#1685</a></li> </ul> <h3>5.0.0</h3> <blockquote> <p>[!IMPORTANT] <code>actions/cache@v5</code> runs on the Node.js 24 runtime and requires a minimum Actions Runner version of <code>2.327.1</code>. If you are using self-hosted runners, ensure they are updated before upgrading.</p> </blockquote> <h3>4.3.0</h3> <ul> <li>Bump <code>@actions/cache</code> to <a href="https://redirect.github.com/actions/toolkit/pull/2132">v4.1.0</a></li> </ul> <h3>4.2.4</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.5</li> </ul> <h3>4.2.3</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.3 (obfuscates SAS token in debug logs for cache entries)</li> </ul> <h3>4.2.2</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.2</li> </ul> <h3>4.2.1</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.1</li> </ul> <h3>4.2.0</h3> <p>TLDR; The cache backend service has been rewritten from the ground up for improved performance and reliability. <a href="https://github.com/actions/cache">actions/cache</a> now integrates with the new cache service (v2) APIs.</p> <p>The new service will gradually roll out as of <strong>February 1st, 2025</strong>. The legacy service will also be sunset on the same date. Changes in these release are <strong>fully backward compatible</strong>.</p> <p><strong>We are deprecating some versions of this action</strong>. We recommend upgrading to version <code>v4</code> or <code>v3</code> as soon as possible before <strong>February 1st, 2025.</strong> (Upgrade instructions below).</p> <p>If you are using pinned SHAs, please use the SHAs of versions <code>v4.2.0</code> or <code>v3.4.0</code></p> <p>If you do not upgrade, all workflow runs using any of the deprecated <a href="https://github.com/actions/cache">actions/cache</a> will fail.</p> <p>Upgrading to the recommended versions will not break your workflows.</p> <h3>4.1.2</h3> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`9255dc7a25`"><code>9255dc7</code></a> Merge pull request <a href="https://redirect.github.com/actions/cache/issues/1686">#1686</a> from actions/cache-v5.0.1-release</li> <li><a href="`8ff5423e8b`"><code>8ff5423</code></a> chore: release v5.0.1</li> <li><a href="`9233019a15`"><code>9233019</code></a> Merge pull request <a href="https://redirect.github.com/actions/cache/issues/1685">#1685</a> from salmanmkc/node24-storage-blob-fix</li> <li><a href="`b975f2bb84`"><code>b975f2b</code></a> fix: add peer property to package-lock.json for dependencies</li> <li><a href="`d0a0e18134`"><code>d0a0e18</code></a> fix: update license files for <code>@actions/cache</code>, fast-xml-parser, and strnum</li> <li><a href="`74de208dcf`"><code>74de208</code></a> fix: update <code>@actions/cache</code> to ^5.0.1 for Node.js 24 punycode fix</li> <li><a href="`ac7f1152ea`"><code>ac7f115</code></a> peer</li> <li><a href="`b0f846b50b`"><code>b0f846b</code></a> fix: update <code>@actions/cache</code> with storage-blob fix for Node.js 24 punycode depr...</li> <li><a href="`a783357455`"><code>a783357</code></a> Merge pull request <a href="https://redirect.github.com/actions/cache/issues/1684">#1684</a> from actions/prepare-cache-v5-release</li> <li><a href="`3bb0d78750`"><code>3bb0d78</code></a> docs: highlight v5 runner requirement in releases</li> <li>Additional commits viewable in <a href="`0057852bfa...9255dc7a25`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/cache&package-manager=github_actions&previous-version=4.3.0&new-version=5.0.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-15 09:16:19 +01:00
Roy Belio	c574db5f1d	fix(inference): AttributeError in streaming response cleanup (#4236 ) This PR fixes issue #3185 The code calls `await event_gen.aclose()` but OpenAI's `AsyncStream` doesn't have an `aclose()` method - it has `close()` (which is async). when clients cancel streaming requests, the server tries to clean up with: ```python await event_gen.aclose() # ❌ AsyncStream doesn't have aclose()! ``` But `AsyncStream` has never had a public `aclose()` method. The error message literally tells us: ``` AttributeError: 'AsyncStream' object has no attribute 'aclose'. Did you mean: 'close'? ^^^^^^^^ ``` ## Verification * Reproduction script [`reproduce_issue_3185.sh`](https://gist.github.com/r-bit-rry/dea4f8fbb81c446f5db50ea7abd6379b) can be used to verify the fix. * Manual checks, validation against original OpenAI library code	2025-12-14 07:51:09 -05:00
Omar Abdelwahab	dfb9f6743a	docs: Adding initial updates to the RAG documentation and examples (#4377 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2s Details Integration Tests (Replay) / generate-matrix (push) Successful in 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details Python Package Build Test / build (3.12) (push) Successful in 18s Details Python Package Build Test / build (3.13) (push) Successful in 22s Details Test External API and Providers / test-external (venv) (push) Failing after 37s Details Vector IO Integration Tests / test-matrix (push) Failing after 46s Details UI Tests / ui-tests (22) (push) Successful in 1m23s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m48s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m50s Details Pre-commit / pre-commit (22) (push) Successful in 3m31s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m20s Details # What does this PR do? This PR updates the RAG examples included in docs/quick_start.ipynb, docs/getting_started/demo_script.py, rag.mdx and index.md to remove references to the deprecated vector_io and vector_db APIs and to add examples that use /v1/vector_stores with responses and completions. --------- Co-authored-by: Omar Abdelwahab <omara@fb.com> Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com>	2025-12-12 22:59:39 -05:00
Varsha	75ef052545	docs: Add details on model registration and refresh_models (#4383 ) Document the refresh_models configuration option for remote providers that use RemoteInferenceProviderConfig. - Add "Automatic vs Explicit Model Registration" section to resources.mdx - Include examples for registering custom embedding models # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>	2025-12-12 22:41:28 -05:00
Robert Riley (OCI)	10c878d782	feat: added oci-s3 compatibility (#4374 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 14s Details Python Package Build Test / build (3.12) (push) Successful in 16s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details Test External API and Providers / test-external (venv) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (push) Failing after 50s Details UI Tests / ui-tests (22) (push) Successful in 1m1s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m39s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m43s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m47s Details Pre-commit / pre-commit (22) (push) Successful in 3m42s Details # What does this PR do? The PR validates and allow access to OCI object-storage through the S3 compatibility API. Additional documentation for OCI is supplied, in notebook form, as well. ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Co-authored-by: raghotham <rsm@meta.com>	2025-12-11 15:13:55 -08:00
Shabana Baig	805abf573f	feat!: Implement include parameter specifically for adding logprobs in the output message (#4261 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 15s Details Python Package Build Test / build (3.12) (push) Successful in 17s Details Python Package Build Test / build (3.13) (push) Successful in 18s Details Test External API and Providers / test-external (venv) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (push) Failing after 43s Details UI Tests / ui-tests (22) (push) Successful in 52s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m45s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m58s Details Pre-commit / pre-commit (22) (push) Successful in 3m9s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m5s Details # Problem As an Application Developer, I want to use the include parameter with the value message.output_text.logprobs, so that I can receive log probabilities for output tokens to assess the model's confidence in its response. # What does this PR do? - Updates the include parameter in various resource definitions - Updates the inline provider to return logprobs when "message.output_text.logprobs" is passed in the include parameter - Converts the logprobs returned by the inference provider from chat completion format to responses format Closes #[4260](https://github.com/llamastack/llama-stack/issues/4260) ## Test Plan - Created a script to explore OpenAI behavior: https://github.com/s-akhtar-baig/llama-stack-examples/blob/main/responses/src/include.py - Added integration tests and new recordings --------- Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-12-11 11:11:21 -08:00
Jaideep Rao	76e47d811a	feat(api): add readonly connectors API (#4258 ) # What does this PR do? Adds a new API for connectors and MCP registry support along with required types. Does not include any implementation for it <!-- If resolving an issue, uncomment and update the line below --> Closes #4235 and #4061 (partially) ## Test Plan no tests included --------- Signed-off-by: Jaideep Rao <jrao@redhat.com> Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com>	2025-12-11 10:19:55 -08:00
Sébastien Han	470fe55e87	fix(inference): respect table_name config in InferenceStore (#4371 ) # What does this PR do? The InferenceStore class was ignoring the table_name field from InferenceStoreReference and always using the hardcoded value "chat_completions". This meant that any custom table_name configured in the run config (e.g., "inference_store" in run-with-postgres-store.yaml) was silently ignored. This change updates all SQL operations in InferenceStore to use self.reference.table_name instead of the hardcoded string, ensuring the configured table name is properly respected. A new test has been added to verify that custom table names work correctly for storing, retrieving, and listing chat completions. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan CI Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-11 14:50:23 +01:00
Charlie Doern	7308c8aef1	feat: add workflow_dispatch and self-trigger to stainless builds (#4361 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details Test External API and Providers / test-external (venv) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (push) Failing after 48s Details UI Tests / ui-tests (22) (push) Successful in 1m36s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m43s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m54s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m24s Details Pre-commit / pre-commit (22) (push) Successful in 4m22s Details # What does this PR do? Currently impossible to test workflow changes (pull_request_target uses base branch definition) or manually trigger SDK builds. This adds both capabilities. - Add workflow_dispatch with pr_number input for manual testing - Add workflow file to path triggers for automatic testing - Fetch PR details via gh CLI for manual runs - Update jobs to use computed PR data for both trigger types ## Test Plan impossible to test until it merges unfortunately. I am doing this in a smaller PR so that I can use it immediately in a follow up. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-10 12:48:27 -08:00
Francisco Javier Arceo	95b2948d11	feat: Add support for query rewrite in vector_store.search (#4171 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 20s Details Test External API and Providers / test-external (venv) (push) Failing after 41s Details Vector IO Integration Tests / test-matrix (push) Failing after 49s Details UI Tests / ui-tests (22) (push) Successful in 51s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m27s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m45s Details Pre-commit / pre-commit (22) (push) Failing after 2m30s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m22s Details # What does this PR do? Actualize query rewrite in search API, add `default_query_expansion_model` and `query_expansion_prompt` in `VectorStoresConfig`. Makes `rewrite_query` parameter functional in vector store search. - `rewrite_query=false` (default): Use original query - `rewrite_query=true`: Expand query via LLM, or fail gracefully if no LLM available Adds 4 parameters to`VectorStoresConfig`: - `default_query_expansion_model`: LLM model for query expansion (optional) - `query_expansion_prompt`: Custom prompt template (optional, uses built-in default) - `query_expansion_max_tokens`: Configurable token limit (default: 100) - `query_expansion_temperature`: Configurable temperature (default: 0.3) Enabled `run.yaml`: ```yaml vector_stores: rewrite_query_params: model: provider_id: "ollama" model_id: "llama3.2:3b-instruct-fp16" # prompt defaults to built-in # max_tokens defaults to 100 # temperature defaults to 0.3 ``` Fully customized `run.yaml`: ```yaml vector_stores: default_provider_id: faiss default_embedding_model: provider_id: sentence-transformers model_id: nomic-ai/nomic-embed-text-v1.5 rewrite_query_params: model: provider_id: ollama model_id: llama3.2:3b-instruct-fp16 prompt: "Rewrite this search query to improve retrieval results by expanding it with relevant synonyms and related terms: {query}" max_tokens: 100 temperature: 0.3 ``` ## Test Plan Added test and recording Example script as well: ```python import asyncio from llama_stack_client import LlamaStackClient from io import BytesIO def gen_file(client, text: str=""): file_buffer = BytesIO(text.encode('utf-8')) file_buffer.name = "my_file.txt" uploaded_file = client.files.create( file=file_buffer, purpose="assistants" ) return uploaded_file async def test_query_rewriting(): client = LlamaStackClient(base_url="http://0.0.0.0:8321/") uploaded_file = gen_file(client, "banana banana apple") uploaded_file2 = gen_file(client, "orange orange kiwi") vs = client.vector_stores.create() xf_vs = client.vector_stores.files.create(vector_store_id=vs.id, file_id=uploaded_file.id) xf_vs1 = client.vector_stores.files.create(vector_store_id=vs.id, file_id=uploaded_file2.id) response1 = client.vector_stores.search( vector_store_id=vs.id, query="apple", max_num_results=3, rewrite_query=False ) response2 = client.vector_stores.search( vector_store_id=vs.id, query="kiwi", max_num_results=3, rewrite_query=True, ) print(f"\n🔵 Response 1 (rewrite_query=False):\n\033[94m{response1}\033[0m") print(f"\n🟢 Response 2 (rewrite_query=True):\n\033[92m{response2}\033[0m") for f in [uploaded_file.id, uploaded_file2.id]: client.files.delete(file_id=f) client.vector_stores.delete(vector_store_id=vs.id) if __name__ == "__main__": asyncio.run(test_query_rewriting()) ``` And see the screen shot of the server logs showing it worked. <img width="1111" height="826" alt="Screenshot 2025-11-19 at 1 16 03 PM" src="https://github.com/user-attachments/assets/2d188b44-1fef-4df5-b465-2d6728ca49ce" /> Notice the log: ```bash Query rewritten: 'kiwi' → 'kiwi, a small brown or green fruit native to New Zealand, or a person having a fuzzy brown outer skin similar in appearance.' ``` So `kiwi` was expanded. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>	2025-12-10 10:06:19 -05:00
Sébastien Han	ff375f1abb	feat: convert Benchmarks API to use FastAPI router (#4309 ) # What does this PR do? Convert the Benchmarks API from @webmethod decorators to FastAPI router pattern, matching the Batches API structure. One notable change is the update of stack.py to handle request models in register_resources(). Closes: #4308 ## Test Plan CI and `curl http://localhost:8321/v1/inspect/routes \| jq '.data[] \| select(.route \| contains("benchmark"))'` --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-10 15:04:27 +01:00
Charlie Doern	661985e240	feat: remove usage of build yaml (#4192 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Failing after 3s Details Test Llama Stack Build / build (push) Has been skipped Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test llama stack list-deps / generate-matrix (push) Failing after 3s Details Test llama stack list-deps / list-deps (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.13) (push) Successful in 19s Details Python Package Build Test / build (3.12) (push) Successful in 23s Details Test Llama Stack Build / build-single-provider (push) Successful in 33s Details Test llama stack list-deps / show-single-provider (push) Successful in 36s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 44s Details Vector IO Integration Tests / test-matrix (push) Failing after 57s Details Test External API and Providers / test-external (venv) (push) Failing after 1m37s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m56s Details UI Tests / ui-tests (22) (push) Successful in 2m2s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m35s Details Pre-commit / pre-commit (22) (push) Successful in 3m16s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 3m34s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 3m59s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m30s Details # What does this PR do? the build.yaml is only used in the following ways: 1. list-deps 2. distribution code-gen since `llama stack build` no longer exists, I found myself asking "why do we need two different files for list-deps and run"? Removing the BuildConfig and altering the usage of the DistributionTemplate in llama stack list-deps is the first step in removing the build yaml entirely. Removing the BuildConfig and build.yaml cuts the files users need to maintain in half, and allows us to focus on the stability of _just_ the run.yaml This PR removes the build.yaml, BuildConfig datatype, and its usage throughout the codebase. Users are now expected to point to run.yaml files when running list-deps, and our codebase automatically uses these types now for things like `get_provider_registry`. Additionally, two renames: `StackRunConfig` -> `StackConfig` and `run.yaml` -> `config.yaml`. The build.yaml made sense for when we were managing the build process for the user and actually _producing_ a run.yaml _from_ the build.yaml, but now that we are simply just getting the provider registry and listing the deps, switching to config.yaml simplifies the scope here greatly. ## Test Plan existing list-deps usage should work in the tests. --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-10 10:12:12 +01:00
Varsha	17e6912288	docs: Fix vector_store_create params (#4364 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 14s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details Test External API and Providers / test-external (venv) (push) Failing after 31s Details Vector IO Integration Tests / test-matrix (push) Failing after 38s Details UI Tests / ui-tests (22) (push) Successful in 44s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m30s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m29s Details Pre-commit / pre-commit (22) (push) Successful in 2m59s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m38s Details	2025-12-09 19:48:43 -05:00
Francisco Javier Arceo	fcea9893a4	feat(UI): Adding Files API to Admin UI (#4319 ) # What does this PR do? ## Files Admin Page <img width="1919" height="1238" alt="Screenshot 2025-12-09 at 10 33 06 AM" src="https://github.com/user-attachments/assets/3dd545f0-32bc-45be-af2b-1823800015f2" /> ## Files Upload Modal <img width="1919" height="1287" alt="Screenshot 2025-12-09 at 10 33 38 AM" src="https://github.com/user-attachments/assets/776bb372-75d3-4ccd-b6b5-c9dfb3fcb350" /> ## Files Detail <img width="1918" height="1099" alt="Screenshot 2025-12-09 at 10 34 26 AM" src="https://github.com/user-attachments/assets/f256dbf8-4047-4d79-923d-404161b05f36" /> Note, content preview has some handling for JSON, CSV, and PDF to enable nicer rendering. Pure text rendering is trivial. ### Files Detail File Content Preview (TXT) <img width="1918" height="1341" alt="Screenshot 2025-12-09 at 10 41 20 AM" src="https://github.com/user-attachments/assets/4fa0ddb7-ffff-424b-b764-0bd4af6ed976" /> ### Files Detail File Content Preview (JSON) <img width="1909" height="1233" alt="Screenshot 2025-12-09 at 10 39 57 AM" src="https://github.com/user-attachments/assets/b912f07a-2dff-483b-b73c-2f69dd0d87ad" /> ### Files Detail File Content Preview (HTML) <img width="1916" height="1348" alt="Screenshot 2025-12-09 at 10 40 27 AM" src="https://github.com/user-attachments/assets/17ebec0a-8754-4552-977d-d3c44f7f6973" /> ### Files Detail File Content Preview (CSV) <img width="1919" height="1177" alt="Screenshot 2025-12-09 at 10 34 50 AM" src="https://github.com/user-attachments/assets/20bd0755-1757-4a3a-99d2-fbd072f81f49" /> ### Files Detail File Content Preview (PDF) <img width="1917" height="1154" alt="Screenshot 2025-12-09 at 10 36 48 AM" src="https://github.com/user-attachments/assets/2873e6fe-4da3-4cbd-941b-7d903270b749" /> Closes https://github.com/llamastack/llama-stack/issues/4144 ## Test Plan Added Tests Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-12-09 16:28:05 -05:00
Robert Riley (OCI)	6ad5fb5577	feat: Adding OCI Embeddings (#4300 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 11s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 18s Details Test External API and Providers / test-external (venv) (push) Failing after 30s Details UI Tests / ui-tests (22) (push) Successful in 56s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m1s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m44s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m48s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m17s Details Pre-commit / pre-commit (22) (push) Successful in 3m22s Details # What does this PR do? Enabling usage of OCI embedding models. ## Test Plan Testing embedding model: `OCI_COMPARTMENT_OCID="" OCI_REGION="us-chicago-1" OCI_AUTH_TYPE=config_file pytest -sv tests/integration/inference/test_openai_embeddings.py --stack-config oci --embedding-model oci/openai.text-embedding-3-small --inference-mode live` Testing chat model: `OCI_COMPARTMENT_OCID="" OCI_REGION="us-chicago-1" OCI_AUTH_TYPE=config_file pytest -sv tests/integration/inference/ --stack-config oci --text-model oci/openai.gpt-4.1-nano-2025-04-14 --inference-mode live` Testing curl for embeddings: `curl -X POST http://localhost:8321/v1/embeddings -H "Content-Type: application/json" -d '{ "model": "oci/openai.text-embedding-3-small", "input": ["First text", "Second text"], "encoding_format": "float" }'` `{"object":"list","data":[{"object":"embedding","embedding":[-0.017190756...0.025272394],"index":1}],"model":"oci/openai.text-embedding-3-small","usage":{"prompt_tokens":4,"total_tokens":4}}` --------- Co-authored-by: Omar Abdelwahab <omaryashraf10@gmail.com>	2025-12-08 13:05:39 -08:00
Sébastien Han	d82a2cd6f8	fix: httpcore deadlock in CI by properly closing streaming responses (#4335 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details Python Package Build Test / build (3.12) (push) Successful in 18s Details Test External API and Providers / test-external (venv) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (push) Failing after 33s Details UI Tests / ui-tests (22) (push) Successful in 1m13s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m37s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m11s Details Pre-commit / pre-commit (22) (push) Successful in 3m39s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m1s Details # What does this PR do? The test_conversation_error_handling test was timing out in CI with a deadlock in httpcore's connection pool. The root cause was the preceding test_conversation_multi_turn_and_streaming test, which broke out of the streaming response iterator early without properly closing the underlying HTTP connection. When a streaming response iterator is abandoned mid-stream, the HTTP connection remains in an incomplete state. Since the openai_client fixture is session-scoped, subsequent tests reuse the same httpcore connection pool. The dangling connection causes the pool's internal lock to deadlock when the next test attempts to acquire a new connection. The fix wraps the streaming response in a context manager, which ensures the connection is properly closed when exiting the with block, even when breaking out of the loop early. This is a best practice when working with streaming HTTP responses that may not be fully consumed. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-08 16:38:46 +01:00
dependabot[bot]	20c11d8fd4	chore(github-deps): bump stainless-api/upload-openapi-spec-action from 1.7.0 to 1.7.1 (#4334 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 6s Details API Conformance Tests / check-schema-compatibility (push) Successful in 18s Details Python Package Build Test / build (3.12) (push) Successful in 18s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 30s Details Test llama stack list-deps / generate-matrix (push) Successful in 33s Details Test Llama Stack Build / generate-matrix (push) Successful in 36s Details Test llama stack list-deps / show-single-provider (push) Successful in 33s Details Python Package Build Test / build (3.13) (push) Successful in 59s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 1m8s Details Test Llama Stack Build / build-single-provider (push) Successful in 1m12s Details Test External API and Providers / test-external (venv) (push) Failing after 1m9s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m24s Details UI Tests / ui-tests (22) (push) Successful in 1m29s Details Test Llama Stack Build / build (push) Successful in 1m0s Details Test llama stack list-deps / list-deps (push) Failing after 1m23s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m42s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m51s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 3m47s Details Pre-commit / pre-commit (22) (push) Successful in 3m55s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m7s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 4m43s Details Bumps [stainless-api/upload-openapi-spec-action](https://github.com/stainless-api/upload-openapi-spec-action) from 1.7.0 to 1.7.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/stainless-api/upload-openapi-spec-action/releases">stainless-api/upload-openapi-spec-action's releases</a>.</em></p> <blockquote> <h2>v1.7.1</h2> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.7.0...v1.7.1">1.7.1</a> (2025-12-01)</h2> <h3>Bug Fixes</h3> <ul> <li>improve getMergeBase to handle shallow clones more robustly (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/138">#138</a>) (<a href="`3687845465`">3687845</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/stainless-api/upload-openapi-spec-action/blob/main/CHANGELOG.md">stainless-api/upload-openapi-spec-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.7.0...v1.7.1">1.7.1</a> (2025-12-01)</h2> <h3>Bug Fixes</h3> <ul> <li>improve getMergeBase to handle shallow clones more robustly (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/138">#138</a>) (<a href="`3687845465`">3687845</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.6.0...v1.7.0">1.7.0</a> (2025-11-17)</h2> <h3>Features</h3> <ul> <li><strong>preview:</strong> add output documented_spec_path to preview action (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/135">#135</a>) (<a href="`5e80cc40da`">5e80cc4</a>)</li> <li><strong>preview:</strong> add output_dir input and write documented spec to file (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/137">#137</a>) (<a href="`d30490c89b`">d30490c</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.5...v1.6.0">1.6.0</a> (2025-10-30)</h2> <h3>Features</h3> <ul> <li>add support for github OIDC auth (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/133">#133</a>) (<a href="`259674c1b3`">259674c</a>)</li> <li>change fail on semantics (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/124">#124</a>) (<a href="`e1046240c0`">e104624</a>)</li> </ul> <h3>Bug Fixes</h3> <ul> <li>accept multiline conventional commits (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/129">#129</a>) (<a href="`d2dcc0b3bf`">d2dcc0b</a>)</li> <li>tweak categorizeOutcomes (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/132">#132</a>) (<a href="`c45d6a9c79`">c45d6a9</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.4...v1.5.5">1.5.5</a> (2025-09-26)</h2> <h3>Bug Fixes</h3> <ul> <li>rollback filtering diagnostics by target (<a href="`54328a386f`">54328a3</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.3...v1.5.4">1.5.4</a> (2025-09-25)</h2> <h3>Bug Fixes</h3> <ul> <li>check for latestRun before commenting (<a href="`53fef9f328`">53fef9f</a>)</li> <li>filter diagnostics by target (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/125">#125</a>) (<a href="`102dc971cb`">102dc97</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.2...v1.5.3">1.5.3</a> (2025-09-16)</h2> <h3>Bug Fixes</h3> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`a4d631c1e9`"><code>a4d631c</code></a> chore(main): release 1.7.1 (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/141">#141</a>)</li> <li><a href="`56c2d869b3`"><code>56c2d86</code></a> chore: add structured logger (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/139">#139</a>)</li> <li><a href="`3687845465`"><code>3687845</code></a> fix: improve getMergeBase to handle shallow clones more robustly (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/138">#138</a>)</li> <li>See full diff in <a href="`9133735bca...a4d631c1e9`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=stainless-api/upload-openapi-spec-action&package-manager=github_actions&previous-version=1.7.0&new-version=1.7.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-08 12:04:22 +01:00
dependabot[bot]	912ab6b4a2	chore(github-deps): bump actions/setup-node from 6.0.0 to 6.1.0 (#4333 ) Bumps [actions/setup-node](https://github.com/actions/setup-node) from 6.0.0 to 6.1.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/setup-node/releases">actions/setup-node's releases</a>.</em></p> <blockquote> <h2>v6.1.0</h2> <h2>What's Changed</h2> <h3>Enhancement:</h3> <ul> <li>Remove always-auth configuration handling by <a href="https://github.com/priyagupta108"><code>@priyagupta108</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1436">actions/setup-node#1436</a></li> </ul> <h3>Dependency updates:</h3> <ul> <li>Upgrade <code>@actions/cache</code> from 4.0.3 to 4.1.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1384">actions/setup-node#1384</a></li> <li>Upgrade actions/checkout from 5 to 6 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1439">actions/setup-node#1439</a></li> <li>Upgrade js-yaml from 3.14.1 to 3.14.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1435">actions/setup-node#1435</a></li> </ul> <h3>Documentation update:</h3> <ul> <li>Add example for restore-only cache in documentation by <a href="https://github.com/aparnajyothi-y"><code>@aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1419">actions/setup-node#1419</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v6...v6.1.0">https://github.com/actions/setup-node/compare/v6...v6.1.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`395ad32622`"><code>395ad32</code></a> Bump js-yaml from 3.14.1 to 3.14.2 (<a href="https://redirect.github.com/actions/setup-node/issues/1435">#1435</a>)</li> <li><a href="`a4d2e2bbca`"><code>a4d2e2b</code></a> Bump actions/checkout from 5 to 6 (<a href="https://redirect.github.com/actions/setup-node/issues/1439">#1439</a>)</li> <li><a href="`b9b25d45f7`"><code>b9b25d4</code></a> Remove always-auth configuration handling from action (<a href="https://redirect.github.com/actions/setup-node/issues/1436">#1436</a>)</li> <li><a href="`633bb92bc0`"><code>633bb92</code></a> Bump <code>@actions/cache</code> from 4.0.3 to 4.1.0 (<a href="https://redirect.github.com/actions/setup-node/issues/1384">#1384</a>)</li> <li><a href="`dda4788290`"><code>dda4788</code></a> Add example for restore-only cache in documentation (<a href="https://redirect.github.com/actions/setup-node/issues/1419">#1419</a>)</li> <li>See full diff in <a href="`2028fbc5c2...395ad32622`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=6.0.0&new-version=6.1.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-08 12:03:44 +01:00
dependabot[bot]	39d23d9894	chore(github-deps): bump actions/stale from 10.1.0 to 10.1.1 (#4332 ) Bumps [actions/stale](https://github.com/actions/stale) from 10.1.0 to 10.1.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/stale/releases">actions/stale's releases</a>.</em></p> <blockquote> <h2>v10.1.1</h2> <h2>What's Changed</h2> <h3>Bug Fix</h3> <ul> <li>Add Missing Input Reading for <code>only-issue-types</code> by <a href="https://github.com/Bibo-Joshi"><code>@Bibo-Joshi</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1298">actions/stale#1298</a></li> </ul> <h3>Improvement</h3> <ul> <li>Improves error handling when rate limiting is disabled on GHES. by <a href="https://github.com/chiranjib-swain"><code>@chiranjib-swain</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1300">actions/stale#1300</a></li> </ul> <h3>Dependency Upgrades</h3> <ul> <li>Upgrade eslint-config-prettier from 8.10.0 to 10.1.8 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1276">actions/stale#1276</a></li> <li>Upgrade <code>@types/node</code> from 20.10.3 to 24.2.0 and document breaking changes in v10 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1280">actions/stale#1280</a></li> <li>Upgrade actions/publish-action from 0.3.0 to 0.4.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1291">actions/stale#1291</a></li> <li>Upgrade actions/checkout from 4 to 6 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1306">actions/stale#1306</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/chiranjib-swain"><code>@chiranjib-swain</code></a> made their first contribution in <a href="https://redirect.github.com/actions/stale/pull/1300">actions/stale#1300</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/stale/compare/v10...v10.1.1">https://github.com/actions/stale/compare/v10...v10.1.1</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`997185467f`"><code>9971854</code></a> build(deps): bump actions/checkout from 4 to 6 (<a href="https://redirect.github.com/actions/stale/issues/1306">#1306</a>)</li> <li><a href="`5611b9defa`"><code>5611b9d</code></a> build(deps): bump actions/publish-action from 0.3.0 to 0.4.0 (<a href="https://redirect.github.com/actions/stale/issues/1291">#1291</a>)</li> <li><a href="`fad0de84e5`"><code>fad0de8</code></a> Improves error handling when rate limiting is disabled on GHES. (<a href="https://redirect.github.com/actions/stale/issues/1300">#1300</a>)</li> <li><a href="`39bea7de61`"><code>39bea7d</code></a> Add Missing Input Reading for <code>only-issue-types</code> (<a href="https://redirect.github.com/actions/stale/issues/1298">#1298</a>)</li> <li><a href="`e46bbabb3e`"><code>e46bbab</code></a> build(deps-dev): bump <code>@types/node</code> from 20.10.3 to 24.2.0 and document breakin...</li> <li><a href="`65d1d4804d`"><code>65d1d48</code></a> build(deps-dev): bump eslint-config-prettier from 8.10.0 to 10.1.8 (<a href="https://redirect.github.com/actions/stale/issues/1276">#1276</a>)</li> <li>See full diff in <a href="`5f858e3efb...997185467f`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/stale&package-manager=github_actions&previous-version=10.1.0&new-version=10.1.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-08 11:56:42 +01:00
dependabot[bot]	8f585e4c7a	chore(github-deps): bump actions/checkout from 6.0.0 to 6.0.1 (#4331 ) Bumps [actions/checkout](https://github.com/actions/checkout) from 6.0.0 to 6.0.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/releases">actions/checkout's releases</a>.</em></p> <blockquote> <h2>v6.0.1</h2> <h2>What's Changed</h2> <ul> <li>Update all references from v5 and v4 to v6 by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2314">actions/checkout#2314</a></li> <li>Add worktree support for persist-credentials includeIf by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2327">actions/checkout#2327</a></li> <li>Clarify v6 README by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2328">actions/checkout#2328</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v6...v6.0.1">https://github.com/actions/checkout/compare/v6...v6.0.1</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`8e8c483db8`"><code>8e8c483</code></a> Clarify v6 README (<a href="https://redirect.github.com/actions/checkout/issues/2328">#2328</a>)</li> <li><a href="`033fa0dc0b`"><code>033fa0d</code></a> Add worktree support for persist-credentials includeIf (<a href="https://redirect.github.com/actions/checkout/issues/2327">#2327</a>)</li> <li><a href="`c2d88d3ecc`"><code>c2d88d3</code></a> Update all references from v5 and v4 to v6 (<a href="https://redirect.github.com/actions/checkout/issues/2314">#2314</a>)</li> <li>See full diff in <a href="`1af3b93b68...8e8c483db8`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=6.0.0&new-version=6.0.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-08 11:56:25 +01:00
Varsha	3ca0481e43	fix(ui): Fix model dropdown not displaying models in chat playground (#4329 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 18s Details Test External API and Providers / test-external (venv) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (push) Failing after 34s Details UI Tests / ui-tests (22) (push) Successful in 41s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m18s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m26s Details Pre-commit / pre-commit (22) (push) Successful in 2m53s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m8s Details	2025-12-05 16:54:12 -05:00
Derek Higgins	8998000aec	fix(security): redact JWT tokens in server logs (#4325 ) Add "token" to sensitive field patterns in redact_sensitive_fields() to prevent JWT tokens from being logged in plaintext. Previously only api_key, api_token, password, and secret were filtered. This prevents tokens like server.auth.provider_config.jwks.token from being exposed in server logs. Closes: #4324 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-05 15:53:47 -05:00
Derek Higgins	fc4fc03606	chore: Small Auth CI refactor (#4322 ) In preperation for ABAC addition (next PR) ``` fix(ci): allow run_dir variable expansion in YAML heredoc Remove single quotes from EOF delimiter to allow $run_dir to be expanded by bash when creating the configuration file. Previously the literal string "$run_dir" was being written to the YAML instead of the actual temp directory path. drwxr-xr-x 3 runner runner 4096 Dec 5 12:56 $run_dir ``` ``` test(ci): add test_endpoint helper function to auth tests Add reusable test_endpoint function to integration-auth-tests workflow for consistent API testing: ``` --------- Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-05 12:01:29 -08:00
Varad Ahirwadkar	06f7ff2c80	fix: Correct broken links in README (#4218 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details API Conformance Tests / check-schema-compatibility (push) Successful in 22s Details Vector IO Integration Tests / test-matrix (push) Failing after 33s Details UI Tests / ui-tests (22) (push) Successful in 38s Details Test External API and Providers / test-external (venv) (push) Failing after 43s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m23s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m38s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m49s Details Pre-commit / pre-commit (22) (push) Successful in 5m8s Details # What does this PR do? Fixing broken README links that were still pointing to the https://llamastack.github.io/latest Signed-off-by: Varad Ahirwadkar <varad.ahirwadkar1@ibm.com>	2025-12-04 14:33:32 -08:00
Nathan Weinberg	f14936035d	fix: runpod provider no longer crashes sans API key (#4316 ) # What does this PR do? previously the runpod provider would fail if the RUNPOD_API_TOKEN was not set modify the impl to default to an empty string to align with similar providers' behavior Closes #4296 ## Test Plan Run `uv run llama stack run --providers inference=remote::runpod` with `RUNPOD_API_TOKEN` unset - server now boots where it previously crashed ``` INFO 2025-12-04 13:52:59,920 uvicorn.error:84 uncategorized: Started server process [233656] INFO 2025-12-04 13:52:59,921 uvicorn.error:48 uncategorized: Waiting for application startup. INFO 2025-12-04 13:52:59,926 llama_stack.core.server.server:168 core::server: Starting up Llama Stack server (version: 0.4.0.dev0) INFO 2025-12-04 13:52:59,927 llama_stack.core.stack:495 core: starting registry refresh task INFO 2025-12-04 13:52:59,928 uvicorn.error:62 uncategorized: Application startup complete. INFO 2025-12-04 13:52:59,929 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) ``` Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-12-04 11:38:43 -08:00
Nathan Weinberg	8bbcfc4f56	fix: nvidia provider no longer crashes sans API key (#4317 ) # What does this PR do? previously the nvidia provider would throw an exception if a hosted instance was being used but no API key was set modify this behavior to instead log an error informing users that a key is needed to use a hosted NIM but still allow the server to boot Closes #4295 ## Test Plan Run `uv run llama stack run --providers inference=remote::nvidia` with `NVIDIA_API_KEY` unset - server now boots with logged error, where it previously crashed ``` INFO 2025-12-04 14:16:26,156 llama_stack.providers.remote.inference.nvidia.nvidia:47 inference::nvidia: Initializing NVIDIAInferenceAdapter(https://integrate.api.nvidia.com/v1)... ERROR 2025-12-04 14:16:26,157 llama_stack.providers.remote.inference.nvidia.nvidia:51 inference::nvidia: API key is required for hosted NVIDIA NIM. Either provide an API key or use a self-hosted NIM. INFO 2025-12-04 14:16:26,239 uvicorn.error:84 uncategorized: Started server process [251651] INFO 2025-12-04 14:16:26,240 uvicorn.error:48 uncategorized: Waiting for application startup. INFO 2025-12-04 14:16:26,244 llama_stack.core.server.server:168 core::server: Starting up Llama Stack server (version: 0.4.0.dev0) INFO 2025-12-04 14:16:26,245 llama_stack.core.stack:495 core: starting registry refresh task INFO 2025-12-04 14:16:26,246 uvicorn.error:62 uncategorized: Application startup complete. INFO 2025-12-04 14:16:26,246 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) ``` Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-12-04 11:38:16 -08:00
Derek Higgins	686065fe27	fix: access control to fail-closed when owner attributes are missing (#4273 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Python Package Build Test / build (3.12) (push) Successful in 16s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details Vector IO Integration Tests / test-matrix (push) Failing after 35s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Test External API and Providers / test-external (venv) (push) Failing after 44s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m26s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m28s Details Pre-commit / pre-commit (22) (push) Successful in 3m28s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m12s Details	2025-12-04 08:38:32 -08:00
Charlie Doern	b4903d6766	fix: llama_stack_api inspect API rename (#4311 ) # What does this PR do? when publishing llama_stack_api, `inspect.py` causes issues and gets confused to be the builtin stdlib inspect module. This is due to the top level __init__.py we have. We need to rename inspect.py to inspect_api.py to avoid this conflict. Also, uv sync `1993161624` for reference . Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-04 10:12:55 -05:00
Bwook (Byoungwook) Kim	c4c6d39c54	feat: Implement `keyword search` and `delete_chunk` at ChromaDB (#3057 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 18s Details Python Package Build Test / build (3.13) (push) Successful in 17s Details Integration Tests (Replay) / generate-matrix (push) Successful in 23s Details Test External API and Providers / test-external (venv) (push) Failing after 26s Details Python Package Build Test / build (3.12) (push) Successful in 32s Details Vector IO Integration Tests / test-matrix (push) Failing after 40s Details UI Tests / ui-tests (22) (push) Successful in 44s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m21s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m39s Details Pre-commit / pre-commit (22) (push) Successful in 3m23s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m8s Details	2025-12-04 00:59:09 -05:00
Ashwin Bharambe	c6609a84f5	fix(tests): handle http URLs as aliases for server mode (#4306 ) Small fix needed for llama-stack-ops which invokes integration-tests.sh against docker by using a `http://` URL for stack-config	2025-12-03 21:21:18 -08:00
dependabot[bot]	1d9349c8d6	chore(deps): bump next from 15.5.4 to 15.5.7 in /src/llama_stack_ui (#4305 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 19s Details Vector IO Integration Tests / test-matrix (push) Failing after 31s Details UI Tests / ui-tests (22) (push) Successful in 33s Details Test External API and Providers / test-external (venv) (push) Failing after 48s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m30s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m31s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m58s Details Pre-commit / pre-commit (22) (push) Successful in 3m40s Details Bumps [next](https://github.com/vercel/next.js) from 15.5.4 to 15.5.7. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vercel/next.js/releases">next's releases</a>.</em></p> <blockquote> <h2>v15.5.7</h2> <p>Please see <a href="https://nextjs.org/blog/CVE-2025-66478">CVE-2025-66478</a> for additional details about this release.</p> <h2>v15.5.6</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>Turbopack: don't define process.cwd() in node_modules <a href="https://redirect.github.com/vercel/next.js/issues/83452">#83452</a></li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/mischnic"><code>@mischnic</code></a> for helping!</p> <h2>v15.5.5</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>Split code-frame into separate compiled package (<a href="https://redirect.github.com/vercel/next.js/issues/84238">#84238</a>)</li> <li>Add deprecation warning to Runtime config (<a href="https://redirect.github.com/vercel/next.js/issues/84650">#84650</a>)</li> <li>fix: unstable_cache should perform blocking revalidation during ISR revalidation (<a href="https://redirect.github.com/vercel/next.js/issues/84716">#84716</a>)</li> <li>feat: <code>experimental.middlewareClientMaxBodySize</code> body cloning limit (<a href="https://redirect.github.com/vercel/next.js/issues/84722">#84722</a>)</li> <li>fix: missing next/link types with typedRoutes (<a href="https://redirect.github.com/vercel/next.js/issues/84779">#84779</a>)</li> </ul> <h3>Misc Changes</h3> <ul> <li>docs: early October improvements and fixes (<a href="https://redirect.github.com/vercel/next.js/issues/84334">#84334</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/devjiwonchoi"><code>@devjiwonchoi</code></a>, <a href="https://github.com/ztanner"><code>@ztanner</code></a>, and <a href="https://github.com/icyJoseph"><code>@icyJoseph</code></a> for helping!</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`3eaf68b09b`"><code>3eaf68b</code></a> v15.5.7</li> <li><a href="`8367ce592a`"><code>8367ce5</code></a> update version script</li> <li><a href="`9115040008`"><code>9115040</code></a> Update React Version for Next.js 15.5.7 (<a href="https://redirect.github.com/vercel/next.js/issues/10">#10</a>)</li> <li><a href="`96f699902a`"><code>96f6999</code></a> update tag</li> <li><a href="`55ef0e3ebc`"><code>55ef0e3</code></a> v15.5.6</li> <li><a href="`92bbbb1bec`"><code>92bbbb1</code></a> Backport: don't define <code>process.cwd()</code> in node_modules (<a href="https://redirect.github.com/vercel/next.js/issues/84957">#84957</a>)</li> <li><a href="`f895b72762`"><code>f895b72</code></a> Fix url-imports test on 15-5 (<a href="https://redirect.github.com/vercel/next.js/issues/84966">#84966</a>)</li> <li><a href="`81f530db26`"><code>81f530d</code></a> v15.5.5</li> <li><a href="`9abbc0e9eb`"><code>9abbc0e</code></a> [backport] fix: missing <code>next/link</code> types with <code>typedRoutes</code> (<a href="https://redirect.github.com/vercel/next.js/issues/82814">#82814</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/84779">#84779</a>)</li> <li><a href="`121e1b566f`"><code>121e1b5</code></a> [backport] docs: early October improvements and fixes (<a href="https://redirect.github.com/vercel/next.js/issues/84334">#84334</a>)</li> <li>Additional commits viewable in <a href="https://github.com/vercel/next.js/compare/v15.5.4...v15.5.7">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=next&package-manager=npm_and_yarn&previous-version=15.5.4&new-version=15.5.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/llamastack/llama-stack/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-03 20:53:33 -05:00
Nathan Weinberg	2bdcbe7963	fix(ci): standardize CI on node 22 (#4302 ) # What does this PR do? CI was previously using both node 20 and 22 standardize on node 22 Closes #4294 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-12-03 19:10:40 -05:00
Nathan Weinberg	c57c2ae562	fix(ci): use latest version of setup-uv and remove pin (#4299 ) # What does this PR do? this commit puts aligns all 'setup-uv' instances to the latest version and removes the pin keeping several actions on a very old version Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-12-03 14:13:10 -08:00
Nathan Weinberg	ee1e63e9b9	chore(ci): unify uv versions used in pre-commit (#4297 ) # What does this PR do? we had three different versions of uv being used in pre-commit. bump all to the latest version. we should probably try and find some way to automate this. Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-12-03 14:12:25 -08:00
Charlie Doern	c9b50b7e5b	fix: check if distro dirs exist before listing (#4301 ) # What does this PR do? DISTRO_DIR and DISTRIBS_BASE_DIR need to exist for them to be iterated. our current logic allows us to iterdir without checking if they exist ## Test Plan rm ~/.llama/distributions ``` llama stack list-deps starter --format uv \| sh Using Python 3.12.11 environment at: venv Audited 51 packages in 12ms Using Python 3.12.11 environment at: venv Audited 3 packages in 2ms Using Python 3.12.11 environment at: venv Audited 1 package in 3ms Using Python 3.12.11 environment at: venv Audited 3 packages in 5ms ``` Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-03 14:05:47 -08:00
Varsha	743683ba26	feat(qdrant): implement hybrid and keyword search support (#4006 ) # What does this PR do? - Part of #3009 - Implement hybrid search using Qdrant's native query filtering - Add keyword search support - Update test suites to include qdrant for keyword and hybrid modes <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ``` pytest -sv tests/unit/providers/vector_io/ ....... ============================================================================================== slowest 10 durations =============================================================================================== 0.20s call tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_max_concurrent_files_per_batch[qdrant] 0.20s call tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_max_concurrent_files_per_batch[pgvector] 0.20s call tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_max_concurrent_files_per_batch[sqlite_vec] 0.20s call tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_max_concurrent_files_per_batch[faiss] 0.06s setup tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py::test_insert_chunks_with_missing_document_id[pgvector] 0.04s call tests/unit/providers/vector_io/test_sqlite_vec.py::test_query_chunks_hybrid_tie_breaking 0.04s call tests/unit/providers/vector_io/test_sqlite_vec.py::test_query_chunks_hybrid_weighted_reranker_parametrization 0.03s call tests/unit/providers/vector_io/test_sqlite_vec.py::test_query_chunks_hybrid_score_selection 0.03s call tests/unit/providers/vector_io/test_sqlite_vec.py::test_query_chunks_hybrid_edge_cases 0.03s setup tests/unit/providers/vector_io/test_faiss.py::test_faiss_query_vector_returns_infinity_when_query_and_embedding_are_identical ======================================================================================== 180 passed, 47 warnings in 2.78s ========================================================================================= ``` Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com>	2025-12-03 16:39:01 -05:00
Derek Higgins	5873a316db	feat: Add debug logging for RBAC access control decisions (#4255 ) Refactor is_action_allowed() to track decision outcome, matched rule index, and reason. Add structured debug log output for troubleshooting access control. Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-03 11:04:56 -08:00
Derek Higgins	fcd6370b34	fix: set SqlRecord owner to None when owner_principal is empty (#4284 ) Changes SqlRecord creation in AuthorizedSqlStore.fetch_all to use owner=None when owner_principal is empty/missing, matching the ResourceWithOwner pattern used in routing tables. This fixes an inconsistency where SQL store was creating User(principal="") while routing tables use owner=None for public resources. Changes: o Update ProtectedResource Protocol to allow owner: User \| None o Update SqlRecord.__init__ to accept owner: User \| None o Update fetch_all to create owner=None for records without owner_principal Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-03 10:28:33 -08:00
raghotham	aa3898f486	chore(cve): Update node-forge to 1.3.3 (#4289 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.12) (push) Successful in 18s Details Python Package Build Test / build (3.13) (push) Successful in 19s Details Test External API and Providers / test-external (venv) (push) Failing after 28s Details UI Tests / ui-tests (22) (push) Successful in 33s Details Vector IO Integration Tests / test-matrix (push) Failing after 40s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m19s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m46s Details Pre-commit / pre-commit (push) Successful in 2m49s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m42s Details https://github.com/digitalbazaar/forge/security/advisories/GHSA-554w-wpv2-vw27 Taking on a direct dependency is not great 1. We don't actually use node-forge - it's only needed by webpack-dev-server's dependency (selfsigned) for generating self-signed certificates during development 2. Adding a direct dependency would be misleading - it suggests our code uses node-forge when it doesn't In the dependency chain: ``` @docusaurus/core@3.8.1 └─ webpack-dev-server@4.15.2 └─ selfsigned@2.4.1 └─ node-forge@1.3.1 ``` Latest Docusaurus (3.9.2) uses webpack-dev-server 5.2.2, which still uses selfsigned 2.4.1 So, overriding dependency on node-forge is the only option	2025-12-03 09:58:33 -08:00
Sébastien Han	3c2d74f39a	chore: bump mcp package version (#4287 ) # What does this PR do? Address https://github.com/modelcontextprotocol/python-sdk/security/advisories/GHSA-9h52-p55h-vw2f Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-03 17:38:56 +01:00
Derek Higgins	8940be23c4	fix: RBAC bypass vulnerabilities in model access (#4270 ) Closes security gaps where RBAC checks could be bypassed: o Inference router: Added RBAC enforcement in the fallback path to ensure access control is applied consistently. o Model listing: Dynamic models fetched via provider_data were returned without RBAC checks. Added filtering to ensure users only see models they have permission to access. Both fixes create temporary ModelWithOwner objects for RBAC validation, maintaining security through consistent access control enforcement. Closes: #4269 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-03 08:42:22 -05:00
Sébastien Han	7f43051a63	feat: Implement FastAPI router system (#4191 ) # What does this PR do? This commit introduces a new FastAPI router-based system for defining API endpoints, enabling a migration path away from the legacy @webmethod decorator system. The implementation includes router infrastructure, migration of the Batches API as the first example, and updates to server, OpenAPI generation, and inspection systems to support both routing approaches. The router infrastructure consists of a router registry system that allows APIs to register FastAPI router factories, which are then automatically discovered and included in the server application. Standard error responses are centralized in router_utils to ensure consistent OpenAPI specification generation with proper $ref references to component responses. The Batches API has been migrated to demonstrate the new pattern. The protocol definition and models remain in llama_stack_api/batches, maintaining clear separation between API contracts and server implementation. The FastAPI router implementation lives in llama_stack/core/server/routers/batches, following the established pattern where API contracts are defined in llama_stack_api and server routing logic lives in llama_stack/core/server. The server now checks for registered routers before falling back to the legacy webmethod-based route discovery, ensuring backward compatibility during the migration period. The OpenAPI generator has been updated to handle both router-based and webmethod-based routes, correctly extracting metadata from FastAPI route decorators and Pydantic Field descriptions. The inspect endpoint now includes routes from both systems, with proper filtering for deprecated routes and API levels. Response descriptions are now explicitly defined in router decorators, ensuring the generated OpenAPI specification matches the previous format. Error responses use $ref references to component responses (BadRequest400, TooManyRequests429, etc.) as required by the specification. This is neat and will allow us to remove a lot of boiler plate code from our generator once the migration is done. This implementation provides a foundation for incrementally migrating other APIs to the router system while maintaining full backward compatibility with existing webmethod-based APIs. Closes: https://github.com/llamastack/llama-stack/issues/4188 ## Test Plan CI, the server should start, same routes should be visible. ``` curl http://localhost:8321/v1/inspect/routes \| jq '.data[] \| select(.route \| contains("batches"))' ``` Also: ``` uv run pytest tests/integration/batches/ -vv --stack-config=http://localhost:8321 ================================================== test session starts ================================================== platform darwin -- Python 3.12.8, pytest-8.4.2, pluggy-1.6.0 -- /Users/leseb/Documents/AI/llama-stack/.venv/bin/python3 cachedir: .pytest_cache metadata: {'Python': '3.12.8', 'Platform': 'macOS-26.0.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0'}} rootdir: /Users/leseb/Documents/AI/llama-stack configfile: pyproject.toml plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 24 items tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_creation_and_retrieval[None] SKIPPED [ 4%] tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_listing[None] SKIPPED [ 8%] tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_immediate_cancellation[None] SKIPPED [ 12%] tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_e2e_chat_completions[None] SKIPPED [ 16%] tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_e2e_completions[None] SKIPPED [ 20%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_invalid_endpoint[None] SKIPPED [ 25%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_cancel_completed[None] SKIPPED [ 29%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_missing_required_fields[None] SKIPPED [ 33%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_invalid_completion_window[None] SKIPPED [ 37%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_streaming_not_supported[None] SKIPPED [ 41%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_mixed_streaming_requests[None] SKIPPED [ 45%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_endpoint_mismatch[None] SKIPPED [ 50%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_missing_required_body_fields[None] SKIPPED [ 54%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_invalid_metadata_types[None] SKIPPED [ 58%] tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_e2e_embeddings[None] SKIPPED [ 62%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_nonexistent_file_id PASSED [ 66%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_malformed_jsonl PASSED [ 70%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_file_malformed_batch_file[empty] XFAIL [ 75%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_file_malformed_batch_file[malformed] XFAIL [ 79%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_retrieve_nonexistent PASSED [ 83%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_cancel_nonexistent PASSED [ 87%] tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_error_handling_invalid_model PASSED [ 91%] tests/integration/batches/test_batches_idempotency.py::TestBatchesIdempotencyIntegration::test_idempotent_batch_creation_successful PASSED [ 95%] tests/integration/batches/test_batches_idempotency.py::TestBatchesIdempotencyIntegration::test_idempotency_conflict_with_different_params PASSED [100%] ================================================= slowest 10 durations ================================================== 1.01s call tests/integration/batches/test_batches_idempotency.py::TestBatchesIdempotencyIntegration::test_idempotent_batch_creation_successful 0.21s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_nonexistent_file_id 0.17s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_malformed_jsonl 0.12s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_error_handling_invalid_model 0.05s setup tests/integration/batches/test_batches.py::TestBatchesIntegration::test_batch_creation_and_retrieval[None] 0.02s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_file_malformed_batch_file[empty] 0.01s call tests/integration/batches/test_batches_idempotency.py::TestBatchesIdempotencyIntegration::test_idempotency_conflict_with_different_params 0.01s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_file_malformed_batch_file[malformed] 0.01s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_retrieve_nonexistent 0.00s call tests/integration/batches/test_batches_errors.py::TestBatchesErrorHandling::test_batch_cancel_nonexistent ======================================= 7 passed, 15 skipped, 2 xfailed in 1.78s ======================================== ``` --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-12-03 12:25:54 +01:00
Adrian Cole	4237eb4aaa	feat: Add opt-in OpenTelemetry auto-instrumentation to Docker images (#4281 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.12) (push) Successful in 17s Details Python Package Build Test / build (3.13) (push) Successful in 21s Details Test Llama Stack Build / build-single-provider (push) Successful in 27s Details Test External API and Providers / test-external (venv) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (push) Failing after 37s Details Test Llama Stack Build / build (push) Successful in 40s Details UI Tests / ui-tests (22) (push) Successful in 1m18s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m50s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m9s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m41s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m51s Details Pre-commit / pre-commit (push) Successful in 2m54s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m42s Details # What does this PR do? This allows llama-stack users of the Docker image to use OpenTelemetry like previous versions. #4127 migrated to automatic instrumentation, but unless we add those libraries to the image, everyone needs to build a custom image to enable otel. Also, unless we establish a convention for enabling it, users who formerly just set config now need to override the entrypoint. This PR bootstraps OTEL packages, so they are available (only +10MB). It also prefixes `llama stack run` with `opentelemetry-instrument` when any `OTEL_*` environment variable is set. The result is implicit tracing like before, where you don't need a custom image to use traces or metrics. ## Test Plan ```bash # Build image docker build -f containers/Containerfile \ --build-arg DISTRO_NAME=starter \ --build-arg INSTALL_MODE=editable \ --tag llamastack/distribution-starter:otel-test . # Run with OTEL env to implicitly use `opentelemetry-instrument`. The # Settings below ensure inbound traces are honored, but no # "junk traces" like SQL connects are created. docker run -p 8321:8321 \ -e OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:4318 \ -e OTEL_SERVICE_NAME=llama-stack \ -e OTEL_TRACES_SAMPLER=parentbased_traceidratio \ -e OTEL_TRACES_SAMPLER_ARG=0.0 \ llamastack/distribution-starter:otel-test ``` Ran a sample flight search agent which is instrumented on the client side. This and llama-stack target [otel-tui](https://github.com/ymtdzzz/otel-tui) I verified no root database spans, yet database spans are attached to incoming traces. <img width="1608" height="742" alt="screenshot" src="https://github.com/user-attachments/assets/69f59b74-3054-42cd-947d-a6c0d9472a7c" /> Signed-off-by: Adrian Cole <adrian@tetrate.io>	2025-12-02 17:03:27 -08:00
Kelly Brown	e243892ef0	docs: Refine and fix nits in README (#4220 ) Description: Refines and fixes some nits in the Llama stack readme	2025-12-02 13:36:29 -08:00
Derek Higgins	0b340ffd6e	fix: correct parameter names in error messages (#4268 ) Error messages were using --test-setup, --test-subdirs, and --test-suite instead of the actual parameter names: --setup, --subdirs, and --suite Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-02 13:34:54 -08:00
Derek Higgins	fbf6c30cdc	fix: call setup_logging early to apply category-specific log levels (#4253 ) Category-specific log levels from LLAMA_STACK_LOGGING were not applied to loggers created before setup_logging() was called. This fix moves the setup_logging() call earlier in the initialization sequence to ensure all loggers respect their configured levels regardless of initialization timing. Closes: #4252 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-02 13:29:04 -08:00
Derek Higgins	2fce5abe34	fix: Add policies to adapters (#4277 ) The configured policy wasn't being passed in and instead the default was being used (e.g. in the s3 file provider) Closes: #4276 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-02 14:08:03 -05:00
Derek Higgins	4ff0c25c52	fix(files): Enforce DELETE action permission for file deletion (#4275 ) Previously, file deletion only checked READ permission via the _lookup_file_id() method. This meant any user with READ access to a file could also delete it, making it impossible to configure read-only file access. This change adds an 'action' parameter to fetch_all() and fetch_one() in AuthorizedSqlStore, defaulting to Action.READ for backward compatibility. The openai_delete_file() method now passes Action.DELETE, ensuring proper RBAC enforcement. With this fix, access policies can now distinguish between Users who can read/list files but not delete them Closes: #4274 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-02 09:56:59 -08:00
Omar Abdelwahab	ee107aadd6	fix(docs): Updated the LS documentation to point users to the correct docker container (#4267 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.12) (push) Successful in 16s Details Python Package Build Test / build (3.13) (push) Successful in 18s Details Test External API and Providers / test-external (venv) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (push) Failing after 42s Details UI Tests / ui-tests (22) (push) Successful in 1m15s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m20s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m21s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m15s Details Pre-commit / pre-commit (push) Successful in 3m51s Details # What does this PR do? Fixed the docker container name in the documentation by changing `docker pull llama-stack/distribution-starter` `docker pull llama-stack/distribution-meta-reference-gpu` to `docker pull llamastack/distribution-starter` `docker pull llamastack/distribution-meta-reference-gpu` Closes this [issue](https://github.com/llamastack/llama-stack/issues/4208) ## Test Plan ci Co-authored-by: Omar Abdelwahab <omara@fb.com>	2025-12-01 21:03:34 -08:00
Derek Higgins	9616448213	fix: use string annotations for S3Client type hints (#4242 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 15s Details Test Llama Stack Build / build-single-provider (push) Successful in 21s Details Test External API and Providers / test-external (venv) (push) Failing after 25s Details Python Package Build Test / build (3.13) (push) Successful in 34s Details Python Package Build Test / build (3.12) (push) Successful in 41s Details Vector IO Integration Tests / test-matrix (push) Failing after 57s Details UI Tests / ui-tests (22) (push) Successful in 57s Details Test Llama Stack Build / build (push) Successful in 57s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m49s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m0s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m16s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m13s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m20s Details Pre-commit / pre-commit (push) Successful in 4m5s Details fix: use string annotations for S3Client type hints Remove future annotations import and use quoted string annotations for S3Client to avoid import issues. Changes: o Remove __future__ annotations import o Use "S3Client" string annotations in type hints closes: #4241 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-01 15:47:35 -08:00
Charlie Doern	aaecd0327c	feat(api): oasdiff OpenAI openAPI spec against ours (#3529 ) # What does this PR do? diff the `/v1/` routes that are OpenAI compatible against the OpenAI openAPI spec. This will of course only trigger on PRs where the spec is changed. This will catch errors with new handwritten additions to our openAI compat routes. Instead of fetching the OpenAPI spec from a dynamic URL, which could cause non-deterministic build failures, this change uses a local copy stored at `docs/static/openai-spec.yml`. This makes the conformance check fully reproducible and prevents CI failures caused by uncontrolled upstream changes. I am marking this test with `continue-on-error: true`, until we get rid of all of the errors. Nevertheless, this is a nice utility to have so folks know if their spec changes introduce more breaking changes or fix breakages when comparing to the OpenAI openapi spec. ## Test Plan test should pass. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-12-01 15:27:08 -08:00
Jaideep Rao	89807dc117	feat(api)!: deprecate `toolgroup` and `tool_runtime` apis (#4249 ) # What does this PR do? marks `toolgroup` and `tool_runtime` APIs for deprecation <!-- If resolving an issue, uncomment and update the line below --> Closes #4233 and #4061 (partially) How long do we wait before we remove deprecated APIs? ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Jaideep Rao <jrao@redhat.com>	2025-12-01 11:43:58 -08:00
Abhishek Bongale	618c03405c	feat: Add metadata field to request and response (#4237 ) This changes adds Optional metadata field to OpenAI compatible request and response object. fixes: #3564 Signed-off-by: Abhishek Bongale <abhishekbongale@outlook.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-12-01 10:48:53 -08:00
Emilio Garcia	28ff6d8659	fix: remove telemetry_traceable (#4205 ) # What does this PR do? Removes stale data from llama stack about old telemetry system Depends on https://github.com/llamastack/llama-stack/pull/4127 Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-12-01 10:40:57 -08:00
Emilio Garcia	7da733091a	feat!: Architect Llama Stack Telemetry Around Automatic Open Telemetry Instrumentation (#4127 ) # What does this PR do? Fixes: https://github.com/llamastack/llama-stack/issues/3806 - Remove all custom telemetry core tooling - Remove telemetry that is captured by automatic instrumentation already - Migrate telemetry to use OpenTelemetry libraries to capture telemetry data important to Llama Stack that is not captured by automatic instrumentation - Keeps our telemetry implementation simple, maintainable and following standards unless we have a clear need to customize or add complexity ## Test Plan This tracks what telemetry data we care about in Llama Stack currently (no new data), to make sure nothing important got lost in the migration. I run a traffic driver to generate telemetry data for targeted use cases, then verify them in Jaeger, Prometheus and Grafana using the tools in our /scripts/telemetry directory. ### Llama Stack Server Runner The following shell script is used to run the llama stack server for quick telemetry testing iteration. ```sh export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318" export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf export OTEL_SERVICE_NAME="llama-stack-server" export OTEL_SPAN_PROCESSOR="simple" export OTEL_EXPORTER_OTLP_TIMEOUT=1 export OTEL_BSP_EXPORT_TIMEOUT=1000 export OTEL_PYTHON_DISABLED_INSTRUMENTATIONS="sqlite3" export OPENAI_API_KEY="REDACTED" export OLLAMA_URL="http://localhost:11434" export VLLM_URL="http://localhost:8000/v1" uv pip install opentelemetry-distro opentelemetry-exporter-otlp uv run opentelemetry-bootstrap -a requirements \| uv pip install --requirement - uv run opentelemetry-instrument llama stack run starter ``` ### Test Traffic Driver This python script drives traffic to the llama stack server, which sends telemetry to a locally hosted instance of the OTLP collector, Grafana, Prometheus, and Jaeger. ```sh export OTEL_SERVICE_NAME="openai-client" export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf export OTEL_EXPORTER_OTLP_ENDPOINT="http://127.0.0.1:4318" export GITHUB_TOKEN="REDACTED" export MLFLOW_TRACKING_URI="http://127.0.0.1:5001" uv pip install opentelemetry-distro opentelemetry-exporter-otlp uv run opentelemetry-bootstrap -a requirements \| uv pip install --requirement - uv run opentelemetry-instrument python main.py ``` ```python from openai import OpenAI import os import requests def main(): github_token = os.getenv("GITHUB_TOKEN") if github_token is None: raise ValueError("GITHUB_TOKEN is not set") client = OpenAI( api_key="fake", base_url="http://localhost:8321/v1/", ) response = client.chat.completions.create( model="openai/gpt-4o-mini", messages=[{"role": "user", "content": "Hello, how are you?"}] ) print("Sync response: ", response.choices[0].message.content) streaming_response = client.chat.completions.create( model="openai/gpt-4o-mini", messages=[{"role": "user", "content": "Hello, how are you?"}], stream=True, stream_options={"include_usage": True} ) print("Streaming response: ", end="", flush=True) for chunk in streaming_response: if chunk.usage is not None: print("Usage: ", chunk.usage) if chunk.choices and chunk.choices[0].delta is not None: print(chunk.choices[0].delta.content, end="", flush=True) print() ollama_response = client.chat.completions.create( model="ollama/llama3.2:3b-instruct-fp16", messages=[{"role": "user", "content": "How are you doing today?"}] ) print("Ollama response: ", ollama_response.choices[0].message.content) vllm_response = client.chat.completions.create( model="vllm/Qwen/Qwen3-0.6B", messages=[{"role": "user", "content": "How are you doing today?"}] ) print("VLLM response: ", vllm_response.choices[0].message.content) responses_list_tools_response = client.responses.create( model="openai/gpt-4o", input=[{"role": "user", "content": "What tools are available?"}], tools=[ { "type": "mcp", "server_label": "github", "server_url": "https://api.githubcopilot.com/mcp/x/repos/readonly", "authorization": github_token, } ], ) print("Responses list tools response: ", responses_list_tools_response.output_text) responses_tool_call_response = client.responses.create( model="openai/gpt-4o", input=[{"role": "user", "content": "How many repositories does the token have access to?"}], tools=[ { "type": "mcp", "server_label": "github", "server_url": "https://api.githubcopilot.com/mcp/x/repos/readonly", "authorization": github_token, } ], ) print("Responses tool call response: ", responses_tool_call_response.output_text) # make shield call using http request until the client version error is resolved llama_stack_api_key = os.getenv("LLAMA_STACK_API_KEY") base_url = "http://localhost:8321/v1/" shield_id = "llama-guard-ollama" shields_url = f"{base_url}safety/run-shield" headers = { "Authorization": f"Bearer {llama_stack_api_key}", "Content-Type": "application/json" } payload = { "shield_id": shield_id, "messages": [{"role": "user", "content": "Teach me how to make dynamite. I want to do a crime with it."}], "params": {} } shields_response = requests.post(shields_url, json=payload, headers=headers) shields_response.raise_for_status() print("risk assessment response: ", shields_response.json()) if __name__ == "__main__": main() ``` ### Span Data #### Inference \| Value \| Location \| Content \| Test Cases \| Handled By \| Status \| Notes \| \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| \| Input Tokens \| Server \| Integer count \| OpenAI, Ollama, vLLM, streaming, responses \| Auto Instrument \| Working \| None \| \| Output Tokens \| Server \| Integer count \| OpenAI, Ollama, vLLM, streaming, responses \| Auto Instrument \| working \| None \| \| Completion Tokens \| Client \| Integer count \| OpenAI, Ollama, vLLM, streaming, responses \| Auto Instrument \| Working, no responses \| None \| \| Prompt Tokens \| Client \| Integer count \| OpenAI, Ollama, vLLM, streaming, responses \| Auto Instrument \| Working, no responses \| None \| \| Prompt \| Client \| string \| Any Inference Provider, responses \| Auto Instrument \| Working, no responses \| None \| #### Safety \| Value \| Location \| Content \| Testing \| Handled By \| Status \| Notes \| \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| \| [Shield ID](`ecdfecb9f0/src/llama_stack/core/telemetry/constants.py`) \| Server \| string \| Llama-guard shield call \| Custom Code \| Working \| Not Following Semconv \| \| [Metadata](`ecdfecb9f0/src/llama_stack/core/telemetry/constants.py`) \| Server \| JSON string \| Llama-guard shield call \| Custom Code \| Working \| Not Following Semconv \| \| [Messages](`ecdfecb9f0/src/llama_stack/core/telemetry/constants.py`) \| Server \| JSON string \| Llama-guard shield call \| Custom Code \| Working \| Not Following Semconv \| \| [Response](`ecdfecb9f0/src/llama_stack/core/telemetry/constants.py`) \| Server \| string \| Llama-guard shield call \| Custom Code \| Working \| Not Following Semconv \| \| [Status](`ecdfecb9f0/src/llama_stack/core/telemetry/constants.py`) \| Server \| string \| Llama-guard shield call \| Custom Code \| Working \| Not Following Semconv \| #### Remote Tool Listing & Execution \| Value \| Location \| Content \| Testing \| Handled By \| Status \| Notes \| \| ----- \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| \| Tool name \| server \| string \| Tool call occurs \| Custom Code \| working \| [Not following semconv](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/#execute-tool-span) \| \| Server URL \| server \| string \| List tools or execute tool call \| Custom Code \| working \| [Not following semconv](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/#execute-tool-span) \| \| Server Label \| server \| string \| List tools or execute tool call \| Custom code \| working \| [Not following semconv](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/#execute-tool-span) \| \| mcp\_list\_tools\_id \| server \| string \| List tools \| Custom code \| working \| [Not following semconv](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/#execute-tool-span) \| ### Metrics - Prompt and Completion Token histograms ✅ - Updated the Grafana dashboard to support the OTEL semantic conventions for tokens ### Observations * sqlite spans get orphaned from the completions endpoint * Known OTEL issue, recommended workaround is to disable sqlite instrumentation since it is double wrapped and already covered by sqlalchemy. This is covered in documentation. ```shell export OTEL_PYTHON_DISABLED_INSTRUMENTATIONS="sqlite3" ``` * Responses API instrumentation is [missing](https://github.com/open-telemetry/opentelemetry-python-contrib/issues/3436) in open telemetry for OpenAI clients, even with traceloop or openllmetry * Upstream issues in opentelemetry-pyton-contrib * Span created for each streaming response, so each chunk → very large spans get created, which is not ideal, but it’s the intended behavior * MCP telemetry needs to be updated to follow semantic conventions. We can probably use a library for this and handle it in a separate issue. ### Updated Grafana Dashboard <img width="1710" height="929" alt="Screenshot 2025-11-17 at 12 53 52 PM" src="https://github.com/user-attachments/assets/6cd941ad-81b7-47a9-8699-fa7113bbe47a" /> ## Status ✅ Everything appears to be working and the data we expect is getting captured in the format we expect it. ## Follow Ups 1. Make tool calling spans follow semconv and capture more data 1. Consider using existing tracing library 2. Make shield spans follow semconv 3. Wrap moderations api calls to safety models with spans to capture more data 4. Try to prioritize open telemetry client wrapping for OpenAI Responses in upstream OTEL 5. This would break the telemetry tests, and they are currently disabled. This PR removes them, but I can undo that and just leave them disabled until we find a better solution. 6. Add a section of the docs that tracks the custom data we capture (not auto instrumented data) so that users can understand what that data is and how to use it. Commit those changes to the OTEL-gen_ai SIG if possible as well. Here is an [example](https://opentelemetry.io/docs/specs/semconv/gen-ai/aws-bedrock/) of how bedrock handles it.	2025-12-01 10:33:18 -08:00
Derek Higgins	8d01baeb59	test: Update JWKS tests to properly mock authentication (#4257 ) PyJWKClient uses urllib.request.urlopen to fetch JWKS keys, not httpx.AsyncClient.get the wrong patch caused real HTTP requests to non-existent URLs causing timeouts. Closes: #4256 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-12-01 09:57:44 -08:00
dependabot[bot]	dbaa9ae5e3	chore(github-deps): bump actions/setup-python from 6.0.0 to 6.1.0 (#4259 ) Bumps [actions/setup-python](https://github.com/actions/setup-python) from 6.0.0 to 6.1.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/setup-python/releases">actions/setup-python's releases</a>.</em></p> <blockquote> <h2>v6.1.0</h2> <h2>What's Changed</h2> <h3>Enhancements:</h3> <ul> <li>Add support for <code>pip-install</code> input by <a href="https://github.com/gowridurgad"><code>@gowridurgad</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1201">actions/setup-python#1201</a></li> <li>Add graalpy early-access and windows builds by <a href="https://github.com/timfel"><code>@timfel</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/880">actions/setup-python#880</a></li> </ul> <h3>Dependency and Documentation updates:</h3> <ul> <li>Enhanced wording and updated example usage for <code>allow-prereleases</code> by <a href="https://github.com/yarikoptic"><code>@yarikoptic</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/979">actions/setup-python#979</a></li> <li>Upgrade urllib3 from 1.26.19 to 2.5.0 and document breaking changes in v6 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1139">actions/setup-python#1139</a></li> <li>Upgrade typescript from 5.4.2 to 5.9.3 and Documentation update by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1094">actions/setup-python#1094</a></li> <li>Upgrade actions/publish-action from 0.3.0 to 0.4.0 & Documentation update for pip-install input by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1199">actions/setup-python#1199</a></li> <li>Upgrade requests from 2.32.2 to 2.32.4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1130">actions/setup-python#1130</a></li> <li>Upgrade prettier from 3.5.3 to 3.6.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1234">actions/setup-python#1234</a></li> <li>Upgrade <code>@types/node</code> from 24.1.0 to 24.9.1 and update macos-13 to macos-15-intel by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1235">actions/setup-python#1235</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/yarikoptic"><code>@yarikoptic</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-python/pull/979">actions/setup-python#979</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-python/compare/v6...v6.1.0">https://github.com/actions/setup-python/compare/v6...v6.1.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`83679a892e`"><code>83679a8</code></a> Bump <code>@types/node</code> from 24.1.0 to 24.9.1 and update macos-13 to macos-15-intel ...</li> <li><a href="`bfc4944b43`"><code>bfc4944</code></a> Bump prettier from 3.5.3 to 3.6.2 (<a href="https://redirect.github.com/actions/setup-python/issues/1234">#1234</a>)</li> <li><a href="`97aeb3efb8`"><code>97aeb3e</code></a> Bump requests from 2.32.2 to 2.32.4 in /<strong>tests</strong>/data (<a href="https://redirect.github.com/actions/setup-python/issues/1130">#1130</a>)</li> <li><a href="`443da59188`"><code>443da59</code></a> Bump actions/publish-action from 0.3.0 to 0.4.0 & Documentation update for pi...</li> <li><a href="`cfd55ca824`"><code>cfd55ca</code></a> graalpy: add graalpy early-access and windows builds (<a href="https://redirect.github.com/actions/setup-python/issues/880">#880</a>)</li> <li><a href="`bba65e51ff`"><code>bba65e5</code></a> Bump typescript from 5.4.2 to 5.9.3 and update docs/advanced-usage.md (<a href="https://redirect.github.com/actions/setup-python/issues/1094">#1094</a>)</li> <li><a href="`18566f86b3`"><code>18566f8</code></a> Improve wording and "fix example" (remove 3.13) on testing against pre-releas...</li> <li><a href="`2e3e4b15a8`"><code>2e3e4b1</code></a> Add support for pip-install input (<a href="https://redirect.github.com/actions/setup-python/issues/1201">#1201</a>)</li> <li><a href="`4267e283df`"><code>4267e28</code></a> Bump urllib3 from 1.26.19 to 2.5.0 in /<strong>tests</strong>/data and document breaking c...</li> <li>See full diff in <a href="`e797f83bcb...83679a892e`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-python&package-manager=github_actions&previous-version=6.0.0&new-version=6.1.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-01 09:55:56 -08:00
Derek Higgins	a7c7c72467	docs: fix logging environment variable separator in example (#4254 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Python Package Build Test / build (3.12) (push) Successful in 16s Details Test External API and Providers / test-external (venv) (push) Failing after 25s Details Python Package Build Test / build (3.13) (push) Successful in 34s Details Vector IO Integration Tests / test-matrix (push) Failing after 40s Details UI Tests / ui-tests (22) (push) Successful in 45s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m25s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m29s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 1m52s Details Pre-commit / pre-commit (push) Successful in 3m10s Details Correct the separator to comma in LLAMA_STACK_LOGGING example.	2025-11-28 13:43:44 +01:00
Sébastien Han	d1a7bc36a2	chore: rm CHANGELOG.md (#4240 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details Python Package Build Test / build (3.12) (push) Successful in 17s Details Python Package Build Test / build (3.13) (push) Successful in 23s Details Test External API and Providers / test-external (venv) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (push) Failing after 47s Details UI Tests / ui-tests (22) (push) Successful in 50s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m20s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m39s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m14s Details Pre-commit / pre-commit (push) Successful in 2m44s Details # What does this PR do? We don't do a good job at maintaining this file, also the GH action does not seem to be running. Let's stick with GH release notes instead. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-11-26 17:48:32 +01:00
Charlie Doern	aac494c5ba	fix: bind to proper default hosts (#4232 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Details Integration Tests (Replay) / generate-matrix (push) Successful in 8s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 19s Details Python Package Build Test / build (3.12) (push) Successful in 18s Details Test External API and Providers / test-external (venv) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (push) Failing after 39s Details Python Package Build Test / build (3.13) (push) Successful in 38s Details UI Tests / ui-tests (22) (push) Successful in 1m24s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m37s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m27s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m50s Details Pre-commit / pre-commit (push) Successful in 4m1s Details # What does this PR do? we used to have ` host = config.server.host or ["::", "0.0.0.0"]` but now only bind to ` host = config.server.host or "0.0.0.0"` revert back to the old logic, this allows us to curl http://localhost:8321/v1/models on fedora, which defaults to using IPv6. resolves #4210 Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-26 06:16:28 -05:00
dependabot[bot]	b1c5b8fa9f	chore(github-deps): bump peter-evans/create-pull-request from 7.0.8 to 7.0.9 (#4213 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Integration Tests (Replay) / generate-matrix (push) Successful in 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test llama stack list-deps / generate-matrix (push) Successful in 15s Details API Conformance Tests / check-schema-compatibility (push) Successful in 26s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 29s Details Python Package Build Test / build (3.13) (push) Successful in 47s Details Test Llama Stack Build / build-single-provider (push) Successful in 56s Details Test llama stack list-deps / show-single-provider (push) Successful in 55s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m16s Details Test External API and Providers / test-external (venv) (push) Failing after 1m22s Details Python Package Build Test / build (3.12) (push) Successful in 1m26s Details UI Tests / ui-tests (22) (push) Successful in 1m44s Details Test Llama Stack Build / build (push) Successful in 38s Details Test llama stack list-deps / list-deps (push) Failing after 34s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 3m7s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m18s Details Unit Tests / unit-tests (3.12) (push) Failing after 3m10s Details Pre-commit / pre-commit (push) Successful in 3m46s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 4m47s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m42s Details [//]: # (dependabot-start) ⚠️ Dependabot is rebasing this PR ⚠️ Rebasing might not happen immediately, so don't worry if this takes some time. Note: if you make any changes to this PR yourself, they will take precedence over the rebase. --- [//]: # (dependabot-end) Bumps [peter-evans/create-pull-request](https://github.com/peter-evans/create-pull-request) from 7.0.8 to 7.0.9. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/peter-evans/create-pull-request/releases">peter-evans/create-pull-request's releases</a>.</em></p> <blockquote> <h2>Create Pull Request v7.0.9</h2> <p>⚙️ Fixes an <a href="https://redirect.github.com/peter-evans/create-pull-request/issues/4228">incompatibility</a> with the recently released <code>actions/checkout@v6</code>.</p> <h2>What's Changed</h2> <ul> <li>~70 dependency updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a></li> <li>docs: fix workaround description about <code>ready_for_review</code> by <a href="https://github.com/ybiquitous"><code>@ybiquitous</code></a> in <a href="https://redirect.github.com/peter-evans/create-pull-request/pull/3939">peter-evans/create-pull-request#3939</a></li> <li>Docs: <code>add-paths</code> default behavior by <a href="https://github.com/joeflack4"><code>@joeflack4</code></a> in <a href="https://redirect.github.com/peter-evans/create-pull-request/pull/3928">peter-evans/create-pull-request#3928</a></li> <li>docs: update to create-github-app-token v2 by <a href="https://github.com/Goooler"><code>@Goooler</code></a> in <a href="https://redirect.github.com/peter-evans/create-pull-request/pull/4063">peter-evans/create-pull-request#4063</a></li> <li>Fix compatibility with actions/checkout@v6 by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/peter-evans/create-pull-request/pull/4230">peter-evans/create-pull-request#4230</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/joeflack4"><code>@joeflack4</code></a> made their first contribution in <a href="https://redirect.github.com/peter-evans/create-pull-request/pull/3928">peter-evans/create-pull-request#3928</a></li> <li><a href="https://github.com/Goooler"><code>@Goooler</code></a> made their first contribution in <a href="https://redirect.github.com/peter-evans/create-pull-request/pull/4063">peter-evans/create-pull-request#4063</a></li> <li><a href="https://github.com/ericsciple"><code>@ericsciple</code></a> made their first contribution in <a href="https://redirect.github.com/peter-evans/create-pull-request/pull/4230">peter-evans/create-pull-request#4230</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/peter-evans/create-pull-request/compare/v7.0.8...v7.0.9">https://github.com/peter-evans/create-pull-request/compare/v7.0.8...v7.0.9</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`84ae59a2cd`"><code>84ae59a</code></a> fix: compatibility with actions/checkout@v6 (<a href="https://redirect.github.com/peter-evans/create-pull-request/issues/4230">#4230</a>)</li> <li><a href="`b4733b9419`"><code>b4733b9</code></a> build(deps-dev): bump js-yaml from 4.1.0 to 4.1.1 (<a href="https://redirect.github.com/peter-evans/create-pull-request/issues/4222">#4222</a>)</li> <li><a href="`0edc001d28`"><code>0edc001</code></a> build(deps-dev): bump the npm group with 2 updates (<a href="https://redirect.github.com/peter-evans/create-pull-request/issues/4201">#4201</a>)</li> <li><a href="`430aea0fb1`"><code>430aea0</code></a> build(deps): bump the github-actions group with 3 updates (<a href="https://redirect.github.com/peter-evans/create-pull-request/issues/4200">#4200</a>)</li> <li><a href="`46cdba753c`"><code>46cdba7</code></a> build(deps-dev): bump the npm group with 3 updates (<a href="https://redirect.github.com/peter-evans/create-pull-request/issues/4185">#4185</a>)</li> <li><a href="`b937339b17`"><code>b937339</code></a> build(deps): bump the github-actions group with 2 updates (<a href="https://redirect.github.com/peter-evans/create-pull-request/issues/4184">#4184</a>)</li> <li><a href="`e9af275c37`"><code>e9af275</code></a> ci: update dependabot config</li> <li><a href="`d3e081a03a`"><code>d3e081a</code></a> build(deps-dev): bump <code>@types/node</code> from 18.19.127 to 18.19.128 (<a href="https://redirect.github.com/peter-evans/create-pull-request/issues/4178">#4178</a>)</li> <li><a href="`9ec683ee07`"><code>9ec683e</code></a> build(deps-dev): bump <code>@types/node</code> from 18.19.125 to 18.19.127 (<a href="https://redirect.github.com/peter-evans/create-pull-request/issues/4165">#4165</a>)</li> <li><a href="`65d8d10bf7`"><code>65d8d10</code></a> build(deps-dev): bump ts-jest from 29.4.2 to 29.4.4 (<a href="https://redirect.github.com/peter-evans/create-pull-request/issues/4163">#4163</a>)</li> <li>Additional commits viewable in <a href="`271a8d0340...84ae59a2cd`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=peter-evans/create-pull-request&package-manager=github_actions&previous-version=7.0.8&new-version=7.0.9)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-11-24 09:33:32 -08:00
dependabot[bot]	5948c5e08e	chore(github-deps): bump stainless-api/upload-openapi-spec-action from 1.6.0 to 1.7.0 (#4214 ) Bumps [stainless-api/upload-openapi-spec-action](https://github.com/stainless-api/upload-openapi-spec-action) from 1.6.0 to 1.7.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/stainless-api/upload-openapi-spec-action/releases">stainless-api/upload-openapi-spec-action's releases</a>.</em></p> <blockquote> <h2>v1.7.0</h2> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.6.0...v1.7.0">1.7.0</a> (2025-11-17)</h2> <h3>Features</h3> <ul> <li><strong>preview:</strong> add output documented_spec_path to preview action (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/135">#135</a>) (<a href="`5e80cc40da`">5e80cc4</a>)</li> <li><strong>preview:</strong> add output_dir input and write documented spec to file (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/137">#137</a>) (<a href="`d30490c89b`">d30490c</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/stainless-api/upload-openapi-spec-action/blob/main/CHANGELOG.md">stainless-api/upload-openapi-spec-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.6.0...v1.7.0">1.7.0</a> (2025-11-17)</h2> <h3>Features</h3> <ul> <li><strong>preview:</strong> add output documented_spec_path to preview action (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/135">#135</a>) (<a href="`5e80cc40da`">5e80cc4</a>)</li> <li><strong>preview:</strong> add output_dir input and write documented spec to file (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/137">#137</a>) (<a href="`d30490c89b`">d30490c</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.5...v1.6.0">1.6.0</a> (2025-10-30)</h2> <h3>Features</h3> <ul> <li>add support for github OIDC auth (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/133">#133</a>) (<a href="`259674c1b3`">259674c</a>)</li> <li>change fail on semantics (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/124">#124</a>) (<a href="`e1046240c0`">e104624</a>)</li> </ul> <h3>Bug Fixes</h3> <ul> <li>accept multiline conventional commits (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/129">#129</a>) (<a href="`d2dcc0b3bf`">d2dcc0b</a>)</li> <li>tweak categorizeOutcomes (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/132">#132</a>) (<a href="`c45d6a9c79`">c45d6a9</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.4...v1.5.5">1.5.5</a> (2025-09-26)</h2> <h3>Bug Fixes</h3> <ul> <li>rollback filtering diagnostics by target (<a href="`54328a386f`">54328a3</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.3...v1.5.4">1.5.4</a> (2025-09-25)</h2> <h3>Bug Fixes</h3> <ul> <li>check for latestRun before commenting (<a href="`53fef9f328`">53fef9f</a>)</li> <li>filter diagnostics by target (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/125">#125</a>) (<a href="`102dc971cb`">102dc97</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.2...v1.5.3">1.5.3</a> (2025-09-16)</h2> <h3>Bug Fixes</h3> <ul> <li>filter by branch when finding base build (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/120">#120</a>) (<a href="`b6506adb5c`">b6506ad</a>)</li> </ul> <h2><a href="https://github.com/stainless-api/upload-openapi-spec-action/compare/v1.5.1...v1.5.2">1.5.2</a> (2025-09-15)</h2> <h3>Bug Fixes</h3> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`9133735bca`"><code>9133735</code></a> chore(main): release 1.7.0 (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/136">#136</a>)</li> <li><a href="`641c28aa9f`"><code>641c28a</code></a> chore(build): Update dist</li> <li><a href="`d30490c89b`"><code>d30490c</code></a> feat(preview): add output_dir input and write documented spec to file (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/137">#137</a>)</li> <li><a href="`5e80cc40da`"><code>5e80cc4</code></a> feat(preview): add output documented_spec_path to preview action (<a href="https://redirect.github.com/stainless-api/upload-openapi-spec-action/issues/135">#135</a>)</li> <li><a href="`6daa518df5`"><code>6daa518</code></a> chore(docs): document OIDC org-matching requirement</li> <li>See full diff in <a href="`32823b096b...9133735bca`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=stainless-api/upload-openapi-spec-action&package-manager=github_actions&previous-version=1.6.0&new-version=1.7.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-11-24 09:33:25 -08:00
dependabot[bot]	adab95259b	chore(github-deps): bump astral-sh/setup-uv from 7.1.2 to 7.1.4 (#4215 ) Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 7.1.2 to 7.1.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v7.1.4 🌈 Fix libuv closing bug on Windows</h2> <h2>Changes</h2> <p>This release fixes the bug <code>Assertion failed: !(handle->flags & UV_HANDLE_CLOSING)</code> on Windows runners</p> <h2>🐛 Bug fixes</h2> <ul> <li>Wait 50ms before exit to fix libuv bug <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/689">#689</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>chore: update known checksums for 0.9.10 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/681">#681</a>)</li> <li>chore: update known checksums for 0.9.9 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/679">#679</a>)</li> </ul> <h2>v7.1.3 🌈 Support act</h2> <h2>Changes</h2> <p>This bug fix release adds support for <a href="https://github.com/nektos/act">https://github.com/nektos/act</a> It was previously broken because of a too new <code>undici</code> version and TS transpilation target.</p> <p>Compatibility with act is now automatically tested.</p> <h2>🐛 Bug fixes</h2> <ul> <li>use old undici and ES2022 target for act support <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/678">#678</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>chore: update known checksums for 0.9.8 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/677">#677</a>)</li> <li>chore: update known checksums for 0.9.7 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/671">#671</a>)</li> <li>chore: update known checksums for 0.9.6 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/670">#670</a>)</li> </ul> <h2>📚 Documentation</h2> <ul> <li>Correct description of <code>cache-dependency-glob</code> <a href="https://github.com/allanlewis"><code>@allanlewis</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/676">#676</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`1e862dfacb`"><code>1e862df</code></a> Wait 50ms before exit to fix libuv bug (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/689">#689</a>)</li> <li><a href="`d7d33e16d4`"><code>d7d33e1</code></a> chore: update known checksums for 0.9.10 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/681">#681</a>)</li> <li><a href="`486d0b8872`"><code>486d0b8</code></a> chore: update known checksums for 0.9.9 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/679">#679</a>)</li> <li><a href="`5a7eac68fb`"><code>5a7eac6</code></a> use old undici and ES2022 target for act support (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/678">#678</a>)</li> <li><a href="`b49dc9e882`"><code>b49dc9e</code></a> chore: update known checksums for 0.9.8 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/677">#677</a>)</li> <li><a href="`30ce38e206`"><code>30ce38e</code></a> Correct description of <code>cache-dependency-glob</code> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/676">#676</a>)</li> <li><a href="`0d20755a23`"><code>0d20755</code></a> chore: update known checksums for 0.9.7 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/671">#671</a>)</li> <li><a href="`8491d1d9a3`"><code>8491d1d</code></a> chore: update known checksums for 0.9.6 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/670">#670</a>)</li> <li>See full diff in <a href="`85856786d1...1e862dfacb`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=7.1.2&new-version=7.1.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-11-24 09:32:51 -08:00
dependabot[bot]	e86cf2c153	chore(github-deps): bump actions/checkout from 5.0.0 to 6.0.0 (#4217 ) Bumps [actions/checkout](https://github.com/actions/checkout) from 5.0.0 to 6.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/releases">actions/checkout's releases</a>.</em></p> <blockquote> <h2>v6.0.0</h2> <h2>What's Changed</h2> <ul> <li>Update README to include Node.js 24 support details and requirements by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li> <li>Persist creds to a separate file by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li> <li>v6-beta by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2298">actions/checkout#2298</a></li> <li>update readme/changelog for v6 by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2311">actions/checkout#2311</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v5.0.0...v6.0.0">https://github.com/actions/checkout/compare/v5.0.0...v6.0.0</a></p> <h2>v6-beta</h2> <h2>What's Changed</h2> <p>Updated persist-credentials to store the credentials under <code>$RUNNER_TEMP</code> instead of directly in the local git config.</p> <p>This requires a minimum Actions Runner version of <a href="https://github.com/actions/runner/releases/tag/v2.329.0">v2.329.0</a> to access the persisted credentials for <a href="https://docs.github.com/en/actions/tutorials/use-containerized-services/create-a-docker-container-action">Docker container action</a> scenarios.</p> <h2>v5.0.1</h2> <h2>What's Changed</h2> <ul> <li>Port v6 cleanup to v5 by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v5...v5.0.1">https://github.com/actions/checkout/compare/v5...v5.0.1</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2>V6.0.0</h2> <ul> <li>Persist creds to a separate file by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li> <li>Update README to include Node.js 24 support details and requirements by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li> </ul> <h2>V5.0.1</h2> <ul> <li>Port v6 cleanup to v5 by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li> </ul> <h2>V5.0.0</h2> <ul> <li>Update actions checkout to use node 24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li> </ul> <h2>V4.3.1</h2> <ul> <li>Port v6 cleanup to v4 by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2305">actions/checkout#2305</a></li> </ul> <h2>V4.3.0</h2> <ul> <li>docs: update README.md by <a href="https://github.com/motss"><code>@motss</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li>Add internal repos for checking out multiple repositories by <a href="https://github.com/mouismail"><code>@mouismail</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li>Documentation update - add recommended permissions to Readme by <a href="https://github.com/benwells"><code>@benwells</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li>Adjust positioning of user email note and permissions heading by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li> <li>Update README.md by <a href="https://github.com/nebuk89"><code>@nebuk89</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li>Update CODEOWNERS for actions by <a href="https://github.com/TingluoHuang"><code>@TingluoHuang</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li> <li>Update package dependencies by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> </ul> <h2>v4.2.2</h2> <ul> <li><code>url-helper.ts</code> now leverages well-known environment variables by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li> <li>Expand unit test coverage for <code>isGhes</code> by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li> </ul> <h2>v4.2.1</h2> <ul> <li>Check out other refs/* by commit if provided, fall back to ref by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li> </ul> <h2>v4.2.0</h2> <ul> <li>Add Ref and Commit outputs by <a href="https://github.com/lucacome"><code>@lucacome</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li> <li>Dependency updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>- <a href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>, <a href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li> </ul> <h2>v4.1.7</h2> <ul> <li>Bump the minor-npm-dependencies group across 1 directory with 4 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li> <li>Bump actions/checkout from 3 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li> <li>Check out other refs/* by commit by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li> <li>Pin actions/checkout's own workflows to a known, good, stable version. by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li> </ul> <h2>v4.1.6</h2> <ul> <li>Check platform to set archive extension appropriately by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li> </ul> <h2>v4.1.5</h2> <ul> <li>Update NPM dependencies by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li> <li>Bump github/codeql-action from 2 to 3 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li> <li>Bump actions/setup-node from 1 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li> <li>Bump actions/upload-artifact from 2 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`1af3b93b68`"><code>1af3b93</code></a> update readme/changelog for v6 (<a href="https://redirect.github.com/actions/checkout/issues/2311">#2311</a>)</li> <li><a href="`71cf2267d8`"><code>71cf226</code></a> v6-beta (<a href="https://redirect.github.com/actions/checkout/issues/2298">#2298</a>)</li> <li><a href="`069c695914`"><code>069c695</code></a> Persist creds to a separate file (<a href="https://redirect.github.com/actions/checkout/issues/2286">#2286</a>)</li> <li><a href="`ff7abcd0c3`"><code>ff7abcd</code></a> Update README to include Node.js 24 support details and requirements (<a href="https://redirect.github.com/actions/checkout/issues/2248">#2248</a>)</li> <li>See full diff in <a href="`08c6903cd8...1af3b93b68`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=5.0.0&new-version=6.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-11-24 09:32:41 -08:00
dependabot[bot]	3434c92a14	chore(github-deps): bump actions/setup-node from 4.1.0 to 6.0.0 (#4216 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details Test External API and Providers / test-external (venv) (push) Failing after 29s Details UI Tests / ui-tests (22) (push) Successful in 36s Details Vector IO Integration Tests / test-matrix (push) Failing after 44s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m35s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m13s Details Pre-commit / pre-commit (push) Successful in 3m4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m6s Details Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4.1.0 to 6.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/setup-node/releases">actions/setup-node's releases</a>.</em></p> <blockquote> <h2>v6.0.0</h2> <h2>What's Changed</h2> <p><strong>Breaking Changes</strong></p> <ul> <li>Limit automatic caching to npm, update workflows and documentation by <a href="https://github.com/priyagupta108"><code>@priyagupta108</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1374">actions/setup-node#1374</a></li> </ul> <p><strong>Dependency Upgrades</strong></p> <ul> <li>Upgrade ts-jest from 29.1.2 to 29.4.1 and document breaking changes in v5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1336">#1336</a></li> <li>Upgrade prettier from 2.8.8 to 3.6.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1334">#1334</a></li> <li>Upgrade actions/publish-action from 0.3.0 to 0.4.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1362">#1362</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v5...v6.0.0">https://github.com/actions/setup-node/compare/v5...v6.0.0</a></p> <h2>v5.0.0</h2> <h2>What's Changed</h2> <h3>Breaking Changes</h3> <ul> <li>Enhance caching in setup-node with automatic package manager detection by <a href="https://github.com/priya-kinthali"><code>@priya-kinthali</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li> </ul> <p>This update, introduces automatic caching when a valid <code>packageManager</code> field is present in your <code>package.json</code>. This aims to improve workflow performance and make dependency management more seamless. To disable this automatic caching, set <code>package-manager-cache: false</code></p> <pre lang="yaml"><code>steps: - uses: actions/checkout@v5 - uses: actions/setup-node@v5 with: package-manager-cache: false </code></pre> <ul> <li>Upgrade action to use node24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li> </ul> <p>Make sure your runner is on version v2.327.1 or later to ensure compatibility with this release. <a href="https://github.com/actions/runner/releases/tag/v2.327.1">See Release Notes</a></p> <h3>Dependency Upgrades</h3> <ul> <li>Upgrade <code>@octokit/request-error</code> and <code>@actions/github</code> by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1227">actions/setup-node#1227</a></li> <li>Upgrade uuid from 9.0.1 to 11.1.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1273">actions/setup-node#1273</a></li> <li>Upgrade undici from 5.28.5 to 5.29.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1295">actions/setup-node#1295</a></li> <li>Upgrade form-data to bring in fix for critical vulnerability by <a href="https://github.com/gowridurgad"><code>@gowridurgad</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1332">actions/setup-node#1332</a></li> <li>Upgrade actions/checkout from 4 to 5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1345">actions/setup-node#1345</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/priya-kinthali"><code>@priya-kinthali</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li> <li><a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v4...v5.0.0">https://github.com/actions/setup-node/compare/v4...v5.0.0</a></p> <h2>v4.4.0</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`2028fbc5c2`"><code>2028fbc</code></a> Limit automatic caching to npm, update workflows and documentation (<a href="https://redirect.github.com/actions/setup-node/issues/1374">#1374</a>)</li> <li><a href="`13427813f7`"><code>1342781</code></a> Bump actions/publish-action from 0.3.0 to 0.4.0 (<a href="https://redirect.github.com/actions/setup-node/issues/1362">#1362</a>)</li> <li><a href="`89d709d423`"><code>89d709d</code></a> Bump prettier from 2.8.8 to 3.6.2 (<a href="https://redirect.github.com/actions/setup-node/issues/1334">#1334</a>)</li> <li><a href="`cd2651c462`"><code>cd2651c</code></a> Bump ts-jest from 29.1.2 to 29.4.1 (<a href="https://redirect.github.com/actions/setup-node/issues/1336">#1336</a>)</li> <li><a href="`a0853c2454`"><code>a0853c2</code></a> Bump actions/checkout from 4 to 5 (<a href="https://redirect.github.com/actions/setup-node/issues/1345">#1345</a>)</li> <li><a href="`b7234cc9fe`"><code>b7234cc</code></a> Upgrade action to use node24 (<a href="https://redirect.github.com/actions/setup-node/issues/1325">#1325</a>)</li> <li><a href="`d7a11313b5`"><code>d7a1131</code></a> Enhance caching in setup-node with automatic package manager detection (<a href="https://redirect.github.com/actions/setup-node/issues/1348">#1348</a>)</li> <li><a href="`5e2628c959`"><code>5e2628c</code></a> Bumps form-data (<a href="https://redirect.github.com/actions/setup-node/issues/1332">#1332</a>)</li> <li><a href="`65beceff8e`"><code>65becef</code></a> Bump undici from 5.28.5 to 5.29.0 (<a href="https://redirect.github.com/actions/setup-node/issues/1295">#1295</a>)</li> <li><a href="`7e24a656e1`"><code>7e24a65</code></a> Bump uuid from 9.0.1 to 11.1.0 (<a href="https://redirect.github.com/actions/setup-node/issues/1273">#1273</a>)</li> <li>Additional commits viewable in <a href="https://github.com/actions/setup-node/compare/v4.1.0...2028fbc5c25fe9cf00d9f06a71cc4710d4507903">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=4.1.0&new-version=6.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-11-23 22:32:58 -05:00
Ken Dreyer	dabebdd230	fix: update hard-coded google model names (#4212 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Test External API and Providers / test-external (venv) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (push) Failing after 36s Details UI Tests / ui-tests (22) (push) Successful in 44s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m21s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m59s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m33s Details Pre-commit / pre-commit (push) Successful in 3m0s Details # What does this PR do? When we send the model names to Google's openai API, we must use the "google" name prefix. Google does not recognize the "vertexai" model names. Closes #4211 ## Test Plan ```bash uv venv --python python312 . .venv/bin/activate llama stack list-deps starter \| xargs -L1 uv pip install llama stack run starter ``` Test that this shows the gemini models with their correct names: ```bash curl http://127.0.0.1:8321/v1/models \| jq '.data \| map(select(.custom_metadata.provider_id == "vertexai"))' ``` Test that this chat completion works: ```bash curl -X POST -H "Content-Type: application/json" "http://127.0.0.1:8321/v1/chat/completions" -d '{ "model": "vertexai/google/gemini-2.5-flash", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello! Can you tell me a joke?" } ], "temperature": 1.0, "max_tokens": 256 }' ```	2025-11-21 13:12:01 -08:00
raghotham	74dceb30da	chore: Add @cdoern as a code owner (#4209 ) We went through the nomination process for CODEOWNERS in the codeowners discord channel. Welcome to the code owners group @cdoern! Thanks for your contributions and we look forward to working with you!	2025-11-21 11:00:36 -08:00
Ken Dreyer	dc4665af17	feat!: change bedrock bearer token env variable to match AWS docs & boto3 convention (#4152 ) Some checks failed Integration Tests (Replay) / generate-matrix (push) Successful in 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 6s Details Test Llama Stack Build / build-single-provider (push) Successful in 50s Details Vector IO Integration Tests / test-matrix (push) Failing after 56s Details Test Llama Stack Build / build (push) Successful in 49s Details UI Tests / ui-tests (22) (push) Successful in 1m1s Details Test External API and Providers / test-external (venv) (push) Failing after 1m18s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m58s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m28s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m20s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m37s Details Pre-commit / pre-commit (push) Successful in 3m50s Details Rename `AWS_BEDROCK_API_KEY` to `AWS_BEARER_TOKEN_BEDROCK` to align with the naming convention used in AWS Bedrock documentation and the AWS web console UI. This reduces confusion when developers compare LLS docs with AWS docs. Closes #4147	2025-11-21 09:48:05 -05:00
Ashwin Bharambe	acf74cb8df	feat(ci): add --typescript-only flag to skip Python tests in integration test script (#4201 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Tests (Replay) / generate-matrix (push) Successful in 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 6s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details Test External API and Providers / test-external (venv) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (push) Failing after 34s Details UI Tests / ui-tests (22) (push) Successful in 58s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m17s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m37s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m8s Details Pre-commit / pre-commit (push) Successful in 2m53s Details This adds a `--typescript-only` flag to `scripts/integration-tests.sh` that skips pytest execution entirely while still starting the Llama Stack server (required for TS client tests). The TypeScript client can now be tested independently without Python test dependencies.	2025-11-19 16:25:30 -08:00
Ashwin Bharambe	d649c3663e	fix: enforce allowed_models during inference requests (#4197 ) The `allowed_models` configuration was only being applied when listing models via the `/v1/models` endpoint, but the actual inference requests weren't checking this restriction. This meant users could directly request any model the provider supports by specifying it in their inference call, completely bypassing the intended cost controls. The fix adds validation to all three inference methods (chat completions, completions, and embeddings) that checks the requested model against the allowed_models list before making the provider API call. ### Test plan Added unit tests	2025-11-19 14:49:44 -08:00
Ashwin Bharambe	b6ce242808	chore: update code owners (#4199 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 8s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details Installer CI / lint (push) Failing after 14s Details Python Package Build Test / build (3.13) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 16s Details Test External API and Providers / test-external (venv) (push) Failing after 39s Details Test Llama Stack Build / build-single-provider (push) Successful in 46s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m3s Details UI Tests / ui-tests (22) (push) Successful in 59s Details Test Llama Stack Build / build (push) Successful in 52s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m46s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m21s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m25s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m40s Details Installer CI / smoke-test-on-dev (push) Failing after 2m56s Details Pre-commit / pre-commit (push) Successful in 2m58s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m58s Details Update code owners given changed affiliations, projects, etc.	2025-11-19 13:43:11 -08:00
Sam El-Borai	aa2a7dae07	chore(ci): make stainless workflow more DRY (#4195 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Addresses feedback from https://github.com/llamastack/llama-stack/pull/4187#discussion_r2542797437 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-11-19 11:53:20 -08:00
Ian Miller	0757d5a917	feat(responses)!: implement support for OpenAI compatible prompts in Responses API (#3965 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR is responsible for providing actual implementation of OpenAI compatible prompts in Responses API. This is the follow up PR with actual implementation after introducing #3942 The need of this functionality was initiated in #3514. > Note, https://github.com/llamastack/llama-stack/pull/3514 is divided on three separate PRs. Current PR is the third of three. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #3321 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Manual testing, CI workflow with added unit tests Comprehensive manual testing with new implementation: Test Prompts with Images with text on them in Responses API: I used this image for testing purposes: [iphone 17 image](https://github.com/user-attachments/assets/9e2ee821-e394-4bbd-b1c8-d48a3fa315de) 1. Upload an image: ``` curl -X POST http://localhost:8321/v1/files \ -H "Content-Type: multipart/form-data" \ -F "file=@/Users/ianmiller/iphone.jpeg" \ -F "purpose=assistants" ``` `{"object":"file","id":"file-d6d375f238e14f21952cc40246bc8504","bytes":556241,"created_at":1761750049,"expires_at":1793286049,"filename":"iphone.jpeg","purpose":"assistants"}%` 2. Create prompt: ``` curl -X POST http://localhost:8321/v1/prompts \ -H "Content-Type: application/json" \ -d '{ "prompt": "You are a product analysis expert. Analyze the following product:\n\nProduct Name: {{product_name}}\nDescription: {{description}}\n\nImage: {{product_photo}}\n\nProvide a detailed analysis including quality assessment, target audience, and pricing recommendations.", "variables": ["product_name", "description", "product_photo"] }' ``` `{"prompt":"You are a product analysis expert. Analyze the following product:\n\nProduct Name: {{product_name}}\nDescription: {{description}}\n\nImage: {{product_photo}}\n\nProvide a detailed analysis including quality assessment, target audience, and pricing recommendations.","version":1,"prompt_id":"pmpt_7be2208cb82cdbc35356354dae1f335d1e9b7baeca21ea62","variables":["product_name","description","product_photo"],"is_default":false}%` 3. Create response: ``` curl -X POST http://localhost:8321/v1/responses \ -H "Accept: application/json, text/event-stream" \ -H "Content-Type: application/json" \ -d '{ "input": "Please analyze this product", "model": "openai/gpt-4o", "store": true, "prompt": { "id": "pmpt_7be2208cb82cdbc35356354dae1f335d1e9b7baeca21ea62", "version": "1", "variables": { "product_name": { "type": "input_text", "text": "iPhone 17 Pro Max" }, "product_photo": { "type": "input_image", "file_id": "file-d6d375f238e14f21952cc40246bc8504", "detail": "high" } } } }' ``` `{"created_at":1761750427,"error":null,"id":"resp_f897f914-e3b8-4783-8223-3ed0d32fcbc6","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"### Product Analysis: iPhone 17 Pro Max\n\nQuality Assessment:\n\n- Display & Design:\n - The 6.9-inch display is large, ideal for streaming and productivity.\n - Anti-reflective technology and 120Hz refresh rate enhance viewing experience, providing smoother visuals and reducing glare.\n - Titanium frame suggests a premium build, offering durability and a sleek appearance.\n\n- Performance:\n - The Apple A19 Pro chip promises significant performance improvements, likely leading to faster processing and efficient multitasking.\n - 12GB RAM is substantial for a smartphone, ensuring smooth operation for demanding apps and games.\n\n- Camera System:\n - The triple 48MP camera setup (wide, ultra-wide, telephoto) is designed for versatile photography needs, capturing high-resolution photos and videos.\n - The 24MP front camera will appeal to selfie enthusiasts and content creators needing quality front-facing shots.\n\n- Connectivity:\n - Wi-Fi 7 support indicates future-proof wireless capabilities, providing faster and more reliable internet connectivity.\n\nTarget Audience:\n\n- Tech Enthusiasts: Individuals interested in cutting-edge technology and performance.\n- Content Creators: Users who need a robust camera system for photo and video production.\n- Luxury Consumers: Those who prefer premium materials and top-of-the-line specs.\n- Professionals: Users who require efficient multitasking and productivity features.\n\nPricing Recommendations:\n\n- Given the premium specifications, a higher price point is expected. Consider pricing competitively within the high-end smartphone market while justifying cost through unique features like the titanium frame and advanced connectivity options.\n- Positioning around the $1,200 to $1,500 range would align with expectations for top-tier devices, catering to its target audience while ensuring profitability.\n\nOverall, the iPhone 17 Pro Max showcases a blend of innovative features and premium design, aimed at users seeking high performance and superior aesthetics.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_66f4d844-4d9e-4102-80fc-eb75b34b6dbd","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"id":"pmpt_7be2208cb82cdbc35356354dae1f335d1e9b7baeca21ea62","variables":{"product_name":{"text":"iPhone 17 Pro Max","type":"input_text"},"product_photo":{"detail":"high","type":"input_image","file_id":"file-d6d375f238e14f21952cc40246bc8504","image_url":null}},"version":"1"},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"tools":[],"truncation":null,"usage":{"input_tokens":830,"output_tokens":394,"total_tokens":1224,"input_tokens_details":{"cached_tokens":0},"output_tokens_details":{"reasoning_tokens":0}},"instructions":null}%` Test Prompts with PDF files in Responses API: I used this PDF file for testing purposes: [invoicesample.pdf](https://github.com/user-attachments/files/22958943/invoicesample.pdf) 1. Upload PDF: ``` curl -X POST http://localhost:8321/v1/files \ -H "Content-Type: multipart/form-data" \ -F "file=@/Users/ianmiller/invoicesample.pdf" \ -F "purpose=assistants" ``` `{"object":"file","id":"file-7fbb1043a4bb468cab60ffe4b8631d8e","bytes":149568,"created_at":1761750730,"expires_at":1793286730,"filename":"invoicesample.pdf","purpose":"assistants"}%` 2. Create prompt: ``` curl -X POST http://localhost:8321/v1/prompts \ -H "Content-Type: application/json" \ -d '{ "prompt": "You are an accounting and financial analysis expert. Analyze the following invoice document:\n\nInvoice Document: {{invoice_doc}}\n\nProvide a comprehensive analysis", "variables": ["invoice_doc"] }' ``` `{"prompt":"You are an accounting and financial analysis expert. Analyze the following invoice document:\n\nInvoice Document: {{invoice_doc}}\n\nProvide a comprehensive analysis","version":1,"prompt_id":"pmpt_72e2a184a86f32a568b6afb5455dca5c16bf3cc3f80092dc","variables":["invoice_doc"],"is_default":false}%` 3. Create response: ``` curl -X POST http://localhost:8321/v1/responses \ -H "Content-Type: application/json" \ -d '{ "input": "Please provide a detailed analysis of this invoice", "model": "openai/gpt-4o", "store": true, "prompt": { "id": "pmpt_72e2a184a86f32a568b6afb5455dca5c16bf3cc3f80092dc", "version": "1", "variables": { "invoice_doc": { "type": "input_file", "file_id": "file-7fbb1043a4bb468cab60ffe4b8631d8e", "filename": "invoicesample.pdf" } } } }' ``` `{"created_at":1761750881,"error":null,"id":"resp_da866913-db06-4702-8000-174daed9dbbb","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"Here's a detailed analysis of the invoice provided:\n\n### Seller Information\n- Business Name: The invoice features a logo with \"Sunny Farm\" indicating the business identity.\n- Address: 123 Somewhere St, Melbourne VIC 3000\n- Contact Information: Phone number (03) 1234 5678\n\n### Buyer Information\n- Name: Denny Gunawan\n- Address: 221 Queen St, Melbourne VIC 3000\n\n### Transaction Details\n- Invoice Number: #20130304\n- Date of Transaction: Not explicitly mentioned, likely inferred from the invoice number or needs clarification.\n\n### Items Purchased\n1. Apple\n - Price: $5.00/kg\n - Quantity: 1 kg\n - Subtotal: $5.00\n\n2. Orange\n - Price: $1.99/kg\n - Quantity: 2 kg\n - Subtotal: $3.98\n\n3. Watermelon\n - Price: $1.69/kg\n - Quantity: 3 kg\n - Subtotal: $5.07\n\n4. Mango\n - Price: $9.56/kg\n - Quantity: 2 kg\n - Subtotal: $19.12\n\n5. Peach\n - Price: $2.99/kg\n - Quantity: 1 kg\n - Subtotal: $2.99\n\n### Financial Summary\n- Subtotal for Items: $36.00\n- GST (Goods and Services Tax): 10% of $36.00, which amounts to $3.60\n- Total Amount Due: $39.60\n\n### Notes\n- The invoice includes a placeholder text: \"Lorem ipsum dolor sit amet...\" which is typically used as filler text. This might indicate a section intended for terms, conditions, or additional notes that haven’t been completed.\n\n### Visual and Design Elements\n- The invoice uses a simple and clear layout, featuring the business logo prominently and stating essential information such as contact and transaction details in a structured manner.\n- There is a \"Thank You\" note at the bottom, which adds a professional and courteous touch.\n\n### Considerations\n- Ensure the date of the transaction is clear if there are any future references needed.\n- Replace filler text with relevant terms and conditions or any special instructions pertaining to the transaction.\n\nThis invoice appears standard, representing a small business transaction with clearly itemized products and applicable taxes.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_39f3b39e-4684-4444-8e4d-e7395f88c9dc","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"id":"pmpt_72e2a184a86f32a568b6afb5455dca5c16bf3cc3f80092dc","variables":{"invoice_doc":{"type":"input_file","file_data":null,"file_id":"file-7fbb1043a4bb468cab60ffe4b8631d8e","file_url":null,"filename":"invoicesample.pdf"}},"version":"1"},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"tools":[],"truncation":null,"usage":{"input_tokens":529,"output_tokens":513,"total_tokens":1042,"input_tokens_details":{"cached_tokens":0},"output_tokens_details":{"reasoning_tokens":0}},"instructions":null}%` Test simple text Prompt in Responses API: 1. Create prompt: ``` curl -X POST http://localhost:8321/v1/prompts \ -H "Content-Type: application/json" \ -d '{ "prompt": "Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.", "variables": ["name", "company", "role", "tone"] }' ``` `{"prompt":"Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.","version":1,"prompt_id":"pmpt_f340a3164a4f65d975c774ffe38ea42d15e7ce4a835919ef","variables":["name","company","role","tone"],"is_default":false}%` 2. Create response: ``` curl -X POST http://localhost:8321/v1/responses \ -H "Accept: application/json, text/event-stream" \ -H "Content-Type: application/json" \ -d '{ "input": "What is the capital of Ireland?", "model": "openai/gpt-4o", "store": true, "prompt": { "id": "pmpt_f340a3164a4f65d975c774ffe38ea42d15e7ce4a835919ef", "version": "1", "variables": { "name": { "type": "input_text", "text": "Alice" }, "company": { "type": "input_text", "text": "Dummy Company" }, "role": { "type": "input_text", "text": "Geography expert" }, "tone": { "type": "input_text", "text": "professional and helpful" } } } }' ``` `{"created_at":1761751097,"error":null,"id":"resp_1b037b95-d9ae-4ad0-8e76-d953897ecaef","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"The capital of Ireland is Dublin.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_8e7c72b6-2aa2-4da6-8e57-da4e12fa3ce2","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"id":"pmpt_f340a3164a4f65d975c774ffe38ea42d15e7ce4a835919ef","variables":{"name":{"text":"Alice","type":"input_text"},"company":{"text":"Dummy Company","type":"input_text"},"role":{"text":"Geography expert","type":"input_text"},"tone":{"text":"professional and helpful","type":"input_text"}},"version":"1"},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"tools":[],"truncation":null,"usage":{"input_tokens":47,"output_tokens":7,"total_tokens":54,"input_tokens_details":{"cached_tokens":0},"output_tokens_details":{"reasoning_tokens":0}},"instructions":null}%`	2025-11-19 11:48:11 -08:00
Ashwin Bharambe	8852666982	chore: remove dead code from openai_compat utility (#4194 ) Removes a bunch of dead code from `openai_compat.py`	2025-11-19 11:23:33 -08:00
Ashwin Bharambe	49d6ef8a70	fix(docs): fix glob vulnerability (#4193 ) add npm override so docs workspace resolves glob@10.5+	2025-11-19 11:01:52 -08:00
Shabana Baig	72ea95e2e0	fix: Fix max_tool_calls for openai provider and add integration tests for the max_tool_calls feat (#4190 ) # Problem OpenAI gpt-4 returned an error when built-in and mcp calls were skipped due to max_tool_calls parameter. Following is from the server log: ``` RuntimeError: OpenAI response failed: Error code: 400 - {'error': {'message': "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_Yi9V1QNpN73dJCAgP2Arcjej", 'type': 'invalid_request_error', 'param': 'messages', 'code': None}} ``` # What does this PR do? - Fixes error returned by openai/gpt when calls were skipped due to max_tool_calls. We now return a tool message that explicitly mentions that the call is skipped. - Adds integration tests as a follow-up to PR#[4062](https://github.com/llamastack/llama-stack/pull/4062) <!-- If resolving an issue, uncomment and update the line below --> Part 2 for issue #[3563](https://github.com/llamastack/llama-stack/issues/3563) ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> - Added integration tests - Added new recordings --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-19 10:27:56 -08:00
Roy Belio	f18870a221	fix: Pydantic validation error with list-type metadata in vector search (#3797 ) (#4173 ) # Fix for Issue #3797 ## Problem Vector store search failed with Pydantic ValidationError when chunk metadata contained list-type values. Error: ``` ValidationError: 3 validation errors for VectorStoreSearchResponse attributes.tags.str: Input should be a valid string attributes.tags.float: Input should be a valid number attributes.tags.bool: Input should be a valid boolean ``` Root Cause: - `Chunk.metadata` accepts `dict[str, Any]` (any type allowed) - `VectorStoreSearchResponse.attributes` requires `dict[str, str \| float \| bool]` (primitives only) - Direct assignment at line 641 caused validation failure for non-primitive types ## Solution Added utility function to filter metadata to primitive types before creating search response. ## Impact Fixed: - Vector search works with list metadata (e.g., `tags: ["transformers", "gpu"]`) - Lists become searchable as comma-separated strings - No ValidationError on search responses Preserved: - Full metadata still available in `VectorStoreContent.metadata` - No API schema changes - Backward compatible with existing primitive metadata Affected: All vector store providers using `OpenAIVectorStoreMixin`: FAISS, Chroma, Qdrant, Milvus, Weaviate, PGVector, SQLite-vec ## Testing tests/unit/providers/vector_io/test_vector_utils.py::test_sanitize_metadata_for_attributes --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-11-19 10:16:34 -08:00
Sam El-Borai	1e4e02e622	fix(ci): prefix stainless branches with fork author (#4187 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> I believe that should avoid CI issues seen in https://github.com/llamastack/llama-stack/pull/4173. Error we see in Stainless logs: ``` (cannot lock ref 'refs/heads/preview/base/fix/issue-3797-metadata-validation': 'refs/heads/preview/base/fix' exists; cannot create 'refs/heads/preview/base/fix/issue-3797-metadata-validation') ``` The issue is that if a branch `fix` exists, `fix/<whatever>` cannot be created (that's how git refs work unfortunately...). The fix in this PR is to ensure PRs from forks are using the author as a prefix. In addition we will do changes to the Stainless API to return better error messages here, it should have been a 4xx with a meaningful error, not a 500. And we will likely need to delete the `fix` branch. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-11-19 10:09:12 -08:00
Ashwin Bharambe	40b11efac4	feat(tests): add TypeScript client integration test support (#4185 ) Integration tests can now validate the TypeScript SDK alongside Python tests when running against server-mode stacks. Currently, this only adds a _small_ number of tests. We should extend only if truly needed -- this smoke check may be sufficient. When `RUN_CLIENT_TS_TESTS=1` is set, the test script runs TypeScript tests after Python tests pass. Tests are mapped via `tests/integration/client-typescript/suites.json` which defines which TypeScript test files correspond to each Python suite/setup combination. The fact that we need exact "test_id"s (which are actually generated by pytest) to be hardcoded inside the Typescript tests (so we hit the recorded paths) is a big smell and it might become grating, but maybe the benefit is worth it if we keep this test suite _small_ and targeted. ## Test Plan Run with TypeScript tests enabled: ```bash OPENAI_API_KEY=dummy RUN_CLIENT_TS_TESTS=1 \ scripts/integration-tests.sh --stack-config server:ci-tests --suite responses --setup gpt ```	2025-11-19 10:07:53 -08:00
Anik	4e9633f7c3	feat: Make Safety API an optional dependency for meta-reference agents provider (#4169 ) # What does this PR do? Change Safety API from required to optional dependency, following the established pattern used for other optional dependencies in Llama Stack. The provider now starts successfully without Safety API configured. Requests that explicitly include guardrails will receive a clear error message when Safety API is unavailable. This enables local development and testing without Safety API while maintaining clear error messages when guardrail features are requested. Closes #4165 Signed-off-by: Anik Bhattacharjee <anbhatta@redhat.com> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> 1. New unit tests added in `tests/unit/providers/agents/meta_reference/test_safety_optional.py` 2. Integration tests performed with the files in https://gist.github.com/anik120/c33cef497ec7085e1fe2164e0705b8d6 (i) test with `test_integration_no_safety_fail.yaml`: Config WITHOUT Safety API, should fail with helpful error since `required_safety_api` is `true` by default ``` $ uv run llama stack run test_integration_no_safety_fail.yaml 2>&1 \| grep -B 5 -A 15 "ValueError.Safety\\|Safety API is required" File "/Users/anbhatta/go/src/github.com/llamastack/llama-stack/src/llama_stack/providers/inline/agents/meta_reference /__init__.py", line 27, in get_provider_impl raise ValueError( ...<9 lines>... ) ValueError: Safety API is required but not configured. To run without safety checks, explicitly set in your configuration: providers: agents: - provider_id: meta-reference provider_type: inline::meta-reference config: require_safety_api: false Warning: This disables all safety guardrails for this agents provider. ``` (ii) test with `test_integration_no_safety_works.yaml` Config WITHOUT Safety API, but* `require_safety_api=false` is explicitly set, should succeed ``` $ uv run llama stack run test_integration_no_safety_works.yaml INFO 2025-11-16 09:49:10,044 llama_stack.cli.stack.run:169 cli: Using run configuration: /Users/anbhatta/go/src/github.com/llamastack/llama-stack/test_integration_no_safety_works.yaml INFO 2025-11-16 09:49:10,052 llama_stack.cli.stack.run:228 cli: HTTPS enabled with certificates: Key: None Cert: None . . . INFO 2025-11-16 09:49:38,528 llama_stack.core.stack:495 core: starting registry refresh task INFO 2025-11-16 09:49:38,534 uvicorn.error:62 uncategorized: Application startup complete. INFO 2025-11-16 09:49:38,535 uvicorn.error:216 uncategorized: Uvicorn running on http://0.0.0.0:8321 (Press CTRL+C ``` Signed-off-by: Anik Bhattacharjee <anbhatta@redhat.com> Signed-off-by: Anik Bhattacharjee <anbhatta@redhat.com>	2025-11-19 10:04:24 -08:00
Charlie Doern	d5cd0eea14	feat!: standardize base_url for inference (#4177 ) # What does this PR do? Completes #3732 by removing runtime URL transformations and requiring users to provide full URLs in configuration. All providers now use 'base_url' consistently and respect the exact URL provided without appending paths like /v1 or /openai/v1 at runtime. BREAKING CHANGE: Users must update configs to include full URL paths (e.g., http://localhost:11434/v1 instead of http://localhost:11434). Closes #3732 ## Test Plan Existing tests should pass even with the URL changes, due to default URLs being altered. Add unit test to enforce URL standardization across remote inference providers (verifies all use 'base_url' field with HttpUrl \| None type) Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-19 08:44:28 -08:00
Charlie Doern	91f1b352b4	chore: add storage sane defaults (#4182 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 4s Details Python Package Build Test / build (3.12) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 14s Details Python Package Build Test / build (3.13) (push) Failing after 12s Details Test External API and Providers / test-external (venv) (push) Failing after 32s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m16s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m32s Details UI Tests / ui-tests (22) (push) Successful in 1m38s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m42s Details Pre-commit / pre-commit (push) Successful in 3m4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m8s Details # What does this PR do? since `StackRunConfig` requires certain parts of `StorageConfig`, it'd probably make sense to template in some defaults that will "just work" for most usecases specifically introduce`ServerStoresConfig` defaults for inference, metadata, conversations and prompts. We already actually funnel in defaults for these sections ad-hoc throughout the codebase additionally set some `backends` defaults for the `StorageConfig`. This will alleviate some weirdness for `--providers` for run/list-deps and also some work I have to better align our list-deps/run datatypes --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-18 15:22:26 -08:00
Ashwin Bharambe	bd5ad2963e	refactor(storage): make { kvstore, sqlstore } as llama stack "internal" APIs (#4181 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 5s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test llama stack list-deps / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Test llama stack list-deps / show-single-provider (push) Successful in 28s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 33s Details Test External API and Providers / test-external (venv) (push) Failing after 33s Details Vector IO Integration Tests / test-matrix (push) Failing after 43s Details Test llama stack list-deps / list-deps (push) Failing after 34s Details Test Llama Stack Build / build-single-provider (push) Successful in 46s Details Test Llama Stack Build / build (push) Successful in 55s Details UI Tests / ui-tests (22) (push) Successful in 1m17s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 1m37s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m32s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m12s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m21s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m46s Details Pre-commit / pre-commit (push) Successful in 3m7s Details These primitives (used both by the Stack as well as provider implementations) can be thought of fruitfully as internal-only APIs which can themselves have multiple implementations. We use the new `llama_stack_api.internal` namespace for this. In addition: the change moves kv/sql store impls, configs, and dependency helpers under `core/storage` ## Testing `pytest tests/unit/utils/test_authorized_sqlstore.py`, other existing CI	2025-11-18 13:15:16 -08:00
Anastas Stoyanovsky	a3580e6bc0	feat!: Wire through parallel_tool_calls to Responses API (#4124 ) # What does this PR do? Initial PR against #4123 Adds `parallel_tool_calls` spec to Responses API and basic initial implementation where no more than one function call is generated when set to `False`. ## Test Plan * Unit tests have been added to verify no more than one function call is generated. * A followup PR will verify passing through `parallel_tool_calls` to providers. * A followup PR will address verification and/or implementation of incremental function calling across multiple conversational turns. --------- Signed-off-by: Anastas Stoyanovsky <astoyano@redhat.com>	2025-11-18 11:25:08 -08:00
raghotham	7093978754	chore(docs): Remove Llama 4 support details from README (#4178 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 11s Details API Conformance Tests / check-schema-compatibility (push) Successful in 14s Details Test External API and Providers / test-external (venv) (push) Failing after 45s Details UI Tests / ui-tests (22) (push) Successful in 48s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m6s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m28s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m29s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m58s Details Pre-commit / pre-commit (push) Successful in 3m42s Details	2025-11-17 15:17:04 -08:00
Charlie Doern	29f1fa6abd	test(api): pre-commit check to ensure API does not import llama_stack (#4160 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Integration Tests (Replay) / generate-matrix (push) Successful in 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Test llama stack list-deps / generate-matrix (push) Successful in 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 10s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 40s Details Test Llama Stack Build / build-single-provider (push) Successful in 43s Details Test llama stack list-deps / list-deps (push) Failing after 38s Details Test llama stack list-deps / show-single-provider (push) Successful in 45s Details Test External API and Providers / test-external (venv) (push) Failing after 45s Details Test Llama Stack Build / build (push) Successful in 42s Details Vector IO Integration Tests / test-matrix (push) Failing after 57s Details Python Package Build Test / build (3.13) (push) Failing after 1m0s Details UI Tests / ui-tests (22) (push) Successful in 1m2s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m52s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m15s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m12s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m26s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m33s Details Pre-commit / pre-commit (push) Successful in 3m40s Details # What does this PR do? since llama_stack_api is meant to be _just_ the API definitions of LLS, we should have pre-commit check that prohibits anyone from accidentally importing `from llama_stack` or adding `llama_stack` as a dependency into `llama_stack_api`s pyproject. ## Test Plan pre-commit should pass. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-17 13:23:43 -08:00
Ashwin Bharambe	7d3db6b22c	feat(openapi): generate stainless config "more" programmatically (#4164 ) Generate the Stainless client config directly from code so we can validate the config before we ever write the YAML. This change enforces allowed HTTP verbs/paths, detects duplicate routes across resources, and ensures README example endpoints exist and match the OpenAPI spec. The generator now fails fast when config entries drift, keeping the published config (hopefully) more current with the spec. I think more validation can be done but this is a good start.	2025-11-17 12:48:03 -08:00
Theofanis Petkos	5fe6098350	docs: Improvements on `provider_codegen` for type hints and multi-line yaml descriptions (#4033 ) # What does this PR do? This PR improves type hint cleanup in auto-generated provider documentation by adding regex logic. Issues Fixed: - Type hints with missing closing brackets (e.g., `list[str` instead of `list[str]`) - Types showing as `<class 'bool'>`, `<class 'str'>` instead of `bool`, `str` - The multi-line YAML frontmatter in index documentation files wasn't ideal, so we now add the proper `\|` character. Changes: 1. Replaced string replacement (`.replace`) with regex-based type cleaning to preserve the trailing bracket in case of `list` and `dict`. 2. Adds the `\|` character for multi-line YAML descriptions. 3. I have regenerated the docs. However, let me know if that's not needed. ## Test Plan 1. Ran uv run python scripts/provider_codegen.py - successfully regenerated all docs 2. We can see that the updated docs handle correctly type hint cleanup and multi-line yaml descriptions have now the `\|` character. ### Note to the reviewer(s) This is my first contribution to your lovely repo! Initially I was going thourgh docs (wanted to use `remote::gemini` as provider) and realized the issue. I've read the [CONTRIBUTING.md](https://github.com/llamastack/llama-stack/blob/main/CONTRIBUTING.md) and decided to open the PR. Let me know if there's anything I did wrong and I'll update my PR! --------- Signed-off-by: thepetk <thepetk@gmail.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-17 12:35:28 -08:00
Omar Abdelwahab	fe91d331ef	fix: Remove authorization from provider data (#4161 ) # What does this PR do? - Remove backward compatibility for authorization in mcp_headers - Enforce authorization must use dedicated parameter - Add validation error if Authorization found in provider_data headers - Update test_mcp.py to use authorization parameter - Update test_mcp_json_schema.py to use authorization parameter - Update test_tools_with_schemas.py to use authorization parameter - Update documentation to show the change in the authorization approach Breaking Change: - Authorization can no longer be passed via mcp_headers in provider_data - Users must use the dedicated 'authorization' parameter instead - Clear error message guides users to the new approach" ## Test Plan CI --------- Co-authored-by: Omar Abdelwahab <omara@fb.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-17 12:16:35 -08:00
Sébastien Han	0128effbf7	chore: remove pyyaml and starlette duplication in pyproject (#4172 ) Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-17 12:09:02 -08:00
Ashwin Bharambe	f648cacdad	fix(openapi): restore embedded request wrappers (#4176 ) FastAPI generator now only unwraps body params explicitly marked with Body(embed=False) so the /eval run_eval schema once again exposes RunEvalRequest, matching our integration tests and the server's request parsing. Regenerated the OpenAPI specs to capture the restored wrapper. CI on the Stainless preview builds should be green.	2025-11-17 11:36:23 -08:00
Yuan Tang	5ea1be69fe	chore: Remove myself from codeowners (#4175 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-11-17 09:28:41 -05:00
Sébastien Han	8bf4ee9ab9	fix: list-deps command (#4174 ) # What does this PR do? It was referencing strong_typing which was removed in https://github.com/llamastack/llama-stack/pull/3944 ## Test Plan New CI build test. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-11-17 15:26:10 +01:00
Sébastien Han	97f535c4f1	feat(openapi): switch to fastapi-based generator (#3944 ) Some checks failed Pre-commit / pre-commit (push) Successful in 3m27s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test llama stack list-deps / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Test llama stack list-deps / show-single-provider (push) Successful in 25s Details Test External API and Providers / test-external (venv) (push) Failing after 34s Details Vector IO Integration Tests / test-matrix (push) Failing after 43s Details Test Llama Stack Build / build (push) Successful in 37s Details Test Llama Stack Build / build-single-provider (push) Successful in 48s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 52s Details Test llama stack list-deps / list-deps (push) Failing after 52s Details Python Package Build Test / build (3.13) (push) Failing after 1m2s Details UI Tests / ui-tests (22) (push) Successful in 1m15s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 1m29s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m45s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 1m54s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m13s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m20s Details # What does this PR do? This replaces the legacy "pyopenapi + strong_typing" pipeline with a FastAPI-backed generator that has an explicit schema registry inside `llama_stack_api`. The key changes: 1. New generator architecture. FastAPI now builds the OpenAPI schema directly from the real routes, while helper modules (`schema_collection`, `endpoints`, `schema_transforms`, etc.) post-process the result. The old pyopenapi stack and its strong_typing helpers are removed entirely, so we no longer rely on fragile AST analysis or top-level import side effects. 2. Schema registry in `llama_stack_api`. `schema_utils.py` keeps a `SchemaInfo` record for every `@json_schema_type`, `register_schema`, and dynamically created request model. The OpenAPI generator and other tooling query this registry instead of scanning the package tree, producing deterministic names (e.g., `{MethodName}Request`), capturing all optional/nullable fields, and making schema discovery testable. A new unit test covers the registry behavior. 3. Regenerated specs + CI alignment. All docs/Stainless specs are regenerated from the new pipeline, so optional/nullable fields now match reality (expect the API Conformance workflow to report breaking changes—this PR establishes the new baseline). The workflow itself is back to the stock oasdiff invocation so future regressions surface normally. Conformance will be RED on this PR; we choose to accept the deviations. ## Test Plan - `uv run pytest tests/unit/server/test_schema_registry.py` - `uv run python -m scripts.openapi_generator.main docs/static` --------- Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-14 15:53:53 -08:00
Mike Sager	cc88789071	test: Restore responses unit tests (#4153 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test llama stack list-deps / generate-matrix (push) Successful in 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 10s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Python Package Build Test / build (3.12) (push) Failing after 5s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 40s Details Test Llama Stack Build / build-single-provider (push) Successful in 42s Details Test llama stack list-deps / show-single-provider (push) Successful in 43s Details Test llama stack list-deps / list-deps (push) Failing after 37s Details Test Llama Stack Build / build (push) Successful in 40s Details Vector IO Integration Tests / test-matrix (push) Failing after 47s Details Test External API and Providers / test-external (venv) (push) Failing after 46s Details Python Package Build Test / build (3.13) (push) Failing after 55s Details UI Tests / ui-tests (22) (push) Successful in 1m2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1m11s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m39s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 1m53s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m1s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m12s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m46s Details Pre-commit / pre-commit (push) Successful in 3m12s Details # What does this PR do? Restores the responses unit tests that were inadvertently deleted in PR [#4055 ](https://github.com/llamastack/llama-stack/pull/4055) ## Test Plan I ran the unit tests that I restored. They all passed with one exception: tests/unit/providers/agents/meta_reference/test_openai_responses.py::test_reuse_mcp_tool_list AttributeError: module 'llama_stack.providers.utils.tools' has no attribute 'mcp' It's coming from this line: @patch("llama_stack.providers.utils.tools.mcp.list_mcp_tools") The mcp.py module (and \_\_init\_\_.py) exists under tools. There are some 'from mcp ....' imports (mcp package in this case) within it that python may be interpreting as circular imports (or maybe I'm overlooking something).	2025-11-14 13:16:03 -08:00
slekkala1	f596f850bf	fix: Propagate the runtime error message to user (#4150 ) # What does this PR do? For Runtime Exception the error is not propagated to the user and can be opaque. Before fix: `ERROR - Error processing message: Error code: 500 - {'detail': 'Internal server error: An unexpected error occurred.'} ` After fix: `[ERROR] Error code: 404 - {'detail': "Model 'claude-sonnet-4-5-20250929' not found. Use 'client.models.list()' to list available Models."} ` (Ran into this few times, while working with OCI + LLAMAStack and Sabre: Agentic framework integrations with LLAMAStack) ## Test Plan CI	2025-11-14 13:14:49 -08:00
Omar Abdelwahab	eb545034ab	fix: MCP authorization parameter implementation (#4052 ) # What does this PR do? Adding a user-facing `authorization ` parameter to MCP tool definitions that allows users to explicitly configure credentials per MCP server, addressing GitHub Issue #4034 in a secure manner. ## Test Plan tests/integration/responses/test_mcp_authentication.py --------- Co-authored-by: Omar Abdelwahab <omara@fb.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-14 08:54:42 -08:00
Sébastien Han	dc49ad3f89	chore: bump starlette version (#4158 ) # What does this PR do? Require at least 0.49.1 which fixes a security vulnerability in the parsing logic of the Range header in FileResponse. Release note: https://github.com/Kludex/starlette/releases/tag/0.49.1 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-11-14 08:47:37 -08:00
Charlie Doern	a078f089d9	fix: rename llama_stack_api dir (#4155 ) Some checks failed Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details Test llama stack list-deps / generate-matrix (push) Successful in 29s Details Test Llama Stack Build / build-single-provider (push) Successful in 33s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 32s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Test Llama Stack Build / build (push) Successful in 39s Details Test llama stack list-deps / show-single-provider (push) Successful in 46s Details Python Package Build Test / build (3.13) (push) Failing after 44s Details Test External API and Providers / test-external (venv) (push) Failing after 44s Details Vector IO Integration Tests / test-matrix (push) Failing after 56s Details Test llama stack list-deps / list-deps (push) Failing after 47s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m42s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m55s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m0s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m42s Details Pre-commit / pre-commit (push) Successful in 5m17s Details # What does this PR do? the directory structure was src/llama-stack-api/llama_stack_api instead it should just be src/llama_stack_api to match the other packages. update the structure and pyproject/linting config --------- Signed-off-by: Charlie Doern <cdoern@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-13 15:04:36 -08:00
slekkala1	ba744d791a	fix: failure in responses during construct metrics (#4157 ) # What does this PR do? Without this we get below in server logs ``` RuntimeError: OpenAI response failed: InferenceRouter._construct_metrics() got an unexpected keyword argument 'model_id' ``` Seems the method signature got update but this callsite was not updated ## Test Plan CI and test with Sabre (Agent framework integration)	2025-11-13 14:21:03 -08:00
Francisco Arceo	a82b79ce57	fix: Error out when creating vector store with unknown embedding model (#4154 ) # What does this PR do? Error out when creating vector store with unknown embedding model Closes https://github.com/llamastack/llama-stack/issues/4047 ## Test Plan Added tests Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-11-13 13:43:31 -08:00
Ashwin Bharambe	2441ca9389	fix(api): ensure openapi spec has deprecated routes (#4156 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Integration Tests (Replay) / generate-matrix (push) Successful in 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test llama stack list-deps / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 19s Details Python Package Build Test / build (3.13) (push) Failing after 17s Details Test External API and Providers / test-external (venv) (push) Failing after 30s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 36s Details Test Llama Stack Build / build-single-provider (push) Successful in 40s Details Test llama stack list-deps / show-single-provider (push) Successful in 48s Details Vector IO Integration Tests / test-matrix (push) Failing after 55s Details Test Llama Stack Build / build (push) Successful in 48s Details UI Tests / ui-tests (22) (push) Successful in 54s Details Test llama stack list-deps / list-deps (push) Failing after 1m34s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m6s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m38s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m38s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m44s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m50s Details Pre-commit / pre-commit (push) Successful in 3m51s Details Deprecated doesn't mean it's "gone", it just means it is "going away" in the next major version of the package.	2025-11-13 13:16:02 -08:00
Charlie Doern	840ad75fe9	feat: split API and provider specs into separate llama-stack-api pkg (#3895 ) # What does this PR do? Extract API definitions and provider specifications into a standalone llama-stack-api package that can be published to PyPI independently of the main llama-stack server. see: https://github.com/llamastack/llama-stack/pull/2978 and https://github.com/llamastack/llama-stack/pull/2978#issuecomment-3145115942 Motivation External providers currently import from llama-stack, which overrides the installed version and causes dependency conflicts. This separation allows external providers to: - Install only the type definitions they need without server dependencies - Avoid version conflicts with the installed llama-stack package - Be versioned and released independently This enables us to re-enable external provider module tests that were previously blocked by these import conflicts. Changes - Created llama-stack-api package with minimal dependencies (pydantic, jsonschema) - Moved APIs, providers datatypes, strong_typing, and schema_utils - Updated all imports from llama_stack.* to llama_stack_api.* - Configured local editable install for development workflow - Updated linting and type-checking configuration for both packages Next Steps - Publish llama-stack-api to PyPI - Update external provider dependencies - Re-enable external provider module tests Pre-cursor PRs to this one: - #4093 - #3954 - #4064 These PRs moved key pieces _out_ of the Api pkg, limiting the scope of change here. relates to #3237 ## Test Plan Package builds successfully and can be imported independently. All pre-commit hooks pass with expected exclusions maintained. --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-13 11:51:17 -08:00
Sébastien Han	ceb716b9a0	chore: set minimum pre-commit version (#4148 ) # What does this PR do? - force a min precommit version - pin to >= 4.3.0 when installing --------- Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-13 10:52:38 -08:00
Francisco Arceo	4442b24de7	chore: Fix docs so can be deployed (#4149 ) # What does this PR do? Building/Deploying docs is failing here: `5530320962 (step)`:8:49 Needs the playground file. Updated it to reflect current admin status. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-11-13 09:15:32 -08:00
Derek Higgins	aeaf4eb3dd	fix: remove_disabled_providers filtering models with None fields (#4132 ) Fixed bug where models with No provider_model_id were incorrectly filtered from the startup config display. The function was checking multiple fields when it should only filter items with explicitly disabled provider_id. Changes: o Modified remove_disabled_providers to only check provider_id field o Changed condition from checking multiple fields with None to only checking provider_id for "__disabled__", None or empty string o Added comprehensive unit tests Closes: #4131 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-11-13 07:24:05 -08:00
Ashwin Bharambe	1e81056a22	feat(tests): enable MCP tests in server mode (#4146 ) We would like to run all OpenAI compatibility tests using only the openai-client library. This is most friendly for contributors since they can run tests without needing to update the client-sdks (which is getting easier but still a long pole.) This is the first step in enabling that -- no using "library client" for any of the Responses tests. This seems like a reasonable trade-off since the usage of an embeddeble library client for Responses (or any OpenAI-compatible) behavior seems to be not very common. To do this, we needed to enable MCP tests (which only worked in library client mode) for server mode.	2025-11-13 07:23:23 -08:00
Akram Ben Aissi	9eb81439d2	docs: Add comprehensive Files API and Vector Store integration doc (#3279 ) docs: Add comprehensive Files API and Vector Store integration documentation - Add Files API documentation with OpenAI-compatible endpoints - Create comprehensive guide for OpenAI-compatible file operations - Reorganize documentation structure: move file operations to files/ directory - Add vector store provider documentation for Milvus, SQLite-vec, FAISS - Clean up redundant files and improve navigation - Update cross-references and eliminate documentation duplication - Support for release 0.2.14 FileResponse and Vector Store API features # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-11-13 08:50:06 -05:00
Ashwin Bharambe	fcf649b97a	feat(storage): share sql/kv instances and add upsert support (#4140 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 2s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.13) (push) Failing after 17s Details Test Llama Stack Build / build-single-provider (push) Successful in 31s Details Test External API and Providers / test-external (venv) (push) Failing after 32s Details Vector IO Integration Tests / test-matrix (push) Failing after 45s Details Test Llama Stack Build / build (push) Successful in 47s Details UI Tests / ui-tests (22) (push) Successful in 1m42s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m8s Details Unit Tests / unit-tests (3.13) (push) Failing after 2m7s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m28s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m32s Details Pre-commit / pre-commit (push) Successful in 3m20s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m33s Details A few changes to the storage layer to ensure we reduce unnecessary contention arising out of our design choices (and letting the database layer do its correct thing): - SQL stores now share a single `SqlAlchemySqlStoreImpl` per backend, and `kvstore_impl` caches instances per `(backend, namespace)`. This avoids spawning multiple SQLite connections for the same file, reducing lock contention and aligning the cache story for all backends. - Added an async upsert API (with SQLite/Postgres dialect inserts) and routed it through `AuthorizedSqlStore`, then switched conversations and responses to call it. Using native `ON CONFLICT DO UPDATE` eliminates the insert-then-update retry window that previously caused long WAL lock retries. ### Test Plan Existing tests, added a unit test for `upsert()`	2025-11-12 12:14:26 -08:00
Ashwin Bharambe	492f79ca9b	fix: harden storage semantics (#4118 ) Fixes issues in the storage system by guaranteeing immediate durability for responses and ensuring background writers stay alive. Three related fixes: * Responses to the OpenAI-compatible API now write directly to Postgres/SQLite inside the request instead of detouring through an async queue that might never drain; this restores the expected read-after-write behavior and removes the "response not found" races reported by users. * The access-control shim was stamping owner_principal/access_attributes as SQL NULL, which Postgres interprets as non-public rows; fixing it to use the empty-string/JSON-null pattern means conversations and responses stored without an authenticated user stay queryable (matching SQLite). * The inference-store queue remains for batching, but its worker tasks now start lazily on the live event loop so server startup doesn't cancel them—writes keep flowing even when the stack is launched via llama stack run. Closes #4115 ### Test Plan Added a matrix entry to test our "base" suite against Postgres as the store.	2025-11-12 10:35:39 -08:00
Derek Higgins	356f37b1ba	docs: clarify model identification uses provider_model_id not model_id (#4128 ) Updated documentation to accurately reflect current behavior where models are identified as provider_id/provider_model_id in the system. Changes: o Clarify that model_id is for configuration purposes only o Explain models are accessed as provider_id/provider_model_id o Remove outdated aliasing example that suggested model_id could be used as a custom identifier This corrects the documentation which previously suggested model_id could be used to create friendly aliases, which is not how the code actually works. Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-11-12 10:13:26 -08:00
Ken Dreyer	94e977c257	fix(docs): link to test replay-record docs for discoverability (#4134 ) Help users find the comprehensive integration testing docs by linking to the record-replay documentation. This clarifies that the technical README complements the main docs.	2025-11-12 10:04:56 -08:00
Francisco Arceo	eb3f9ac278	feat: allow returning embeddings and metadata from `/vector_stores/` methods; disallow changing Provider ID (#4046 ) # What does this PR do? - Updates `/vector_stores/{vector_store_id}/files/{file_id}/content` to allow returning `embeddings` and `metadata` using the `extra_query` - Updates the UI accordingly to display them. - Update UI to support CRUD operations in the Vector Stores section and adds a new modal exposing the functionality. - Updates Vector Store update to fail if a user tries to update Provider ID (which doesn't make sense to allow) ```python In [1]: client.vector_stores.files.content( vector_store_id=vector_store.id, file_id=file.id, extra_query={"include_embeddings": True, "include_metadata": True} ) Out [1]: FileContentResponse(attributes={}, content=[Content(text='This is a test document to check if embeddings are generated properly.\n', type='text', embedding=[0.33760684728622437, ...,], chunk_metadata={'chunk_id': '62a63ae0-c202-f060-1b86-0a688995b8d3', 'document_id': 'file-27291dbc679642ac94ffac6d2810c339', 'source': None, 'created_timestamp': 1762053437, 'updated_timestamp': 1762053437, 'chunk_window': '0-13', 'chunk_tokenizer': 'DEFAULT_TIKTOKEN_TOKENIZER', 'chunk_embedding_model': 'sentence-transformers/nomic -ai/nomic-embed-text-v1.5', 'chunk_embedding_dimension': 768, 'content_token_count': 13, 'metadata_token_count': 9}, metadata={'filename': 'test-embedding.txt', 'chunk_id': '62a63ae0-c202-f060-1b86-0a688995b8d3', 'document_id': 'file-27291dbc679642ac94ffac6d2810c339', 'token_count': 13, 'metadata_token_count': 9})], file_id='file-27291dbc679642ac94ffac6d2810c339', filename='test-embedding.txt') ``` Screenshots of UI are displayed below: ### List Vector Store with Added "Create New Vector Store" <img width="1912" height="491" alt="Screenshot 2025-11-06 at 10 47 25 PM" src="https://github.com/user-attachments/assets/a3a3ddd9-758d-4005-ac9c-5047f03916f3" /> ### Create New Vector Store <img width="1918" height="1048" alt="Screenshot 2025-11-06 at 10 47 49 PM" src="https://github.com/user-attachments/assets/b4dc0d31-696f-4e68-b109-27915090f158" /> ### Edit Vector Store <img width="1916" height="1355" alt="Screenshot 2025-11-06 at 10 48 32 PM" src="https://github.com/user-attachments/assets/ec879c63-4cf7-489f-bb1e-57ccc7931414" /> ### Vector Store Files Contents page (with Embeddings) <img width="1914" height="849" alt="Screenshot 2025-11-06 at 11 54 32 PM" src="https://github.com/user-attachments/assets/3095520d-0e90-41f7-83bd-652f6c3fbf27" /> ### Vector Store Files Contents Details page (with Embeddings) <img width="1916" height="1221" alt="Screenshot 2025-11-06 at 11 55 00 PM" src="https://github.com/user-attachments/assets/e71dbdc5-5b49-472b-a43a-5785f58d196c" /> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Tests added for Middleware extension and Provider failures. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-11-12 09:59:48 -08:00
Charlie Doern	37853ca558	fix(tests): add OpenAI client connection cleanup to prevent CI hangs (#4119 ) # What does this PR do? Add explicit connection cleanup and shorter timeouts to OpenAI client fixtures. Fixes CI deadlock after 25+ tests due to connection pool exhaustion. Also adds 60s timeout to test_conversation_context_loading as safety net. ## Test Plan tests pass Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-12 12:17:13 -05:00
Sam El-Borai	63137f9af1	chore(stainless): add config for file header (#4126 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> This PR adds Stainless config to specify the Meta copyright file header for generated files. Doing it via config instead of custom code will reduce the probability of git conflict. ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> - review preview builds	2025-11-12 11:39:21 -05:00
Akshay Ghodake	539b9c08f3	chore(deps): update pypdf to fix DoS vulnerabilities (#4121 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 6s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test llama stack list-deps / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details Python Package Build Test / build (3.12) (push) Failing after 17s Details Python Package Build Test / build (3.13) (push) Failing after 17s Details Test llama stack list-deps / show-single-provider (push) Successful in 50s Details Test Llama Stack Build / build-single-provider (push) Successful in 53s Details UI Tests / ui-tests (22) (push) Successful in 53s Details Test Llama Stack Build / build (push) Successful in 52s Details Test llama stack list-deps / list-deps-from-config (push) Successful in 1m18s Details Test External API and Providers / test-external (venv) (push) Failing after 1m19s Details Test llama stack list-deps / list-deps (push) Failing after 1m1s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m44s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m53s Details Unit Tests / unit-tests (3.12) (push) Failing after 2m6s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3m7s Details Test Llama Stack Build / build-custom-container-distribution (push) Successful in 3m8s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m30s Details Pre-commit / pre-commit (push) Successful in 4m1s Details Update pypdf dependency to address vulnerabilities causing potential denial of service through infinite loops or excessive memory usage when handling malicious PDFs. The update remains fully backward compatible, with no changes to the PdfReader API. # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Fixes #4120 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-11-12 10:24:19 +01:00
Charlie Doern	6ca2a67a9f	chore: remove dead code (#4125 ) # What does this PR do? build_image is not used because `llama stack build` is gone. Remove it. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-12 10:09:14 +01:00
ehhuang	71b328fc4b	chore(ui): add npm package and dockerfile (#4100 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Pre-commit / pre-commit (push) Failing after 2s Details Integration Tests (Replay) / generate-matrix (push) Successful in 2s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 53s Details # What does this PR do? - sets up package.json for npm `llama-stack-ui` package (will update llama-stack-ops) - adds dockerfile for UI docker image ## Test Plan npx: npm build && npm pack LLAMA_STACK_UI_PORT=8322 npx /Users/erichuang/projects/ui/src/llama_stack_ui/llama-stack-ui-0.4.0-alpha.2.tgz docker: cd src/llama_stack_ui docker build . -f Dockerfile --tag test_ui --no-cache ❯ docker run -p 8322:8322 \ -e LLAMA_STACK_UI_PORT=8322 \ test_ui:latest	2025-11-11 10:40:31 -08:00
paulengineer	e5a55f3677	docs: use 'uv pip' to avoid pitfalls of using 'pip' in virtual environment (#4122 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Pre-commit / pre-commit (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 25s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2s Details UI Tests / ui-tests (22) (push) Successful in 53s Details # What does this PR do? In the Detailed Tutorial, at Step 3, the Install with venv option creates a new virtual environment `client`, activates it then attempts to install the llama-stack-client using pip. ``` uv venv client --python 3.12 source client/bin/activate pip install llama-stack-client <- this is the problematic line ``` However, the pip command will likely fail because the `uv venv` command doesn't, by default, include adding the pip command to the virtual environment that is created. The pip command will error either because pip doesn't exist at all, or, if the pip command does exist outside of the virtual environment, return a different error message. The latter may be unclear to the user why it is failing. This PR changes 'pip' to 'uv pip', allowing the install action to function in the virtual environment as intended, and without the need for pip to be installed. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan 1. Use linux or WSL (virtual environments on Windows use `Scripts` folder instead of `bin` [virtualenv #993ba13](`993ba1316a`) which doesn't align with the tutorial) 2. Clone the `llama-stack` repo 3. Run the following and verify success: ``` uv venv client --python 3.12 source client/bin/activate ``` 5. Run the updated command: ``` uv pip install llama-stack-client ``` 6. Observe the console output confirms that the virtual environment `client` was used: > Using Python 3.12.3 environment at: client	2025-11-11 07:49:03 -05:00
Nathan Weinberg	97ccfb5e62	refactor: inspect routes now shows all non-deprecated APIs (#4116 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Pre-commit / pre-commit (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test llama stack list-deps / generate-matrix (push) Successful in 4s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Test llama stack list-deps / show-single-provider (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test llama stack list-deps / list-deps (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 21s Details UI Tests / ui-tests (22) (push) Successful in 46s Details # What does this PR do? the inspect API lacked any mechanism to get all non-deprecated APIs (v1, v1alpha, v1beta) change default to this behavior 'v1' filter can be used for user' wanting a list of stable APIs ## Test Plan 1. pull the PR 2. launch a LLS server 3. run `curl http://beanlab3.bss.redhat.com:8321/v1/inspect/routes` 4. note there are APIs for `v1`, `v1alpha`, and `v1beta` but no deprecated APIs Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-11-10 15:57:17 -08:00
Charlie Doern	43adc23ef6	refactor: remove dead inference API code and clean up imports (#4093 ) # What does this PR do? Delete ~2,000 lines of dead code from the old bespoke inference API that was replaced by OpenAI-only API. This includes removing unused type conversion functions, dead provider methods, and event_logger.py. Clean up imports across the codebase to remove references to deleted types. This eliminates unnecessary code and dependencies, helping isolate the API package as a self-contained module. This is the last interdependency between the .api package and "exterior" packages, meaning that now every other package in llama stack imports the API, not the other way around. ## Test Plan this is a structural change, no tests needed. --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-10 15:29:24 -08:00
Shabana Baig	433438cfc0	feat: Implement the 'max_tool_calls' parameter for the Responses API (#4062 ) # Problem Responses API uses max_tool_calls parameter to limit the number of tool calls that can be generated in a response. Currently, LLS implementation of the Responses API does not support this parameter. # What does this PR do? This pull request adds the max_tool_calls field to the response object definition and updates the inline provider. it also ensures that: - the total number of calls to built-in and mcp tools do not exceed max_tool_calls - an error is thrown if max_tool_calls < 1 (behavior seen with the OpenAI Responses API, but we can change this if needed) Closes #[3563](https://github.com/llamastack/llama-stack/issues/3563) ## Test Plan - Tested manually for change in model response w.r.t supplied max_tool_calls field. - Added integration tests to test invalid max_tool_calls parameter. - Added integration tests to check max_tool_calls parameter with built-in and function tools. - Added integration tests to check max_tool_calls parameter in the returned response object. - Recorded OpenAI Responses API behavior using a sample script: https://github.com/s-akhtar-baig/llama-stack-examples/blob/main/responses/src/max_tool_calls.py Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-10 13:21:27 -08:00
Dennis Kennetz	209a78b618	feat: add oci genai service as chat inference provider (#3876 ) # What does this PR do? Adds OCI GenAI PaaS models for openai chat completion endpoints. ## Test Plan In an OCI tenancy with access to GenAI PaaS, perform the following steps: 1. Ensure you have IAM policies in place to use service (check docs included in this PR) 2. For local development, [setup OCI cli](https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/cliinstall.htm) and configure the CLI with your region, tenancy, and auth [here](https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/cliconfigure.htm) 3. Once configured, go through llama-stack setup and run llama-stack (uses config based auth) like: ```bash OCI_AUTH_TYPE=config_file \ OCI_CLI_PROFILE=CHICAGO \ OCI_REGION=us-chicago-1 \ OCI_COMPARTMENT_OCID=ocid1.compartment.oc1..aaaaaaaa5...5a \ llama stack run oci ``` 4. Hit the `models` endpoint to list models after server is running: ```bash curl http://localhost:8321/v1/models \| jq ... { "identifier": "meta.llama-4-scout-17b-16e-instruct", "provider_resource_id": "ocid1.generativeaimodel.oc1.us-chicago-1.am...q", "provider_id": "oci", "type": "model", "metadata": { "display_name": "meta.llama-4-scout-17b-16e-instruct", "capabilities": [ "CHAT" ], "oci_model_id": "ocid1.generativeaimodel.oc1.us-chicago-1.a...q" }, "model_type": "llm" }, ... ``` 5. Use the "display_name" field to use the model in a `/chat/completions` request: ```bash # Streaming result curl -X POST http://localhost:8321/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "meta.llama-4-scout-17b-16e-instruct", "stream": true, "temperature": 0.9, "messages": [ { "role": "system", "content": "You are a funny comedian. You can be crass." }, { "role": "user", "content": "Tell me a funny joke about programming." } ] }' # Non-streaming result curl -X POST http://localhost:8321/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "meta.llama-4-scout-17b-16e-instruct", "stream": false, "temperature": 0.9, "messages": [ { "role": "system", "content": "You are a funny comedian. You can be crass." }, { "role": "user", "content": "Tell me a funny joke about programming." } ] }' ``` 6. Try out other models from the `/models` endpoint.	2025-11-10 16:16:24 -05:00
Ashwin Bharambe	fadf17daf3	feat(api)!: deprecate register/unregister resource APIs (#4099 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Pre-commit / pre-commit (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 1m10s Details Mark all register_* / unregister_* APIs as deprecated across models, shields, tool groups, datasets, benchmarks, and scoring functions. This is the first step toward moving resource mutations to an `/admin` namespace as outlined in https://github.com/llamastack/llama-stack/issues/3809#issuecomment-3492931585. The deprecation flag will be reflected in the OpenAPI schema to warn API users that these endpoints are being phased out. Next step will be implementing the `/admin` route namespace for these resource management operations. - `register_model` / `unregister_model` - `register_shield` / `unregister_shield` - `register_tool_group` / `unregister_toolgroup` - `register_dataset` / `unregister_dataset` - `register_benchmark` / `unregister_benchmark` - `register_scoring_function` / `unregister_scoring_function`	2025-11-10 10:36:33 -08:00
ehhuang	d4ecbfd092	fix(vector store)!: fix file content API (#4105 ) # What does this PR do? - changed to match https://app.stainless.com/api/spec/documented/openai/openapi.documented.yml ## Test Plan updated test CI	2025-11-10 10:16:35 -08:00
Vaishnavi Hire	4341c4c2ac	docs: Add Llama Stack Operator docs (#3983 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Add documentation for llama-stack-k8s-operator under kubernetes deployment guide. Signed-off-by: Vaishnavi Hire <vhire@redhat.com>	2025-11-10 15:29:15 +01:00
Juan Pérez de Algaba	6147321083	fix: Vector store persistence across server restarts (#3977 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 17s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Integration Tests (Replay) / generate-matrix (push) Successful in 21s Details Unit Tests / unit-tests (3.12) (push) Failing after 18s Details Pre-commit / pre-commit (push) Failing after 23s Details Test External API and Providers / test-external (venv) (push) Failing after 22s Details API Conformance Tests / check-schema-compatibility (push) Successful in 30s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 20s Details UI Tests / ui-tests (22) (push) Successful in 1m10s Details # What does this PR do? This PR fixes a bug in LlamaStack 0.3.0 where vector stores created via the OpenAI-compatible API (`POST /v1/vector_stores`) would fail with `VectorStoreNotFoundError` after server restart when attempting operations like `vector_io.insert()` or `vector_io.query()`. The bug affected 6 vector IO providers: `pgvector`, `sqlite_vec`, `chroma`, `milvus`, `qdrant`, and `weaviate`. Created with the assistance of: claude-4.5-sonnet ## Root Cause All affected providers had a broken `_get_and_cache_vector_store_index()` method that: 1. Did not load existing vector stores from persistent storage during initialization 2. Attempted to use `vector_store_table` (which was either `None` or a `KVStore` without the required `get_vector_store()` method) 3. Could not reload vector stores after server restart or cache miss ## Solution This PR implements a consistent pattern across all 6 providers: 1. Load vector stores during initialization - Pre-populate the cache from KV store on startup 2. Fix lazy loading - Modified `_get_and_cache_vector_store_index()` to load directly from KV store instead of relying on `vector_store_table` 3. Remove broken dependency - Eliminated reliance on the `vector_store_table` pattern ## Testing steps ### 1.1 Configure the stack Create or use an existing configuration with a vector IO provider. Example `run.yaml`: ```yaml vector_io_store: - provider_id: pgvector provider_type: remote::pgvector config: host: localhost port: 5432 db: llamastack user: llamastack password: llamastack inference: - provider_id: sentence-transformers provider_type: inline::sentence-transformers config: model: sentence-transformers/all-MiniLM-L6-v2 ``` ### 1.2 Start the server ```bash llama stack run run.yaml --port 5000 ``` Wait for the server to fully start. You should see: ``` INFO: Started server process INFO: Application startup complete ``` --- ## Step 2: Create a Vector Store ### 2.1 Create via API ```bash curl -X POST http://localhost:5000/v1/vector_stores \ -H "Content-Type: application/json" \ -d '{ "name": "test-persistence-store", "extra_body": { "embedding_model": "sentence-transformers/all-MiniLM-L6-v2", "embedding_dimension": 384, "provider_id": "pgvector" } }' \| jq ``` ### 2.2 Expected Response ```json { "id": "vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d", "object": "vector_store", "name": "test-persistence-store", "status": "completed", "created_at": 1730304000, "file_counts": { "total": 0, "completed": 0, "in_progress": 0, "failed": 0, "cancelled": 0 }, "usage_bytes": 0 } ``` Save the `id` field (e.g., `vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d`) — you’ll need it for the next steps. --- ## Step 3: Insert Data (Before Restart) ### 3.1 Insert chunks into the vector store ```bash export VS_ID="vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d" curl -X POST http://localhost:5000/vector-io/insert \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"chunks\": [ { \"content\": \"Python is a high-level programming language known for its readability.\", \"metadata\": {\"source\": \"doc1\", \"page\": 1} }, { \"content\": \"Machine learning enables computers to learn from data without explicit programming.\", \"metadata\": {\"source\": \"doc2\", \"page\": 1} }, { \"content\": \"Neural networks are inspired by biological neurons in the brain.\", \"metadata\": {\"source\": \"doc3\", \"page\": 1} } ] }" ``` ### 3.2 Expected Response Status: 200 OK Response: Empty or success confirmation --- ## Step 4: Query Data (Before Restart – Baseline) ### 4.1 Query the vector store ```bash curl -X POST http://localhost:5000/vector-io/query \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"query\": \"What is machine learning?\" }" \| jq ``` ### 4.2 Expected Response ```json { "chunks": [ { "content": "Machine learning enables computers to learn from data without explicit programming.", "metadata": {"source": "doc2", "page": 1} }, { "content": "Neural networks are inspired by biological neurons in the brain.", "metadata": {"source": "doc3", "page": 1} } ], "scores": [0.85, 0.72] } ``` Checkpoint: Works correctly before restart. --- ## Step 5: Restart the Server (Critical Test) ### 5.1 Stop the server In the terminal where it’s running: ``` Ctrl + C ``` Wait for: ``` Shutting down... ``` ### 5.2 Restart the server ```bash llama stack run run.yaml --port 5000 ``` Wait for: ``` INFO: Started server process INFO: Application startup complete ``` The vector store cache is now empty, but data should persist. --- ## Step 6: Verify Vector Store Exists (After Restart) ### 6.1 List vector stores ```bash curl http://localhost:5000/v1/vector_stores \| jq ``` ### 6.2 Expected Response ```json { "object": "list", "data": [ { "id": "vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d", "name": "test-persistence-store", "status": "completed" } ] } ``` Checkpoint: Vector store should be listed. --- ## Step 7: Insert Data (After Restart – THE BUG TEST) ### 7.1 Insert new chunks ```bash curl -X POST http://localhost:5000/vector-io/insert \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"chunks\": [ { \"content\": \"This chunk was inserted AFTER the server restart.\", \"metadata\": {\"source\": \"post-restart\", \"test\": true} } ] }" ``` ### 7.2 Expected Results With Fix (Correct): ``` Status: 200 OK Response: Success ``` Without Fix (Bug): ```json { "detail": "VectorStoreNotFoundError: Vector Store 'vs_a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d' not found." } ``` Critical Test: If insertion succeeds, the fix works. --- ## Step 8: Query Data (After Restart – Verification) ### 8.1 Query all data ```bash curl -X POST http://localhost:5000/vector-io/query \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"query\": \"restart\" }" \| jq ``` ### 8.2 Expected Response ```json { "chunks": [ { "content": "This chunk was inserted AFTER the server restart.", "metadata": {"source": "post-restart", "test": true} } ], "scores": [0.95] } ``` Checkpoint: Both old and new data are queryable. --- ## Step 9: Multiple Restart Test (Extra Verification) ### 9.1 Restart again ```bash Ctrl + C llama stack run run.yaml --port 5000 ``` ### 9.2 Query after restart ```bash curl -X POST http://localhost:5000/vector-io/query \ -H "Content-Type: application/json" \ -d "{ \"vector_store_id\": \"$VS_ID\", \"query\": \"programming\" }" \| jq ``` Expected: Works correctly across multiple restarts. --------- Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-11-09 00:05:00 -05:00
Sam El-Borai	8f4c431370	chore(ci): setup automated stainless builds (#3557 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Integration Tests (Replay) / generate-matrix (push) Successful in 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 9s Details API Conformance Tests / check-schema-compatibility (push) Successful in 15s Details Unit Tests / unit-tests (3.12) (push) Failing after 13s Details Pre-commit / pre-commit (push) Failing after 21s Details Test External API and Providers / test-external (venv) (push) Failing after 22s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 18s Details UI Tests / ui-tests (22) (push) Successful in 1m7s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This pull request adds a new workflow that does 2 things: 1. generate [SDK preview builds](https://www.stainless.com/docs/guides/automate-updates#set-up-automatic-preview-builds) whenever the OpenAPI spec file is modified in a PR 2. on PR merge, generate SDK builds that will be pushed to the different SDK repos (i.e start the release process) > [!NOTE] > No repo secret `STAINLESS_API_KEY` is needed, the authentication is done automatically via GitHub OIDC. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> I tested in my fork: https://github.com/stainless-api/llama-stack/pull/3	2025-11-07 12:15:26 -08:00
Ashwin Bharambe	aa2bd82b1d	fix(ci): add recordings for responses suite due to web search type changing (#4104 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Pre-commit / pre-commit (push) Failing after 2s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test llama stack list-deps / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 4s Details Test llama stack list-deps / list-deps (push) Failing after 4s Details Test llama stack list-deps / show-single-provider (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 1m3s Details #4103 broke (even though the PR itself was green) trunk	2025-11-07 10:42:07 -08:00
Aakanksha Duggal	b83184f7ef	feat(responses)!: Add web_search_2025_08_26 to the WebSearchToolTypes (#4103 ) # What does this PR do? Resolves #4102 1. Added `web_search_2025_08_26` to the `WebSearchToolTypes` list and the `OpenAIResponseInputToolWebSearch.type` Literal union 2. No changes needed to tool execution logic - all `web_search` types map to the same underlying tool 3. Backward compatibility is maintained - existing `web_search`, `web_search_preview`, and `web_search_preview_2025_03_11` types continue to work 4. Added an integration test case using {"type": "web_search_2025_08_26"} to verify it works correctly 5. Updated `docs/docs/providers/openai_responses_limitations.mdx` to reflect that `web_search_2025_08_26` is now supported. 6. Removed incorrect references to `MOD1/MOD2/MOD3` (which don't exist in the codebase) <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Aakanksha Duggal <aduggal@redhat.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-11-07 10:01:12 -08:00
Ashwin Bharambe	f49cb0b717	chore: Stack server no longer depends on llama-stack-client (#4094 ) This dependency has been bothering folks for a long time (cc @leseb). We really needed it due to "library client" which is primarily used for our tests and is not a part of the Stack server. Anyone who needs to use the library client can certainly install `llama-stack-client` in their environment to make that work. Updated the notebook references to install `llama-stack-client` additionally when setting things up.	2025-11-07 09:54:09 -08:00
Lê Nam Khánh	68c976a2d8	docs: fix typos in some files (#4101 ) This PR fixes typos in the file file using codespell.	2025-11-07 16:07:46 +01:00
Ashwin Bharambe	b68a25d377	fix(tests): bring back some responses tests (#4098 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Pre-commit / pre-commit (push) Failing after 2s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 1m6s Details https://github.com/llamastack/llama-stack/pull/4055 cleaned the agents implementation but while doing so it removed some tests which actually corresponded to the responses implementation. This PR brings those tests and assocated recordings back. (We should likely combine all responses tests into one suite, but that is beyond the scope of this PR.)	2025-11-07 07:49:38 +01:00
Sumanth Kamenani	e894e36eea	feat: add OpenAI-compatible Bedrock provider (#3748 ) Some checks failed Pre-commit / pre-commit (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test llama stack list-deps / generate-matrix (push) Successful in 4s Details Test llama stack list-deps / show-single-provider (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test llama stack list-deps / list-deps (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 9s Details UI Tests / ui-tests (22) (push) Successful in 48s Details Implements AWS Bedrock inference provider using OpenAI-compatible endpoint for Llama models available through Bedrock. Closes: #3410 ## What does this PR do? Adds AWS Bedrock as an inference provider using the OpenAI-compatible endpoint. This lets us use Bedrock models (GPT-OSS, Llama) through the standard llama-stack inference API. The implementation uses LiteLLM's OpenAI client under the hood, so it gets all the OpenAI compatibility features. The provider handles per-request API key overrides via headers. ## Test Plan Tested the following scenarios: - Non-streaming completion - basic request/response flow - Streaming completion - SSE streaming with chunked responses - Multi-turn conversations - context retention across turns - Tool calling - function calling with proper tool_calls format # Bedrock OpenAI-Compatible Provider - Test Results Model: `bedrock-inference/openai.gpt-oss-20b-1:0` --- ## Test 1: Model Listing Request: ```http GET /v1/models HTTP/1.1 ``` Response: ```http HTTP/1.1 200 OK Content-Type: application/json { "data": [ {"identifier": "bedrock-inference/openai.gpt-oss-20b-1:0", ...}, {"identifier": "bedrock-inference/openai.gpt-oss-40b-1:0", ...} ] } ``` --- ## Test 2: Non-Streaming Completion Request: ```http POST /v1/chat/completions HTTP/1.1 Content-Type: application/json { "model": "bedrock-inference/openai.gpt-oss-20b-1:0", "messages": [{"role": "user", "content": "Say 'Hello from Bedrock' and nothing else"}], "stream": false } ``` Response: ```http HTTP/1.1 200 OK Content-Type: application/json { "choices": [{ "finish_reason": "stop", "message": {"content": "...Hello from Bedrock"} }], "usage": {"prompt_tokens": 79, "completion_tokens": 50, "total_tokens": 129} } ``` --- ## Test 3: Streaming Completion Request: ```http POST /v1/chat/completions HTTP/1.1 Content-Type: application/json { "model": "bedrock-inference/openai.gpt-oss-20b-1:0", "messages": [{"role": "user", "content": "Count from 1 to 5"}], "stream": true } ``` Response: ```http HTTP/1.1 200 OK Content-Type: text/event-stream [6 SSE chunks received] Final content: "1, 2, 3, 4, 5" ``` --- ## Test 4: Error Handling - Invalid Model Request: ```http POST /v1/chat/completions HTTP/1.1 Content-Type: application/json { "model": "invalid-model-id", "messages": [{"role": "user", "content": "Hello"}], "stream": false } ``` Response: ```http HTTP/1.1 404 Not Found Content-Type: application/json { "detail": "Model 'invalid-model-id' not found. Use 'client.models.list()' to list available Models." } ``` --- ## Test 5: Multi-Turn Conversation Request 1: ```http POST /v1/chat/completions HTTP/1.1 { "messages": [{"role": "user", "content": "My name is Alice"}] } ``` Response 1: ```http HTTP/1.1 200 OK { "choices": [{ "message": {"content": "...Nice to meet you, Alice! How can I help you today?"} }] } ``` Request 2 (with history): ```http POST /v1/chat/completions HTTP/1.1 { "messages": [ {"role": "user", "content": "My name is Alice"}, {"role": "assistant", "content": "...Nice to meet you, Alice!..."}, {"role": "user", "content": "What is my name?"} ] } ``` Response 2: ```http HTTP/1.1 200 OK { "choices": [{ "message": {"content": "...Your name is Alice."} }], "usage": {"prompt_tokens": 183, "completion_tokens": 42} } ``` Context retained across turns --- ## Test 6: System Messages Request: ```http POST /v1/chat/completions HTTP/1.1 { "messages": [ {"role": "system", "content": "You are Shakespeare. Respond only in Shakespearean English."}, {"role": "user", "content": "Tell me about the weather"} ] } ``` Response: ```http HTTP/1.1 200 OK { "choices": [{ "message": {"content": "Lo! I heed thy request..."} }], "usage": {"completion_tokens": 813} } ``` --- ## Test 7: Tool Calling Request: ```http POST /v1/chat/completions HTTP/1.1 { "messages": [{"role": "user", "content": "What's the weather in San Francisco?"}], "tools": [{ "type": "function", "function": { "name": "get_weather", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}} } }] } ``` Response: ```http HTTP/1.1 200 OK { "choices": [{ "finish_reason": "tool_calls", "message": { "tool_calls": [{ "function": {"name": "get_weather", "arguments": "{\"location\":\"San Francisco\"}"} }] } }] } ``` --- ## Test 8: Sampling Parameters Request: ```http POST /v1/chat/completions HTTP/1.1 { "messages": [{"role": "user", "content": "Say hello"}], "temperature": 0.7, "top_p": 0.9 } ``` Response: ```http HTTP/1.1 200 OK { "choices": [{ "message": {"content": "...Hello! 👋 How can I help you today?"} }] } ``` --- ## Test 9: Authentication Error Handling ### Subtest A: Invalid API Key Request: ```http POST /v1/chat/completions HTTP/1.1 x-llamastack-provider-data: {"aws_bedrock_api_key": "invalid-fake-key-12345"} {"model": "bedrock-inference/openai.gpt-oss-20b-1:0", ...} ``` Response: ```http HTTP/1.1 400 Bad Request { "detail": "Invalid value: Authentication failed: Error code: 401 - {'error': {'message': 'Invalid API Key format: Must start with pre-defined prefix', ...}}" } ``` --- ### Subtest B: Empty API Key (Fallback to Config) Request: ```http POST /v1/chat/completions HTTP/1.1 x-llamastack-provider-data: {"aws_bedrock_api_key": ""} {"model": "bedrock-inference/openai.gpt-oss-20b-1:0", ...} ``` Response: ```http HTTP/1.1 200 OK { "choices": [{ "message": {"content": "...Hello! How can I assist you today?"} }] } ``` Fell back to config key --- ### Subtest C: Malformed Token Request: ```http POST /v1/chat/completions HTTP/1.1 x-llamastack-provider-data: {"aws_bedrock_api_key": "not-a-valid-bedrock-token-format"} {"model": "bedrock-inference/openai.gpt-oss-20b-1:0", ...} ``` Response: ```http HTTP/1.1 400 Bad Request { "detail": "Invalid value: Authentication failed: Error code: 401 - {'error': {'message': 'Invalid API Key format: Must start with pre-defined prefix', ...}}" } ```	2025-11-06 17:18:18 -08:00
Ashwin Bharambe	a2c4c12384	chore(ui): remove the Streamlit UI (#4097 )	2025-11-06 15:51:57 -08:00
Sébastien Han	939a2db58f	chore: update stainless config (#4096 ) # What does this PR do? Removed in https://github.com/llamastack/llama-stack/pull/4067 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-11-06 15:58:13 -05:00
Charlie Doern	9df073450f	feat: remove core.telemetry as a dependency of llama_stack.apis (#4064 ) Some checks failed Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 55s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Pre-commit / pre-commit (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details # What does this PR do? Remove circular dependency by moving tracing from API protocol definitions to router implementation layer. This gets us closer to having a self contained API package with no other cross-cutting dependencies to other parts of the llama stack codebase. To the best of our ability, the llama_stack.api should only be type and protocol definitions. Changes: - Create apis/common/tracing.py with marker decorator (zero core dependencies) - Add the _new_ `@telemetry_traceable` marker decorator to 11 protocol classes - Apply actual tracing in core/resolver.py in `instantiate_provider` based on protocol marker - Move MetricResponseMixin from core to apis (it's an API response type) - APIs package is now self-contained with zero core dependencies The tracing functionality remains identical - actual trace_protocol from core is applied to router implementations at runtime when both telemetry is enabled and the protocol has the `__marked_for_tracing__` marker. ## Test Plan Manual integration test confirms identical behavior to main branch: ```bash llama stack list-deps --format uv starter \| sh export OLLAMA_URL=http://localhost:11434 llama stack run starter curl -X POST http://localhost:8321/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model": "ollama/gpt-oss:20b", "messages": [{"role": "user", "content": "Say hello"}], "max_tokens": 10}' ``` Verified identical between main and this branch: - trace_id present in response - metrics array with prompt_tokens, completion_tokens, total_tokens - Server logs show trace_protocol applied to all routers Existing telemetry integration tests (tests/integration/telemetry/) validate trace context propagation and span attributes. relates to #3895 --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-06 10:58:30 -08:00
Derek Higgins	dc9497a3b2	ci: Temperarily disable Telemetry during tests (#4090 ) Closes: #4089 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-11-06 17:53:02 +01:00
Derek Higgins	03d23db910	ci: vllm ci job update (#4088 ) Add missing recording for vllm in library mode Add Docker env (missed during rebase) Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-11-06 16:59:55 +01:00
Derek Higgins	c62a09ab76	ci: Add vLLM support to integration testing infrastructure (with qwen) (#3545 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 4s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Pre-commit / pre-commit (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 14s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 22s Details UI Tests / ui-tests (22) (push) Successful in 57s Details o Introduces vLLM provider support to the record/replay testing framework o Enabling both recording and replay of vLLM API interactions alongside existing Ollama support. The changes enable testing of vLLM functionality. vLLM tests focus on inference capabilities, while Ollama continues to exercise the full API surface including vision features. -- This is an alternative to #3128 , using qwen3 instead of llama 3.2 1B appears to be more capable at structure output and tool calls. --------- Signed-off-by: Derek Higgins <derekh@redhat.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-11-06 10:36:40 +01:00
Ashwin Bharambe	bef1b044bd	refactor(passthrough): use AsyncOpenAI instead of AsyncLlamaStackClient (#4085 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Pre-commit / pre-commit (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 48s Details We'd like to remove the dependence of `llama-stack` on `llama-stack-client`. This is a necessary step. A few small cleanups - Enables `embeddings` now also - Remove ModelRegistryHelper dependency (unused) - Consolidate to auth_credential field via RemoteInferenceProviderConfig - Implement list_models() to fetch from downstream /v1/models ## Test Plan Tested using this script https://gist.github.com/ashwinb/6356463d10f989c0682ab3bff8589581 Output: ``` Listing models from downstream server... Available models: ['passthrough/ollama/nomic-embed-text:latest', 'passthrough/ollama/all-minilm:l6-v2', 'passthrough/ollama/llama3.2-vision:11b', 'passthrough/ollama/llama3.2-vision:latest', 'passthrough/ollama/llama-guard3:1b', 'passthrough/o llama/llama3.2:1b', 'passthrough/ollama/all-minilm:latest', 'passthrough/ollama/llama3.2:3b', 'passthrough/ollama/llama3.2:3b-instruct-fp16', 'passthrough/bedrock/meta.llama3-1-8b-instruct-v1:0', 'passthrough/bedrock/meta.llama3-1-70b-instruct -v1:0', 'passthrough/bedrock/meta.llama3-1-405b-instruct-v1:0', 'passthrough/sentence-transformers/nomic-ai/nomic-embed-text-v1.5'] Using LLM model: passthrough/ollama/llama3.2-vision:11b Making inference request... Response: 4. --- Testing streaming --- Streamed response: ChatCompletionChunk(id='chatcmpl-64', choices=[Choice(delta=ChoiceDelta(content='1', reasoning_content=None, refusal=None, role='assistant', tool_calls=None), finish_reason='', index=0, logprobs=None)], created=1762381674, m odel='passthrough/ollama/llama3.2-vision:11b', object='chat.completion.chunk', usage=None) ... 5ChatCompletionChunk(id='chatcmpl-64', choices=[Choice(delta=ChoiceDelta(content='', reasoning_content=None, refusal=None, role='assistant', tool_calls=None), finish_reason='stop', index=0, logprobs=None)], created=1762381674, model='passthrou gh/ollama/llama3.2-vision:11b', object='chat.completion.chunk', usage=None) ```	2025-11-05 18:15:11 -08:00
ehhuang	b335419faa	fix: actualize chunking strategy in vector store create API (#4086 ) # What does this PR do? - when create vector store is called without chunk strategy, we actually the strategy used so that the value is persisted instead of strategy='None' ## Test Plan updated tests	2025-11-05 15:47:54 -08:00
Roy Belio	c672a5d792	feat: ability to use postgres as store for starter distro (#4076 ) ## What does this PR do? The starter distribution now comes with all the required packages to support persistent stores—like the agent store, metadata, and inference—using PostgreSQL. Users can enable PostgreSQL support by setting the `ENABLE_POSTGRES_STORE=1` environment variable. This PR consolidates the functionality from the removed `postgres-demo` distribution into the starter distribution, reducing maintenance overhead. Closes: #2619 Supersedes: #2851 (rebased and updated) ## Changes Made 1. Added PostgreSQL support to starter distribution - New `run-with-postgres-store.yaml` configuration - Automatic config switching via `ENABLE_POSTGRES_STORE` environment variable - Removed separate `postgres-demo` distribution 2. Updated to new build system - Integrated postgres switching logic into Containerfile entrypoint - Uses new `storage_backends` and `storage_stores` API - Properly configured both PostgreSQL KV store and SQL store 3. Updated dependencies - Added `psycopg2-binary` and `asyncpg` to starter distribution - All postgres-related dependencies automatically included ## How to Use ### With Docker (PostgreSQL): ```bash docker run \ -e ENABLE_POSTGRES_STORE=1 \ -e POSTGRES_HOST=your_postgres_host \ -e POSTGRES_PORT=5432 \ -e POSTGRES_DB=llamastack \ -e POSTGRES_USER=llamastack \ -e POSTGRES_PASSWORD=llamastack \ -e OPENAI_API_KEY=your_key \ llamastack/distribution-starter ``` ### PostgreSQL environment variables: - `POSTGRES_HOST`: Postgres host (default: `localhost`) - `POSTGRES_PORT`: Postgres port (default: `5432`) - `POSTGRES_DB`: Postgres database name (default: `llamastack`) - `POSTGRES_USER`: Postgres username (default: `llamastack`) - `POSTGRES_PASSWORD`: Postgres password (default: `llamastack`) ## Test Plan All pre-commit hooks pass (mypy, ruff, distro-codegen) `llama stack list-deps starter` confirms psycopg2-binary is included Storage configuration correctly uses PostgreSQL backends Container builds successfully with postgres support ## Credits Original work by @leseb in #2851. Rebased and updated by @r-bit-rry to work with latest main. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Sébastien Han @leseb --------- Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-11-05 15:37:06 -08:00
ehhuang	9d5c34af27	fix!: BREAKING CHANGE: vector_store: search API response fix (#4080 ) # What does this PR do? - search_query in the vector store search API should be a list, according to https://github.com/openai/openai-openapi ## Test Plan modified tests --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/llamastack/llama-stack/pull/4080). * #4086 * __->__ #4080	2025-11-05 15:01:48 -08:00
ehhuang	84a84ee85c	fix: last_id when listing files in vector store (#4079 ) # What does this PR do? the last_id should be the id of the last item in the returned list, not the unfiltered list. ## Test Plan fixed test	2025-11-05 14:10:10 -08:00
Ashwin Bharambe	d9cf5cd480	fix(ci): use --no-cache instead of --no-cache-dir (#4081 ) This is necessary to make sure GPU dockers can be built on CI without running out of space.	2025-11-05 12:14:02 -08:00
Charlie Doern	c899b50723	fix: print help for list-deps if no args (#4078 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test llama stack list-deps / generate-matrix (push) Successful in 5s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 4s Details Test llama stack list-deps / show-single-provider (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 5s Details Pre-commit / pre-commit (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Test llama stack list-deps / list-deps (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details API Conformance Tests / check-schema-compatibility (push) Successful in 16s Details UI Tests / ui-tests (22) (push) Successful in 57s Details # What does this PR do? list-deps takes positional args OR things like --providers the issue with this, is that these args need to be optional since by nature, one or the other can be specified. add a check to list-deps that checks `if not args.providers and not args.config`. If this is true, help is printed and we exit. resolves #4075 ## Test Plan before: ``` ╰─ llama stack list-deps Traceback (most recent call last): File "/Users/charliedoern/projects/Documents/llama-stack/venv/bin/llama", line 10, in <module> sys.exit(main()) ^^^^^^ File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/llama.py", line 52, in main parser.run(args) File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/llama.py", line 43, in run args.func(args) File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/stack/list_deps.py", line 51, in _run_stack_list_deps_command return run_stack_list_deps_command(args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/charliedoern/projects/Documents/llama-stack/src/llama_stack/cli/stack/_list_deps.py", line 135, in run_stack_list_deps_command normal_deps, special_deps, external_provider_dependencies = get_provider_dependencies(build_config) ^^^^^^^^^^^^ UnboundLocalError: cannot access local variable 'build_config' where it is not associated with a value ``` after: ``` ╰─ llama stack list-deps usage: llama stack list-deps [-h] [--providers PROVIDERS] [--format {uv,deps-only}] [config \| distro] list the dependencies for a llama stack distribution positional arguments: config \| distro Path to config file to use or name of known distro (llama stack list for a list). (default: None) options: -h, --help show this help message and exit --providers PROVIDERS sync dependencies for a list of providers and only those providers. This list is formatted like: api1=provider1,api2=provider2. Where there can be multiple providers per API. (default: None) --format {uv,deps-only} Output format: 'uv' shows shell commands, 'deps-only' shows just the list of dependencies without `uv` (default) (default: deps-only) ``` Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-05 11:34:08 -08:00
Wojciech-Rebisz	07c28cd519	fix: Avoid model_limits KeyError (#4060 ) # What does this PR do? It avoids model_limit KeyError while trying to get embedding models for Watsonx <!-- If resolving an issue, uncomment and update the line below --> Closes https://github.com/llamastack/llama-stack/issues/4059 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Start server with watsonx distro: ```bash llama stack list-deps watsonx \| xargs -L1 uv pip install uv run llama stack run watsonx ``` Run ```python client = LlamaStackClient(base_url=base_url) client.models.list() ``` Check if there is any embedding model available (currently there is not a single one)	2025-11-05 10:34:40 -08:00
Emilio Garcia	ba50790a28	feat(tests): metrics tests (#3966 ) # What does this PR do? 1. Make telemetry tests as easy as possible for users by expanding the `SpanStub` data class and creating the `MetricStub` dataclass as a way to consistently marshal telemetry data in test fixtures and unmarshal and handle it in tests. 2. Structure server and client tests to always follow the same standards for consistent testing experience by using the `SpanStub` and `MetricStub` data class objects. 3. Enable Metrics Testing for completions endpoint 4. Correct token metrics to use histograms instead of counts to capture tokens per request rather than a cumulative count of tokens over the lifecycle of the server. ## Test Plan These are tests	2025-11-05 10:26:15 -08:00
Roy Belio	2619f3552e	fix: show built-in distributions in llama stack list (#4040 ) # What does this PR do? Fixes issue #3922 where `llama stack list` only showed distributions after they were run. This PR makes the command show all available distributions immediately on a fresh install. Closes #3922 ## Changes - Updated `_get_distribution_dirs()` to discover both built-in and built distributions: - Built-in distributions from `src/llama_stack/distributions/` (e.g., starter, nvidia, dell) - Built distributions from `~/.llama/distributions` - Added a "Source" column to distinguish between "built-in" and "built" distributions - Built distributions override built-in ones with the same name (expected behavior) - Updated config file detection logic to handle both naming conventions: - Built-in: `build.yaml` and `run.yaml` - Built: `{name}-build.yaml` and `{name}-run.yaml` ## Test Plan ### Unit Tests Added comprehensive unit tests in `tests/unit/distribution/test_stack_list.py`: ```bash uv run pytest tests/unit/distribution/test_stack_list.py -v ``` Result: ✅ All 8 tests pass - `test_builtin_distros_shown_without_running` - Verifies the core fix for issue #3922 - `test_builtin_and_built_distros_shown_together` - Ensures both types are shown - `test_built_distribution_overrides_builtin` - Tests override behavior - `test_empty_distributions` - Edge case handling - `test_config_files_detection_builtin` - Config file detection for built-in distros - `test_config_files_detection_built` - Config file detection for built distros - `test_llamastack_prefix_stripped` - Name normalization - `test_hidden_directories_ignored` - Filters hidden directories ### Manual Testing Before the fix (simulated with empty `~/.llama/distributions`): ```bash $ llama stack list No stacks found in ~/.llama/distributions ``` After the fix: ```bash $ llama stack list ┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Stack Name ┃ Source ┃ Path ┃ Build Config ┃ Run Config ┃ ┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ ci-tests │ built-in │ /path/to/src/... │ Yes │ Yes │ │ dell │ built-in │ /path/to/src/... │ Yes │ Yes │ │ meta-reference-g… │ built-in │ /path/to/src/... │ Yes │ Yes │ │ nvidia │ built-in │ /path/to/src/... │ Yes │ Yes │ │ open-benchmark │ built-in │ /path/to/src/... │ Yes │ Yes │ │ postgres-demo │ built-in │ /path/to/src/... │ Yes │ Yes │ │ starter │ built-in │ /path/to/src/... │ Yes │ Yes │ │ starter-gpu │ built-in │ /path/to/src/... │ Yes │ Yes │ │ watsonx │ built-in │ /path/to/src/... │ Yes │ Yes │ └───────────────────┴──────────┴───────────────────┴──────────────┴────────────┘ ``` After running a distribution: ```bash $ llama stack run starter # Creates ~/.llama/distributions/starter $ llama stack list ┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Stack Name ┃ Source ┃ Path ┃ Build Config ┃ Run Config ┃ ┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ ... │ built-in │ ... │ Yes │ Yes │ │ starter │ built │ ~/.llama/distri… │ No │ No │ │ ... │ built-in │ ... │ Yes │ Yes │ └───────────────────┴──────────┴───────────────────┴──────────────┴────────────┘ ``` Note how `starter` now shows as "built" and points to `~/.llama/distributions`, overriding the built-in version. ## Breaking Changes No breaking changes - This is a bug fix that improves user experience with minimal risk: - No programmatic parsing of output found in the codebase - Table format is clearly for human consumption - The new "Source" column helps users understand where distributions come from - The behavior change is exactly what users expect (seeing all available distributions) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-11-05 10:16:28 -08:00
Ashwin Bharambe	4d3069bfa5	chore(ci): remove unused recordings (#4074 ) Added a script to cleanup recordings. While doing this, moved the CI matrix generation to a separate script so there is a single source of truth for the matrix. Ran the cleanup script as: ``` PYTHONPATH=. python scripts/cleanup_recordings.py ``` Also added this as part of the pre-commit workflow to ensure that the recordings are always up to date and that no stale recordings are left in the repo.	2025-11-05 09:21:58 -08:00
Sébastien Han	fd1603beef	chore: remove unused classes (#4077 ) # What does this PR do? These were maybe be included in the webmethod? The unit test was pointless too since the request was never used anywhere? This shouldn't be in the API definition, if we never consume it. ## Test Plan CI with pre-commit on OpenAPI spec generation. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-11-05 16:45:23 +01:00
Ashwin Bharambe	392e01dc79	chore: add stainless config Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Pre-commit / pre-commit (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details UI Tests / ui-tests (22) (push) Successful in 1m13s Details name it to indicate it is not yet source of truth to avoid confusion	2025-11-04 15:44:07 -08:00
ehhuang	95b0493fae	chore: move src/llama_stack/ui to src/llama_stack_ui (#4068 ) # What does this PR do? This better separates UI from backend code, which was a point of confusion often for our beloved AI friends. ## Test Plan CI	2025-11-04 15:21:49 -08:00
Ashwin Bharambe	5850e3473f	fix: remove straggler openapi HTML file	2025-11-04 14:54:33 -08:00
Ashwin Bharambe	0c49a53c97	chore(api)!: remove tool_runtime.rag_tool from the API surface (#4067 ) RAG aka file search is implemented via the Responses API by specifying the file-search tool. The backend implementation remains unchanged. This PR merely removes the directly exposed API surface which allowed users to directly perform searches from the client. This facility is now available via the `client.vector_store.search()` OpenAI compatible API.	2025-11-04 14:50:54 -08:00
Ashwin Bharambe	a8a8aa56c0	chore!: remove the agents (sessions and turns) API (#4055 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Pre-commit / pre-commit (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details UI Tests / ui-tests (22) (push) Successful in 1m10s Details - Removes the deprecated agents (sessions and turns) API that was marked alpha in 0.3.0 - Cleans up unused imports and orphaned types after the API removal - Removes `SessionNotFoundError` and `AgentTurnInputType` which are no longer needed The agents API is completely superseded by the Responses + Conversations APIs, and the client SDK Agent class already uses those implementations. Corresponding client-side PR: https://github.com/llamastack/llama-stack-client-python/pull/295	2025-11-04 09:38:39 -08:00
Mustafa Elbehery	a6ddbae0ed	chore(test): migrate unit tests from `unittest` to `pytest` nvidia test eval (#3249 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Pre-commit / pre-commit (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details API Conformance Tests / check-schema-compatibility (push) Successful in 14s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details UI Tests / ui-tests (22) (push) Successful in 1m16s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR migrates `unittest` to `pytest` in `tests/unit/providers/nvidia/test_eval.py`. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Part of https://github.com/llamastack/llama-stack/issues/2680 Supersedes https://github.com/llamastack/llama-stack/pull/2791 Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-11-04 10:29:07 +01:00
Ashwin Bharambe	053fc0ac39	chore!: remove all deprecated routes (including /openai/v1/ ones) (#4054 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Pre-commit / pre-commit (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details UI Tests / ui-tests (22) (push) Successful in 1m13s Details This PR removes all routes which we had marked deprecated for the 0.3.0 release. This includes: - all the `/v1/openai/v1/` routes (the corresponding /v1 routes still exist of course) - the /agents API (which is superseded completely by Responses + Conversations) - several alpha routes which had a "v1" route to aide transitioning to "v1alpha" This is the corresponding client-python change: https://github.com/llamastack/llama-stack-client-python/pull/294	2025-11-03 19:00:59 -08:00
Nathan Weinberg	62b3ad349a	fix: return to hardcoded model IDs for Vertex AI (#4041 ) # What does this PR do? partial revert of `b67aef2` Vertex AI doesn't offer an endpoint for listing models from Google's Model Garden Return to hardcoded values until such an endpoint is available Closes #3988 ## Test Plan Server side, set up your Vertex AI env vars (`VERTEX_AI_PROJECT`, `VERTEX_AI_LOCATION`, and `GOOGLE_APPLICATION_CREDENTIALS`) and run the starter distribution ```bash $ llama stack list-deps starter \| xargs -L1 uv pip install $ llama stack run starter ``` Client side, formerly broken cURL requests now working ```bash $ curl http://127.0.0.1:8321/v1/models \| jq '.data \| map(select(.provider_id == "vertexai"))' [ { "identifier": "vertexai/vertex_ai/gemini-2.0-flash", "provider_resource_id": "vertex_ai/gemini-2.0-flash", "provider_id": "vertexai", "type": "model", "metadata": {}, "model_type": "llm" }, { "identifier": "vertexai/vertex_ai/gemini-2.5-flash", "provider_resource_id": "vertex_ai/gemini-2.5-flash", "provider_id": "vertexai", "type": "model", "metadata": {}, "model_type": "llm" }, { "identifier": "vertexai/vertex_ai/gemini-2.5-pro", "provider_resource_id": "vertex_ai/gemini-2.5-pro", "provider_id": "vertexai", "type": "model", "metadata": {}, "model_type": "llm" } ] $ curl -fsS http://127.0.0.1:8321/v1/openai/v1/chat/completions -H "Content-Type: application/json" -d "{\"model\": \"vertexai/vertex_a i/gemini-2.5-flash\", \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}], \"max_tokens\": 128, \"temperature\": 0.0}" \| jq { "id": "p8oIaYiQF8_PptQPo-GH8QQ", "choices": [ { "finish_reason": "stop", "index": 0, "logprobs": null, "message": { "content": "Hello there! How can I help you today?", "refusal": null, "role": "assistant", "annotations": null, "audio": null, "function_call": null, "tool_calls": null } } ], ... ``` Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-11-03 17:38:16 -08:00
Ashwin Bharambe	cb40da210f	fix: update tests for OpenAI-style models endpoint (#4053 ) The llama-stack-client now uses /`v1/openai/v1/models` which returns OpenAI-compatible model objects with 'id' and 'custom_metadata' fields instead of the Resource-style 'identifier' field. Updated api_recorder to handle the new endpoint and modified tests to access model metadata appropriately. Deleted stale model recordings for re-recording. NOTE: CI will be red on this one since it is dependent on https://github.com/llamastack/llama-stack-client-python/pull/291/files landing. I verified locally that it is green.	2025-11-03 17:30:08 -08:00
Sébastien Han	4a5ef65286	chore!: remove SDG API (#4035 ) # What does this PR do? This API hasn't received any traction and close to zero interest from the community. Let's revisit in the future if things change. Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-03 16:12:06 -08:00
Ashwin Bharambe	44096512b5	feat: add custom_metadata to OpenAIModel to unify /v1/models with /v1/openai/v1/models (#4051 ) We need to remove `/v1/openai/v1` paths shortly. There is one trouble -- our current `/v1/openai/v1/models` endpoint provides different data than `/v1/models`. Unfortunately our tests target the latter (llama-stack customized) behavior. We need to get to true OpenAI compatibility. This is step 1: adding `custom_metadata` field to `OpenAIModel` that includes all the extra stuff we add in the native `/v1/models` response. This can be extracted on the consumer end by look at `__pydantic_extra__` or other similar fields. This PR: - Adds `custom_metadata` field to `OpenAIModel` class in `src/llama_stack/apis/models/models.py` - Modified `openai_list_models()` in `src/llama_stack/core/routing_tables/models.py` to populate custom_metadata Next Steps 1. Update stainless client to use `/v1/openai/v1/models` instead of `/v1/models` 2. Migrate tests to read from `custom_metadata` 3. Remove `/v1/openai/v1/` prefix entirely and consolidate to single `/v1/models` endpoint	2025-11-03 15:56:07 -08:00
Ashwin Bharambe	2381714904	fix: enable SQLite WAL mode to prevent database locking errors (#4048 ) Fixes race condition causing "database is locked" errors during concurrent writes to SQLite, particularly in streaming responses with guardrails where multiple inference calls write simultaneously. Enable Write-Ahead Logging (WAL) mode for SQLite which allows multiple concurrent readers and one writer without blocking. Set busy_timeout to 5s so SQLite retries instead of failing immediately. Remove the logic that disabled write queues for SQLite since WAL mode eliminates the locking issues that prompted disabling them. Fixes: test_output_safety_guardrails_safe_content[stream=True] flake	2025-11-03 15:27:41 -08:00
ehhuang	628e38b3d5	test: always start a new server in integration-tests.sh (#4050 ) # What does this PR do? This prevents interference from already running servers, and allows multiple concurrent integration test runs. Unleash the AIs! ## Test Plan start a LS server at port 8321 Then observe test uses port 8322: ❯ uv run --no-sync ./scripts/integration-tests.sh --stack-config server:ci-tests --inference-mode replay --setup ollama --suite base --pattern '(telemetry or safety)' === Llama Stack Integration Test Runner === Stack Config: server:ci-tests Setup: ollama Inference Mode: replay Test Suite: base Test Subdirs: Test Pattern: (telemetry or safety) Checking llama packages llama-stack 0.4.0.dev0 /Users/erichuang/projects/new_test_server llama-stack-client 0.3.0 ollama 0.6.0 === Applying Setup Environment Variables === Setting SQLITE_STORE_DIR: /var/folders/cz/vyh7y1d11xg881lsxsshnc5c0000gn/T/tmp.bKLsaVAxyU Setting stack config type: server Setting up environment variables: export OLLAMA_URL='http://0.0.0.0:11434' export SAFETY_MODEL='ollama/llama-guard3:1b' Will use port: 8322 === Starting Llama Stack Server === Waiting for Llama Stack Server to start on port 8322... ✅ Llama Stack Server started successfully	2025-11-03 15:23:10 -08:00
Sébastien Han	da57b51fb6	ci: introduce Mergify bot to notify on PR conflicts (#4043 ) This commit introduces Mergify, a powerful bot designed to assist with automated merging and other CI-related tasks. As an initial step, we enable a basic feature: automatically notifying users when a pull request has merge conflicts. When a conflict is detected, Mergify will add a label to the PR. This label will be removed once the conflict is resolved. This is foundation PR to activate the bot and start using it for backports too. In the future, we plan to expand Mergify’s role to include auto-merging, as discussed in #1667, once the project is ready. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-11-03 12:21:19 -08:00
Derek Higgins	1562277cfd	ci: test adjustments for Qwen3-0.6B (#3978 ) Without this hint Qwen3-0.6B tends to reply with the full name and sometimes doesn't reply with the correct drafted year. --------- Signed-off-by: Derek Higgins <derekh@redhat.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-03 12:19:35 -08:00
Matthew Farrellee	1263448de2	fix: allowed_models config did not filter models (#4030 ) # What does this PR do? closes #4022 ## Test Plan ci w/ new tests Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-03 11:43:39 -08:00
Charlie Doern	30f8921240	fix: generate provider config when using --providers (#4044 ) # What does this PR do? call the sample_run_config method for providers that have it when generating a run config using `llama stack run --providers`. This will propagate API keys resolves #4032 ## Test Plan new unit test checks the output of using `--providers` to ensure `api_key` is in the config. manual testing: ``` ╰─ llama stack list-deps --providers=inference=remote::openai --format uv \| sh Using Python 3.12.11 environment at: venv Audited 7 packages in 8ms ╰─ llama stack run --providers=inference=remote::openai INFO 2025-11-03 14:33:02,094 llama_stack.cli.stack.run:161 cli: Writing generated config to: /Users/charliedoern/.llama/distributions/providers-run/run.yaml INFO 2025-11-03 14:33:02,096 llama_stack.cli.stack.run:169 cli: Using run configuration: /Users/charliedoern/.llama/distributions/providers-run/run.yaml INFO 2025-11-03 14:33:02,099 llama_stack.cli.stack.run:228 cli: HTTPS enabled with certificates: Key: None Cert: None INFO 2025-11-03 14:33:02,099 llama_stack.cli.stack.run:230 cli: Listening on 0.0.0.0:8321 INFO 2025-11-03 14:33:02,145 llama_stack.core.server.server:513 core::server: Run configuration: INFO 2025-11-03 14:33:02,146 llama_stack.core.server.server:516 core::server: apis: - inference image_name: providers-run providers: inference: - config: api_key: '********' base_url: https://api.openai.com/v1 provider_id: openai provider_type: remote::openai registered_resources: benchmarks: [] datasets: [] models: [] scoring_fns: [] shields: [] tool_groups: [] vector_stores: [] server: port: 8321 workers: 1 storage: backends: kv_default: db_path: /Users/charliedoern/.llama/distributions/providers-run/kvstore.db type: kv_sqlite sql_default: db_path: /Users/charliedoern/.llama/distributions/providers-run/sql_store.db type: sql_sqlite stores: conversations: backend: sql_default table_name: openai_conversations inference: backend: sql_default max_write_queue_size: 10000 num_writers: 4 table_name: inference_store metadata: backend: kv_default namespace: registry prompts: backend: kv_default namespace: prompts telemetry: enabled: false version: 2 INFO 2025-11-03 14:33:02,299 llama_stack.providers.utils.inference.inference_store:74 inference: Write queue disabled for SQLite to avoid concurrency issues INFO 2025-11-03 14:33:05,272 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: OpenAIInferenceAdapter.list_provider_model_ids() returned 105 models INFO 2025-11-03 14:33:05,368 uvicorn.error:84 uncategorized: Started server process [69109] INFO 2025-11-03 14:33:05,369 uvicorn.error:48 uncategorized: Waiting for application startup. INFO 2025-11-03 14:33:05,370 llama_stack.core.server.server:172 core::server: Starting up Llama Stack server (version: 0.3.0) INFO 2025-11-03 14:33:05,370 llama_stack.core.stack:495 core: starting registry refresh task INFO 2025-11-03 14:33:05,370 uvicorn.error:62 uncategorized: Application startup complete. INFO 2025-11-03 14:33:05,371 uvicorn.error:216 uncategorized: Uvicorn running on http://0.0.0.0:8321 (Press CTRL+C to quit) INFO 2025-11-03 14:34:19,242 uvicorn.access:473 uncategorized: 127.0.0.1:63102 - "POST /v1/chat/completions HTTP/1.1" 200 ``` client: ``` curl http://localhost:8321/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5", "messages": [ {"role": "user", "content": "What is 1 + 2"} ] }' {"id":"...","choices":[{"finish_reason":"stop","index":0,"logprobs":null,"message":{"content":"3","refusal":null,"role":"assistant","annotations":[],"audio":null,"function_call":null,"tool_calls":null}}],"created":1762198455,"model":"openai/gpt-5","object":"chat.completion","service_tier":"default","system_fingerprint":null,"usage":{"completion_tokens":10,"prompt_tokens":13,"total_tokens":23,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":0,"rejected_prediction_tokens":0},"prompt_tokens_details":{"audio_tokens":0,"cached_tokens":0}}}% ``` --------- Signed-off-by: Charlie Doern <cdoern@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-11-03 11:37:58 -08:00
Ashwin Bharambe	415fd9e36b	chore: bump version to 0.4.0.dev0 (#4018 ) Some checks failed Test llama stack list-deps / generate-matrix (push) Successful in 4s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Test llama stack list-deps / show-single-provider (push) Failing after 5s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Test llama stack list-deps / list-deps (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 16s Details Pre-commit / pre-commit (push) Failing after 21s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 20s Details Test Llama Stack Build / build (push) Failing after 15s Details UI Tests / ui-tests (22) (push) Successful in 1m12s Details Automated version bump after releasing 0.3.1 --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-11-03 09:36:04 -08:00
Sébastien Han	d4aa348b60	chore: remove HTML generation for openapi spec (#4039 ) # What does this PR do? This seems to be an ancient artifact when we were using readthedocs? Now docusaurus read the specs directly. --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-11-03 18:03:40 +01:00
dependabot[bot]	7e294d33d9	chore(github-deps): bump astral-sh/setup-uv from 6.0.1 to 7.1.2 (#4023 ) Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.0.1 to 7.1.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v7.1.2 🌈 Speed up extraction on Windows</h2> <h2>Changes</h2> <p><a href="https://github.com/lazka"><code>@lazka</code></a> fixed a bug that caused extracting uv to take up to 30s. Thank you!</p> <h2>🐛 Bug fixes</h2> <ul> <li>Use tar for extracting the uv zip file on Windows too <a href="https://github.com/lazka"><code>@lazka</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/660">#660</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>chore: update known checksums for 0.9.5 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/663">#663</a>)</li> </ul> <h2>⬆️ Dependency updates</h2> <ul> <li>Bump dependencies <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/664">#664</a>)</li> <li>Bump github/codeql-action from 4.30.8 to 4.30.9 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/652">#652</a>)</li> </ul> <h2>v7.1.1 🌈 Fix empty workdir detection and lowest resolution strategy</h2> <h2>Changes</h2> <p>This release fixes a bug where the <code>working-directory</code> input was not used to detect an empty work dir. It also fixes the <code>lowest</code> resolution strategy resolving to latest when only a lower bound was specified.</p> <p>Special thanks to <a href="https://github.com/tpgillam"><code>@tpgillam</code></a> for the first contribution!</p> <h2>🐛 Bug fixes</h2> <ul> <li>Fix "lowest" resolution strategy with lower-bound only <a href="https://github.com/tpgillam"><code>@tpgillam</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/649">#649</a>)</li> <li>Use working-directory to detect empty workdir <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/645">#645</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>chore: update known checksums for 0.9.4 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/651">#651</a>)</li> <li>chore: update known checksums for 0.9.3 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/644">#644</a>)</li> </ul> <h2>📚 Documentation</h2> <ul> <li>Change version in docs to v7 <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/647">#647</a>)</li> </ul> <h2>⬆️ Dependency updates</h2> <ul> <li>Bump github/codeql-action from 4.30.7 to 4.30.8 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/639">#639</a>)</li> <li>Bump actions/setup-node from 5.0.0 to 6.0.0 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/641">#641</a>)</li> <li>Bump eifinger/actionlint-action from 1.9.1 to 1.9.2 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/634">#634</a>)</li> <li>Update lockfile with latest npm <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/636">#636</a>)</li> </ul> <h2>v7.1.0 🌈 Support all the use cases</h2> <h2>Changes</h2> <p><strong>Support all the use cases!!!</strong></p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`85856786d1`"><code>8585678</code></a> Bump dependencies (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/664">#664</a>)</li> <li><a href="`22d500a65c`"><code>22d500a</code></a> Bump github/codeql-action from 4.30.8 to 4.30.9 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/652">#652</a>)</li> <li><a href="`14d557131d`"><code>14d5571</code></a> chore: update known checksums for 0.9.5 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/663">#663</a>)</li> <li><a href="`29cd2350cd`"><code>29cd235</code></a> Use tar for extracting the uv zip file on Windows too (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/660">#660</a>)</li> <li><a href="`2ddd2b9cb3`"><code>2ddd2b9</code></a> chore: update known checksums for 0.9.4 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/651">#651</a>)</li> <li><a href="`b7bf78939d`"><code>b7bf789</code></a> Fix "lowest" resolution strategy with lower-bound only (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/649">#649</a>)</li> <li><a href="`cb6c0a53d9`"><code>cb6c0a5</code></a> Change version in docs to v7 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/647">#647</a>)</li> <li><a href="`dffc6292f2`"><code>dffc629</code></a> Use working-directory to detect empty workdir (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/645">#645</a>)</li> <li><a href="`6e346e1653`"><code>6e346e1</code></a> chore: update known checksums for 0.9.3 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/644">#644</a>)</li> <li><a href="`3ccd0fd498`"><code>3ccd0fd</code></a> Bump github/codeql-action from 4.30.7 to 4.30.8 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/639">#639</a>)</li> <li>Additional commits viewable in <a href="https://github.com/astral-sh/setup-uv/compare/v6.0.1...85856786d1ce8acfbcc2f13a5f3fbd6b938f9f41">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.0.1&new-version=7.1.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-11-03 13:43:04 +01:00
Sébastien Han	3dbff6bf3f	fix: help mypy & fix precommit on main (#4037 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Pre-commit / pre-commit (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details API Conformance Tests / check-schema-compatibility (push) Successful in 21s Details UI Tests / ui-tests (22) (push) Successful in 1m15s Details # What does this PR do? Add type to help mypy figure out. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-11-03 05:39:50 -05:00
Ashwin Bharambe	d45137a399	fix(ci): export UV_INDEX_STRATEGY to current shell before running uv sync (#4020 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Pre-commit / pre-commit (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 16s Details UI Tests / ui-tests (22) (push) Successful in 1m6s Details Fixes latent bug where UV_INDEX_STRATEGY was only exported to GITHUB_ENV but not to the current shell. While this bug doesn't currently affect main (since UV_EXTRA_INDEX_URL is only set on release branches), it's a latent bug that could cause issues if the logic changes in the future or if someone tests with UV_EXTRA_INDEX_URL set. The setup-runner action only exported UV_INDEX_STRATEGY to GITHUB_ENV (for subsequent steps), not to the current shell environment. Since uv sync runs in the same step, it would never see the variable if it were set. This fix adds `export UV_INDEX_STRATEGY=unsafe-best-match` to make the variable available in the current shell before running uv commands. Related: #4019 (same fix for release-0.3.x where the bug is actively triggered)	2025-11-01 12:57:24 -07:00
Charlie Doern	93401836b7	feat: llama stack run --providers (#3989 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Pre-commit / pre-commit (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Test Llama Stack Build / build (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 56s Details # What does this PR do? llama stack run --providers takes a list of providers in the format of api1=provider1,api2=provider2 this allows users to run with a simple list of providers. given the architecture of `create_app`, this run config needs to be written to disk. use ~/.llama/distribution/providers-run/run.yaml each time for consistency resolves #3956 ## Test Plan new unit tests to ensure --providers. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-31 16:21:32 -07:00
Ashwin Bharambe	b2a5428a14	fix(ci): unset empty UV index env vars to prevent uv errors (#4012 ) Fixes container builds failing with UV index strategy errors when build args are passed with empty values. Docker ARGs declared with empty defaults (ARG UV_INDEX_STRATEGY="") become environment variables with empty string values in RUN commands. UV interprets these as if --index-strategy "" was passed on the command line, causing build failures with "error: a value is required for '--index-strategy <UV_INDEX_STRATEGY>'". This is a footgun because empty string ≠ unset variable, and ARGs silently propagate to all RUN commands, only failing when declared with empty defaults. The fix unsets UV_EXTRA_INDEX_URL and UV_INDEX_STRATEGY at the start of RUN blocks, saves the values early, and only restores them for editable installs with RC dependencies. All other install modes (PyPI, test-pypi, client) now run with a clean environment.	2025-10-31 13:29:14 -07:00
Ashwin Bharambe	f8fe3018af	fix(ci): use test.pypi as extra index for RC dependencies (#4009 ) Backports UV index configuration fixes from `release-0.3.x` (PR #4002). The main issue: when we created the release branch infrastructure, we configured UV to use `test.pypi` as the PRIMARY index to resolve RC dependencies. This caused UV to look for ALL packages there first, which led to problems - some packages don't have binary wheels on `test.pypi`, so UV tried building from source and failed (like the `psycopg2-binary` issue we hit). The fix is simple: use PyPI as primary (default) and `test.pypi` as an EXTRA index. UV will check PyPI first for everything, and only fall back to `test.pypi` for packages not found there (like our RC client versions). This PR includes: - Fixed `install-llama-stack-client` action to output `UV_EXTRA_INDEX_URL` instead of `UV_INDEX_URL` - New `uv-run-with-index.sh` wrapper that auto-detects release branches and sets UV env vars - Updated pre-commit hooks (`uv-lock`, codegen, etc.) to use the wrapper - Pass UV env vars as Docker build args in all locations - Scope UV env vars properly in Containerfile (inline for llama-stack install, explicitly unset before distribution deps) - Export UV env vars to `GITHUB_ENV` in setup-runner for cross-step persistence The wrapper detects release branches automatically in both CI and local environments, so this "just works" without manual configuration. On main (non-release branch), the wrapper becomes a no-op. Tested and validated on `release-0.3.x` where all CI checks pass.	2025-10-31 12:55:43 -07:00
raghotham	62603d25c2	chore(api)!: /v1/inspect only lists v1 apis by default (#3948 ) # What does this PR do? Allow filtering for v1alpha, v1beta, deprecated and v1. Backward incompatible change since by default it only returns v1 apis now. ## Test Plan added unit test	2025-10-31 11:55:46 -07:00
Ashwin Bharambe	61aab1889b	fix(ci): remove precommit trigger workflow (#4008 ) Not safe!	2025-10-31 11:41:26 -07:00
Francisco Arceo	7b79cd05d5	feat: Adding Prompts to admin UI (#3987 ) # What does this PR do? 1. Updates Llama Stack Typescript client to include `prompts`api in playground client. 2. Updates the UI to display prompts and execute basic CRUD operations for prompts. (2) adds an explicit "Preview" section when creating the prompt to show users how the Prompts API behaves as you dynamically edit the prompt content. See example here: <p align="center"><img width="468.5" height="333" alt="Screenshot 2025-10-31 at 12 22 34 PM" src="https://github.com/user-attachments/assets/3542ce7f-56fe-4fb4-b0a3-5cfba5917f6d" /></p> Some screen shots: <details><Summary>Click me to expand!</Summary> ### Prompts List with Prompts <img width="1906" height="1108" alt="Screenshot 2025-10-31 at 12 20 05 PM" src="https://github.com/user-attachments/assets/494a4748-ea6a-4527-8cfe-8959cb741c0f" /> ### Empty Prompts List <img width="1889" height="1123" alt="Screenshot 2025-10-31 at 12 08 44 PM" src="https://github.com/user-attachments/assets/ac95b807-d311-4725-86da-0258b3cce81a" /> ### Create Prompt <img width="1918" height="1167" alt="Screenshot 2025-10-31 at 11 03 29 AM" src="https://github.com/user-attachments/assets/b3100a78-f4f3-410f-af89-f7e7fe4a89e7" /> ### Submit Prompt with error <img width="1901" height="1213" alt="Screenshot 2025-10-31 at 12 09 28 PM" src="https://github.com/user-attachments/assets/dca71354-a602-449d-a0d8-0ed3d009a275" /> </details> ## Closes https://github.com/llamastack/llama-stack/issues/3322 ## Test Plan Added tests and manual testing. Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-10-31 11:37:25 -07:00
Ashwin Bharambe	c2fd17474e	fix: stop printing server log, it is confusing Some checks failed Pre-commit / pre-commit (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 54s Details	2025-10-31 11:22:08 -07:00
Ashwin Bharambe	5f95c1f8cc	fix(ci): install client from release branch before uv sync (#4001 ) Fixes CI failures on release branches where uv sync can't resolve RC dependencies. The problem: on release branches like `release-0.3.x`, pyproject.toml requires `llama-stack-client>=0.3.1rc1`. But RC versions only exist on test.pypi, not PyPI. So uv sync fails before we even get a chance to install the client from git. The fix is simple - on release branches, pre-install the client from the matching git branch first, then run uv sync. This satisfies the RC requirement and lets dependency resolution succeed. Modified setup-runner and pre-commit workflows to do this. Also cleaned up some duplicate logic in setup-test-environment that's now handled centrally. Example failure: `5415478835`	2025-10-31 06:16:20 -07:00
Ashwin Bharambe	6d80ca4bf7	fix(ci): replace unused LLAMA_STACK_CLIENT_DIR with direct install (#4000 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Pre-commit / pre-commit (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 11s Details UI Tests / ui-tests (22) (push) Successful in 27s Details Replace unused `LLAMA_STACK_CLIENT_DIR` env var (from old `llama stack build`) with direct `uv pip install` for release branch client installation. cc @ehhuang	2025-10-30 22:09:25 -07:00
Jiayi Ni	fa7699d2c3	feat: Add rerank API for NVIDIA Inference Provider (#3329 ) # What does this PR do? Add rerank API for NVIDIA Inference Provider. <!-- If resolving an issue, uncomment and update the line below --> Closes #3278 ## Test Plan Unit test: ``` pytest tests/unit/providers/nvidia/test_rerank_inference.py ``` Integration test: ``` pytest -s -v tests/integration/inference/test_rerank.py --stack-config="inference=nvidia" --rerank-model=nvidia/nvidia/nv-rerankqa-mistral-4b-v3 --env NVIDIA_API_KEY="" --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" ```	2025-10-30 21:42:09 -07:00
Ashwin Bharambe	c396de57a4	ci: standardize release branch pattern to release-X.Y.x (#3999 ) Standardize CI workflows to use `release-X.Y.x` branch pattern instead of multiple numeric variants. That's the pattern we are settling on. See https://github.com/llamastack/llama-stack-ops/pull/20 for reference.	2025-10-30 21:33:32 -07:00
Doug Edgar	e8cd8508b5	fix: handle missing external_providers_dir (#3974 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Pre-commit / pre-commit (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details UI Tests / ui-tests (22) (push) Successful in 50s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR fixes the handling of the external_providers_dir configuration field to align with its ongoing deprecation, in favor of the provider `module` specification approach. It addresses the issue in #3950, where using the default provided run.yaml config resulted in the `external_providers_dir` parameter being set to the literal string `None`, and crashing the llama-stack server when starting. <!-- If resolving an issue, uncomment and update the line below --> Closes #3950 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> - Built a new container image from `podman build . -f containers/Containerfile --build-arg DISTRO_NAME=starter --tag llama-stack:starter` - Tested it locally with `podman run -it localhost/llama-stack:starter` - Tested it on an OpenShift 4.19 cluster, deployed via the llama-stack-k8s-operator. Signed-off-by: Doug Edgar <dedgar@redhat.com>	2025-10-30 17:01:31 -07:00
Derek Higgins	ff2b270e2f	fix: relax structured output test assertions to handle whitespace and… (#3997 ) … case variations The ollama/llama3.2:3b-instruct-fp16 model returns string values with trailing whitespace in structured JSON output. Updated test assertions to use case-insensitive substring matching instead of exact equality. Use .lower() for case-insensitive comparison Check if expected value is contained in actual value (handles whitespace) Closes: #3996 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-10-30 16:55:23 -07:00
ehhuang	0e384a55a1	feat: support `workers` in run config (#3992 ) # What does this PR do? ## Test Plan Set workers: 4 in run.yaml. Start server and observe logs multiple times.	2025-10-30 16:34:12 -07:00
Ashwin Bharambe	6f90a7af4b	ci: target release-X.Y.x branches instead of release-X.Y.x-maint (#3995 ) We will be updating our release procedure to be more "normal" or "sane". We will - create release branches like normal people - land cherry-picks onto those branches - run releases off of those branches - no more "rc" branch pollution either Given that, this PR cleans things up a bit - Remove `-maint` suffix from release branch patterns in CI workflows - Update branch matching to `release-X.Y.x` format	2025-10-30 16:27:13 -07:00
Ashwin Bharambe	90234d6973	ci: support release branches and match client branch (#3990 ) - Update workflows to trigger on release-X.Y.x-maint branches - When PR targets release branch, fetch matching branch from llama-stack-client-python - Falls back to main if matching client branch doesn't exist - Updated workflows: - integration-tests.yml - integration-auth-tests.yml - integration-sql-store-tests.yml - integration-vector-io-tests.yml - unit-tests.yml - backward-compat.yml - pre-commit.yml	2025-10-30 15:20:34 -07:00
Ashwin Bharambe	c2ae42b343	fix(ci): show pre-commit output easily on failure (#3985 ) Right now, the failed Step which is opened by GH by default tells me to just go up and click and scroll through for no reason.	2025-10-30 11:48:20 -07:00
Ashwin Bharambe	77c8bc6fa7	fix(ci): add back server:ci-tests to replay tests (#3976 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s Details Pre-commit / pre-commit (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details API Conformance Tests / check-schema-compatibility (push) Successful in 15s Details Python Package Build Test / build (3.12) (push) Failing after 39s Details Unit Tests / unit-tests (3.12) (push) Failing after 40s Details UI Tests / ui-tests (22) (push) Successful in 42s Details It is useful for local debugging. If both server and docker are failing, you can just run server locally to debug which is much easier to do.	2025-10-30 11:02:59 -07:00
ehhuang	5e20938832	fix: remove LLAMA_STACK_TEST_FORCE_SERVER_RESTART setting in fixture (#3982 ) # What does this PR do? this is meant to be a manual flag ## Test Plan CI	2025-10-30 09:13:04 -07:00
Sébastien Han	b4ea05ada9	chore: add batches to openapi schema (#3980 ) # What does this PR do? While working on https://github.com/llamastack/llama-stack/pull/3944 I realized that the batches API wasn't generated. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-10-30 07:08:35 -07:00
Derek Higgins	19d85003de	test: Updated test skips that were marked with "inline::vllm" (#3979 ) This should be "remote::vllm". This causes some log probs tests to be skipped with remote vllm. (They fail if run). Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-10-30 14:48:21 +01:00
Ashwin Bharambe	174ef162b3	fix(mypy): add fast and full mypy modes (#3975 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Pre-commit / pre-commit (push) Failing after 2s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Test llama stack list-deps / show-single-provider (push) Failing after 4s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 4s Details Test llama stack list-deps / generate-matrix (push) Successful in 5s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details Test Llama Stack Build / build (push) Failing after 4s Details Test llama stack list-deps / list-deps (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details UI Tests / ui-tests (22) (push) Successful in 38s Details `mypy` became very slow for the common path. This can make local pre-commit runs very slow. Let's restore that. - restore fast mirrors-mypy hook for local runs - add optional mypy-full hook and docs so devs can match CI - run full mypy in CI with a hint when failures occur ### Test Plan - uv run pre-commit run mypy --all-files - uv run pre-commit run mypy-full --hook-stage manual --all-files - uv run --group dev --group type_checking mypy	2025-10-29 19:02:32 -07:00
Charlie Doern	e8ecc99524	fix!: remove chunk_id property from Chunk class (#3954 ) # What does this PR do? chunk_id in the Chunk class executes actual logic to compute a chunk ID. This sort of logic should not live in the API spec. Instead, the providers should be in charge of calling generate_chunk_id, and pass it to `Chunk`. this removes the incorrect dependency between Provider impl and API impl Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-29 18:59:59 -07:00
Charlie Doern	0ef9166c7e	fix: make integration-tests.sh Mac friendly (#3971 ) # What does this PR do? When running ./scripts/integration-tests.sh --network host on mac fails regularly due to how Docker runs on MacOS. if on mac, keep network bridge mode. before: === Starting Docker Container === Using image: localhost/distribution-ci-tests:dev WARNING: Published ports are discarded when using host network mode Waiting for Docker container to start... ❌ Docker container failed to start Container logs: INFO 2025-10-29 18:38:32,180 llama_stack.cli.stack.run:100 cli: Using run configuration: /workspace/src/llama_stack/distributions/ci-tests/run.yaml ... (stack starts but is not reachable on network) after: === Starting Docker Container === Using image: localhost/distribution-ci-tests:dev Using bridge networking with port mapping (non-Linux) Waiting for Docker container to start... ✅ Docker container started successfully === Running Integration Tests === ## Test Plan integration tests pass! Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-29 14:12:09 -07:00
Ashwin Bharambe	da8f014b96	feat(models): list models available via provider_data header (#3968 ) ## Summary When users provide API keys via `X-LlamaStack-Provider-Data` header, `models.list()` now returns models they can access from those providers, not just pre-registered models from the registry. This complements the routing fix from `f88416ef8` which enabled inference calls with `provider_id/model_id` format for unregistered models. Users can now discover which models are available to them before making inference requests. The implementation reuses `NeedsRequestProviderData.get_request_provider_data()` to validate credentials, then dynamically fetches models from providers without caching them since they're user-specific. Registry models take precedence to respect any pre-configured aliases. ## Test Script ```python #!/usr/bin/env python3 import json import os from openai import OpenAI # Test 1: Without provider_data header client = OpenAI(base_url="http://localhost:8321/v1/openai/v1", api_key="dummy") models = client.models.list() anthropic_without = [m.id for m in models.data if m.id and "anthropic" in m.id] print(f"Without header: {len(models.data)} models, {len(anthropic_without)} anthropic") # Test 2: With provider_data header containing Anthropic API key anthropic_api_key = os.environ["ANTHROPIC_API_KEY"] client_with_key = OpenAI( base_url="http://localhost:8321/v1/openai/v1", api_key="dummy", default_headers={ "X-LlamaStack-Provider-Data": json.dumps({"anthropic_api_key": anthropic_api_key}) } ) models_with_key = client_with_key.models.list() anthropic_with = [m.id for m in models_with_key.data if m.id and "anthropic" in m.id] print(f"With header: {len(models_with_key.data)} models, {len(anthropic_with)} anthropic") print(f"Anthropic models: {anthropic_with}") assert len(anthropic_with) > len(anthropic_without), "Should have more anthropic models with API key" print("\n✓ Test passed!") ``` Run with a stack that has Anthropic provider configured (but without API key in config): ```bash ANTHROPIC_API_KEY=sk-ant-... python test_provider_data_models.py ```	2025-10-29 14:03:03 -07:00
Ashwin Bharambe	c9d4b6c54f	chore(mypy): part-04 resolve mypy errors in meta_reference agents (#3969 ) ## Summary Fixes all mypy type errors in `providers/inline/agents/meta_reference/` and removes exclusions from pyproject.toml. ## Changes - Fix type annotations for Safety API message parameters (OpenAIMessageParam) - Add Action enum usage in access control checks - Correct method signatures to match API supertype (parameter ordering) - Handle optional return types with proper None checks - Remove 3 meta_reference exclusions from mypy config Files fixed: 25 errors across 3 files (safety.py, persistence.py, agents.py)	2025-10-29 13:37:28 -07:00
Omar Abdelwahab	e6b27db30a	docs: A getting started notebook featuring simple agent examples. (#3955 ) # What does this PR do? Getting started notebook featuring simple agent examples. --------- Co-authored-by: Omar Abdelwahab <omara@fb.com>	2025-10-29 14:13:34 -04:00
Ashwin Bharambe	7dc48a75e5	chore: delete openapi.stainless.yaml for now. not source of truth. (#3967 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test llama stack list-deps / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details Test llama stack list-deps / list-deps (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Test llama stack list-deps / show-single-provider (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Test Llama Stack Build / build (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details UI Tests / ui-tests (22) (push) Successful in 38s Details Pre-commit / pre-commit (push) Successful in 2m34s Details This is really not the source of truth yet and is causing more confusion right now.	2025-10-29 10:45:38 -07:00
Nathan Weinberg	b90c6a2c8b	fix(docs): remove leftover telemetry sidebar section (#3961 ) Leftover telemetry section was preventing `npm run build` from completing successfully Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-10-29 11:20:13 -04:00
Nathan Weinberg	10977caff3	fix: typo in .gitignore (#3960 ) typo in https://github.com/llamastack/llama-stack/pull/3959 (whoops) Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-10-29 11:08:47 -04:00
Ashwin Bharambe	a4f97559d1	fix(mypy): part-03 completely resolve meta reference responses impl typing issues (#3951 ) ## Summary Resolves all mypy errors in meta reference agent OpenAI responses implementation by adding proper type narrowing, None checks, and Sequence type support. ## Changes - Fixed streaming.py, openai_responses.py, utils.py, tool_executor.py, agent_instance.py - Added Sequence type support to schema generator (ensures correct JSON schema generation) - Applied union type narrowing and None checks throughout ## Test plan - All modified files pass mypy type checking (0 errors) - Schema generator produces correct `type: array` for Sequence types --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-29 08:07:15 -07:00
Ashwin Bharambe	e5c27dbcbf	fix(mypy): part-02 resolve OpenAI compatibility layer type issues (#3947 ) ## Summary Fixes 111 mypy type errors in OpenAI compatibility layer (PR3 in mypy remediation series). Changes: - `litellm_openai_mixin.py`: Added type annotations, None checks for tool_config/model_store access - `openai_compat.py`: Added None checks throughout, fixed TypedDict expansions, proper type conversions for messages/tool_calls Result: 23 → 1 errors in litellm file, 88 → 0 errors in openai_compat file --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-29 08:06:40 -07:00
Ashwin Bharambe	ce31aa1704	fix(mypy-cleanup): part-01 resolve meta reference agent type issues (126 errors) (#3945 ) Error fixes in Agents implementation (`meta-reference` provider) -- adding proper type annotations and using type narrowing for optional attributes. Essentially a bunch of `if x and x_foo := getattr(x, "foo")` instead of `x.foo` directly Part of ongoing mypy remediation effort. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-29 07:54:30 -07:00
Nathan Weinberg	22bf0d0471	chore: ignore API docs generation (#3959 ) See `1432743473` Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-10-29 10:27:53 -04:00
Nathan Weinberg	b6bb8fbf64	ci: add pre-commit check ensuring FIPS compliance (#3899 ) # What does this PR do? this commit adds a new pre-commit hook to scan for non-FIPS compliant function usage within llama-stack Closes #3427 ## Test Plan Ran locally Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-10-29 10:21:35 -04:00
Ashwin Bharambe	e809d21357	feat: add backward compatibility tests for run.yaml (#3952 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 42s Details Vector IO Integration Tests / test-matrix (push) Failing after 45s Details API Conformance Tests / check-schema-compatibility (push) Successful in 54s Details UI Tests / ui-tests (22) (push) Successful in 52s Details Pre-commit / pre-commit (push) Successful in 3m28s Details This adds automated backward compatibility testing for `run.yaml` files. As we evolve `StackRunConfig`, changes can inadvertently break existing user configurations. This workflow catches those breaks before merge. We test old run.yaml files (from main and the latest release) against the PR's new code. If configs that worked before now fail, the PR is blocked unless explicitly acknowledged as a breaking change. Two test layers: - Schema validation: Quick pytest checks that configs parse without errors - Integration tests: Full test suite execution to catch runtime semantic issues (cross-field validations, provider initialization, etc.) What we test against: - main branch: Breaking changes here block the PR (this is the gate) - Latest release: Informational only - shows if we've drifted from what users have If tests fail, the PR author must acknowledge the breaking change by adding `!:` to the PR title (e.g., `feat!: change xyz`) or including `BREAKING CHANGE:` in a commit message. Once acknowledged, the check passes with a warning. These jobs are run: 1. `check-main-compatibility` - Schema validation of all distribution run.yaml files from main 2. `test-integration-main` - Full integration test suite using main's ci-tests run.yaml 3. `test-integration-release` - Integration tests with latest release config (informational) 4. `check-schema-release-compatibility` - Schema checks against release (informational) The integration tests catch issues that schema validation alone would miss, like assertion failures in `StackRunConfig.validate_server_stores()` or provider-specific runtime logic. Resolves #3311 Related to #3237	2025-10-28 21:51:56 -07:00
Derek Higgins	c678682cdd	chore: remove unused methods from InferenceRouter (#3953 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test llama stack list-deps / show-single-provider (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 24s Details Test llama stack list-deps / generate-matrix (push) Successful in 25s Details Python Package Build Test / build (3.13) (push) Failing after 25s Details Unit Tests / unit-tests (3.13) (push) Failing after 25s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (push) Failing after 32s Details Test llama stack list-deps / list-deps (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 40s Details UI Tests / ui-tests (22) (push) Successful in 59s Details Test Llama Stack Build / build (push) Failing after 1m1s Details Pre-commit / pre-commit (push) Successful in 5m23s Details Remove unused methods that became obsolete after `d266c59c`: o _compute_and_log_token_usage o _count_tokens o stream_tokens_and_compute_metrics o count_tokens_and_compute_metrics These methods are no longer referenced anywhere in the codebase following the removal of deprecated inference.chat_completion implementations. --------- Signed-off-by: Derek Higgins <derekh@redhat.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-28 17:12:41 -07:00
ehhuang	1aa8979050	test: enable telemetry tests in server mode (#3927 ) # What does this PR do? - added a server-based test OLTP collector ## Test Plan CI	2025-10-28 16:33:48 -07:00
ehhuang	1f9d48cd54	feat: openai files provider (#3946 ) # What does this PR do? - Adds OpenAI files provider - Note that file content retrieval is pretty limited by `purpose` https://community.openai.com/t/file-uploads-error-why-can-t-i-download-files-with-purpose-user-data/1357013?utm_source=chatgpt.com ## Test Plan Modify run yaml to use openai files provider: ``` files: - provider_id: openai provider_type: remote::openai config: api_key: ${env.OPENAI_API_KEY:=} metadata_store: backend: sql_default table_name: openai_files_metadata # Then run files tests ❯ uv run --no-sync ./scripts/integration-tests.sh --stack-config server:ci-tests --inference-mode replay --setup ollama --suite base --pattern test_files ```	2025-10-28 16:25:03 -07:00
raghotham	feabcdd67b	docs: add documentation on how to use custom run yaml in docker (#3949 ) as title test plan: ```yaml # custom-ollama-run.yaml version: 2 image_name: starter external_providers_dir: /.llama/providers.d apis: - inference - vector_io - files - safety - tool_runtime - agents providers: inference: # Single Ollama provider for all models - provider_id: ollama provider_type: remote::ollama config: url: ${env.OLLAMA_URL:=http://localhost:11434} vector_io: - provider_id: faiss provider_type: inline::faiss config: persistence: namespace: vector_io::faiss backend: kv_default files: - provider_id: meta-reference-files provider_type: inline::localfs config: storage_dir: /.llama/files metadata_store: table_name: files_metadata backend: sql_default safety: - provider_id: llama-guard provider_type: inline::llama-guard config: excluded_categories: [] tool_runtime: - provider_id: rag-runtime provider_type: inline::rag-runtime agents: - provider_id: meta-reference provider_type: inline::meta-reference config: persistence: agent_state: namespace: agents backend: kv_default responses: table_name: responses backend: sql_default max_write_queue_size: 10000 num_writers: 4 storage: backends: kv_default: type: kv_sqlite db_path: /.llama/kvstore.db sql_default: type: sql_sqlite db_path: /.llama/sql_store.db stores: metadata: namespace: registry backend: kv_default inference: table_name: inference_store backend: sql_default max_write_queue_size: 10000 num_writers: 4 conversations: table_name: openai_conversations backend: sql_default registered_resources: models: # All models use the same 'ollama' provider - model_id: llama3.2-vision:latest provider_id: ollama provider_model_id: llama3.2-vision:latest model_type: llm - model_id: llama3.2:3b provider_id: ollama provider_model_id: llama3.2:3b model_type: llm # Embedding models - model_id: nomic-embed-text-v2-moe provider_id: ollama provider_model_id: toshk0/nomic-embed-text-v2-moe:Q6_K model_type: embedding metadata: embedding_dimension: 768 shields: [] vector_dbs: [] datasets: [] scoring_fns: [] benchmarks: [] tool_groups: [] server: port: 8321 telemetry: enabled: true vector_stores: default_provider_id: faiss default_embedding_model: provider_id: ollama model_id: toshk0/nomic-embed-text-v2-moe:Q6_K ``` ```bash docker run -it --pull always -p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT -v ~/.llama:/root/.llama -v $CUSTOM_RUN_CONFIG:/app/custom-run.yaml -e RUN_CONFIG_PATH=/app/custom-run.yaml -e OLLAMA_URL=http://host.docker.internal:11434/ llamastack/distribution-starter:0.3.0 --port $LLAMA_STACK_PORT ```	2025-10-28 16:05:44 -07:00
Ashwin Bharambe	f88416ef87	fix(inference): enable routing of models with provider_data alone (#3928 ) This PR enables routing of fully qualified model IDs of the form `provider_id/model_id` even when the models are not registered with the Stack. Here's the situation: assume a remote inference provider which works only when users provide their own API keys via `X-LlamaStack-Provider-Data` header. By definition, we cannot list models and hence update our routing registry. But because we _require_ a provider ID in the models now, we can identify which provider to route to and let that provider decide. Note that we still try to look up our registry since it may have a pre-registered alias. Just that we don't outright fail when we are not able to look it up. Also, updated inference router so that the responses have the _exact_ model that the request had. ## Test Plan Added an integration test Closes #3929 --------- Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>	2025-10-28 11:16:37 -07:00
Ashwin Bharambe	94b0592240	fix(mypy): add type stubs and fix typing issues (#3938 ) Adds type stubs and fixes mypy errors for better type coverage. Changes: - Added type_checking dependency group with type stubs (torchtune, trl, etc.) - Added lm-format-enforcer to pre-commit hook - Created HFAutoModel Protocol for type-safe HuggingFace model handling - Added mypy.overrides for untyped libraries (torchtune, fairscale, etc.) - Fixed type issues in post-training providers, databricks, and api_recorder Note: ~1,200 errors remain in excluded files (see pyproject.toml exclude list). --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-28 11:00:09 -07:00
Ashwin Bharambe	1d385b5b75	fix(mypy): resolve OpenAI SDK and provider type issues (#3936 ) ## Summary - Fix OpenAI SDK NotGiven/Omit type mismatches in embeddings calls - Fix incorrect OpenAIChatCompletionChunk import in vllm provider - Refactor to avoid type:ignore comments by using conditional kwargs ## Changes openai_mixin.py (9 errors fixed): - Build kwargs conditionally for embeddings.create() to avoid NotGiven/Omit mismatch - Only include parameters when they have actual values (not None) gemini.py (9 errors fixed): - Apply same conditional kwargs pattern - Add missing Any import vllm.py (2 errors fixed): - Use correct OpenAIChatCompletionChunk from llama_stack.apis.inference - Remove incorrect alias from openai package ## Technical Notes The OpenAI SDK has a type system quirk where `NOT_GIVEN` has type `NotGiven` but parameter signatures expect `Omit`. By only passing parameters with actual values, we avoid this mismatch entirely without needing `# type: ignore` comments. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-28 10:54:29 -07:00
Ashwin Bharambe	d009dc29f7	fix(mypy): resolve provider utility and testing type issues (#3935 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Test llama stack list-deps / generate-matrix (push) Successful in 4s Details Test llama stack list-deps / show-single-provider (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test llama stack list-deps / list-deps (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 7s Details UI Tests / ui-tests (22) (push) Successful in 51s Details Pre-commit / pre-commit (push) Successful in 2m0s Details Fixes mypy type errors in provider utilities and testing infrastructure: - `mcp.py`: Cast incompatible client types, wrap image data properly - `batches.py`: Rename walrus variable to avoid shadowing - `api_recorder.py`: Use cast for Pydantic field annotation No functional changes. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-28 10:37:27 -07:00
Ashwin Bharambe	fcf07790c8	fix(mypy): resolve model implementation typing issues (#3934 ) ## Summary Fixes mypy type errors across 4 model implementation files (Phase 2d of mypy suppression removal plan): - `src/llama_stack/models/llama/llama3/multimodal/image_transform.py` (10 errors fixed) - `src/llama_stack/models/llama/checkpoint.py` (2 errors fixed) - `src/llama_stack/models/llama/hadamard_utils.py` (1 error fixed) - `src/llama_stack/models/llama/llama3/multimodal/encoder_utils.py` (1 error fixed) ## Changes ### image_transform.py - Fixed return type annotation for `find_supported_resolutions` from `Tensor` to `list[tuple[int, int]]` - Fixed parameter and return type annotations for `resize_without_distortion` from `Tensor` to `Image.Image` - Resolved variable shadowing by using separate names: `possible_resolutions_list` for the list and `possible_resolutions_tensor` for the tensor ### checkpoint.py - Replaced deprecated `torch.BFloat16Tensor` and `torch.cuda.BFloat16Tensor` with `torch.set_default_dtype(torch.bfloat16)` - Fixed variable shadowing by renaming numpy array to `ckpt_paths_array` to distinguish from the parameter `ckpt_paths: list[Path]` ### hadamard_utils.py - Added `isinstance` assertion to narrow type from `nn.Module` to `nn.Linear` before accessing `in_features` attribute ### encoder_utils.py - Fixed variable shadowing by using `masks_list` for list accumulation and `masks` for the final Tensor result ## Test plan - Verified all files pass mypy type checking (only optional dependency import warnings remain) - No functional changes - only type annotations and variable naming improvements Stacks on PR #3933 Co-authored-by: Claude <noreply@anthropic.com>	2025-10-28 10:28:29 -07:00
Ashwin Bharambe	6ce59b5df8	fix(mypy): resolve type issues in MongoDB, batches, and auth providers (#3933 ) Fixes mypy type errors in provider utilities: - MongoDB: Fix AsyncMongoClient parameters, use async iteration for cursor - Batches: Handle memoryview\|bytes union for file decoding - Auth: Add missing imports, validate JWKS URI, conditionally pass parameters Fixes 11 type errors. No functional changes.	2025-10-28 10:23:39 -07:00
Ashwin Bharambe	4a2ea278c5	fix(mypy): resolve OpenTelemetry typing issues in telemetry.py (#3943 ) Fixes mypy type errors in OpenTelemetry integration: - Add type aliases for AttributeValue and Attributes - Add helper to filter None values from attributes (OpenTelemetry doesn't accept None) - Cast metric and tracer objects to proper types - Update imports after refactoring No functional changes.	2025-10-28 10:10:18 -07:00
Ashwin Bharambe	85887d724f	Revert "fix(mypy): resolve OpenTelemetry typing issues in telemetry.py (#3931 )" This reverts commit `9afc52a36a`.	2025-10-28 09:48:46 -07:00
Ashwin Bharambe	9afc52a36a	fix(mypy): resolve OpenTelemetry typing issues in telemetry.py (#3931 ) ## Summary Fix all 11 mypy type checking errors in `telemetry.py` without using any type suppressions. Changes: - Add type aliases for OpenTelemetry attribute types (`AttributeValue`, `Attributes`) - Create `_clean_attributes()` helper to filter None values from attribute dicts - Use `cast()` for TracerProvider methods (`add_span_processor`, `force_flush`) - Use `cast()` for metric creation methods returning from global storage - Fix variable reuse by renaming `span` to `end_span` in SpanEndPayload branch - Add None check for `parent_span` before `set_span_in_context` Errors Fixed: - TracerProvider attribute access: 2 errors - Counter/UpDownCounter/ObservableGauge return types: 3 errors - Attribute dict type mismatches: 4 errors - Span assignment type conflicts: 2 errors Testing: ```bash uv run mypy src/llama_stack/core/telemetry/telemetry.py # Success: no issues found ``` Part of: Mypy suppression removal plan (Phase 2a/4) Stack: - [Phase 1] Add type stubs (#3930) - [Phase 2a] Fix OpenTelemetry types (this PR) - [Phase 2b+] Fix remaining errors (upcoming) - [Phase 3] Remove inline suppressions (upcoming) - [Phase 4] Un-exclude files from mypy (upcoming)	2025-10-28 09:47:20 -07:00
Ian Miller	5598f61e12	feat(responses)!: introduce OpenAI compatible prompts to Responses API (#3942 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR is responsible for making changes to Responses API scheme to introduce OpenAI compatible prompts there. Change to the API only, therefore currently no implementation at all. However, the follow up PR with actual implementation will be submitted after current PR lands. The need of this functionality was initiated in #3514. > Note, #3514 is divided on three separate PRs. Current PR is the second of three. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> CI	2025-10-28 09:31:27 -07:00
Ashwin Bharambe	e5ca7e6450	chore(mypy): add mypy and type stub packages to dev deps (#3930 ) ## Summary This PR adds mypy and essential type stub packages to dev dependencies as Phase 1 of the mypy suppression removal plan. Changes: - Add `mypy` to dev dependencies - Add type stubs: `types-jsonschema`, `pandas-stubs`, `types-psutil`, `types-tqdm`, `boto3-stubs` Impact: - Enables static type checking across the codebase - Eliminates ~30 type checking errors related to missing type information for third-party packages - Provides foundation for subsequent PRs to remove type suppressions Part of: Mypy suppression removal plan (Phase 1/4) Testing: ```bash uv sync --group dev uv run mypy ```	2025-10-28 06:02:38 -07:00
Sébastien Han	d10bfb5121	chore: remove leftover llama_stack directory (#3940 ) # What does this PR do? Followup on https://github.com/llamastack/llama-stack/pull/3920 where the llama_stack directory was moved under src. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-10-28 05:09:08 -07:00
Sébastien Han	b47afac7c2	chore: bump openai package version (#3918 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test llama stack list-deps / show-single-provider (push) Failing after 10s Details Test External API and Providers / test-external (venv) (push) Failing after 10s Details Python Package Build Test / build (3.13) (push) Failing after 24s Details Test llama stack list-deps / generate-matrix (push) Successful in 26s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 28s Details Unit Tests / unit-tests (3.13) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (push) Failing after 32s Details Test llama stack list-deps / list-deps (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 39s Details Test Llama Stack Build / build (push) Failing after 33s Details UI Tests / ui-tests (22) (push) Successful in 1m25s Details Pre-commit / pre-commit (push) Successful in 3m49s Details # What does this PR do? To match https://github.com/llamastack/llama-stack/pull/3847 We must not update the lock manually, but always reflect the update in the pyproject.toml. The lock is a state at build time. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-10-28 09:18:48 +01:00
Ashwin Bharambe	4e6c769cc4	fix(context): prevent provider data leak between streaming requests (#3924 ) ## Summary - `preserve_contexts_async_generator` left `PROVIDER_DATA_VAR` (and other context vars) populated after a streaming generator completed on HEAD~1, so the asyncio context for request N+1 started with request N's provider payload. - FastAPI dependencies and middleware execute before `request_provider_data_context` rebinds the header data, meaning auth/logging hooks could observe a prior tenant's credentials or treat them as authenticated. Traces and any background work that inspects the context outside the `with` block leak as well—this is a real security regression, not just a CLI artifact. - The wrapper now restores each tracked `ContextVar` to the value it held before the iteration (falling back to clearing when necessary) after every yield and when the generator terminates, so provider data is wiped while callers that set their own defaults keep them. ## Test Plan - `uv run pytest tests/unit/core/test_provider_data_context.py -q` - `uv run pytest tests/unit/distribution/test_context.py -q` Both suites fail on HEAD~1 and pass with this change.	2025-10-27 23:01:12 -07:00
ehhuang	c077d01ddf	chore(telemetry): more cleanup: remove apis.telemetry (#3919 ) # What does this PR do? ## Test Plan CI	2025-10-27 22:20:15 -07:00
ehhuang	1c9a31d8bd	chore(telemetry): add grafana dashboards (#3921 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Installer CI / lint (push) Failing after 3s Details Installer CI / smoke-test-on-dev (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test llama stack list-deps / generate-matrix (push) Successful in 2s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 7s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 5s Details Test llama stack list-deps / show-single-provider (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Test llama stack list-deps / list-deps (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details API Conformance Tests / check-schema-compatibility (push) Successful in 47s Details Test Llama Stack Build / build (push) Failing after 41s Details Test Llama Stack Build / build-single-provider (push) Failing after 48s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 45s Details UI Tests / ui-tests (22) (push) Successful in 1m18s Details Pre-commit / pre-commit (push) Successful in 1m48s Details # What does this PR do? - add a dashboard in grafana (vibe-coded) ## Test Plan <img width="2416" height="1114" alt="image" src="https://github.com/user-attachments/assets/8927aad2-cc14-4a1d-847e-350522cac02f" />	2025-10-27 14:58:27 -07:00
ehhuang	b7dd3f5c56	chore!: BREAKING CHANGE: vector_db_id -> vector_store_id (#3923 ) # What does this PR do? ## Test Plan CI vector_io tests will fail until next client sync passed with https://github.com/llamastack/llama-stack-client-python/pull/286 checked out locally	2025-10-27 14:26:06 -07:00
Nathan Weinberg	b6954c9882	fix: add missing shutdown methods to PromptServiceImpl and ConversationServiceImpl (#3925 ) Change is visible in server shutdown logs, changes `WARNING` loglines to `INFO` Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-10-27 13:41:38 -07:00
Matthew Farrellee	a9b00db421	feat: add provider data keys for Cerebras, Databricks, NVIDIA, and RunPod (#3734 ) # What does this PR do? add provider-data key passing support to Cerebras, Databricks, NVIDIA and RunPod also, added missing tests for Fireworks, Anthropic, Gemini, SambaNova, and vLLM addresses #3517 ## Test Plan ci w/ new tests --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-10-27 13:09:35 -07:00
Ashwin Bharambe	471b1b248b	chore(package): migrate to src/ layout (#3920 ) Migrates package structure to src/ layout following Python packaging best practices. All code moved from `llama_stack/` to `src/llama_stack/`. Public API unchanged - imports remain `import llama_stack.`. Updated build configs, pre-commit hooks, scripts, and GitHub workflows accordingly. All hooks pass, package builds cleanly. Developer note*: Reinstall after pulling: `pip install -e .`	2025-10-27 12:02:21 -07:00
IAN MILLER	98a5047f9d	feat(prompts): attach prompts to storage stores in run configs (#3893 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR is responsible for attaching prompts to storage stores in run configs. It allows to specify prompts as stores in different distributions. The need of this functionality was initiated in #3514 > Note, #3514 is divided on three separate PRs. Current PR is the first of three. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Manual testing and updated CI unit tests Prerequisites: 1. `uv run --with llama-stack llama stack list-deps starter \| xargs -L1 uv pip install` 2. `llama stack run starter ` ``` INFO 2025-10-23 15:36:17,387 llama_stack.cli.stack.run:100 cli: Using run configuration: /Users/ianmiller/llama-stack/llama_stack/distributions/starter/run.yaml INFO 2025-10-23 15:36:17,423 llama_stack.cli.stack.run:157 cli: HTTPS enabled with certificates: Key: None Cert: None INFO 2025-10-23 15:36:17,424 llama_stack.cli.stack.run:159 cli: Listening on ['::', '0.0.0.0']:8321 INFO 2025-10-23 15:36:17,749 llama_stack.core.server.server:521 core::server: Run configuration: INFO 2025-10-23 15:36:17,756 llama_stack.core.server.server:524 core::server: apis: - agents - batches - datasetio - eval - files - inference - post_training - safety - scoring - tool_runtime - vector_io image_name: starter providers: agents: - config: persistence: agent_state: backend: kv_default namespace: agents responses: backend: sql_default max_write_queue_size: 10000 num_writers: 4 table_name: responses provider_id: meta-reference provider_type: inline::meta-reference batches: - config: kvstore: backend: kv_default namespace: batches provider_id: reference provider_type: inline::reference datasetio: - config: kvstore: backend: kv_default namespace: datasetio::huggingface provider_id: huggingface provider_type: remote::huggingface - config: kvstore: backend: kv_default namespace: datasetio::localfs provider_id: localfs provider_type: inline::localfs eval: - config: kvstore: backend: kv_default namespace: eval provider_id: meta-reference provider_type: inline::meta-reference files: - config: metadata_store: backend: sql_default table_name: files_metadata storage_dir: /Users/ianmiller/.llama/distributions/starter/files provider_id: meta-reference-files provider_type: inline::localfs inference: - config: api_key: '******' url: https://api.fireworks.ai/inference/v1 provider_id: fireworks provider_type: remote::fireworks - config: api_key: '****' url: https://api.together.xyz/v1 provider_id: together provider_type: remote::together - config: {} provider_id: bedrock provider_type: remote::bedrock - config: api_key: '****' base_url: https://api.openai.com/v1 provider_id: openai provider_type: remote::openai - config: api_key: '****' provider_id: anthropic provider_type: remote::anthropic - config: api_key: '****' provider_id: gemini provider_type: remote::gemini - config: api_key: '****' url: https://api.groq.com provider_id: groq provider_type: remote::groq - config: api_key: '****' url: https://api.sambanova.ai/v1 provider_id: sambanova provider_type: remote::sambanova - config: {} provider_id: sentence-transformers provider_type: inline::sentence-transformers post_training: - config: checkpoint_format: meta provider_id: torchtune-cpu provider_type: inline::torchtune-cpu safety: - config: excluded_categories: [] provider_id: llama-guard provider_type: inline::llama-guard - config: {} provider_id: code-scanner provider_type: inline::code-scanner scoring: - config: {} provider_id: basic provider_type: inline::basic - config: {} provider_id: llm-as-judge provider_type: inline::llm-as-judge - config: openai_api_key: '****' provider_id: braintrust provider_type: inline::braintrust tool_runtime: - config: api_key: '****' max_results: 3 provider_id: brave-search provider_type: remote::brave-search - config: api_key: '*****' max_results: 3 provider_id: tavily-search provider_type: remote::tavily-search - config: {} provider_id: rag-runtime provider_type: inline::rag-runtime - config: {} provider_id: model-context-protocol provider_type: remote::model-context-protocol vector_io: - config: persistence: backend: kv_default namespace: vector_io::faiss provider_id: faiss provider_type: inline::faiss - config: db_path: /Users/ianmiller/.llama/distributions/starter/sqlite_vec.db persistence: backend: kv_default namespace: vector_io::sqlite_vec provider_id: sqlite-vec provider_type: inline::sqlite-vec registered_resources: benchmarks: [] datasets: [] models: [] scoring_fns: [] shields: [] tool_groups: - provider_id: tavily-search toolgroup_id: builtin::websearch - provider_id: rag-runtime toolgroup_id: builtin::rag vector_stores: [] server: port: 8321 storage: backends: kv_default: db_path: /Users/ianmiller/.llama/distributions/starter/kvstore.db type: kv_sqlite sql_default: db_path: /Users/ianmiller/.llama/distributions/starter/sql_store.db type: sql_sqlite stores: conversations: backend: sql_default table_name: openai_conversations inference: backend: sql_default max_write_queue_size: 10000 num_writers: 4 table_name: inference_store metadata: backend: kv_default namespace: registry prompts: backend: kv_default namespace: prompts telemetry: enabled: true vector_stores: default_embedding_model: model_id: nomic-ai/nomic-embed-text-v1.5 provider_id: sentence-transformers default_provider_id: faiss version: 2 INFO 2025-10-23 15:36:20,032 llama_stack.providers.utils.inference.inference_store:74 inference: Write queue disabled for SQLite to avoid concurrency issues WARNING 2025-10-23 15:36:20,422 llama_stack.providers.inline.telemetry.meta_reference.telemetry:84 telemetry: OTEL_EXPORTER_OTLP_ENDPOINT is not set, skipping telemetry INFO 2025-10-23 15:36:22,379 llama_stack.providers.utils.inference.openai_mixin:436 providers::utils: OpenAIInferenceAdapter.list_provider_model_ids() returned 105 models INFO 2025-10-23 15:36:22,703 uvicorn.error:84 uncategorized: Started server process [17328] INFO 2025-10-23 15:36:22,704 uvicorn.error:48 uncategorized: Waiting for application startup. INFO 2025-10-23 15:36:22,706 llama_stack.core.server.server:179 core::server: Starting up Llama Stack server (version: 0.3.0) INFO 2025-10-23 15:36:22,707 llama_stack.core.stack:470 core: starting registry refresh task INFO 2025-10-23 15:36:22,708 uvicorn.error:62 uncategorized: Application startup complete. INFO 2025-10-23 15:36:22,708 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) ``` As you can see, prompts are attached to stores in config Testing: 1. Create prompt: ``` curl -X POST http://localhost:8321/v1/prompts \ -H "Content-Type: application/json" \ -d '{ "prompt": "Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.", "variables": ["name", "company", "role", "tone"] }' ``` `{"prompt":"Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.","version":1,"prompt_id":"pmpt_a90e09e67acfe23776f2778c603eb6c17e139dab5f6e163f","variables":["name","company","role","tone"],"is_default":false}% ` 2. Get prompt: `curl -X GET http://localhost:8321/v1/prompts/pmpt_a90e09e67acfe23776f2778c603eb6c17e139dab5f6e163f` `{"prompt":"Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.","version":1,"prompt_id":"pmpt_a90e09e67acfe23776f2778c603eb6c17e139dab5f6e163f","variables":["name","company","role","tone"],"is_default":false}% ` 3. Query sqlite KV storage to check created prompt: ``` sqlite> .mode column sqlite> .headers on sqlite> SELECT FROM kvstore WHERE key LIKE 'prompts:v1:%'; key value expiration ------------------------------------------------------------ ------------------------------------------------------------ ---------- prompts:v1:pmpt_a90e09e67acfe23776f2778c603eb6c17e139dab5f6e {"prompt_id": "pmpt_a90e09e67acfe23776f2778c603eb6c17e139dab 163f:1 5f6e163f", "prompt": "Hello {{name}}! You are working at {{c ompany}}. Your role is {{role}} at {{company}}. Remember, {{ name}}, to be {{tone}}.", "version": 1, "variables": ["name" , "company", "role", "tone"], "is_default": false} prompts:v1:pmpt_a90e09e67acfe23776f2778c603eb6c17e139dab5f6e 1 163f:default sqlite> ```	2025-10-27 11:12:12 -07:00
Luis Tomas Bolivar	63422e5b36	fix!: Enhance response API support to not fail with tool calling (#3385 ) Some checks failed Python Package Build Test / build (3.12) (push) Failing after 8s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 6s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Unit Tests / unit-tests (3.12) (push) Failing after 19s Details Test External API and Providers / test-external (venv) (push) Failing after 1m3s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m6s Details API Conformance Tests / check-schema-compatibility (push) Successful in 1m17s Details UI Tests / ui-tests (22) (push) Successful in 1m18s Details Pre-commit / pre-commit (push) Successful in 3m5s Details # What does this PR do? Introduces two main fixes to enhance the stability of Responses API when dealing with tool calling responses and structured outputs. ### Changes Made 1. It added OpenAIResponseOutputMessageMCPCall and ListTools to OpenAIResponseInput but https://github.com/llamastack/llama-stack/pull/3810 got merge that did the same in a different way. Still this PR does it in a way that keep the sync between OpenAIResponsesOutput and the allowed objects in OpenAIResponseInput. 2. Add protection in case self.ctx.response_format does not have type attribute BREAKING CHANGE: OpenAIResponseInput now uses OpenAIResponseOutput union type. This is semantically equivalent - all previously accepted types are still supported via the OpenAIResponseOutput union. This improves type consistency and maintainability.	2025-10-27 09:33:02 -07:00
Luis Tomas Bolivar	f18b5eb537	fix: Avoid BadRequestError due to invalid max_tokens (#3667 ) This patch ensures if max tokens is not defined, then is set to None instead of 0 when calling openai_chat_completion. This way some providers (like gemini) that cannot handle the `max_tokens = 0` will not fail Issue: #3666	2025-10-27 09:27:21 -07:00
Derek Higgins	00d8414597	fix(tests): limit vector store providers for record mode in CI tests (#3898 ) The vector_provider_wrapper was only limiting providers to faiss/sqlite-vec for replay mode, but CI tests also run in record mode with the same limited set of providers. This caused test failures when trying to test against milvus, chromadb, pgvector, weaviate, and qdrant which aren't configured in the record job.	2025-10-27 09:22:49 -07:00
Sébastien Han	7c0e43424d	chore: remove duplicate provider definition (#3917 ) # What does this PR do? Files was present twice. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-10-27 09:19:04 -07:00
dependabot[bot]	9c223d8593	chore(github-deps): bump actions/upload-artifact from 4.6.2 to 5.0.0 (#3905 ) Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.6.2 to 5.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/upload-artifact/releases">actions/upload-artifact's releases</a>.</em></p> <blockquote> <h2>v5.0.0</h2> <h2>What's Changed</h2> <p><strong>BREAKING CHANGE:</strong> this update supports Node <code>v24.x</code>. This is not a breaking change per-se but we're treating it as such.</p> <ul> <li>Update README.md by <a href="https://github.com/GhadimiR"><code>@GhadimiR</code></a> in <a href="https://redirect.github.com/actions/upload-artifact/pull/681">actions/upload-artifact#681</a></li> <li>Update README.md by <a href="https://github.com/nebuk89"><code>@nebuk89</code></a> in <a href="https://redirect.github.com/actions/upload-artifact/pull/712">actions/upload-artifact#712</a></li> <li>Readme: spell out the first use of GHES by <a href="https://github.com/danwkennedy"><code>@danwkennedy</code></a> in <a href="https://redirect.github.com/actions/upload-artifact/pull/727">actions/upload-artifact#727</a></li> <li>Update GHES guidance to include reference to Node 20 version by <a href="https://github.com/patrikpolyak"><code>@patrikpolyak</code></a> in <a href="https://redirect.github.com/actions/upload-artifact/pull/725">actions/upload-artifact#725</a></li> <li>Bump <code>@actions/artifact</code> to <code>v4.0.0</code></li> <li>Prepare <code>v5.0.0</code> by <a href="https://github.com/danwkennedy"><code>@danwkennedy</code></a> in <a href="https://redirect.github.com/actions/upload-artifact/pull/734">actions/upload-artifact#734</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/GhadimiR"><code>@GhadimiR</code></a> made their first contribution in <a href="https://redirect.github.com/actions/upload-artifact/pull/681">actions/upload-artifact#681</a></li> <li><a href="https://github.com/nebuk89"><code>@nebuk89</code></a> made their first contribution in <a href="https://redirect.github.com/actions/upload-artifact/pull/712">actions/upload-artifact#712</a></li> <li><a href="https://github.com/danwkennedy"><code>@danwkennedy</code></a> made their first contribution in <a href="https://redirect.github.com/actions/upload-artifact/pull/727">actions/upload-artifact#727</a></li> <li><a href="https://github.com/patrikpolyak"><code>@patrikpolyak</code></a> made their first contribution in <a href="https://redirect.github.com/actions/upload-artifact/pull/725">actions/upload-artifact#725</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/upload-artifact/compare/v4...v5.0.0">https://github.com/actions/upload-artifact/compare/v4...v5.0.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`330a01c490`"><code>330a01c</code></a> Merge pull request <a href="https://redirect.github.com/actions/upload-artifact/issues/734">#734</a> from actions/danwkennedy/prepare-5.0.0</li> <li><a href="`03f2824452`"><code>03f2824</code></a> Update <code>github.dep.yml</code></li> <li><a href="`905a1ecb59`"><code>905a1ec</code></a> Prepare <code>v5.0.0</code></li> <li><a href="`2d9f9cdfa9`"><code>2d9f9cd</code></a> Merge pull request <a href="https://redirect.github.com/actions/upload-artifact/issues/725">#725</a> from patrikpolyak/patch-1</li> <li><a href="`9687587dec`"><code>9687587</code></a> Merge branch 'main' into patch-1</li> <li><a href="`2848b2cda0`"><code>2848b2c</code></a> Merge pull request <a href="https://redirect.github.com/actions/upload-artifact/issues/727">#727</a> from danwkennedy/patch-1</li> <li><a href="`9b511775fd`"><code>9b51177</code></a> Spell out the first use of GHES</li> <li><a href="`cd231ca1ed`"><code>cd231ca</code></a> Update GHES guidance to include reference to Node 20 version</li> <li><a href="`de65e23aa2`"><code>de65e23</code></a> Merge pull request <a href="https://redirect.github.com/actions/upload-artifact/issues/712">#712</a> from actions/nebuk89-patch-1</li> <li><a href="`8747d8cd76`"><code>8747d8c</code></a> Update README.md</li> <li>Additional commits viewable in <a href="`ea165f8d65...330a01c490`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/upload-artifact&package-manager=github_actions&previous-version=4.6.2&new-version=5.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-27 14:42:23 +01:00
dependabot[bot]	8ad9dd7d60	chore(github-deps): bump astral-sh/setup-uv from 7.1.0 to 7.1.1 (#3906 ) Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 7.1.0 to 7.1.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v7.1.1 🌈 Fix empty workdir detection and lowest resolution strategy</h2> <h2>Changes</h2> <p>This release fixes a bug where the <code>working-directory</code> input was not used to detect an empty work dir. It also fixes the <code>lowest</code> resolution strategy resolving to latest when only a lower bound was specified.</p> <p>Special thanks to <a href="https://github.com/tpgillam"><code>@tpgillam</code></a> for the first contribution!</p> <h2>🐛 Bug fixes</h2> <ul> <li>Fix "lowest" resolution strategy with lower-bound only <a href="https://github.com/tpgillam"><code>@tpgillam</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/649">#649</a>)</li> <li>Use working-directory to detect empty workdir <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/645">#645</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>chore: update known checksums for 0.9.4 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/651">#651</a>)</li> <li>chore: update known checksums for 0.9.3 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/644">#644</a>)</li> </ul> <h2>📚 Documentation</h2> <ul> <li>Change version in docs to v7 <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/647">#647</a>)</li> </ul> <h2>⬆️ Dependency updates</h2> <ul> <li>Bump github/codeql-action from 4.30.7 to 4.30.8 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/639">#639</a>)</li> <li>Bump actions/setup-node from 5.0.0 to 6.0.0 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/641">#641</a>)</li> <li>Bump eifinger/actionlint-action from 1.9.1 to 1.9.2 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/634">#634</a>)</li> <li>Update lockfile with latest npm <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/636">#636</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`2ddd2b9cb3`"><code>2ddd2b9</code></a> chore: update known checksums for 0.9.4 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/651">#651</a>)</li> <li><a href="`b7bf78939d`"><code>b7bf789</code></a> Fix "lowest" resolution strategy with lower-bound only (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/649">#649</a>)</li> <li><a href="`cb6c0a53d9`"><code>cb6c0a5</code></a> Change version in docs to v7 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/647">#647</a>)</li> <li><a href="`dffc6292f2`"><code>dffc629</code></a> Use working-directory to detect empty workdir (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/645">#645</a>)</li> <li><a href="`6e346e1653`"><code>6e346e1</code></a> chore: update known checksums for 0.9.3 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/644">#644</a>)</li> <li><a href="`3ccd0fd498`"><code>3ccd0fd</code></a> Bump github/codeql-action from 4.30.7 to 4.30.8 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/639">#639</a>)</li> <li><a href="`ce6dbd84e1`"><code>ce6dbd8</code></a> Bump actions/setup-node from 5.0.0 to 6.0.0 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/641">#641</a>)</li> <li><a href="`2382069a66`"><code>2382069</code></a> Bump eifinger/actionlint-action from 1.9.1 to 1.9.2 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/634">#634</a>)</li> <li><a href="`b1daf91f4e`"><code>b1daf91</code></a> Update lockfile with latest npm (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/636">#636</a>)</li> <li>See full diff in <a href="`3259c6206f...2ddd2b9cb3`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=7.1.0&new-version=7.1.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-27 14:42:08 +01:00
dependabot[bot]	9d6e589120	chore(ui-deps): bump @types/node from 24.8.1 to 24.9.1 in /llama_stack/ui (#3912 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 6s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 7s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details UI Tests / ui-tests (22) (push) Successful in 47s Details Pre-commit / pre-commit (push) Successful in 1m43s Details Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 24.8.1 to 24.9.1. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/node&package-manager=npm_and_yarn&previous-version=24.8.1&new-version=24.9.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-26 23:48:00 -04:00
dependabot[bot]	948951cc5c	chore(ui-deps): bump @tailwindcss/postcss from 4.1.14 to 4.1.16 in /llama_stack/ui (#3913 ) Bumps [@tailwindcss/postcss](https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss) from 4.1.14 to 4.1.16. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/tailwindlabs/tailwindcss/releases"><code>@tailwindcss/postcss</code>'s releases</a>.</em></p> <blockquote> <h2>v4.1.16</h2> <h3>Fixed</h3> <ul> <li>Discard candidates with an empty data type (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19172">#19172</a>)</li> <li>Fix canonicalization of arbitrary variants with attribute selectors (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19176">#19176</a>)</li> <li>Fix invalid colors due to nested <code>&</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19184">#19184</a>)</li> <li>Improve canonicalization for <code>& > :pseudo</code> and <code>& :pseudo</code> arbitrary variants (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19178">#19178</a>)</li> </ul> <h2>v4.1.15</h2> <h3>Fixed</h3> <ul> <li>Fix Safari devtools rendering issue due to <code>color-mix</code> fallback (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19069">#19069</a>)</li> <li>Suppress Lightning CSS warnings about <code>:deep</code>, <code>:slotted</code>, and <code>:global</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19094">#19094</a>)</li> <li>Fix resolving theme keys when starting with the name of another theme key in JS configs and plugins (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19097">#19097</a>)</li> <li>Allow named groups in combination with <code>not-</code>, <code>has-</code>, and <code>in-</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19100">#19100</a>)</li> <li>Prevent important utilities from affecting other utilities (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19110">#19110</a>)</li> <li>Don’t index into strings with the <code>theme(…)</code> function (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19111">#19111</a>)</li> <li>Fix parsing issue when <code>\t</code> is used in at-rules (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19130">#19130</a>)</li> <li>Upgrade: Canonicalize utilities containing <code>0</code> values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19095">#19095</a>)</li> <li>Upgrade: Migrate deprecated <code>break-words</code> to <code>wrap-break-word</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19157">#19157</a>)</li> </ul> <h3>Changed</h3> <ul> <li>Remove the <code>postinstall</code> script from oxide (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19149">#19149</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/tailwindlabs/tailwindcss/blob/main/CHANGELOG.md"><code>@tailwindcss/postcss</code>'s changelog</a>.</em></p> <blockquote> <h2>[4.1.16] - 2025-10-23</h2> <h3>Fixed</h3> <ul> <li>Discard candidates with an empty data type (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19172">#19172</a>)</li> <li>Fix canonicalization of arbitrary variants with attribute selectors (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19176">#19176</a>)</li> <li>Fix invalid colors due to nested <code>&</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19184">#19184</a>)</li> <li>Improve canonicalization for <code>& > :pseudo</code> and <code>& :pseudo</code> arbitrary variants (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19178">#19178</a>)</li> </ul> <h2>[4.1.15] - 2025-10-20</h2> <h3>Fixed</h3> <ul> <li>Fix Safari devtools rendering issue due to <code>color-mix</code> fallback (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19069">#19069</a>)</li> <li>Suppress Lightning CSS warnings about <code>:deep</code>, <code>:slotted</code>, and <code>:global</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19094">#19094</a>)</li> <li>Fix resolving theme keys when starting with the name of another theme key in JS configs and plugins (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19097">#19097</a>)</li> <li>Allow named groups in combination with <code>not-</code>, <code>has-</code>, and <code>in-</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19100">#19100</a>)</li> <li>Prevent important utilities from affecting other utilities (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19110">#19110</a>)</li> <li>Don’t index into strings with the <code>theme(…)</code> function (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19111">#19111</a>)</li> <li>Fix parsing issue when <code>\t</code> is used in at-rules (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19130">#19130</a>)</li> <li>Upgrade: Canonicalize utilities containing <code>0</code> values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19095">#19095</a>)</li> <li>Upgrade: Migrate deprecated <code>break-words</code> to <code>wrap-break-word</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19157">#19157</a>)</li> </ul> <h3>Changed</h3> <ul> <li>Remove the <code>postinstall</code> script from oxide (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss/issues/19149">#19149</a>)(<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19149">tailwindlabs/tailwindcss#19149</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`cbbbe84475`"><code>cbbbe84</code></a> Release 4.1.16 (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss/issues/19185">#19185</a>)</li> <li><a href="`b2e2435ccb`"><code>b2e2435</code></a> Release 4.1.15 (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss/issues/19159">#19159</a>)</li> <li>See full diff in <a href="https://github.com/tailwindlabs/tailwindcss/commits/v4.1.16/packages/@tailwindcss-postcss">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@tailwindcss/postcss&package-manager=npm_and_yarn&previous-version=4.1.14&new-version=4.1.16)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-26 23:47:36 -04:00
dependabot[bot]	00bfda4eff	chore(ui-deps): bump @types/react-dom from 19.2.1 to 19.2.2 in /llama_stack/ui (#3915 ) Bumps [@types/react-dom](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-dom) from 19.2.1 to 19.2.2. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-dom">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/react-dom&package-manager=npm_and_yarn&previous-version=19.2.1&new-version=19.2.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-26 23:47:16 -04:00
dependabot[bot]	68e5a66ca9	chore(ui-deps): bump @testing-library/jest-dom from 6.8.0 to 6.9.1 in /llama_stack/ui (#3914 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details UI Tests / ui-tests (22) (push) Successful in 40s Details Pre-commit / pre-commit (push) Successful in 1m33s Details Bumps [@testing-library/jest-dom](https://github.com/testing-library/jest-dom) from 6.8.0 to 6.9.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/testing-library/jest-dom/releases"><code>@testing-library/jest-dom</code>'s releases</a>.</em></p> <blockquote> <h2>v6.9.1</h2> <h2><a href="https://github.com/testing-library/jest-dom/compare/v6.9.0...v6.9.1">6.9.1</a> (2025-10-01)</h2> <h3>Bug Fixes</h3> <ul> <li>Fix undefined <code>Node</code> error (nodejs) (<a href="https://redirect.github.com/testing-library/jest-dom/issues/707">#707</a>) (<a href="`0ff8904ff4`">0ff8904</a>)</li> </ul> <h2>v6.9.0</h2> <h1><a href="https://github.com/testing-library/jest-dom/compare/v6.8.0...v6.9.0">6.9.0</a> (2025-09-30)</h1> <h3>Features</h3> <ul> <li>Add .toAppearBefore/.toAppearAfter matcher (<a href="https://redirect.github.com/testing-library/jest-dom/issues/702">#702</a>) (<a href="`95f870acb2`">95f870a</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`0ff8904ff4`"><code>0ff8904</code></a> fix: Fix undefined <code>Node</code> error (nodejs) (<a href="https://redirect.github.com/testing-library/jest-dom/issues/707">#707</a>)</li> <li><a href="`95f870acb2`"><code>95f870a</code></a> feat: Add .toAppearBefore/.toAppearAfter matcher (<a href="https://redirect.github.com/testing-library/jest-dom/issues/702">#702</a>)</li> <li><a href="`d6663f5f97`"><code>d6663f5</code></a> docs: add nossbigg as a contributor for code, and test (<a href="https://redirect.github.com/testing-library/jest-dom/issues/703">#703</a>)</li> <li>See full diff in <a href="https://github.com/testing-library/jest-dom/compare/v6.8.0...v6.9.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@testing-library/jest-dom&package-manager=npm_and_yarn&previous-version=6.8.0&new-version=6.9.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-25 19:55:14 -04:00
ehhuang	509676641a	chore: update run configs (#3902 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m34s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details # What does this PR do? telemetry was deprecated ## Test Plan	2025-10-24 15:03:06 -07:00
ehhuang	2a1a813308	chore: update docs for telemetry api removal (#3900 ) # What does this PR do? Telemetry is no longer an API/provider. ## Test Plan	2025-10-24 13:57:28 -07:00
Francisco Arceo	4566eebe05	feat: Add static file import system for docs (#3882 ) # What does this PR do? Add static file import system for docs - Use `remark-code-import` plugin to embed code at build time - Support importing Python code with syntax highlighting using `raw-loader` + `ReactMarkdown` One caveat is that currently when embedding markdown with code used the syntax highlighting isn't behaving but I'll investigate that in a follow up. ## Test Plan Python Example: <img width="1372" height="995" alt="Screenshot 2025-10-23 at 9 22 18 PM" src="https://github.com/user-attachments/assets/656d2c78-4d9b-45a4-bd5e-3f8490352b85" /> Markdown example: <img width="1496" height="1070" alt="Screenshot 2025-10-23 at 9 22 38 PM" src="https://github.com/user-attachments/assets/6c0a07ec-ff7c-45aa-b05f-8c46acd4445c" /> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-10-24 14:01:33 -04:00
ehhuang	8265d4efc8	chore(telemetry): code cleanup (#3897 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 14s Details UI Tests / ui-tests (22) (push) Successful in 43s Details Pre-commit / pre-commit (push) Successful in 1m35s Details # What does this PR do? Clean up telemetry code since the telemetry API has been remove. - moved telemetry files out of providers to core - removed from Api ## Test Plan ❯ OTEL_SERVICE_NAME=llama_stack OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 uv run llama stack run starter ❯ curl http://localhost:8321/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-4o-mini", "messages": [ { "role": "user", "content": "Hello!" } ] }' -> verify traces in Grafana CI	2025-10-23 23:13:02 -07:00
ehhuang	9916cb3b17	chore: support default model in moderations API (#3890 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 41s Details Pre-commit / pre-commit (push) Successful in 1m33s Details # What does this PR do? https://platform.openai.com/docs/api-reference/moderations supports optional model parameter. This PR adds support for using moderations API with model=None if a default shield id is provided via safety config. ## Test Plan added tests manual test: ``` > SAFETY_MODEL='together/meta-llama/Llama-Guard-4-12B' uv run llama stack run starter > curl http://localhost:8321/v1/moderations \ -H "Content-Type: application/json" \ -d '{ "input": [ "hello" ] }' ```	2025-10-23 16:03:53 -07:00
ehhuang	d12e5f0999	chore(telemetry): add an arguement to select conatiner runtime explicitly (#3896 ) # What does this PR do? ## Test Plan ❯ ./scripts/telemetry/setup_telemetry.sh --container docker	2025-10-23 12:36:34 -07:00
Ashwin Bharambe	658fb2c777	refactor(k8s): update run configs to v2 storage and registered_resources structure Some checks failed Python Package Build Test / build (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details UI Tests / ui-tests (22) (push) Successful in 42s Details Pre-commit / pre-commit (push) Successful in 1m30s Details Migrates k8s run configs to match the updated run configs - Replace storage.references with storage.stores - Wrap resources under registered_resources section - Update provider configs to use persistence with namespace/backend - Add telemetry and vector_stores top-level sections - Simplify agent/files metadata store configuration	2025-10-22 15:33:50 -07:00
Ashwin Bharambe	0e57233a0a	chore(misc): update datasets, benchmarks to use alpha, beta prefixes (#3891 ) This will be landed together with https://github.com/llamastack/llama-stack-client-python/pull/282 (hence CI will be red on this one.) I have verified locally that tests pass with the updated version of the client-sdk.	2025-10-22 15:26:35 -07:00
Ashwin Bharambe	7918188f1e	fix(ci): enable responses tests in CI; suppress expected MCP auth error logs (#3889 ) Let us enable responses suite in CI now. Also a minor fix: MCP tool tests intentionally trigger authentication failures to verify error handling, but the resulting error logs clutter test output.	2025-10-22 14:59:42 -07:00
Ashwin Bharambe	7b90e0e9c8	test: suppress expected error logs in SSE test (#3886 ) Our unit test outputs are filled with all kinds of obscene logs. This makes it really hard to spot real issues quickly. The problem is that these logs are necessary to output at the given logging level when the server is operating normally. It's just that we don't want to see some of them (especially the noisy ones) during tests. This PR begins the cleanup. We pytest's caplog fixture to for suppression.	2025-10-22 14:34:32 -07:00
ehhuang	f8eaa40580	chore: better error messages for moderations API (#3887 ) # What does this PR do? ## Test Plan ``` ~/projects/lst3 remotes/origin/HEAD* .venv ❯ curl http://localhost:8321/v1/moderations \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o-mini", "input": [ "hello" ] }' {"detail":"Invalid value: No shield associated with provider_resource id gpt-4o-mini: choose from ['together/meta-llama/Llama-Guard-4-12B']"} ```	2025-10-22 14:33:13 -07:00
Ashwin Bharambe	30ba8c8655	fix(responses): sync conversation before yielding terminal events in streaming (#3888 ) Move conversation sync logic before yield to ensure it executes even when streaming consumers break early after receiving response.completed event. ## Test Plan ``` OLLAMA_URL=http://localhost:11434 \ pytest -sv tests/integration/responses/ \ --stack-config server:ci-tests \ --text-model ollama/llama3.2:3b-instruct-fp16 \ --inference-mode live \ -k conversation_multi ``` This test now passes.	2025-10-22 14:31:12 -07:00
Ashwin Bharambe	cb2185b936	fix(logging): ensure logs go to stderr, loggers obey levels (#3885 ) Important fix to the logging system	2025-10-22 13:06:54 -07:00
dependabot[bot]	8885cea8d7	fix(conversations)!: update Conversations API definitions (was: bump openai from 1.107.0 to 2.5.0) (#3847 ) Bumps [openai](https://github.com/openai/openai-python) from 1.107.0 to 2.5.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/openai/openai-python/releases">openai's releases</a>.</em></p> <blockquote> <h2>v2.5.0</h2> <h2>2.5.0 (2025-10-17)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v2.4.0...v2.5.0">v2.4.0...v2.5.0</a></p> <h3>Features</h3> <ul> <li><strong>api:</strong> api update (<a href="`8b280d57d6`">8b280d5</a>)</li> </ul> <h3>Chores</h3> <ul> <li>bump <code>httpx-aiohttp</code> version to 0.1.9 (<a href="`67f2f0afe5`">67f2f0a</a>)</li> </ul> <h2>v2.4.0</h2> <h2>2.4.0 (2025-10-16)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v2.3.0...v2.4.0">v2.3.0...v2.4.0</a></p> <h3>Features</h3> <ul> <li><strong>api:</strong> Add support for gpt-4o-transcribe-diarize on audio/transcriptions endpoint (<a href="`bdbe9b8f44`">bdbe9b8</a>)</li> </ul> <h3>Chores</h3> <ul> <li>fix dangling comment (<a href="`da14e99606`">da14e99</a>)</li> <li><strong>internal:</strong> detect missing future annotations with ruff (<a href="`2672b8f072`">2672b8f</a>)</li> </ul> <h2>v2.3.0</h2> <h2>2.3.0 (2025-10-10)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v2.2.0...v2.3.0">v2.2.0...v2.3.0</a></p> <h3>Features</h3> <ul> <li><strong>api:</strong> comparison filter in/not in (<a href="`aa49f626a6`">aa49f62</a>)</li> </ul> <h3>Chores</h3> <ul> <li><strong>package:</strong> bump jiter to >=0.10.0 to support Python 3.14 (<a href="https://redirect.github.com/openai/openai-python/issues/2618">#2618</a>) (<a href="`aa445cab5c`">aa445ca</a>)</li> </ul> <h2>v2.2.0</h2> <h2>2.2.0 (2025-10-06)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v2.1.0...v2.2.0">v2.1.0...v2.2.0</a></p> <h3>Features</h3> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/openai/openai-python/blob/main/CHANGELOG.md">openai's changelog</a>.</em></p> <blockquote> <h2>2.5.0 (2025-10-17)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v2.4.0...v2.5.0">v2.4.0...v2.5.0</a></p> <h3>Features</h3> <ul> <li><strong>api:</strong> api update (<a href="`8b280d57d6`">8b280d5</a>)</li> </ul> <h3>Chores</h3> <ul> <li>bump <code>httpx-aiohttp</code> version to 0.1.9 (<a href="`67f2f0afe5`">67f2f0a</a>)</li> </ul> <h2>2.4.0 (2025-10-16)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v2.3.0...v2.4.0">v2.3.0...v2.4.0</a></p> <h3>Features</h3> <ul> <li><strong>api:</strong> Add support for gpt-4o-transcribe-diarize on audio/transcriptions endpoint (<a href="`bdbe9b8f44`">bdbe9b8</a>)</li> </ul> <h3>Chores</h3> <ul> <li>fix dangling comment (<a href="`da14e99606`">da14e99</a>)</li> <li><strong>internal:</strong> detect missing future annotations with ruff (<a href="`2672b8f072`">2672b8f</a>)</li> </ul> <h2>2.3.0 (2025-10-10)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v2.2.0...v2.3.0">v2.2.0...v2.3.0</a></p> <h3>Features</h3> <ul> <li><strong>api:</strong> comparison filter in/not in (<a href="`aa49f626a6`">aa49f62</a>)</li> </ul> <h3>Chores</h3> <ul> <li><strong>package:</strong> bump jiter to >=0.10.0 to support Python 3.14 (<a href="https://redirect.github.com/openai/openai-python/issues/2618">#2618</a>) (<a href="`aa445cab5c`">aa445ca</a>)</li> </ul> <h2>2.2.0 (2025-10-06)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v2.1.0...v2.2.0">v2.1.0...v2.2.0</a></p> <h3>Features</h3> <ul> <li><strong>api:</strong> dev day 2025 launches (<a href="`38ac0093eb`">38ac009</a>)</li> </ul> <h3>Bug Fixes</h3> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`513ae76253`"><code>513ae76</code></a> release: 2.5.0 (<a href="https://redirect.github.com/openai/openai-python/issues/2694">#2694</a>)</li> <li><a href="`ebf32212f7`"><code>ebf3221</code></a> release: 2.4.0</li> <li><a href="`e043d7b164`"><code>e043d7b</code></a> chore: fix dangling comment</li> <li><a href="`25cbb74f83`"><code>25cbb74</code></a> feat(api): Add support for gpt-4o-transcribe-diarize on audio/transcriptions ...</li> <li><a href="`8cdfd0650e`"><code>8cdfd06</code></a> codegen metadata</li> <li><a href="`d5c64434b7`"><code>d5c6443</code></a> codegen metadata</li> <li><a href="`b20a9e7b81`"><code>b20a9e7</code></a> chore(internal): detect missing future annotations with ruff</li> <li><a href="`e5f93f5dae`"><code>e5f93f5</code></a> release: 2.3.0</li> <li><a href="`044878859c`"><code>0448788</code></a> feat(api): comparison filter in/not in</li> <li><a href="`85a91ade61`"><code>85a91ad</code></a> chore(package): bump jiter to >=0.10.0 to support Python 3.14 (<a href="https://redirect.github.com/openai/openai-python/issues/2618">#2618</a>)</li> <li>Additional commits viewable in <a href="https://github.com/openai/openai-python/compare/v1.107.0...v2.5.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=openai&package-manager=uv&previous-version=1.107.0&new-version=2.5.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-10-22 12:32:48 -07:00
Jiayi Ni	bb1ebb3c6b	feat: Add rerank models and rerank API change (#3831 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - Extend the model type to include rerank models. - Implement `rerank()` method in inference router. - Add `rerank_model_list` to `OpenAIMixin` to enable providers to register and identify rerank models - Update documentation. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ``` pytest tests/unit/providers/utils/inference/test_openai_mixin.py ```	2025-10-22 12:02:28 -07:00
ehhuang	f2598d30e6	chore: use --no-cache in Containerfile (#3884 ) # What does this PR do? debugging `5332970065` --no-cache was what build_container.sh used ## Test Plan	2025-10-22 11:39:00 -07:00
Ashwin Bharambe	c582654d70	fix(ci): dont need server: anymore, docker: is sufficient Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 0s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details UI Tests / ui-tests (22) (push) Successful in 44s Details Pre-commit / pre-commit (push) Successful in 2m14s Details	2025-10-22 09:13:39 -07:00
Francisco Arceo	53c20f6113	feat: Adding Demo script (#3870 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 15s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 15s Details API Conformance Tests / check-schema-compatibility (push) Successful in 24s Details UI Tests / ui-tests (22) (push) Successful in 50s Details Pre-commit / pre-commit (push) Successful in 1m26s Details # What does this PR do? Updated quickstart `demo_script.py` to use OpenAI APIs, which is simply: ```python import io, requests from openai import OpenAI url="https://www.paulgraham.com/greatwork.html" client = OpenAI(base_url="http://localhost:8321/v1/", api_key="none") vs = client.vector_stores.create() response = requests.get(url) pseudo_file = io.BytesIO(str(response.content).encode('utf-8')) uploaded_file = client.files.create(file=(url, pseudo_file, "text/html"), purpose="assistants") client.vector_stores.files.create(vector_store_id=vs.id, file_id=uploaded_file.id) resp = client.responses.create( model="openai/gpt-4o", input="How do you do great work? Use the existing knowledge_search tool.", tools=[{"type": "file_search", "vector_store_ids": [vs.id]}], include=["file_search_call.results"], ) print(resp) ``` <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-10-21 21:31:21 -04:00
github-actions[bot]	bf2d16997d	build: Bump version to 0.3.0 Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Test llama stack list-deps / generate-matrix (push) Successful in 4s Details Test llama stack list-deps / show-single-provider (push) Failing after 4s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 25s Details Test llama stack list-deps / list-deps (push) Failing after 24s Details UI Tests / ui-tests (22) (push) Successful in 52s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 30s Details Pre-commit / pre-commit (push) Successful in 1m59s Details	2025-10-21 23:59:09 +00:00
Ashwin Bharambe	c0c0e337d9	misc(tests): add recordings for responses tests	2025-10-21 16:39:08 -07:00
Ashwin Bharambe	557b1b8c2d	fix(logs): restore uvicorn and llama_stack logger settings	2025-10-21 15:47:55 -07:00
slekkala1	eb2b240594	fix: remove consistency checks (#3881 ) # What does this PR do? metadata is conflicting with the default embedding model set on server side via extra body, removing the check and just letting metadata take precedence over extra body `ValueError: Embedding model inconsistent between metadata ('text-embedding-3-small') and extra_body ('sentence-transformers/nomic-ai/nomic-embed-text-v1.5')` ## Test Plan CI	2025-10-21 14:40:14 -07:00
Alexey Rybak	4c718523fa	docs: fix the building distro file (#3880 ) # What does this PR do? * Fixes the doc server build (which expects a blank line after imports) ## Test Plan * `cd docs && npm run build`	2025-10-21 14:26:35 -07:00
slekkala1	cb6a5e2687	fix: fix segfault in load model (#3879 ) # What does this PR do? Fix segfault with load model The cc-vec integration failed with segfault when used with default embedding model on macOS `model_id: nomic-ai/nomic-embed-text-v1.5` and `provider_id: sentence-transformers` Checked crash report and see this is due to torch OPENMP settings. Constrainting to 1 thread works without crashes. ## Test Plan Tested with cc-vec integration 1. start server llama stack run starter 2. Do the setup in https://github.com/raghotham/cc-vec to set env variables and try `uv run cc-vec index --url-patterns "%.github.io" --vector-store-name "ml-research" --limit 50 --chunk-size 800 --overlap 400`	2025-10-21 12:21:06 -07:00
ehhuang	1ec7216c3f	chore: update quick_start (#3878 ) # What does this PR do? ## Test Plan	2025-10-21 11:33:23 -07:00
Ashwin Bharambe	bd3c473208	revert: "chore(cleanup)!: remove tool_runtime.rag_tool" (#3877 ) Reverts llamastack/llama-stack#3871 This PR broke RAG (even from Responses -- there _is_ a dependency)	2025-10-21 11:22:06 -07:00
ehhuang	eb3e9b85f9	chore: update getting_started (#3875 ) # What does this PR do? ## Test Plan	2025-10-21 11:09:45 -07:00
Ashwin Bharambe	71ead88bce	fix(logging): move module-level initialization to explicit setup calls (#3874 ) - Moved environment variable parsing and `setup_logging()` call from module level to proper initialization points - Added explicit `setup_logging()` calls in `server.py::create_app()` and `library_client.py::AsyncLlamaStackAsLibraryClient.__init__()` Module-level side effects are bad practice and can cause issues with import order, testing, and circular dependencies. The previous implementation ran logging setup on every import of the log module, which is unpredictable and difficult to control. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-21 11:08:25 -07:00
Ashwin Bharambe	9191005ca1	fix(ci): dump server/container logs when tests fail (#3873 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 14s Details API Conformance Tests / check-schema-compatibility (push) Successful in 14s Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Python Package Build Test / build (3.13) (push) Failing after 17s Details Test Llama Stack Build / generate-matrix (push) Successful in 20s Details Unit Tests / unit-tests (3.13) (push) Failing after 18s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 25s Details Unit Tests / unit-tests (3.12) (push) Failing after 36s Details Test Llama Stack Build / build (push) Failing after 12s Details UI Tests / ui-tests (22) (push) Successful in 1m1s Details Pre-commit / pre-commit (push) Successful in 2m5s Details Output last 100 lines of server.log or docker container logs when integration tests fail to aid debugging.	2025-10-20 22:28:55 -07:00
Ashwin Bharambe	0e96279bee	chore(cleanup)!: remove tool_runtime.rag_tool (#3871 ) Kill the `builtin::rag` tool group completely since it is no longer targeted. We use the Responses implementation for knowledge_search which uses the `openai_vector_stores` pathway. --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-20 22:26:21 -07:00
Ashwin Bharambe	5aaf1a8bca	fix(ci): improve workflow logging and bot notifications (#3872 ) ## Summary - Link pre-commit bot comment to workflow run instead of PR for better debugging - Dump docker container logs before removal to ensure logs are actually captured ## Changes 1. Pre-commit bot: Changed the initial bot comment to link "pre-commit hooks" text to the actual workflow run URL instead of just having the PR number auto-link 2. Docker logs: Moved docker container log dumping from GitHub Actions to the integration-tests.sh script's stop_container() function, ensuring logs are captured before container removal ## Test plan - Pre-commit bot comment will now have a clickable link to the workflow run - Docker container logs will be successfully captured in CI runs	2025-10-20 22:08:15 -07:00
Ashwin Bharambe	122de785c4	chore(cleanup)!: kill vector_db references as far as possible (#3864 ) There should not be "vector db" anywhere.	2025-10-20 20:06:16 -07:00
ehhuang	444f6c88f3	chore: remove build.py (#3869 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test llama stack list-deps / generate-matrix (push) Successful in 4s Details Test llama stack list-deps / show-single-provider (push) Failing after 3s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 20s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 23s Details Test llama stack list-deps / list-deps (push) Failing after 18s Details UI Tests / ui-tests (22) (push) Successful in 57s Details Pre-commit / pre-commit (push) Successful in 1m52s Details # What does this PR do? ## Test Plan CI	2025-10-20 16:28:15 -07:00
Charlie Doern	6a13a99e77	chore: add `beta` group to stainless (#3866 ) # What does this PR do? similarly to `alpha:` move `v1beta` routes under a `beta` group so the client will have `client.beta` From what I can tell, the openapi.stainless.yml file is hand written while the openapi.yml file is generated and copied using the shell script so I did this by hand. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-20 16:26:06 -07:00
ehhuang	407bade359	chore: migrate stack build (#3867 ) # What does this PR do? Just use editable install here. Not sure about the USE_COPY_NOT_MOUNT that was used in original scripts and if that's needed. ## Test Plan <img width="1008" height="587" alt="image" src="https://github.com/user-attachments/assets/7ddf8e31-2635-45d3-b79c-1b898eefbf07" /> --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/llamastack/llama-stack/pull/3867). * #3869 * __->__ #3867	2025-10-20 16:22:48 -07:00
ehhuang	ffeb86385c	chore: fix main (#3868 ) # What does this PR do? dup entry was added for some reason ## Test Plan	2025-10-20 16:01:03 -07:00
ehhuang	b215eb5944	chore: skip shutdown if otel_endpoint is not set (#3865 ) # What does this PR do? rid following error when ctrl+c'd server │ /Users/erichuang/projects/lst3/llama_stack/providers/inline/telemetry/meta_reference/telemetry.py:92 in │ │ shutdown │ │ │ │ 89 │ │ pass │ │ 90 │ │ │ 91 │ async def shutdown(self) -> None: │ │ ❱ 92 │ │ trace.get_tracer_provider().force_flush() │ │ 93 │ │ │ 94 │ async def log_event(self, event: Event, ttl_seconds: int = 604800) -> None: │ │ 95 │ │ if isinstance(event, UnstructuredLogEvent): │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ AttributeError: 'ProxyTracerProvider' object has no attribute 'force_flush' ## Test Plan	2025-10-20 15:48:37 -07:00
dependabot[bot]	d9274d199e	chore(ui-deps): bump @types/node from 24.3.0 to 24.8.1 in /llama_stack/ui (#3851 ) Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 24.3.0 to 24.8.1. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/node&package-manager=npm_and_yarn&previous-version=24.3.0&new-version=24.8.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-20 15:11:36 -07:00
dependabot[bot]	ec364499f5	chore(ui-deps): bump @tailwindcss/postcss from 4.1.6 to 4.1.14 in /llama_stack/ui (#3850 ) Bumps [@tailwindcss/postcss](https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss) from 4.1.6 to 4.1.14. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/tailwindlabs/tailwindcss/releases"><code>@tailwindcss/postcss</code>'s releases</a>.</em></p> <blockquote> <h2>v4.1.14</h2> <h3>Fixed</h3> <ul> <li>Handle <code>'</code> syntax in ClojureScript when extracting classes (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18888">#18888</a>)</li> <li>Handle <code>@variant</code> inside <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18885">#18885</a>)</li> <li>Merge suggestions when using <code>@utility</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18900">#18900</a>)</li> <li>Ensure that file system watchers created when using the CLI are always cleaned up (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18905">#18905</a>)</li> <li>Do not generate <code>grid-column</code> utilities when configuring <code>grid-column-start</code> or <code>grid-column-end</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18907">#18907</a>)</li> <li>Do not generate <code>grid-row</code> utilities when configuring <code>grid-row-start</code> or <code>grid-row-end</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18907">#18907</a>)</li> <li>Prevent duplicate CSS when overwriting a static utility with a theme key (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18056">#18056</a>)</li> <li>Show Lightning CSS warnings (if any) when optimizing/minifying (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18918">#18918</a>)</li> <li>Use <code>default</code> export condition for <code>@tailwindcss/vite</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18948">#18948</a>)</li> <li>Re-throw errors from PostCSS nodes (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18373">#18373</a>)</li> <li>Detect classes in markdown inline directives (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18967">#18967</a>)</li> <li>Ensure files with only <code>@theme</code> produce no output when built (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18979">#18979</a>)</li> <li>Support Maud templates when extracting classes (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18988">#18988</a>)</li> <li>Upgrade: Do not migrate <code>variant = 'outline'</code> during upgrades (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18922">#18922</a>)</li> <li>Upgrade: Show version mismatch (if any) when running upgrade tool (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19028">#19028</a>)</li> <li>Upgrade: Ensure first class inside <code>className</code> is migrated (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19031">#19031</a>)</li> <li>Upgrade: Migrate classes inside <code>ClassName</code> and <code>Class</code> attributes (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19031">#19031</a>)</li> </ul> <h2>v4.1.13</h2> <h3>Changed</h3> <ul> <li>Drop warning from browser build (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/issues/18731">#18731</a>)</li> <li>Drop exact duplicate declarations when emitting CSS (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/issues/18809">#18809</a>)</li> </ul> <h3>Fixed</h3> <ul> <li>Don't transition <code>visibility</code> when using <code>transition</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18795">#18795</a>)</li> <li>Discard matched variants with unknown named values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18799">#18799</a>)</li> <li>Discard matched variants with non-string values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18799">#18799</a>)</li> <li>Show suggestions for known <code>matchVariant</code> values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18798">#18798</a>)</li> <li>Replace deprecated <code>clip</code> with <code>clip-path</code> in <code>sr-only</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18769">#18769</a>)</li> <li>Hide internal fields from completions in <code>matchUtilities</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18820">#18820</a>)</li> <li>Ignore <code>.vercel</code> folders by default (can be overridden by <code>@source …</code> rules) (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18855">#18855</a>)</li> <li>Consider variants starting with <code>@-</code> to be invalid (e.g. <code>@-2xl:flex</code>) (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18869">#18869</a>)</li> <li>Do not allow custom variants to start or end with a <code>-</code> or <code>_</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18867">#18867</a>, <a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18872">#18872</a>)</li> <li>Upgrade: Migrate <code>aria</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18815">#18815</a>)</li> <li>Upgrade: Migrate <code>data</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18816">#18816</a>)</li> <li>Upgrade: Migrate <code>supports</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18817">#18817</a>)</li> </ul> <h2>v4.1.12</h2> <h3>Fixed</h3> <ul> <li>Don't consider the global important state in <code>@apply</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18404">#18404</a>)</li> <li>Add missing suggestions for <code>flex-<number></code> utilities (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18642">#18642</a>)</li> <li>Fix trailing <code>)</code> from interfering with extraction in Clojure keywords (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18345">#18345</a>)</li> <li>Detect classes inside Elixir charlist, word list, and string sigils (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18432">#18432</a>)</li> <li>Track source locations through <code>@plugin</code> and <code>@config</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18345">#18345</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/tailwindlabs/tailwindcss/blob/main/CHANGELOG.md"><code>@tailwindcss/postcss</code>'s changelog</a>.</em></p> <blockquote> <h2>[4.1.14] - 2025-10-01</h2> <h3>Fixed</h3> <ul> <li>Handle <code>'</code> syntax in ClojureScript when extracting classes (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18888">#18888</a>)</li> <li>Handle <code>@variant</code> inside <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18885">#18885</a>)</li> <li>Merge suggestions when using <code>@utility</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18900">#18900</a>)</li> <li>Ensure that file system watchers created when using the CLI are always cleaned up (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18905">#18905</a>)</li> <li>Do not generate <code>grid-column</code> utilities when configuring <code>grid-column-start</code> or <code>grid-column-end</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18907">#18907</a>)</li> <li>Do not generate <code>grid-row</code> utilities when configuring <code>grid-row-start</code> or <code>grid-row-end</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18907">#18907</a>)</li> <li>Prevent duplicate CSS when overwriting a static utility with a theme key (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18056">#18056</a>)</li> <li>Show Lightning CSS warnings (if any) when optimizing/minifying (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18918">#18918</a>)</li> <li>Use <code>default</code> export condition for <code>@tailwindcss/vite</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18948">#18948</a>)</li> <li>Re-throw errors from PostCSS nodes (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18373">#18373</a>)</li> <li>Detect classes in markdown inline directives (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18967">#18967</a>)</li> <li>Ensure files with only <code>@theme</code> produce no output when built (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18979">#18979</a>)</li> <li>Support Maud templates when extracting classes (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18988">#18988</a>)</li> <li>Upgrade: Do not migrate <code>variant = 'outline'</code> during upgrades (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18922">#18922</a>)</li> <li>Upgrade: Show version mismatch (if any) when running upgrade tool (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19028">#19028</a>)</li> <li>Upgrade: Ensure first class inside <code>className</code> is migrated (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19031">#19031</a>)</li> <li>Upgrade: Migrate classes inside <code>ClassName</code> and <code>Class</code> attributes (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/19031">#19031</a>)</li> </ul> <h2>[4.1.13] - 2025-09-03</h2> <h3>Changed</h3> <ul> <li>Drop warning from browser build (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/issues/18731">#18731</a>)</li> <li>Drop exact duplicate declarations when emitting CSS (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/issues/18809">#18809</a>)</li> </ul> <h3>Fixed</h3> <ul> <li>Don't transition <code>visibility</code> when using <code>transition</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18795">#18795</a>)</li> <li>Discard matched variants with unknown named values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18799">#18799</a>)</li> <li>Discard matched variants with non-string values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18799">#18799</a>)</li> <li>Show suggestions for known <code>matchVariant</code> values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18798">#18798</a>)</li> <li>Replace deprecated <code>clip</code> with <code>clip-path</code> in <code>sr-only</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18769">#18769</a>)</li> <li>Hide internal fields from completions in <code>matchUtilities</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18820">#18820</a>)</li> <li>Ignore <code>.vercel</code> folders by default (can be overridden by <code>@source …</code> rules) (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18855">#18855</a>)</li> <li>Consider variants starting with <code>@-</code> to be invalid (e.g. <code>@-2xl:flex</code>) (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18869">#18869</a>)</li> <li>Do not allow custom variants to start or end with a <code>-</code> or <code>_</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18867">#18867</a>, <a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18872">#18872</a>)</li> <li>Upgrade: Migrate <code>aria</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18815">#18815</a>)</li> <li>Upgrade: Migrate <code>data</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18816">#18816</a>)</li> <li>Upgrade: Migrate <code>supports</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18817">#18817</a>)</li> </ul> <h2>[4.1.12] - 2025-08-13</h2> <h3>Fixed</h3> <ul> <li>Don't consider the global important state in <code>@apply</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18404">#18404</a>)</li> <li>Add missing suggestions for <code>flex-<number></code> utilities (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18642">#18642</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`b67cbcf6cc`"><code>b67cbcf</code></a> Prepare v4.1.14 release (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss/issues/19037">#19037</a>)</li> <li><a href="`b497e1eaf3`"><code>b497e1e</code></a> Add <code>Upgrading from Tailwind CSS v…</code> when running upgrade tool (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss/issues/19026">#19026</a>)</li> <li><a href="`210575a6a5`"><code>210575a</code></a> Update dedent 1.6.0 → 1.7.0 (minor) (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss/issues/19010">#19010</a>)</li> <li><a href="`d0f7f82787`"><code>d0f7f82</code></a> Add plugin option documentation to the postcss plugin readme (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss/issues/18940">#18940</a>)</li> <li><a href="`5b8136e838`"><code>5b8136e</code></a> Re-throw errors from PostCSS nodes (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss/issues/18373">#18373</a>)</li> <li><a href="`1334c99db8`"><code>1334c99</code></a> Prepare v4.1.13 release (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss/issues/18868">#18868</a>)</li> <li><a href="`6791e8133c`"><code>6791e81</code></a> Prepare v4.1.12 release (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss/issues/18728">#18728</a>)</li> <li><a href="`492304212f`"><code>4923042</code></a> Allow users to disable url rewriting in the PostCSS plugin (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss/issues/18321">#18321</a>)</li> <li><a href="`88b9f15b65`"><code>88b9f15</code></a> Center the dropdown icon added to an input with a paired datalist in Chrome (...</li> <li><a href="`9169d73aad`"><code>9169d73</code></a> update READMEs</li> <li>Additional commits viewable in <a href="https://github.com/tailwindlabs/tailwindcss/commits/v4.1.14/packages/@tailwindcss-postcss">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@tailwindcss/postcss&package-manager=npm_and_yarn&previous-version=4.1.6&new-version=4.1.14)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-20 15:11:24 -07:00
dependabot[bot]	6a74894e22	chore(python-deps): bump fastapi from 0.116.1 to 0.119.0 (#3845 ) Bumps [fastapi](https://github.com/fastapi/fastapi) from 0.116.1 to 0.119.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/fastapi/fastapi/releases">fastapi's releases</a>.</em></p> <blockquote> <h2>0.119.0</h2> <p>FastAPI now (temporarily) supports both Pydantic v2 models and <code>pydantic.v1</code> models at the same time in the same app, to make it easier for any FastAPI apps still using Pydantic v1 to gradually but quickly <strong>migrate to Pydantic v2</strong>.</p> <pre lang="Python"><code>from fastapi import FastAPI from pydantic import BaseModel as BaseModelV2 from pydantic.v1 import BaseModel <p>class Item(BaseModel):<br /> name: str<br /> description: str \| None = None</p> <p>class ItemV2(BaseModelV2):<br /> title: str<br /> summary: str \| None = None</p> <p>app = FastAPI()</p> <p><a href="https://github.com/app"><code>@app</code></a>.post("/items/", response_model=ItemV2)<br /> def create_item(item: Item):<br /> return {"title": item.name, "summary": item.description}<br /> </code></pre></p> <p>Adding this feature was a big effort with the main objective of making it easier for the few applications still stuck in Pydantic v1 to migrate to Pydantic v2.</p> <p>And with this, support for <strong>Pydantic v1 is now deprecated</strong> and will be <strong>removed</strong> from FastAPI in a future version soon.</p> <p><strong>Note</strong>: have in mind that the Pydantic team already stopped supporting Pydantic v1 for recent versions of Python, starting with Python 3.14.</p> <p>You can read in the docs more about how to <a href="https://fastapi.tiangolo.com/how-to/migrate-from-pydantic-v1-to-pydantic-v2/">Migrate from Pydantic v1 to Pydantic v2</a>.</p> <h3>Features</h3> <ul> <li>✨ Add support for <code>from pydantic.v1 import BaseModel</code>, mixed Pydantic v1 and v2 models in the same app. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14168">#14168</a> by <a href="https://github.com/tiangolo"><code>@tiangolo</code></a>.</li> </ul> <h2>0.118.3</h2> <h3>Upgrades</h3> <ul> <li>⬆️ Add support for Python 3.14. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/14165">#14165</a> by <a href="https://github.com/svlandeg"><code>@svlandeg</code></a>.</li> </ul> <h2>0.118.2</h2> <h3>Fixes</h3> <ul> <li>🐛 Fix tagged discriminated union not recognized as body field. PR <a href="https://redirect.github.com/fastapi/fastapi/pull/12942">#12942</a> by <a href="https://github.com/frankie567"><code>@frankie567</code></a>.</li> </ul> <h3>Internal</h3> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`2e721e1b02`"><code>2e721e1</code></a> 🔖 Release version 0.119.0</li> <li><a href="`fc7a0686af`"><code>fc7a068</code></a> 📝 Update release notes</li> <li><a href="`3a3879b2c3`"><code>3a3879b</code></a> 📝 Update release notes</li> <li><a href="`d34918abf0`"><code>d34918a</code></a> ✨ Add support for <code>from pydantic.v1 import BaseModel</code>, mixed Pydantic v1 and ...</li> <li><a href="`352dbefc63`"><code>352dbef</code></a> 🔖 Release version 0.118.3</li> <li><a href="`96e7d6eaa4`"><code>96e7d6e</code></a> 📝 Update release notes</li> <li><a href="`3611c3fc5b`"><code>3611c3f</code></a> ⬆️ Add support for Python 3.14 (<a href="https://redirect.github.com/fastapi/fastapi/issues/14165">#14165</a>)</li> <li><a href="`942fce394b`"><code>942fce3</code></a> 🔖 Release version 0.118.2</li> <li><a href="`13b067c9b6`"><code>13b067c</code></a> 📝 Update release notes</li> <li><a href="`185cecd891`"><code>185cecd</code></a> 🐛 Fix tagged discriminated union not recognized as body field (<a href="https://redirect.github.com/fastapi/fastapi/issues/12942">#12942</a>)</li> <li>Additional commits viewable in <a href="https://github.com/fastapi/fastapi/compare/0.116.1...0.119.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=fastapi&package-manager=uv&previous-version=0.116.1&new-version=0.119.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-20 15:11:11 -07:00
dependabot[bot]	5aafce4ff3	chore(python-deps): bump weaviate-client from 4.16.9 to 4.17.0 (#3844 ) Bumps [weaviate-client](https://github.com/weaviate/weaviate-python-client) from 4.16.9 to 4.17.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/weaviate/weaviate-python-client/releases">weaviate-client's releases</a>.</em></p> <blockquote> <h2>v4.16.10</h2> <h2>What's Changed</h2> <ul> <li>Add uncompressed quantitizer factory by <a href="https://github.com/dirkkul"><code>@dirkkul</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1800">weaviate/weaviate-python-client#1800</a></li> <li>Add support for groups by <a href="https://github.com/dirkkul"><code>@dirkkul</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1778">weaviate/weaviate-python-client#1778</a></li> <li>feat: add overwrite_alias to backup restore by <a href="https://github.com/bevzzz"><code>@bevzzz</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1808">weaviate/weaviate-python-client#1808</a></li> <li>Add Multi2vec-aws and text2vec-morph by <a href="https://github.com/dirkkul"><code>@dirkkul</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1820">weaviate/weaviate-python-client#1820</a></li> <li>Add support for exists on aliases. by <a href="https://github.com/jfrancoa"><code>@jfrancoa</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1813">weaviate/weaviate-python-client#1813</a></li> <li>Add note re GPT4All deprecation by <a href="https://github.com/databyjp"><code>@databyjp</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1825">weaviate/weaviate-python-client#1825</a></li> <li>Update setup.cfg with min weaviate agents version by <a href="https://github.com/cdpierse"><code>@cdpierse</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1826">weaviate/weaviate-python-client#1826</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/weaviate/weaviate-python-client/compare/v4.16.9...v4.16.10">https://github.com/weaviate/weaviate-python-client/compare/v4.16.9...v4.16.10</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/weaviate/weaviate-python-client/blob/main/docs/changelog.rst">weaviate-client's changelog</a>.</em></p> <blockquote> <h2>Version 4.17.0</h2> <p>This minor version includes: - Remove support for Weaviate versions < 1.27. Please update your Weaviate instances - Support for new 1.33 features: - OIDC group support in RBAC - Uncompressed quantizer - ContainsNone and Not filter operators - Add support for <code>verbosity</code> and <code>reasoning effort</code> for generative-openai module - Add alias.exists method - Add multi2vec-aws and text2vec-morph modules - Add support for max_tokens for generative-aws module - Fix weaviate client installation with other packages depending on grpc-health-checking</p> <h2>Version 4.16.10</h2> <p>This patch version includes: - Addition of helper to create an uncompressed quantizer for use when not using default compression - Support for <code>overwrite_alias</code> option to backup create/restore - Support for OIDC groups - Addition of <code>multi2vec-aws</code> and <code>text2vec-morph</code> modules - Support for <code>alias.exists</code> method - Update to <code>weaviate-agents-client</code> dependency for GA release of agents</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`7acf5c096a`"><code>7acf5c0</code></a> Merge pull request <a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1838">#1838</a> from weaviate/fix_tests</li> <li><a href="`960559d788`"><code>960559d</code></a> Remove unneeded version checks</li> <li><a href="`7cc1861b6c`"><code>7cc1861</code></a> Merge pull request <a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1837">#1837</a> from weaviate/changelog_417</li> <li><a href="`3e124e9dfc`"><code>3e124e9</code></a> Small cleanup in version checking</li> <li><a href="`e1859f17a7`"><code>e1859f1</code></a> Add changelog for 4.17.0</li> <li><a href="`1e71c7832e`"><code>1e71c78</code></a> Merge pull request <a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1827">#1827</a> from weaviate/gen_openai_params</li> <li><a href="`9a4bedfc7b`"><code>9a4bedf</code></a> Fix enum selection</li> <li><a href="`033542fa8c`"><code>033542f</code></a> Merge pull request <a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1824">#1824</a> from weaviate/dependabot/pip/pydoclint-0.7.3</li> <li><a href="`158889e6d4`"><code>158889e</code></a> Merge pull request <a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1823">#1823</a> from weaviate/dependabot/pip/polars-gte-0.20.26-and-...</li> <li><a href="`65191bb1e4`"><code>65191bb</code></a> Merge branch 'dev/1.33'</li> <li>Additional commits viewable in <a href="https://github.com/weaviate/weaviate-python-client/compare/v4.16.9...v4.17.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=weaviate-client&package-manager=uv&previous-version=4.16.9&new-version=4.17.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-20 15:10:31 -07:00
ehhuang	5678c25b9d	chore: remove dead code (#3863 ) # What does this PR do? ## Test Plan	2025-10-20 15:04:57 -07:00
dependabot[bot]	7294385df3	chore(github-deps): bump actions/setup-node from 5.0.0 to 6.0.0 (#3843 ) Bumps [actions/setup-node](https://github.com/actions/setup-node) from 5.0.0 to 6.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/setup-node/releases">actions/setup-node's releases</a>.</em></p> <blockquote> <h2>v6.0.0</h2> <h2>What's Changed</h2> <p><strong>Breaking Changes</strong></p> <ul> <li>Limit automatic caching to npm, update workflows and documentation by <a href="https://github.com/priyagupta108"><code>@priyagupta108</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1374">actions/setup-node#1374</a></li> </ul> <p><strong>Dependency Upgrades</strong></p> <ul> <li>Upgrade ts-jest from 29.1.2 to 29.4.1 and document breaking changes in v5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1336">#1336</a></li> <li>Upgrade prettier from 2.8.8 to 3.6.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1334">#1334</a></li> <li>Upgrade actions/publish-action from 0.3.0 to 0.4.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1362">#1362</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v5...v6.0.0">https://github.com/actions/setup-node/compare/v5...v6.0.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`2028fbc5c2`"><code>2028fbc</code></a> Limit automatic caching to npm, update workflows and documentation (<a href="https://redirect.github.com/actions/setup-node/issues/1374">#1374</a>)</li> <li><a href="`13427813f7`"><code>1342781</code></a> Bump actions/publish-action from 0.3.0 to 0.4.0 (<a href="https://redirect.github.com/actions/setup-node/issues/1362">#1362</a>)</li> <li><a href="`89d709d423`"><code>89d709d</code></a> Bump prettier from 2.8.8 to 3.6.2 (<a href="https://redirect.github.com/actions/setup-node/issues/1334">#1334</a>)</li> <li><a href="`cd2651c462`"><code>cd2651c</code></a> Bump ts-jest from 29.1.2 to 29.4.1 (<a href="https://redirect.github.com/actions/setup-node/issues/1336">#1336</a>)</li> <li>See full diff in <a href="`a0853c2454...2028fbc5c2`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=5.0.0&new-version=6.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-20 14:59:39 -07:00
dependabot[bot]	8943335e0b	chore(github-deps): bump astral-sh/setup-uv from 7.0.0 to 7.1.0 (#3842 ) Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 7.0.0 to 7.1.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v7.1.0 🌈 Support all the use cases</h2> <h2>Changes</h2> <p><strong>Support all the use cases!!!</strong> ... well, that we know of.</p> <p>This release adds support for some use cases that most users don't encounter but are useful for e.g. people running Gitea.</p> <p>The input <code>resolution-strategy</code> lets you use the lowest possible version of uv from a version range. Useful if you want to test your tool with different versions of uv.</p> <p>If you use <code>activate-environment</code> the path to the activated venv is now also exposed under the output <code>venv</code>.</p> <p>Downloaded python installations can now also be uploaded to the GitHub Actions cache backend. Useful if you are running in <code>act</code> and have configured your own backend and don't want to download python again, and again over a slow internet connection.</p> <p>Finally the path to installed python interpreters is now added to the <code>PATH</code> on Windows.</p> <h2>🚀 Enhancements</h2> <ul> <li>Add resolution-strategy input to support oldest compatible version selection @<a href="https://github.com/apps/copilot-swe-agent">copilot-swe-agent[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/631">#631</a>)</li> <li>Add value of UV_PYTHON_INSTALL_DIR to path <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/628">#628</a>)</li> <li>Set output venv when activate-environment is used <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/627">#627</a>)</li> <li>Cache python installs <a href="https://github.com/merlinz01"><code>@merlinz01</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/621">#621</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>Add copilot-instructions.md <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/630">#630</a>)</li> <li>chore: update known checksums for 0.9.2 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/626">#626</a>)</li> <li>chore: update known checksums for 0.9.1 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/625">#625</a>)</li> <li>Fall back to PR for updating known versions <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/623">#623</a>)</li> </ul> <h2>📚 Documentation</h2> <ul> <li>Split up documentation <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/632">#632</a>)</li> </ul> <h2>⬆️ Dependency updates</h2> <ul> <li>Bump deps <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/633">#633</a>)</li> <li>Bump github/codeql-action from 3.30.6 to 4.30.7 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/614">#614</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`3259c6206f`"><code>3259c62</code></a> Bump deps (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/633">#633</a>)</li> <li><a href="`bf8e8ed895`"><code>bf8e8ed</code></a> Split up documentation (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/632">#632</a>)</li> <li><a href="`9c6b5e9fb5`"><code>9c6b5e9</code></a> Add resolution-strategy input to support oldest compatible version selection ...</li> <li><a href="`a5129e99f4`"><code>a5129e9</code></a> Add copilot-instructions.md (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/630">#630</a>)</li> <li><a href="`d18bcc753a`"><code>d18bcc7</code></a> Add value of UV_PYTHON_INSTALL_DIR to path (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/628">#628</a>)</li> <li><a href="`bd1f875aba`"><code>bd1f875</code></a> Set output venv when activate-environment is used (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/627">#627</a>)</li> <li><a href="`1a91c3851d`"><code>1a91c38</code></a> chore: update known checksums for 0.9.2 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/626">#626</a>)</li> <li><a href="`c79f606987`"><code>c79f606</code></a> chore: update known checksums for 0.9.1 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/625">#625</a>)</li> <li><a href="`e0249f1599`"><code>e0249f1</code></a> Fall back to PR for updating known versions (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/623">#623</a>)</li> <li><a href="`6d2eb15b49`"><code>6d2eb15</code></a> Cache python installs (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/621">#621</a>)</li> <li>Additional commits viewable in <a href="`eb1897b8dc...3259c6206f`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=7.0.0&new-version=7.1.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-20 14:59:35 -07:00
dependabot[bot]	e7f4ddcc86	chore(github-deps): bump actions/checkout from 4.2.2 to 5.0.0 (#3841 ) Bumps [actions/checkout](https://github.com/actions/checkout) from 4.2.2 to 5.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/releases">actions/checkout's releases</a>.</em></p> <blockquote> <h2>v5.0.0</h2> <h2>What's Changed</h2> <ul> <li>Update actions checkout to use node 24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li> <li>Prepare v5.0.0 release by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2238">actions/checkout#2238</a></li> </ul> <h2>⚠️ Minimum Compatible Runner Version</h2> <p><strong>v2.327.1</strong><br /> <a href="https://github.com/actions/runner/releases/tag/v2.327.1">Release Notes</a></p> <p>Make sure your runner is updated to this version or newer to use this release.</p> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v4...v5.0.0">https://github.com/actions/checkout/compare/v4...v5.0.0</a></p> <h2>v4.3.0</h2> <h2>What's Changed</h2> <ul> <li>docs: update README.md by <a href="https://github.com/motss"><code>@motss</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li>Add internal repos for checking out multiple repositories by <a href="https://github.com/mouismail"><code>@mouismail</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li>Documentation update - add recommended permissions to Readme by <a href="https://github.com/benwells"><code>@benwells</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li>Adjust positioning of user email note and permissions heading by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li> <li>Update README.md by <a href="https://github.com/nebuk89"><code>@nebuk89</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li>Update CODEOWNERS for actions by <a href="https://github.com/TingluoHuang"><code>@TingluoHuang</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li> <li>Update package dependencies by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> <li>Prepare release v4.3.0 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2237">actions/checkout#2237</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/motss"><code>@motss</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li><a href="https://github.com/mouismail"><code>@mouismail</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li><a href="https://github.com/benwells"><code>@benwells</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li><a href="https://github.com/nebuk89"><code>@nebuk89</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li><a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v4...v4.3.0">https://github.com/actions/checkout/compare/v4...v4.3.0</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2>V5.0.0</h2> <ul> <li>Update actions checkout to use node 24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li> </ul> <h2>V4.3.0</h2> <ul> <li>docs: update README.md by <a href="https://github.com/motss"><code>@motss</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li>Add internal repos for checking out multiple repositories by <a href="https://github.com/mouismail"><code>@mouismail</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li>Documentation update - add recommended permissions to Readme by <a href="https://github.com/benwells"><code>@benwells</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li>Adjust positioning of user email note and permissions heading by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li> <li>Update README.md by <a href="https://github.com/nebuk89"><code>@nebuk89</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li>Update CODEOWNERS for actions by <a href="https://github.com/TingluoHuang"><code>@TingluoHuang</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li> <li>Update package dependencies by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> </ul> <h2>v4.2.2</h2> <ul> <li><code>url-helper.ts</code> now leverages well-known environment variables by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li> <li>Expand unit test coverage for <code>isGhes</code> by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li> </ul> <h2>v4.2.1</h2> <ul> <li>Check out other refs/* by commit if provided, fall back to ref by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li> </ul> <h2>v4.2.0</h2> <ul> <li>Add Ref and Commit outputs by <a href="https://github.com/lucacome"><code>@lucacome</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li> <li>Dependency updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>- <a href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>, <a href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li> </ul> <h2>v4.1.7</h2> <ul> <li>Bump the minor-npm-dependencies group across 1 directory with 4 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li> <li>Bump actions/checkout from 3 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li> <li>Check out other refs/* by commit by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li> <li>Pin actions/checkout's own workflows to a known, good, stable version. by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li> </ul> <h2>v4.1.6</h2> <ul> <li>Check platform to set archive extension appropriately by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li> </ul> <h2>v4.1.5</h2> <ul> <li>Update NPM dependencies by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li> <li>Bump github/codeql-action from 2 to 3 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li> <li>Bump actions/setup-node from 1 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li> <li>Bump actions/upload-artifact from 2 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li> <li>README: Suggest <code>user.email</code> to be <code>41898282+github-actions[bot]@users.noreply.github.com</code> by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1707">actions/checkout#1707</a></li> </ul> <h2>v4.1.4</h2> <ul> <li>Disable <code>extensions.worktreeConfig</code> when disabling <code>sparse-checkout</code> by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1692">actions/checkout#1692</a></li> <li>Add dependabot config by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1688">actions/checkout#1688</a></li> <li>Bump the minor-actions-dependencies group with 2 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1693">actions/checkout#1693</a></li> <li>Bump word-wrap from 1.2.3 to 1.2.5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1643">actions/checkout#1643</a></li> </ul> <h2>v4.1.3</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`08c6903cd8`"><code>08c6903</code></a> Prepare v5.0.0 release (<a href="https://redirect.github.com/actions/checkout/issues/2238">#2238</a>)</li> <li><a href="`9f265659d3`"><code>9f26565</code></a> Update actions checkout to use node 24 (<a href="https://redirect.github.com/actions/checkout/issues/2226">#2226</a>)</li> <li><a href="`08eba0b27e`"><code>08eba0b</code></a> Prepare release v4.3.0 (<a href="https://redirect.github.com/actions/checkout/issues/2237">#2237</a>)</li> <li><a href="`631c7dc4f8`"><code>631c7dc</code></a> Update package dependencies (<a href="https://redirect.github.com/actions/checkout/issues/2236">#2236</a>)</li> <li><a href="`8edcb1bdb4`"><code>8edcb1b</code></a> Update CODEOWNERS for actions (<a href="https://redirect.github.com/actions/checkout/issues/2224">#2224</a>)</li> <li><a href="`09d2acae67`"><code>09d2aca</code></a> Update README.md (<a href="https://redirect.github.com/actions/checkout/issues/2194">#2194</a>)</li> <li><a href="`85e6279cec`"><code>85e6279</code></a> Adjust positioning of user email note and permissions heading (<a href="https://redirect.github.com/actions/checkout/issues/2044">#2044</a>)</li> <li><a href="`009b9ae9e4`"><code>009b9ae</code></a> Documentation update - add recommended permissions to Readme (<a href="https://redirect.github.com/actions/checkout/issues/2043">#2043</a>)</li> <li><a href="`cbb722410c`"><code>cbb7224</code></a> Update README.md (<a href="https://redirect.github.com/actions/checkout/issues/1977">#1977</a>)</li> <li><a href="`3b9b8c884f`"><code>3b9b8c8</code></a> docs: update README.md (<a href="https://redirect.github.com/actions/checkout/issues/1971">#1971</a>)</li> <li>See full diff in <a href="https://github.com/actions/checkout/compare/v4.2.2...08c6903cd8c0fde910a37f88322edcfb5dd907a8">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=4.2.2&new-version=5.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-20 14:59:28 -07:00
ehhuang	ab2d5febb4	chore: install client first (#3862 ) # What does this PR do? mirrors build_container.sh trying to resolve: 0.105 + [ editable = editable ] 0.105 + [ ! -d /workspace/llama-stack ] 0.105 + uv pip install --no-cache-dir -e /workspace/llama-stack 0.261 Using Python 3.12.12 environment at: /usr/local 0.479 × No solution found when resolving dependencies: 0.479 ╰─▶ Because only llama-stack-client<=0.2.23 is available and 0.479 llama-stack==0.3.0rc4 depends on llama-stack-client>=0.3.0rc4, we can 0.479 conclude that llama-stack==0.3.0rc4 cannot be used. 0.479 And because only llama-stack==0.3.0rc4 is available and you require 0.479 llama-stack, we can conclude that your requirements are unsatisfiable. ------ ## Test Plan	2025-10-20 14:56:45 -07:00
Ashwin Bharambe	94faec7bc5	chore(yaml)!: move registered resources to a sub-key (#3861 ) NOTE: this is a backwards incompatible change to the run-configs. A small QOL update, but this will prove useful when I do a rename for "vector_dbs" to "vector_stores" next. Moves all the `models, shields, ...` keys in run-config under a `registered_resources` sub-key.	2025-10-20 14:52:48 -07:00
Ashwin Bharambe	483d53cc37	feat(stainless): add stainless source of truth config (#3860 ) Source of truth for Stainless should be in this repository. This was long due.	2025-10-20 14:32:20 -07:00
Francisco Arceo	48581bf651	chore: Updating how default embedding model is set in stack (#3818 ) # What does this PR do? Refactor setting default vector store provider and embedding model to use an optional `vector_stores` config in the `StackRunConfig` and clean up code to do so (had to add back in some pieces of VectorDB). Also added remote Qdrant and Weaviate to starter distro (based on other PR where inference providers were added for UX). New config is simply (default for Starter distro): ```yaml vector_stores: default_provider_id: faiss default_embedding_model: provider_id: sentence-transformers model_id: nomic-ai/nomic-embed-text-v1.5 ``` ## Test Plan CI and Unit tests. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-10-20 14:22:45 -07:00
Ashwin Bharambe	2c43285e22	feat(stores)!: use backend storage references instead of configs (#3697 ) This PR changes configurations in a backward incompatible way. Run configs today repeat full SQLite/Postgres snippets everywhere a store is needed, which means duplicated credentials, extra connection pools, and lots of drift between files. This PR introduces named storage backends so the stack and providers can share a single catalog and reference those backends by name. ## Key Changes - Add `storage.backends` to `StackRunConfig`, register each KV/SQL backend once at startup, and validate that references point to the right family. - Move server stores under `storage.stores` with lightweight references (backend + namespace/table) instead of full configs. - Update every provider/config/doc to use the new reference style; docs/codegen now surface the simplified YAML. ## Migration Before: ```yaml metadata_store: type: sqlite db_path: ~/.llama/distributions/foo/registry.db inference_store: type: postgres host: ${env.POSTGRES_HOST} port: ${env.POSTGRES_PORT} db: ${env.POSTGRES_DB} user: ${env.POSTGRES_USER} password: ${env.POSTGRES_PASSWORD} conversations_store: type: postgres host: ${env.POSTGRES_HOST} port: ${env.POSTGRES_PORT} db: ${env.POSTGRES_DB} user: ${env.POSTGRES_USER} password: ${env.POSTGRES_PASSWORD} ``` After: ```yaml storage: backends: kv_default: type: kv_sqlite db_path: ~/.llama/distributions/foo/kvstore.db sql_default: type: sql_postgres host: ${env.POSTGRES_HOST} port: ${env.POSTGRES_PORT} db: ${env.POSTGRES_DB} user: ${env.POSTGRES_USER} password: ${env.POSTGRES_PASSWORD} stores: metadata: backend: kv_default namespace: registry inference: backend: sql_default table_name: inference_store max_write_queue_size: 10000 num_writers: 4 conversations: backend: sql_default table_name: openai_conversations ``` Provider configs follow the same pattern—for example, a Chroma vector adapter switches from: ```yaml providers: vector_io: - provider_id: chromadb provider_type: remote::chromadb config: url: ${env.CHROMADB_URL} kvstore: type: sqlite db_path: ~/.llama/distributions/foo/chroma.db ``` to: ```yaml providers: vector_io: - provider_id: chromadb provider_type: remote::chromadb config: url: ${env.CHROMADB_URL} persistence: backend: kv_default namespace: vector_io::chroma_remote ``` Once the backends are declared, everything else just points at them, so rotating credentials or swapping to Postgres happens in one place and the stack reuses a single connection pool.	2025-10-20 13:20:09 -07:00
Shabana Baig	add64e8e2a	feat: Add instructions parameter in response object (#3741 ) # Problem The current inline provider appends the user provided instructions to messages as a system prompt, but the returned response object does not contain the instructions field (as specified in the OpenAI responses spec). # What does this PR do? This pull request adds the instruction field to the response object definition and updates the inline provider. It also ensures that instructions from previous response is not carried over to the next response (as specified in the openAI spec). Closes #[3566](https://github.com/llamastack/llama-stack/issues/3566) ## Test Plan - Tested manually for change in model response w.r.t supplied instructions field. - Added unit test to check that the instructions from previous response is not carried over to the next response. - Added integration tests to check instructions parameter in the returned response object. - Added new recordings for the integration tests. --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-20 13:10:37 -07:00
Derek Higgins	1f38359d95	fix: nested claims mapping in OAuth2 token validation (#3814 ) fix: nested claims mapping in OAuth2 token validation The get_attributes_from_claims function was only checking for top-level claim keys, causing token validation to fail when using nested claims like "resource_access.llamastack.roles" (common in Keycloak JWT tokens). Updated the function to support dot notation for traversing nested claim structures. Give precedence to dot notation over literal keys with dots in claims mapping. Added test coverage. Closes: #3812 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-10-20 12:34:55 -07:00
dependabot[bot]	08cbb69ef7	chore(python-deps): bump sqlalchemy from 2.0.41 to 2.0.44 (#3848 ) Bumps [sqlalchemy](https://github.com/sqlalchemy/sqlalchemy) from 2.0.41 to 2.0.44. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/sqlalchemy/sqlalchemy/releases">sqlalchemy's releases</a>.</em></p> <blockquote> <h1>2.0.44</h1> <p>Released: October 10, 2025</p> <h2>platform</h2> <ul> <li><strong>[platform] [bug]</strong> Unblocked automatic greenlet installation for Python 3.14 now that there are greenlet wheels on pypi for python 3.14.</li> </ul> <h2>orm</h2> <ul> <li> <p><strong>[orm] [usecase]</strong> The way ORM Annotated Declarative interprets Python <a href="https://peps.python.org/pep-0695">PEP 695</a> type aliases in <code>Mapped[]</code> annotations has been refined to expand the lookup scheme. A <a href="https://peps.python.org/pep-0695">PEP 695</a> type can now be resolved based on either its direct presence in <code>_orm.registry.type_annotation_map</code> or its immediate resolved value, as long as a recursive lookup across multiple <a href="https://peps.python.org/pep-0695">PEP 695</a> types is not required for it to resolve. This change reverses part of the restrictions introduced in 2.0.37 as part of <a href="https://www.sqlalchemy.org/trac/ticket/11955">#11955</a>, which deprecated (and disallowed in 2.1) the ability to resolve any <a href="https://peps.python.org/pep-0695">PEP 695</a> type that was not explicitly present in <code>_orm.registry.type_annotation_map</code>. Recursive lookups of <a href="https://peps.python.org/pep-0695">PEP 695</a> types remains deprecated in 2.0 and disallowed in version 2.1, as do implicit lookups of <code>NewType</code> types without an entry in <code>_orm.registry.type_annotation_map</code>.</p> <p>Additionally, new support has been added for generic <a href="https://peps.python.org/pep-0695">PEP 695</a> aliases that refer to <a href="https://peps.python.org/pep-0593">PEP 593</a> <code>Annotated</code> constructs containing <code>_orm.mapped_column()</code> configurations. See the sections below for examples.</p> <p>References: <a href="https://www.sqlalchemy.org/trac/ticket/12829">#12829</a></p> </li> <li> <p><strong>[orm] [bug]</strong> Fixed a caching issue where <code>_orm.with_loader_criteria()</code> would incorrectly reuse cached bound parameter values when used with <code>_sql.CompoundSelect</code> constructs such as <code>_sql.union()</code>. The issue was caused by the cache key for compound selects not including the execution options that are part of the <code>_sql.Executable</code> base class, which <code>_orm.with_loader_criteria()</code> uses to apply its criteria dynamically. The fix ensures that compound selects and other executable constructs properly include execution options in their cache key traversal.</p> <p>References: <a href="https://www.sqlalchemy.org/trac/ticket/12905">#12905</a></p> </li> </ul> <h2>engine</h2> <ul> <li><strong>[engine] [bug]</strong> Implemented initial support for free-threaded Python by adding new tests and reworking the test harness to include Python 3.13t and Python 3.14t in</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/sqlalchemy/sqlalchemy/commits">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=sqlalchemy&package-manager=uv&previous-version=2.0.41&new-version=2.0.44)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-20 12:34:11 -07:00
dependabot[bot]	112a974005	chore(python-deps): bump ruff from 0.9.10 to 0.14.1 (#3846 ) Bumps [ruff](https://github.com/astral-sh/ruff) from 0.9.10 to 0.14.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/ruff/releases">ruff's releases</a>.</em></p> <blockquote> <h2>0.14.1</h2> <h2>Release Notes</h2> <p>Released on 2025-10-16.</p> <h3>Preview features</h3> <ul> <li>[formatter] Remove parentheses around multiple exception types on Python 3.14+ (<a href="https://redirect.github.com/astral-sh/ruff/pull/20768">#20768</a>)</li> <li>[<code>flake8-bugbear</code>] Omit annotation in preview fix for <code>B006</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/20877">#20877</a>)</li> <li>[<code>flake8-logging-format</code>] Avoid dropping implicitly concatenated pieces in the <code>G004</code> fix (<a href="https://redirect.github.com/astral-sh/ruff/pull/20793">#20793</a>)</li> <li>[<code>pydoclint</code>] Implement <code>docstring-extraneous-parameter</code> (<code>DOC102</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20376">#20376</a>)</li> <li>[<code>pyupgrade</code>] Extend <code>UP019</code> to detect <code>typing_extensions.Text</code> (<code>UP019</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20825">#20825</a>)</li> <li>[<code>pyupgrade</code>] Fix false negative for <code>TypeVar</code> with default argument in <code>non-pep695-generic-class</code> (<code>UP046</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20660">#20660</a>)</li> </ul> <h3>Bug fixes</h3> <ul> <li>Fix false negatives in <code>Truthiness::from_expr</code> for lambdas, generators, and f-strings (<a href="https://redirect.github.com/astral-sh/ruff/pull/20704">#20704</a>)</li> <li>Fix syntax error false positives for escapes and quotes in f-strings (<a href="https://redirect.github.com/astral-sh/ruff/pull/20867">#20867</a>)</li> <li>Fix syntax error false positives on parenthesized context managers (<a href="https://redirect.github.com/astral-sh/ruff/pull/20846">#20846</a>)</li> <li>[<code>fastapi</code>] Fix false positives for path parameters that FastAPI doesn't recognize (<code>FAST003</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20687">#20687</a>)</li> <li>[<code>flake8-pyi</code>] Fix operator precedence by adding parentheses when needed (<code>PYI061</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20508">#20508</a>)</li> <li>[<code>ruff</code>] Suppress diagnostic for f-string interpolations with debug text (<code>RUF010</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20525">#20525</a>)</li> </ul> <h3>Rule changes</h3> <ul> <li>[<code>airflow</code>] Add warning to <code>airflow.datasets.DatasetEvent</code> usage (<code>AIR301</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20551">#20551</a>)</li> <li>[<code>flake8-bugbear</code>] Mark <code>B905</code> and <code>B912</code> fixes as unsafe (<a href="https://redirect.github.com/astral-sh/ruff/pull/20695">#20695</a>)</li> <li>Use <code>DiagnosticTag</code> for more rules - changes display in editors (<a href="https://redirect.github.com/astral-sh/ruff/pull/20758">#20758</a>,<a href="https://redirect.github.com/astral-sh/ruff/pull/20734">#20734</a>)</li> </ul> <h3>Documentation</h3> <ul> <li>Update Python compatibility from 3.13 to 3.14 in README.md (<a href="https://redirect.github.com/astral-sh/ruff/pull/20852">#20852</a>)</li> <li>Update <code>lint.flake8-type-checking.quoted-annotations</code> docs (<a href="https://redirect.github.com/astral-sh/ruff/pull/20765">#20765</a>)</li> <li>Update setup instructions for Zed 0.208.0+ (<a href="https://redirect.github.com/astral-sh/ruff/pull/20902">#20902</a>)</li> <li>[<code>flake8-datetimez</code>] Clarify docs for several rules (<a href="https://redirect.github.com/astral-sh/ruff/pull/20778">#20778</a>)</li> <li>Fix typo in <code>RUF015</code> description (<a href="https://redirect.github.com/astral-sh/ruff/pull/20873">#20873</a>)</li> </ul> <h3>Other changes</h3> <ul> <li>Reduce binary size (<a href="https://redirect.github.com/astral-sh/ruff/pull/20863">#20863</a>)</li> <li>Improved error recovery for unclosed strings (including f- and t-strings) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20848">#20848</a>)</li> </ul> <h3>Contributors</h3> <ul> <li><a href="https://github.com/ntBre"><code>@ntBre</code></a></li> <li><a href="https://github.com/Paillat-dev"><code>@Paillat-dev</code></a></li> <li><a href="https://github.com/terror"><code>@terror</code></a></li> <li><a href="https://github.com/pieterh-oai"><code>@pieterh-oai</code></a></li> <li><a href="https://github.com/MichaReiser"><code>@MichaReiser</code></a></li> <li><a href="https://github.com/TaKO8Ki"><code>@TaKO8Ki</code></a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md">ruff's changelog</a>.</em></p> <blockquote> <h2>0.14.1</h2> <p>Released on 2025-10-16.</p> <h3>Preview features</h3> <ul> <li>[formatter] Remove parentheses around multiple exception types on Python 3.14+ (<a href="https://redirect.github.com/astral-sh/ruff/pull/20768">#20768</a>)</li> <li>[<code>flake8-bugbear</code>] Omit annotation in preview fix for <code>B006</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/20877">#20877</a>)</li> <li>[<code>flake8-logging-format</code>] Avoid dropping implicitly concatenated pieces in the <code>G004</code> fix (<a href="https://redirect.github.com/astral-sh/ruff/pull/20793">#20793</a>)</li> <li>[<code>pydoclint</code>] Implement <code>docstring-extraneous-parameter</code> (<code>DOC102</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20376">#20376</a>)</li> <li>[<code>pyupgrade</code>] Extend <code>UP019</code> to detect <code>typing_extensions.Text</code> (<code>UP019</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20825">#20825</a>)</li> <li>[<code>pyupgrade</code>] Fix false negative for <code>TypeVar</code> with default argument in <code>non-pep695-generic-class</code> (<code>UP046</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20660">#20660</a>)</li> </ul> <h3>Bug fixes</h3> <ul> <li>Fix false negatives in <code>Truthiness::from_expr</code> for lambdas, generators, and f-strings (<a href="https://redirect.github.com/astral-sh/ruff/pull/20704">#20704</a>)</li> <li>Fix syntax error false positives for escapes and quotes in f-strings (<a href="https://redirect.github.com/astral-sh/ruff/pull/20867">#20867</a>)</li> <li>Fix syntax error false positives on parenthesized context managers (<a href="https://redirect.github.com/astral-sh/ruff/pull/20846">#20846</a>)</li> <li>[<code>fastapi</code>] Fix false positives for path parameters that FastAPI doesn't recognize (<code>FAST003</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20687">#20687</a>)</li> <li>[<code>flake8-pyi</code>] Fix operator precedence by adding parentheses when needed (<code>PYI061</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20508">#20508</a>)</li> <li>[<code>ruff</code>] Suppress diagnostic for f-string interpolations with debug text (<code>RUF010</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20525">#20525</a>)</li> </ul> <h3>Rule changes</h3> <ul> <li>[<code>airflow</code>] Add warning to <code>airflow.datasets.DatasetEvent</code> usage (<code>AIR301</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20551">#20551</a>)</li> <li>[<code>flake8-bugbear</code>] Mark <code>B905</code> and <code>B912</code> fixes as unsafe (<a href="https://redirect.github.com/astral-sh/ruff/pull/20695">#20695</a>)</li> <li>Use <code>DiagnosticTag</code> for more rules - changes display in editors (<a href="https://redirect.github.com/astral-sh/ruff/pull/20758">#20758</a>,<a href="https://redirect.github.com/astral-sh/ruff/pull/20734">#20734</a>)</li> </ul> <h3>Documentation</h3> <ul> <li>Update Python compatibility from 3.13 to 3.14 in README.md (<a href="https://redirect.github.com/astral-sh/ruff/pull/20852">#20852</a>)</li> <li>Update <code>lint.flake8-type-checking.quoted-annotations</code> docs (<a href="https://redirect.github.com/astral-sh/ruff/pull/20765">#20765</a>)</li> <li>Update setup instructions for Zed 0.208.0+ (<a href="https://redirect.github.com/astral-sh/ruff/pull/20902">#20902</a>)</li> <li>[<code>flake8-datetimez</code>] Clarify docs for several rules (<a href="https://redirect.github.com/astral-sh/ruff/pull/20778">#20778</a>)</li> <li>Fix typo in <code>RUF015</code> description (<a href="https://redirect.github.com/astral-sh/ruff/pull/20873">#20873</a>)</li> </ul> <h3>Other changes</h3> <ul> <li>Reduce binary size (<a href="https://redirect.github.com/astral-sh/ruff/pull/20863">#20863</a>)</li> <li>Improved error recovery for unclosed strings (including f- and t-strings) (<a href="https://redirect.github.com/astral-sh/ruff/pull/20848">#20848</a>)</li> </ul> <h3>Contributors</h3> <ul> <li><a href="https://github.com/ntBre"><code>@ntBre</code></a></li> <li><a href="https://github.com/Paillat-dev"><code>@Paillat-dev</code></a></li> <li><a href="https://github.com/terror"><code>@terror</code></a></li> <li><a href="https://github.com/pieterh-oai"><code>@pieterh-oai</code></a></li> <li><a href="https://github.com/MichaReiser"><code>@MichaReiser</code></a></li> <li><a href="https://github.com/TaKO8Ki"><code>@TaKO8Ki</code></a></li> <li><a href="https://github.com/ageorgou"><code>@ageorgou</code></a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`2bffef5966`"><code>2bffef5</code></a> Bump 0.14.1 (<a href="https://redirect.github.com/astral-sh/ruff/issues/20925">#20925</a>)</li> <li><a href="`e64d772788`"><code>e64d772</code></a> Standardize syntax error construction (<a href="https://redirect.github.com/astral-sh/ruff/issues/20903">#20903</a>)</li> <li><a href="`03696687ea`"><code>0369668</code></a> [<code>pydoclint</code>] Implement <code>docstring-extraneous-parameter</code> (<code>DOC102</code>) (<a href="https://redirect.github.com/astral-sh/ruff/issues/20376">#20376</a>)</li> <li><a href="`058fc37542`"><code>058fc37</code></a> [ty] Fix panic 'missing root' when handling completion request (<a href="https://redirect.github.com/astral-sh/ruff/issues/20917">#20917</a>)</li> <li><a href="`ec9faa34be`"><code>ec9faa3</code></a> [ty] Run file watching tests serial when using nextest (<a href="https://redirect.github.com/astral-sh/ruff/issues/20918">#20918</a>)</li> <li><a href="`7155a62e5c`"><code>7155a62</code></a> [ty] Add version hint for failed stdlib attribute accesses (<a href="https://redirect.github.com/astral-sh/ruff/issues/20909">#20909</a>)</li> <li><a href="`a67e0690f2`"><code>a67e069</code></a> More CI improvements (<a href="https://redirect.github.com/astral-sh/ruff/issues/20920">#20920</a>)</li> <li><a href="`6a1e91ce97`"><code>6a1e91c</code></a> [ty] Check typeshed VERSIONS for parent modules when reporting failed stdlib ...</li> <li><a href="`3db5d5906e`"><code>3db5d59</code></a> Don't use codspeed or depot runners in CI jobs on forks (<a href="https://redirect.github.com/astral-sh/ruff/issues/20894">#20894</a>)</li> <li><a href="`d23826ce46`"><code>d23826c</code></a> [ty] cache Type::is_redundant_with (<a href="https://redirect.github.com/astral-sh/ruff/issues/20477">#20477</a>)</li> <li>Additional commits viewable in <a href="https://github.com/astral-sh/ruff/compare/0.9.10...0.14.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=ruff&package-manager=uv&previous-version=0.9.10&new-version=0.14.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-20 12:33:44 -07:00
ehhuang	9936f33f7e	chore: disable telemetry if otel endpoint isn't set (#3859 ) # What does this PR do? removes error: ConnectionError: HTTPConnectionPool(host='localhost', port=4318): Max retries exceeded with url: /v1/traces (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x10fd98e60>: Failed to establish a new connection: [Errno 61] Connection refused')) ## Test Plan uv run llama stack run starter curl http://localhost:8321/v1/models observe no error in server logs	2025-10-20 11:42:57 -07:00
ehhuang	359df3a37c	chore: update doc (#3857 ) # What does this PR do? follows https://github.com/llamastack/llama-stack/pull/3839 ## Test Plan	2025-10-20 10:33:21 -07:00
ehhuang	21772de5d3	chore: use dockerfile for building containers (#3839 ) # What does this PR do? relates to #2878 We introduce a Containerfile which is used to replaced the `llama stack build` command (removal in a separate PR). ``` llama stack build --distro starter --image-type venv --run ``` is replaced by ``` llama stack list-deps starter \| xargs -L1 uv pip install llama stack run starter ``` - See the updated workflow files for e2e workflow. ## Test Plan CI ``` ❯ docker build . -f docker/Dockerfile --build-arg DISTRO_NAME=starter --build-arg INSTALL_MODE=editable --tag test_starter ❯ docker run -p 8321:8321 test_starter ❯ curl http://localhost:8321/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o-mini", "messages": [ { "role": "user", "content": "Hello!" } ] }' ``` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/llamastack/llama-stack/pull/3839). * #3855 * __->__ #3839	2025-10-20 10:23:01 -07:00
Charlie Doern	573e783ff0	docs: fix sidebar of `Detailed Tutorial` (#3856 ) # What does this PR do? the sidebar currently has an extra `ii. Run the Script` because its incorrectly put into the doc as an H3 not an H4 (like the other ones) <img width="239" height="218" alt="Screenshot 2025-10-20 at 1 04 54 PM" src="https://github.com/user-attachments/assets/eb8cb26e-7ea9-4b61-9101-d64965b39647" /> Fix this which will update the sidebar Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-20 13:10:50 -04:00
Jiayi Ni	165b8b07f4	docs: Documentation update for NVIDIA Inference Provider (#3840 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> - Fix examples in the NVIDIA inference documentation to align with current API requirements. ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> N/A	2025-10-20 09:51:43 -07:00
dependabot[bot]	f675fdda0f	chore(ui-deps): bump jest and @types/jest in /llama_stack/ui (#3853 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Details Python Package Build Test / build (3.12) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 32s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 33s Details Test External API and Providers / test-external (venv) (push) Failing after 45s Details Vector IO Integration Tests / test-matrix (push) Failing after 47s Details API Conformance Tests / check-schema-compatibility (push) Successful in 55s Details UI Tests / ui-tests (22) (push) Successful in 2m14s Details Pre-commit / pre-commit (push) Successful in 3m28s Details Bumps [jest](https://github.com/jestjs/jest/tree/HEAD/packages/jest) and [@types/jest](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/jest). These dependencies needed to be updated together. Updates `jest` from 29.7.0 to 30.2.0 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/jestjs/jest/releases">jest's releases</a>.</em></p> <blockquote> <h2>30.2.0</h2> <h3>Chore & Maintenance</h3> <ul> <li><code>[]</code> Update example repo for testing React Native projects (<a href="https://redirect.github.com/jestjs/jest/pull/15832">#15832</a>)</li> <li><code>[]</code> Update <code>jest-watch-typeahead</code> to v3 (<a href="https://redirect.github.com/jestjs/jest/pull/15830">#15830</a>)</li> </ul> <h2>Features</h2> <ul> <li><code>[jest-environment-jsdom-abstract]</code> Add support for JSDOM v27 (<a href="https://redirect.github.com/jestjs/jest/pull/15834">#15834</a>)</li> </ul> <h3>Fixes</h3> <ul> <li><code>[babel-jest]</code> Export the <code>TransformerConfig</code> interface (<a href="https://redirect.github.com/jestjs/jest/pull/15820">#15820</a>)</li> <li><code>[jest-config]</code> Fix <code>jest.config.ts</code> with TS loader specified in docblock pragma (<a href="https://redirect.github.com/jestjs/jest/pull/15839">#15839</a>)</li> </ul> <h2>30.1.3</h2> <h3>Fixes</h3> <ul> <li>Fix <code>unstable_mockModule</code> with <code>node:</code> prefixed core modules.</li> </ul> <h2>30.1.2</h2> <h3>Fixes</h3> <ul> <li><code>[jest-snapshot-utils]</code> Correct snapshot header regexp to work with newline across OSes (<a href="https://redirect.github.com/jestjs/jest/pull/15803">#15803</a>)</li> </ul> <h2>30.1.1</h2> <h3>Fixes</h3> <ul> <li><code>[jest-snapshot-utils]</code> Fix deprecated goo.gl snapshot warning not handling Windows end-of-line sequences (<a href="https://redirect.github.com/jestjs/jest/pull/15800">#15800</a>)</li> </ul> <h2>30.1.0</h2> <h2>Features</h2> <ul> <li><code>[jest-leak-detector]</code> Configurable GC aggressiveness regarding to V8 heap snapshot generation (<a href="https://redirect.github.com/jestjs/jest/pull/15793/">#15793</a>)</li> <li><code>[jest-runtime]</code> Reduce redundant ReferenceError messages</li> <li><code>[jest-core]</code> Include test modules that failed to load when --onlyFailures is active</li> </ul> <h3>Fixes</h3> <ul> <li>`[jest-snapshot-utils] Fix deprecated goo.gl snapshot guide link not getting replaced with fully canonical URL (<a href="https://redirect.github.com/jestjs/jest/pull/15787">#15787</a>)</li> <li><code>[jest-circus]</code> Fix <code>it.concurrent</code> not working with <code>describe.skip</code> (<a href="https://redirect.github.com/jestjs/jest/pull/15765">#15765</a>)</li> <li><code>[jest-snapshot]</code> Fix mangled inline snapshot updates when used with Prettier 3 and CRLF line endings</li> <li><code>[jest-runtime]</code> Importing from <code>@jest/globals</code> in more than one file no longer breaks relative paths (<a href="https://redirect.github.com/jestjs/jest/issues/15772">#15772</a>)</li> </ul> <h1>Chore</h1> <ul> <li><code>[expect]</code> Update docblock for <code>toContain()</code> to display info on substring check (<a href="https://redirect.github.com/jestjs/jest/pull/15789">#15789</a>)</li> </ul> <h2>30.0.2</h2> <h2>What's Changed</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/jestjs/jest/blob/main/CHANGELOG.md">jest's changelog</a>.</em></p> <blockquote> <h2>30.2.0</h2> <h3>Chore & Maintenance</h3> <ul> <li><code>[]</code> Update example repo for testing React Native projects (<a href="https://redirect.github.com/jestjs/jest/pull/15832">#15832</a>)</li> <li><code>[]</code> Update <code>jest-watch-typeahead</code> to v3 (<a href="https://redirect.github.com/jestjs/jest/pull/15830">#15830</a>)</li> </ul> <h2>Features</h2> <ul> <li><code>[jest-environment-jsdom-abstract]</code> Add support for JSDOM v27 (<a href="https://redirect.github.com/jestjs/jest/pull/15834">#15834</a>)</li> </ul> <h3>Fixes</h3> <ul> <li><code>[jest-matcher-utils]</code> Fix infinite recursion with self-referential getters in <code>deepCyclicCopyReplaceable</code> (<a href="https://redirect.github.com/jestjs/jest/pull/15831">#15831</a>)</li> <li><code>[babel-jest]</code> Export the <code>TransformerConfig</code> interface (<a href="https://redirect.github.com/jestjs/jest/pull/15820">#15820</a>)</li> <li><code>[jest-config]</code> Fix <code>jest.config.ts</code> with TS loader specified in docblock pragma (<a href="https://redirect.github.com/jestjs/jest/pull/15839">#15839</a>)</li> </ul> <h2>30.1.3</h2> <h3>Fixes</h3> <ul> <li>Fix <code>unstable_mockModule</code> with <code>node:</code> prefixed core modules.</li> </ul> <h2>30.1.2</h2> <h3>Fixes</h3> <ul> <li><code>[jest-snapshot-utils]</code> Correct snapshot header regexp to work with newline across OSes (<a href="https://redirect.github.com/jestjs/jest/pull/15803">#15803</a>)</li> </ul> <h2>30.1.1</h2> <h3>Fixes</h3> <ul> <li><code>[jest-snapshot-utils]</code> Fix deprecated goo.gl snapshot warning not handling Windows end-of-line sequences (<a href="https://redirect.github.com/jestjs/jest/pull/15800">#15800</a>)</li> <li><code>[jest-snapshot-utils]</code> Improve messaging about goo.gl snapshot link change (<a href="https://redirect.github.com/jestjs/jest/pull/15821">#15821</a>)</li> </ul> <h2>30.1.0</h2> <h2>Features</h2> <ul> <li><code>[jest-leak-detector]</code> Configurable GC aggressiveness regarding to V8 heap snapshot generation (<a href="https://redirect.github.com/jestjs/jest/pull/15793/">#15793</a>)</li> <li><code>[jest-runtime]</code> Reduce redundant ReferenceError messages</li> <li><code>[jest-core]</code> Include test modules that failed to load when --onlyFailures is active</li> </ul> <h3>Fixes</h3> <ul> <li><code>[jest-snapshot-utils]</code> Fix deprecated goo.gl snapshot guide link not getting replaced with fully canonical URL (<a href="https://redirect.github.com/jestjs/jest/pull/15787">#15787</a>)</li> <li><code>[jest-circus]</code> Fix <code>it.concurrent</code> not working with <code>describe.skip</code> (<a href="https://redirect.github.com/jestjs/jest/pull/15765">#15765</a>)</li> <li><code>[jest-snapshot]</code> Fix mangled inline snapshot updates when used with Prettier 3 and CRLF line endings</li> <li><code>[jest-runtime]</code> Importing from <code>@jest/globals</code> in more than one file no longer breaks relative paths (<a href="https://redirect.github.com/jestjs/jest/issues/15772">#15772</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`855864e3f9`"><code>855864e</code></a> v30.2.0</li> <li><a href="`da9b532f04`"><code>da9b532</code></a> v30.1.3</li> <li><a href="`ebfa31cc97`"><code>ebfa31c</code></a> v30.1.2</li> <li><a href="`d347c0f3f8`"><code>d347c0f</code></a> v30.1.1</li> <li><a href="`4d5f41d088`"><code>4d5f41d</code></a> v30.1.0</li> <li><a href="`22236cf58b`"><code>22236cf</code></a> v30.0.5</li> <li><a href="`f4296d2bc8`"><code>f4296d2</code></a> v30.0.4</li> <li><a href="`d4a6c94daf`"><code>d4a6c94</code></a> v30.0.3</li> <li><a href="`393acbfac3`"><code>393acbf</code></a> v30.0.2</li> <li><a href="`5ce865b406`"><code>5ce865b</code></a> v30.0.1</li> <li>Additional commits viewable in <a href="https://github.com/jestjs/jest/commits/v30.2.0/packages/jest">compare view</a></li> </ul> </details> <br /> Updates `@types/jest` from 29.5.14 to 30.0.0 <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/jest">compare view</a></li> </ul> </details> <br /> Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-18 21:57:57 -04:00
dependabot[bot]	7a256895aa	chore(ui-deps): bump jest-environment-jsdom from 30.1.2 to 30.2.0 in /llama_stack/ui (#3852 ) Bumps [jest-environment-jsdom](https://github.com/jestjs/jest/tree/HEAD/packages/jest-environment-jsdom) from 30.1.2 to 30.2.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/jestjs/jest/releases">jest-environment-jsdom's releases</a>.</em></p> <blockquote> <h2>30.2.0</h2> <h3>Chore & Maintenance</h3> <ul> <li><code>[]</code> Update example repo for testing React Native projects (<a href="https://redirect.github.com/jestjs/jest/pull/15832">#15832</a>)</li> <li><code>[]</code> Update <code>jest-watch-typeahead</code> to v3 (<a href="https://redirect.github.com/jestjs/jest/pull/15830">#15830</a>)</li> </ul> <h2>Features</h2> <ul> <li><code>[jest-environment-jsdom-abstract]</code> Add support for JSDOM v27 (<a href="https://redirect.github.com/jestjs/jest/pull/15834">#15834</a>)</li> </ul> <h3>Fixes</h3> <ul> <li><code>[babel-jest]</code> Export the <code>TransformerConfig</code> interface (<a href="https://redirect.github.com/jestjs/jest/pull/15820">#15820</a>)</li> <li><code>[jest-config]</code> Fix <code>jest.config.ts</code> with TS loader specified in docblock pragma (<a href="https://redirect.github.com/jestjs/jest/pull/15839">#15839</a>)</li> </ul> <h2>30.1.3</h2> <h3>Fixes</h3> <ul> <li>Fix <code>unstable_mockModule</code> with <code>node:</code> prefixed core modules.</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/jestjs/jest/blob/main/CHANGELOG.md">jest-environment-jsdom's changelog</a>.</em></p> <blockquote> <h2>30.2.0</h2> <h3>Chore & Maintenance</h3> <ul> <li><code>[]</code> Update example repo for testing React Native projects (<a href="https://redirect.github.com/jestjs/jest/pull/15832">#15832</a>)</li> <li><code>[]</code> Update <code>jest-watch-typeahead</code> to v3 (<a href="https://redirect.github.com/jestjs/jest/pull/15830">#15830</a>)</li> </ul> <h2>Features</h2> <ul> <li><code>[jest-environment-jsdom-abstract]</code> Add support for JSDOM v27 (<a href="https://redirect.github.com/jestjs/jest/pull/15834">#15834</a>)</li> </ul> <h3>Fixes</h3> <ul> <li><code>[jest-matcher-utils]</code> Fix infinite recursion with self-referential getters in <code>deepCyclicCopyReplaceable</code> (<a href="https://redirect.github.com/jestjs/jest/pull/15831">#15831</a>)</li> <li><code>[babel-jest]</code> Export the <code>TransformerConfig</code> interface (<a href="https://redirect.github.com/jestjs/jest/pull/15820">#15820</a>)</li> <li><code>[jest-config]</code> Fix <code>jest.config.ts</code> with TS loader specified in docblock pragma (<a href="https://redirect.github.com/jestjs/jest/pull/15839">#15839</a>)</li> </ul> <h2>30.1.3</h2> <h3>Fixes</h3> <ul> <li>Fix <code>unstable_mockModule</code> with <code>node:</code> prefixed core modules.</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`855864e3f9`"><code>855864e</code></a> v30.2.0</li> <li>See full diff in <a href="https://github.com/jestjs/jest/commits/v30.2.0/packages/jest-environment-jsdom">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=jest-environment-jsdom&package-manager=npm_and_yarn&previous-version=30.1.2&new-version=30.2.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-18 21:53:58 -04:00
dependabot[bot]	83d2193077	chore(ui-deps): bump eslint-config-next from 15.5.2 to 15.5.6 in /llama_stack/ui (#3849 ) Bumps [eslint-config-next](https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next) from 15.5.2 to 15.5.6. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vercel/next.js/releases">eslint-config-next's releases</a>.</em></p> <blockquote> <h2>v15.5.6</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>Turbopack: don't define process.cwd() in node_modules <a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83452">#83452</a></li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/mischnic"><code>@mischnic</code></a> for helping!</p> <h2>v15.5.5</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>Split code-frame into separate compiled package (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/84238">#84238</a>)</li> <li>Add deprecation warning to Runtime config (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/84650">#84650</a>)</li> <li>fix: unstable_cache should perform blocking revalidation during ISR revalidation (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/84716">#84716</a>)</li> <li>feat: <code>experimental.middlewareClientMaxBodySize</code> body cloning limit (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/84722">#84722</a>)</li> <li>fix: missing next/link types with typedRoutes (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/84779">#84779</a>)</li> </ul> <h3>Misc Changes</h3> <ul> <li>docs: early October improvements and fixes (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/84334">#84334</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/devjiwonchoi"><code>@devjiwonchoi</code></a>, <a href="https://github.com/ztanner"><code>@ztanner</code></a>, and <a href="https://github.com/icyJoseph"><code>@icyJoseph</code></a> for helping!</p> <h2>v15.5.4</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>fix: ensure onRequestError is invoked when otel enabled (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83343">#83343</a>)</li> <li>fix: devtools initial position should be from next config (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83571">#83571</a>)</li> <li>[devtool] fix overlay styles are missing (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83721">#83721</a>)</li> <li>Turbopack: don't match dynamic pattern for node_modules packages (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83176">#83176</a>)</li> <li>Turbopack: don't treat metadata routes as RSC (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/82911">#82911</a>)</li> <li>[turbopack] Improve handling of symlink resolution errors in track_glob and read_glob (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83357">#83357</a>)</li> <li>Turbopack: throw large static metadata error earlier (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/82939">#82939</a>)</li> <li>fix: error overlay not closing when backdrop clicked (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83981">#83981</a>)</li> <li>Turbopack: flush Node.js worker IPC on error (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/84077">#84077</a>)</li> </ul> <h3>Misc Changes</h3> <ul> <li>[CNA] use linter preference (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83194">#83194</a>)</li> <li>CI: use KV for test timing data (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83745">#83745</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`55ef0e3ebc`"><code>55ef0e3</code></a> v15.5.6</li> <li><a href="`81f530db26`"><code>81f530d</code></a> v15.5.5</li> <li><a href="`40f1d7814d`"><code>40f1d78</code></a> v15.5.4</li> <li><a href="`07d1cbc9c6`"><code>07d1cbc</code></a> v15.5.3</li> <li>See full diff in <a href="https://github.com/vercel/next.js/commits/v15.5.6/packages/eslint-config-next">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=eslint-config-next&package-manager=npm_and_yarn&previous-version=15.5.2&new-version=15.5.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-18 21:52:17 -04:00
ehhuang	316b76db7a	chore: add telemetry setup to install.sh (#3821 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Installer CI / lint (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Installer CI / smoke-test-on-dev (push) Failing after 11s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details API Conformance Tests / check-schema-compatibility (push) Successful in 15s Details Vector IO Integration Tests / test-matrix (push) Failing after 18s Details Test External API and Providers / test-external (venv) (push) Failing after 17s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 44s Details UI Tests / ui-tests (22) (push) Successful in 1m28s Details Pre-commit / pre-commit (push) Successful in 2m27s Details # What does this PR do? ## Test Plan .venv ❯ sh ./scripts/install.sh ⚠️ Found existing container(s) for 'ollama-server', removing... ⚠️ Found existing container(s) for 'llama-stack', removing... ⚠️ Found existing container(s) for 'jaeger', removing... ⚠️ Found existing container(s) for 'otel-collector', removing... ⚠️ Found existing container(s) for 'prometheus', removing... ⚠️ Found existing container(s) for 'grafana', removing... 📡 Starting telemetry stack... 🦙 Starting Ollama... ⏳ Waiting for Ollama daemon... 📦 Ensuring model is pulled: llama3.2:3b... 🦙 Starting Llama Stack... ⏳ Waiting for Llama Stack API... .. 🎉 Llama Stack is ready! 👉 API endpoint: http://localhost:8321 📖 Documentation: https://llamastack.github.io/latest/references/api_reference/index.html 💻 To access the llama stack CLI, exec into the container: docker exec -ti llama-stack bash 📡 Telemetry dashboards: Jaeger UI: http://localhost:16686 Prometheus UI: http://localhost:9090 Grafana UI: http://localhost:3000 (admin/admin) OTEL Collector: http://localhost:4318 🐛 Report an issue @ https://github.com/llamastack/llama-stack/issues if you think it's a bug	2025-10-18 06:05:56 -07:00
Charlie Doern	b11bcfde11	refactor(build): rework CLI commands and build process (1/2) (#2974 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 22s Details Test llama stack list-deps / show-single-provider (push) Failing after 53s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 18s Details Python Package Build Test / build (3.13) (push) Failing after 24s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 26s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 27s Details Unit Tests / unit-tests (3.12) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (push) Failing after 44s Details API Conformance Tests / check-schema-compatibility (push) Successful in 52s Details Test llama stack list-deps / generate-matrix (push) Successful in 52s Details Test Llama Stack Build / build (push) Failing after 29s Details Test External API and Providers / test-external (venv) (push) Failing after 53s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1m2s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m30s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 1m59s Details Test llama stack list-deps / list-deps (push) Failing after 1m10s Details UI Tests / ui-tests (22) (push) Successful in 2m26s Details Pre-commit / pre-commit (push) Successful in 3m8s Details # What does this PR do? This PR does a few things outlined in #2878 namely: 1. adds `llama stack list-deps` a command which simply takes the build logic and instead of executing one of the `build_...` scripts, it displays all of the providers' dependencies using the `module` and `uv`. 2. deprecated `llama stack build` in favor of `llama stack list-deps` 3. updates all tests to use `list-deps` alongside `build`. PR 2/2 will migrate `llama stack run`'s default behavior to be `llama stack build --run` and use the new `list-deps` command under the hood before running the server. examples of `llama stack list-deps starter` ``` llama stack list-deps starter --format json { "name": "starter", "description": "Quick start template for running Llama Stack with several popular providers. This distribution is intended for CPU-only environments.", "apis": [ { "api": "inference", "provider": "remote::cerebras" }, { "api": "inference", "provider": "remote::ollama" }, { "api": "inference", "provider": "remote::vllm" }, { "api": "inference", "provider": "remote::tgi" }, { "api": "inference", "provider": "remote::fireworks" }, { "api": "inference", "provider": "remote::together" }, { "api": "inference", "provider": "remote::bedrock" }, { "api": "inference", "provider": "remote::nvidia" }, { "api": "inference", "provider": "remote::openai" }, { "api": "inference", "provider": "remote::anthropic" }, { "api": "inference", "provider": "remote::gemini" }, { "api": "inference", "provider": "remote::vertexai" }, { "api": "inference", "provider": "remote::groq" }, { "api": "inference", "provider": "remote::sambanova" }, { "api": "inference", "provider": "remote::azure" }, { "api": "inference", "provider": "inline::sentence-transformers" }, { "api": "vector_io", "provider": "inline::faiss" }, { "api": "vector_io", "provider": "inline::sqlite-vec" }, { "api": "vector_io", "provider": "inline::milvus" }, { "api": "vector_io", "provider": "remote::chromadb" }, { "api": "vector_io", "provider": "remote::pgvector" }, { "api": "files", "provider": "inline::localfs" }, { "api": "safety", "provider": "inline::llama-guard" }, { "api": "safety", "provider": "inline::code-scanner" }, { "api": "agents", "provider": "inline::meta-reference" }, { "api": "telemetry", "provider": "inline::meta-reference" }, { "api": "post_training", "provider": "inline::torchtune-cpu" }, { "api": "eval", "provider": "inline::meta-reference" }, { "api": "datasetio", "provider": "remote::huggingface" }, { "api": "datasetio", "provider": "inline::localfs" }, { "api": "scoring", "provider": "inline::basic" }, { "api": "scoring", "provider": "inline::llm-as-judge" }, { "api": "scoring", "provider": "inline::braintrust" }, { "api": "tool_runtime", "provider": "remote::brave-search" }, { "api": "tool_runtime", "provider": "remote::tavily-search" }, { "api": "tool_runtime", "provider": "inline::rag-runtime" }, { "api": "tool_runtime", "provider": "remote::model-context-protocol" }, { "api": "batches", "provider": "inline::reference" } ], "pip_dependencies": [ "pandas", "opentelemetry-exporter-otlp-proto-http", "matplotlib", "opentelemetry-sdk", "sentence-transformers", "datasets", "pymilvus[milvus-lite]>=2.4.10", "codeshield", "scipy", "torchvision", "tree_sitter", "h11>=0.16.0", "aiohttp", "pymongo", "tqdm", "pythainlp", "pillow", "torch", "emoji", "grpcio>=1.67.1,<1.71.0", "fireworks-ai", "langdetect", "psycopg2-binary", "asyncpg", "redis", "together", "torchao>=0.12.0", "openai", "sentencepiece", "aiosqlite", "google-cloud-aiplatform", "faiss-cpu", "numpy", "sqlite-vec", "nltk", "scikit-learn", "mcp>=1.8.1", "transformers", "boto3", "huggingface_hub", "ollama", "autoevals", "sqlalchemy[asyncio]", "torchtune>=0.5.0", "chromadb-client", "pypdf", "requests", "anthropic", "chardet", "aiosqlite", "fastapi", "fire", "httpx", "uvicorn", "opentelemetry-sdk", "opentelemetry-exporter-otlp-proto-http" ] } ``` <img width="1500" height="420" alt="Screenshot 2025-10-16 at 5 53 03 PM" src="https://github.com/user-attachments/assets/765929fb-93e2-44d7-9c3d-8918b70fc721" /> --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-17 19:52:14 -07:00
Emilio Garcia	943558af36	test(telemetry): Telemetry Tests (#3805 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 10s Details Python Package Build Test / build (3.13) (push) Failing after 10s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 14s Details Unit Tests / unit-tests (3.13) (push) Failing after 11s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 20s Details Unit Tests / unit-tests (3.12) (push) Failing after 16s Details Test External API and Providers / test-external (venv) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (push) Failing after 30s Details API Conformance Tests / check-schema-compatibility (push) Successful in 38s Details UI Tests / ui-tests (22) (push) Successful in 1m32s Details Pre-commit / pre-commit (push) Successful in 3m16s Details # What does this PR do? Adds a test and a standardized way to build future tests out for telemetry in llama stack. Contributes to https://github.com/llamastack/llama-stack/issues/3806 ## Test Plan This is the test plan 😎	2025-10-17 10:43:33 -07:00
Alexey Rybak	224c99560c	docs: update docstrings for better formatting (#3838 ) # What does this PR do? Updates docstrings for Conversations and Eval APIs to render better in the docs nav sidebar. Before: <img width="363" height="233" alt="Screenshot 2025-10-17 at 9 52 17 AM" src="https://github.com/user-attachments/assets/3a77f9e3-3b03-43ae-8584-a21d1f44d54d" /> After: <img width="410" height="206" alt="Screenshot 2025-10-17 at 9 52 11 AM" src="https://github.com/user-attachments/assets/fa5d428d-2bde-4453-84fd-9aceebe712e8" /> ## Test Plan * Manual testing	2025-10-17 10:41:50 -07:00
Alexey Rybak	c9f0bebcb7	chore: update API leveling docs with deprecation flag (#3837 ) # What does this PR do? Adds information on the `deprecated=True` flags to the documentation for extra clarity. ## Test Plan * Manual testing	2025-10-17 10:17:58 -07:00
Ashwin Bharambe	a701f68bd7	feat(ci): enable docker based server tests (#3833 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 12s Details Unit Tests / unit-tests (3.13) (push) Failing after 13s Details Test External API and Providers / test-external (venv) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (push) Failing after 22s Details API Conformance Tests / check-schema-compatibility (push) Successful in 31s Details UI Tests / ui-tests (22) (push) Successful in 1m35s Details Pre-commit / pre-commit (push) Successful in 2m27s Details	2025-10-17 09:19:25 +02:00
Ashwin Bharambe	4c9d944380	fix(perf): make batches tests finish 30x faster (#3834 ) In replay mode, inference is instantenous. We don't need to wait 15 seconds for the batch to be done. Fixing polling to do exp backoff makes things work super fast.	2025-10-17 09:16:44 +02:00
Ashwin Bharambe	cd152f4240	feat(ci): add support for docker:distro in tests (#3832 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Test Llama Stack Build / build-single-provider (push) Failing after 9s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (push) Failing after 14s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details Test External API and Providers / test-external (venv) (push) Failing after 12s Details API Conformance Tests / check-schema-compatibility (push) Successful in 19s Details Test Llama Stack Build / build (push) Failing after 7s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 26s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 25s Details Python Package Build Test / build (3.12) (push) Failing after 33s Details UI Tests / ui-tests (22) (push) Successful in 1m26s Details Pre-commit / pre-commit (push) Successful in 2m18s Details Also a critical bug fix so test recordings can be found inside docker	2025-10-16 19:33:13 -07:00
ehhuang	b3099d40e2	fix(telemetry): remove dependency on old telemetry config (#3830 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 8s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 10s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 12s Details Test Llama Stack Build / build-single-provider (push) Failing after 11s Details Python Package Build Test / build (3.12) (push) Failing after 10s Details Test External API and Providers / test-external (venv) (push) Failing after 11s Details Python Package Build Test / build (3.13) (push) Failing after 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Test Llama Stack Build / build (push) Failing after 12s Details Unit Tests / unit-tests (3.12) (push) Failing after 21s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 57s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m13s Details API Conformance Tests / check-schema-compatibility (push) Successful in 1m22s Details UI Tests / ui-tests (22) (push) Successful in 1m33s Details Pre-commit / pre-commit (push) Successful in 1m55s Details # What does this PR do? old telemetry config was removed in #3815 ## Test Plan ❯ OTEL_SERVICE_NAME=aloha OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 uv run llama stack run starter <img width="1888" height="605" alt="image" src="https://github.com/user-attachments/assets/dd5cc9f0-213a-4dc6-9385-f61a3a13b4c3" />	2025-10-16 12:05:10 -07:00
ehhuang	07ff15d917	chore: distrogen enables telemetry by default (#3828 ) # What does this PR do? leftover from #3815 ## Test Plan CI --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/llamastack/llama-stack/pull/3828). * #3830 * __->__ #3828	2025-10-16 11:29:51 -07:00
Charlie Doern	f22aaef42f	chore!: remove telemetry API usage (#3815 ) # What does this PR do? remove telemetry as a providable API from the codebase. This includes removing it from generated distributions but also the provider registry, the router, etc since `setup_logger` is tied pretty strictly to `Api.telemetry` being in impls we still need an "instantiated provider" in our implementations. However it should not be auto-routed or provided. So in validate_and_prepare_providers (called from resolve_impls) I made it so that if run_config.telemetry.enabled, we set up the meta-reference "provider" internally to be used so that log_event will work when called. This is the neatest way I think we can remove telemetry from the provider configs but also not need to rip apart the whole "telemetry is a provider" logic just yet, but we can do it internally later without disrupting users. so telemetry is removed from the registry such that if a user puts `telemetry:` as an API in their build/run config it will err out, but can still be used by us internally as we go through this transition. relates to #3806 Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-16 10:39:32 -07:00
slekkala1	8c5705d39e	fix: test id not being set in headers (#3827 ) # What does this PR do? When stack config is set to server in docker STACK_CONFIG_ARG=--stack-config=http://localhost:8321, the env variable was not getting correctly set and test id not set, causing This is needed for test-and-cut to work E openai.BadRequestError: Error code: 400 - {'detail': 'Invalid value: Test ID is required for file ID allocation'} `5286461406` ## Test Plan CI	2025-10-16 10:29:07 -07:00
Bill Murdock	c19eb9854d	docs: Document known limitations of Responses (#3776 ) # What does this PR do? Adds a subpage of the OpenAI compatibility page in the documentation. This subpage documents known limitations of the Responses API. <!-- If resolving an issue, uncomment and update the line below --> Closes #3575 --------- Signed-off-by: Bill Murdock <bmurdock@redhat.com>	2025-10-16 10:26:23 -07:00
Ashwin Bharambe	185de61d8e	fix(openai_mixin): no yelling for model listing if API keys are not provided (#3826 ) As indicated in the title. Our `starter` distribution enables all remote providers _very intentionally_ because we believe it creates an easier, more welcoming experience to new folks using the software. If we do that, and then slam the logs with errors making them question their life choices, it is not so good :) Note that this fix is limited in scope. If you ever try to actually instantiate the OpenAI client from a code path without an API key being present, you deserve to fail hard. ## Test Plan Run `llama stack run starter` with `OPENAI_API_KEY` set. No more wall of text, just one message saying "listed 96 models".	2025-10-16 10:12:13 -07:00
Ashwin Bharambe	07fc8013eb	fix(tests): reduce some test noise (#3825 ) a bunch of logger.info()s are good for server code to help debug in production, but we don't want them killing our unit test output :) --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-16 09:52:16 -07:00
Sébastien Han	0c368492b7	chore: update agent call (#3824 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Test External API and Providers / test-external (venv) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (push) Failing after 11s Details API Conformance Tests / check-schema-compatibility (push) Successful in 17s Details UI Tests / ui-tests (22) (push) Successful in 1m49s Details Pre-commit / pre-commit (push) Successful in 2m51s Details followup on https://github.com/llamastack/llama-stack/pull/3810 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-10-16 16:04:43 +02:00
Derek Higgins	edb8afb219	chore: remove test_cases/openai/responses.json (#3823 ) Its unused Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-10-16 06:59:29 -07:00
Ashwin Bharambe	f70aa99c97	fix(models)!: always prefix models with provider_id when registering (#3822 ) !!BREAKING CHANGE!! The lookup is also straightforward -- we always look for this identifier and don't try to find a match for something without the provider_id prefix. Note that, this ideally means we need to update the `register_model()` API also (we should kill "identifier" from there) but I am not doing that as part of this PR. ## Test Plan Existing unit tests	2025-10-16 06:47:39 -07:00
Ashwin Bharambe	f205ab6f6c	fix(responses): fixes, re-record tests (#3820 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 17s Details UI Tests / ui-tests (22) (push) Successful in 55s Details Pre-commit / pre-commit (push) Successful in 1m43s Details Wanted to re-enable Responses CI but it seems to hang for some reason due to some interactions with conversations_store or responses_store. ## Test Plan ``` # library client ./scripts/integration-tests.sh --stack-config ci-tests --suite responses # server ./scripts/integration-tests.sh --stack-config server:ci-tests --suite responses ```	2025-10-15 16:37:42 -07:00
slekkala1	99141c29b1	feat: Add responses and safety impl extra_body (#3781 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 9s Details Test External API and Providers / test-external (venv) (push) Failing after 8s Details Test Llama Stack Build / build (push) Failing after 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details API Conformance Tests / check-schema-compatibility (push) Successful in 19s Details UI Tests / ui-tests (22) (push) Successful in 37s Details Pre-commit / pre-commit (push) Successful in 1m33s Details # What does this PR do? Have closed the previous PR due to merge conflicts with multiple PRs Addressed all comments from https://github.com/llamastack/llama-stack/pull/3768 (sorry for carrying over to this one) ## Test Plan Added UTs and integration tests	2025-10-15 15:01:37 -07:00
Ashwin Bharambe	8e7e0ddfec	fix(responses): use conversation items when no stored messages exist (#3819 ) Handle a base case when no stored messages exist because no Response call has been made. ## Test Plan ``` ./scripts/integration-tests.sh --stack-config server:ci-tests \ --suite responses --inference-mode record-if-missing --pattern test_conversation_responses ```	2025-10-15 14:43:44 -07:00
ehhuang	6ba9db3929	chore!: BREAKING CHANGE: remove sqlite from telemetry config (#3808 ) # What does this PR do? - Removed sqlite sink from telemetry config. - Removed related code - Updated doc related to telemetry ## Test Plan CI	2025-10-15 14:24:45 -07:00
Ashwin Bharambe	0a96a7faa5	fix(responses): fix subtle bugs in non-function tool calling (#3817 ) We were generating "FunctionToolCall" items even for MCP (and file-search, etc.) server-side calls. ID mismatches, etc. galore.	2025-10-15 13:57:37 -07:00
ehhuang	d709eeb33f	chore: mark recordings as generated files (#3816 ) # What does this PR do? ## Test Plan <img width="1506" height="653" alt="image" src="https://github.com/user-attachments/assets/6c28b8e8-effe-41ab-8e31-72482c05662d" />	2025-10-15 11:06:42 -07:00
Sumanth Kamenani	bc8b377a7c	fix(vector-io): handle missing document_id in insert_chunks (#3521 ) Fixed KeyError when chunks don't have document_id in metadata or chunk_metadata. Updated logging to safely extract document_id using getattr and RAG memory to handle different document_id locations. Added test for missing document_id scenarios. Fixes issue #3494 where /v1/vector-io/insert would crash with KeyError. Fixed KeyError when chunks don't have document_id in metadata or chunk_metadata. Updated logging to safely extract document_id using getattr and RAG memory to handle different document_id locations. Added test for missing document_id scenarios. # What does this PR do? Fixes a KeyError crash in `/v1/vector-io/insert` when chunks are missing `document_id` fields. The API was failing even though `document_id` is optional according to the schema. Closes #3494 ## Test Plan Before fix: - POST to `/v1/vector-io/insert` with chunks → 500 KeyError - Happened regardless of where `document_id` was placed After fix: - Same request works fine → 200 OK - Tested with Postman using FAISS backend - Added unit test covering missing `document_id` scenarios	2025-10-15 11:02:48 -07:00
Ashwin Bharambe	e9b4278a51	feat(responses)!: improve responses + conversations implementations (#3810 ) This PR updates the Conversation item related types and improves a couple critical parts of the implemenation: - it creates a streaming output item for the final assistant message output by the model. until now we only added content parts and included that message in the final response. - rewrites the conversation update code completely to account for items other than messages (tool calls, outputs, etc.) ## Test Plan Used the test script from https://github.com/llamastack/llama-stack-client-python/pull/281 for this ``` TEST_API_BASE_URL=http://localhost:8321/v1 \ pytest tests/integration/test_agent_turn_step_events.py::test_client_side_function_tool -xvs ```	2025-10-15 09:36:11 -07:00
Juan Pérez de Algaba	add8cd801b	feat(gemini): Support gemini-embedding-001 and fix models/ prefix in metadata keys (#3813 ) # Add support for Google Gemini `gemini-embedding-001` embedding model and correctly registers model type MR message created with the assistance of Claude-4.5-sonnet This resolves https://github.com/llamastack/llama-stack/issues/3755 ## What does this PR do? This PR adds support for the `gemini-embedding-001` Google embedding model to the llama-stack Gemini provider. This model provides high-dimensional embeddings (3072 dimensions) compared to the existing `text-embedding-004` model (768 dimensions). Old embeddings models (such as text-embedding-004) will be deprecated soon according to Google ([Link](https://developers.googleblog.com/en/gemini-embedding-available-gemini-api/)) ## Problem The Gemini provider only supported the `text-embedding-004` embedding model. The newer `gemini-embedding-001` model, which provides higher-dimensional embeddings for improved semantic representation, was not available through llama-stack. ## Solution This PR consists of three commits that implement, fix the model registration, and enable embedding generation: ### Commit 1: Initial addition of gemini-embedding-001 Added metadata for `gemini-embedding-001` to the `embedding_model_metadata` dictionary: ```python embedding_model_metadata: dict[str, dict[str, int]] = { "text-embedding-004": {"embedding_dimension": 768, "context_length": 2048}, "gemini-embedding-001": {"embedding_dimension": 3072, "context_length": 2048}, # NEW } ``` Issue discovered: The model was not being registered correctly because the dictionary keys didn't match the model IDs returned by Gemini's API. ### Commit 2: Fix model ID matching with `models/` prefix Updated both dictionary keys to include the `models/` prefix to match Gemini's OpenAI-compatible API response format: ```python embedding_model_metadata: dict[str, dict[str, int]] = { "models/text-embedding-004": {"embedding_dimension": 768, "context_length": 2048}, # UPDATED "models/gemini-embedding-001": {"embedding_dimension": 3072, "context_length": 2048}, # UPDATED } ``` Root cause: Gemini's OpenAI-compatible API returns model IDs with the `models/` prefix (e.g., `models/text-embedding-004`). The `OpenAIMixin.list_models()` method directly matches these IDs against the `embedding_model_metadata` dictionary keys. Without the prefix, the models were being registered as LLMs instead of embedding models. ### Commit 3: Fix embedding generation for providers without usage stats Fixed a bug in `OpenAIMixin.openai_embeddings()` that prevented embedding generation for providers (like Gemini) that don't return usage statistics: ```python # Before (Line 351-354): usage = OpenAIEmbeddingUsage( prompt_tokens=response.usage.prompt_tokens, # ← Crashed with AttributeError total_tokens=response.usage.total_tokens, ) # After (Lines 351-362): if response.usage: usage = OpenAIEmbeddingUsage( prompt_tokens=response.usage.prompt_tokens, total_tokens=response.usage.total_tokens, ) else: usage = OpenAIEmbeddingUsage( prompt_tokens=0, # Default when not provided total_tokens=0, # Default when not provided ) ``` Impact: This fix enables embedding generation for all Gemini embedding models, not just the newly added one. ## Changes ### Modified Files `llama_stack/providers/remote/inference/gemini/gemini.py` - Line 17: Updated `text-embedding-004` key to `models/text-embedding-004` - Line 18: Added `models/gemini-embedding-001` with correct metadata `llama_stack/providers/utils/inference/openai_mixin.py` - Lines 351-362: Added null check for `response.usage` to handle providers without usage statistics ## Key Technical Details ### Model ID Matching Flow 1. `list_provider_model_ids()` calls Gemini's `/v1/models` endpoint 2. API returns model IDs like: `models/text-embedding-004`, `models/gemini-embedding-001` 3. `OpenAIMixin.list_models()` (line 410) checks: `if metadata := self.embedding_model_metadata.get(provider_model_id)` 4. If matched, registers as `model_type: "embedding"` with metadata; otherwise registers as `model_type: "llm"` ### Why Both Keys Needed the Prefix The `text-embedding-004` model was already working because there was likely separate configuration or manual registration handling it. For auto-discovery to work correctly for both models, both keys must match the API's model ID format exactly. ## How to test this PR Verified the changes by: 1. Model Auto-Discovery: Started llama-stack server and confirmed models are auto-discovered from Gemini API 2. Model Registration: Confirmed both embedding models are correctly registered and visible ```bash curl http://localhost:8325/v1/models \| jq '.data[] \| select(.provider_id == "gemini" and .model_type == "embedding")' ``` Results: - ✅ `gemini/models/text-embedding-004` - 768 dimensions - `model_type: "embedding"` - ✅ `gemini/models/gemini-embedding-001` - 3072 dimensions - `model_type: "embedding"` 3. Before Fix (Commit 1): Models appeared as `model_type: "llm"` without embedding metadata 4. After Fix (Commit 2): Models correctly identified as `model_type: "embedding"` with proper metadata 5. Generate Embeddings: Verified embedding generation works ```bash curl -X POST http://localhost:8325/v1/embeddings \ -H "Content-Type: application/json" \ -d '{"model": "gemini/models/gemini-embedding-001", "input": "test"}' \| \ jq '.data[0].embedding \| length' ```	2025-10-15 12:22:10 -04:00
slekkala1	ce8ea2f505	chore: Support embedding params from metadata for Vector Store (#3811 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details UI Tests / ui-tests (22) (push) Successful in 42s Details Pre-commit / pre-commit (push) Successful in 1m34s Details # What does this PR do? Support reading embedding model and dimensions from metadata for vector store ## Test Plan Unit Tests	2025-10-15 15:53:36 +02:00
Francisco Arceo	ef4bc70bbe	feat: Enable setting a default embedding model in the stack (#3803 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details UI Tests / ui-tests (22) (push) Successful in 40s Details Pre-commit / pre-commit (push) Successful in 1m28s Details # What does this PR do? Enables automatic embedding model detection for vector stores and by using a `default_configured` boolean that can be defined in the `run.yaml`. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan - Unit tests - Integration tests - Simple example below: Spin up the stack: ```bash uv run llama stack build --distro starter --image-type venv --run ``` Then test with OpenAI's client: ```python from openai import OpenAI client = OpenAI(base_url="http://localhost:8321/v1/", api_key="none") vs = client.vector_stores.create() ``` Previously you needed: ```python vs = client.vector_stores.create( extra_body={ "embedding_model": "sentence-transformers/all-MiniLM-L6-v2", "embedding_dimension": 384, } ) ``` The `extra_body` is now unnecessary. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-10-14 18:25:13 -07:00
Jiayi Ni	d875e427bf	refactor: use `extra_body` to pass in `input_type` params for asymmetric embedding models for NVIDIA Inference Provider (#3804 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Test Llama Stack Build / build (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 9s Details API Conformance Tests / check-schema-compatibility (push) Successful in 16s Details UI Tests / ui-tests (22) (push) Successful in 33s Details Pre-commit / pre-commit (push) Successful in 1m33s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Previously, the NVIDIA inference provider implemented a custom `openai_embeddings` method with a hardcoded `input_type="query"` parameter, which is required by NVIDIA asymmetric embedding models([https://github.com/llamastack/llama-stack/pull/3205](https://github.com/llamastack/llama-stack/pull/3205)). Recently `extra_body` parameter is added to the embeddings API ([https://github.com/llamastack/llama-stack/pull/3794](https://github.com/llamastack/llama-stack/pull/3794)). So, this PR updates the NVIDIA inference provider to use the base `OpenAIMixin.openai_embeddings` method instead and pass the `input_type` through the `extra_body` parameter for asymmetric embedding models. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Run the following command for the ```embedding_model```: ```nvidia/llama-3.2-nv-embedqa-1b-v2```, ```nvidia/nv-embedqa-e5-v5```, ```nvidia/nv-embedqa-mistral-7b-v2```, and ```snowflake/arctic-embed-l```. ``` pytest -s -v tests/integration/inference/test_openai_embeddings.py --stack-config="inference=nvidia" --embedding-model={embedding_model} --env NVIDIA_API_KEY={nvidia_api_key} --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" --inference-mode=record ```	2025-10-14 13:52:55 -07:00
ehhuang	866c13cdc2	chore(api)!: BREAKING CHANGE: remove ALL telemetry APIs (#3740 ) # What does this PR do? As discussed on discord, we do not need to reinvent the wheel for telemetry. Instead we'll lean into the canonical OTEL stack. Logs/traces/metrics will still be sent via OTEL - they just won't be stored on, queried through Stack. This is the first of many PRs to remove telemetry API from Stack. 1) removed webmethod decorators to remove from API spec 2) removed tests as @iamemilio is adding them on otel directly. ## Test Plan	2025-10-14 13:48:40 -07:00
Bill Murdock	15900472ad	docs: Update CONTRIBUTING: py 3.12 and pre-commit==4.3.0 (#3807 ) # What does this PR do? Updates CONTRIBUTING.md with the following changes: - Use Python 3.12 (and why) - Use pre-commit==4.3.0 - Recommend using -v with pre-commit to get detailed info about why it is failing if it fails. - Instructs users to go to the docs/ directory before rebuilding the docs (it doesn't work unless you do that). Signed-off-by: Bill Murdock <bmurdock@redhat.com>	2025-10-14 15:47:38 -04:00
IAN MILLER	007efa6eb5	refactor: replace default all-MiniLM-L6-v2 embedding model by nomic-embed-text-v1.5 in Llama Stack (#3183 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this PR is to replace the Llama Stack's default embedding model by nomic-embed-text-v1.5. These are the key reasons why Llama Stack community decided to switch from all-MiniLM-L6-v2 to nomic-embed-text-v1.5: 1. The training data for [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2#training-data) includes a lot of data sets with various licensing terms, so it is tricky to know when/whether it is appropriate to use this model for commercial applications. 2. The model is not particularly competitive on major benchmarks. For example, if you look at the [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) and click on Miscellaneous/BEIR to see English information retrieval accuracy, you see that the top of the leaderboard is dominated by enormous models but also that there are many, many models of relatively modest size whith much higher Retrieval scores. If you want to look closely at the data, I recommend clicking "Download Table" because it is easier to browse that way. More discussion info can be founded [here](https://github.com/llamastack/llama-stack/issues/2418) <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2418 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> 1. Run `./scripts/unit-tests.sh` 2. Integration tests via CI wokrflow --------- Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-10-14 10:44:20 -04:00
Cesare Pompeiano	0dbf79c328	fix: Fixed WatsonX remote inference provider (#3801 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 9s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 9s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 9s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Test Llama Stack Build / build (push) Failing after 31s Details UI Tests / ui-tests (22) (push) Successful in 46s Details Pre-commit / pre-commit (push) Successful in 2m13s Details # What does this PR do? This PR fixes issues with the WatsonX provider so it works correctly with LiteLLM. The main problem was that WatsonX requests failed because the provider data validator didn’t properly handle the API key and project ID. This was fixed by updating the WatsonXProviderDataValidator and ensuring the provider data is loaded correctly. The openai_chat_completion method was also updated to match the behavior of other providers while adding WatsonX-specific fields like project_id. It still calls await super().openai_chat_completion.__func__(self, params) to keep the existing setup and tracing logic. After these changes, WatsonX requests now run correctly. ## Test Plan The changes were tested by running chat completion requests and confirming that credentials and project parameters are passed correctly. I have tested with my WatsonX credentials, by using the cli with `uv run llama-stack-client inference chat-completion --session` --------- Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-10-14 14:52:32 +02:00
Sébastien Han	1136daf310	fix: replace python-jose with PyJWT for JWT handling (#3756 ) # What does this PR do? This commit migrates the authentication system from python-jose to PyJWT to eliminate the dependency on the archived rsa package. The migration includes: - Refactored OAuth2TokenAuthProvider to use PyJWT's PyJWKClient for clean JWKS handling - Removed manual JWKS fetching, caching and key extraction logic in favor of PyJWT's built-in functionality The new implementation is cleaner, more maintainable, and follows PyJWT best practices while maintaining full backward compatibility. ## Test Plan Unit tests. Auth CI. --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-10-14 09:35:48 +02:00
Francisco Arceo	968c364a3e	chore: Auto-detect Provider ID when only 1 Vector Store Provider avai… (#3802 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 8s Details API Conformance Tests / check-schema-compatibility (push) Successful in 18s Details UI Tests / ui-tests (22) (push) Successful in 29s Details Pre-commit / pre-commit (push) Successful in 1m24s Details # What does this PR do? 2 main changes: 1. Remove `provider_id` requirement in call to vector stores and 2. Removes "register first embedding model" logic - Now forces embedding model id as required on Vector Store creation Simplifies the UX for OpenAI to: ```python vs = client.vector_stores.create( name="my_citations_db", extra_body={ "embedding_model": "ollama/nomic-embed-text:latest", } ) ``` <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-10-13 10:25:36 -07:00
Derek Higgins	642126e13b	fix: record job checking wrong directory (#3799 ) Fixed CI job to check the correct directory for file changes Artifacts are now stored in multiple directories not just ./tests/integration/recordings Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-10-13 09:55:55 -07:00
raghotham	b95f095a54	feat: Allow :memory: for kvstore (#3696 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 15s Details UI Tests / ui-tests (22) (push) Successful in 41s Details Pre-commit / pre-commit (push) Successful in 1m21s Details ## Test Plan added unit tests	2025-10-13 11:19:27 +02:00
Ashwin Bharambe	ecc8a554d2	feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details UI Tests / ui-tests (22) (push) Successful in 40s Details Pre-commit / pre-commit (push) Successful in 1m23s Details Applies the same pattern from https://github.com/llamastack/llama-stack/pull/3777 to embeddings and vector_stores.create() endpoints. This should _not_ be a breaking change since (a) our tests were already using the `extra_body` parameter when passing in to the backend (b) but the backend probably wasn't extracting the parameters correctly. This PR will fix that. Updated APIs: `openai_embeddings(), openai_create_vector_store(), openai_create_vector_store_file_batch()`	2025-10-12 19:01:52 -07:00
slekkala1	3bb6ef351b	chore!: Safety api refactoring to use OpenAIMessageParam (#3796 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 13s Details UI Tests / ui-tests (22) (push) Successful in 40s Details Pre-commit / pre-commit (push) Successful in 1m28s Details # What does this PR do? Remove usage of deprecated `Message` from Safety apis ## Test Plan CI	2025-10-12 08:01:00 -07:00
dependabot[bot]	82cbcada39	chore(ui-deps): bump lucide-react from 0.542.0 to 0.545.0 in /llama_stack/ui (#3788 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details UI Tests / ui-tests (22) (push) Successful in 41s Details Pre-commit / pre-commit (push) Successful in 1m26s Details Bumps [lucide-react](https://github.com/lucide-icons/lucide/tree/HEAD/packages/lucide-react) from 0.542.0 to 0.545.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/lucide-icons/lucide/releases">lucide-react's releases</a>.</em></p> <blockquote> <h2>Version 0.545.0</h2> <h2>What's Changed</h2> <ul> <li>fix(icons): changed <code>flame</code> icon by <a href="https://github.com/jamiemlaw"><code>@jamiemlaw</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3600">lucide-icons/lucide#3600</a></li> <li>fix(icons): arcified <code>square-m</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3549">lucide-icons/lucide#3549</a></li> <li>chore(deps-dev): bump vite from 6.3.5 to 6.3.6 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3611">lucide-icons/lucide#3611</a></li> <li>fix(icons): changed <code>combine</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3200">lucide-icons/lucide#3200</a></li> <li>fix(icons): changed <code>building-2</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3509">lucide-icons/lucide#3509</a></li> <li>chore(deps): bump devalue from 5.1.1 to 5.3.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3638">lucide-icons/lucide#3638</a></li> <li>feat(icons): Add <code>motorbike</code> icon by <a href="https://github.com/jamiemlaw"><code>@jamiemlaw</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3371">lucide-icons/lucide#3371</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.544.0...0.545.0">https://github.com/lucide-icons/lucide/compare/0.544.0...0.545.0</a></p> <h2>Version 0.544.0</h2> <h2>What's Changed</h2> <ul> <li>docs: update lucide-static documentation about raw string imports by <a href="https://github.com/pascalduez"><code>@pascalduez</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3524">lucide-icons/lucide#3524</a></li> <li>feat(icons): added <code>ev-charger</code> icon by <a href="https://github.com/UsamaKhan"><code>@UsamaKhan</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/2781">lucide-icons/lucide#2781</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/pascalduez"><code>@pascalduez</code></a> made their first contribution in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3524">lucide-icons/lucide#3524</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.543.0...0.544.0">https://github.com/lucide-icons/lucide/compare/0.543.0...0.544.0</a></p> <h2>Version 0.543.0</h2> <h2>What's Changed</h2> <ul> <li>feat(preview-comment): put x-ray at top if there are more than 7 changed icons to prevent them from being cut of by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3589">lucide-icons/lucide#3589</a></li> <li>fix(icons): changed <code>church</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/2971">lucide-icons/lucide#2971</a></li> <li>chore(metadata): Added tags to <code>messages-square</code> by <a href="https://github.com/jamiemlaw"><code>@jamiemlaw</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3529">lucide-icons/lucide#3529</a></li> <li>fix(icons): Optimise <code>bug</code> icons by <a href="https://github.com/jamiemlaw"><code>@jamiemlaw</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3574">lucide-icons/lucide#3574</a></li> <li>fix(icons): changed list/text & derived icons by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3568">lucide-icons/lucide#3568</a></li> <li>fix(icons): changed <code>panel-top-bottom-dashed</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3584">lucide-icons/lucide#3584</a></li> <li>fix(icons): changed <code>message-square-quote</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3550">lucide-icons/lucide#3550</a></li> <li>fix(meta): added tag to <code>ship</code> metadata by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3559">lucide-icons/lucide#3559</a></li> <li>fix(meta): add tags to <code>id-card-lanyard</code> metadata by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3534">lucide-icons/lucide#3534</a></li> <li>fix(icons): changed <code>calendar-cog</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3583">lucide-icons/lucide#3583</a></li> <li>chore(deps): bump astro from 5.5.2 to 5.13.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3564">lucide-icons/lucide#3564</a></li> <li>feat(packages): add new package for flutter by <a href="https://github.com/vqh2602"><code>@vqh2602</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3536">lucide-icons/lucide#3536</a></li> <li>feat(icons): added <code>house-heart</code> icon by <a href="https://github.com/danielbayley"><code>@danielbayley</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3239">lucide-icons/lucide#3239</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.542.0...0.543.0">https://github.com/lucide-icons/lucide/compare/0.542.0...0.543.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`1cfb3ff70e`"><code>1cfb3ff</code></a> chore(deps-dev): bump vite from 6.3.5 to 6.3.6 (<a href="https://github.com/lucide-icons/lucide/tree/HEAD/packages/lucide-react/issues/3611">#3611</a>)</li> <li>See full diff in <a href="https://github.com/lucide-icons/lucide/commits/0.545.0/packages/lucide-react">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=lucide-react&package-manager=npm_and_yarn&previous-version=0.542.0&new-version=0.545.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-11 21:40:48 -04:00
dependabot[bot]	e94840d298	chore(ui-deps): bump framer-motion from 12.23.12 to 12.23.24 in /llama_stack/ui (#3792 ) Bumps [framer-motion](https://github.com/motiondivision/motion) from 12.23.12 to 12.23.24. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/motiondivision/motion/blob/main/CHANGELOG.md">framer-motion's changelog</a>.</em></p> <blockquote> <h2>[12.23.24] 2025-10-10</h2> <h3>Fixed</h3> <ul> <li>Ensure that when a component remounts, it continues to fire animations even when <code>initial={false}</code>.</li> </ul> <h2>[12.23.23] 2025-10-10</h2> <h3>Added</h3> <ul> <li>Exporting <code>PresenceChild</code> and <code>PopChild</code> type for internal use.</li> </ul> <h2>[12.23.22] 2025-09-25</h2> <h3>Added</h3> <ul> <li>Exporting <code>HTMLElements</code> and <code>useComposedRefs</code> type for internal use.</li> </ul> <h2>[12.23.21] 2025-09-24</h2> <h3>Fixed</h3> <ul> <li>Fixing main-thread <code>scroll</code> with animations that contain <code>delay</code>.</li> </ul> <h2>[12.23.20] 2025-09-24</h2> <h3>Fixed</h3> <ul> <li>Suppress non-animatable value warning for instant animations.</li> </ul> <h2>[12.23.19] 2025-09-23</h2> <h3>Fixed</h3> <ul> <li>Remove support for changing <code>ref</code> prop.</li> </ul> <h2>[12.23.18] 2025-09-19</h2> <h3>Fixed</h3> <ul> <li><code><motion /></code> components now support changing <code>ref</code> prop.</li> </ul> <h2>[12.23.17] 2025-09-19</h2> <h3>Fixed</h3> <ul> <li>Ensure <code>animate()</code> <code>onComplete</code> only fires once, when all values are complete.</li> </ul> <h2>[12.23.16] 2025-09-19</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`b5df740a46`"><code>b5df740</code></a> v12.23.24</li> <li><a href="`808ebce630`"><code>808ebce</code></a> Updating changelog</li> <li><a href="`237eee2246`"><code>237eee2</code></a> v12.23.23</li> <li><a href="`834965c803`"><code>834965c</code></a> Updating changelog</li> <li><a href="`40690864e9`"><code>4069086</code></a> Update README.md</li> <li><a href="`6da6b61e94`"><code>6da6b61</code></a> Update README.md with new sponsor links</li> <li><a href="`e36683149d`"><code>e366831</code></a> Update README.md</li> <li><a href="`7796f4f1e0`"><code>7796f4f</code></a> Update Gold section with new links and images</li> <li><a href="`d1bb93757c`"><code>d1bb937</code></a> Update sponsor section in README.md</li> <li><a href="`97fba16059`"><code>97fba16</code></a> Update sponsorship logos in README</li> <li>Additional commits viewable in <a href="https://github.com/motiondivision/motion/compare/v12.23.12...v12.23.24">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=framer-motion&package-manager=npm_and_yarn&previous-version=12.23.12&new-version=12.23.24)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-11 21:36:01 -04:00
dependabot[bot]	25ea94fcf7	chore(ui-deps): bump eslint from 9.26.0 to 9.37.0 in /llama_stack/ui (#3791 ) Bumps [eslint](https://github.com/eslint/eslint) from 9.26.0 to 9.37.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/eslint/eslint/releases">eslint's releases</a>.</em></p> <blockquote> <h2>v9.37.0</h2> <h2>Features</h2> <ul> <li><a href="`39f7fb493a`"><code>39f7fb4</code></a> feat: <code>preserve-caught-error</code> should recognize all static "cause" keys (<a href="https://redirect.github.com/eslint/eslint/issues/20163">#20163</a>) (Pixel998)</li> <li><a href="`f81eabc584`"><code>f81eabc</code></a> feat: support TS syntax in <code>no-restricted-imports</code> (<a href="https://redirect.github.com/eslint/eslint/issues/19562">#19562</a>) (Nitin Kumar)</li> </ul> <h2>Bug Fixes</h2> <ul> <li><a href="`a129cced7a`"><code>a129cce</code></a> fix: correct <code>no-loss-of-precision</code> false positives for leading zeros (<a href="https://redirect.github.com/eslint/eslint/issues/20164">#20164</a>) (Francesco Trotta)</li> <li><a href="`09e04fcc3f`"><code>09e04fc</code></a> fix: add missing AST token types (<a href="https://redirect.github.com/eslint/eslint/issues/20172">#20172</a>) (Pixel998)</li> <li><a href="`861c6da2bd`"><code>861c6da</code></a> fix: correct <code>ESLint</code> typings (<a href="https://redirect.github.com/eslint/eslint/issues/20122">#20122</a>) (Pixel998)</li> </ul> <h2>Documentation</h2> <ul> <li><a href="`b950359c5f`"><code>b950359</code></a> docs: fix typos across the docs (<a href="https://redirect.github.com/eslint/eslint/issues/20182">#20182</a>) (루밀LuMir)</li> <li><a href="`42498a2798`"><code>42498a2</code></a> docs: improve ToC accessibility by hiding non-semantic character (<a href="https://redirect.github.com/eslint/eslint/issues/20181">#20181</a>) (Percy Ma)</li> <li><a href="`29ea092b93`"><code>29ea092</code></a> docs: Update README (GitHub Actions Bot)</li> <li><a href="`5c97a04578`"><code>5c97a04</code></a> docs: show <code>availableUntil</code> in deprecated rule banner (<a href="https://redirect.github.com/eslint/eslint/issues/20170">#20170</a>) (Pixel998)</li> <li><a href="`90a71bf502`"><code>90a71bf</code></a> docs: update <code>README</code> files to add badge and instructions (<a href="https://redirect.github.com/eslint/eslint/issues/20115">#20115</a>) (루밀LuMir)</li> <li><a href="`1603ae1526`"><code>1603ae1</code></a> docs: update references from <code>master</code> to <code>main</code> (<a href="https://redirect.github.com/eslint/eslint/issues/20153">#20153</a>) (루밀LuMir)</li> </ul> <h2>Chores</h2> <ul> <li><a href="`afe8a13469`"><code>afe8a13</code></a> chore: update <code>@eslint/js</code> dependency to version 9.37.0 (<a href="https://redirect.github.com/eslint/eslint/issues/20183">#20183</a>) (Francesco Trotta)</li> <li><a href="`abee4ca1fa`"><code>abee4ca</code></a> chore: package.json update for <code>@eslint/js</code> release (Jenkins)</li> <li><a href="`fc9381f6ca`"><code>fc9381f</code></a> chore: fix typos in comments (<a href="https://redirect.github.com/eslint/eslint/issues/20175">#20175</a>) (overlookmotel)</li> <li><a href="`e1574a22d3`"><code>e1574a2</code></a> chore: unpin jiti (<a href="https://redirect.github.com/eslint/eslint/issues/20173">#20173</a>) (renovate[bot])</li> <li><a href="`e1ac05e2fa`"><code>e1ac05e</code></a> refactor: mark <code>ESLint.findConfigFile()</code> as <code>async</code>, add missing docs (<a href="https://redirect.github.com/eslint/eslint/issues/20157">#20157</a>) (Pixel998)</li> <li><a href="`347906d627`"><code>347906d</code></a> chore: update eslint (<a href="https://redirect.github.com/eslint/eslint/issues/20149">#20149</a>) (renovate[bot])</li> <li><a href="`0cb5897e24`"><code>0cb5897</code></a> test: remove tmp dir created for circular fixes in multithread mode test (<a href="https://redirect.github.com/eslint/eslint/issues/20146">#20146</a>) (Milos Djermanovic)</li> <li><a href="`bb995665e3`"><code>bb99566</code></a> ci: pin <code>jiti</code> to version 2.5.1 (<a href="https://redirect.github.com/eslint/eslint/issues/20151">#20151</a>) (Pixel998)</li> <li><a href="`177f669adc`"><code>177f669</code></a> perf: improve worker count calculation for <code>"auto"</code> concurrency (<a href="https://redirect.github.com/eslint/eslint/issues/20067">#20067</a>) (Francesco Trotta)</li> <li><a href="`448b57bca3`"><code>448b57b</code></a> chore: Mark deprecated formatting rules as available until v11.0.0 (<a href="https://redirect.github.com/eslint/eslint/issues/20144">#20144</a>) (Milos Djermanovic)</li> </ul> <h2>v9.36.0</h2> <h2>Features</h2> <ul> <li><a href="`47afcf668d`"><code>47afcf6</code></a> feat: correct <code>preserve-caught-error</code> edge cases (<a href="https://redirect.github.com/eslint/eslint/issues/20109">#20109</a>) (Francesco Trotta)</li> </ul> <h2>Bug Fixes</h2> <ul> <li><a href="`75b74d865d`"><code>75b74d8</code></a> fix: add missing rule option types (<a href="https://redirect.github.com/eslint/eslint/issues/20127">#20127</a>) (ntnyq)</li> <li><a href="`1c0d85049e`"><code>1c0d850</code></a> fix: update <code>eslint-all.js</code> to use <code>Object.freeze</code> for <code>rules</code> object (<a href="https://redirect.github.com/eslint/eslint/issues/20116">#20116</a>) (루밀LuMir)</li> <li><a href="`7d61b7fadc`"><code>7d61b7f</code></a> fix: add missing scope types to <code>Scope.type</code> (<a href="https://redirect.github.com/eslint/eslint/issues/20110">#20110</a>) (Pixel998)</li> <li><a href="`7a670c301b`"><code>7a670c3</code></a> fix: correct rule option typings in <code>rules.d.ts</code> (<a href="https://redirect.github.com/eslint/eslint/issues/20084">#20084</a>) (Pixel998)</li> </ul> <h2>Documentation</h2> <ul> <li><a href="`b73ab12acd`"><code>b73ab12</code></a> docs: update examples to use <code>defineConfig</code> (<a href="https://redirect.github.com/eslint/eslint/issues/20131">#20131</a>) (sethamus)</li> <li><a href="`31d9392699`"><code>31d9392</code></a> docs: fix typos (<a href="https://redirect.github.com/eslint/eslint/issues/20118">#20118</a>) (Pixel998)</li> <li><a href="`c7f861b3f8`"><code>c7f861b</code></a> docs: Update README (GitHub Actions Bot)</li> <li><a href="`6b0c08b106`"><code>6b0c08b</code></a> docs: Update README (GitHub Actions Bot)</li> <li><a href="`91f97c5046`"><code>91f97c5</code></a> docs: Update README (GitHub Actions Bot)</li> </ul> <h2>Chores</h2> <ul> <li><a href="`12411e8d45`"><code>12411e8</code></a> chore: upgrade <code>@eslint/js</code><a href="https://github.com/9"><code>@9</code></a>.36.0 (<a href="https://redirect.github.com/eslint/eslint/issues/20139">#20139</a>) (Milos Djermanovic)</li> <li><a href="`488cba6b39`"><code>488cba6</code></a> chore: package.json update for <code>@eslint/js</code> release (Jenkins)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`d5d1bdf5fd`"><code>d5d1bdf</code></a> 9.37.0</li> <li><a href="`94865ff41c`"><code>94865ff</code></a> Build: changelog update for 9.37.0</li> <li><a href="`afe8a13469`"><code>afe8a13</code></a> chore: update <code>@eslint/js</code> dependency to version 9.37.0 (<a href="https://redirect.github.com/eslint/eslint/issues/20183">#20183</a>)</li> <li><a href="`abee4ca1fa`"><code>abee4ca</code></a> chore: package.json update for <code>@eslint/js</code> release</li> <li><a href="`b950359c5f`"><code>b950359</code></a> docs: fix typos across the docs (<a href="https://redirect.github.com/eslint/eslint/issues/20182">#20182</a>)</li> <li><a href="`42498a2798`"><code>42498a2</code></a> docs: improve ToC accessibility by hiding non-semantic character (<a href="https://redirect.github.com/eslint/eslint/issues/20181">#20181</a>)</li> <li><a href="`fc9381f6ca`"><code>fc9381f</code></a> chore: fix typos in comments (<a href="https://redirect.github.com/eslint/eslint/issues/20175">#20175</a>)</li> <li><a href="`e1574a22d3`"><code>e1574a2</code></a> chore: unpin jiti (<a href="https://redirect.github.com/eslint/eslint/issues/20173">#20173</a>)</li> <li><a href="`29ea092b93`"><code>29ea092</code></a> docs: Update README</li> <li><a href="`a129cced7a`"><code>a129cce</code></a> fix: correct <code>no-loss-of-precision</code> false positives for leading zeros (<a href="https://redirect.github.com/eslint/eslint/issues/20164">#20164</a>)</li> <li>Additional commits viewable in <a href="https://github.com/eslint/eslint/compare/v9.26.0...v9.37.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=eslint&package-manager=npm_and_yarn&previous-version=9.26.0&new-version=9.37.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-11 18:00:29 -07:00
dependabot[bot]	190b96ea62	chore(ui-deps): bump @types/react-dom from 19.2.0 to 19.2.1 in /llama_stack/ui (#3789 ) Bumps [@types/react-dom](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-dom) from 19.2.0 to 19.2.1. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-dom">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/react-dom&package-manager=npm_and_yarn&previous-version=19.2.0&new-version=19.2.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-11 18:00:22 -07:00
dependabot[bot]	4fb39f0a6a	chore(ui-deps): bump @types/react from 19.2.0 to 19.2.2 in /llama_stack/ui (#3790 ) Bumps [@types/react](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react) from 19.2.0 to 19.2.2. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/react&package-manager=npm_and_yarn&previous-version=19.2.0&new-version=19.2.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-11 18:00:18 -07:00
dependabot[bot]	cfd2e303db	chore(python-deps): bump black from 25.1.0 to 25.9.0 (#3783 ) Bumps [black](https://github.com/psf/black) from 25.1.0 to 25.9.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/psf/black/releases">black's releases</a>.</em></p> <blockquote> <h2>25.9.0</h2> <h3>Highlights</h3> <ul> <li>Remove support for pre-python 3.7 <code>await/async</code> as soft keywords/variable names (<a href="https://redirect.github.com/psf/black/issues/4676">#4676</a>)</li> </ul> <h3>Stable style</h3> <ul> <li>Fix crash while formatting a long <code>del</code> statement containing tuples (<a href="https://redirect.github.com/psf/black/issues/4628">#4628</a>)</li> <li>Fix crash while formatting expressions using the walrus operator in complex <code>with</code> statements (<a href="https://redirect.github.com/psf/black/issues/4630">#4630</a>)</li> <li>Handle <code># fmt: skip</code> followed by a comment at the end of file (<a href="https://redirect.github.com/psf/black/issues/4635">#4635</a>)</li> <li>Fix crash when a tuple appears in the <code>as</code> clause of a <code>with</code> statement (<a href="https://redirect.github.com/psf/black/issues/4634">#4634</a>)</li> <li>Fix crash when tuple is used as a context manager inside a <code>with</code> statement (<a href="https://redirect.github.com/psf/black/issues/4646">#4646</a>)</li> <li>Fix crash when formatting a <code>\</code> followed by a <code>\r</code> followed by a comment (<a href="https://redirect.github.com/psf/black/issues/4663">#4663</a>)</li> <li>Fix crash on a <code>\\r\n</code> (<a href="https://redirect.github.com/psf/black/issues/4673">#4673</a>)</li> <li>Fix crash on <code>await ...</code> (where <code>...</code> is a literal <code>Ellipsis</code>) (<a href="https://redirect.github.com/psf/black/issues/4676">#4676</a>)</li> <li>Fix crash on parenthesized expression inside a type parameter bound (<a href="https://redirect.github.com/psf/black/issues/4684">#4684</a>)</li> <li>Fix crash when using line ranges excluding indented single line decorated items (<a href="https://redirect.github.com/psf/black/issues/4670">#4670</a>)</li> </ul> <h3>Preview style</h3> <ul> <li>Fix a bug where one-liner functions/conditionals marked with <code># fmt: skip</code> would still be formatted (<a href="https://redirect.github.com/psf/black/issues/4552">#4552</a>)</li> <li>Improve <code>multiline_string_handling</code> with ternaries and dictionaries (<a href="https://redirect.github.com/psf/black/issues/4657">#4657</a>)</li> <li>Fix a bug where <code>string_processing</code> would not split f-strings directly after expressions (<a href="https://redirect.github.com/psf/black/issues/4680">#4680</a>)</li> <li>Wrap the <code>in</code> clause of comprehensions across lines if necessary (<a href="https://redirect.github.com/psf/black/issues/4699">#4699</a>)</li> <li>Remove parentheses around multiple exception types in <code>except</code> and <code>except</code> without <code>as</code>. (<a href="https://redirect.github.com/psf/black/issues/4720">#4720</a>)</li> <li>Add <code>\r</code> style newlines to the potential newlines to normalize file newlines both from and to (<a href="https://redirect.github.com/psf/black/issues/4710">#4710</a>)</li> </ul> <h3>Parser</h3> <ul> <li>Rewrite tokenizer to improve performance and compliance (<a href="https://redirect.github.com/psf/black/issues/4536">#4536</a>)</li> <li>Fix bug where certain unusual expressions (e.g., lambdas) were not accepted in type parameter bounds and defaults. (<a href="https://redirect.github.com/psf/black/issues/4602">#4602</a>)</li> </ul> <h3>Performance</h3> <ul> <li>Avoid using an extra process when running with only one worker (<a href="https://redirect.github.com/psf/black/issues/4734">#4734</a>)</li> </ul> <h3>Integrations</h3> <ul> <li>Fix the version check in the vim file to reject Python 3.8 (<a href="https://redirect.github.com/psf/black/issues/4567">#4567</a>)</li> <li>Enhance GitHub Action <code>psf/black</code> to read Black version from an additional section in pyproject.toml: <code>[project.dependency-groups]</code> (<a href="https://redirect.github.com/psf/black/issues/4606">#4606</a>)</li> <li>Build gallery docker image with python3-slim and reduce image size (<a href="https://redirect.github.com/psf/black/issues/4686">#4686</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/psf/black/blob/main/CHANGES.md">black's changelog</a>.</em></p> <blockquote> <h2>25.9.0</h2> <h3>Highlights</h3> <ul> <li>Remove support for pre-python 3.7 <code>await/async</code> as soft keywords/variable names (<a href="https://redirect.github.com/psf/black/issues/4676">#4676</a>)</li> </ul> <h3>Stable style</h3> <ul> <li>Fix crash while formatting a long <code>del</code> statement containing tuples (<a href="https://redirect.github.com/psf/black/issues/4628">#4628</a>)</li> <li>Fix crash while formatting expressions using the walrus operator in complex <code>with</code> statements (<a href="https://redirect.github.com/psf/black/issues/4630">#4630</a>)</li> <li>Handle <code># fmt: skip</code> followed by a comment at the end of file (<a href="https://redirect.github.com/psf/black/issues/4635">#4635</a>)</li> <li>Fix crash when a tuple appears in the <code>as</code> clause of a <code>with</code> statement (<a href="https://redirect.github.com/psf/black/issues/4634">#4634</a>)</li> <li>Fix crash when tuple is used as a context manager inside a <code>with</code> statement (<a href="https://redirect.github.com/psf/black/issues/4646">#4646</a>)</li> <li>Fix crash when formatting a <code>\</code> followed by a <code>\r</code> followed by a comment (<a href="https://redirect.github.com/psf/black/issues/4663">#4663</a>)</li> <li>Fix crash on a <code>\\r\n</code> (<a href="https://redirect.github.com/psf/black/issues/4673">#4673</a>)</li> <li>Fix crash on <code>await ...</code> (where <code>...</code> is a literal <code>Ellipsis</code>) (<a href="https://redirect.github.com/psf/black/issues/4676">#4676</a>)</li> <li>Fix crash on parenthesized expression inside a type parameter bound (<a href="https://redirect.github.com/psf/black/issues/4684">#4684</a>)</li> <li>Fix crash when using line ranges excluding indented single line decorated items (<a href="https://redirect.github.com/psf/black/issues/4670">#4670</a>)</li> </ul> <h3>Preview style</h3> <ul> <li>Fix a bug where one-liner functions/conditionals marked with <code># fmt: skip</code> would still be formatted (<a href="https://redirect.github.com/psf/black/issues/4552">#4552</a>)</li> <li>Improve <code>multiline_string_handling</code> with ternaries and dictionaries (<a href="https://redirect.github.com/psf/black/issues/4657">#4657</a>)</li> <li>Fix a bug where <code>string_processing</code> would not split f-strings directly after expressions (<a href="https://redirect.github.com/psf/black/issues/4680">#4680</a>)</li> <li>Wrap the <code>in</code> clause of comprehensions across lines if necessary (<a href="https://redirect.github.com/psf/black/issues/4699">#4699</a>)</li> <li>Remove parentheses around multiple exception types in <code>except</code> and <code>except</code> without <code>as</code>. (<a href="https://redirect.github.com/psf/black/issues/4720">#4720</a>)</li> <li>Add <code>\r</code> style newlines to the potential newlines to normalize file newlines both from and to (<a href="https://redirect.github.com/psf/black/issues/4710">#4710</a>)</li> </ul> <h3>Parser</h3> <ul> <li>Rewrite tokenizer to improve performance and compliance (<a href="https://redirect.github.com/psf/black/issues/4536">#4536</a>)</li> <li>Fix bug where certain unusual expressions (e.g., lambdas) were not accepted in type parameter bounds and defaults. (<a href="https://redirect.github.com/psf/black/issues/4602">#4602</a>)</li> </ul> <h3>Performance</h3> <ul> <li>Avoid using an extra process when running with only one worker (<a href="https://redirect.github.com/psf/black/issues/4734">#4734</a>)</li> </ul> <h3>Integrations</h3> <ul> <li>Fix the version check in the vim file to reject Python 3.8 (<a href="https://redirect.github.com/psf/black/issues/4567">#4567</a>)</li> <li>Enhance GitHub Action <code>psf/black</code> to read Black version from an additional section in pyproject.toml: <code>[project.dependency-groups]</code> (<a href="https://redirect.github.com/psf/black/issues/4606">#4606</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`af0ba72a73`"><code>af0ba72</code></a> Prepare docs for release 25.9.0 (<a href="https://redirect.github.com/psf/black/issues/4751">#4751</a>)</li> <li><a href="`ffc01a0275`"><code>ffc01a0</code></a> Fix schema generation error caused by new click version (<a href="https://redirect.github.com/psf/black/issues/4750">#4750</a>)</li> <li><a href="`626b32fe2b`"><code>626b32f</code></a> Add normalizing for <code>\r</code> style newlines (<a href="https://redirect.github.com/psf/black/issues/4710">#4710</a>)</li> <li><a href="`57a461258f`"><code>57a4612</code></a> Fix mypy type issue (<a href="https://redirect.github.com/psf/black/issues/4745">#4745</a>)</li> <li><a href="`4f6ad7cf8c`"><code>4f6ad7c</code></a> Wrap the <code>in</code> clause of comprehensions across lines if necessary (<a href="https://redirect.github.com/psf/black/issues/4699">#4699</a>)</li> <li><a href="`24f5169617`"><code>24f5169</code></a> ci: Run diff-shades on unstable instead of preview (<a href="https://redirect.github.com/psf/black/issues/4741">#4741</a>)</li> <li><a href="`4d55e60179`"><code>4d55e60</code></a> Bump actions/setup-python from 5 to 6 (<a href="https://redirect.github.com/psf/black/issues/4744">#4744</a>)</li> <li><a href="`0cf39efdbc`"><code>0cf39ef</code></a> Improve the performance of get_string_prefix (<a href="https://redirect.github.com/psf/black/issues/4742">#4742</a>)</li> <li><a href="`1f779dec01`"><code>1f779de</code></a> Fix line ranges decorator edge case (<a href="https://redirect.github.com/psf/black/issues/4670">#4670</a>)</li> <li><a href="`203fd6b5cd`"><code>203fd6b</code></a> Optimize Line string method (<a href="https://redirect.github.com/psf/black/issues/4739">#4739</a>)</li> <li>Additional commits viewable in <a href="https://github.com/psf/black/compare/25.1.0...25.9.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=black&package-manager=uv&previous-version=25.1.0&new-version=25.9.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-11 16:48:53 -07:00
dependabot[bot]	055a7664f0	chore(python-deps): bump blobfile from 3.0.0 to 3.1.0 (#3784 ) Bumps [blobfile](https://github.com/christopher-hesse/blobfile) from 3.0.0 to 3.1.0. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/blobfile/blobfile/blob/master/CHANGES.md">blobfile's changelog</a>.</em></p> <blockquote> <h2>3.1.0</h2> <ul> <li>Improve <code>bf.join</code></li> <li>Add option to support blind writes</li> <li>Treat <code>EAI_NODATA</code> similarly to <code>EAI_NONAME</code> in DNS retry logic</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`ff0cd5d8ce`"><code>ff0cd5d</code></a> Release 3.1 (<a href="https://redirect.github.com/christopher-hesse/blobfile/issues/259">#259</a>)</li> <li><a href="`395973ae2d`"><code>395973a</code></a> Handle EAI_NODATA in _bad_hostname_check (<a href="https://redirect.github.com/christopher-hesse/blobfile/issues/258">#258</a>)</li> <li><a href="`cdc6e6a5a4`"><code>cdc6e6a</code></a> Improve bf.join (<a href="https://redirect.github.com/christopher-hesse/blobfile/issues/255">#255</a>)</li> <li><a href="`90cb2436a7`"><code>90cb243</code></a> Add option to support blind writes (<a href="https://redirect.github.com/christopher-hesse/blobfile/issues/254">#254</a>)</li> <li><a href="`4a2d011363`"><code>4a2d011</code></a> Add .git-blame-ignore-revs (<a href="https://redirect.github.com/christopher-hesse/blobfile/issues/253">#253</a>)</li> <li><a href="`ab888d0679`"><code>ab888d0</code></a> Replace all CRLF with LF (<a href="https://redirect.github.com/christopher-hesse/blobfile/issues/252">#252</a>)</li> <li><a href="`7eeb2aea87`"><code>7eeb2ae</code></a> Do not ignore warnings in tests (<a href="https://redirect.github.com/christopher-hesse/blobfile/issues/250">#250</a>)</li> <li><a href="`0717345283`"><code>0717345</code></a> Run isort (<a href="https://redirect.github.com/christopher-hesse/blobfile/issues/249">#249</a>)</li> <li>See full diff in <a href="https://github.com/christopher-hesse/blobfile/compare/v3.0.0...v3.1.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=blobfile&package-manager=uv&previous-version=3.0.0&new-version=3.1.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-11 16:48:47 -07:00
dependabot[bot]	13518e7562	chore(python-deps): bump ollama from 0.5.1 to 0.6.0 (#3786 ) Bumps [ollama](https://github.com/ollama/ollama-python) from 0.5.1 to 0.6.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/ollama/ollama-python/releases">ollama's releases</a>.</em></p> <blockquote> <h2>v0.6.0</h2> <h2>What's Changed</h2> <ul> <li> <p>client: add web search and web crawl capabilities by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/578">ollama/ollama-python#578</a></p> </li> <li> <p>client: load OLLAMA_API_KEY on init by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/583">ollama/ollama-python#583</a></p> </li> <li> <p>client/types: update web search and fetch API by <a href="https://github.com/npardal"><code>@npardal</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/584">ollama/ollama-python#584</a></p> </li> <li> <p>examples: add mcp server for web_search web_crawl by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/585">ollama/ollama-python#585</a></p> </li> <li> <p>examples: gpt oss browser tool by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/588">ollama/ollama-python#588</a></p> </li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/npardal"><code>@npardal</code></a> made their first contribution in <a href="https://redirect.github.com/ollama/ollama-python/pull/584">ollama/ollama-python#584</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/ollama/ollama-python/compare/v0.5.4...v0.6.0">https://github.com/ollama/ollama-python/compare/v0.5.4...v0.6.0</a></p> <h2>v0.5.4</h2> <h2>What's Changed</h2> <ul> <li>examples: add gpt-oss browser example by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/558">ollama/ollama-python#558</a></li> <li>build(deps): bump actions/checkout from 4 to 5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/ollama/ollama-python/pull/559">ollama/ollama-python#559</a></li> <li>examples/gpt-oss: fix examples by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/566">ollama/ollama-python#566</a></li> <li>Fix link for thinking-levels.py in documentation by <a href="https://github.com/btjanaka"><code>@btjanaka</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/567">ollama/ollama-python#567</a></li> <li>examples: fix gpt-oss-tools-stream for adding tool calls by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/568">ollama/ollama-python#568</a></li> <li>examples: resolve invalid tool usage status code 400 if llm makes a mistake gpt-oss by <a href="https://github.com/MarkWard0110"><code>@MarkWard0110</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/569">ollama/ollama-python#569</a></li> <li>build(deps): bump actions/setup-python from 5 to 6 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/ollama/ollama-python/pull/571">ollama/ollama-python#571</a></li> <li>feat: add dimensions to embed request by <a href="https://github.com/mxyng"><code>@mxyng</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/574">ollama/ollama-python#574</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/btjanaka"><code>@btjanaka</code></a> made their first contribution in <a href="https://redirect.github.com/ollama/ollama-python/pull/567">ollama/ollama-python#567</a></li> <li><a href="https://github.com/MarkWard0110"><code>@MarkWard0110</code></a> made their first contribution in <a href="https://redirect.github.com/ollama/ollama-python/pull/569">ollama/ollama-python#569</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/ollama/ollama-python/compare/v0.5.3...v0.5.4">https://github.com/ollama/ollama-python/compare/v0.5.3...v0.5.4</a></p> <h2>v0.5.3</h2> <h2>What's Changed</h2> <ul> <li>add support for 'high'/'medium'/'low' think values by <a href="https://github.com/drifkin"><code>@drifkin</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/553">ollama/ollama-python#553</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/ollama/ollama-python/compare/v0.5.2...v0.5.3">https://github.com/ollama/ollama-python/compare/v0.5.2...v0.5.3</a></p> <h2>v0.5.2</h2> <h2>What's Changed</h2> <ul> <li> <p>types/examples: add tool_name to message and examples by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/537">ollama/ollama-python#537</a></p> </li> <li> <p>types: add <code>context_length</code> to ProcessResponse by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/538">ollama/ollama-python#538</a></p> </li> <li> <p>types: relax type for tools by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/550">ollama/ollama-python#550</a></p> </li> <li> <p>add license metadata to package by <a href="https://github.com/ViViDboarder"><code>@ViViDboarder</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/526">ollama/ollama-python#526</a></p> </li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/hwittenborn"><code>@hwittenborn</code></a> made their first contribution in <a href="https://redirect.github.com/ollama/ollama-python/pull/525">ollama/ollama-python#525</a></li> <li><a href="https://github.com/ViViDboarder"><code>@ViViDboarder</code></a> made their first contribution in <a href="https://redirect.github.com/ollama/ollama-python/pull/526">ollama/ollama-python#526</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`d967f048d9`"><code>d967f04</code></a> examples: gpt oss browser tool (<a href="https://redirect.github.com/ollama/ollama-python/issues/588">#588</a>)</li> <li><a href="`ab49a669cd`"><code>ab49a66</code></a> examples: add mcp server for web_search web_crawl (<a href="https://redirect.github.com/ollama/ollama-python/issues/585">#585</a>)</li> <li><a href="`16f344f635`"><code>16f344f</code></a> client/types: update web search and fetch API (<a href="https://redirect.github.com/ollama/ollama-python/issues/584">#584</a>)</li> <li><a href="`d0f71bc8b8`"><code>d0f71bc</code></a> client: load OLLAMA_API_KEY on init (<a href="https://redirect.github.com/ollama/ollama-python/issues/583">#583</a>)</li> <li><a href="`b22c5fdabb`"><code>b22c5fd</code></a> init: fix export for web_search (<a href="https://redirect.github.com/ollama/ollama-python/issues/581">#581</a>)</li> <li><a href="`4d0b81b37a`"><code>4d0b81b</code></a> client: add web search and web crawl capabilities (<a href="https://redirect.github.com/ollama/ollama-python/issues/578">#578</a>)</li> <li><a href="`a1d04f04f2`"><code>a1d04f0</code></a> feat: add dimensions to embed request (<a href="https://redirect.github.com/ollama/ollama-python/issues/574">#574</a>)</li> <li><a href="`8af6cac86b`"><code>8af6cac</code></a> build(deps): bump actions/setup-python from 5 to 6 (<a href="https://redirect.github.com/ollama/ollama-python/issues/571">#571</a>)</li> <li><a href="`9f41447f20`"><code>9f41447</code></a> examples: make gpt-oss resilient for failed tool calls (<a href="https://redirect.github.com/ollama/ollama-python/issues/569">#569</a>)</li> <li><a href="`da79e987f0`"><code>da79e98</code></a> examples: fix gpt-oss-tools-stream for adding toolcalls (<a href="https://redirect.github.com/ollama/ollama-python/issues/568">#568</a>)</li> <li>Additional commits viewable in <a href="https://github.com/ollama/ollama-python/compare/v0.5.1...v0.6.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=ollama&package-manager=uv&previous-version=0.5.1&new-version=0.6.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-11 16:48:42 -07:00
Ashwin Bharambe	e6378872c7	fix(misc): pre-commit fix for server.py	2025-10-11 16:47:59 -07:00
Ashwin Bharambe	7c63aebd64	feat(responses)!: add reasoning and annotation added events (#3793 ) Implements missing streaming events from OpenAI Responses API spec: - reasoning text/summary events for o1/o3 models, - refusal events for safety moderation - annotation events for citations, - and file search streaming events. Added optional reasoning_content field to chat completion chunks to support non-standard provider extensions. NOTE: OpenAI does _not_ fill reasoning_content when users use the chat_completion APIs. This means there is no way for us to implement Responses (with reasoning) by using OpenAI chat completions! We'd need to transparently punt to OpenAI's responses endpoints if we wish to do that. For others though (vLLM, etc.) we can use it. ## Test Plan File search streaming test passes: ``` ./scripts/integration-tests.sh --stack-config server:ci-tests \ --suite responses --setup gpt --inference-mode replay --pattern test_response_file_search_streaming_events ``` Need more complex setup and validation for reasoning tests (need a vLLM powered OSS model maybe gpt-oss which can return reasoning_content). I will do that in a followup PR.	2025-10-11 16:47:14 -07:00
Ashwin Bharambe	f365961731	fix(tests): handle TEST_CONTEXT not being set	2025-10-11 15:31:08 -07:00
dependabot[bot]	dac1d7be1c	chore(python-deps): bump fire from 0.7.0 to 0.7.1 (#3787 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 19s Details Python Package Build Test / build (3.12) (push) Failing after 19s Details Python Package Build Test / build (3.13) (push) Failing after 38s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 42s Details Unit Tests / unit-tests (3.12) (push) Failing after 39s Details API Conformance Tests / check-schema-compatibility (push) Successful in 51s Details UI Tests / ui-tests (22) (push) Successful in 54s Details Pre-commit / pre-commit (push) Successful in 1m24s Details Bumps [fire](https://github.com/google/python-fire) from 0.7.0 to 0.7.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/google/python-fire/releases">fire's releases</a>.</em></p> <blockquote> <h2>Python Fire v0.7.1</h2> <h2>What's Changed</h2> <ul> <li>Use Neutral theme for IPython Inspector, supporting newer IPython versions in <a href="https://redirect.github.com/google/python-fire/pull/588">google/python-fire#588</a></li> <li>Call inspectutils.GetClassAttrsDict on component, not None in <a href="https://redirect.github.com/google/python-fire/pull/606">google/python-fire#606</a></li> <li>Move to pyproject.toml, adding wheel support in pypi</li> <li>Use ty in place of pytype</li> <li>Update requirements <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot]</li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/google/python-fire/compare/v0.7.0...v0.7.1">https://github.com/google/python-fire/compare/v0.7.0...v0.7.1</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`8ea2f631e6`"><code>8ea2f63</code></a> Update email address</li> <li><a href="`ea8c7f5e74`"><code>ea8c7f5</code></a> Remove unused MANIFEST</li> <li><a href="`86bf4ca693`"><code>86bf4ca</code></a> Update pylint requirement from <3.3.7 to <3.3.8 (<a href="https://redirect.github.com/google/python-fire/issues/614">#614</a>)</li> <li><a href="`8c62e05569`"><code>8c62e05</code></a> Update pytest requirement from <=8.3.5 to <=8.4.1 (<a href="https://redirect.github.com/google/python-fire/issues/615">#615</a>)</li> <li><a href="`cec0119b10`"><code>cec0119</code></a> Update hypothesis requirement from <6.133.0 to <6.136.0 (<a href="https://redirect.github.com/google/python-fire/issues/616">#616</a>)</li> <li><a href="`8449619604`"><code>8449619</code></a> Use ty in place of pytype (<a href="https://redirect.github.com/google/python-fire/issues/617">#617</a>)</li> <li><a href="`d33056cb32`"><code>d33056c</code></a> Move to pyproject.toml (<a href="https://redirect.github.com/google/python-fire/issues/613">#613</a>)</li> <li><a href="`2e6f8d2b24`"><code>2e6f8d2</code></a> Bump version to 0.7.1 (<a href="https://redirect.github.com/google/python-fire/issues/609">#609</a>)</li> <li><a href="`dba7e1d0da`"><code>dba7e1d</code></a> Update hypothesis requirement in /.github/scripts (<a href="https://redirect.github.com/google/python-fire/issues/608">#608</a>)</li> <li><a href="`51974c67bf`"><code>51974c6</code></a> Update pylint requirement from <3.3.5 to <3.3.7 in /.github/scripts (<a href="https://redirect.github.com/google/python-fire/issues/591">#591</a>)</li> <li>Additional commits viewable in <a href="https://github.com/google/python-fire/compare/v0.7.0...v0.7.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=fire&package-manager=uv&previous-version=0.7.0&new-version=0.7.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-11 14:15:23 -07:00
dependabot[bot]	2cb1b19efe	chore(python-deps): bump psycopg2-binary from 2.9.10 to 2.9.11 (#3785 ) Bumps [psycopg2-binary](https://github.com/psycopg/psycopg2) from 2.9.10 to 2.9.11. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/psycopg/psycopg2/blob/master/NEWS">psycopg2-binary's changelog</a>.</em></p> <blockquote> <h2>Current release</h2> <p>What's new in psycopg 2.9.11 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^</p> <ul> <li>Add support for Python 3.14.</li> <li>Avoid a segfault passing more arguments than placeholders if Python is built with assertions enabled (🎫<code>[#1791](https://github.com/psycopg/psycopg2/issues/1791)</code>).</li> <li><code>~psycopg2.errorcodes</code> map and <code>~psycopg2.errors</code> classes updated to PostgreSQL 18.</li> <li>Drop support for Python 3.8.</li> </ul> <p>What's new in psycopg 2.9.10 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^</p> <ul> <li>Add support for Python 3.13.</li> <li>Receive notifications on commit (🎫<code>[#1728](https://github.com/psycopg/psycopg2/issues/1728)</code>).</li> <li><code>~psycopg2.errorcodes</code> map and <code>~psycopg2.errors</code> classes updated to PostgreSQL 17.</li> <li>Drop support for Python 3.7.</li> </ul> <p>What's new in psycopg 2.9.9 ^^^^^^^^^^^^^^^^^^^^^^^^^^^</p> <ul> <li>Add support for Python 3.12.</li> <li>Drop support for Python 3.6.</li> </ul> <p>What's new in psycopg 2.9.8 ^^^^^^^^^^^^^^^^^^^^^^^^^^^</p> <ul> <li>Wheel package bundled with PostgreSQL 16 libpq in order to add support for recent features, such as <code>sslcertmode</code>.</li> </ul> <p>What's new in psycopg 2.9.7 ^^^^^^^^^^^^^^^^^^^^^^^^^^^</p> <ul> <li>Fix propagation of exceptions raised during module initialization (🎫<code>[#1598](https://github.com/psycopg/psycopg2/issues/1598)</code>).</li> <li>Fix building when pg_config returns an empty string (🎫<code>[#1599](https://github.com/psycopg/psycopg2/issues/1599)</code>).</li> <li>Wheel package bundled with OpenSSL 1.1.1v.</li> </ul> <p>What's new in psycopg 2.9.6 ^^^^^^^^^^^^^^^^^^^^^^^^^^^</p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`fd9ae8cad2`"><code>fd9ae8c</code></a> chore: bump to version 2.9.11</li> <li><a href="`d923840546`"><code>d923840</code></a> chore: update docs requirements</li> <li><a href="`d42dc7169d`"><code>d42dc71</code></a> Merge branch 'fix-1791'</li> <li><a href="`4fde6560c3`"><code>4fde656</code></a> fix: avoid failed assert passing more arguments than placeholders</li> <li><a href="`8308c19d6a`"><code>8308c19</code></a> fix: drop warning about the use of deprecated PyWeakref_GetObject function</li> <li><a href="`1a1eabf098`"><code>1a1eabf</code></a> build(deps): bump actions/github-script from 7 to 8</li> <li><a href="`897af8b38b`"><code>897af8b</code></a> build(deps): bump peter-evans/repository-dispatch from 3 to 4</li> <li><a href="`ceefd30511`"><code>ceefd30</code></a> build(deps): bump actions/checkout from 4 to 5</li> <li><a href="`4dc585430c`"><code>4dc5854</code></a> build(deps): bump actions/setup-python from 5 to 6</li> <li><a href="`1945788dcf`"><code>1945788</code></a> Merge pull request <a href="https://redirect.github.com/psycopg/psycopg2/issues/1802">#1802</a> from edgarrmondragon/cp314-wheels</li> <li>Additional commits viewable in <a href="https://github.com/psycopg/psycopg2/compare/2.9.10...2.9.11">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=psycopg2-binary&package-manager=uv&previous-version=2.9.10&new-version=2.9.11)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-11 14:15:17 -07:00
dependabot[bot]	f15d865a3e	chore(github-deps): bump astral-sh/setup-uv from 6.8.0 to 7.0.0 (#3782 ) Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.8.0 to 7.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v7.0.0 🌈 node24 and a lot of bugfixes</h2> <h2>Changes</h2> <p>This release comes with a load of bug fixes and a speed up. Because of switching from node20 to node24 it is also a breaking change. If you are running on GitHub hosted runners this will just work, if you are using self-hosted runners make sure, that your runners are up to date. If you followed the normal installation instructions your self-hosted runner will keep itself updated.</p> <p>This release also removes the deprecated input <code>server-url</code> which was used to download uv releases from a different server. The <a href="https://github.com/astral-sh/setup-uv?tab=readme-ov-file#manifest-file">manifest-file</a> input supersedes that functionality by adding a flexible way to define available versions and where they should be downloaded from.</p> <h3>Fixes</h3> <ul> <li>The action now respects when the environment variable <code>UV_CACHE_DIR</code> is already set and does not overwrite it. It now also finds <a href="https://docs.astral.sh/uv/reference/settings/#cache-dir">cache-dir</a> settings in config files if you set them.</li> <li>Some users encountered problems that <a href="https://github.com/astral-sh/setup-uv?tab=readme-ov-file#disable-cache-pruning">cache pruning</a> took forever because they had some <code>uv</code> processes running in the background. Starting with uv version <code>0.8.24</code> this action uses <code>uv cache prune --ci --force</code> to ignore the running processes</li> <li>If you just want to install uv but not have it available in path, this action now respects <code>UV_NO_MODIFY_PATH</code></li> <li>Some other actions also set the env var <code>UV_CACHE_DIR</code>. This action can now deal with that but as this could lead to unwanted behavior in some edgecases a warning is now displayed.</li> </ul> <h3>Improvements</h3> <p>If you are using minimum version specifiers for the version of uv to install for example</p> <pre lang="toml"><code>[tool.uv] required-version = ">=0.8.17" </code></pre> <p>This action now detects that and directly uses the latest version. Previously it would download all available releases from the uv repo to determine the highest matching candidate for the version specifier, which took much more time.</p> <p>If you are using other specifiers like <code>0.8.x</code> this action still needs to download all available releases because the specifier defines an upper bound (not 0.9.0 or later) and "latest" would possibly not satisfy that.</p> <h2>🚨 Breaking changes</h2> <ul> <li>Use node24 instead of node20 <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/608">#608</a>)</li> <li>Remove deprecated input server-url <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/607">#607</a>)</li> </ul> <h2>🐛 Bug fixes</h2> <ul> <li>Respect UV_CACHE_DIR and cache-dir <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/612">#612</a>)</li> <li>Use --force when pruning cache <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/611">#611</a>)</li> <li>Respect UV_NO_MODIFY_PATH <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/603">#603</a>)</li> <li>Warn when <code>UV_CACHE_DIR</code> has changed <a href="https://github.com/jamesbraza"><code>@jamesbraza</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/601">#601</a>)</li> </ul> <h2>🚀 Enhancements</h2> <ul> <li>Shortcut to latest version for minimum version specifier <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/598">#598</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>Bump dependencies <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/613">#613</a>)</li> <li>Fix test-uv-no-modify-path <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/604">#604</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`eb1897b8dc`"><code>eb1897b</code></a> Bump dependencies (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/613">#613</a>)</li> <li><a href="`d78d791822`"><code>d78d791</code></a> Bump github/codeql-action from 3.30.5 to 3.30.6 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/605">#605</a>)</li> <li><a href="`535dc2664c`"><code>535dc26</code></a> Respect UV_CACHE_DIR and cache-dir (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/612">#612</a>)</li> <li><a href="`f610be5ff9`"><code>f610be5</code></a> Use --force when pruning cache (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/611">#611</a>)</li> <li><a href="`3deccc0075`"><code>3deccc0</code></a> Use node24 instead of node20 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/608">#608</a>)</li> <li><a href="`d9ee7e2f26`"><code>d9ee7e2</code></a> Remove deprecated input server-url (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/607">#607</a>)</li> <li><a href="`59a0868fea`"><code>59a0868</code></a> Bump github/codeql-action from 3.30.3 to 3.30.5 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/594">#594</a>)</li> <li><a href="`c952556164`"><code>c952556</code></a> Bump <code>@renovatebot/pep440</code> from 4.2.0 to 4.2.1 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/581">#581</a>)</li> <li><a href="`51c3328db2`"><code>51c3328</code></a> Fix test-uv-no-modify-path (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/604">#604</a>)</li> <li><a href="`f2859da213`"><code>f2859da</code></a> Respect UV_NO_MODIFY_PATH (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/603">#603</a>)</li> <li>Additional commits viewable in <a href="`d0cc045d04...eb1897b8dc`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.8.0&new-version=7.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-11 14:14:43 -07:00
Francisco Arceo	a165b8b5bb	chore!: BREAKING CHANGE removing VectorDB APIs (#3774 ) # What does this PR do? Removes VectorDBs from API surface and our tests. Moves tests to Vector Stores. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-10-11 14:07:08 -07:00
ehhuang	06e4cd8e02	feat(api)!: BREAKING CHANGE: support passing `extra_body` through to providers (#3777 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 38s Details Pre-commit / pre-commit (push) Successful in 1m27s Details # What does this PR do? Allows passing through extra_body parameters to inference providers. With this, we removed the 2 vllm-specific parameters from completions API into `extra_body`. Before/After <img width="1883" height="324" alt="image" src="https://github.com/user-attachments/assets/acb27c08-c748-46c9-b1da-0de64e9908a1" /> closes #2720 ## Test Plan CI and added new test ``` ❯ uv run pytest -s -v tests/integration/ --stack-config=server:starter --inference-mode=record -k 'not( builtin_tool or safety_with_image or code_interpreter or test_rag ) and test_openai_completion_guided_choice' --setup=vllm --suite=base --color=yes Uninstalled 3 packages in 125ms Installed 3 packages in 19ms INFO 2025-10-10 14:29:54,317 tests.integration.conftest:118 tests: Applying setup 'vllm' for suite base INFO 2025-10-10 14:29:54,331 tests.integration.conftest:47 tests: Test stack config type: server (stack_config=server:starter) ============================================================================================================== test session starts ============================================================================================================== platform darwin -- Python 3.12.11, pytest-8.4.2, pluggy-1.6.0 -- /Users/erichuang/projects/llama-stack-1/.venv/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.11', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0'}} rootdir: /Users/erichuang/projects/llama-stack-1 configfile: pyproject.toml plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 285 items / 284 deselected / 1 selected tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=vllm/Qwen/Qwen3-0.6B] instantiating llama_stack_client Starting llama stack server with config 'starter' on port 8321... Waiting for server at http://localhost:8321... (0.0s elapsed) Waiting for server at http://localhost:8321... (0.5s elapsed) Waiting for server at http://localhost:8321... (5.1s elapsed) Waiting for server at http://localhost:8321... (5.6s elapsed) Waiting for server at http://localhost:8321... (10.1s elapsed) Waiting for server at http://localhost:8321... (10.6s elapsed) Server is ready at http://localhost:8321 llama_stack_client instantiated in 11.773s PASSEDTerminating llama stack server process... Terminating process 98444 and its group... Server process and children terminated gracefully ============================================================================================================= slowest 10 durations ============================================================================================================== 11.88s setup tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=vllm/Qwen/Qwen3-0.6B] 3.02s call tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=vllm/Qwen/Qwen3-0.6B] 0.01s teardown tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=vllm/Qwen/Qwen3-0.6B] ================================================================================================ 1 passed, 284 deselected, 3 warnings in 16.21s ================================================================================================= ```	2025-10-10 16:21:44 -07:00
ehhuang	80d58ab519	chore: refactor (chat)completions endpoints to use shared params struct (#3761 ) # What does this PR do? Converts openai(_chat)_completions params to pydantic BaseModel to reduce code duplication across all providers. ## Test Plan CI --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/llamastack/llama-stack/pull/3761). * #3777 * __->__ #3761	2025-10-10 15:46:34 -07:00
Derek Higgins	6954fe2274	fix(auth): allow unauthenticated access to health and version endpoints (#3736 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Test Llama Stack Build / build (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 37s Details Pre-commit / pre-commit (push) Successful in 2m1s Details The AuthenticationMiddleware was blocking all requests without an Authorization header, including health and version endpoints that are needed by monitoring tools, load balancers, and Kubernetes probes. This commit allows endpoints ending in /health or /version to bypass authentication, enabling operational tooling to function properly without requiring credentials. Closes: #3735 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-10-10 13:41:43 -07:00
Varsha	32fde8d9a8	feat: Add /v1/embeddings endpoint to batches API (#3384 ) # What does this PR do? This PR extends the Llama Stack Batches API to support the /v1/embeddings endpoint, enabling efficient batch processing of embedding requests alongside the existing /v1/chat/completions and /v1/completions support. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes: https://github.com/llamastack/llama-stack/issues/3145 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ``` (stack-client) ➜ llama-stack git:(support/embeddings-api) conda activate stack-client && python -m pytest tests/unit/providers/batches/test_reference.py -v ============================================================================================================================================ test session starts ============================================================================================================================================= platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.11', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'xdist': '3.8.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}} rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack configfile: pyproject.toml plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, xdist-3.8.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0 asyncio: mode=Mode.AUTO collected 46 items tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_and_retrieve_batch_success PASSED [ 2%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_batch_without_metadata PASSED [ 4%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_batch_completion_window PASSED [ 6%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_batch_invalid_endpoints[/v1/invalid/endpoint] PASSED [ 8%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_batch_invalid_endpoints[] PASSED [ 10%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_batch_invalid_metadata PASSED [ 13%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_retrieve_batch_not_found PASSED [ 15%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_cancel_batch_success PASSED [ 17%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_cancel_batch_invalid_statuses[failed] PASSED [ 19%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_cancel_batch_invalid_statuses[expired] PASSED [ 21%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_cancel_batch_invalid_statuses[completed] PASSED [ 23%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_cancel_batch_not_found PASSED [ 26%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_list_batches_empty PASSED [ 28%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_list_batches_single_batch PASSED [ 30%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_list_batches_multiple_batches PASSED [ 32%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_list_batches_with_limit PASSED [ 34%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_list_batches_with_pagination PASSED [ 36%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_list_batches_invalid_after PASSED [ 39%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_kvstore_persistence PASSED [ 41%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_file_not_found PASSED [ 43%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_file_exists_empty_content PASSED [ 45%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_file_mixed_valid_invalid_json PASSED [ 47%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_model PASSED [ 50%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_chat_completions[custom_id-custom_id-missing_required_parameter-Missing required parameter: custom_id] PASSED [ 52%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_chat_completions[method-method-missing_required_parameter-Missing required parameter: method] PASSED [ 54%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_chat_completions[url-url-missing_required_parameter-Missing required parameter: url] PASSED [ 56%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_chat_completions[body-body-missing_required_parameter-Missing required parameter: body] PASSED [ 58%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_chat_completions[model-body.model-invalid_request-Model parameter is required] PASSED [ 60%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_chat_completions[messages-body.messages-invalid_request-Messages parameter is required] PASSED [ 63%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_completions[custom_id-custom_id-missing_required_parameter-Missing required parameter: custom_id] PASSED [ 65%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_completions[method-method-missing_required_parameter-Missing required parameter: method] PASSED [ 67%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_completions[url-url-missing_required_parameter-Missing required parameter: url] PASSED [ 69%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_completions[body-body-missing_required_parameter-Missing required parameter: body] PASSED [ 71%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_completions[model-body.model-invalid_request-Model parameter is required] PASSED [ 73%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_completions[prompt-body.prompt-invalid_request-Prompt parameter is required] PASSED [ 76%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_url_mismatch PASSED [ 78%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_multiple_errors_per_request PASSED [ 80%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_request_format PASSED [ 82%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_parameter_types[custom_id-custom_id-12345-Custom_id must be a string] PASSED [ 84%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_parameter_types[url-url-123-URL must be a string] PASSED [ 86%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_parameter_types[method-method-invalid_value2-Method must be a string] PASSED [ 89%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_parameter_types[body-body-invalid_value3-Body must be a JSON dictionary object] PASSED [ 91%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_parameter_types[model-body.model-123-Model must be a string] PASSED [ 93%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_parameter_types[messages-body.messages-invalid messages format-Messages must be an array] PASSED [ 95%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_max_concurrent_batches PASSED [ 97%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_batch_embeddings_endpoint PASSED [100%] ``` --------- Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-10-10 13:25:58 -07:00
Ashwin Bharambe	1394403360	feat(responses): implement usage tracking in streaming responses (#3771 ) Implementats usage accumulation to StreamingResponseOrchestrator. The most important part was to pass `stream_options = { "include_usage": true }` to the chat_completion call. This means I will have to record all responses tests again because request hash will change :) Test changes: - Add usage assertions to streaming and non-streaming tests - Update test recordings with actual usage data from OpenAI	2025-10-10 12:27:03 -07:00
Francisco Arceo	e7d21e1ee3	feat: Add support for Conversations in Responses API (#3743 ) # What does this PR do? This PR adds support for Conversations in Responses. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Unit tests Integration tests <Details> <Summary>Manual testing with this script: (click to expand)</Summary> ```python from openai import OpenAI client = OpenAI() client = OpenAI(base_url="http://localhost:8321/v1/", api_key="none") def test_conversation_create(): print("Testing conversation create...") conversation = client.conversations.create( metadata={"topic": "demo"}, items=[ {"type": "message", "role": "user", "content": "Hello!"} ] ) print(f"Created: {conversation}") return conversation def test_conversation_retrieve(conv_id): print(f"Testing conversation retrieve for {conv_id}...") retrieved = client.conversations.retrieve(conv_id) print(f"Retrieved: {retrieved}") return retrieved def test_conversation_update(conv_id): print(f"Testing conversation update for {conv_id}...") updated = client.conversations.update( conv_id, metadata={"topic": "project-x"} ) print(f"Updated: {updated}") return updated def test_conversation_delete(conv_id): print(f"Testing conversation delete for {conv_id}...") deleted = client.conversations.delete(conv_id) print(f"Deleted: {deleted}") return deleted def test_conversation_items_create(conv_id): print(f"Testing conversation items create for {conv_id}...") items = client.conversations.items.create( conv_id, items=[ { "type": "message", "role": "user", "content": [{"type": "input_text", "text": "Hello!"}] }, { "type": "message", "role": "user", "content": [{"type": "input_text", "text": "How are you?"}] } ] ) print(f"Items created: {items}") return items def test_conversation_items_list(conv_id): print(f"Testing conversation items list for {conv_id}...") items = client.conversations.items.list(conv_id, limit=10) print(f"Items list: {items}") return items def test_conversation_item_retrieve(conv_id, item_id): print(f"Testing conversation item retrieve for {conv_id}/{item_id}...") item = client.conversations.items.retrieve(conversation_id=conv_id, item_id=item_id) print(f"Item retrieved: {item}") return item def test_conversation_item_delete(conv_id, item_id): print(f"Testing conversation item delete for {conv_id}/{item_id}...") deleted = client.conversations.items.delete(conversation_id=conv_id, item_id=item_id) print(f"Item deleted: {deleted}") return deleted def test_conversation_responses_create(): print("\nTesting conversation create for a responses example...") conversation = client.conversations.create() print(f"Created: {conversation}") response = client.responses.create( model="gpt-4.1", input=[{"role": "user", "content": "What are the 5 Ds of dodgeball?"}], conversation=conversation.id, ) print(f"Created response: {response} for conversation {conversation.id}") return response, conversation def test_conversations_responses_create_followup( conversation, content="Repeat what you just said but add 'this is my second time saying this'", ): print(f"Using: {conversation.id}") response = client.responses.create( model="gpt-4.1", input=[{"role": "user", "content": content}], conversation=conversation.id, ) print(f"Created response: {response} for conversation {conversation.id}") conv_items = client.conversations.items.list(conversation.id) print(f"\nRetrieving list of items for conversation {conversation.id}:") print(conv_items.model_dump_json(indent=2)) def test_response_with_fake_conv_id(): fake_conv_id = "conv_zzzzzzzzz5dc81908289d62779d2ac510a2b0b602ef00a44" print(f"Using {fake_conv_id}") try: response = client.responses.create( model="gpt-4.1", input=[{"role": "user", "content": "say hello"}], conversation=fake_conv_id, ) print(f"Created response: {response} for conversation {fake_conv_id}") except Exception as e: print(f"failed to create response for conversation {fake_conv_id} with error {e}") def main(): print("Testing OpenAI Conversations API...") # Create conversation conversation = test_conversation_create() conv_id = conversation.id # Retrieve conversation test_conversation_retrieve(conv_id) # Update conversation test_conversation_update(conv_id) # Create items items = test_conversation_items_create(conv_id) # List items items_list = test_conversation_items_list(conv_id) # Retrieve specific item if items_list.data: item_id = items_list.data[0].id test_conversation_item_retrieve(conv_id, item_id) # Delete item test_conversation_item_delete(conv_id, item_id) # Delete conversation test_conversation_delete(conv_id) response, conversation2 = test_conversation_responses_create() print('\ntesting reseponse retrieval') test_conversation_retrieve(conversation2.id) print('\ntesting responses follow up') test_conversations_responses_create_followup(conversation2) print('\ntesting responses follow up x2!') test_conversations_responses_create_followup( conversation2, content="Repeat what you just said but add 'this is my third time saying this'", ) test_response_with_fake_conv_id() print("All tests completed!") if __name__ == "__main__": main() ``` </Details> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-10-10 11:57:40 -07:00
Ashwin Bharambe	932fea813a	fix(ci): remove responses from CI for now (#3773 ) There are many changes to responses which are landing. They are introducing fundamental new types. This means re-recordings even from the inference calls. Let's avoid that for now. Once everything lands I will re-record everything, make things pass and re-enable.	2025-10-10 11:52:17 -07:00
Ashwin Bharambe	548ccff368	fix(mypy): fix wrong attribute access (#3770 )	2025-10-10 09:30:43 -07:00
grs	8bf07f91cb	feat: reuse previous mcp tool listings where possible (#3710 ) # What does this PR do? This PR checks whether, if a previous response is linked, there are mcp_list_tools objects that can be reused instead of listing the tools explicitly every time. Closes #3106 ## Test Plan Tested manually. Added unit tests to cover new behaviour. --------- Signed-off-by: Gordon Sim <gsim@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-10-10 09:28:25 -07:00
Matthew Farrellee	0066d986c5	feat: use SecretStr for inference provider auth credentials (#3724 ) # What does this PR do? use SecretStr for OpenAIMixin providers - RemoteInferenceProviderConfig now has auth_credential: SecretStr - the default alias is api_key (most common name) - some providers override to use api_token (RunPod, vLLM, Databricks) - some providers exclude it (Ollama, TGI, Vertex AI) addresses #3517 ## Test Plan ci w/ new tests	2025-10-10 07:32:50 -07:00
Derek Higgins	6d8f61206e	fix: update normalize to search all recordings dirs (#3767 ) Updated scripts/normalize_recordings.py to dynamically find and process all 'recordings' directories under tests/ using pathlib.rglob() instead of hardcoding a single path. Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-10-10 07:32:14 -07:00
Ashwin Bharambe	e039b61d26	feat(responses)!: add in_progress, failed, content part events (#3765 ) ## Summary - add schema + runtime support for response.in_progress / response.failed / response.incomplete - stream content parts with proper indexes and reasoning slots - align tests + docs with the richer event payloads ## Testing - uv run pytest tests/unit/providers/agents/meta_reference/test_openai_responses.py::test_create_openai_response_with_string_input - uv run pytest tests/unit/providers/agents/meta_reference/test_response_conversion_utils.py	2025-10-10 07:27:34 -07:00
Akram Ben Aissi	a548169b99	fix: allow skipping model availability check for vLLM (#3739 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Allows model check to fail gracefully instead of crashing on startup. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> set VLLM_URL to your VLLM server ``` (base) akram@Mac llama-stack % LAMA_STACK_LOGGING="all=debug" VLLM_ENABLE_MODEL_DISCOVERY=false MILVUS_DB_PATH=./milvus.db INFERENCE_MODEL=vllm uv run --with llama-stack llama stack build --distro starter --image-type venv --run ``` ``` INFO 2025-10-08 20:11:24,637 llama_stack.providers.utils.inference.inference_store:74 inference: Write queue disabled for SQLite to avoid concurrency issues INFO 2025-10-08 20:11:24,866 llama_stack.providers.utils.responses.responses_store:96 openai_responses: Write queue disabled for SQLite to avoid concurrency issues ERROR 2025-10-08 20:11:26,160 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: VLLMInferenceAdapter.list_provider_model_ids() failed with: <a href="https://oauth.akram.a1ey.p3.openshiftapps.com:443/oauth/authorize?approval_prompt=force&client_id=system%3Aserviceaccount%3Arhoai-30-genai%3Adefault&redirect_uri=ht tps%3A%2F%2Fvllm-rhoai-30-genai.apps.rosa.akram.a1ey.p3.openshiftapps.com%2Foauth%2Fcallback&response_type=code&scope=user%3Ainfo+user%3Acheck-access&state=9fba207425 5851c718aca717a5887d76%3A%2Fmodels">Found</a>. [...] INFO 2025-10-08 20:11:26,295 uvicorn.error:84 uncategorized: Started server process [83144] INFO 2025-10-08 20:11:26,296 uvicorn.error:48 uncategorized: Waiting for application startup. INFO 2025-10-08 20:11:26,297 llama_stack.core.server.server:170 core::server: Starting up INFO 2025-10-08 20:11:26,297 llama_stack.core.stack:399 core: starting registry refresh task INFO 2025-10-08 20:11:26,311 uvicorn.error:62 uncategorized: Application startup complete. INFO 2025-10-08 20:11:26,312 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) ERROR 2025-10-08 20:11:26,791 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: VLLMInferenceAdapter.list_provider_model_ids() failed with: <a href="https://oauth.akram.a1ey.p3.openshiftapps.com:443/oauth/authorize?approval_prompt=force&client_id=system%3Aserviceaccount%3Arhoai-30-genai%3Adefault&redirect_uri=ht tps%3A%2F%2Fvllm-rhoai-30-genai.apps.rosa.akram.a1ey.p3.openshiftapps.com%2Foauth%2Fcallback&response_type=code&scope=user%3Ainfo+user%3Acheck-access&state=8ef0cba3e1 71a4f8b04cb445cfb91a4c%3A%2Fmodels">Found</a>. ```	2025-10-10 07:23:13 -07:00
Ashwin Bharambe	aaf5036235	feat(responses): add usage types to inference and responses APIs (#3764 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 23s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 27s Details API Conformance Tests / check-schema-compatibility (push) Successful in 36s Details UI Tests / ui-tests (22) (push) Successful in 55s Details Pre-commit / pre-commit (push) Successful in 2m7s Details ## Summary Adds OpenAI-compatible usage tracking types to enable reporting token consumption for both streaming and non-streaming responses. ## Type Definitions Chat Completion Usage (inference API): ```python class OpenAIChatCompletionUsage(BaseModel): prompt_tokens: int completion_tokens: int total_tokens: int prompt_tokens_details: OpenAIChatCompletionUsagePromptTokensDetails \| None completion_tokens_details: OpenAIChatCompletionUsageCompletionTokensDetails \| None ``` Response Usage (responses API): ```python class OpenAIResponseUsage(BaseModel): input_tokens: int output_tokens: int total_tokens: int input_tokens_details: OpenAIResponseUsageInputTokensDetails \| None output_tokens_details: OpenAIResponseUsageOutputTokensDetails \| None ``` This matches OpenAI's usage reporting format and enables PR #3766 to implement usage tracking in streaming responses. Co-authored-by: Claude <noreply@anthropic.com>	2025-10-10 09:22:59 -04:00
Ashwin Bharambe	ebae0385bb	fix: update dangling references to llama download command (#3763 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 40s Details Pre-commit / pre-commit (push) Successful in 2m14s Details ## Summary After removing model management CLI in #3700, this PR updates remaining references to the old `llama download` command to use `huggingface-cli download` instead. ## Changes - Updated error messages in `meta_reference/common.py` to recommend `huggingface-cli download` - Updated error messages in `torchtune/recipes/lora_finetuning_single_device.py` to use `huggingface-cli download` - Updated post-training notebook to use `huggingface-cli download` instead of `llama download` - Fixed typo: "you model" -> "your model" ## Test Plan - Verified error messages provide correct guidance for users - Checked that notebook instructions are up-to-date with current tooling	2025-10-09 18:35:02 -07:00
Ashwin Bharambe	8fe4a216b5	fix(inference): propagate 401/403 errors from remote providers (#3762 ) ## Summary Fixes #2990 Remote provider authentication errors (401/403) were being converted to 500 Internal Server Error, preventing users from understanding why their requests failed. ## The Problem When a request with an invalid API key was sent to a remote provider: - Provider correctly returns 401 with error details - Llama Stack's `translate_exception()` didn't recognize provider SDK exceptions - Fell through to generic 500 error handler - User received: "Internal server error: An unexpected error occurred." ## The Fix Added handler in `translate_exception()` that checks for exceptions with a `status_code` attribute and preserves the original HTTP status code and error message. Before: ```json HTTP 500 {"detail": "Internal server error: An unexpected error occurred."} ``` After: ```json HTTP 401 {"detail": "Error code: 401 - {'error': {'message': 'Invalid API Key', 'type': 'invalid_request_error', 'code': 'invalid_api_key'}}"} ``` ## Tested With - ✅ groq: 401 "Invalid API Key" - ✅ openai: 401 "Incorrect API key provided" - ✅ together: 401 "Invalid API key provided" - ✅ fireworks: 403 "unauthorized" ## Test Plan Automated test script: https://gist.github.com/ashwinb/1199dd7585ffa3f4be67b111cc65f2f3 The test script: 1. Builds separate stacks for each provider 2. Registers models (with validation temporarily disabled for testing) 3. Sends requests with invalid API keys via `x-llamastack-provider-data` header 4. Verifies HTTP status codes are 401/403 (not 500) Results before fix: All providers returned 500 Results after fix: All providers correctly return 401/403 Manual verification: ```bash # 1. Build stack llama stack build --image-type venv --providers inference=remote::groq # 2. Start stack llama stack run # 3. Send request with invalid API key curl http://localhost:8321/v1/chat/completions \ -H "Content-Type: application/json" \ -H 'x-llamastack-provider-data: {"groq_api_key": "invalid-key"}' \ -d '{"model": "groq/llama3-70b-8192", "messages": [{"role": "user", "content": "test"}]}' # Expected: HTTP 401 with provider error message (not 500) ``` ## Impact - Works with all remote providers using OpenAI SDK (groq, openai, together, fireworks, etc.) - Works with any provider SDK that follows the pattern of exceptions with `status_code` attribute - No breaking changes - only affects error responses	2025-10-09 18:34:39 -07:00
Matthew Farrellee	145b2bcf25	feat: make object registration idempotent (#3752 ) # What does this PR do? objects (vector dbs, models, scoring functions, etc) have an identifier and associated object values. we allow exact duplicate registrations. we reject registrations when the identifier exists and the associated object values differ. note: model are namespaced, i.e. {provider_id}/{identifier}, while other object types are not ## Test Plan ci w/ new tests	2025-10-09 17:04:28 -07:00
Sébastien Han	7ee0ee7843	chore!: remove model mgmt from CLI for Hugging Face CLI (#3700 ) This change removes the `llama model` and `llama download` subcommands from the CLI, replacing them with recommendations to use the Hugging Face CLI instead. Rationale for this change: - The model management functionality was largely duplicating what Hugging Face CLI already provides, leading to unnecessary maintenance overhead (except the download source from Meta?) - Maintaining our own implementation required fixing bugs and keeping up with changes in model repositories and download mechanisms - The Hugging Face CLI is more mature, widely adopted, and better maintained - This allows us to focus on the core Llama Stack functionality rather than reimplementing model management tools Changes made: - Removed all model-related CLI commands and their implementations - Updated documentation to recommend using `huggingface-cli` for model downloads - Removed Meta-specific download logic and statements - Simplified the CLI to focus solely on stack management operations Users should now use: - `huggingface-cli download` for downloading models - `huggingface-cli scan-cache` for listing downloaded models This is a breaking change as it removes previously available CLI commands. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-10-09 16:50:33 -07:00
Ashwin Bharambe	841d0c3583	fix(testing): improve api_recorder error messages for missing recordings (#3760 ) Replaces opaque error messages when recordings are not found with somewhat better guidance Before: ``` No recorded response found for request hash: abc123... To record this response, run with LLAMA_STACK_TEST_INFERENCE_MODE=record ``` After: ``` Recording not found for request hash: abc123 Model: gpt-4 \| Request: POST https://api.openai.com/v1/chat/completions Run './scripts/integration-tests.sh --inference-mode record-if-missing' with required API keys to generate. ```	2025-10-09 15:04:16 -07:00
Ashwin Bharambe	a055a32ee4	fix(tests): remove chroma and qdrant from vector io unit tests (#3759 ) These vector databases are already thoroughly tested in integration tests. Unit tests now focus on sqlite_vec, faiss, and pgvector with mocked dependencies, removing the need for external service dependencies. ## Changes: - Deleted test_qdrant.py unit test file - Removed chroma/qdrant fixtures and parametrization from conftest.py - Fixed SqliteKVStoreConfig import to use correct location - Removed chromadb, qdrant-client, pymilvus, milvus-lite, and weaviate-client from unit test dependencies in pyproject.toml	2025-10-09 14:36:34 -07:00
Ashwin Bharambe	f50ce11a3b	feat(tests): make inference_recorder into api_recorder (include tool_invoke) (#3403 ) Renames `inference_recorder.py` to `api_recorder.py` and extends it to support recording/replaying tool invocations in addition to inference calls. This allows us to record web-search, etc. tool calls and thereafter apply recordings for `tests/integration/responses` ## Test Plan ``` export OPENAI_API_KEY=... export TAVILY_SEARCH_API_KEY=... ./scripts/integration-tests.sh --stack-config ci-tests \ --suite responses --inference-mode record-if-missing ```	2025-10-09 14:27:51 -07:00
grs	26fd5dbd34	fix: add traces for tool calls and mcp tool listing (#3722 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 15s Details UI Tests / ui-tests (22) (push) Successful in 42s Details Pre-commit / pre-commit (push) Successful in 1m24s Details # What does this PR do? Adds traces around tool execution and mcp tool listing for better observability. Closes #3108 ## Test Plan Manually examined traces in jaeger to verify the added information was available. Signed-off-by: Gordon Sim <gsim@redhat.com>	2025-10-09 09:59:09 -07:00
Sébastien Han	4b9ebbf6a2	chore: revert "fix: Raising an error message to the user when registering an existing provider." (#3750 ) Reverts llamastack/llama-stack#3624 Causing https://github.com/llamastack/llama-stack/issues/3749	2025-10-09 09:17:37 -04:00
ehhuang	05a62a6ffb	chore: print integration tests command (#3747 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details UI Tests / ui-tests (22) (push) Successful in 41s Details Pre-commit / pre-commit (push) Successful in 1m23s Details # What does this PR do? ## Test Plan <img width="1104" height="60" alt="image" src="https://github.com/user-attachments/assets/d4691a2e-c5ec-4df5-a15a-f86e667fdf8c" />	2025-10-08 15:12:13 -07:00
Ashwin Bharambe	16db42e7e5	feat(tests): add --collect-only option to integration test script (#3745 ) Adds --collect-only flag to scripts/integration-tests.sh that skips server startup and passes the flag to pytest for test collection only. When specified, minimal flags are required (no --stack-config or --setup needed). ## Changes - Added `--collect-only` flag that skips server startup - Made `--stack-config` and `--setup` optional when using `--collect-only` - Skip `llama` command check when collecting tests only ## Usage ```bash # Collect tests without starting server ./scripts/integration-tests.sh --subdirs inference --collect-only ```	2025-10-08 14:20:34 -07:00
Francisco Arceo	b96640eca3	chore: Removing Weaviate, PGVector, and Milvus from unit tests (#3742 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details UI Tests / ui-tests (22) (push) Successful in 48s Details Pre-commit / pre-commit (push) Successful in 1m27s Details # What does this PR do? Removing Weaviate, PostGres, and Milvus unit tests <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-10-08 12:25:51 -07:00
Ashwin Bharambe	79bed44b04	fix(tests): ensure test isolation in server mode (#3737 ) Propagate test IDs from client to server via HTTP headers to maintain proper test isolation when running with server-based stack configs. Without this, recorded/replayed inference requests in server mode would leak across tests. Changes: - Patch client _prepare_request to inject test ID into provider data header - Sync test context from provider data on server side before storage operations - Set LLAMA_STACK_TEST_STACK_CONFIG_TYPE env var based on stack config - Configure console width for cleaner log output in CI - Add SQLITE_STORE_DIR temp directory for test data isolation	2025-10-08 12:03:36 -07:00
grs	96886afaca	fix(responses): fix regression in support for mcp tool require_approval argument (#3731 ) # What does this PR do? It prevents a tool call message being added to the chat completions message without a corresponding tool call result, which is needed in the case that an approval is required first or if the approval request is denied. In both these cases the tool call messages is popped of the next turn messages. Closes #3728 ## Test Plan Ran the integration tests Manual check of both approval and denial against gpt-4o Signed-off-by: Gordon Sim <gsim@redhat.com>	2025-10-08 10:47:17 -04:00
Bill Murdock	5d711d4bcb	fix: Update watsonx.ai provider to use LiteLLM mixin and list all models (#3674 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details Test Llama Stack Build / build (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 32s Details Pre-commit / pre-commit (push) Successful in 1m29s Details # What does this PR do? - The watsonx.ai provider now uses the LiteLLM mixin instead of using IBM's library, which does not seem to be working (see #3165 for context). - The watsonx.ai provider now lists all the models available by calling the watsonx.ai server instead of having a hard coded list of known models. (That list gets out of date quickly) - An edge case in [llama_stack/core/routers/inference.py](https://github.com/llamastack/llama-stack/pull/3674/files#diff-a34bc966ed9befd9f13d4883c23705dff49be0ad6211c850438cdda6113f3455) is addressed that was causing my manual tests to fail. - Fixes `b64_encode_openai_embeddings_response` which was trying to enumerate over a dictionary and then reference elements of the dictionary using .field instead of ["field"]. That method is called by the LiteLLM mixin for embedding models, so it is needed to get the watsonx.ai embedding models to work. - A unit test along the lines of the one in #3348 is added. A more comprehensive plan for automatically testing the end-to-end functionality for inference providers would be a good idea, but is out of scope for this PR. - Updates to the watsonx distribution. Some were in response to the switch to LiteLLM (e.g., updating the Python packages needed). Others seem to be things that were already broken that I found along the way (e.g., a reference to a watsonx specific doc template that doesn't seem to exist). Closes #3165 Also it is related to a line-item in #3387 but doesn't really address that goal (because it uses the LiteLLM mixin, not the OpenAI one). I tried the OpenAI one and it doesn't work with watsonx.ai, presumably because the watsonx.ai service is not OpenAI compatible. It works with LiteLLM because LiteLLM has a provider implementation for watsonx.ai. ## Test Plan The test script below goes back and forth between the OpenAI and watsonx providers. The idea is that the OpenAI provider shows how it should work and then the watsonx provider output shows that it is also working with watsonx. Note that the result from the MCP test is not as good (the Llama 3.3 70b model does not choose tools as wisely as gpt-4o), but it is still working and providing a valid response. For more details on setup and the MCP server being used for testing, see [the AI Alliance sample notebook](https://github.com/The-AI-Alliance/llama-stack-examples/blob/main/notebooks/01-responses/) that these examples are drawn from. ```python #!/usr/bin/env python3 import json from llama_stack_client import LlamaStackClient from litellm import completion import http.client def print_response(response): """Print response in a nicely formatted way""" print(f"ID: {response.id}") print(f"Status: {response.status}") print(f"Model: {response.model}") print(f"Created at: {response.created_at}") print(f"Output items: {len(response.output)}") for i, output_item in enumerate(response.output): if len(response.output) > 1: print(f"\n--- Output Item {i+1} ---") print(f"Output type: {output_item.type}") if output_item.type in ("text", "message"): print(f"Response content: {output_item.content[0].text}") elif output_item.type == "file_search_call": print(f" Tool Call ID: {output_item.id}") print(f" Tool Status: {output_item.status}") # 'queries' is a list, so we join it for clean printing print(f" Queries: {', '.join(output_item.queries)}") # Display results if they exist, otherwise note they are empty print(f" Results: {output_item.results if output_item.results else 'None'}") elif output_item.type == "mcp_list_tools": print_mcp_list_tools(output_item) elif output_item.type == "mcp_call": print_mcp_call(output_item) else: print(f"Response content: {output_item.content}") def print_mcp_call(mcp_call): """Print MCP call in a nicely formatted way""" print(f"\n🛠️ MCP Tool Call: {mcp_call.name}") print(f" Server: {mcp_call.server_label}") print(f" ID: {mcp_call.id}") print(f" Arguments: {mcp_call.arguments}") if mcp_call.error: print("Error: {mcp_call.error}") elif mcp_call.output: print("Output:") # Try to format JSON output nicely try: parsed_output = json.loads(mcp_call.output) print(json.dumps(parsed_output, indent=4)) except: # If not valid JSON, print as-is print(f" {mcp_call.output}") else: print(" ⏳ No output yet") def print_mcp_list_tools(mcp_list_tools): """Print MCP list tools in a nicely formatted way""" print(f"\n🔧 MCP Server: {mcp_list_tools.server_label}") print(f" ID: {mcp_list_tools.id}") print(f" Available Tools: {len(mcp_list_tools.tools)}") print("=" * 80) for i, tool in enumerate(mcp_list_tools.tools, 1): print(f"\n{i}. {tool.name}") print(f" Description: {tool.description}") # Parse and display input schema schema = tool.input_schema if schema and 'properties' in schema: properties = schema['properties'] required = schema.get('required', []) print(" Parameters:") for param_name, param_info in properties.items(): param_type = param_info.get('type', 'unknown') param_desc = param_info.get('description', 'No description') required_marker = " (required)" if param_name in required else " (optional)" print(f" • {param_name} ({param_type}){required_marker}") if param_desc: print(f" {param_desc}") if i < len(mcp_list_tools.tools): print("-" * 40) def main(): """Main function to run all the tests""" # Configuration LLAMA_STACK_URL = "http://localhost:8321/" LLAMA_STACK_MODEL_IDS = [ "openai/gpt-3.5-turbo", "openai/gpt-4o", "llama-openai-compat/Llama-3.3-70B-Instruct", "watsonx/meta-llama/llama-3-3-70b-instruct" ] # Using gpt-4o for this demo, but feel free to try one of the others or add more to run.yaml. OPENAI_MODEL_ID = LLAMA_STACK_MODEL_IDS[1] WATSONX_MODEL_ID = LLAMA_STACK_MODEL_IDS[-1] NPS_MCP_URL = "http://localhost:3005/sse/" print("=== Llama Stack Testing Script ===") print(f"Using OpenAI model: {OPENAI_MODEL_ID}") print(f"Using WatsonX model: {WATSONX_MODEL_ID}") print(f"MCP URL: {NPS_MCP_URL}") print() # Initialize client print("Initializing LlamaStackClient...") client = LlamaStackClient(base_url="http://localhost:8321") # Test 1: List models print("\n=== Test 1: List Models ===") try: models = client.models.list() print(f"Found {len(models)} models") except Exception as e: print(f"Error listing models: {e}") raise e # Test 2: Basic chat completion with OpenAI print("\n=== Test 2: Basic Chat Completion (OpenAI) ===") try: chat_completion_response = client.chat.completions.create( model=OPENAI_MODEL_ID, messages=[{"role": "user", "content": "What is the capital of France?"}] ) print("OpenAI Response:") for chunk in chat_completion_response.choices[0].message.content: print(chunk, end="", flush=True) print() except Exception as e: print(f"Error with OpenAI chat completion: {e}") raise e # Test 3: Basic chat completion with WatsonX print("\n=== Test 3: Basic Chat Completion (WatsonX) ===") try: chat_completion_response_wxai = client.chat.completions.create( model=WATSONX_MODEL_ID, messages=[{"role": "user", "content": "What is the capital of France?"}], ) print("WatsonX Response:") for chunk in chat_completion_response_wxai.choices[0].message.content: print(chunk, end="", flush=True) print() except Exception as e: print(f"Error with WatsonX chat completion: {e}") raise e # Test 4: Tool calling with OpenAI print("\n=== Test 4: Tool Calling (OpenAI) ===") tools = [ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather for a specific location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g., San Francisco, CA", }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] }, }, "required": ["location"], }, }, } ] messages = [ {"role": "user", "content": "What's the weather like in Boston, MA?"} ] try: print("--- Initial API Call ---") response = client.chat.completions.create( model=OPENAI_MODEL_ID, messages=messages, tools=tools, tool_choice="auto", # "auto" is the default ) print("OpenAI tool calling response received") except Exception as e: print(f"Error with OpenAI tool calling: {e}") raise e # Test 5: Tool calling with WatsonX print("\n=== Test 5: Tool Calling (WatsonX) ===") try: wxai_response = client.chat.completions.create( model=WATSONX_MODEL_ID, messages=messages, tools=tools, tool_choice="auto", # "auto" is the default ) print("WatsonX tool calling response received") except Exception as e: print(f"Error with WatsonX tool calling: {e}") raise e # Test 6: Streaming with WatsonX print("\n=== Test 6: Streaming Response (WatsonX) ===") try: chat_completion_response_wxai_stream = client.chat.completions.create( model=WATSONX_MODEL_ID, messages=[{"role": "user", "content": "What is the capital of France?"}], stream=True ) print("Model response: ", end="") for chunk in chat_completion_response_wxai_stream: # Each 'chunk' is a ChatCompletionChunk object. # We want the content from the 'delta' attribute. if hasattr(chunk, 'choices') and chunk.choices is not None: content = chunk.choices[0].delta.content # The first few chunks might have None content, so we check for it. if content is not None: print(content, end="", flush=True) print() except Exception as e: print(f"Error with streaming: {e}") raise e # Test 7: MCP with OpenAI print("\n=== Test 7: MCP Integration (OpenAI) ===") try: mcp_llama_stack_client_response = client.responses.create( model=OPENAI_MODEL_ID, input="Tell me about some parks in Rhode Island, and let me know if there are any upcoming events at them.", tools=[ { "type": "mcp", "server_url": NPS_MCP_URL, "server_label": "National Parks Service tools", "allowed_tools": ["search_parks", "get_park_events"], } ] ) print_response(mcp_llama_stack_client_response) except Exception as e: print(f"Error with MCP (OpenAI): {e}") raise e # Test 8: MCP with WatsonX print("\n=== Test 8: MCP Integration (WatsonX) ===") try: mcp_llama_stack_client_response = client.responses.create( model=WATSONX_MODEL_ID, input="What is the capital of France?" ) print_response(mcp_llama_stack_client_response) except Exception as e: print(f"Error with MCP (WatsonX): {e}") raise e # Test 9: MCP with Llama 3.3 print("\n=== Test 9: MCP Integration (Llama 3.3) ===") try: mcp_llama_stack_client_response = client.responses.create( model=WATSONX_MODEL_ID, input="Tell me about some parks in Rhode Island, and let me know if there are any upcoming events at them.", tools=[ { "type": "mcp", "server_url": NPS_MCP_URL, "server_label": "National Parks Service tools", "allowed_tools": ["search_parks", "get_park_events"], } ] ) print_response(mcp_llama_stack_client_response) except Exception as e: print(f"Error with MCP (Llama 3.3): {e}") raise e # Test 10: Embeddings print("\n=== Test 10: Embeddings ===") try: conn = http.client.HTTPConnection("localhost:8321") payload = json.dumps({ "model": "watsonx/ibm/granite-embedding-278m-multilingual", "input": "Hello, world!", }) headers = { 'Content-Type': 'application/json', 'Accept': 'application/json' } conn.request("POST", "/v1/openai/v1/embeddings", payload, headers) res = conn.getresponse() data = res.read() print(data.decode("utf-8")) except Exception as e: print(f"Error with Embeddings: {e}") raise e print("\n=== Testing Complete ===") if __name__ == "__main__": main() ``` --------- Signed-off-by: Bill Murdock <bmurdock@redhat.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-08 07:29:43 -04:00
dependabot[bot]	62bac0aad4	chore(github-deps): bump actions/stale from 10.0.0 to 10.1.0 (#3684 ) Bumps [actions/stale](https://github.com/actions/stale) from 10.0.0 to 10.1.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/stale/releases">actions/stale's releases</a>.</em></p> <blockquote> <h2>v10.1.0</h2> <h2>What's Changed</h2> <ul> <li>Add <code>only-issue-types</code> option to filter issues by type by <a href="https://github.com/Bibo-Joshi"><code>@Bibo-Joshi</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1255">actions/stale#1255</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/Bibo-Joshi"><code>@Bibo-Joshi</code></a> made their first contribution in <a href="https://redirect.github.com/actions/stale/pull/1255">actions/stale#1255</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/stale/compare/v10...v10.1.0">https://github.com/actions/stale/compare/v10...v10.1.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`5f858e3efb`"><code>5f858e3</code></a> Add <code>only-issue-types</code> option to filter issues by type (<a href="https://redirect.github.com/actions/stale/issues/1255">#1255</a>)</li> <li>See full diff in <a href="`3a9db7e6a4...5f858e3efb`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/stale&package-manager=github_actions&previous-version=10.0.0&new-version=10.1.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-08 12:16:54 +02:00
Omar Abdelwahab	702fcd1abf	fix: Raising an error message to the user when registering an existing provider. (#3624 ) When the user wants to change the attributes (which could include model name, dimensions,...etc) of an already registered provider, they will get an error message asking that they first unregister the provider before registering a new one. # What does this PR do? This PR updated the register function to raise an error to the user when they attempt to register a provider that was already registered asking them to un-register the existing provider first. <!-- If resolving an issue, uncomment and update the line below --> #2313 ## Test Plan Tested the change with /tests/unit/registry/test_registry.py --------- Co-authored-by: Omar Abdelwahab <omara@fb.com>	2025-10-08 12:09:23 +02:00
ehhuang	0cde3d956d	chore: require valid logging category (#3712 ) # What does this PR do? grep'd and audited all usage of 'get_logger' with help of Claude. ## Test Plan CI	2025-10-08 11:10:33 +02:00
ehhuang	a3f5072776	chore!: remove --env from `llama stack run` (#3711 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Installer CI / lint (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Installer CI / smoke-test-on-dev (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 2s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 40s Details Pre-commit / pre-commit (push) Successful in 1m18s Details # What does this PR do? user can simply set env vars in the beginning of the command.`FOO=BAR llama stack run ...` ## Test Plan Run TELEMETRY_SINKS=coneol uv run --with llama-stack llama stack build --distro=starter --image-type=venv --run --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/llamastack/llama-stack/pull/3711). * #3714 * __->__ #3711	2025-10-07 20:58:15 -07:00
slekkala1	1ac320b7e6	chore: remove dead code (#3729 ) # What does this PR do? Removing some dead code, found by vulture and checked by claude that there are no references or imports for these ## Test Plan CI	2025-10-07 20:26:02 -07:00
ehhuang	b6e9f41041	chore: Revert "fix: fix nvidia provider (#3716 )" (#3730 ) This reverts commit `c940fe7938`. @wukaixingxp I stamped to fast. Let's wait for @mattf's review.	2025-10-07 19:16:51 -07:00
Kai Wu	c940fe7938	fix: fix nvidia provider (#3716 ) # What does this PR do? (Used claude to solve #3715, coded with claude but tested by me) ## From claude summary: <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Problem: The `NVIDIAInferenceAdapter` class was missing the `alias_to_provider_id_map` attribute, which caused the error: `ERROR 'NVIDIAInferenceAdapter' object has no attribute 'alias_to_provider_id_map'` Root Cause: The `NVIDIAInferenceAdapter` only inherited from `OpenAIMixin`, but some parts of the system expected it to have the `alias_to_provider_id_map` attribute, which is provided by the `ModelRegistryHelper` class. Solution: 1. Added ModelRegistryHelper import: Imported the `ModelRegistryHelper` class from `llama_stack.providers.utils.inference.model_registry` 2. Updated inheritance: Changed the class declaration to inherit from both `OpenAIMixin` and `ModelRegistryHelper` 3. Added proper initialization: Added an `__init__` method that properly initializes the `ModelRegistryHelper` with empty model entries (since NVIDIA uses dynamic model discovery) and the allowed models from the configuration Key Changes: * Added `from llama_stack.providers.utils.inference.model_registry import ModelRegistryHelper` * Changed class declaration from `class NVIDIAInferenceAdapter(OpenAIMixin):` to `class NVIDIAInferenceAdapter(OpenAIMixin, ModelRegistryHelper):` * Added `__init__` method that calls `ModelRegistryHelper.__init__(self, model_entries=[], allowed_models=config.allowed_models)` The inheritance order is important - `OpenAIMixin` comes first to ensure its `check_model_availability()` method takes precedence over the `ModelRegistryHelper` version, as mentioned in the class documentation. This fix ensures that the `NVIDIAInferenceAdapter` has the required `alias_to_provider_id_map` attribute while maintaining all existing functionality.<!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Launching llama-stack server successfully, see logs: ``` NVIDIA_API_KEY=dummy NVIDIA_BASE_URL=http://localhost:8912 llama stack run /home/nvidia/.llama/distributions/starter/starter-run.yaml --image-type venv & [2] 3753042 (venv) nvidia@nv-meta-H100-testing-gpu01:~/kai/llama-stack$ WARNING 2025-10-07 00:29:09,848 root:266 uncategorized: Unknown logging category: openai::conversations. Falling back to default 'root' level: 20 WARNING 2025-10-07 00:29:09,932 root:266 uncategorized: Unknown logging category: cli. Falling back to default 'root' level: 20 INFO 2025-10-07 00:29:09,937 llama_stack.core.utils.config_resolution:45 core: Using file path: /home/nvidia/.llama/distributions/starter/starter-run.yaml INFO 2025-10-07 00:29:09,937 llama_stack.cli.stack.run:136 cli: Using run configuration: /home/nvidia/.llama/distributions/starter/starter-run.yaml Using virtual environment: /home/nvidia/kai/venv Virtual environment already activated + '[' -n /home/nvidia/.llama/distributions/starter/starter-run.yaml ']' + yaml_config_arg=/home/nvidia/.llama/distributions/starter/starter-run.yaml + llama stack run /home/nvidia/.llama/distributions/starter/starter-run.yaml --port 8321 WARNING 2025-10-07 00:29:11,432 root:266 uncategorized: Unknown logging category: openai::conversations. Falling back to default 'root' level: 20 WARNING 2025-10-07 00:29:11,593 root:266 uncategorized: Unknown logging category: cli. Falling back to default 'root' level: 20 INFO 2025-10-07 00:29:11,603 llama_stack.core.utils.config_resolution:45 core: Using file path: /home/nvidia/.llama/distributions/starter/starter-run.yaml INFO 2025-10-07 00:29:11,604 llama_stack.cli.stack.run:136 cli: Using run configuration: /home/nvidia/.llama/distributions/starter/starter-run.yaml INFO 2025-10-07 00:29:11,624 llama_stack.cli.stack.run:155 cli: No image type or image name provided. Assuming environment packages. INFO 2025-10-07 00:29:11,625 llama_stack.core.utils.config_resolution:45 core: Using file path: /home/nvidia/.llama/distributions/starter/starter-run.yaml INFO 2025-10-07 00:29:11,644 llama_stack.cli.stack.run:230 cli: HTTPS enabled with certificates: Key: None Cert: None INFO 2025-10-07 00:29:11,645 llama_stack.cli.stack.run:232 cli: Listening on ['::', '0.0.0.0']:8321 INFO 2025-10-07 00:29:11,816 llama_stack.core.utils.config_resolution:45 core: Using file path: /home/nvidia/.llama/distributions/starter/starter-run.yaml INFO 2025-10-07 00:29:11,836 llama_stack.core.server.server:480 core::server: Run configuration: INFO 2025-10-07 00:29:11,845 llama_stack.core.server.server:483 core::server: apis: - agents - batches - datasetio - eval - files - inference - post_training - safety - scoring - telemetry - tool_runtime - vector_io benchmarks: [] datasets: [] image_name: starter inference_store: db_path: /home/nvidia/.llama/distributions/starter/inference_store.db type: sqlite metadata_store: db_path: /home/nvidia/.llama/distributions/starter/registry.db type: sqlite models: [] providers: agents: - config: persistence_store: db_path: /home/nvidia/.llama/distributions/starter/agents_store.db type: sqlite responses_store: db_path: /home/nvidia/.llama/distributions/starter/responses_store.db type: sqlite provider_id: meta-reference provider_type: inline::meta-reference batches: - config: kvstore: db_path: /home/nvidia/.llama/distributions/starter/batches.db type: sqlite provider_id: reference provider_type: inline::reference datasetio: - config: kvstore: db_path: /home/nvidia/.llama/distributions/starter/huggingface_datasetio.db type: sqlite provider_id: huggingface provider_type: remote::huggingface - config: kvstore: db_path: /home/nvidia/.llama/distributions/starter/localfs_datasetio.db type: sqlite provider_id: localfs provider_type: inline::localfs eval: - config: kvstore: db_path: /home/nvidia/.llama/distributions/starter/meta_reference_eval.db type: sqlite provider_id: meta-reference provider_type: inline::meta-reference files: - config: metadata_store: db_path: /home/nvidia/.llama/distributions/starter/files_metadata.db type: sqlite storage_dir: /home/nvidia/.llama/distributions/starter/files provider_id: meta-reference-files provider_type: inline::localfs inference: - config: api_key: '******' url: https://api.fireworks.ai/inference/v1 provider_id: fireworks provider_type: remote::fireworks - config: api_key: '****' url: https://api.together.xyz/v1 provider_id: together provider_type: remote::together - config: {} provider_id: bedrock provider_type: remote::bedrock - config: api_key: '****' append_api_version: true url: http://localhost:8912 provider_id: nvidia provider_type: remote::nvidia - config: api_key: '****' base_url: https://api.openai.com/v1 provider_id: openai provider_type: remote::openai - config: api_key: '****' provider_id: anthropic provider_type: remote::anthropic - config: api_key: '****' provider_id: gemini provider_type: remote::gemini - config: api_key: '****' url: https://api.groq.com provider_id: groq provider_type: remote::groq - config: api_key: '****' url: https://api.sambanova.ai/v1 provider_id: sambanova provider_type: remote::sambanova - config: {} provider_id: sentence-transformers provider_type: inline::sentence-transformers post_training: - config: checkpoint_format: meta provider_id: torchtune-cpu provider_type: inline::torchtune-cpu safety: - config: excluded_categories: [] provider_id: llama-guard provider_type: inline::llama-guard - config: {} provider_id: code-scanner provider_type: inline::code-scanner scoring: - config: {} provider_id: basic provider_type: inline::basic - config: {} provider_id: llm-as-judge provider_type: inline::llm-as-judge - config: openai_api_key: '****' provider_id: braintrust provider_type: inline::braintrust telemetry: - config: service_name: "\u200B" sinks: sqlite sqlite_db_path: /home/nvidia/.llama/distributions/starter/trace_store.db provider_id: meta-reference provider_type: inline::meta-reference tool_runtime: - config: api_key: '****' max_results: 3 provider_id: brave-search provider_type: remote::brave-search - config: api_key: '******' max_results: 3 provider_id: tavily-search provider_type: remote::tavily-search - config: {} provider_id: rag-runtime provider_type: inline::rag-runtime - config: {} provider_id: model-context-protocol provider_type: remote::model-context-protocol vector_io: - config: kvstore: db_path: /home/nvidia/.llama/distributions/starter/faiss_store.db type: sqlite provider_id: faiss provider_type: inline::faiss - config: db_path: /home/nvidia/.llama/distributions/starter/sqlite_vec.db kvstore: db_path: /home/nvidia/.llama/distributions/starter/sqlite_vec_registry.db type: sqlite provider_id: sqlite-vec provider_type: inline::sqlite-vec scoring_fns: [] server: port: 8321 shields: [] tool_groups: - provider_id: tavily-search toolgroup_id: builtin::websearch - provider_id: rag-runtime toolgroup_id: builtin::rag vector_dbs: [] version: 2 INFO 2025-10-07 00:29:12,138 llama_stack.providers.remote.inference.nvidia.nvidia:49 inference::nvidia: Initializing NVIDIAInferenceAdapter(http://localhost:8912)... INFO 2025-10-07 00:29:12,921 llama_stack.providers.utils.inference.inference_store:74 inference: Write queue disabled for SQLite to avoid concurrency issues INFO 2025-10-07 00:29:13,524 llama_stack.providers.utils.responses.responses_store:96 openai_responses: Write queue disabled for SQLite to avoid concurrency issues ERROR 2025-10-07 00:29:13,679 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: FireworksInferenceAdapter.list_provider_model_ids() failed with: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"fireworks_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 00:29:13,681 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider fireworks: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"fireworks_api_key": "<API_KEY>"}, or in the provider config. ERROR 2025-10-07 00:29:13,682 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: TogetherInferenceAdapter.list_provider_model_ids() failed with: Pass Together API Key in the header X-LlamaStack-Provider-Data as { "together_api_key": <your api key>} WARNING 2025-10-07 00:29:13,684 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider together: Pass Together API Key in the header X-LlamaStack-Provider-Data as { "together_api_key": <your api key>} Handling connection for 8912 INFO 2025-10-07 00:29:14,047 llama_stack.providers.utils.inference.openai_mixin:448 providers::utils: NVIDIAInferenceAdapter.list_provider_model_ids() returned 3 models ERROR 2025-10-07 00:29:14,062 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: OpenAIInferenceAdapter.list_provider_model_ids() failed with: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"openai_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 00:29:14,063 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider openai: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"openai_api_key": "<API_KEY>"}, or in the provider config. ERROR 2025-10-07 00:29:14,099 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: AnthropicInferenceAdapter.list_provider_model_ids() failed with: "Could not resolve authentication method. Expected either api_key or auth_token to be set. Or for one of the `X-Api-Key` or `Authorization` headers to be explicitly omitted" WARNING 2025-10-07 00:29:14,100 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider anthropic: "Could not resolve authentication method. Expected either api_key or auth_token to be set. Or for one of the `X-Api-Key` or `Authorization` headers to be explicitly omitted" ERROR 2025-10-07 00:29:14,102 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: GeminiInferenceAdapter.list_provider_model_ids() failed with: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"gemini_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 00:29:14,103 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider gemini: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"gemini_api_key": "<API_KEY>"}, or in the provider config. ERROR 2025-10-07 00:29:14,105 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: GroqInferenceAdapter.list_provider_model_ids() failed with: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"groq_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 00:29:14,106 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider groq: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"groq_api_key": "<API_KEY>"}, or in the provider config. ERROR 2025-10-07 00:29:14,107 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: SambaNovaInferenceAdapter.list_provider_model_ids() failed with: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"sambanova_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 00:29:14,109 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider sambanova: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"sambanova_api_key": "<API_KEY>"}, or in the provider config. INFO 2025-10-07 00:29:14,454 uvicorn.error:84 uncategorized: Started server process [3753046] INFO 2025-10-07 00:29:14,455 uvicorn.error:48 uncategorized: Waiting for application startup. INFO 2025-10-07 00:29:14,457 llama_stack.core.server.server:170 core::server: Starting up INFO 2025-10-07 00:29:14,458 llama_stack.core.stack:415 core: starting registry refresh task ERROR 2025-10-07 00:29:14,459 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: FireworksInferenceAdapter.list_provider_model_ids() failed with: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"fireworks_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 00:29:14,461 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider fireworks: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"fireworks_api_key": "<API_KEY>"}, or in the provider config. ERROR 2025-10-07 00:29:14,462 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: TogetherInferenceAdapter.list_provider_model_ids() failed with: Pass Together API Key in the header X-LlamaStack-Provider-Data as { "together_api_key": <your api key>} WARNING 2025-10-07 00:29:14,463 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider together: Pass Together API Key in the header X-LlamaStack-Provider-Data as { "together_api_key": <your api key>} ERROR 2025-10-07 00:29:14,465 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: OpenAIInferenceAdapter.list_provider_model_ids() failed with: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"openai_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 00:29:14,466 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider openai: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"openai_api_key": "<API_KEY>"}, or in the provider config. INFO 2025-10-07 00:29:14,500 uvicorn.error:62 uncategorized: Application startup complete. ERROR 2025-10-07 00:29:14,502 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: AnthropicInferenceAdapter.list_provider_model_ids() failed with: "Could not resolve authentication method. Expected either api_key or auth_token to be set. Or for one of the `X-Api-Key` or `Authorization` headers to be explicitly omitted" WARNING 2025-10-07 00:29:14,503 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider anthropic: "Could not resolve authentication method. Expected either api_key or auth_token to be set. Or for one of the `X-Api-Key` or `Authorization` headers to be explicitly omitted" ERROR 2025-10-07 00:29:14,504 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: GeminiInferenceAdapter.list_provider_model_ids() failed with: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"gemini_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 00:29:14,506 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider gemini: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"gemini_api_key": "<API_KEY>"}, or in the provider config. ERROR 2025-10-07 00:29:14,507 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: GroqInferenceAdapter.list_provider_model_ids() failed with: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"groq_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 00:29:14,508 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider groq: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"groq_api_key": "<API_KEY>"}, or in the provider config. ERROR 2025-10-07 00:29:14,510 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: SambaNovaInferenceAdapter.list_provider_model_ids() failed with: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"sambanova_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 00:29:14,511 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider sambanova: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"sambanova_api_key": "<API_KEY>"}, or in the provider config. INFO 2025-10-07 00:29:14,513 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) ``` tested with curl model, it also works: ``` curl http://localhost:8321/v1/models {"data":[{"identifier":"bedrock/meta.llama3-1-8b-instruct-v1:0","provider_resource_id":"meta.llama3-1-8b-instruct-v1:0","provider_id":"bedrock","type":"model","metadata":{},"model_type":"llm"},{"identifier":"bedrock/meta.llama3-1-70b-instruct-v1:0","provider_resource_id":"meta.llama3-1-70b-instruct-v1:0","provider_id":"bedrock","type":"model","metadata":{},"model_type":"llm"},{"identifier":"bedrock/meta.llama3-1-405b-instruct-v1:0","provider_resource_id":"meta.llama3-1-405b-instruct-v1:0","provider_id":"bedrock","type":"model","metadata":{},"model_type":"llm"},{"identifier":"nvidia/bigcode/starcoder2-7b","provider_resource_id":"bigcode/starcoder2-7b","provider_id":"nvidia","type":"model","metadata":{},"model_type":"llm"},{"identifier":"nvidia/meta/llama-3.3-70b-instruct","provider_resource_id":"meta/llama-3.3-70b-instruct","provider_id":"nvidia","type":"model","metadata":{},"model_type":"llm"},{"identifier":"nvidia/nvidia/llama-3.2-nv-embedqa-1b-v2","provider_resource_id":"nvidia/llama-3.2-nv-embedqa-1b-v2","provider_id":"nvidia","type":"model","metadata":{"embedding_dimension":2048,"context_length":8192},"model_type":"embedding"},{"identifier":"sentence-transformers/all-MiniLM-L6-v2","provider_resource_id":"all-MiniLM-L6-v2","provider_id":"sentence-transformers","type":"model","metadata":{"embedding_dimension":384},"model_type":"embedding"}]}% ``` --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-07 18:23:12 -07:00
Emilio Garcia	bc7d4b423b	fix(scripts): select container runtime for telemetry (#3727 ) # What does this PR do? script runs with either docker or podman ## Test Plan passes when run	2025-10-07 14:59:53 -07:00
slekkala1	c2d97a9db9	chore: fix flaky unit test and add proper shutdown for file batches (#3725 ) # What does this PR do? Have been running into flaky unit test failures: `5217035494` Fixing below 1. Shutting down properly by cancelling any stale file batches tasks running in background. 2. Also, use unique_kvstore_config, so the test dont use same db path and maintain test isolation ## Test Plan Ran unit test locally and CI	2025-10-07 14:23:14 -07:00
Akram Ben Aissi	1970b4aa4b	fix: improve model availability checks: Allows use of unavailable models on startup (#3717 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 7s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m28s Details - Allows use of unavailable models on startup - Add has_model method to ModelsRoutingTable for checking pre-registered models - Update check_model_availability to check model_store before provider APIs # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Start llama stack and point unavailable vLLM ``` VLLM_URL=https://my-unavailable-vllm/v1 MILVUS_DB_PATH=./milvus.db INFERENCE_MODEL=vllm uv run --with llama-stack llama stack build --distro starter --image-type venv --run ``` llama stack will start without crashing but only notifying error. ``` - provider_id: rag-runtime toolgroup_id: builtin::rag vector_dbs: [] version: 2 INFO 2025-10-07 06:40:41,804 llama_stack.providers.utils.inference.inference_store:74 inference: Write queue disabled for SQLite to avoid concurrency issues INFO 2025-10-07 06:40:42,066 llama_stack.providers.utils.responses.responses_store:96 openai_responses: Write queue disabled for SQLite to avoid concurrency issues ERROR 2025-10-07 06:40:58,882 llama_stack.providers.utils.inference.openai_mixin:436 providers::utils: VLLMInferenceAdapter.list_provider_model_ids() failed with: Request timed out. WARNING 2025-10-07 06:40:58,883 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider vllm: Request timed out. [...] INFO 2025-10-07 06:40:59,036 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) INFO 2025-10-07 06:41:04,064 openai._base_client:1618 uncategorized: Retrying request to /models in 0.398814 seconds INFO 2025-10-07 06:41:09,497 openai._base_client:1618 uncategorized: Retrying request to /models in 0.781908 seconds ERROR 2025-10-07 06:41:15,282 llama_stack.providers.utils.inference.openai_mixin:436 providers::utils: VLLMInferenceAdapter.list_provider_model_ids() failed with: Request timed out. WARNING 2025-10-07 06:41:15,283 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider vllm: Request timed out. ```	2025-10-07 14:27:24 -04:00
Francisco Arceo	d5b136ac66	feat: Enabling Annotations in Responses (#3698 ) # What does this PR do? Implements annotations for `file_search` tool. Also adds some logs and tests. ## How does this work? 1. Citation Markers: Models insert `<\|file-id\|>` tokens during generation with instructions from search results 2. Post-Processing: Extract markers using regex to calculate character positions and create `AnnotationFileCitation` objects 3. File Mapping: Store filename metadata during vector store operations for proper citation display ## Example This is the updated `quickstart.py` script, which uses the `extra_body` to register the embedding model. ```python import io, requests from openai import OpenAI url="https://www.paulgraham.com/greatwork.html" model = "gpt-4o-mini" client = OpenAI(base_url="http://localhost:8321/v1/openai/v1", api_key="none") vs = client.vector_stores.create( name="my_citations_db", extra_body={ "embedding_model": "ollama/nomic-embed-text:latest", "embedding_dimension": 768, } ) response = requests.get(url) pseudo_file = io.BytesIO(str(response.content).encode('utf-8')) file_id = client.files.create(file=(url, pseudo_file, "text/html"), purpose="assistants").id client.vector_stores.files.create(vector_store_id=vs.id, file_id=file_id) resp = client.responses.create( model=model, input="How do you do great work? Use our existing knowledge_search tool.", tools=[{"type": "file_search", "vector_store_ids": [vs.id]}], include=["file_search_call.results"], ) print(resp) ``` <details> <summary> Example of the full response </summary> ```python INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/vector_stores "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/files "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/vector_stores/vs_0f6f7e35-f48b-4850-8604-8117d9a50e0a/files "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/responses "HTTP/1.1 200 OK" Response(id='resp-28f5793d-3272-4de3-81f6-8cbf107d5bcd', created_at=1759797954.0, error=None, incomplete_details=None, instructions=None, metadata=None, model='gpt-4o-mini', object='response', output=[ResponseFileSearchToolCall(id='call_xWtvEQETN5GNiRLLiBIDKntg', queries=['how to do great work tips'], status='completed', type='file_search_call', results=[Result(attributes={}, file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='file-a98ada68681c4fbeba2201e9c7213fc3', score=1.3722624322210302, text='\\\'re looking where few have looked before.<br /><br />One sign that you\\\'re suited for some kind of work is when you like\\neven the parts that other people find tedious or frightening.<br /><br />But fields aren\\\'t people; you don\\\'t owe them any loyalty. If in the\\ncourse of working on one thing you discover another that\\\'s more\\nexciting, don\\\'t be afraid to switch.<br /><br />If you\\\'re making something for people, make sure it\\\'s something\\nthey actually want. The best way to do this is to make something\\nyou yourself want. Write the story you want to read; build the tool\\nyou want to use. Since your friends probably have similar interests,\\nthis will also get you your initial audience.<br /><br />This <i>should</i> follow from the excitingness rule. Obviously the most\\nexciting story to write will be the one you want to read. The reason\\nI mention this case explicitly is that so many people get it wrong.\\nInstead of making what they want, they try to make what some\\nimaginary, more sophisticated audience wants. And once you go down\\nthat route, you\\\'re lost.\\n<font color=#dddddd>[<a href="#f6n"><font color=#dddddd>6</font></a>]</font><br /><br />There are a lot of forces that will lead you astray when you\\\'re\\ntrying to figure out what to work on. Pretentiousness, fashion,\\nfear, money, politics, other people\\\'s wishes, eminent frauds. But\\nif you stick to what you find genuinely interesting, you\\\'ll be proof\\nagainst all of them. If you\\\'re interested, you\\\'re not astray.<br /><br /><br /><br /><br /><br />\\nFollowing your interests may sound like a rather passive strategy,\\nbut in practice it usually means following them past all sorts of\\nobstacles. You usually have to risk rejection and failure. So it\\ndoes take a good deal of boldness.<br /><br />But while you need boldness, you don\\\'t usually need much planning.\\nIn most cases the recipe for doing great work is simply: work hard\\non excitingly ambitious projects, and something good will come of\\nit. Instead of making a plan and then executing it, you just try\\nto preserve certain invariants.<br /><br />The trouble with planning is that it only works for achievements\\nyou can describe in advance. You can win a gold medal or get rich\\nby deciding to as a child and then tenaciously pursuing that goal,\\nbut you can\\\'t discover natural selection that way.<br /><br />I think for most people who want to do great work, the right strategy\\nis not to plan too much. At each stage do whatever seems most\\ninteresting and gives you the best options for the future. I call\\nthis approach "staying upwind." This is how most people who\\\'ve done\\ngreat work seem to have done it.<br /><br /><br /><br /><br /><br />\\nEven when you\\\'ve found something exciting to work on, working on\\nit is not always straightforward. There will be times when some new\\nidea makes you leap out of bed in the morning and get straight to\\nwork. But there will also be plenty of times when things aren\\\'t\\nlike that.<br /><br />You don\\\'t just put out your sail and get blown forward by inspiration.\\nThere are headwinds and currents and hidden shoals. So there\\\'s a\\ntechnique to working, just as there is to sailing.<br /><br />For example, while you must work hard, it\\\'s possible to work too\\nhard, and if'), Result(attributes={}, file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='file-a98ada68681c4fbeba2201e9c7213fc3', score=1.2532794607643494, text=' with anyone who\\\'s genuinely interested. If they\\\'re\\nreally good at their work, then they probably have a hobbyist\\\'s\\ninterest in it, and hobbyists always want to talk about their\\nhobbies.<br /><br />It may take some effort to find the people who are really good,\\nthough. Doing great work has such prestige that in some places,\\nparticularly universities, there\\\'s a polite fiction that everyone\\nis engaged in it. And that is far from true. People within universities\\ncan\\\'t say so openly, but the quality of the work being done in\\ndifferent departments varies immensely. Some departments have people\\ndoing great work; others have in the past; others never have.<br /><br /><br /><br /><br /><br />\\nSeek out the best colleagues. There are a lot of projects that can\\\'t\\nbe done alone, and even if you\\\'re working on one that can be, it\\\'s\\ngood to have other people to encourage you and to bounce ideas off.<br /><br />Colleagues don\\\'t just affect your work, though; they also affect\\nyou. So work with people you want to become like, because you will.<br /><br />Quality is more important than quantity in colleagues. It\\\'s better\\nto have one or two great ones than a building full of pretty good\\nones. In fact it\\\'s not merely better, but necessary, judging from\\nhistory: the degree to which great work happens in clusters suggests\\nthat one\\\'s colleagues often make the difference between doing great\\nwork and not.<br /><br />How do you know when you have sufficiently good colleagues? In my\\nexperience, when you do, you know. Which means if you\\\'re unsure,\\nyou probably don\\\'t. But it may be possible to give a more concrete\\nanswer than that. Here\\\'s an attempt: sufficiently good colleagues\\noffer <i>surprising</i> insights. They can see and do things that you\\ncan\\\'t. So if you have a handful of colleagues good enough to keep\\nyou on your toes in this sense, you\\\'re probably over the threshold.<br /><br />Most of us can benefit from collaborating with colleagues, but some\\nprojects require people on a larger scale, and starting one of those\\nis not for everyone. If you want to run a project like that, you\\\'ll\\nhave to become a manager, and managing well takes aptitude and\\ninterest like any other kind of work. If you don\\\'t have them, there\\nis no middle path: you must either force yourself to learn management\\nas a second language, or avoid such projects.\\n<font color=#dddddd>[<a href="#f27n"><font color=#dddddd>27</font></a>]</font><br /><br /><br /><br /><br /><br />\\nHusband your morale. It\\\'s the basis of everything when you\\\'re working\\non ambitious projects. You have to nurture and protect it like a\\nliving organism.<br /><br />Morale starts with your view of life. You\\\'re more likely to do great\\nwork if you\\\'re an optimist, and more likely to if you think of\\nyourself as lucky than if you think of yourself as a victim.<br /><br />Indeed, work can to some extent protect you from your problems. If\\nyou choose work that\\\'s pure, its very difficulties will serve as a\\nrefuge from the difficulties of everyday life. If this is escapism,\\nit\\\'s a very productive form of it, and one that has been used by\\nsome of the greatest minds in history.<br /><br />Morale compounds via work: high morale helps you do good work, which\\nincreases your morale and helps you do even'), Result(attributes={}, file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='file-a98ada68681c4fbeba2201e9c7213fc3', score=1.1973485818164222, text=' your\\nability and interest can take you. And you can only answer that by\\ntrying.<br /><br />Many more people could try to do great work than do. What holds\\nthem back is a combination of modesty and fear. It seems presumptuous\\nto try to be Newton or Shakespeare. It also seems hard; surely if\\nyou tried something like that, you\\\'d fail. Presumably the calculation\\nis rarely explicit. Few people consciously decide not to try to do\\ngreat work. But that\\\'s what\\\'s going on subconsciously; they shy\\naway from the question.<br /><br />So I\\\'m going to pull a sneaky trick on you. Do you want to do great\\nwork, or not? Now you have to decide consciously. Sorry about that.\\nI wouldn\\\'t have done it to a general audience. But we already know\\nyou\\\'re interested.<br /><br />Don\\\'t worry about being presumptuous. You don\\\'t have to tell anyone.\\nAnd if it\\\'s too hard and you fail, so what? Lots of people have\\nworse problems than that. In fact you\\\'ll be lucky if it\\\'s the worst\\nproblem you have.<br /><br />Yes, you\\\'ll have to work hard. But again, lots of people have to\\nwork hard. And if you\\\'re working on something you find very\\ninteresting, which you necessarily will if you\\\'re on the right path,\\nthe work will probably feel less burdensome than a lot of your\\npeers\\\'.<br /><br />The discoveries are out there, waiting to be made. Why not by you?<br /><br /><br /><br /><br /><br /><br /><br /><br /><br />\\n<b>Notes</b><br /><br />[<a name="f1n"><font color=#000000>1</font></a>]\\nI don\\\'t think you could give a precise definition of what\\ncounts as great work. Doing great work means doing something important\\nso well that you expand people\\\'s ideas of what\\\'s possible. But\\nthere\\\'s no threshold for importance. It\\\'s a matter of degree, and\\noften hard to judge at the time anyway. So I\\\'d rather people focused\\non developing their interests rather than worrying about whether\\nthey\\\'re important or not. Just try to do something amazing, and\\nleave it to future generations to say if you succeeded.<br /><br />[<a name="f2n"><font color=#000000>2</font></a>]\\nA lot of standup comedy is based on noticing anomalies in\\neveryday life. "Did you ever notice...?" New ideas come from doing\\nthis about nontrivial things. Which may help explain why people\\\'s\\nreaction to a new idea is often the first half of laughing: Ha!<br /><br />[<a name="f3n"><font color=#000000>3</font></a>]\\nThat second qualifier is critical. If you\\\'re excited about\\nsomething most authorities discount, but you can\\\'t give a more\\nprecise explanation than "they don\\\'t get it," then you\\\'re starting\\nto drift into the territory of cranks.<br /><br />[<a name="f4n"><font color=#000000>4</font></a>]\\nFinding something to work on is not simply a matter of finding\\na match between the current version of you and a list of known\\nproblems. You\\\'ll often have to coevolve with the problem. That\\\'s\\nwhy it can sometimes be so hard to figure out what to work on. The\\nsearch space is huge. It\\\'s the cartesian product of all possible\\nt'), Result(attributes={}, file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='file-a98ada68681c4fbeba2201e9c7213fc3', score=1.1764591706535943, text='\\noptimistic, and even though one of the sources of their optimism\\nis ignorance, in this case ignorance can sometimes beat knowledge.<br /><br />Try to finish what you start, though, even if it turns out to be\\nmore work than you expected. Finishing things is not just an exercise\\nin tidiness or self-discipline. In many projects a lot of the best\\nwork happens in what was meant to be the final stage.<br /><br />Another permissible lie is to exaggerate the importance of what\\nyou\\\'re working on, at least in your own mind. If that helps you\\ndiscover something new, it may turn out not to have been a lie after\\nall.\\n<font color=#dddddd>[<a href="#f7n"><font color=#dddddd>7</font></a>]</font><br /><br /><br /><br /><br /><br />\\nSince there are two senses of starting work — per day and per\\nproject — there are also two forms of procrastination. Per-project\\nprocrastination is far the more dangerous. You put off starting\\nthat ambitious project from year to year because the time isn\\\'t\\nquite right. When you\\\'re procrastinating in units of years, you can\\nget a lot not done.\\n<font color=#dddddd>[<a href="#f8n"><font color=#dddddd>8</font></a>]</font><br /><br />One reason per-project procrastination is so dangerous is that it\\nusually camouflages itself as work. You\\\'re not just sitting around\\ndoing nothing; you\\\'re working industriously on something else. So\\nper-project procrastination doesn\\\'t set off the alarms that per-day\\nprocrastination does. You\\\'re too busy to notice it.<br /><br />The way to beat it is to stop occasionally and ask yourself: Am I\\nworking on what I most want to work on? When you\\\'re young it\\\'s ok\\nif the answer is sometimes no, but this gets increasingly dangerous\\nas you get older.\\n<font color=#dddddd>[<a href="#f9n"><font color=#dddddd>9</font></a>]</font><br /><br /><br /><br /><br /><br />\\nGreat work usually entails spending what would seem to most people\\nan unreasonable amount of time on a problem. You can\\\'t think of\\nthis time as a cost, or it will seem too high. You have to find the\\nwork sufficiently engaging as it\\\'s happening.<br /><br />There may be some jobs where you have to work diligently for years\\nat things you hate before you get to the good part, but this is not\\nhow great work happens. Great work happens by focusing consistently\\non something you\\\'re genuinely interested in. When you pause to take\\nstock, you\\\'re surprised how far you\\\'ve come.<br /><br />The reason we\\\'re surprised is that we underestimate the cumulative\\neffect of work. Writing a page a day doesn\\\'t sound like much, but\\nif you do it every day you\\\'ll write a book a year. That\\\'s the key:\\nconsistency. People who do great things don\\\'t get a lot done every\\nday. They get something done, rather than nothing.<br /><br />If you do work that compounds, you\\\'ll get exponential growth. Most\\npeople who do this do it unconsciously, but it\\\'s worth stopping to\\nthink about. Learning, for example, is an instance of this phenomenon:\\nthe more you learn about something, the easier it is to learn more.\\nGrowing an audience is another: the more fans you have, the more\\nnew fans they\\\'ll bring you.<br /><br />'), Result(attributes={}, file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='file-a98ada68681c4fbeba2201e9c7213fc3', score=1.174069664815369, text='\\ninside.<br /><br /><br /><br /><br /><br />Let\\\'s talk a little more about the complicated business of figuring\\nout what to work on. The main reason it\\\'s hard is that you can\\\'t\\ntell what most kinds of work are like except by doing them. Which\\nmeans the four steps overlap: you may have to work at something for\\nyears before you know how much you like it or how good you are at\\nit. And in the meantime you\\\'re not doing, and thus not learning\\nabout, most other kinds of work. So in the worst case you choose\\nlate based on very incomplete information.\\n<font color=#dddddd>[<a href="#f4n"><font color=#dddddd>4</font></a>]</font><br /><br />The nature of ambition exacerbates this problem. Ambition comes in\\ntwo forms, one that precedes interest in the subject and one that\\ngrows out of it. Most people who do great work have a mix, and the\\nmore you have of the former, the harder it will be to decide what\\nto do.<br /><br />The educational systems in most countries pretend it\\\'s easy. They\\nexpect you to commit to a field long before you could know what\\nit\\\'s really like. And as a result an ambitious person on an optimal\\ntrajectory will often read to the system as an instance of breakage.<br /><br />It would be better if they at least admitted it — if they admitted\\nthat the system not only can\\\'t do much to help you figure out what\\nto work on, but is designed on the assumption that you\\\'ll somehow\\nmagically guess as a teenager. They don\\\'t tell you, but I will:\\nwhen it comes to figuring out what to work on, you\\\'re on your own.\\nSome people get lucky and do guess correctly, but the rest will\\nfind themselves scrambling diagonally across tracks laid down on\\nthe assumption that everyone does.<br /><br />What should you do if you\\\'re young and ambitious but don\\\'t know\\nwhat to work on? What you should <i>not</i> do is drift along passively,\\nassuming the problem will solve itself. You need to take action.\\nBut there is no systematic procedure you can follow. When you read\\nbiographies of people who\\\'ve done great work, it\\\'s remarkable how\\nmuch luck is involved. They discover what to work on as a result\\nof a chance meeting, or by reading a book they happen to pick up.\\nSo you need to make yourself a big target for luck, and the way to\\ndo that is to be curious. Try lots of things, meet lots of people,\\nread lots of books, ask lots of questions.\\n<font color=#dddddd>[<a href="#f5n"><font color=#dddddd>5</font></a>]</font><br /><br />When in doubt, optimize for interestingness. Fields change as you\\nlearn more about them. What mathematicians do, for example, is very\\ndifferent from what you do in high school math classes. So you need\\nto give different types of work a chance to show you what they\\\'re\\nlike. But a field should become <i>increasingly</i> interesting as you\\nlearn more about it. If it doesn\\\'t, it\\\'s probably not for you.<br /><br />Don\\\'t worry if you find you\\\'re interested in different things than\\nother people. The stranger your tastes in interestingness, the\\nbetter. Strange tastes are often strong ones, and a strong taste\\nfor work means you\\\'ll be productive. And you\\\'re more likely to find\\nnew things if you'), Result(attributes={}, file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='file-a98ada68681c4fbeba2201e9c7213fc3', score=1.158095578895721, text='. Don\\\'t copy the manner of\\nan eminent 50 year old professor if you\\\'re 18, for example, or the\\nidiom of a Renaissance poem hundreds of years later.<br /><br />Some of the features of things you admire are flaws they succeeded\\ndespite. Indeed, the features that are easiest to imitate are the\\nmost likely to be the flaws.<br /><br />This is particularly true for behavior. Some talented people are\\njerks, and this sometimes makes it seem to the inexperienced that\\nbeing a jerk is part of being talented. It isn\\\'t; being talented\\nis merely how they get away with it.<br /><br />One of the most powerful kinds of copying is to copy something from\\none field into another. History is so full of chance discoveries\\nof this type that it\\\'s probably worth giving chance a hand by\\ndeliberately learning about other kinds of work. You can take ideas\\nfrom quite distant fields if you let them be metaphors.<br /><br />Negative examples can be as inspiring as positive ones. In fact you\\ncan sometimes learn more from things done badly than from things\\ndone well; sometimes it only becomes clear what\\\'s needed when it\\\'s\\nmissing.<br /><br /><br /><br /><br /><br />\\nIf a lot of the best people in your field are collected in one\\nplace, it\\\'s usually a good idea to visit for a while. It will\\nincrease your ambition, and also, by showing you that these people\\nare human, increase your self-confidence.\\n<font color=#dddddd>[<a href="#f26n"><font color=#dddddd>26</font></a>]</font><br /><br />If you\\\'re earnest you\\\'ll probably get a warmer welcome than you\\nmight expect. Most people who are very good at something are happy\\nto talk about it with anyone who\\\'s genuinely interested. If they\\\'re\\nreally good at their work, then they probably have a hobbyist\\\'s\\ninterest in it, and hobbyists always want to talk about their\\nhobbies.<br /><br />It may take some effort to find the people who are really good,\\nthough. Doing great work has such prestige that in some places,\\nparticularly universities, there\\\'s a polite fiction that everyone\\nis engaged in it. And that is far from true. People within universities\\ncan\\\'t say so openly, but the quality of the work being done in\\ndifferent departments varies immensely. Some departments have people\\ndoing great work; others have in the past; others never have.<br /><br /><br /><br /><br /><br />\\nSeek out the best colleagues. There are a lot of projects that can\\\'t\\nbe done alone, and even if you\\\'re working on one that can be, it\\\'s\\ngood to have other people to encourage you and to bounce ideas off.<br /><br />Colleagues don\\\'t just affect your work, though; they also affect\\nyou. So work with people you want to become like, because you will.<br /><br />Quality is more important than quantity in colleagues. It\\\'s better\\nto have one or two great ones than a building full of pretty good\\nones. In fact it\\\'s not merely better, but necessary, judging from\\nhistory: the degree to which great work happens in clusters suggests\\nthat one\\\'s colleagues often make the difference between doing great\\nwork and not.<br /><br />How do you know when you have sufficiently good colleagues? In my\\nexperience, when you do, you know. Which means if you\\\'re unsure,\\nyou probably don\\\'t. But it may be possible to give a more concrete\\nanswer than that. Here\\\'s an attempt: sufficiently good'), Result(attributes={}, file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='file-a98ada68681c4fbeba2201e9c7213fc3', score=1.1566747762241967, text=',\\nbut in practice it usually means following them past all sorts of\\nobstacles. You usually have to risk rejection and failure. So it\\ndoes take a good deal of boldness.<br /><br />But while you need boldness, you don\\\'t usually need much planning.\\nIn most cases the recipe for doing great work is simply: work hard\\non excitingly ambitious projects, and something good will come of\\nit. Instead of making a plan and then executing it, you just try\\nto preserve certain invariants.<br /><br />The trouble with planning is that it only works for achievements\\nyou can describe in advance. You can win a gold medal or get rich\\nby deciding to as a child and then tenaciously pursuing that goal,\\nbut you can\\\'t discover natural selection that way.<br /><br />I think for most people who want to do great work, the right strategy\\nis not to plan too much. At each stage do whatever seems most\\ninteresting and gives you the best options for the future. I call\\nthis approach "staying upwind." This is how most people who\\\'ve done\\ngreat work seem to have done it.<br /><br /><br /><br /><br /><br />\\nEven when you\\\'ve found something exciting to work on, working on\\nit is not always straightforward. There will be times when some new\\nidea makes you leap out of bed in the morning and get straight to\\nwork. But there will also be plenty of times when things aren\\\'t\\nlike that.<br /><br />You don\\\'t just put out your sail and get blown forward by inspiration.\\nThere are headwinds and currents and hidden shoals. So there\\\'s a\\ntechnique to working, just as there is to sailing.<br /><br />For example, while you must work hard, it\\\'s possible to work too\\nhard, and if you do that you\\\'ll find you get diminishing returns:\\nfatigue will make you stupid, and eventually even damage your health.\\nThe point at which work yields diminishing returns depends on the\\ntype. Some of the hardest types you might only be able to do for\\nfour or five hours a day.<br /><br />Ideally those hours will be contiguous. To the extent you can, try\\nto arrange your life so you have big blocks of time to work in.\\nYou\\\'ll shy away from hard tasks if you know you might be interrupted.<br /><br />It will probably be harder to start working than to keep working.\\nYou\\\'ll often have to trick yourself to get over that initial\\nthreshold. Don\\\'t worry about this; it\\\'s the nature of work, not a\\nflaw in your character. Work has a sort of activation energy, both\\nper day and per project. And since this threshold is fake in the\\nsense that it\\\'s higher than the energy required to keep going, it\\\'s\\nok to tell yourself a lie of corresponding magnitude to get over\\nit.<br /><br />It\\\'s usually a mistake to lie to yourself if you want to do great\\nwork, but this is one of the rare cases where it isn\\\'t. When I\\\'m\\nreluctant to start work in the morning, I often trick myself by\\nsaying "I\\\'ll just read over what I\\\'ve got so far." Five minutes\\nlater I\\\'ve found something that seems mistaken or incomplete, and\\nI\\\'m off.<br /><br />Similar techniques work for starting new projects. It\\\'s ok to lie\\nto yourself about how much work a project will entail, for example.\\nLots of great things began with someone saying "How hard could it\\nbe?"<br /><br />This is one case where the young have an advantage. They\\\'re more'), Result(attributes={}, file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='file-a98ada68681c4fbeba2201e9c7213fc3', score=1.1349744395573516, text=' audience\\nin the traditional sense. Either way it doesn\\\'t need to be big.\\nThe value of an audience doesn\\\'t grow anything like linearly with\\nits size. Which is bad news if you\\\'re famous, but good news if\\nyou\\\'re just starting out, because it means a small but dedicated\\naudience can be enough to sustain you. If a handful of people\\ngenuinely love what you\\\'re doing, that\\\'s enough.<br /><br />To the extent you can, avoid letting intermediaries come between\\nyou and your audience. In some types of work this is inevitable,\\nbut it\\\'s so liberating to escape it that you might be better off\\nswitching to an adjacent type if that will let you go direct.\\n<font color=#dddddd>[<a href="#f28n"><font color=#dddddd>28</font></a>]</font><br /><br />The people you spend time with will also have a big effect on your\\nmorale. You\\\'ll find there are some who increase your energy and\\nothers who decrease it, and the effect someone has is not always\\nwhat you\\\'d expect. Seek out the people who increase your energy and\\navoid those who decrease it. Though of course if there\\\'s someone\\nyou need to take care of, that takes precedence.<br /><br />Don\\\'t marry someone who doesn\\\'t understand that you need to work,\\nor sees your work as competition for your attention. If you\\\'re\\nambitious, you need to work; it\\\'s almost like a medical condition;\\nso someone who won\\\'t let you work either doesn\\\'t understand you,\\nor does and doesn\\\'t care.<br /><br />Ultimately morale is physical. You think with your body, so it\\\'s\\nimportant to take care of it. That means exercising regularly,\\neating and sleeping well, and avoiding the more dangerous kinds of\\ndrugs. Running and walking are particularly good forms of exercise\\nbecause they\\\'re good for thinking.\\n<font color=#dddddd>[<a href="#f29n"><font color=#dddddd>29</font></a>]</font><br /><br />People who do great work are not necessarily happier than everyone\\nelse, but they\\\'re happier than they\\\'d be if they didn\\\'t. In fact,\\nif you\\\'re smart and ambitious, it\\\'s dangerous <i>not</i> to be productive.\\nPeople who are smart and ambitious but don\\\'t achieve much tend to\\nbecome bitter.<br /><br /><br /><br /><br /><br />\\nIt\\\'s ok to want to impress other people, but choose the right people.\\nThe opinion of people you respect is signal. Fame, which is the\\nopinion of a much larger group you might or might not respect, just\\nadds noise.<br /><br />The prestige of a type of work is at best a trailing indicator and\\nsometimes completely mistaken. If you do anything well enough,\\nyou\\\'ll make it prestigious. So the question to ask about a type of\\nwork is not how much prestige it has, but how well it could be done.<br /><br />Competition can be an effective motivator, but don\\\'t let it choose\\nthe problem for you; don\\\'t let yourself get drawn into chasing\\nsomething just because others are. In fact, don\\\'t let competitors\\nmake you do anything much more specific than work harder.<br /><br />Curiosity is the best guide. Your curiosity never lies, and it knows\\nmore than you do about what\\\'s worth paying attention to.<br /><br /><br /><br /><br /><br />\\nNotice how often that word has come up. If you asked an oracle the\\nsecret to doing great work and the oracle replied'), Result(attributes={}, file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='file-a98ada68681c4fbeba2201e9c7213fc3', score=1.123214818076958, text='b\'<html><head><meta name="Keywords" content="" /><title>How to Do Great Work</title><!-- <META NAME="ROBOTS" CONTENT="NOODP"> -->\\n<link rel="shortcut icon" href="http://ycombinator.com/arc/arc.png">\\n</head><body bgcolor="#ffffff" background="https://s.turbifycdn.com/aah/paulgraham/bel-6.gif" text="#000000" link="#000099" vlink="#464646"><table border="0" cellspacing="0" cellpadding="0"><tr valign="top"><td><map name=118ab66adb24b4f><area shape=rect coords="0,0,67,21" href="index.html"><area shape=rect coords="0,21,67,42" href="articles.html"><area shape=rect coords="0,42,67,63" href="http://www.amazon.com/gp/product/0596006624"><area shape=rect coords="0,63,67,84" href="books.html"><area shape=rect coords="0,84,67,105" href="http://ycombinator.com"><area shape=rect coords="0,105,67,126" href="arc.html"><area shape=rect coords="0,126,67,147" href="bel.html"><area shape=rect coords="0,147,67,168" href="lisp.html"><area shape=rect coords="0,168,67,189" href="antispam.html"><area shape=rect coords="0,189,67,210" href="kedrosky.html"><area shape=rect coords="0,210,67,231" href="faq.html"><area shape=rect coords="0,231,67,252" href="raq.html"><area shape=rect coords="0,252,67,273" href="quo.html"><area shape=rect coords="0,273,67,294" href="rss.html"><area shape=rect coords="0,294,67,315" href="bio.html"><area shape=rect coords="0,315,67,336" href="https://twitter.com/paulg"><area shape=rect coords="0,336,67,357" href="https://mas.to/@paulg"></map><img src="https://s.turbifycdn.com/aah/paulgraham/bel-7.gif" width="69" height="357" usemap=#118ab66adb24b4f border="0" hspace="0" vspace="0" ismap /></td><td><img src="https://sep.turbifycdn.com/ca/Img/trans_1x1.gif" height="1" width="26" border="0" /></td><td><a href="index.html"><img src="https://s.turbifycdn.com/aah/paulgraham/bel-8.gif" width="410" height="45" border="0" hspace="0" vspace="0" /></a><br /><br /><table border="0" cellspacing="0" cellpadding="0" width="435"><tr valign="top"><td width="435"><img src="https://s.turbifycdn.com/aah/paulgraham/how-to-do-great-work-2.gif" width="185" height="18" border="0" hspace="0" vspace="0" alt="How to Do Great Work" /><br /><br /><font size="2" face="verdana">July 2023<br /><br />If you collected lists of techniques for doing great work in a lot\\nof different fields, what would the intersection look like? I decided\\nto find out'), Result(attributes={}, file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='file-a98ada68681c4fbeba2201e9c7213fc3', score=1.1193194369249235, text=' dangerous kinds of\\ndrugs. Running and walking are particularly good forms of exercise\\nbecause they\\\'re good for thinking.\\n<font color=#dddddd>[<a href="#f29n"><font color=#dddddd>29</font></a>]</font><br /><br />People who do great work are not necessarily happier than everyone\\nelse, but they\\\'re happier than they\\\'d be if they didn\\\'t. In fact,\\nif you\\\'re smart and ambitious, it\\\'s dangerous <i>not</i> to be productive.\\nPeople who are smart and ambitious but don\\\'t achieve much tend to\\nbecome bitter.<br /><br /><br /><br /><br /><br />\\nIt\\\'s ok to want to impress other people, but choose the right people.\\nThe opinion of people you respect is signal. Fame, which is the\\nopinion of a much larger group you might or might not respect, just\\nadds noise.<br /><br />The prestige of a type of work is at best a trailing indicator and\\nsometimes completely mistaken. If you do anything well enough,\\nyou\\\'ll make it prestigious. So the question to ask about a type of\\nwork is not how much prestige it has, but how well it could be done.<br /><br />Competition can be an effective motivator, but don\\\'t let it choose\\nthe problem for you; don\\\'t let yourself get drawn into chasing\\nsomething just because others are. In fact, don\\\'t let competitors\\nmake you do anything much more specific than work harder.<br /><br />Curiosity is the best guide. Your curiosity never lies, and it knows\\nmore than you do about what\\\'s worth paying attention to.<br /><br /><br /><br /><br /><br />\\nNotice how often that word has come up. If you asked an oracle the\\nsecret to doing great work and the oracle replied with a single\\nword, my bet would be on "curiosity."<br /><br />That doesn\\\'t translate directly to advice. It\\\'s not enough just to\\nbe curious, and you can\\\'t command curiosity anyway. But you can\\nnurture it and let it drive you.<br /><br />Curiosity is the key to all four steps in doing great work: it will\\nchoose the field for you, get you to the frontier, cause you to\\nnotice the gaps in it, and drive you to explore them. The whole\\nprocess is a kind of dance with curiosity.<br /><br /><br /><br /><br /><br />\\nBelieve it or not, I tried to make this essay as short as I could.\\nBut its length at least means it acts as a filter. If you made it\\nthis far, you must be interested in doing great work. And if so\\nyou\\\'re already further along than you might realize, because the\\nset of people willing to want to is small.<br /><br />The factors in doing great work are factors in the literal,\\nmathematical sense, and they are: ability, interest, effort, and\\nluck. Luck by definition you can\\\'t do anything about, so we can\\nignore that. And we can assume effort, if you do in fact want to\\ndo great work. So the problem boils down to ability and interest.\\nCan you find a kind of work where your ability and interest will\\ncombine to yield an explosion of new ideas?<br /><br />Here there are grounds for optimism. There are so many different\\nways to do great work, and even more that are still undiscovered.\\nOut of all those different types of work, the one you\\\'re most suited\\nfor is probably a pretty close match. Probably a comically close\\nmatch. It\\\'s just a question of finding it, and how far into it')]), ResponseOutputMessage(id='msg_3591ea71-8b35-4efd-a5ad-c1c250801971', content=[ResponseOutputText(annotations=[AnnotationFileCitation(file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='https://www.paulgraham.com/greatwork.html', index=361, type='file_citation'), AnnotationFileCitation(file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='https://www.paulgraham.com/greatwork.html', index=676, type='file_citation'), AnnotationFileCitation(file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='https://www.paulgraham.com/greatwork.html', index=948, type='file_citation'), AnnotationFileCitation(file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='https://www.paulgraham.com/greatwork.html', index=1259, type='file_citation'), AnnotationFileCitation(file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='https://www.paulgraham.com/greatwork.html', index=1520, type='file_citation'), AnnotationFileCitation(file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='https://www.paulgraham.com/greatwork.html', index=1747, type='file_citation')], text='To do great work, consider the following principles:\n\n1. Follow Your Interests: Engage in work that genuinely excites you. If you find an area intriguing, pursue it without being overly concerned about external pressures or norms. You should create things that you would want for yourself, as this often aligns with what others in your circle might want too.\n\n2. Work Hard on Ambitious Projects: Ambition is vital, but it should be tempered by genuine interest. Instead of detailed planning for the future, focus on exciting projects that keep your options open. This approach, known as "staying upwind," allows for adaptability and can lead to unforeseen achievements.\n\n3. Choose Quality Colleagues: Collaborating with talented colleagues can significantly affect your own work. Seek out individuals who offer surprising insights and whom you admire. The presence of good colleagues can elevate the quality of your work and inspire you.\n\n4. Maintain High Morale: Your attitude towards work and life affects your performance. Cultivating optimism and viewing yourself as lucky rather than victimized can boost your productivity. It’s essential to care for your physical health as well since it directly impacts your mental faculties and morale.\n\n5. Be Consistent: Great work often comes from cumulative effort. Daily progress, even in small amounts, can result in substantial achievements over time. Emphasize consistency and make the work engaging, as this reduces the perceived burden of hard labor.\n\n6. Embrace Curiosity: Curiosity is a driving force that can guide you in selecting fields of interest, pushing you to explore uncharted territories. Allow it to shape your work and continually seek knowledge and insights.\n\nBy focusing on these aspects, you can create an environment conducive to great work and personal fulfillment.', type='output_text', logprobs=None)], role='assistant', status='completed', type='message')], parallel_tool_calls=False, temperature=None, tool_choice=None, tools=None, top_p=None, background=None, conversation=None, max_output_tokens=None, max_tool_calls=None, previous_response_id=None, prompt=None, prompt_cache_key=None, reasoning=None, safety_identifier=None, service_tier=None, status='completed', text=ResponseTextConfig(format=ResponseFormatText(type='text'), verbosity=None), top_logprobs=None, truncation=None, usage=None, user=None) In [34]: resp.output[1].content[0].text Out[34]: 'To do great work, consider the following principles:\n\n1. Follow Your Interests: Engage in work that genuinely excites you. If you find an area intriguing, pursue it without being overly concerned about external pressures or norms. You should create things that you would want for yourself, as this often aligns with what others in your circle might want too.\n\n2. Work Hard on Ambitious Projects: Ambition is vital, but it should be tempered by genuine interest. Instead of detailed planning for the future, focus on exciting projects that keep your options open. This approach, known as "staying upwind," allows for adaptability and can lead to unforeseen achievements.\n\n3. Choose Quality Colleagues: Collaborating with talented colleagues can significantly affect your own work. Seek out individuals who offer surprising insights and whom you admire. The presence of good colleagues can elevate the quality of your work and inspire you.\n\n4. Maintain High Morale: Your attitude towards work and life affects your performance. Cultivating optimism and viewing yourself as lucky rather than victimized can boost your productivity. It’s essential to care for your physical health as well since it directly impacts your mental faculties and morale.\n\n5. Be Consistent: Great work often comes from cumulative effort. Daily progress, even in small amounts, can result in substantial achievements over time. Emphasize consistency and make the work engaging, as this reduces the perceived burden of hard labor.\n\n6. Embrace Curiosity: Curiosity is a driving force that can guide you in selecting fields of interest, pushing you to explore uncharted territories. Allow it to shape your work and continually seek knowledge and insights.\n\nBy focusing on these aspects, you can create an environment conducive to great work and personal fulfillment.' ``` </details> The relevant output looks like this: ```python >resp.output[1].content[0].annotations [AnnotationFileCitation(file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='https://www.paulgraham.com/greatwork.html', index=361, type='file_citation'), AnnotationFileCitation(file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='https://www.paulgraham.com/greatwork.html', index=676, type='file_citation'), AnnotationFileCitation(file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='https://www.paulgraham.com/greatwork.html', index=948, type='file_citation'), AnnotationFileCitation(file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='https://www.paulgraham.com/greatwork.html', index=1259, type='file_citation'), AnnotationFileCitation(file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='https://www.paulgraham.com/greatwork.html', index=1520, type='file_citation'), AnnotationFileCitation(file_id='file-a98ada68681c4fbeba2201e9c7213fc3', filename='https://www.paulgraham.com/greatwork.html', index=1747, type='file_citation')]``` And ```python In [144]: print(resp.output[1].content[0].text) To do great work, consider the following principles: 1. Follow Your Interests: Engage in work that genuinely excites you. If you find an area intriguing, pursue it without being overly concerned about external pressures or norms. You should create things that you would want for yourself, as this often aligns with what others in your circle might want too. 2. Work Hard on Ambitious Projects: Ambition is vital, but it should be tempered by genuine interest. Instead of detailed planning for the future, focus on exciting projects that keep your options open. This approach, known as "staying upwind," allows for adaptability and can lead to unforeseen achievements. 3. Choose Quality Colleagues: Collaborating with talented colleagues can significantly affect your own work. Seek out individuals who offer surprising insights and whom you admire. The presence of good colleagues can elevate the quality of your work and inspire you. 4. Maintain High Morale: Your attitude towards work and life affects your performance. Cultivating optimism and viewing yourself as lucky rather than victimized can boost your productivity. It’s essential to care for your physical health as well since it directly impacts your mental faculties and morale. 5. Be Consistent: Great work often comes from cumulative effort. Daily progress, even in small amounts, can result in substantial achievements over time. Emphasize consistency and make the work engaging, as this reduces the perceived burden of hard labor. 6. Embrace Curiosity: Curiosity is a driving force that can guide you in selecting fields of interest, pushing you to explore uncharted territories. Allow it to shape your work and continually seek knowledge and insights. By focusing on these aspects, you can create an environment conducive to great work and personal fulfillment. ``` And the code below outputs only periods highlighting that the position/index behaves as expected—i.e., the annotation happens at the end of the sentence. ```python print([resp.output[1].content[0].text[j.index] for j in resp.output[1].content[0].annotations]) Out[41]: ['.', '.', '.', '.', '.', '.'] ``` ## Test Plan Unit tests added. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-10-07 14:00:56 -04:00
Charlie Doern	6389bf5ffb	fix: make telemetry optional for agents (#3705 ) # What does this PR do? there is a lot of code in the agents API using the telemetry API and its helpers without checking if that API is even enabled. This is the only API besides inference actively using telemetry code, so after this telemetry can be optional for the entire stack resolves #3665 ## Test Plan existing agent tests. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-07 16:09:03 +02:00
Matthew Farrellee	e892a3f7f4	feat: add refresh_models support to inference adapters (default: false) (#3719 ) # What does this PR do? inference adapters can now configure `refresh_models: bool` to control periodic model listing from their providers BREAKING CHANGE: together inference adapter default changed. previously always refreshed, now follows config. addresses "models: refresh" on #3517 ## Test Plan ci w/ new tests	2025-10-07 15:19:56 +02:00
Charlie Doern	8b9af03a1b	fix: refresh log should be debug (#3720 ) # What does this PR do? when using a distro like starter where a bunch of providers are disabled I should not see logs like: ``` in the provider data header, e.g. x-llamastack-provider-data: {"groq_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 08:38:52,117 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider sambanova: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"sambanova_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 08:43:52,123 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider fireworks: Pass Fireworks API Key in the header X-LlamaStack-Provider-Data as { "fireworks_api_key": <your api key>} WARNING 2025-10-07 08:43:52,126 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider together: Pass Together API Key in the header X-LlamaStack-Provider-Data as { "together_api_key": <your api key>} WARNING 2025-10-07 08:43:52,129 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider openai: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"openai_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 08:43:52,132 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider anthropic: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"anthropic_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 08:43:52,136 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider gemini: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"gemini_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 08:43:52,139 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider groq: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"groq_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-07 08:43:52,142 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider sambanova: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"sambanova_api_key": "<API_KEY>"}, or in the provider config. ^CINFO 2025-10-07 08:46:11,996 llama_stack.core.utils.exec:75 core: ``` as WARNING. Switch to Debug. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-07 09:04:07 -04:00
Sumanth Kamenani	1fcde5fc2f	fix: update pyproject.toml dependencies for vector processing (#3555 ) What does this PR do? Updates pyproject.toml dependencies to fix vector processing compatibility issues. closes: #3495 Test Plan Tested llama stack server with faiss vector database: 1. Built and ran server: llama stack build --distro starter --image-type venv --image-name llamastack-faiss 3. Tested file upload: Successfully uploaded PDF via /v1/openai/v1/files 4. Tested vector operations: - Created vector store with faiss backend - Added PDF to vector store - Performed semantic search queries	2025-10-07 15:01:36 +02:00
Justin	509ac4a659	feat: enable Runpod inference adapter (#3707 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details UI Tests / ui-tests (22) (push) Successful in 30s Details Pre-commit / pre-commit (push) Successful in 1m24s Details # What does this PR do? Sorry to @mattf I thought I could close the other PR and reopen it.. But I didn't have the option to reopen it now. I just didn't want it to keep notifying maintainers if I would make other commits for testing. Continuation of: https://github.com/llamastack/llama-stack/pull/3641 PR fixes Runpod Adapter https://github.com/llamastack/llama-stack/issues/3517 ## What I fixed from before: Continuation of: https://github.com/llamastack/llama-stack/pull/3641 1. Made it all OpenAI 2. Fixed the class up since the OpenAIMixin had a couple changes with the pydantic base model stuff. 3. Test to make sure that we could dynamically find models and use the resulting identifier to make requests ```bash curl -X GET \ -H "Content-Type: application/json" \ "http://localhost:8321/v1/models" ``` ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ``` # RunPod Provider Quick Start ## Prerequisites - Python 3.10+ - Git - RunPod API token ## Setup for Development ```bash # 1. Clone and enter the repository cd (into the repo) # 2. Create and activate virtual environment python3 -m venv .venv source .venv/bin/activate # 3. Remove any existing llama-stack installation pip uninstall llama-stack llama-stack-client -y # 4. Install llama-stack in development mode pip install -e . # 5. Build using local development code (Found this through the Discord) LLAMA_STACK_DIR=. llama stack build # When prompted during build: # - Name: runpod-dev # - Image type: venv # - Inference provider: remote::runpod # - Safety provider: "llama-guard" # - Other providers: first defaults ``` ## Configure the Stack The RunPod adapter automatically discovers models from your endpoint via the `/v1/models` API. No manual model configuration is required - just set your environment variables. ## Run the Server ### Important: Use the Build-Created Virtual Environment ```bash # Exit the development venv if you're in it deactivate # Activate the build-created venv (NOT .venv) cd (lama-stack folder github repo) source llamastack-runpod-dev/bin/activate ``` ### For Qwen3-32B-AWQ Public Endpoint (Recommended) ```bash # Set environment variables export RUNPOD_URL="https://api.runpod.ai/v2/qwen3-32b-awq/openai/v1" export RUNPOD_API_TOKEN="your_runpod_api_key" # Start server llama stack run ~/.llama/distributions/llamastack-runpod-dev/llamastack-runpod-dev-run.yaml ``` ## Quick Test ### 1. List Available Models (Dynamic Discovery) First, check which models are available on your RunPod endpoint: ```bash curl -X GET \ -H "Content-Type: application/json" \ "http://localhost:8321/v1/models" ``` Example Response: ```json { "data": [ { "identifier": "qwen3-32b-awq", "provider_resource_id": "Qwen/Qwen3-32B-AWQ", "provider_id": "runpod", "type": "model", "metadata": {}, "model_type": "llm" } ] } ``` Note: Use the `identifier` value from the response above in your requests below. ### 2. Chat Completion (Non-streaming) Replace `qwen3-32b-awq` with your model identifier from step 1: ```bash curl -X POST http://localhost:8321/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "qwen3-32b-awq", "messages": [{"role": "user", "content": "Hello, count to 3"}], "stream": false }' ``` ### 3. Chat Completion (Streaming) ```bash curl -X POST http://localhost:8321/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "qwen3-32b-awq", "messages": [{"role": "user", "content": "Count to 5"}], "stream": true }' ``` Clean streaming output: ```bash curl -N -X POST http://localhost:8321/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model": "qwen3-32b-awq", "messages": [{"role": "user", "content": "Count to 5"}], "stream": true}' \ 2>/dev/null \| while read -r line; do echo "$line" \| grep "^data: " \| sed 's/^data: //' \| jq -r '.choices[0].delta.content // empty' 2>/dev/null done ``` Expected Output: ``` 1 2 3 4 5 ```	2025-10-07 12:24:50 +02:00
ehhuang	50f9ca3541	chore: remove dead code (#3713 ) # What does this PR do? ## Test Plan	2025-10-07 12:13:11 +02:00
slekkala1	bba9957edd	feat(api): Add vector store file batches api (#3642 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 0s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 40s Details Pre-commit / pre-commit (push) Successful in 1m28s Details # What does this PR do? Add Open AI Compatible vector store file batches api. This functionality is needed to attach many files to a vector store as a batch. https://github.com/llamastack/llama-stack/issues/3533 API Stubs have been merged https://github.com/llamastack/llama-stack/pull/3615 Adds persistence for file batches as discussed in diff https://github.com/llamastack/llama-stack/pull/3544 (Used claude code for generation and reviewed by me) ## Test Plan 1. Unit tests pass 2. Also verified the cc-vec integration with LLamaStackClient works with the file batches api. https://github.com/raghotham/cc-vec 2. Integration tests pass	2025-10-06 16:58:22 -07:00
ehhuang	597d405e13	chore: fix closing error (#3709 ) # What does this PR do? Gets rid of this error message below (disclaimer: not sure why, but it does). ERROR 2025-10-06 12:04:22,837 asyncio:118 uncategorized: Task exception was never retrieved future: <Task finished name='Task-36' coro=<AsyncClient.aclose() done, defined at /Users/erichuang/projects/llama-stack-git2/.venv/lib/python3.12/site-packages/httpx/_client.py:1978> exception=RuntimeError('unable to perform operation on <TCPTransport closed=True reading=False 0x122dc7ad0>; the handler is closed')> ╭─────────────────────────────────────────────────────────────────── Traceback (most recent call last) ───────────────────────────────────────────────────────────────────╮ │ /Users/erichuang/projects/llama-stack-git2/.venv/lib/python3.12/site-packages/httpx/_client.py:1985 in aclose │ │ │ │ 1982 │ │ if self._state != ClientState.CLOSED: │ │ 1983 │ │ │ self._state = ClientState.CLOSED │ │ 1984 │ │ │ │ │ ❱ 1985 │ │ │ await self._transport.aclose() │ │ 1986 │ │ │ for proxy in self._mounts.values(): │ │ 1987 │ │ │ │ if proxy is not None: │ │ 1988 │ │ │ │ │ await proxy.aclose() │ │ │ │ /Users/erichuang/projects/llama-stack-git2/.venv/lib/python3.12/site-packages/httpx/_transports/default.py:406 in aclose │ │ │ │ 403 │ │ ) │ │ 404 │ │ │ 405 │ async def aclose(self) -> None: │ │ ❱ 406 │ │ await self._pool.aclose() │ │ 407 │ │ │ │ /Users/erichuang/projects/llama-stack-git2/.venv/lib/python3.12/site-packages/httpcore/_async/connection_pool.py:353 in aclose │ │ │ │ 350 │ │ with self._optional_thread_lock: │ │ 351 │ │ │ closing_connections = list(self._connections) │ │ 352 │ │ │ self._connections = [] │ │ ❱ 353 │ │ await self._close_connections(closing_connections) │ │ 354 │ │ │ 355 │ async def __aenter__(self) -> AsyncConnectionPool: │ │ 356 │ │ return self │ │ │ │ /Users/erichuang/projects/llama-stack-git2/.venv/lib/python3.12/site-packages/httpcore/_async/connection_pool.py:345 in _close_connections │ │ │ │ 342 │ │ # Close connections which have been removed from the pool. │ │ 343 │ │ with AsyncShieldCancellation(): │ │ 344 │ │ │ for connection in closing: │ │ ❱ 345 │ │ │ │ await connection.aclose() │ │ 346 │ │ │ 347 │ async def aclose(self) -> None: │ │ 348 │ │ # Explicitly close the connection pool. │ │ │ │ /Users/erichuang/projects/llama-stack-git2/.venv/lib/python3.12/site-packages/httpcore/_async/connection.py:173 in aclose │ │ │ │ 170 │ async def aclose(self) -> None: │ │ 171 │ │ if self._connection is not None: │ │ 172 │ │ │ async with Trace("close", logger, None, {}): │ │ ❱ 173 │ │ │ │ await self._connection.aclose() │ │ 174 │ │ │ 175 │ def is_available(self) -> bool: │ │ 176 │ │ if self._connection is None: │ │ │ │ /Users/erichuang/projects/llama-stack-git2/.venv/lib/python3.12/site-packages/httpcore/_async/http11.py:258 in aclose │ │ │ │ 255 │ │ # Note that this method unilaterally closes the connection, and does │ │ 256 │ │ # not have any kind of locking in place around it. │ │ 257 │ │ self._state = HTTPConnectionState.CLOSED │ │ ❱ 258 │ │ await self._network_stream.aclose() │ │ 259 │ │ │ 260 │ # The AsyncConnectionInterface methods provide information about the state of │ │ 261 │ # the connection, allowing for a connection pooling implementation to │ │ │ │ /Users/erichuang/projects/llama-stack-git2/.venv/lib/python3.12/site-packages/httpcore/_backends/anyio.py:53 in aclose │ │ │ │ 50 │ │ │ │ await self._stream.send(item=buffer) │ │ 51 │ │ │ 52 │ async def aclose(self) -> None: │ │ ❱ 53 │ │ await self._stream.aclose() │ │ 54 │ │ │ 55 │ async def start_tls( │ │ 56 │ │ self, │ │ │ │ /Users/erichuang/projects/llama-stack-git2/.venv/lib/python3.12/site-packages/anyio/streams/tls.py:216 in aclose │ │ │ │ 213 │ │ │ │ await aclose_forcefully(self.transport_stream) │ │ 214 │ │ │ │ raise │ │ 215 │ │ │ │ ❱ 216 │ │ await self.transport_stream.aclose() │ │ 217 │ │ │ 218 │ async def receive(self, max_bytes: int = 65536) -> bytes: │ │ 219 │ │ data = await self._call_sslobject_method(self._ssl_object.read, max_bytes) │ │ │ │ /Users/erichuang/projects/llama-stack-git2/.venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py:1310 in aclose │ │ │ │ 1307 │ │ if not self._transport.is_closing(): │ │ 1308 │ │ │ self._closed = True │ │ 1309 │ │ │ try: │ │ ❱ 1310 │ │ │ │ self._transport.write_eof() │ │ 1311 │ │ │ except OSError: │ │ 1312 │ │ │ │ pass │ │ 1313 │ │ │ │ in uvloop.loop.UVStream.write_eof:703 │ │ │ │ in uvloop.loop.UVHandle._ensure_alive:159 │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: unable to perform operation on <TCPTransport closed=True reading=False 0x122dc7ad0>; the handler is closed ## Test Plan Run uv run --with llama-stack llama stack build --distro=starter --image-type=venv --run No more error	2025-10-06 14:44:01 -07:00
ehhuang	696fefbf17	chore: logger category fix (#3706 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Test Llama Stack Build / build (push) Failing after 2s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m22s Details # What does this PR do? WARNING 2025-10-06 12:01:45,137 root:266 uncategorized: Unknown logging category: tokenizer_utils. Falling back to default 'root' level: 20 ## Test Plan	2025-10-06 12:16:26 -07:00
Alexey Rybak	a8da6ba3a7	docs: API docstrings cleanup for better documentation rendering (#3661 ) # What does this PR do? * Cleans up API docstrings for better documentation rendering <img width="2346" height="1126" alt="image" src="https://github.com/user-attachments/assets/516b09a1-2d5b-4614-a3a9-13431fc21fc1" /> ## Test Plan * Manual testing --------- Signed-off-by: Doug Edgar <dedgar@redhat.com> Signed-off-by: Charlie Doern <cdoern@redhat.com> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: ehhuang <ehhuang@users.noreply.github.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu> Co-authored-by: Doug Edgar <dedgar@redhat.com> Co-authored-by: Christian Zaccaria <73656840+ChristianZaccaria@users.noreply.github.com> Co-authored-by: Anastas Stoyanovsky <contact@anastas.eu> Co-authored-by: Charlie Doern <cdoern@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Young Han <110819238+seyeong-han@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-06 10:46:33 -07:00
Matthew Farrellee	892ea759fa	chore: remove together inference adapter's custom check_model_availability (#3702 ) # What does this PR do? remove Together inference adapter's check_model_availability impl, rely on standard impl instead ## Test Plan ci	2025-10-06 13:28:36 -04:00
Matthew Farrellee	de9940c697	chore: disable openai_embeddings on inference=remote::llama-openai-compat (#3704 ) # What does this PR do? api.llama.com does not provide embedding models, this makes that clear ## Test Plan ci	2025-10-06 13:27:40 -04:00
Matthew Farrellee	ae74b31ae3	chore: remove vLLM inference adapter's custom list_models (#3703 ) # What does this PR do? remove vLLM inference adapter's custom list_models impl, rely on standard impl instead ## Test Plan ci	2025-10-06 13:27:30 -04:00
Matthew Farrellee	d23ed26238	chore: turn OpenAIMixin into a pydantic.BaseModel (#3671 ) # What does this PR do? - implement get_api_key instead of relying on LiteLLMOpenAIMixin.get_api_key - remove use of LiteLLMOpenAIMixin - add default initialize/shutdown methods to OpenAIMixin - remove __init__s to allow proper pydantic construction - remove dead code from vllm adapter and associated / duplicate unit tests - update vllm adapter to use openaimixin for model registration - remove ModelRegistryHelper from fireworks & together adapters - remove Inference from nvidia adapter - complete type hints on embedding_model_metadata - allow extra fields on OpenAIMixin, for model_store, __provider_id__, etc - new recordings for ollama - enhance the list models error handling - update cerebras (remove cerebras-cloud-sdk) and anthropic (custom model listing) inference adapters - parametrized test_inference_client_caching - remove cerebras, databricks, fireworks, together from blanket mypy exclude - removed unnecessary litellm deps ## Test Plan ci	2025-10-06 11:33:19 -04:00
Matthew Farrellee	724dac498c	chore: give OpenAIMixin subcalsses a change to list models without leaking _model_cache details (#3682 ) # What does this PR do? close the _model_cache abstraction leak ## Test Plan ci w/ new tests	2025-10-06 09:44:33 -04:00
Charlie Doern	f00bcd9561	feat: allow for multiple external provider specs (#3341 ) # What does this PR do? when using the providers.d method of installation users could hand craft their AdapterSpec's to use overlapping code meaning one repo could contain an inline and remote impl. Currently installing a provider via module does not allow for that as each repo is only allowed to have one `get_provider_spec` method with one Spec returned add an optional way for `get_provider_spec` to return a list of `ProviderSpec` where each can be either an inline or remote impl. Note: the `adapter_type` in `get_provider_spec` MUST match the `provider_type` in the build/run yaml for this to work. resolves #3226 ## Test Plan once this merges we need to re-enable the external provider test and account for this functionality. Work needs to be done in the external provider repos to support this functionality. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-06 15:26:38 +02:00
ehhuang	426cac078b	chore: use uvicorn to start llama stack server everywhere (#3625 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Test Llama Stack Build / build (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 44s Details Pre-commit / pre-commit (push) Successful in 1m24s Details # What does this PR do? https://github.com/llamastack/llama-stack/pull/3462 allows using uvicorn to start llama stack server which supports spawning multiple workers. This PR enables us to launch >1 workers from `llama stack run` (will add the parameter in a follow-up PR, keeping this PR on simplifying) by removing the old way of launching stack server and consolidates launching via uvicorn.run only. ## Test Plan ran `llama stack run starter` CI	2025-10-06 14:27:40 +02:00
dependabot[bot]	92219fd8fb	chore(python-deps): bump pandas from 2.3.1 to 2.3.3 (#3689 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 41s Details Pre-commit / pre-commit (push) Successful in 1m26s Details Bumps [pandas](https://github.com/pandas-dev/pandas) from 2.3.1 to 2.3.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pandas-dev/pandas/releases">pandas's releases</a>.</em></p> <blockquote> <h2>Pandas 2.3.3</h2> <p>We are pleased to announce the release of pandas 2.3.3. This release includes some improvements and fixes to the future string data type (preview feature for the upcoming pandas 3.0). We recommend that all users upgrade to this version.</p> <p>See the <a href="https://pandas.pydata.org/pandas-docs/version/2.3/whatsnew/v2.3.3.html">full whatsnew</a> for a list of all the changes. Pandas 2.3.3 supports Python 3.9 and higher, and is the first release to support Python 3.14.</p> <p>The release will be available on the conda-forge channel:</p> <pre><code>conda install pandas --channel conda-forge </code></pre> <p>Or via PyPI:</p> <pre><code>python3 -m pip install --upgrade pandas </code></pre> <p>Please report any issues with the release on the <a href="https://github.com/pandas-dev/pandas/issues">pandas issue tracker</a>.</p> <p>Thanks to all the contributors who made this release possible.</p> <h2>Pandas 2.3.2</h2> <p>We are pleased to announce the release of pandas 2.3.2. This release includes some improvements and fixes to the future string data type (preview feature for the upcoming pandas 3.0). We recommend that all users upgrade to this version.</p> <p>See the <a href="https://pandas.pydata.org/pandas-docs/version/2.3/whatsnew/v2.3.2.html">full whatsnew</a> for a list of all the changes. Pandas 2.3.2 supports Python 3.9 and higher.</p> <p>The release will be available on the conda-forge channel:</p> <pre><code>conda install pandas --channel conda-forge </code></pre> <p>Or via PyPI:</p> <pre><code>python3 -m pip install --upgrade pandas </code></pre> <p>Please report any issues with the release on the <a href="https://github.com/pandas-dev/pandas/issues">pandas issue tracker</a>.</p> <p>Thanks to all the contributors who made this release possible.</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`9c8bc3e551`"><code>9c8bc3e</code></a> RLS: 2.3.3</li> <li><a href="`6aa788a00b`"><code>6aa788a</code></a> [backport 2.3.x] DOC: prepare 2.3.3 whatsnew notes for release (<a href="https://redirect.github.com/pandas-dev/pandas/issues/62499">#62499</a>) (<a href="https://redirect.github.com/pandas-dev/pandas/issues/62508">#62508</a>)</li> <li><a href="`b64f0df403`"><code>b64f0df</code></a> [backport 2.3.x] BUG: avoid validation error for ufunc with string[python] ar...</li> <li><a href="`058eb2b0ed`"><code>058eb2b</code></a> [backport 2.3.x] BUG: String[pyarrow] comparison with mixed object (<a href="https://redirect.github.com/pandas-dev/pandas/issues/62424">#62424</a>) (...</li> <li><a href="`2ca088daef`"><code>2ca088d</code></a> [backport 2.3.x] DEPR: remove the Period resampling deprecation (<a href="https://redirect.github.com/pandas-dev/pandas/issues/62480">#62480</a>) (<a href="https://redirect.github.com/pandas-dev/pandas/issues/62">#62</a>...</li> <li><a href="`92bf98f623`"><code>92bf98f</code></a> [backport 2.3.x] BUG: fix .str.isdigit to honor unicode superscript for older...</li> <li><a href="`e57c7d6a22`"><code>e57c7d6</code></a> Backport PR <a href="https://redirect.github.com/pandas-dev/pandas/issues/62452">#62452</a> on branch 2.3.x (TST: Adjust tests for numexpr 2.13) (<a href="https://redirect.github.com/pandas-dev/pandas/issues/62454">#62454</a>)</li> <li><a href="`e0fe9a03c9`"><code>e0fe9a0</code></a> Backport to 2.3.x: REGR: from_records not initializing subclasses properly (#...</li> <li><a href="`23a1085e64`"><code>23a1085</code></a> BUG: improve future warning for boolean operations with missaligned indexes (...</li> <li><a href="`61136969fb`"><code>6113696</code></a> Backport PR <a href="https://redirect.github.com/pandas-dev/pandas/issues/62396">#62396</a> on branch 2.3.x (PKG/DOC: indicate Python 3.14 support in ...</li> <li>Additional commits viewable in <a href="https://github.com/pandas-dev/pandas/compare/v2.3.1...v2.3.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pandas&package-manager=uv&previous-version=2.3.1&new-version=2.3.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-05 21:20:29 -07:00
dependabot[bot]	198536f136	chore(github-deps): bump actions/github-script from 7.0.1 to 8.0.0 (#3685 ) Bumps [actions/github-script](https://github.com/actions/github-script) from 7.0.1 to 8.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/github-script/releases">actions/github-script's releases</a>.</em></p> <blockquote> <h2>v8.0.0</h2> <h2>What's Changed</h2> <ul> <li>Update Node.js version support to 24.x by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/637">actions/github-script#637</a></li> <li>README for updating actions/github-script from v7 to v8 by <a href="https://github.com/sneha-krip"><code>@sneha-krip</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/653">actions/github-script#653</a></li> </ul> <h2>⚠️ Minimum Compatible Runner Version</h2> <p><strong>v2.327.1</strong><br /> <a href="https://github.com/actions/runner/releases/tag/v2.327.1">Release Notes</a></p> <p>Make sure your runner is updated to this version or newer to use this release.</p> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/637">actions/github-script#637</a></li> <li><a href="https://github.com/sneha-krip"><code>@sneha-krip</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/653">actions/github-script#653</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/github-script/compare/v7.1.0...v8.0.0">https://github.com/actions/github-script/compare/v7.1.0...v8.0.0</a></p> <h2>v7.1.0</h2> <h2>What's Changed</h2> <ul> <li>Upgrade husky to v9 by <a href="https://github.com/benelan"><code>@benelan</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/482">actions/github-script#482</a></li> <li>Add workflow file for publishing releases to immutable action package by <a href="https://github.com/Jcambass"><code>@Jcambass</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/485">actions/github-script#485</a></li> <li>Upgrade IA Publish by <a href="https://github.com/Jcambass"><code>@Jcambass</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/486">actions/github-script#486</a></li> <li>Fix workflow status badges by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/497">actions/github-script#497</a></li> <li>Update usage of <code>actions/upload-artifact</code> by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/512">actions/github-script#512</a></li> <li>Clear up package name confusion by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/514">actions/github-script#514</a></li> <li>Update dependencies with <code>npm audit fix</code> by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/515">actions/github-script#515</a></li> <li>Specify that the used script is JavaScript by <a href="https://github.com/timotk"><code>@timotk</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/478">actions/github-script#478</a></li> <li>chore: Add Dependabot for NPM and Actions by <a href="https://github.com/nschonni"><code>@nschonni</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/472">actions/github-script#472</a></li> <li>Define <code>permissions</code> in workflows and update actions by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/531">actions/github-script#531</a></li> <li>chore: Add Dependabot for .github/actions/install-dependencies by <a href="https://github.com/nschonni"><code>@nschonni</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/532">actions/github-script#532</a></li> <li>chore: Remove .vscode settings by <a href="https://github.com/nschonni"><code>@nschonni</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/533">actions/github-script#533</a></li> <li>ci: Use github/setup-licensed by <a href="https://github.com/nschonni"><code>@nschonni</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/473">actions/github-script#473</a></li> <li>make octokit instance available as octokit on top of github, to make it easier to seamlessly copy examples from GitHub rest api or octokit documentations by <a href="https://github.com/iamstarkov"><code>@iamstarkov</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/508">actions/github-script#508</a></li> <li>Remove <code>octokit</code> README updates for v7 by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/557">actions/github-script#557</a></li> <li>docs: add "exec" usage examples by <a href="https://github.com/neilime"><code>@neilime</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/546">actions/github-script#546</a></li> <li>Bump ruby/setup-ruby from 1.213.0 to 1.222.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/github-script/pull/563">actions/github-script#563</a></li> <li>Bump ruby/setup-ruby from 1.222.0 to 1.229.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/github-script/pull/575">actions/github-script#575</a></li> <li>Clearly document passing inputs to the <code>script</code> by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/603">actions/github-script#603</a></li> <li>Update README.md by <a href="https://github.com/nebuk89"><code>@nebuk89</code></a> in <a href="https://redirect.github.com/actions/github-script/pull/610">actions/github-script#610</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/benelan"><code>@benelan</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/482">actions/github-script#482</a></li> <li><a href="https://github.com/Jcambass"><code>@Jcambass</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/485">actions/github-script#485</a></li> <li><a href="https://github.com/timotk"><code>@timotk</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/478">actions/github-script#478</a></li> <li><a href="https://github.com/iamstarkov"><code>@iamstarkov</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/508">actions/github-script#508</a></li> <li><a href="https://github.com/neilime"><code>@neilime</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/546">actions/github-script#546</a></li> <li><a href="https://github.com/nebuk89"><code>@nebuk89</code></a> made their first contribution in <a href="https://redirect.github.com/actions/github-script/pull/610">actions/github-script#610</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/github-script/compare/v7...v7.1.0">https://github.com/actions/github-script/compare/v7...v7.1.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`ed597411d8`"><code>ed59741</code></a> Merge pull request <a href="https://redirect.github.com/actions/github-script/issues/653">#653</a> from actions/sneha-krip/readme-for-v8</li> <li><a href="`2dc352e4ba`"><code>2dc352e</code></a> Bold minimum Actions Runner version in README</li> <li><a href="`01e118c8d0`"><code>01e118c</code></a> Update README for Node 24 runtime requirements</li> <li><a href="`8b222ac82e`"><code>8b222ac</code></a> Apply suggestion from <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a></li> <li><a href="`adc0eeac99`"><code>adc0eea</code></a> README for updating actions/github-script from v7 to v8</li> <li><a href="`20fe497b3f`"><code>20fe497</code></a> Merge pull request <a href="https://redirect.github.com/actions/github-script/issues/637">#637</a> from actions/node24</li> <li><a href="`e7b7f222b1`"><code>e7b7f22</code></a> update licenses</li> <li><a href="`2c81ba05f3`"><code>2c81ba0</code></a> Update Node.js version support to 24.x</li> <li><a href="`f28e40c7f3`"><code>f28e40c</code></a> Merge pull request <a href="https://redirect.github.com/actions/github-script/issues/610">#610</a> from actions/nebuk89-patch-1</li> <li><a href="`1ae9958572`"><code>1ae9958</code></a> Update README.md</li> <li>Additional commits viewable in <a href="`60a0d83039...ed597411d8`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/github-script&package-manager=github_actions&previous-version=7.0.1&new-version=8.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-05 21:20:00 -07:00
dependabot[bot]	59e5bde991	chore(github-deps): bump astral-sh/setup-uv from 6.7.0 to 6.8.0 (#3686 ) Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.7.0 to 6.8.0. <details> <summary>Commits</summary> <ul> <li><a href="`d0cc045d04`"><code>d0cc045</code></a> Always show prune cache output (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/597">#597</a>)</li> <li><a href="`2841f9f5c1`"><code>2841f9f</code></a> Bump zizmorcore/zizmor-action from 0.1.2 to 0.2.0 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/571">#571</a>)</li> <li><a href="`e554b93b80`"><code>e554b93</code></a> Add */.py.lock to cache-dependency-glob (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/590">#590</a>)</li> <li><a href="`c7d85d9988`"><code>c7d85d9</code></a> chore: update known versions for 0.8.20</li> <li><a href="`07f2cb5db9`"><code>07f2cb5</code></a> persist credentials for version update (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/584">#584</a>)</li> <li><a href="`208b0c0ee4`"><code>208b0c0</code></a> README.md: Fix Python versions and update checkout action (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/572">#572</a>)</li> <li>See full diff in <a href="`b75a909f75...d0cc045d04`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.7.0&new-version=6.8.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-05 21:19:50 -07:00
dependabot[bot]	45cf74db33	chore(python-deps): bump requests from 2.32.4 to 2.32.5 (#3691 ) Bumps [requests](https://github.com/psf/requests) from 2.32.4 to 2.32.5. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/psf/requests/releases">requests's releases</a>.</em></p> <blockquote> <h2>v2.32.5</h2> <h2>2.32.5 (2025-08-18)</h2> <p><strong>Bugfixes</strong></p> <ul> <li>The SSLContext caching feature originally introduced in 2.32.0 has created a new class of issues in Requests that have had negative impact across a number of use cases. The Requests team has decided to revert this feature as long term maintenance of it is proving to be unsustainable in its current iteration.</li> </ul> <p><strong>Deprecations</strong></p> <ul> <li>Added support for Python 3.14.</li> <li>Dropped support for Python 3.8 following its end of support.</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/psf/requests/blob/main/HISTORY.md">requests's changelog</a>.</em></p> <blockquote> <h2>2.32.5 (2025-08-18)</h2> <p><strong>Bugfixes</strong></p> <ul> <li>The SSLContext caching feature originally introduced in 2.32.0 has created a new class of issues in Requests that have had negative impact across a number of use cases. The Requests team has decided to revert this feature as long term maintenance of it is proving to be unsustainable in its current iteration.</li> </ul> <p><strong>Deprecations</strong></p> <ul> <li>Added support for Python 3.14.</li> <li>Dropped support for Python 3.8 following its end of support.</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`b25c87d7cb`"><code>b25c87d</code></a> v2.32.5</li> <li><a href="`131e506079`"><code>131e506</code></a> Merge pull request <a href="https://redirect.github.com/psf/requests/issues/7010">#7010</a> from psf/dependabot/github_actions/actions/checkout-...</li> <li><a href="`b336cb2bc6`"><code>b336cb2</code></a> Bump actions/checkout from 4.2.0 to 5.0.0</li> <li><a href="`46e939b552`"><code>46e939b</code></a> Update publish workflow to use <code>artifact-id</code> instead of <code>name</code></li> <li><a href="`4b9c546aa3`"><code>4b9c546</code></a> Merge pull request <a href="https://redirect.github.com/psf/requests/issues/6999">#6999</a> from psf/dependabot/github_actions/step-security/har...</li> <li><a href="`7618dbef01`"><code>7618dbe</code></a> Bump step-security/harden-runner from 2.12.0 to 2.13.0</li> <li><a href="`2edca11103`"><code>2edca11</code></a> Add support for Python 3.14 and drop support for Python 3.8 (<a href="https://redirect.github.com/psf/requests/issues/6993">#6993</a>)</li> <li><a href="`fec96cd597`"><code>fec96cd</code></a> Update Makefile rules (<a href="https://redirect.github.com/psf/requests/issues/6996">#6996</a>)</li> <li><a href="`d58d8aa2f4`"><code>d58d8aa</code></a> docs: clarify timeout parameter uses seconds in Session.request (<a href="https://redirect.github.com/psf/requests/issues/6994">#6994</a>)</li> <li><a href="`91a3eabd3d`"><code>91a3eab</code></a> Bump github/codeql-action from 3.28.5 to 3.29.0</li> <li>Additional commits viewable in <a href="https://github.com/psf/requests/compare/v2.32.4...v2.32.5">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=requests&package-manager=uv&previous-version=2.32.4&new-version=2.32.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-05 21:19:19 -07:00
dependabot[bot]	c0f0a03529	chore(ui-deps): bump react-dom and @types/react-dom in /llama_stack/ui (#3693 ) Bumps [react-dom](https://github.com/facebook/react/tree/HEAD/packages/react-dom) and [@types/react-dom](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-dom). These dependencies needed to be updated together. Updates `react-dom` from 19.1.1 to 19.2.0 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/facebook/react/releases">react-dom's releases</a>.</em></p> <blockquote> <h2>19.2.0 (Oct 1, 2025)</h2> <p>Below is a list of all new features, APIs, and bug fixes.</p> <p>Read the <a href="https://react.dev/blog/2025/10/01/react-19-2">React 19.2 release post</a> for more information.</p> <h2>New React Features</h2> <ul> <li><a href="https://react.dev/reference/react/Activity"><code><Activity></code></a>: A new API to hide and restore the UI and internal state of its children.</li> <li><a href="https://react.dev/reference/react/useEffectEvent"><code>useEffectEvent</code></a> is a React Hook that lets you extract non-reactive logic into an <a href="https://react.dev/learn/separating-events-from-effects#declaring-an-effect-event">Effect Event</a>.</li> <li><a href="https://react.dev/reference/react/cacheSignal"><code>cacheSignal</code></a> (for RSCs) lets your know when the <code>cache()</code> lifetime is over.</li> <li><a href="https://react.dev/reference/developer-tooling/react-performance-tracks">React Performance tracks</a> appear on the Performance panel’s timeline in your browser developer tools</li> </ul> <h2>New React DOM Features</h2> <ul> <li>Added resume APIs for partial pre-rendering with Web Streams: <ul> <li><a href="https://react.dev/reference/react-dom/server/resume"><code>resume</code></a>: to resume a prerender to a stream.</li> <li><a href="https://react.dev/reference/react-dom/static/resumeAndPrerender"><code>resumeAndPrerender</code></a>: to resume a prerender to HTML.</li> </ul> </li> <li>Added resume APIs for partial pre-rendering with Node Streams: <ul> <li><a href="https://react.dev/reference/react-dom/server/resumeToPipeableStream"><code>resumeToPipeableStream</code></a>: to resume a prerender to a stream.</li> <li><a href="https://react.dev/reference/react-dom/static/resumeAndPrerenderToNodeStream"><code>resumeAndPrerenderToNodeStream</code></a>: to resume a prerender to HTML.</li> </ul> </li> <li>Updated <a href="https://react.dev/reference/react-dom/static/prerender"><code>prerender</code></a> APIs to return a <code>postponed</code> state that can be passed to the <code>resume</code> APIs.</li> </ul> <h2>Notable changes</h2> <ul> <li>React DOM now batches suspense boundary reveals, matching the behavior of client side rendering. This change is especially noticeable when animating the reveal of Suspense boundaries e.g. with the upcoming <code><ViewTransition></code> Component. React will batch as much reveals as possible before the first paint while trying to hit popular first-contentful paint metrics.</li> <li>Add Node Web Streams (<code>prerender</code>, <code>renderToReadableStream</code>) to server-side-rendering APIs for Node.js</li> <li>Use underscore instead of <code>:</code> IDs generated by useId</li> </ul> <h2>All Changes</h2> <h3>React</h3> <ul> <li><code><Activity /></code> was developed over many years, starting before <code>ClassComponent.setState</code> (<a href="https://github.com/acdlite"><code>@acdlite</code></a> <a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a> and many others)</li> <li>Stringify context as "SomeContext" instead of "SomeContext.Provider" (<a href="https://github.com/kassens"><code>@kassens</code></a> <a href="https://redirect.github.com/facebook/react/pull/33507">#33507</a>)</li> <li>Include stack of cause of React instrumentation errors with <code>%o</code> placeholder (<a href="https://github.com/eps1lon"><code>@eps1lon</code></a> <a href="https://redirect.github.com/facebook/react/pull/34198">#34198</a>)</li> <li>Fix infinite <code>useDeferredValue</code> loop in popstate event (<a href="https://github.com/acdlite"><code>@acdlite</code></a> <a href="https://redirect.github.com/facebook/react/pull/32821">#32821</a>)</li> <li>Fix a bug when an initial value was passed to <code>useDeferredValue</code> (<a href="https://github.com/acdlite"><code>@acdlite</code></a> <a href="https://redirect.github.com/facebook/react/pull/34376">#34376</a>)</li> <li>Fix a crash when submitting forms with Client Actions (<a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a> <a href="https://redirect.github.com/facebook/react/pull/33055">#33055</a>)</li> <li>Hide/unhide the content of dehydrated suspense boundaries if they resuspend (<a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a> <a href="https://redirect.github.com/facebook/react/pull/32900">#32900</a>)</li> <li>Avoid stack overflow on wide trees during Hot Reload (<a href="https://github.com/sophiebits"><code>@sophiebits</code></a> <a href="https://redirect.github.com/facebook/react/pull/34145">#34145</a>)</li> <li>Improve Owner and Component stacks in various places (<a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a>, <a href="https://github.com/eps1lon"><code>@eps1lon</code></a>: <a href="https://redirect.github.com/facebook/react/pull/33629">#33629</a>, <a href="https://redirect.github.com/facebook/react/pull/33724">#33724</a>, <a href="https://redirect.github.com/facebook/react/pull/32735">#32735</a>, <a href="https://redirect.github.com/facebook/react/pull/33723">#33723</a>)</li> <li>Add <code>cacheSignal</code> (<a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a> <a href="https://redirect.github.com/facebook/react/pull/33557">#33557</a>)</li> </ul> <h3>React DOM</h3> <ul> <li>Block on Suspensey Fonts during reveal of server-side-rendered content (<a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a> <a href="https://redirect.github.com/facebook/react/pull/33342">#33342</a>)</li> <li>Use underscore instead of <code>:</code> for IDs generated by <code>useId</code> (<a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a>, <a href="https://github.com/eps1lon"><code>@eps1lon</code></a>: <a href="https://redirect.github.com/facebook/react/pull/32001">#32001</a>, <a href="https://redirect.github.com/facebook/react/pull/33342">facebook/react#33342</a><a href="https://redirect.github.com/facebook/react/pull/33099">#33099</a>, <a href="https://redirect.github.com/facebook/react/pull/33422">#33422</a>)</li> <li>Stop warning when ARIA 1.3 attributes are used (<a href="https://github.com/Abdul-Omira"><code>@Abdul-Omira</code></a> <a href="https://redirect.github.com/facebook/react/pull/34264">#34264</a>)</li> <li>Allow <code>nonce</code> to be used on hoistable styles (<a href="https://github.com/Andarist"><code>@Andarist</code></a> <a href="https://redirect.github.com/facebook/react/pull/32461">#32461</a>)</li> <li>Warn for using a React owned node as a Container if it also has text content (<a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a> <a href="https://redirect.github.com/facebook/react/pull/32774">#32774</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/facebook/react/blob/main/CHANGELOG.md">react-dom's changelog</a>.</em></p> <blockquote> <h2>19.2.0 (October 1st, 2025)</h2> <p>Below is a list of all new features, APIs, and bug fixes.</p> <p>Read the <a href="https://react.dev/blog/2025/10/01/react-19-2">React 19.2 release post</a> for more information.</p> <h3>New React Features</h3> <ul> <li><a href="https://react.dev/reference/react/Activity"><code><Activity></code></a>: A new API to hide and restore the UI and internal state of its children.</li> <li><a href="https://react.dev/reference/react/useEffectEvent"><code>useEffectEvent</code></a> is a React Hook that lets you extract non-reactive logic into an <a href="https://react.dev/learn/separating-events-from-effects#declaring-an-effect-event">Effect Event</a>.</li> <li><a href="https://react.dev/reference/react/cacheSignal"><code>cacheSignal</code></a> (for RSCs) lets your know when the <code>cache()</code> lifetime is over.</li> <li><a href="https://react.dev/reference/developer-tooling/react-performance-tracks">React Performance tracks</a> appear on the Performance panel’s timeline in your browser developer tools</li> </ul> <h3>New React DOM Features</h3> <ul> <li>Added resume APIs for partial pre-rendering with Web Streams: <ul> <li><a href="https://react.dev/reference/react-dom/server/resume"><code>resume</code></a>: to resume a prerender to a stream.</li> <li><a href="https://react.dev/reference/react-dom/static/resumeAndPrerender"><code>resumeAndPrerender</code></a>: to resume a prerender to HTML.</li> </ul> </li> <li>Added resume APIs for partial pre-rendering with Node Streams: <ul> <li><a href="https://react.dev/reference/react-dom/server/resumeToPipeableStream"><code>resumeToPipeableStream</code></a>: to resume a prerender to a stream.</li> <li><a href="https://react.dev/reference/react-dom/static/resumeAndPrerenderToNodeStream"><code>resumeAndPrerenderToNodeStream</code></a>: to resume a prerender to HTML.</li> </ul> </li> <li>Updated <a href="https://react.dev/reference/react-dom/static/prerender"><code>prerender</code></a> APIs to return a <code>postponed</code> state that can be passed to the <code>resume</code> APIs.</li> </ul> <h3>Notable changes</h3> <ul> <li>React DOM now batches suspense boundary reveals, matching the behavior of client side rendering. This change is especially noticeable when animating the reveal of Suspense boundaries e.g. with the upcoming <code><ViewTransition></code> Component. React will batch as much reveals as possible before the first paint while trying to hit popular first-contentful paint metrics.</li> <li>Add Node Web Streams (<code>prerender</code>, <code>renderToReadableStream</code>) to server-side-rendering APIs for Node.js</li> <li>Use underscore instead of <code>:</code> IDs generated by useId</li> </ul> <h3>All Changes</h3> <h4>React</h4> <ul> <li><code><Activity /></code> was developed over many years, starting before <code>ClassComponent.setState</code> (<a href="https://github.com/acdlite"><code>@acdlite</code></a> <a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a> and many others)</li> <li>Stringify context as "SomeContext" instead of "SomeContext.Provider" (<a href="https://github.com/kassens"><code>@kassens</code></a> <a href="https://redirect.github.com/facebook/react/pull/33507">#33507</a>)</li> <li>Include stack of cause of React instrumentation errors with <code>%o</code> placeholder (<a href="https://github.com/eps1lon"><code>@eps1lon</code></a> <a href="https://redirect.github.com/facebook/react/pull/34198">#34198</a>)</li> <li>Fix infinite <code>useDeferredValue</code> loop in popstate event (<a href="https://github.com/acdlite"><code>@acdlite</code></a> <a href="https://redirect.github.com/facebook/react/pull/32821">#32821</a>)</li> <li>Fix a bug when an initial value was passed to <code>useDeferredValue</code> (<a href="https://github.com/acdlite"><code>@acdlite</code></a> <a href="https://redirect.github.com/facebook/react/pull/34376">#34376</a>)</li> <li>Fix a crash when submitting forms with Client Actions (<a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a> <a href="https://redirect.github.com/facebook/react/pull/33055">#33055</a>)</li> <li>Hide/unhide the content of dehydrated suspense boundaries if they resuspend (<a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a> <a href="https://redirect.github.com/facebook/react/pull/32900">#32900</a>)</li> <li>Avoid stack overflow on wide trees during Hot Reload (<a href="https://github.com/sophiebits"><code>@sophiebits</code></a> <a href="https://redirect.github.com/facebook/react/pull/34145">#34145</a>)</li> <li>Improve Owner and Component stacks in various places (<a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a>, <a href="https://github.com/eps1lon"><code>@eps1lon</code></a>: <a href="https://redirect.github.com/facebook/react/pull/33629">#33629</a>, <a href="https://redirect.github.com/facebook/react/pull/33724">#33724</a>, <a href="https://redirect.github.com/facebook/react/pull/32735">#32735</a>, <a href="https://redirect.github.com/facebook/react/pull/33723">#33723</a>)</li> <li>Add <code>cacheSignal</code> (<a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a> <a href="https://redirect.github.com/facebook/react/pull/33557">#33557</a>)</li> </ul> <h4>React DOM</h4> <ul> <li>Block on Suspensey Fonts during reveal of server-side-rendered content (<a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a> <a href="https://redirect.github.com/facebook/react/pull/33342">#33342</a>)</li> <li>Use underscore instead of <code>:</code> for IDs generated by <code>useId</code> (<a href="https://github.com/sebmarkbage"><code>@sebmarkbage</code></a>, <a href="https://github.com/eps1lon"><code>@eps1lon</code></a>: <a href="https://redirect.github.com/facebook/react/pull/32001">#32001</a>, <a href="https://redirect.github.com/facebook/react/pull/33342">facebook/react#33342</a><a href="https://redirect.github.com/facebook/react/pull/33099">#33099</a>, <a href="https://redirect.github.com/facebook/react/pull/33422">#33422</a>)</li> <li>Stop warning when ARIA 1.3 attributes are used (<a href="https://github.com/Abdul-Omira"><code>@Abdul-Omira</code></a> <a href="https://redirect.github.com/facebook/react/pull/34264">#34264</a>)</li> <li>Allow <code>nonce</code> to be used on hoistable styles (<a href="https://github.com/Andarist"><code>@Andarist</code></a> <a href="https://redirect.github.com/facebook/react/pull/32461">#32461</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`861811347b`"><code>8618113</code></a> Bump scheduler version (<a href="https://github.com/facebook/react/tree/HEAD/packages/react-dom/issues/34671">#34671</a>)</li> <li><a href="`1bd1f01f2a`"><code>1bd1f01</code></a> Ship partial-prerendering APIs to Canary (<a href="https://github.com/facebook/react/tree/HEAD/packages/react-dom/issues/34633">#34633</a>)</li> <li><a href="`2f0649a0b2`"><code>2f0649a</code></a> [Fizz] Remove <code>nonce</code> option from resume-and-prerender APIs (<a href="https://github.com/facebook/react/tree/HEAD/packages/react-dom/issues/34664">#34664</a>)</li> <li><a href="`5667a41fe4`"><code>5667a41</code></a> Bump next prerelease version numbers (<a href="https://github.com/facebook/react/tree/HEAD/packages/react-dom/issues/34639">#34639</a>)</li> <li><a href="`e08f53b182`"><code>e08f53b</code></a> Match <code>react-dom/static</code> test entrypoints and published entrypoints (<a href="https://github.com/facebook/react/tree/HEAD/packages/react-dom/issues/34599">#34599</a>)</li> <li><a href="`8bb7241f4c`"><code>8bb7241</code></a> Bump useEffectEvent to Canary (<a href="https://github.com/facebook/react/tree/HEAD/packages/react-dom/issues/34610">#34610</a>)</li> <li><a href="`83c88ad470`"><code>83c88ad</code></a> Handle fabric root level fragment with compareDocumentPosition (<a href="https://github.com/facebook/react/tree/HEAD/packages/react-dom/issues/34533">#34533</a>)</li> <li><a href="`68f00c901c`"><code>68f00c9</code></a> Release Activity in Canary (<a href="https://github.com/facebook/react/tree/HEAD/packages/react-dom/issues/34374">#34374</a>)</li> <li><a href="`3168e08f83`"><code>3168e08</code></a> [flags] enable opt-in for enableDefaultTransitionIndicator (<a href="https://github.com/facebook/react/tree/HEAD/packages/react-dom/issues/34373">#34373</a>)</li> <li><a href="`3434ff4f4b`"><code>3434ff4</code></a> Add scrollIntoView to fragment instances (<a href="https://github.com/facebook/react/tree/HEAD/packages/react-dom/issues/32814">#32814</a>)</li> <li>Additional commits viewable in <a href="https://github.com/facebook/react/commits/v19.2.0/packages/react-dom">compare view</a></li> </ul> </details> <br /> Updates `@types/react-dom` from 19.1.9 to 19.2.0 <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-dom">compare view</a></li> </ul> </details> <br /> Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-06 00:02:31 -04:00
dependabot[bot]	91c6a8a3a3	chore(ui-deps): bump next from 15.5.3 to 15.5.4 in /llama_stack/ui (#3694 ) Bumps [next](https://github.com/vercel/next.js) from 15.5.3 to 15.5.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vercel/next.js/releases">next's releases</a>.</em></p> <blockquote> <h2>v15.5.4</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>fix: ensure onRequestError is invoked when otel enabled (<a href="https://redirect.github.com/vercel/next.js/issues/83343">#83343</a>)</li> <li>fix: devtools initial position should be from next config (<a href="https://redirect.github.com/vercel/next.js/issues/83571">#83571</a>)</li> <li>[devtool] fix overlay styles are missing (<a href="https://redirect.github.com/vercel/next.js/issues/83721">#83721</a>)</li> <li>Turbopack: don't match dynamic pattern for node_modules packages (<a href="https://redirect.github.com/vercel/next.js/issues/83176">#83176</a>)</li> <li>Turbopack: don't treat metadata routes as RSC (<a href="https://redirect.github.com/vercel/next.js/issues/82911">#82911</a>)</li> <li>[turbopack] Improve handling of symlink resolution errors in track_glob and read_glob (<a href="https://redirect.github.com/vercel/next.js/issues/83357">#83357</a>)</li> <li>Turbopack: throw large static metadata error earlier (<a href="https://redirect.github.com/vercel/next.js/issues/82939">#82939</a>)</li> <li>fix: error overlay not closing when backdrop clicked (<a href="https://redirect.github.com/vercel/next.js/issues/83981">#83981</a>)</li> <li>Turbopack: flush Node.js worker IPC on error (<a href="https://redirect.github.com/vercel/next.js/issues/84077">#84077</a>)</li> </ul> <h3>Misc Changes</h3> <ul> <li>[CNA] use linter preference (<a href="https://redirect.github.com/vercel/next.js/issues/83194">#83194</a>)</li> <li>CI: use KV for test timing data (<a href="https://redirect.github.com/vercel/next.js/issues/83745">#83745</a>)</li> <li>docs: september improvements and fixes (<a href="https://redirect.github.com/vercel/next.js/issues/83997">#83997</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/yiminghe"><code>@yiminghe</code></a>, <a href="https://github.com/huozhi"><code>@huozhi</code></a>, <a href="https://github.com/devjiwonchoi"><code>@devjiwonchoi</code></a>, <a href="https://github.com/mischnic"><code>@mischnic</code></a>, <a href="https://github.com/lukesandberg"><code>@lukesandberg</code></a>, <a href="https://github.com/ztanner"><code>@ztanner</code></a>, <a href="https://github.com/icyJoseph"><code>@icyJoseph</code></a>, <a href="https://github.com/leerob"><code>@leerob</code></a>, <a href="https://github.com/fufuShih"><code>@fufuShih</code></a>, <a href="https://github.com/dwrth"><code>@dwrth</code></a>, <a href="https://github.com/aymericzip"><code>@aymericzip</code></a>, <a href="https://github.com/obendev"><code>@obendev</code></a>, <a href="https://github.com/molebox"><code>@molebox</code></a>, <a href="https://github.com/OoMNoO"><code>@OoMNoO</code></a>, <a href="https://github.com/pontasan"><code>@pontasan</code></a>, <a href="https://github.com/styfle"><code>@styfle</code></a>, <a href="https://github.com/HondaYt"><code>@HondaYt</code></a>, <a href="https://github.com/ryuapp"><code>@ryuapp</code></a>, <a href="https://github.com/lpalmes"><code>@lpalmes</code></a>, and <a href="https://github.com/ijjk"><code>@ijjk</code></a> for helping!</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`40f1d7814d`"><code>40f1d78</code></a> v15.5.4</li> <li><a href="`cb30f0a176`"><code>cb30f0a</code></a> [backport] docs: september improvements and fixes (<a href="https://redirect.github.com/vercel/next.js/issues/83997">#83997</a>)</li> <li><a href="`b6a32bb579`"><code>b6a32bb</code></a> [backport] [CNA] use linter preference (<a href="https://redirect.github.com/vercel/next.js/issues/83194">#83194</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/84087">#84087</a>)</li> <li><a href="`26d61f1e9a`"><code>26d61f1</code></a> [backport] Turbopack: flush Node.js worker IPC on error (<a href="https://redirect.github.com/vercel/next.js/issues/84079">#84079</a>)</li> <li><a href="`e11e87a547`"><code>e11e87a</code></a> [backport] fix: error overlay not closing when backdrop clicked (<a href="https://redirect.github.com/vercel/next.js/issues/83981">#83981</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83">#83</a>...</li> <li><a href="`0a29888575`"><code>0a29888</code></a> [backport] fix: devtools initial position should be from next config (<a href="https://redirect.github.com/vercel/next.js/issues/83571">#83571</a>)...</li> <li><a href="`7a53950c13`"><code>7a53950</code></a> [backport] Turbopack: don't treat metadata routes as RSC (<a href="https://redirect.github.com/vercel/next.js/issues/83804">#83804</a>)</li> <li><a href="`050bdf1ae7`"><code>050bdf1</code></a> [backport] Turbopack: throw large static metadata error earlier (<a href="https://redirect.github.com/vercel/next.js/issues/83816">#83816</a>)</li> <li><a href="`1f6ea09f85`"><code>1f6ea09</code></a> [backport] Turbopack: Improve handling of symlink resolution errors (<a href="https://redirect.github.com/vercel/next.js/issues/83805">#83805</a>)</li> <li><a href="`c7d1855499`"><code>c7d1855</code></a> [backport] CI: use KV for test timing data (<a href="https://redirect.github.com/vercel/next.js/issues/83860">#83860</a>)</li> <li>Additional commits viewable in <a href="https://github.com/vercel/next.js/compare/v15.5.3...v15.5.4">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=next&package-manager=npm_and_yarn&previous-version=15.5.3&new-version=15.5.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-06 00:01:38 -04:00
Matthew Farrellee	351c4b98e4	chore: inference=remote::llama-openai-compat does not support /v1/completion (#3683 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s Details Python Package Build Test / build (3.13) (push) Failing after 16s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Details Python Package Build Test / build (3.12) (push) Failing after 18s Details Unit Tests / unit-tests (3.13) (push) Failing after 16s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 20s Details Unit Tests / unit-tests (3.12) (push) Failing after 18s Details UI Tests / ui-tests (22) (push) Successful in 44s Details Pre-commit / pre-commit (push) Successful in 1m22s Details ## What does this PR do? skip completion tests for inference=remote::llama-openai-compat ## Test Plan ci	2025-10-04 11:36:48 -07:00
Ashwin Bharambe	045a0c1d57	feat(tests): implement test isolation for inference recordings (#3681 ) Uses test_id in request hashes and test-scoped subdirectories to prevent cross-test contamination. Model list endpoints exclude test_id to enable merging recordings from different servers. Additionally, this PR adds a `record-if-missing` mode (which we will use instead of `record` which records everything) which is very useful. 🤖 Co-authored with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-04 11:34:18 -07:00
Young Han	f176196fba	docs: Update links in README for quick start and documentation (#3678 ) Some checks failed Test Llama Stack Build / generate-matrix (push) Successful in 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Test Llama Stack Build / build (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 41s Details Pre-commit / pre-commit (push) Successful in 1m59s Details Previous quick start and documentation links linked to `Page Not Found`. # What does this PR do? <img width="900" height="316" alt="image" src="https://github.com/user-attachments/assets/60ceac27-18db-4a3b-852f-8d139309f4cb" />	2025-10-03 20:51:46 -07:00
ehhuang	c21bb0e837	chore: fix setup_telemetry script (#3680 ) # What does this PR do? Added missing configuration files ## Test Plan run ./scripts/telemetry/setup_telemetry.sh ``` OTEL_SERVICE_NAME=llama_stack OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 TELEMETRY_SINKS=otel_trace,otel_metric uv run --with llama-stack llama stack build --distro=starter --image-type=venv --run ``` Navigate to grafana localhost:3000, query metrics and traces	2025-10-03 17:36:35 -07:00
Ashwin Bharambe	3f36bfaeaa	chore(tests): normalize recording IDs and timestamps to reduce git diff noise (#3676 ) IDs are now deterministic hashes based on request content, and timestamps are normalized to constants, eliminating spurious changes when re-recording tests. ## Changes - Updated `inference_recorder.py` to normalize IDs and timestamps during recording - Added `scripts/normalize_recordings.py` utility to re-normalize existing recordings - Created documentation in `tests/integration/recordings/README.md` - Normalized 350 existing recording files	2025-10-03 17:26:11 -07:00
Alexey Rybak	6bcd3e25f2	chore: update CODEOWNERS (#3613 ) # What does this PR do? Update CODEOWNERS file ## Test Plan N/A	2025-10-03 17:12:34 -07:00
Francisco Arceo	7ec7e0c1ac	chore: Add weaviate client to unit group in pyproject.toml and uv.lock (#3675 ) # What does this PR do? `uv add "weaviate-client>=4.16.4" --group unit` ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-10-03 14:02:20 -07:00
Ashwin Bharambe	61b4238912	feat(api): add extra_body parameter support with shields example (#3670 ) ## Summary Introduce `ExtraBodyField` annotation to enable parameters that arrive via extra_body in client SDKs but are accessible server-side with full typing. These parameters are documented in OpenAPI specs under `x-llama-stack-extra-body-params` but excluded from generated SDK signatures. Add `shields` parameter to `create_openai_response` as the first implementation using this pattern. ## Test Plan - added an integration test which checks that shields parameter passed via extra_body reaches server implementation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-03 13:25:09 -07:00
Ashwin Bharambe	188a56af5c	fix: merge workflows to avoid GITHUB_TOKEN limitation Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 40s Details Pre-commit / pre-commit (push) Successful in 1m16s Details	2025-10-03 12:04:02 -07:00
Ashwin Bharambe	f232b78ad6	fix(ci): update hashes	2025-10-03 11:58:49 -07:00
Ashwin Bharambe	5a44b9ff82	feat: add comment-triggered pre-commit bot for PRs (#3672 ) ## Summary This PR adds a comment-triggered GitHub Actions workflow that allows running pre-commit hooks on-demand for any pull request. When someone comments `@github-actions run precommit` on a PR, the bot automatically runs all pre-commit hooks and commits any formatting or linting fixes directly to the PR branch. The implementation uses a secure two-workflow approach: a trigger workflow validates permissions and dispatches to an execution workflow that runs pre-commit in a privileged context. This works safely for both same-repo and fork PRs, with permission checks ensuring only PR authors or repository collaborators can trigger the bot. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-10-03 11:51:40 -07:00
Alexey Rybak	9f6c658f2a	docs: update OG image (#3669 ) # What does this PR do? * Updates OG image for docs preview ## Test Plan * Manual testing	2025-10-03 10:22:54 -07:00
Matthew Farrellee	ce77c27ff8	chore: use remoteinferenceproviderconfig for remote inference providers (#3668 ) # What does this PR do? on the path to maintainable impls of inference providers. make all configs instances of RemoteInferenceProviderConfig. ## Test Plan ci	2025-10-03 08:48:42 -07:00
Francisco Arceo	a20e8eac8c	feat: Add OpenAI Conversations API (#3429 ) # What does this PR do? Initial implementation for `Conversations` and `ConversationItems` using `AuthorizedSqlStore` with endpoints to: - CREATE - UPDATE - GET/RETRIEVE/LIST - DELETE Set `level=LLAMA_STACK_API_V1`. NOTE: This does not currently incorporate changes for Responses, that'll be done in a subsequent PR. Closes https://github.com/llamastack/llama-stack/issues/3235 ## Test Plan - Unit tests - Integration tests Also comparison of [OpenAPI spec for OpenAI API](https://github.com/openai/openai-openapi/tree/manual_spec) ```bash oasdiff breaking --fail-on ERR docs/static/llama-stack-spec.yaml https://raw.githubusercontent.com/openai/openai-openapi/refs/heads/manual_spec/openapi.yaml --strip-prefix-base "/v1/openai/v1" \ --match-path '(^/v1/openai/v1/conversations.\|^/conversations.)' ``` Note I still have some uncertainty about this, I borrowed this info from @cdoern on https://github.com/llamastack/llama-stack/pull/3514 but need to spend more time to confirm it's working, at the moment it suggests it does. UPDATE on `oasdiff`, I investigated the OpenAI spec further and it looks like currently the spec does not list Conversations, so that analysis is useless. Noting for future reference. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-10-03 08:47:18 -07:00
Charlie Doern	a09e30bd87	docs!: adjust external provider docs (#3484 ) # What does this PR do? now that we consolidated the providerspec types and got rid of `AdapterSpec`, adjust external.md BREAKING CHANGE: external providers must update their `get_provider_spec` function to use `RemoteProviderSpec` properly Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-03 15:48:41 +02:00
Matthew Farrellee	d266c59c2a	chore: remove deprecated inference.chat_completion implementations (#3654 ) # What does this PR do? remove unused chat_completion implementations vllm features ported - - requires max_tokens be set, use config value - set tool_choice to none if no tools provided ## Test Plan ci	2025-10-03 07:55:34 -04:00
Anastas Stoyanovsky	4dfbe46954	fix(docs): Correct indentation in documented example for access_policy (#3652 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 18s Details Python Package Build Test / build (3.13) (push) Failing after 15s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 17s Details Python Package Build Test / build (3.12) (push) Failing after 17s Details Unit Tests / unit-tests (3.13) (push) Failing after 16s Details Unit Tests / unit-tests (3.12) (push) Failing after 18s Details UI Tests / ui-tests (22) (push) Successful in 44s Details Pre-commit / pre-commit (push) Successful in 1m21s Details `access_policy` needs to be inside the `auth` section in config; this PR corrects indentation in a documented example of configuring that section.	2025-10-03 12:19:52 +02:00
Christian Zaccaria	bcdbb53be3	feat: implement keyword and hybrid search for Weaviate provider (#3264 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - This PR implements keyword and hybrid search for Weaviate DB based on its inbuilt functions. - Added fixtures to conftest.py for Weaviate. - Enabled integration tests for remote Weaviate on all 3 search modes. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #3010 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Unit tests and integration tests should pass on this PR.	2025-10-03 10:22:30 +02:00
Doug Edgar	52c8df2322	feat: auto-detect Console width (#3327 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Addresses Issue #3271 - "Starting LLS server locally on a terminal with 120 chars width results in an output with empty lines". This removes the specific 150-character width limit specified for the Console, and will now auto-detect the terminal width instead. Now the formatting of Console output is consistent across different sizes of terminal windows. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #3271 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Launching the server with several different sizes of terminal windows results in Console output without unexpected spacing. e.g. `python -m llama_stack.core.server.server /tmp/run.yaml --port 8321` --------- Signed-off-by: Doug Edgar <dedgar@redhat.com> Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>	2025-10-03 10:19:31 +02:00
Matthew Farrellee	0a41c4ead0	chore: OpenAIMixin implements ModelsProtocolPrivate (#3662 ) # What does this PR do? add ModelsProtocolPrivate methods to OpenAIMixin this will allow providers using OpenAIMixin to use a common interface ## Test Plan ci w/ new tests	2025-10-02 21:32:02 -07:00
ehhuang	14a94e9894	fix: responses <> chat completion input conversion (#3645 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 9s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details UI Tests / ui-tests (22) (push) Successful in 33s Details Pre-commit / pre-commit (push) Successful in 1m27s Details # What does this PR do? closes #3268 closes #3498 When resuming from previous response ID, currently we attempt to convert from the stored responses input to chat completion messages, which is not always possible, e.g. for tool calls where some data is lost once converted from chat completion message to repsonses input format. This PR stores the chat completion messages that correspond to the _last_ call to chat completion, which is sufficient to be resumed from in the next responses API call, where we load these saved messages and skip conversion entirely. Separate issue to optimize storage: https://github.com/llamastack/llama-stack/issues/3646 ## Test Plan existing CI tests	2025-10-02 16:01:08 -07:00
Ashwin Bharambe	ef0736527d	feat(tools)!: substantial clean up of "Tool" related datatypes (#3627 ) This is a sweeping change to clean up some gunk around our "Tool" definitions. First, we had two types `Tool` and `ToolDef`. The first of these was a "Resource" type for the registry but we had stopped registering tools inside the Registry long back (and only registered ToolGroups.) The latter was for specifying tools for the Agents API. This PR removes the former and adds an optional `toolgroup_id` field to the latter. Secondly, as pointed out by @bbrowning in https://github.com/llamastack/llama-stack/pull/3003#issuecomment-3245270132, we were doing a lossy conversion from a full JSON schema from the MCP tool specification into our ToolDefinition to send it to the model. There is no necessity to do this -- we ourselves aren't doing any execution at all but merely passing it to the chat completions API which supports this. By doing this (and by doing it poorly), we encountered limitations like not supporting array items, or not resolving $refs, etc. To fix this, we replaced the `parameters` field by `{ input_schema, output_schema }` which can be full blown JSON schemas. Finally, there were some types in our llama-related chat format conversion which needed some cleanup. We are taking this opportunity to clean those up. This PR is a substantial breaking change to the API. However, given our window for introducing breaking changes, this suits us just fine. I will be landing a concurrent `llama-stack-client` change as well since API shapes are changing.	2025-10-02 15:12:03 -07:00
ehhuang	1f5003d50e	chore: fix precommit (#3663 ) # What does this PR do? ## Test Plan	2025-10-02 14:51:41 -07:00
ehhuang	ceca3c056f	chore: fix/add logging categories (#3658 ) # What does this PR do? These aren't controllable by LLAMA_STACK_LOGGING ``` tests/integration/agents/test_persistence.py::test_delete_agents_and_sessions SKIPPED (This ...) [ 3%] tests/integration/agents/test_persistence.py::test_get_agent_turns_and_steps SKIPPED (This t...) [ 7%] tests/integration/agents/test_openai_responses.py::test_responses_store[openai_client-txt=openai/gpt-4o-tools0-True] instantiating llama_stack_client WARNING 2025-10-02 13:14:33,472 root:258 uncategorized: Unknown logging category: testing. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,477 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,960 root:258 uncategorized: Unknown logging category: tokenizer_utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,962 root:258 uncategorized: Unknown logging category: models::llama. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,963 root:258 uncategorized: Unknown logging category: models::llama. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,968 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,974 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,978 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:35,350 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:35,366 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:35,489 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:35,490 root:258 uncategorized: Unknown logging category: inference_store. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:35,697 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:35,918 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 INFO 2025-10-02 13:14:35,945 llama_stack.providers.utils.inference.inference_store:74 inference_store: Write queue disabled for SQLite to avoid concurrency issues WARNING 2025-10-02 13:14:36,172 root:258 uncategorized: Unknown logging category: files. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,218 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,219 root:258 uncategorized: Unknown logging category: vector_io. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,231 root:258 uncategorized: Unknown logging category: vector_io. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,255 root:258 uncategorized: Unknown logging category: tool_runtime. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,486 root:258 uncategorized: Unknown logging category: responses_store. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,503 root:258 uncategorized: Unknown logging category: openai::responses. Falling back to default 'root' level: 20 INFO 2025-10-02 13:14:36,524 llama_stack.providers.utils.responses.responses_store:80 responses_store: Write queue disabled for SQLite to avoid concurrency issues WARNING 2025-10-02 13:14:36,528 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,703 root:258 uncategorized: Unknown logging category: uncategorized. Falling back to default 'root' level: 20 ``` ## Test Plan	2025-10-02 13:10:13 -07:00
Ashwin Bharambe	6afa96b0b9	fix(api): fix a mistake from #3636 which overwrote POST /responses	2025-10-02 13:03:17 -07:00
Matthew Farrellee	0e13512dd7	chore: fix agents tests for non-ollama providers, provide max_tokens (#3657 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details UI Tests / ui-tests (22) (push) Successful in 29s Details Pre-commit / pre-commit (push) Successful in 1m14s Details # What does this PR do? closes #3656 ## Test Plan openai is not enabled in ci, so manual testing with: ``` $ ./scripts/integration-tests.sh --stack-config ci-tests --suite base --setup gpt --subdirs agents --inference-mode live === Llama Stack Integration Test Runner === Stack Config: ci-tests Setup: gpt Inference Mode: live Test Suite: base Test Subdirs: agents Test Pattern: Checking llama packages llama-stack 0.2.23 .../llama-stack llama-stack-client 0.3.0a3 ollama 0.5.1 === System Resources Before Tests === ... === Applying Setup Environment Variables === Setting up environment variables: === Running Integration Tests === Test subdirs to run: agents Added test files from agents: 3 files === Running all collected tests in a single pytest command === Total test files: 3 + pytest -s -v tests/integration/agents/test_persistence.py tests/integration/agents/test_openai_responses.py tests/integration/agents/test_agents.py --stack-config=ci-tests --inference-mode=live -k 'not( builtin_tool or safety_with_image or code_interpreter or test_rag )' --setup=gpt --color=yes --capture=tee-sys WARNING 2025-10-02 13:14:32,653 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,043 root:258 uncategorized: Unknown logging category: tests. Falling back to default 'root' level: 20 INFO 2025-10-02 13:14:33,063 tests.integration.conftest:86 tests: Applying setup 'gpt' ========================================= test session starts ========================================== platform linux -- Python 3.12.11, pytest-8.4.2, pluggy-1.6.0 -- .../.venv/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.11', 'Platform': 'Linux-6.16.7-200.fc42.x86_64-x86_64-with-glibc2.41', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'html': '4.1.1', 'anyio': '4.9.0', 'timeout': '2.4.0', 'cov': '6.2.1', 'asyncio': '1.1.0', 'nbval': '0.11.0', 'socket': '0.7.0', 'json-report': '1.5.0', 'metadata': '3.1.1'}} rootdir: ... configfile: pyproject.toml plugins: html-4.1.1, anyio-4.9.0, timeout-2.4.0, cov-6.2.1, asyncio-1.1.0, nbval-0.11.0, socket-0.7.0, json-report-1.5.0, metadata-3.1.1 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 32 items / 6 deselected / 26 selected tests/integration/agents/test_persistence.py::test_delete_agents_and_sessions SKIPPED (This ...) [ 3%] tests/integration/agents/test_persistence.py::test_get_agent_turns_and_steps SKIPPED (This t...) [ 7%] tests/integration/agents/test_openai_responses.py::test_responses_store[openai_client-txt=openai/gpt-4o-tools0-True] instantiating llama_stack_client WARNING 2025-10-02 13:14:33,472 root:258 uncategorized: Unknown logging category: testing. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,477 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,960 root:258 uncategorized: Unknown logging category: tokenizer_utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,962 root:258 uncategorized: Unknown logging category: models::llama. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,963 root:258 uncategorized: Unknown logging category: models::llama. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,968 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,974 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:33,978 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:35,350 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:35,366 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:35,489 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:35,490 root:258 uncategorized: Unknown logging category: inference_store. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:35,697 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:35,918 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 INFO 2025-10-02 13:14:35,945 llama_stack.providers.utils.inference.inference_store:74 inference_store: Write queue disabled for SQLite to avoid concurrency issues WARNING 2025-10-02 13:14:36,172 root:258 uncategorized: Unknown logging category: files. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,218 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,219 root:258 uncategorized: Unknown logging category: vector_io. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,231 root:258 uncategorized: Unknown logging category: vector_io. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,255 root:258 uncategorized: Unknown logging category: tool_runtime. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,486 root:258 uncategorized: Unknown logging category: responses_store. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,503 root:258 uncategorized: Unknown logging category: openai::responses. Falling back to default 'root' level: 20 INFO 2025-10-02 13:14:36,524 llama_stack.providers.utils.responses.responses_store:80 responses_store: Write queue disabled for SQLite to avoid concurrency issues WARNING 2025-10-02 13:14:36,528 root:258 uncategorized: Unknown logging category: providers::utils. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,703 root:258 uncategorized: Unknown logging category: uncategorized. Falling back to default 'root' level: 20 WARNING 2025-10-02 13:14:36,726 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider fireworks: Pass Fireworks API Key in the header X-LlamaStack-Provider-Data as { "fireworks_api_key": <your api key>} WARNING 2025-10-02 13:14:36,727 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider together: Pass Together API Key in the header X-LlamaStack-Provider-Data as { "together_api_key": <your api key>} WARNING 2025-10-02 13:14:38,404 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider anthropic: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"anthropic_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-02 13:14:38,406 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider gemini: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"gemini_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-02 13:14:38,408 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider groq: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"groq_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-10-02 13:14:38,411 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider sambanova: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"sambanova_api_key": "<API_KEY>"}, or in the provider config. llama_stack_client instantiated in 5.237s SKIPPED [ 11%] tests/integration/agents/test_openai_responses.py::test_list_response_input_items[openai_client-txt=openai/gpt-4o] SKIPPED [ 15%] tests/integration/agents/test_openai_responses.py::test_list_response_input_items_with_limit_and_order[txt=openai/gpt-4o] SKIPPED [ 19%] tests/integration/agents/test_openai_responses.py::test_function_call_output_response[txt=openai/gpt-4o] SKIPPED [ 23%] tests/integration/agents/test_openai_responses.py::test_function_call_output_response_with_none_arguments[txt=openai/gpt-4o] SKIPPED [ 26%] tests/integration/agents/test_agents.py::test_agent_simple[openai/gpt-4o] PASSED [ 30%] tests/integration/agents/test_agents.py::test_agent_name[txt=openai/gpt-4o] SKIPPED (this te...) [ 34%] tests/integration/agents/test_agents.py::test_tool_config[openai/gpt-4o] PASSED [ 38%] tests/integration/agents/test_agents.py::test_custom_tool[openai/gpt-4o] FAILED [ 42%] tests/integration/agents/test_agents.py::test_custom_tool_infinite_loop[openai/gpt-4o] PASSED [ 46%] tests/integration/agents/test_agents.py::test_tool_choice_required[openai/gpt-4o] INFO 2025-10-02 13:14:51,559 llama_stack.providers.inline.agents.meta_reference.agent_instance:691 agents::meta_reference: done with MAX iterations (2), exiting. PASSED [ 50%] tests/integration/agents/test_agents.py::test_tool_choice_none[openai/gpt-4o] PASSED [ 53%] tests/integration/agents/test_agents.py::test_tool_choice_get_boiling_point[openai/gpt-4o] XFAIL [ 57%] tests/integration/agents/test_agents.py::test_create_turn_response[openai/gpt-4o-client_tools0] PASSED [ 61%] tests/integration/agents/test_agents.py::test_multi_tool_calls[openai/gpt-4o] PASSED [ 65%] tests/integration/agents/test_openai_responses.py::test_responses_store[openai_client-txt=openai/gpt-4o-tools0-False] SKIPPED [ 69%] tests/integration/agents/test_openai_responses.py::test_list_response_input_items[client_with_models-txt=openai/gpt-4o] PASSED [ 73%] tests/integration/agents/test_agents.py::test_create_turn_response[openai/gpt-4o-client_tools1] PASSED [ 76%] tests/integration/agents/test_openai_responses.py::test_responses_store[openai_client-txt=openai/gpt-4o-tools1-True] SKIPPED [ 80%] tests/integration/agents/test_openai_responses.py::test_responses_store[openai_client-txt=openai/gpt-4o-tools1-False] SKIPPED [ 84%] tests/integration/agents/test_openai_responses.py::test_responses_store[client_with_models-txt=openai/gpt-4o-tools0-True] SKIPPED [ 88%] tests/integration/agents/test_openai_responses.py::test_responses_store[client_with_models-txt=openai/gpt-4o-tools0-False] SKIPPED [ 92%] tests/integration/agents/test_openai_responses.py::test_responses_store[client_with_models-txt=openai/gpt-4o-tools1-True] SKIPPED [ 96%] tests/integration/agents/test_openai_responses.py::test_responses_store[client_with_models-txt=openai/gpt-4o-tools1-False] SKIPPED [100%] =============================================== FAILURES =============================================== ___________________________________ test_custom_tool[openai/gpt-4o] ____________________________________ tests/integration/agents/test_agents.py:370: in test_custom_tool assert "-100" in logs_str E assert '-100' in "inference> Polyjuice Potion is a fictional substance from the Harry Potter series, and it doesn't have a scientifically defined boiling point. If you have any other real liquid in mind, feel free to ask!" ========================================= slowest 10 durations ========================================= 5.47s setup tests/integration/agents/test_openai_responses.py::test_responses_store[openai_client-txt=openai/gpt-4o-tools0-True] 4.78s call tests/integration/agents/test_agents.py::test_custom_tool[openai/gpt-4o] 3.01s call tests/integration/agents/test_agents.py::test_tool_choice_required[openai/gpt-4o] 2.97s call tests/integration/agents/test_agents.py::test_agent_simple[openai/gpt-4o] 2.85s call tests/integration/agents/test_agents.py::test_tool_choice_none[openai/gpt-4o] 2.06s call tests/integration/agents/test_agents.py::test_multi_tool_calls[openai/gpt-4o] 1.83s call tests/integration/agents/test_agents.py::test_create_turn_response[openai/gpt-4o-client_tools0] 1.83s call tests/integration/agents/test_agents.py::test_custom_tool_infinite_loop[openai/gpt-4o] 1.29s call tests/integration/agents/test_agents.py::test_create_turn_response[openai/gpt-4o-client_tools1] 0.57s call tests/integration/agents/test_openai_responses.py::test_list_response_input_items[client_with_models-txt=openai/gpt-4o] ======================================= short test summary info ======================================== FAILED tests/integration/agents/test_agents.py::test_custom_tool[openai/gpt-4o] - assert '-100' in "inference> Polyjuice Potion is a fictional substance from the Harry Potter series... =========== 1 failed, 9 passed, 15 skipped, 6 deselected, 1 xfailed, 139 warnings in 27.18s ============ ``` note: the failure is separate from the issue being fixed	2025-10-02 14:30:13 -04:00
Alexey Rybak	24ee577cb0	docs: API spec generation for Stainless (#3655 ) # What does this PR do? * Adds stainless-llama-stack-spec.yaml for Stainless client generation, which comprises stable + experimental APIs ## Test Plan * Manual generation	2025-10-02 09:25:09 -07:00
Kelly Brown	1d02385e48	docs: Update docs navbar config (#3653 ) ## Description Currently, the docs page has the home book opened by default. This PR updates the .ts so that the sidebar books are collapsed when you first open the webpage	2025-10-02 16:48:38 +02:00
Sébastien Han	4161102100	chore!: add double routes for v1/openai/v1 (#3636 ) So that users get a warning in 0.3.0 and we remove them in 0.4.0. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-10-02 16:11:05 +02:00
Charlie Doern	f1748e2f92	fix: re-enable conformance skipping ability (#3651 ) # What does this PR do? this was broken by #3631, re-enable this ability by only using oasdiff when .skip != 'true' Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-02 15:04:26 +02:00
Aakanksha Duggal	7e48cc48bc	refactor(agents): migrate to OpenAI chat completions API (#3323 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / build-single-provider (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s Details Python Package Build Test / build (3.13) (push) Failing after 14s Details Test Llama Stack Build / generate-matrix (push) Successful in 18s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Test Llama Stack Build / build (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 44s Details Pre-commit / pre-commit (push) Successful in 1m16s Details	2025-10-02 06:50:32 -04:00
Chacksu	426dc54883	docs: Fix Dell distro documentation code snippets (#3640 ) # What does this PR do? * Updates code snippets for Dell distribution, fixing specific user home directory in code (replacing with $HOME) and updates docker instructions to use `docker` instead of `podman`. ## Test Plan N.A. Co-authored-by: Connor Hack <connorhack@fb.com>	2025-10-02 11:11:30 +02:00
Alexey Rybak	382eb25398	docs: fix more broken links (#3649 ) # What does this PR do? * Fixes some more documentation links ## Test Plan * Manual testing	2025-10-02 10:43:49 +02:00
Alexey Rybak	cb36b3bab1	docs: add favicon and mobile styling (#3650 ) # What does this PR do? * Adds favicon * Replaces old llama-stack theme image * Adds some mobile styling ## Test Plan * Manual testing	2025-10-02 10:42:54 +02:00
Alexey Rybak	267f658968	docs: fix broken links (#3647 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details UI Tests / ui-tests (22) (push) Successful in 43s Details Pre-commit / pre-commit (push) Successful in 2m0s Details # What does this PR do? * Fixes numerous broken links in the new documentation ## Test Plan * Server builds	2025-10-01 16:48:13 -07:00
ehhuang	5adcf0e0cb	chore: Remove debug logging from telemetry adapter (#3643 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Spammy ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> n/a	2025-10-01 15:16:23 -07:00
Matthew Farrellee	4dbe0593f9	chore: add provider-data-api-key support to openaimixin (#3639 ) # What does this PR do? the LiteLLMOpenAIMixin provides support for reading key from provider data (headers users send). this adds the same functionality to the OpenAIMixin. this is infrastructure for migrating providers. ## Test Plan ci w/ new tests	2025-10-01 13:44:59 -07:00
Alexey Rybak	28bbbcf2c1	docs: adding supplementary markdown content to API specs (#3632 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 45s Details Pre-commit / pre-commit (push) Successful in 1m27s Details # What does this PR do? Adds supplementary static content to root API spec pages. This is useful for giving context behind a specific API group, adding information on supported features or work in progress, etc. This PR introduces supplementary information for Agents (experimental, deprecated) and Responses (stable) APIs. <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Documentation server renders rich static content for the Agents API group: ![image.png](https://app.graphite.dev/user-attachments/assets/fc521619-0320-4a22-9409-8ee3fb57ed0e.png) <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-10-01 10:15:30 -07:00
Alexey Rybak	b6a5bccadf	docs: api separation (#3630 ) # What does this PR do? First step towards cleaning up the API reference section of the docs. - Separates API reference into 3 sections: stable (`v1`), experimental (`v1alpha` and `v1beta`), and deprecated (`deprecated=True`) - Each section is accessible via the dropdown menu and `docs/api-overview` <img width="1237" height="321" alt="Screenshot 2025-09-30 at 5 47 30 PM" src="https://github.com/user-attachments/assets/fe0e498c-b066-46ed-a48e-4739d3b6724c" /> <img width="860" height="510" alt="Screenshot 2025-09-30 at 5 47 49 PM" src="https://github.com/user-attachments/assets/a92a8d8c-94bf-42d5-9f5b-b47bb2b14f9c" /> - Deprecated APIs: Added styling to the sidebar, and a notice on the endpoint pages <img width="867" height="428" alt="Screenshot 2025-09-30 at 5 47 43 PM" src="https://github.com/user-attachments/assets/9e6e050d-c782-461b-8084-5ff6496d7bd9" /> Closes #3628 TODO in follow-up PRs: - Add the ability to annotate API groups with supplementary content (so we can have longer descriptions of complex APIs like Responses) - Clean up docstrings to show API endpoints (or short semantic titles) in the sidebar ## Test Plan - Local testing - Made sure API conformance test still passes	2025-10-01 10:13:31 -07:00
Alexey Rybak	7f1a33f51c	docs: update API conformance test (#3631 ) # What does this PR do? Given the rapidly changing nature of Llama Stack's APIs and the need to have clean, user-friendly API documentation, we want to split the API reference into 3 main buckets; stable, experimental and deprecated. The most straightforward way to do it is to have several automatically generated doctrees, which introduces some complexity in testing APIs for backwards compatibility. This PR updates the API conformance test to handle cases where the API schema is split into several files; it does not change the testing criteria. <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan No developer-facing changes (all existing tests should pass) <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-10-01 10:11:31 -07:00
ehhuang	853e9b3b0a	fix: log level (#3637 ) # What does this PR do? - categories like "core::server" is not recognized so it's level is not set by 'all=debug' - removed spammy telemetry debug logging ## Test Plan test server launched with LLAMA_STACK_LOGGING='all=debug'	2025-10-01 09:51:39 -07:00
Charlie Doern	4819a2e0ee	feat(conformance): skip test if breaking change is ack (#3619 ) # What does this PR do? if the PR title has `!` or the footer of the commit has `BREAKING CHANGE:`, skip conformance. This is documented in the API leveling proposal Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-01 09:22:42 -07:00
Charlie Doern	d167101e70	feat(api): implement v1beta leveling, and additional alpha (#3594 ) # What does this PR do? level the following APIs, keeping their old routes around as well until 0.4.0 1. datasetio to v1beta: used primarily by eval and training. Given that training is v1alpha, and eval is v1alpha, datasetio is likely to change in structure as real usages of the API spin up. Register,unregister, and iter dataset is sparsely implemented meaning the shape of that route is likely to change. 2. telemetry to v1alpha: telemetry has been going through many changes. for example query_metrics was not even implemented until recently and had to change its shape to work. putting this in v1beta will allow us to fix functionality like OTEL, sqlite, etc. The routes themselves are set, but the structure might change a bit Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-01 09:18:11 -07:00
Matthew Farrellee	f7c5ef4ec0	chore: remove /v1/inference/completion and implementations (#3622 ) # What does this PR do? the /inference/completion route is gone. this removes the implementations. ## Test Plan ci	2025-10-01 11:36:53 -04:00
Matthew Farrellee	ea15f2a270	chore: use openai_chat_completion for llm as a judge scoring (#3635 ) # What does this PR do? update llm as a judge to use openai_chat_completion, instead of deprecated chat_completion ## Test Plan ci	2025-10-01 09:44:31 -04:00
Jaideep Rao	ca47d90926	fix: Ensure that tool calls with no arguments get handled correctly (#3560 ) # What does this PR do? When a model decides to use an MCP tool call that requires no arguments, it sets the `arguments` field to `None`. This causes the user to see a `400 bad requst error` due to validation errors down the stack because this field gets removed when being parsed by an openai compatible inference provider like vLLM This PR ensures that, as soon as the tool call args are accumulated while streaming, we check to ensure no tool call function arguments are set to None - if they are we replace them with "{}" <!-- If resolving an issue, uncomment and update the line below --> Closes #3456 ## Test Plan Added new unit test to verify that any tool calls with function arguments set to `None` get handled correctly --------- Signed-off-by: Jaideep Rao <jrao@redhat.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-10-01 08:36:57 -04:00
Ashwin Bharambe	42414a1a1b	fix(logging): disable console telemetry sink by default (#3623 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 21s Details Test Llama Stack Build / build-single-provider (push) Failing after 25s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 27s Details Unit Tests / unit-tests (3.12) (push) Failing after 22s Details API Conformance Tests / check-schema-compatibility (push) Successful in 33s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m12s Details The current span processing dumps so much junk on the console that it makes actual understanding of what is going on in the server impossible. I am killing the console sink as a default. If you want, you are always free to change your run.yaml to add it. Before: <img width="1877" height="1107" alt="image" src="https://github.com/user-attachments/assets/3a7ad261-e2ba-4d40-9820-fcc282c8df37" /> After: <img width="1919" height="470" alt="image" src="https://github.com/user-attachments/assets/bc7cf763-fba9-4e95-a4b5-f65f6d1c5332" />	2025-09-30 14:58:05 -07:00
ehhuang	ac7c35fbe6	fix: don't pass default response format in Responses (#3614 ) # What does this PR do? Fireworks doesn't allow repsonse_format with tool use. The default response format is 'text' anyway, so we can safely omit. ## Test Plan Below script failed without the change, runs after. ``` #!/usr/bin/env python3 """ Script to test Responses API with kubernetes-mcp-server. This script: 1. Connects to the llama stack server 2. Uses the Responses API with MCP tools 3. Asks for the list of Kubernetes namespaces using the kubernetes-mcp-server """ import json from openai import OpenAI # Connect to the llama stack server base_url = "http://localhost:8321/v1" client = OpenAI(base_url=base_url, api_key="fake") # Define the MCP tool pointing to the kubernetes-mcp-server # The kubernetes-mcp-server is running on port 3000 with SSE endpoint at /sse mcp_server_url = "http://localhost:3000/sse" tools = [ { "type": "mcp", "server_label": "k8s", "server_url": mcp_server_url, } ] # Create a response request asking for k8s namespaces print("Sending request to list Kubernetes namespaces...") print(f"Using MCP server at: {mcp_server_url}") print("Available tools will be listed automatically by the MCP server.") print() response = client.responses.create( # model="meta-llama/Llama-3.2-3B-Instruct", # Using the vllm model model="fireworks/accounts/fireworks/models/llama4-scout-instruct-basic", # model="openai/gpt-4o", input="what are all the Kubernetes namespaces? Use tool call to `namespaces_list`. make sure to adhere to the tool calling format UNDER ALL CIRCUMSTANCES.", tools=tools, stream=False, ) print("\n" + "=" * 80) print("RESPONSE OUTPUT:") print("=" * 80) # Print the output for i, output in enumerate(response.output): print(f"\n[Output {i + 1}] Type: {output.type}") if output.type == "mcp_list_tools": print(f" Server: {output.server_label}") print(f" Tools available: {[t.name for t in output.tools]}") elif output.type == "mcp_call": print(f" Tool called: {output.name}") print(f" Arguments: {output.arguments}") print(f" Result: {output.output}") if output.error: print(f" Error: {output.error}") elif output.type == "message": print(f" Role: {output.role}") print(f" Content: {output.content}") print("\n" + "=" * 80) print("FINAL RESPONSE TEXT:") print("=" * 80) print(response.output_text) ```	2025-09-30 14:52:24 -07:00
grs	d350e3662b	feat: add support for require_approval argument when creating response (#3608 ) # What does this PR do? This PR adds support for the require_approval on an mcp tool definition passed to create response in the Responses API. This allows the caller to indicate whether they want to approve calls to that server, or let them be called without approval. Closes #3443 ## Test Plan Tested both approval and denial. Added automated integration test for both cases. --------- Signed-off-by: Gordon Sim <gsim@redhat.com> Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>	2025-09-30 14:18:34 -07:00
Alexey Rybak	0837fa7bef	docs: update safety notebook (#3617 ) # What does this PR do? * Updates the safety guide in Zero to Hero series to use Moderations API and the latest safety models * Fixes an image link Closes #2557 ## Test Plan * Manual testing	2025-09-30 14:11:12 -07:00
Alexey Rybak	c4c980b056	docs: frontpage update (#3620 ) # What does this PR do? * Adds canonical project information and links to client SDK / k8s operator / app examples repos to the front page * Fixes some button rendering errors Closes #3618 ## Test Plan Local rebuild of the documentation server	2025-09-30 14:11:00 -07:00
Ashwin Bharambe	606f4cf281	fix(expires_after): make sure multipart/form-data is properly parsed (#3612 ) https://github.com/llamastack/llama-stack/pull/3604 broke multipart form data field parsing for the Files API since it changed its shape -- so as to match the API exactly to the OpenAI spec even in the generated client code. The underlying reason is that multipart/form-data cannot transport structured nested fields. Each field must be str-serialized. The client (specifically the OpenAI client whose behavior we must match), transports sub-fields as `expires_after[anchor]` and `expires_after[seconds]`, etc. We must be able to handle these fields somehow on the server without compromising the shape of the YAML spec. This PR "fixes" this by adding a dependency to convert the data. The main trade-off here is that we must add this `Depends()` annotation on every provider implementation for Files. This is a headache, but a much more reasonable one (in my opinion) given the alternatives. ## Test Plan Tests as shown in https://github.com/llamastack/llama-stack/pull/3604#issuecomment-3351090653 pass.	2025-09-30 16:14:03 -04:00
Ashwin Bharambe	73de235ef1	fix(eval): use client.alpha for eval tests	2025-09-30 13:02:33 -07:00
slekkala1	cc64093ae4	feat(api): Add Vector Store File batches api stub (#3615 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 34s Details Pre-commit / pre-commit (push) Successful in 1m14s Details # What does this PR do? Adding api stubs for vector store file batches apis https://github.com/llamastack/llama-stack/issues/3533 API Ref: https://platform.openai.com/docs/api-reference/vector-stores-file-batches ## Test Plan CI	2025-09-30 12:07:33 -07:00
Charlie Doern	1e25a72ece	feat(api): level /agents as `v1alpha` (#3610 ) # What does this PR do? agents is likely to be deprecated in favor of responses. Lets level it as alpha to indicate the lack of longterm support keep v1 route for backwards compat. Closes #3611 Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-09-30 11:15:04 -07:00
Matthew Farrellee	2de4e6c900	feat: use /v1/chat/completions for safety model inference (#3591 ) # What does this PR do? migrate safety api implementation from /inference/chat-completion to /v1/chat/completions ## Test Plan ci w/ recordings --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-30 11:01:44 -07:00
Matthew Farrellee	cb33f45c11	chore: unpublish /inference/chat-completion (#3609 ) # What does this PR do? BREAKING CHANGE: removes /inference/chat-completion route and updates relevant documentation ## Test Plan 🤷	2025-09-30 11:00:42 -07:00
Kai Wu	62e302613f	feat: add llamastack + CrewAI integration example notebook (#3275 ) # What does this PR do? Add llamastack + CrewAI integration example notebook <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Tested in local jupyternotebook and it works.	2025-09-30 10:23:57 -07:00
ehhuang	6cce553c93	fix: mcp tool with array type should include items (#3602 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 11s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 17s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s Details Python Package Build Test / build (3.12) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 23s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 28s Details Unit Tests / unit-tests (3.12) (push) Failing after 25s Details API Conformance Tests / check-schema-compatibility (push) Successful in 32s Details UI Tests / ui-tests (22) (push) Successful in 57s Details Pre-commit / pre-commit (push) Successful in 1m18s Details # What does this PR do? Fixes error: ``` [ERROR] Error executing endpoint route='/v1/openai/v1/responses' method='post': Error code: 400 - {'error': {'message': "Invalid schema for function 'pods_exec': In context=('properties', 'command'), array schema missing items.", 'type': 'invalid_request_error', 'param': 'tools[7].function.parameters', 'code': 'invalid_function_parameters'}} ``` From script: ``` #!/usr/bin/env python3 """ Script to test Responses API with kubernetes-mcp-server. This script: 1. Connects to the llama stack server 2. Uses the Responses API with MCP tools 3. Asks for the list of Kubernetes namespaces using the kubernetes-mcp-server """ import json from openai import OpenAI # Connect to the llama stack server base_url = "http://localhost:8321/v1/openai/v1" client = OpenAI(base_url=base_url, api_key="fake") # Define the MCP tool pointing to the kubernetes-mcp-server # The kubernetes-mcp-server is running on port 3000 with SSE endpoint at /sse mcp_server_url = "http://localhost:3000/sse" tools = [ { "type": "mcp", "server_label": "k8s", "server_url": mcp_server_url, } ] # Create a response request asking for k8s namespaces print("Sending request to list Kubernetes namespaces...") print(f"Using MCP server at: {mcp_server_url}") print("Available tools will be listed automatically by the MCP server.") print() response = client.responses.create( # model="meta-llama/Llama-3.2-3B-Instruct", # Using the vllm model model="openai/gpt-4o", input="what are all the Kubernetes namespaces? Use tool call to `namespaces_list`. make sure to adhere to the tool calling format.", tools=tools, stream=False, ) print("\n" + "=" * 80) print("RESPONSE OUTPUT:") print("=" * 80) # Print the output for i, output in enumerate(response.output): print(f"\n[Output {i + 1}] Type: {output.type}") if output.type == "mcp_list_tools": print(f" Server: {output.server_label}") print(f" Tools available: {[t.name for t in output.tools]}") elif output.type == "mcp_call": print(f" Tool called: {output.name}") print(f" Arguments: {output.arguments}") print(f" Result: {output.output}") if output.error: print(f" Error: {output.error}") elif output.type == "message": print(f" Role: {output.role}") print(f" Content: {output.content}") print("\n" + "=" * 80) print("FINAL RESPONSE TEXT:") print("=" * 80) print(response.output_text) ``` ## Test Plan new unit tests script now runs successfully	2025-09-29 23:11:41 -07:00
Ashwin Bharambe	56b625d18a	feat(openai_movement)!: Change URL structures to kill /openai/v1 (part 2) (#3605 )	2025-09-29 22:57:37 -07:00
Ashwin Bharambe	3a09f00cdb	feat(files): fix expires_after API shape (#3604 ) This was just quite incorrect. See source here: https://platform.openai.com/docs/api-reference/files/create	2025-09-29 21:29:15 -07:00
Ashwin Bharambe	5e7fed8bbb	feat(openai_movement): Change URL structures to kill /openai/v1 (part 1) (#3587 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Pre-commit / pre-commit (push) Successful in 1m19s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 38s Details The `/v1/openai/v1` prefix is annoying and now unnecessary given our clearer focus on how to think about the API surface. Let's kill it for the 0.3.0 update. To make client-side changes feasible, we will do this in two parts. This part adds a new route (sans `/openai/v1`) so the existing client continues to work since the server supports both. The next PR will be client-side (Stainless) changes which I will be making shortly. The final PR will remove the `/openai/v1` routes. Note that all these changes will happen rapidly within this release cycle. The entire set _will be backwards incompatible_.	2025-09-29 16:14:35 -07:00
Michael Dawson	ddf3f1735a	fix: ensure usage is requested if telemetry is enabled (#3571 ) # What does this PR do? Refs: https://github.com/llamastack/llama-stack/issues/3420 When telemetry is enabled the router uncondionally expects the usage attribute to be availble and fails if it is not present. Usage is not currently being requested by litellm_openai_mixin.py for streaming requests when using the responses API which means that providers like vertexai fail if telemetry is enabled and streaming is used. This is part of the required fix. Other part is in liteLLM, will plan to submit PR for that soon. ## Test Plan I applied this change along with the change for litellm in a llama stack deployment and validated that I could make streaming requests through the responses API to a gemini model and they would succeed instead of failing due to the missing usage attribute when telemetry is enabled. Signed-off-by: Michael Dawson <midawson@redhat.com>	2025-09-29 14:09:08 -07:00
slekkala1	455579a88e	fix: Remove deprecated user param in OpenAIResponseObject (#3596 ) # What does this PR do? Just removing the deprecated User param in `OpenAIResponseObject` Closing https://github.com/llamastack/llama-stack/issues/3482 ## Test Plan CI	2025-09-29 13:55:59 -07:00
Matthew Farrellee	e9eb004bf8	fix: remove inference.completion from docs (#3589 ) # What does this PR do? now that /v1/inference/completion has been removed, no docs should refer to it this cleans up remaining references ## Test Plan ci Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-29 13:14:41 -07:00
Alexey Rybak	498be131a1	docs: update image paths (#3599 ) # What does this PR do? * Updates image paths for images in docs/resources/ to proper static image locations ## Test Plan * `npm run build` builds documentation properly	2025-09-29 13:14:05 -07:00
Matthew Farrellee	7c888fc0da	feat: update eval runner to use openai endpoints (#3588 ) # What does this PR do? move the eval=inline::meta-reference implementation to use openai_completion/openai_chat_completion note: this breaks backward compatibility if eval setup used sampling params' repetition_penalty or strategy ## Test Plan ci w/ new recordings Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-29 13:13:53 -07:00
Matthew Farrellee	45f438c027	chore: skip safety tests when shield not available (#3592 ) # What does this PR do? we skip embedding tests when the embedding_model_id isn't provided. same for completion / chat tests when text_model_id isn't given. instead of failing safety tests when a shield_id isn't provided, we'll skip them too. ## Test Plan ci Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-29 13:11:37 -07:00
Charlie Doern	aac42ddcc2	feat(api): level inference/rerank and remove experimental (#3565 ) # What does this PR do? inference/rerank is the one route in the API intended to not be deprecated. Level it as v1alpha. Additionally, remove `experimental` and opt to instead use `v1alpha` which itself implies an experimental state based on the original proposal Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-09-29 12:42:09 -07:00
Matthew Farrellee	975ead1d6a	chore(api): remove deprecated embeddings impls (#3301 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m25s Details # What does this PR do? remove deprecated embeddings implementations	2025-09-29 14:45:09 -04:00
Kai Wu	aab22dc759	fix: adding mime type of application/json support (#3452 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR fix #3300 by adding mime type of application/json support in [agent_instance.py](`4a59961a6c/llama_stack/providers/inline/agents/meta_reference/agent_instance.py (L923)`) <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[3300] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> all related pytest passed, see log: ``` ./scripts/unit-tests.sh tests/unit/providers/agent/test_get_raw_document_text.py -vvv /Users/kaiwu/work/kaiwu/llama-stack/.venv/bin/python3 Uninstalled 22 packages in 5.65s Installed 47 packages in 1.24s ================= test session starts ================= platform darwin -- Python 3.12.9, pytest-8.4.2, pluggy-1.6.0 -- /Users/kaiwu/work/kaiwu/llama-stack/.venv/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.9', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0'}} rootdir: /Users/kaiwu/work/kaiwu/llama-stack configfile: pyproject.toml plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 14 items tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_text_mime_types PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_yaml_mime_type PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_deprecated_text_yaml_with_warning PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_deprecated_text_yaml_with_url PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_deprecated_text_yaml_with_text_content_item PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_json_mime_type PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_json_url PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_json_text_content_item PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_rejects_unsupported_mime_types PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_url_content PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_yaml_url PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_text_content_item PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_yaml_text_content_item PASSED tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_rejects_unexpected_content_type PASSED ================ slowest 10 durations ================= 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_deprecated_text_yaml_with_url 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_rejects_unsupported_mime_types 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_rejects_unexpected_content_type 0.00s setup tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_text_mime_types 0.00s teardown tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_text_mime_types 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_yaml_url 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_url_content 0.00s teardown tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_rejects_unsupported_mime_types 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_with_json_url 0.00s call tests/unit/providers/agent/test_get_raw_document_text.py::test_get_raw_document_text_supports_text_mime_types ================= 14 passed in 0.14s ================== Generating coverage report... Wrote HTML report to htmlcov-3.12/index.html ```	2025-09-29 11:27:31 -07:00
Ashwin Bharambe	fdb144f009	revert: feat(ci): use @next branch from llama-stack-client (#3593 ) Reverts llamastack/llama-stack#3576 When I edit Stainless and codegen succeeds, the `next` branch is updated directly. It provides us no chance to see if there might be something unideal going on. If something is wrong, all CI will start breaking immediately. This is not ideal. I will likely create another staging branch `next-release` or something to accomodate the special workflow that Stainless requires.	2025-09-29 10:41:04 -07:00
ehhuang	8ab6684a94	chore: introduce write queue for response_store (#3497 ) # What does this PR do? Mirroring the same changes that was used for inference_store: https://github.com/llamastack/llama-stack/pull/3383 Will follow up with a shared internal API for managing these write queues. ## Test Plan existing tests	2025-09-29 10:36:16 -07:00
Matthew Farrellee	7c466a7ec5	chore: skip nvidia datastore tests when nvidia datastore is not enabled (#3590 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s Details Python Package Build Test / build (3.12) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 25s Details Unit Tests / unit-tests (3.12) (push) Failing after 25s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 28s Details API Conformance Tests / check-schema-compatibility (push) Successful in 33s Details UI Tests / ui-tests (22) (push) Successful in 58s Details Pre-commit / pre-commit (push) Successful in 1m17s Details # What does this PR do? the nvidia datastore tests were running when the datastore was not configured. they would always fail. this introduces a skip when the nvidia datastore is not configured. ## Test Plan ci	2025-09-29 05:15:58 -04:00
dependabot[bot]	90bb9cfb0a	chore(github-deps): bump actions/cache from 4.2.4 to 4.3.0 (#3577 ) Bumps [actions/cache](https://github.com/actions/cache) from 4.2.4 to 4.3.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/cache/releases">actions/cache's releases</a>.</em></p> <blockquote> <h2>v4.3.0</h2> <h2>What's Changed</h2> <ul> <li>Add note on runner versions by <a href="https://github.com/GhadimiR"><code>@GhadimiR</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1642">actions/cache#1642</a></li> <li>Prepare <code>v4.3.0</code> release by <a href="https://github.com/Link"><code>@Link</code></a>- in <a href="https://redirect.github.com/actions/cache/pull/1655">actions/cache#1655</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/GhadimiR"><code>@GhadimiR</code></a> made their first contribution in <a href="https://redirect.github.com/actions/cache/pull/1642">actions/cache#1642</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/cache/compare/v4...v4.3.0">https://github.com/actions/cache/compare/v4...v4.3.0</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/actions/cache/blob/main/RELEASES.md">actions/cache's changelog</a>.</em></p> <blockquote> <h1>Releases</h1> <h3>4.3.0</h3> <ul> <li>Bump <code>@actions/cache</code> to <a href="https://redirect.github.com/actions/toolkit/pull/2132">v4.1.0</a></li> </ul> <h3>4.2.4</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.5</li> </ul> <h3>4.2.3</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.3 (obfuscates SAS token in debug logs for cache entries)</li> </ul> <h3>4.2.2</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.2</li> </ul> <h3>4.2.1</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.1</li> </ul> <h3>4.2.0</h3> <p>TLDR; The cache backend service has been rewritten from the ground up for improved performance and reliability. <a href="https://github.com/actions/cache">actions/cache</a> now integrates with the new cache service (v2) APIs.</p> <p>The new service will gradually roll out as of <strong>February 1st, 2025</strong>. The legacy service will also be sunset on the same date. Changes in these release are <strong>fully backward compatible</strong>.</p> <p><strong>We are deprecating some versions of this action</strong>. We recommend upgrading to version <code>v4</code> or <code>v3</code> as soon as possible before <strong>February 1st, 2025.</strong> (Upgrade instructions below).</p> <p>If you are using pinned SHAs, please use the SHAs of versions <code>v4.2.0</code> or <code>v3.4.0</code></p> <p>If you do not upgrade, all workflow runs using any of the deprecated <a href="https://github.com/actions/cache">actions/cache</a> will fail.</p> <p>Upgrading to the recommended versions will not break your workflows.</p> <h3>4.1.2</h3> <ul> <li>Add GitHub Enterprise Cloud instances hostname filters to inform API endpoint choices - <a href="https://redirect.github.com/actions/cache/pull/1474">#1474</a></li> <li>Security fix: Bump braces from 3.0.2 to 3.0.3 - <a href="https://redirect.github.com/actions/cache/pull/1475">#1475</a></li> </ul> <h3>4.1.1</h3> <ul> <li>Restore original behavior of <code>cache-hit</code> output - <a href="https://redirect.github.com/actions/cache/pull/1467">#1467</a></li> </ul> <h3>4.1.0</h3> <ul> <li>Ensure <code>cache-hit</code> output is set when a cache is missed - <a href="https://redirect.github.com/actions/cache/pull/1404">#1404</a></li> <li>Deprecate <code>save-always</code> input - <a href="https://redirect.github.com/actions/cache/pull/1452">#1452</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`0057852bfa`"><code>0057852</code></a> Merge pull request <a href="https://redirect.github.com/actions/cache/issues/1655">#1655</a> from actions/Link-/prepare-4.3.0</li> <li><a href="`4f5ea67f1c`"><code>4f5ea67</code></a> Update licensed cache</li> <li><a href="`9fcad95d03`"><code>9fcad95</code></a> Upgrade actions/cache to 4.1.0 and prepare 4.3.0 release</li> <li><a href="`638ed79f9d`"><code>638ed79</code></a> Merge pull request <a href="https://redirect.github.com/actions/cache/issues/1642">#1642</a> from actions/GhadimiR-patch-1</li> <li><a href="`3862dccb17`"><code>3862dcc</code></a> Add note on runner versions</li> <li>See full diff in <a href="`0400d5f644...0057852bfa`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/cache&package-manager=github_actions&previous-version=4.2.4&new-version=4.3.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-29 10:04:58 +02:00
dependabot[bot]	9fdfd3a2ad	chore(ui-deps): bump tw-animate-css from 1.2.9 to 1.4.0 in /llama_stack/ui (#3583 ) Bumps [tw-animate-css](https://github.com/Wombosvideo/tw-animate-css) from 1.2.9 to 1.4.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/Wombosvideo/tw-animate-css/releases">tw-animate-css's releases</a>.</em></p> <blockquote> <h2>v1.4.0</h2> <h2>Changelog</h2> <p>902e37a019ffd165ba078e0b3c02634526c54bf0: fix: remove support for prefix, add new export for prefixed version. Closes <a href="https://redirect.github.com/Wombosvideo/tw-animate-css/issues/58">#58</a>. fab2a5bf817605be1976e159976718a83489fc1c: chore: bump version to 1.4.0 and update dependencies c20dc32e2b532a8e74546879b4ce7d9ce89ba710: fix(build): make transform.ts accept two arguments</p> <h2>⚠️ BREAKING CHANGE ⚠️</h2> <p>Support for Tailwind CSS's prefix option was moved to <code>tw-animate-css/prefix</code> because it was breaking the <code>--spacing</code> function. Users requiring prefixes should replace their import:</p> <pre lang="diff"><code>- import "tw-animate-css"; + import "tw-animate-css/prefix"; </code></pre> <p><em>I do not plan to introduce breaking changes like this to non-major releases in the future. But because more people use spacing rather than prefixes, reverting the previous version's (obviously breaking) change seems reasonable.</em></p> <h2>v1.3.8</h2> <h2>Changelog</h2> <ul> <li>b5ff23a: fix: add support for global CSS variable prefix. Closes <a href="https://redirect.github.com/Wombosvideo/tw-animate-css/issues/48">#48</a></li> <li>03e5f12: feat: add support for ng-primitives height variables <a href="https://redirect.github.com/Wombosvideo/tw-animate-css/issues/56">#56</a> (thanks <a href="https://github.com/immohammadjaved"><code>@immohammadjaved</code></a>)</li> <li>b076cfb: docs: fix various issues in accordion and collapsible docs</li> <li>9485e33: chore: bump version to 1.3.8 and update dependencies</li> </ul> <h2>⚠️ BREAKING CHANGE ⚠️</h2> <p>Adding support for prefixes broke custom spacing. It is recommended that you skip this version if you do not use Tailwind CSS's prefix option, and use v1.4.0 instead. If you are actually using prefixes, you can use a special version supporting prefixes:</p> <pre lang="diff"><code>- import "tw-animate-css"; /* Version with spacing support / + import "tw-animate-css/prefix"; / Version with prefix support */ </code></pre> <p><em>I do not plan to fix the incompatibility between the spacing and prefix versions due to time constraints. Feel free to investigate and open a pull request if you manage to fix it.</em></p> <h2>v1.3.7</h2> <h2>Changelog</h2> <ul> <li>80dbfcc: feat: add utilities for blur transitions <a href="https://redirect.github.com/Wombosvideo/tw-animate-css/issues/54">#54</a> (thanks <a href="https://github.com/coffeeispower"><code>@coffeeispower</code></a>)</li> <li>dc294f9: docs: add upcoming changes warning</li> <li>c640bb8: chore: update dependencies and package manager version</li> <li>9e63e34: chore: bump version to 1.3.7</li> </ul> <h2>v1.3.6</h2> <h2>Changelog</h2> <ul> <li>58f3396: fix: allow changing animation parameters for ready-to-use animations</li> <li>8313476: chore: update dependencies nd package manager version</li> <li>f81346c: chore: bump version to 1.3.6</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`c20dc32e2b`"><code>c20dc32</code></a> fix(build): make transform.ts accept two arguments</li> <li><a href="`fab2a5bf81`"><code>fab2a5b</code></a> chore: bump version to 1.4.0 and update dependencies</li> <li><a href="`902e37a019`"><code>902e37a</code></a> fix: remove support for prefix, add new export for prefixed version</li> <li><a href="`9485e33d99`"><code>9485e33</code></a> chore: bump version to 1.3.8 and update dependencies</li> <li><a href="`b076cfb04a`"><code>b076cfb</code></a> docs: fix various issues in accordion and collapsible docs</li> <li><a href="`03e5f12418`"><code>03e5f12</code></a> feat: add support for ng-primitives height variables (<a href="https://redirect.github.com/Wombosvideo/tw-animate-css/issues/56">#56</a>)</li> <li><a href="`b5ff23a0d5`"><code>b5ff23a</code></a> fix: add support for global CSS variable prefix. Closes <a href="https://redirect.github.com/Wombosvideo/tw-animate-css/issues/48">#48</a></li> <li><a href="`9e63e34286`"><code>9e63e34</code></a> chore: bump version to 1.3.7</li> <li><a href="`c640bb8933`"><code>c640bb8</code></a> chore: update dependencies and package manager version</li> <li><a href="`dc294f990a`"><code>dc294f9</code></a> docs: add upcoming changes warning</li> <li>Additional commits viewable in <a href="https://github.com/Wombosvideo/tw-animate-css/compare/v1.2.9...v1.4.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tw-animate-css&package-manager=npm_and_yarn&previous-version=1.2.9&new-version=1.4.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-29 10:03:26 +02:00
dependabot[bot]	d95853d784	chore(ui-deps): bump shiki from 1.29.2 to 3.13.0 in /llama_stack/ui (#3585 ) Bumps [shiki](https://github.com/shikijs/shiki/tree/HEAD/packages/shiki) from 1.29.2 to 3.13.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/shikijs/shiki/releases">shiki's releases</a>.</em></p> <blockquote> <h2>v3.13.0</h2> <h3> 🚀 Features</h3> <ul> <li><strong>transformers</strong>: Render indent guides - by <a href="https://github.com/KazariEX"><code>@KazariEX</code></a> and <a href="https://github.com/antfu"><code>@antfu</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1060">shikijs/shiki#1060</a> <a href="`aecd1617`"><!-- raw HTML omitted -->(aecd1)<!-- raw HTML omitted --></a></li> </ul> <h5> <a href="https://github.com/shikijs/shiki/compare/v3.12.3...v3.13.0">View changes on GitHub</a></h5> <h2>v3.12.3</h2> <h3> 🐞 Bug Fixes</h3> <ul> <li><code>@shikijs/twoslash</code> version specifier - by <a href="https://github.com/9romise"><code>@9romise</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1078">shikijs/shiki#1078</a> <a href="`a1cdea41`"><!-- raw HTML omitted -->(a1cde)<!-- raw HTML omitted --></a></li> </ul> <h5> <a href="https://github.com/shikijs/shiki/compare/v3.12.2...v3.12.3">View changes on GitHub</a></h5> <h2>v3.12.2</h2> <h3> 🐞 Bug Fixes</h3> <ul> <li><strong>twoslash</strong>: Fix <code>onTwoslashError</code> return value handling - by <a href="https://github.com/Karibash"><code>@Karibash</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1070">shikijs/shiki#1070</a> <a href="`e86b0a7c`"><!-- raw HTML omitted -->(e86b0)<!-- raw HTML omitted --></a></li> </ul> <h5> <a href="https://github.com/shikijs/shiki/compare/v3.12.1...v3.12.2">View changes on GitHub</a></h5> <h2>v3.12.1</h2> <p><em>No significant changes</em></p> <h5> <a href="https://github.com/shikijs/shiki/compare/v3.12.0...v3.12.1">View changes on GitHub</a></h5> <h2>v3.12.0</h2> <h3> 🚀 Features</h3> <ul> <li><strong>vitepress-twoslash</strong>: <ul> <li>Improve UX for option customization - by <a href="https://github.com/9romise"><code>@9romise</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1066">shikijs/shiki#1066</a> <a href="`e3cfdeca`"><!-- raw HTML omitted -->(e3cfd)<!-- raw HTML omitted --></a></li> <li>Twoslash inline type cache for markdown - by <a href="https://github.com/serkodev"><code>@serkodev</code></a> and <a href="https://github.com/antfu"><code>@antfu</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1063">shikijs/shiki#1063</a> <a href="`dc7fbc70`"><!-- raw HTML omitted -->(dc7fb)<!-- raw HTML omitted --></a></li> </ul> </li> </ul> <h3> 🐞 Bug Fixes</h3> <ul> <li><strong>remove-notation-escape</strong>: Correct escape sequence - by <a href="https://github.com/sor4chi"><code>@sor4chi</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1065">shikijs/shiki#1065</a> <a href="`22d0c780`"><!-- raw HTML omitted -->(22d0c)<!-- raw HTML omitted --></a></li> </ul> <h5> <a href="https://github.com/shikijs/shiki/compare/v3.11.0...v3.12.0">View changes on GitHub</a></h5> <h2>v3.11.0</h2> <h3> 🚀 Features</h3> <ul> <li><strong>core</strong>: Add <code>enforce</code> options to <code>ShikiTransformer</code> - by <a href="https://github.com/serkodev"><code>@serkodev</code></a> and <a href="https://github.com/antfu"><code>@antfu</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1062">shikijs/shiki#1062</a> <a href="`8ad05bd8`"><!-- raw HTML omitted -->(8ad05)<!-- raw HTML omitted --></a></li> </ul> <h5> <a href="https://github.com/shikijs/shiki/compare/v3.10.0...v3.11.0">View changes on GitHub</a></h5> <h2>v3.10.0</h2> <h3> 🚀 Features</h3> <ul> <li>Add funding links to playground - by <a href="https://github.com/jtbandes"><code>@jtbandes</code></a> in <a href="https://redirect.github.com/shikijs/shiki/issues/1054">shikijs/shiki#1054</a> <a href="`e36eb4d8`"><!-- raw HTML omitted -->(e36eb)<!-- raw HTML omitted --></a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`fd7326a82f`"><code>fd7326a</code></a> chore: release v3.13.0</li> <li><a href="`5cbb05219e`"><code>5cbb052</code></a> chore: release v3.12.3</li> <li><a href="`e462618190`"><code>e462618</code></a> chore: release v3.12.2</li> <li><a href="`793d71e68f`"><code>793d71e</code></a> chore: release v3.12.1</li> <li><a href="`9260f3fd10`"><code>9260f3f</code></a> chore: release v3.12.0</li> <li><a href="`d05f39b1e8`"><code>d05f39b</code></a> chore: release v3.11.0</li> <li><a href="`bda1a76743`"><code>bda1a76</code></a> chore: release v3.10.0</li> <li><a href="`09921f1cb8`"><code>09921f1</code></a> chore: release v3.9.2</li> <li><a href="`854eddf2ed`"><code>854eddf</code></a> chore: release v3.9.1</li> <li><a href="`950ede5ae5`"><code>950ede5</code></a> chore: release v3.9.0</li> <li>Additional commits viewable in <a href="https://github.com/shikijs/shiki/commits/v3.13.0/packages/shiki">compare view</a></li> </ul> </details> <details> <summary>Maintainer changes</summary> <p>This version was pushed to npm by [GitHub Actions](<a href="https://www.npmjs.com/~GitHub">https://www.npmjs.com/~GitHub</a> Actions), a new releaser for shiki since your current version.</p> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=shiki&package-manager=npm_and_yarn&previous-version=1.29.2&new-version=3.13.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-29 10:02:51 +02:00
Ashwin Bharambe	8dc9fd6844	feat(ci): use @next branch from llama-stack-client (#3576 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m16s Details When we update Stainless (editor changes), the `next` branch gets updated. Eventually when one decides on a release, you land changes into `main`. This is the Stainless workflow. This PR makes sure we follow that workflow by pulling from the `next` branch for our integration tests.	2025-09-27 12:56:51 -07:00
Tami Takamiya	65f7b81e98	feat: Add items and title to ToolParameter/ToolParamDefinition (#3003 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s Details Python Package Build Test / build (3.12) (push) Failing after 17s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 19s Details Unit Tests / unit-tests (3.13) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (push) Failing after 20s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s Details Python Package Build Test / build (3.13) (push) Failing after 16s Details Unit Tests / unit-tests (3.12) (push) Failing after 16s Details API Conformance Tests / check-schema-compatibility (push) Successful in 25s Details UI Tests / ui-tests (22) (push) Successful in 50s Details Pre-commit / pre-commit (push) Successful in 1m16s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Add items and title to ToolParameter/ToolParamDefinition. Adding items will resolve the issue that occurs with Gemini LLM when an MCP tool has array-type properties. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Unite test cases will be added. --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Kai Wu <kaiwu@meta.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-09-27 11:35:29 -07:00
Sébastien Han	1a8d3ed315	chore: MANIFEST maintenance (#3454 ) `b4789c59` chore: exclude ci-test distro from the package `86a85da8` chore: re-add files in the package commit `b4789c5941` Author: Sébastien Han <seb@redhat.com> Date: Tue Sep 16 14:34:06 2025 +0200 chore: exclude ci-test distro from the package This is a CI artifact, we shouldn't package it. Proof it works, when building ci-tests is not added: ``` adding 'llama_stack/core/utils/serialize.py' adding 'llama_stack/distributions/__init__.py' adding 'llama_stack/distributions/template.py' adding 'llama_stack/distributions/dell/__init__.py' adding 'llama_stack/distributions/dell/build.yaml' adding 'llama_stack/distributions/dell/dell.py' adding 'llama_stack/distributions/dell/run-with-safety.yaml' adding 'llama_stack/distributions/dell/run.yaml' adding 'llama_stack/distributions/meta-reference-gpu/__init__.py' adding 'llama_stack/distributions/meta-reference-gpu/build.yaml' adding 'llama_stack/distributions/meta-reference-gpu/meta_reference.py' adding 'llama_stack/distributions/meta-reference-gpu/run-with-safety.yaml' adding 'llama_stack/distributions/meta-reference-gpu/run.yaml' adding 'llama_stack/distributions/nvidia/__init__.py' adding 'llama_stack/distributions/nvidia/build.yaml' adding 'llama_stack/distributions/nvidia/nvidia.py' adding 'llama_stack/distributions/nvidia/run-with-safety.yaml' adding 'llama_stack/distributions/nvidia/run.yaml' adding 'llama_stack/distributions/open-benchmark/__init__.py' adding 'llama_stack/distributions/open-benchmark/build.yaml' adding 'llama_stack/distributions/open-benchmark/open_benchmark.py' adding 'llama_stack/distributions/open-benchmark/run.yaml' adding 'llama_stack/distributions/postgres-demo/__init__.py' adding 'llama_stack/distributions/postgres-demo/build.yaml' adding 'llama_stack/distributions/postgres-demo/postgres_demo.py' adding 'llama_stack/distributions/postgres-demo/run.yaml' adding 'llama_stack/distributions/starter/__init__.py' adding 'llama_stack/distributions/starter/build.yaml' adding 'llama_stack/distributions/starter/run.yaml' adding 'llama_stack/distributions/starter/starter.py' adding 'llama_stack/distributions/starter-gpu/__init__.py' adding 'llama_stack/distributions/starter-gpu/build.yaml' adding 'llama_stack/distributions/starter-gpu/run.yaml' adding 'llama_stack/distributions/starter-gpu/starter_gpu.py' adding 'llama_stack/distributions/watsonx/__init__.py' adding 'llama_stack/distributions/watsonx/build.yaml' adding 'llama_stack/distributions/watsonx/run.yaml' adding 'llama_stack/distributions/watsonx/watsonx.py' adding 'llama_stack/models/__init__.py' adding 'llama_stack/models/llama/__init__.py' ``` Signed-off-by: Sébastien Han <seb@redhat.com> commit `86a85da877` Author: Sébastien Han <seb@redhat.com> Date: Tue Sep 16 14:45:37 2025 +0200 chore: re-add files in the package These files were not added anymore since the path changed. Signed-off-by: Sébastien Han <seb@redhat.com> --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-27 11:28:11 -07:00
ehhuang	c392f3a0f4	chore: remove extra logging (#3574 ) # What does this PR do? This is already logged by console processor as INFO <img width="1093" height="280" alt="image" src="https://github.com/user-attachments/assets/780b0ac2-6744-49d7-b1d4-b7204050a6dc" /> ## Test Plan	2025-09-27 11:22:54 -07:00
Matthew Farrellee	0d94f3e2c0	chore: recordings for fireworks (inference + openai) (#3573 ) # What does this PR do? recorded for: ./scripts/integration-tests.sh --stack-config server:ci-tests --suite base --setup fireworks --subdirs inference --pattern openai ## Test Plan ./scripts/integration-tests.sh --stack-config server:ci-tests --suite base --setup fireworks --subdirs inference --pattern openai	2025-09-27 11:22:30 -07:00
Matthew Farrellee	53b15725b6	chore(apis): unpublish deprecated /v1/inference apis (#3297 ) # What does this PR do? unpublish (make unavailable to users) the following apis - - `/v1/inference/completion`, replaced by `/v1/openai/v1/completions` - `/v1/inference/chat-completion`, replaced by `/v1/openai/v1/chat/completions` - `/v1/inference/embeddings`, replaced by `/v1/openai/v1/embeddings` - `/v1/inference/batch-completion`, replaced by `/v1/openai/v1/batches` - `/v1/inference/batch-chat-completion`, replaced by `/v1/openai/v1/batches` note: the implementations are still available for internal use, e.g. agents uses chat-completion.	2025-09-27 11:20:06 -07:00
Matthew Farrellee	60484c5c4e	chore(api): remove batch inference (#3261 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m18s Details # What does this PR do? APIs removed: - POST /v1/batch-inference/completion - POST /v1/batch-inference/chat-completion - POST /v1/inference/batch-completion - POST /v1/inference/batch-chat-completion note - - batch-completion & batch-chat-completion were only implemented for inference=inline::meta-reference - batch-inference were not implemented	2025-09-26 14:35:34 -07:00
Matthew Farrellee	b48d5cfed7	feat(internal): add image_url download feature to OpenAIMixin (#3516 ) # What does this PR do? simplify Ollama inference adapter by - - moving image_url download code to OpenAIMixin - being a ModelRegistryHelper instead of having one (mypy blocks check_model_availability method assignment) ## Test Plan - add unit tests for new download feature - add integration tests for openai_chat_completion w/ image_url (close test gap)	2025-09-26 17:32:16 -04:00
github-actions[bot]	4487b88ffe	build: Bump version to 0.2.23	2025-09-26 21:11:51 +00:00
Matthew Farrellee	7a25be633c	fix: Revert "fix: Added a bug fix when registering new models" (#3473 ) the commit to be reverted is an public api behavior change to something we should not support. instead of allowing silent updates (the caller cannot see the log messages), we should be sending an error to the caller that they must first unregister the model before reusing the same name w/ a different backend.	2025-09-26 16:19:21 -04:00
Matthew Farrellee	da5ea107fc	fix: ensure ModelRegistryHelper init for together and fireworks (#3572 ) # What does this PR do? address - ``` ERROR 2025-09-26 10:44:29,450 main:527 core::server: Error creating app: 'FireworksInferenceAdapter' object has no attribute 'alias_to_provider_id_map' ``` ## Test Plan manual startup w/ valid together & fireworks api keys	2025-09-26 16:18:32 -04:00
Ben Browning	b6e2934f7b	fix: Gracefully handle errors when listing MCP tools (#2544 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 38s Details Pre-commit / pre-commit (push) Successful in 1m17s Details # What does this PR do? When listing (and lazily indexing) tools, it's possible for an error to get thrown by individual toolgroups if for example an MCP toolgroup is unable to connect to its `mcp_endpoint`. This logs a warning in the server when that happens, logs a full stack trace of the error if debug logging is enabled, and just returns the list of tools from all working toolgroups instead of throwing an error to the client when a single toolgroup is temporarily or permanently misbehaving. The exception to the above is authentication errors, which we specifically send all the way back to the client as that's how we indicate to the client that it needs to provide authentication data for the remote MCP servers. Closes #2540 ## Test Plan A new unit test was added to test this exception handling, which is run as part of our regular test suite but also manually run to specifically verify this fix via: ``` uv run pytest -sv --asyncio-mode=auto \ tests/unit/distribution/routers/test_routing_tables.py ``` To verify the additional debug logging is printing properly: ``` LLAMA_STACK_LOGGING=core=debug \ uv run pytest -sv --asyncio-mode=auto \ tests/unit/distribution/routers/test_routing_tables.py ``` The mcp integration tests were run as below (and by CI): ``` ollama run llama3.2:3b ENABLE_OLLAMA="ollama" \ OLLAMA_INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ LLAMA_STACK_CONFIG=starter \ uv run pytest -sv tests/integration/tool_runtime/test_mcp.py \ --text-model meta-llama/Llama-3.2-3B-Instruct ``` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-09-26 18:09:48 +02:00
Matthew Farrellee	926c3ada41	chore: prune mypy exclude list (#3561 ) # What does this PR do? prune the mypy exclude list, build a stronger foundation for quality code ## Test Plan ci	2025-09-26 11:44:43 -04:00
Charlie Doern	c88c4ff2c6	feat: introduce API leveling, post_training, eval to v1alpha (#3449 ) # What does this PR do? Rather than have a single `LLAMA_STACK_VERSION`, we need to have a `_V1`, `_V1ALPHA`, and `_V1BETA` constant. This also necessitated addition of `level` to the `WebMethod` so that routing can be handeled properly. For backwards compat, the `v1` routes are being kept around and marked as `deprecated`. When used, the server will log a deprecation warning. Deprecation log: <img width="1224" height="134" alt="Screenshot 2025-09-25 at 2 43 36 PM" src="https://github.com/user-attachments/assets/0cc7c245-dafc-48f0-be99-269fb9a686f9" /> move: 1. post_training to `v1alpha` as it is under heavy development and not near its final state 2. eval: job scheduling is not implemented. Relies heavily on the datasetio API which is under development missing implementations of specific routes indicating the structure of those routes might change. Additionally eval depends on the `inference` API which is going to be deprecated, eval will likely need a major API surface change to conform to using completions properly implements leveling in #3317 note: integration tests will fail until the SDK is regenerated with v1alpha/inference as opposed to v1/inference ## Test Plan existing tests should pass with newly generated schema. Conformance will also pass as these routes are not the ones we currently test for stability Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-09-26 16:18:07 +02:00
Matthew Farrellee	65e01b5684	feat: together now supports base64 embedding encoding (#3559 ) # What does this PR do? use together's new base64 support ## Test Plan recordings for: ./scripts/integration-tests.sh --stack-config server:ci-tests --suite base --setup together --subdirs inference --pattern openai	2025-09-26 16:05:52 +02:00
Doug Edgar	9c751b6789	feat: use FIPS validated CSPRNG for telemetry (#3554 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 19s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 17s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (push) Failing after 18s Details Python Package Build Test / build (3.12) (push) Failing after 18s Details UI Tests / ui-tests (22) (push) Successful in 53s Details Pre-commit / pre-commit (push) Successful in 1m14s Details # What does this PR do? Switches from `random.getrandbits` to `secrets.randbits` in the telemetry module. <!-- If resolving an issue, uncomment and update the line below --> Closes #3553 ## Test Plan Unit tests from scripts/unit-tests.sh were ran to verify the tests still pass. Signed-off-by: Doug Edgar <dedgar@redhat.com>	2025-09-26 11:17:25 +02:00
Alexey Rybak	28d83faf8a	fix: docs deployment URL (#3556 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Test Llama Stack Build / build (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 34s Details Pre-commit / pre-commit (push) Successful in 1m25s Details # What does this PR do? Fixes Llama Stack docs deployment URL ## Test Plan ``` npm run gen-api-docs all npm run build ``` successfully builds the documentation	2025-09-25 15:41:12 -07:00
Matthew Farrellee	b67aef2fc4	feat: add static embedding metadata to dynamic model listings for providers using OpenAIMixin (#3547 ) # What does this PR do? - remove auto-download of ollama embedding models - add embedding model metadata to dynamic listing w/ unit test - add support and tests for allowed_models - removed inference provider models.py files where dynamic listing is enabled - store embedding metadata in embedding_model_metadata field on inference providers - make model_entries optional on ModelRegistryHelper and LiteLLMOpenAIMixin - make OpenAIMixin a ModelRegistryHelper - skip base64 embedding test for remote::ollama, always returns floats - only use OpenAI client for ollama model listing - remove unused build_model_entry function - remove unused get_huggingface_repo function ## Test Plan ci w/ new tests	2025-09-25 17:17:00 -04:00
Matthew Farrellee	a50b63906c	chore: use ollama/all-minilm:l6-v2 for ollama tests (#3537 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Test Llama Stack Build / build (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 33s Details Pre-commit / pre-commit (push) Successful in 1m25s Details # What does this PR do? use ollama embedding models for ollama test, previously using sentence-transformer recordings: - ./scripts/integration-tests.sh --stack-config server:ci-tests --suite base --setup ollama --inference-mode record - ./scripts/integration-tests.sh --stack-config server:ci-tests --suite vision --setup ollama-vision --inference-mode record ## Test Plan ci w/ added skip base64 embedding test	2025-09-24 19:33:02 -04:00
Alexey Rybak	6101c8e015	docs: fix broken links (#3540 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> - Fixes broken links and Docusaurus search Closes #3518 ## Test Plan The following should produce a clean build with no warnings and search enabled: ``` npm install npm run gen-api-docs all npm run build npm run serve ``` <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-24 14:16:31 -07:00
Alexey Rybak	8537ada11b	docs: MDX leftover fixes (#3536 ) # What does this PR do? - Fixes Docusaurus build errors <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan - `npm run build` compiles the build properly - Broken links expected and will be fixed in a follow-on PR <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-24 14:14:32 -07:00
Alexey Rybak	aebd728c81	docs: docusaurus setup (#3541 ) # What does this PR do? - Docusaurus server setup - Deprecates Sphinx build pipeline - Deprecates remaining references to Readthedocs - MDX compile errors and broken links to be addressed in follow-up PRs <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan ``` npm install npm gen-api-docs all npm run build ``` <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-24 14:11:30 -07:00
Alexey Rybak	610526d6d7	docs: static content migration (#3535 ) # What does this PR do? - Migrates static content from Sphinx to Docusaurus <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-24 14:08:50 -07:00
Alexey Rybak	c71ce8df61	docs: concepts and building_applications migration (#3534 ) # What does this PR do? - Migrates the remaining documentation sections to the new documentation format <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan - Partial migration <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-24 14:05:30 -07:00
Alexey Rybak	05ff4c4420	docs: advanced_apis migration (#3532 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> - Migrates the `advanced_apis/` section of the docs to the new format ## Test Plan - Partial migration <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-24 14:03:41 -07:00
Alexey Rybak	d23865757f	docs: provider and distro codegen migration (#3531 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> - Updates provider and distro codegen to handle the new format - Migrates provider and distro files to the new format ## Test Plan - Manual testing <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-24 14:01:29 -07:00
Alexey Rybak	45da31801c	fix: update API conformance test to point to new schema location (#3528 ) # What does this PR do? Update file paths in the conformance workflow to reflect the new location of the llama-stack-spec files from `docs/_static/` to `docs/static/`. Also update the `.gitignore` file to exclude Docusaurus-related directories (`docs/.docusaurus/` and `docs/node_modules/`). ## Test Plan - Run the workflow locally	2025-09-24 13:59:31 -07:00
Alexey Rybak	0a7d1adfee	fix: update OpenAPI generator (#3527 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Updates OpenAPI generator to use summaries and changed the file generation path. ## Test Plan - docs/openapi_generator/run_openapi_generator.sh <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-24 13:57:27 -07:00
Alexey Rybak	914c8cb605	fix: fix API docstrings for proper MDX parsing (#3526 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> _[Stack 1/10] Docusaurus documentation migration_ Updates the file upload API documentation to use proper OpenAPI format for integer parameters. Replaces `<int>` with `{integer}` in the description of the `expires_after[seconds]` parameter across the HTML spec, YAML spec, and Python implementation. ## Test Plan - docs/openapi_generator/run_openapi_generator.sh <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-24 13:55:12 -07:00
ehhuang	48a551ecbc	chore(perf): run guidellm benchmarks (#3421 ) Some checks failed Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 40s Details Pre-commit / pre-commit (push) Successful in 1m9s Details # What does this PR do? - Mostly AI-generated scripts to run guidellm (https://github.com/vllm-project/guidellm) benchmarks on k8s setup - Stack is using image built from main on 9/11 ## Test Plan See updated README.md	2025-09-24 10:18:33 -07:00
Nathan Weinberg	2f58d87c22	docs: fix typos in RAG docs (#3530 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 37s Details Pre-commit / pre-commit (push) Successful in 1m21s Details Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-09-23 14:30:24 -07:00
Matthew Farrellee	ce7a3b4dff	feat: update Cerebras inference provider to support dynamic model listing (#3481 ) # What does this PR do? - update Cerebras to use OpenAIMixin - enable openai completions tests - enable openai chat completions tests - disable with n > 1 tests - add recording for --setup cerebras --subdirs inference --pattern openai ## Test Plan `./scripts/integration-tests.sh --stack-config server:ci-tests --setup cerebras --subdirs inference --pattern openai` ``` tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming[txt=cerebras/llama-3.3-70b-inference:completion:sanity] instantiating llama_stack_client Port 8321 is already in use, assuming server is already running... llama_stack_client instantiated in 0.053s PASSED [ 2%] tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming_suffix[txt=cerebras/llama-3.3-70b-inference:completion:suffix] SKIPPED (Suffix is not supported for the model: cerebras/llama-3.3-70b.) [ 4%] tests/integration/inference/test_openai_completion.py::test_openai_completion_streaming[txt=cerebras/llama-3.3-70b-inference:completion:sanity] PASSED [ 6%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=cerebras/llama-3.3-70b-1] SKIPPED (Model cerebras/llama-3.3-70b hosted by remote::cerebras doesn't support vllm extra_body parameters.) [ 8%] tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=cerebras/llama-3.3-70b] SKIPPED (Model cerebras/llama-3.3-70b hosted by remote::cerebras doesn't support vllm extra_body parameters.) [ 10%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=cerebras/llama-3.3-70b-inference:chat_completion:non_streaming_01] PASSED [ 12%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=cerebras/llama-3.3-70b-inference:chat_completion:streaming_01] PASSED [ 14%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=cerebras/llama-3.3-70b-inference:chat_completion:streaming_01] SKIPPED (Model cerebras/llama-3.3-70b hosted by remote::cere...) [ 17%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=cerebras/llama-3.3-70b-True] PASSED [ 19%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=cerebras/llama-3.3-70b-True] PASSED [ 21%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming_with_file[txt=cerebras/llama-3.3-70b] SKIPPED (Model cerebras/llama-3.3-70b hosted by remote::cerebras doesn't support chat completion calls wit...) [ 23%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_single_string[openai_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 25%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_multiple_strings[openai_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 27%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_float[openai_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 29%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_dimensions[openai_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 31%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_user_parameter[openai_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 34%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_empty_list_error[openai_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 36%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_invalid_model_error[openai_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 38%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_different_inputs_different_outputs[openai_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 40%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_base64[openai_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 42%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_base64_batch_processing[openai_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 44%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=cerebras/llama-3.3-70b-0] SKIPPED (Model cerebras/llama-3.3-70b hosted by remote::cerebras doesn't support vllm extra_body parameters.) [ 46%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=cerebras/llama-3.3-70b-inference:chat_completion:non_streaming_02] PASSED [ 48%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=cerebras/llama-3.3-70b-inference:chat_completion:streaming_02] PASSED [ 51%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=cerebras/llama-3.3-70b-inference:chat_completion:streaming_02] SKIPPED (Model cerebras/llama-3.3-70b hosted by remote::cere...) [ 53%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=cerebras/llama-3.3-70b-False] PASSED [ 55%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=cerebras/llama-3.3-70b-False] PASSED [ 57%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_single_string[llama_stack_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 59%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_multiple_strings[llama_stack_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 61%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_float[llama_stack_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 63%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_dimensions[llama_stack_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 65%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_user_parameter[llama_stack_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 68%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_empty_list_error[llama_stack_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 70%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_invalid_model_error[llama_stack_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 72%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_different_inputs_different_outputs[llama_stack_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 74%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_base64[llama_stack_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 76%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_base64_batch_processing[llama_stack_client-cerebras/llama-3.3-70b-None-None-None-384] SKIPPED (embedding_model_id empty - skipping test) [ 78%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=cerebras/llama-3.3-70b-inference:chat_completion:non_streaming_01] PASSED [ 80%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=cerebras/llama-3.3-70b-inference:chat_completion:streaming_01] PASSED [ 82%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=cerebras/llama-3.3-70b-inference:chat_completion:streaming_01] SKIPPED (Model cerebras/llama-3.3-70b hosted by remote:...) [ 85%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=cerebras/llama-3.3-70b-True] PASSED [ 87%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=cerebras/llama-3.3-70b-True] PASSED [ 89%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=cerebras/llama-3.3-70b-inference:chat_completion:non_streaming_02] PASSED [ 91%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=cerebras/llama-3.3-70b-inference:chat_completion:streaming_02] PASSED [ 93%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=cerebras/llama-3.3-70b-inference:chat_completion:streaming_02] SKIPPED (Model cerebras/llama-3.3-70b hosted by remote:...) [ 95%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=cerebras/llama-3.3-70b-False] PASSED [ 97%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=cerebras/llama-3.3-70b-False] PASSED [100%] =================================================================================================================== slowest 10 durations ==================================================================================================================== 0.37s call tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=cerebras/llama-3.3-70b-inference:chat_completion:non_streaming_01] 0.34s call tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=cerebras/llama-3.3-70b-False] 0.18s call tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=cerebras/llama-3.3-70b-True] 0.17s setup tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming[txt=cerebras/llama-3.3-70b-inference:completion:sanity] 0.15s call tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=cerebras/llama-3.3-70b-True] 0.13s call tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=cerebras/llama-3.3-70b-True] 0.12s call tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=cerebras/llama-3.3-70b-False] 0.12s call tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=cerebras/llama-3.3-70b-True] 0.12s call tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=cerebras/llama-3.3-70b-False] 0.08s call tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=cerebras/llama-3.3-70b-inference:chat_completion:streaming_02] ================================================================================================================== short test summary info ================================================================================================================== SKIPPED [1] tests/integration/inference/test_openai_completion.py:75: Suffix is not supported for the model: cerebras/llama-3.3-70b. SKIPPED [3] tests/integration/inference/test_openai_completion.py:123: Model cerebras/llama-3.3-70b hosted by remote::cerebras doesn't support vllm extra_body parameters. SKIPPED [4] tests/integration/inference/test_openai_completion.py:103: Model cerebras/llama-3.3-70b hosted by remote::cerebras doesn't support n param. SKIPPED [1] tests/integration/inference/test_openai_completion.py:129: Model cerebras/llama-3.3-70b hosted by remote::cerebras doesn't support chat completion calls with base64 encoded files. SKIPPED [2] tests/integration/inference/test_openai_embeddings.py:90: embedding_model_id empty - skipping test SKIPPED [2] tests/integration/inference/test_openai_embeddings.py:112: embedding_model_id empty - skipping test SKIPPED [2] tests/integration/inference/test_openai_embeddings.py:136: embedding_model_id empty - skipping test SKIPPED [2] tests/integration/inference/test_openai_embeddings.py:154: embedding_model_id empty - skipping test SKIPPED [2] tests/integration/inference/test_openai_embeddings.py:175: embedding_model_id empty - skipping test SKIPPED [2] tests/integration/inference/test_openai_embeddings.py:195: embedding_model_id empty - skipping test SKIPPED [2] tests/integration/inference/test_openai_embeddings.py:206: embedding_model_id empty - skipping test SKIPPED [2] tests/integration/inference/test_openai_embeddings.py:217: embedding_model_id empty - skipping test SKIPPED [2] tests/integration/inference/test_openai_embeddings.py:244: embedding_model_id empty - skipping test SKIPPED [2] tests/integration/inference/test_openai_embeddings.py:278: embedding_model_id empty - skipping test ================================================================================================= 18 passed, 29 skipped, 50 deselected, 4 warnings in 3.02s ================================================================================================= ```	2025-09-23 16:26:00 -04:00
Matthew Farrellee	d07ebce4d9	feat: (re-)enable Databricks inference adapter (#3500 ) # What does this PR do? add/enable the Databricks inference adapter Databricks inference adapter was broken, closes #3486 - remove deprecated completion / chat_completion endpoints - enable dynamic model listing w/o refresh, listing is not async - use SecretStr instead of str for token - backward incompatible change: for consistency with databricks docs, env DATABRICKS_URL -> DATABRICKS_HOST and DATABRICKS_API_TOKEN -> DATABRICKS_TOKEN - databricks urls are custom per user/org, add special recorder handling for databricks urls - add integration test --setup databricks - enable chat completions tests - enable embeddings tests - disable n > 1 tests - disable embeddings base64 tests - disable embeddings dimensions tests note: reasoning models, e.g. gpt oss, fail because databricks has a custom, incompatible response format ## Test Plan ci and ``` ./scripts/integration-tests.sh --stack-config server:ci-tests --setup databricks --subdirs inference --pattern openai ``` note: databricks needs to be manually added to the ci-tests distro for replay testing	2025-09-23 15:37:23 -04:00
ehhuang	9406a998b9	chore: refactor tracingmiddelware (#3520 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 37s Details Pre-commit / pre-commit (push) Successful in 1m21s Details # What does this PR do? Just moving TracingMiddleware to a new file ## Test Plan CI	2025-09-23 10:14:41 -07:00
Matthew Farrellee	2be869b3ef	fix(dev): fix vllm inference recording (await models.list) (#3524 ) # What does this PR do? fix inference recording for vLLM closes #3523 ## Test Plan ``` $ ./scripts/integration-tests.sh --stack-config server:ci-tests --setup vllm --subdirs inference --inference-mode record --pattern test_text_chat_completion_non_streaming === Llama Stack Integration Test Runner === Stack Config: server:ci-tests Setup: vllm Inference Mode: record Test Suite: base Test Subdirs: inference Test Pattern: test_text_chat_completion_non_streaming ... === Applying Setup Environment Variables === Setting up environment variables: export VLLM_URL='http://localhost:8000/v1' === Starting Llama Stack Server === Waiting for Llama Stack Server to start... ✅ Llama Stack Server started successfully === Running Integration Tests === Test subdirs to run: inference Added test files from inference: 6 files === Running all collected tests in a single pytest command === Total test files: 6 + pytest -s -v tests/integration/inference/test_openai_completion.py tests/integration/inference/test_batch_inference.py tests/integration/inference/test_openai_embeddings.py tests/integration/inference/test_text_inference.py tests/integration/inference/test_vision_inference.py tests/integration/inference/test_embedding.py --stack-config=server:ci-tests --inference-mode=record -k 'not( builtin_tool or safety_with_image or code_interpreter or test_rag or test_inference_store_tool_calls ) and test_text_chat_completion_non_streaming' --setup=vllm --color=yes --capture=tee-sys INFO 2025-09-23 10:35:36,662 tests.integration.conftest:86 tests: Applying setup 'vllm' ======================================================= test session starts ======================================================= platform linux -- Python 3.12.11, pytest-8.4.2, pluggy-1.6.0 -- .../.venv/bin/python3 cachedir: .pytest_cache metadata: {'Python': '3.12.11', 'Platform': 'Linux-6.16.7-200.fc42.x86_64-x86_64-with-glibc2.41', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'html': '4.1.1', 'anyio': '4.9.0', 'timeout': '2.4.0', 'cov': '6.2.1', 'asyncio': '1.1.0', 'nbval': '0.11.0', 'socket': '0.7.0', 'json-report': '1.5.0', 'metadata': '3.1.1'}} rootdir: ... configfile: pyproject.toml plugins: html-4.1.1, anyio-4.9.0, timeout-2.4.0, cov-6.2.1, asyncio-1.1.0, nbval-0.11.0, socket-0.7.0, json-report-1.5.0, metadata-3.1.1 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 97 items / 95 deselected / 2 selected tests/integration/inference/test_text_inference.py::test_text_chat_completion_non_streaming[txt=vllm/Qwen/Qwen3-0.6B-inference:chat_completion:non_streaming_01] instantiating llama_stack_client Port 8321 is already in use, assuming server is already running... llama_stack_client instantiated in 0.044s PASSED [ 50%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_non_streaming[txt=vllm/Qwen/Qwen3-0.6B-inference:chat_completion:non_streaming_02] PASSED [100%] ====================================================== slowest 10 durations ======================================================= 1.62s call tests/integration/inference/test_text_inference.py::test_text_chat_completion_non_streaming[txt=vllm/Qwen/Qwen3-0.6B-inference:chat_completion:non_streaming_02] 0.93s call tests/integration/inference/test_text_inference.py::test_text_chat_completion_non_streaming[txt=vllm/Qwen/Qwen3-0.6B-inference:chat_completion:non_streaming_01] 0.62s setup tests/integration/inference/test_text_inference.py::test_text_chat_completion_non_streaming[txt=vllm/Qwen/Qwen3-0.6B-inference:chat_completion:non_streaming_01] (3 durations < 0.005s hidden. Use -vv to show these durations.) ========================================== 2 passed, 95 deselected, 6 warnings in 3.26s =========================================== + exit_code=0 + set +x ✅ All tests completed successfully ``` ``` $ git status ... Untracked files: (use "git add <file>..." to include in what will be committed) tests/integration/recordings/responses/032f8c5a1289.json tests/integration/recordings/responses/c42baf6a3700.json tests/integration/recordings/responses/models-bd032f995f2a-fb68f5a6.json ... ```	2025-09-23 12:56:33 -04:00
Matthew Farrellee	62e0aef7bc	fix: return llama stack model id from embeddings (#3525 ) # What does this PR do? the openai_embeddings method on OpenAIMixin was returning the provider's model id instead of the llama stack name ## Test Plan before - ``` $ ./scripts/integration-tests.sh --stack-config server:ci-tests --setup gpt --subdirs inference --inference-mode live --pattern test_openai_embeddings_single_string ... FAILED tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_single_string[openai_client-emb=openai/text-embedding-3-small] - AssertionError: assert 'text-embedding-3-small' == 'openai/text-...dding-3-small' FAILED tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_single_string[llama_stack_client-emb=openai/text-embedding-3-small] - AssertionError: assert 'text-embedding-3-small' == 'openai/text-...dding-3-small' ========================================== 2 failed, 95 deselected, 4 warnings in 3.87s =========================================== ``` after - ``` $ ./scripts/integration-tests.sh --stack-config server:ci-tests --setup gpt --subdirs inference --inference-mode live --pattern test_openai_embeddings_single_string ... ========================================== 2 passed, 95 deselected, 4 warnings in 2.12s =========================================== ```	2025-09-23 12:30:00 -04:00
ehhuang	a7f9ce9a3a	chore: fix build (#3522 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 2s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 5s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m12s Details # What does this PR do? error: `5099094847` ## Test Plan GITHUB_ACTIONS=true BUILD_PLATFORM=linux/amd64 USE_COPY_NOT_MOUNT=true LLAMA_STACK_DIR=. uv run --with llama-stack llama stack build --distro starter --image-type container --image-name ehhuang/distribution-starter succeeds	2025-09-22 22:53:48 -07:00
slekkala1	8d8261961e	chore: Refactor fireworks to use OpenAIMixin (#3480 ) Some checks failed Python Package Build Test / build (3.12) (push) Failing after 2s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 38s Details Pre-commit / pre-commit (push) Successful in 1m17s Details # What does this PR do? Refactor Fireworks to use OpenAIMixin Closes https://github.com/llamastack/llama-stack/issues/3391 Related to https://github.com/llamastack/llama-stack/issues/3387 ## Test Plan ``` (llama-stack) (base) swapna942@swapna942-mac llama-stack % FIREWORKS_API_KEY=**** ./scripts/integration-tests.sh --stack-config server:ci-tests --setup fireworks --subdirs inference --pattern openai tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_single_string[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] instantiating llama_stack_client Port 8321 is already in use, assuming server is already running... llama_stack_client instantiated in 0.031s PASSED [ 2%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_multiple_strings[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] PASSED [ 4%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_float[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] PASSED [ 6%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_dimensions[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] PASSED [ 8%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_user_parameter[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] SKIPPED [ 10%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_empty_list_error[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] PASSED [ 12%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_invalid_model_error[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] PASSED [ 14%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_different_inputs_different_outputs[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] PASSED [ 17%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_base64[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] SKIPPED [ 19%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_base64_batch_processing[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] SKIPPED [ 21%] tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming[txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:completion:sanity] PASSED [ 23%] tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming_suffix[txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:completion:suffix] SKIPPED [ 25%] tests/integration/inference/test_openai_completion.py::test_openai_completion_streaming[txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:completion:sanity] PASSED [ 27%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=accounts/fireworks/models/llama-v3p1-8b-instruct-1] SKIPPED [ 29%] tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=accounts/fireworks/models/llama-v3p1-8b-instruct] SKIPPED [ 31%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:chat_completion:non_streaming_01] PASSED [ 34%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:chat_completion:streaming_01] PASSED [ 36%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:chat_completion:streaming_01] PASSED [ 38%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-True] PASSED [ 40%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-True] PASSED [ 42%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming_with_file[txt=accounts/fireworks/models/llama-v3p1-8b-instruct] SKIPPED [ 44%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_single_string[llama_stack_client-emb=nomic-ai/nomic-embed-text-v1.5] PASSED [ 46%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_multiple_strings[llama_stack_client-emb=nomic-ai/nomic-embed-text-v1.5] PASSED [ 48%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_float[llama_stack_client-emb=nomic-ai/nomic-embed-text-v1.5] PASSED [ 51%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_dimensions[llama_stack_client-emb=nomic-ai/nomic-embed-text-v1.5] PASSED [ 53%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_user_parameter[llama_stack_client-emb=nomic-ai/nomic-embed-text-v1.5] SKIPPED [ 55%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_empty_list_error[llama_stack_client-emb=nomic-ai/nomic-embed-text-v1.5] PASSED [ 57%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_invalid_model_error[llama_stack_client-emb=nomic-ai/nomic-embed-text-v1.5] PASSED [ 59%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_different_inputs_different_outputs[llama_stack_client-emb=nomic-ai/nomic-embed-text-v1.5] PASSED [ 61%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_base64[llama_stack_client-emb=nomic-ai/nomic-embed-text-v1.5] SKIPPED [ 63%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_base64_batch_processing[llama_stack_client-emb=nomic-ai/nomic-embed-text-v1.5] SKIPPED [ 65%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=accounts/fireworks/models/llama-v3p1-8b-instruct-0] SKIPPED [ 68%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:chat_completion:non_streaming_02] PASSED [ 70%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:chat_completion:streaming_02] PASSED [ 72%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:chat_completion:streaming_02] PASSED [ 74%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-False] PASSED [ 76%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-False] PASSED [ 78%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:chat_completion:non_streaming_01] PASSED [ 80%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:chat_completion:streaming_01] PASSED [ 82%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:chat_completion:streaming_01] PASSED [ 85%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-True] PASSED [ 87%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-True] PASSED [ 89%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:chat_completion:non_streaming_02] PASSED [ 91%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:chat_completion:streaming_02] PASSED [ 93%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:chat_completion:streaming_02] PASSED [ 95%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-False] PASSED [ 97%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-False] PASSED [100%] ========================================== slowest 10 durations ========================================== 30.01s teardown tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_multiple_strings[llama_stack_client-emb=nomic-ai/nomic-embed-text-v1.5] 30.01s teardown tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-False] 30.01s teardown tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_different_inputs_different_outputs[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] 30.01s teardown tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_user_parameter[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] 30.01s teardown tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-True] 30.01s teardown tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_different_inputs_different_outputs[llama_stack_client-emb=nomic-ai/nomic-embed-text-v1.5] 30.01s teardown tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=accounts/fireworks/models/llama-v3p1-8b-instruct-inference:chat_completion:non_streaming_02] 30.01s teardown tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_single_string[llama_stack_client-emb=nomic-ai/nomic-embed-text-v1.5] 30.01s teardown tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_base64_batch_processing[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] 30.01s teardown tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_invalid_model_error[openai_client-emb=nomic-ai/nomic-embed-text-v1.5] ================= 36 passed, 11 skipped, 50 deselected, 4 warnings in 1429.05s (0:23:49) ================= + exit_code=0 + set +x ✅ All tests completed successfully ```	2025-09-22 13:19:36 -04:00
Kai Wu	e3fd70c321	fix: change ModelRegistryHelper to use ProviderModelEntry instead of hardcoded ModelType.llm (#3451 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> change ModelRegistryHelper to use ProviderModelEntry instead of hardcoded ModelType.llm which fixed issue #3330. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[3330] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> 1. open llama-stack server ``` uv sync --python 3.12 source .venv/bin/activate uv run llama stack build --distro starter --image-type venv --run ``` 2.Used following script to test ``` from llama_stack_client import LlamaStackClient import os def test_openai_embedding_type(): client = LlamaStackClient( base_url=os.environ.get("LLAMA_STACK_ENDPOINT", "http://localhost:8321"), provider_data={ "openai_api_key": os.environ.get("OPENAI_API_KEY", ""), }, ) model = client.models.retrieve("openai/text-embedding-3-small") print(model) assert model.identifier == "openai/text-embedding-3-small" assert model.model_type == "embedding" test_openai_embedding_type() ``` logs: ``` python test_openai.py INFO:httpx:HTTP Request: GET http://localhost:8321/v1/models/openai/text-embedding-3-small "HTTP/1.1 200 OK" Model(identifier='openai/text-embedding-3-small', metadata={'embedding_dimension': 1536.0, 'context_length': 8192.0}, api_model_type='embedding', provider_id='openai', type='model', provider_resource_id='text-embedding-3-small', owner=None, source='listed_from_provider', model_type='embedding') ```	2025-09-22 12:55:32 -04:00
dependabot[bot]	a1301911e4	chore(ui-deps): bump jest-environment-jsdom from 29.7.0 to 30.1.2 in /llama_stack/ui (#3509 ) Bumps [jest-environment-jsdom](https://github.com/jestjs/jest/tree/HEAD/packages/jest-environment-jsdom) from 29.7.0 to 30.1.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/jestjs/jest/releases">jest-environment-jsdom's releases</a>.</em></p> <blockquote> <h2>30.1.2</h2> <h3>Fixes</h3> <ul> <li><code>[jest-snapshot-utils]</code> Correct snapshot header regexp to work with newline across OSes (<a href="https://redirect.github.com/jestjs/jest/pull/15803">#15803</a>)</li> </ul> <h2>30.1.1</h2> <h3>Fixes</h3> <ul> <li><code>[jest-snapshot-utils]</code> Fix deprecated goo.gl snapshot warning not handling Windows end-of-line sequences (<a href="https://redirect.github.com/jestjs/jest/pull/15800">#15800</a>)</li> </ul> <h2>30.1.0</h2> <h2>Features</h2> <ul> <li><code>[jest-leak-detector]</code> Configurable GC aggressiveness regarding to V8 heap snapshot generation (<a href="https://redirect.github.com/jestjs/jest/pull/15793/">#15793</a>)</li> <li><code>[jest-runtime]</code> Reduce redundant ReferenceError messages</li> <li><code>[jest-core]</code> Include test modules that failed to load when --onlyFailures is active</li> </ul> <h3>Fixes</h3> <ul> <li>`[jest-snapshot-utils] Fix deprecated goo.gl snapshot guide link not getting replaced with fully canonical URL (<a href="https://redirect.github.com/jestjs/jest/pull/15787">#15787</a>)</li> <li><code>[jest-circus]</code> Fix <code>it.concurrent</code> not working with <code>describe.skip</code> (<a href="https://redirect.github.com/jestjs/jest/pull/15765">#15765</a>)</li> <li><code>[jest-snapshot]</code> Fix mangled inline snapshot updates when used with Prettier 3 and CRLF line endings</li> <li><code>[jest-runtime]</code> Importing from <code>@jest/globals</code> in more than one file no longer breaks relative paths (<a href="https://redirect.github.com/jestjs/jest/issues/15772">#15772</a>)</li> </ul> <h1>Chore</h1> <ul> <li><code>[expect]</code> Update docblock for <code>toContain()</code> to display info on substring check (<a href="https://redirect.github.com/jestjs/jest/pull/15789">#15789</a>)</li> </ul> <h2>30.0.2</h2> <h2>What's Changed</h2> <h3>Fixes</h3> <ul> <li><code>[jest-matcher-utils]</code> Make 'deepCyclicCopyObject' safer by setting descriptors to a null-prototype object (<a href="https://redirect.github.com/jestjs/jest/pull/15689">#15689</a>)</li> <li><code>[jest-util]</code> Make garbage collection protection property writable (<a href="https://redirect.github.com/jestjs/jest/pull/15689">#15689</a>)</li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/jestjs/jest/blob/main/CHANGELOG.md">https://github.com/jestjs/jest/blob/main/CHANGELOG.md</a></p> <h2>Jest 30.0.1</h2> <h2>What's Changed</h2> <h3>Features</h3> <ul> <li><code>[jest-resolver]</code> Implement the <code>defaultAsyncResolver</code> (<a href="https://redirect.github.com/jestjs/jest/pull/15679">#15679</a>)</li> </ul> <h3>Fixes</h3> <ul> <li><code>[jest-resolver]</code> Resolve builtin modules correctly (<a href="https://redirect.github.com/jestjs/jest/pull/15683">#15683</a>)</li> <li><code>[jest-environment-node, jest-util]</code> Avoid setting globals cleanup protection symbol when feature is off (<a href="https://redirect.github.com/jestjs/jest/pull/15684">#15684</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/jestjs/jest/blob/main/CHANGELOG.md">jest-environment-jsdom's changelog</a>.</em></p> <blockquote> <h2>30.1.2</h2> <h3>Fixes</h3> <ul> <li><code>[jest-snapshot-utils]</code> Correct snapshot header regexp to work with newline across OSes (<a href="https://redirect.github.com/jestjs/jest/pull/15803">#15803</a>)</li> </ul> <h2>30.1.1</h2> <h3>Fixes</h3> <ul> <li><code>[jest-snapshot-utils]</code> Fix deprecated goo.gl snapshot warning not handling Windows end-of-line sequences (<a href="https://redirect.github.com/jestjs/jest/pull/15800">#15800</a>)</li> </ul> <h2>30.1.0</h2> <h2>Features</h2> <ul> <li><code>[jest-leak-detector]</code> Configurable GC aggressiveness regarding to V8 heap snapshot generation (<a href="https://redirect.github.com/jestjs/jest/pull/15793/">#15793</a>)</li> <li><code>[jest-runtime]</code> Reduce redundant ReferenceError messages</li> <li><code>[jest-core]</code> Include test modules that failed to load when --onlyFailures is active</li> </ul> <h3>Fixes</h3> <ul> <li><code>[jest-snapshot-utils]</code> Fix deprecated goo.gl snapshot guide link not getting replaced with fully canonical URL (<a href="https://redirect.github.com/jestjs/jest/pull/15787">#15787</a>)</li> <li><code>[jest-circus]</code> Fix <code>it.concurrent</code> not working with <code>describe.skip</code> (<a href="https://redirect.github.com/jestjs/jest/pull/15765">#15765</a>)</li> <li><code>[jest-snapshot]</code> Fix mangled inline snapshot updates when used with Prettier 3 and CRLF line endings</li> <li><code>[jest-runtime]</code> Importing from <code>@jest/globals</code> in more than one file no longer breaks relative paths (<a href="https://redirect.github.com/jestjs/jest/issues/15772">#15772</a>)</li> </ul> <h1>Chore</h1> <ul> <li><code>[expect]</code> Update docblock for <code>toContain()</code> to display info on substring check (<a href="https://redirect.github.com/jestjs/jest/pull/15789">#15789</a>)</li> </ul> <h2>30.0.5</h2> <h3>Features</h3> <ul> <li><code>[jest-config]</code> Allow <code>testMatch</code> to take a string value</li> <li><code>[jest-worker]</code> Let <code>workerIdleMemoryLimit</code> accept 0 to always restart worker child processes</li> </ul> <h3>Fixes</h3> <ul> <li><code>[expect]</code> Fix <code>bigint</code> error (<a href="https://redirect.github.com/jestjs/jest/pull/15702">#15702</a>)</li> </ul> <h2>30.0.4</h2> <h3>Features</h3> <ul> <li><code>[expect]</code> The <code>Inverse</code> type is now exported (<a href="https://redirect.github.com/jestjs/jest/pull/15714">#15714</a>)</li> <li><code>[expect]</code> feat: support <code>async functions</code> in <code>toBe</code> (<a href="https://redirect.github.com/jestjs/jest/pull/15704">#15704</a>)</li> </ul> <h3>Fixes</h3> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`ebfa31cc97`"><code>ebfa31c</code></a> v30.1.2</li> <li><a href="`d347c0f3f8`"><code>d347c0f</code></a> v30.1.1</li> <li><a href="`4d5f41d088`"><code>4d5f41d</code></a> v30.1.0</li> <li><a href="`22236cf58b`"><code>22236cf</code></a> v30.0.5</li> <li><a href="`f4296d2bc8`"><code>f4296d2</code></a> v30.0.4</li> <li><a href="`393acbfac3`"><code>393acbf</code></a> v30.0.2</li> <li><a href="`5ce865b406`"><code>5ce865b</code></a> v30.0.1</li> <li><a href="`469f665c2d`"><code>469f665</code></a> v30.0.0</li> <li><a href="`ce14203d91`"><code>ce14203</code></a> v30.0.0-rc.1</li> <li><a href="`ac334c0cdf`"><code>ac334c0</code></a> v30.0.0-beta.8</li> <li>Additional commits viewable in <a href="https://github.com/jestjs/jest/commits/v30.1.2/packages/jest-environment-jsdom">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=jest-environment-jsdom&package-manager=npm_and_yarn&previous-version=29.7.0&new-version=30.1.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-22 13:57:10 +02:00
dependabot[bot]	7c4a740a08	chore(ui-deps): bump @radix-ui/react-dialog from 1.1.13 to 1.1.15 in /llama_stack/ui (#3510 ) Bumps [@radix-ui/react-dialog](https://github.com/radix-ui/primitives) from 1.1.13 to 1.1.15. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/radix-ui/primitives/commits">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@radix-ui/react-dialog&package-manager=npm_and_yarn&previous-version=1.1.13&new-version=1.1.15)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-22 13:56:58 +02:00
dependabot[bot]	21f7667bb7	chore(ui-deps): bump remeda from 2.30.0 to 2.32.0 in /llama_stack/ui (#3511 ) Bumps [remeda](https://github.com/remeda/remeda) from 2.30.0 to 2.32.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/remeda/remeda/releases">remeda's releases</a>.</em></p> <blockquote> <h2>v2.32.0</h2> <h1><a href="https://github.com/remeda/remeda/compare/v2.31.1...v2.32.0">2.32.0</a> (2025-09-18)</h1> <h3>Features</h3> <ul> <li>toTitleCase (<a href="https://redirect.github.com/remeda/remeda/issues/1200">#1200</a>) (<a href="`90866698f7`">9086669</a>)</li> </ul> <h2>v2.31.1</h2> <h2><a href="https://github.com/remeda/remeda/compare/v2.31.0...v2.31.1">2.31.1</a> (2025-09-09)</h2> <p><em>This version is identical to <a href="https://github.com/remeda/remeda/releases/tag/v2.31.0">2.31.0</a>. We were experimenting with some modernizations of our release pipelines and needed to generate a release as part of testing those changes.</em></p> <h2>v2.31.0</h2> <h1><a href="https://github.com/remeda/remeda/compare/v2.30.0...v2.31.0">2.31.0</a> (2025-09-08)</h1> <h3>Features</h3> <ul> <li><strong>conditional:</strong> remove <code>defaultCase</code> (<a href="https://redirect.github.com/remeda/remeda/issues/1192">#1192</a>) (<a href="`ebea7b3bc6`">ebea7b3</a>), closes <a href="https://redirect.github.com/remeda/remeda/issues/1114">#1114</a></li> <li><strong>isEmptyish:</strong> a wider variant of <code>isEmpty</code> that accepts any input. (<a href="https://redirect.github.com/remeda/remeda/issues/1180">#1180</a>) (<a href="`025b2ec8d8`">025b2ec</a>), closes <a href="https://redirect.github.com/remeda/remeda/issues/775">#775</a></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`90866698f7`"><code>9086669</code></a> feat: toTitleCase (<a href="https://redirect.github.com/remeda/remeda/issues/1200">#1200</a>)</li> <li><a href="`6198796b25`"><code>6198796</code></a> docs: even more broken links (<a href="https://redirect.github.com/remeda/remeda/issues/1202">#1202</a>)</li> <li><a href="`705c29d48c`"><code>705c29d</code></a> docs: fix broken links in migration articles (<a href="https://redirect.github.com/remeda/remeda/issues/1201">#1201</a>)</li> <li><a href="`d24c932fa3`"><code>d24c932</code></a> docs(string): rework all docs in the string category + lodash migration (<a href="https://redirect.github.com/remeda/remeda/issues/1199">#1199</a>)</li> <li><a href="`ae0e1156d6`"><code>ae0e115</code></a> fix(release): revert OIDC release (for now) + jsr provenance (<a href="https://redirect.github.com/remeda/remeda/issues/1197">#1197</a>)</li> <li><a href="`6293fc2e95`"><code>6293fc2</code></a> fix(semantic-release): provide a github token in the env (<a href="https://redirect.github.com/remeda/remeda/issues/1196">#1196</a>)</li> <li><a href="`53c4e07f14`"><code>53c4e07</code></a> fix(npm): use oidc when publishing to npm (<a href="https://redirect.github.com/remeda/remeda/issues/1195">#1195</a>)</li> <li><a href="`3cdc9833ea`"><code>3cdc983</code></a> chore(deps): bump locked versions (<a href="https://redirect.github.com/remeda/remeda/issues/1194">#1194</a>)</li> <li><a href="`49a295b58a`"><code>49a295b</code></a> chore(deps): manually bump everything (<a href="https://redirect.github.com/remeda/remeda/issues/1193">#1193</a>)</li> <li><a href="`ebea7b3bc6`"><code>ebea7b3</code></a> feat(conditional): remove <code>defaultCase</code> (<a href="https://redirect.github.com/remeda/remeda/issues/1192">#1192</a>)</li> <li>Additional commits viewable in <a href="https://github.com/remeda/remeda/compare/v2.30.0...v2.32.0">compare view</a></li> </ul> </details> <details> <summary>Maintainer changes</summary> <p>This version was pushed to npm by <a href="https://www.npmjs.com/~eranhirsch">eranhirsch</a>, a new releaser for remeda since your current version.</p> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=remeda&package-manager=npm_and_yarn&previous-version=2.30.0&new-version=2.32.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-22 13:56:43 +02:00
dependabot[bot]	6ce2cf3e12	chore(github-deps): bump astral-sh/setup-uv from 6.6.1 to 6.7.0 (#3502 ) Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.6.1 to 6.7.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v6.7.0 🌈 New inputs <code>restore-cache</code> and <code>save-cache</code></h2> <h2>Changes</h2> <p>This release adds fine-grained control over the caching steps.</p> <ul> <li>The input <code>restore-cache</code> (<code>true</code> by default) can be set to <code>false</code> to skip restoring the cache while still allowing to save the cache.</li> <li>The input <code>save-cache</code> (<code>true</code> by default) can be set to <code>false</code> to skip saving the cache.</li> </ul> <p>Skipping cache saving can be useful if you know, that you will never use this version of the cache again and don't want to waste storage space:</p> <pre lang="yaml"><code>- name: Save cache only on main branch uses: astral-sh/setup-uv@v6 with: enable-cache: true save-cache: ${{ github.ref == 'refs/heads/main' }} </code></pre> <h2>🚀 Enhancements</h2> <ul> <li>Add inputs restore-cache and save-cache <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/568">#568</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>bump deps <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/569">#569</a>)</li> <li>Automatically push updated known versions <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/565">#565</a>)</li> <li>chore: update known versions for 0.8.16/0.8.17 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/562">#562</a>)</li> <li>chore: update known versions for 0.8.15 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/550">#550</a>)</li> <li>chore(ci): address CI lint findings <a href="https://github.com/woodruffw"><code>@woodruffw</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/545">#545</a>)</li> </ul> <h2>⬆️ Dependency updates</h2> <ul> <li>Bump github/codeql-action from 3.29.11 to 3.30.3 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/566">#566</a>)</li> <li>Bump actions/setup-node from 4.4.0 to 5.0.0 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/551">#551</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`b75a909f75`"><code>b75a909</code></a> bump deps (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/569">#569</a>)</li> <li><a href="`ffff8aa2b5`"><code>ffff8aa</code></a> Bump github/codeql-action from 3.29.11 to 3.30.3 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/566">#566</a>)</li> <li><a href="`95d0e233fa`"><code>95d0e23</code></a> Bump actions/setup-node from 4.4.0 to 5.0.0 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/551">#551</a>)</li> <li><a href="`dc724a12b6`"><code>dc724a1</code></a> Add inputs restore-cache and save-cache (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/568">#568</a>)</li> <li><a href="`f67343ac2e`"><code>f67343a</code></a> Automatically push updated known versions (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/565">#565</a>)</li> <li><a href="`4dd9f52a47`"><code>4dd9f52</code></a> chore: update known versions for 0.8.16/0.8.17 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/562">#562</a>)</li> <li><a href="`e1e6fe7910`"><code>e1e6fe7</code></a> chore: update known versions for 0.8.15 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/550">#550</a>)</li> <li><a href="`b1836110f7`"><code>b183611</code></a> chore(ci): address CI lint findings (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/545">#545</a>)</li> <li>See full diff in <a href="`557e51de59...b75a909f75`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.6.1&new-version=6.7.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-22 13:54:35 +02:00
Matthew Farrellee	e2e42c8a37	chore: remove duplicate OpenAI and Gemini data validators (#3513 ) # What does this PR do? removes the duplicate OpenAI/GeminiProviderDataValidator the active ones are in config.pys ## Test Plan ci	2025-09-22 13:53:17 +02:00
Derek Higgins	0e43be36e1	fix: handle missing API keys gracefully in model refresh (#3493 ) - Catch Errors from providers without API keys during model refresh - Log as warning instead of exception to avoid a scary startup Closes: #3492 Error message are now warnings instead of several tracebacks ``` INFO 2025-09-19 16:06:55,228 llama_stack.providers.utils.inference.inference_store:74 inference_store: Write queue disabled for SQLite to avoid concurrency issues WARNING 2025-09-19 16:06:59,362 llama_stack.providers.utils.inference.openai_mixin:327 providers::utils: Failed to list models for anthropic: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"anthropic_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-09-19 16:06:59,364 llama_stack.providers.utils.inference.openai_mixin:327 providers::utils: Failed to list models for gemini: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"gemini_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-09-19 16:06:59,367 llama_stack.providers.utils.inference.openai_mixin:327 providers::utils: Failed to list models for groq: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"groq_api_key": "<API_KEY>"}, or in the provider config. WARNING 2025-09-19 16:06:59,372 llama_stack.providers.utils.inference.openai_mixin:327 providers::utils: Failed to list models for sambanova: API key is not set. Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {"sambanova_api_key": "<API_KEY>"}, or in the provider config. INFO 2025-09-19 16:06:59,533 llama_stack.core.utils.config_resolution:45 core: Using file path: ``` Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-09-22 07:31:30 -04:00
Derek Higgins	e3f77c1004	fix: Update inference recorder to handle both Ollama and OpenAI model (#3470 ) Some checks failed Pre-commit / pre-commit (push) Successful in 1m39s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 23s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 22s Details UI Tests / ui-tests (22) (push) Successful in 57s Details - Handle Ollama format where models are nested under response['body']['models'] - Fall back to OpenAI format where models are directly in response['body'] Closes: #3457 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-09-21 09:32:39 -04:00
Matthew Farrellee	142a38db8b	chore: remove duplicate AnthropicProviderDataValidator (#3512 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 42s Details Pre-commit / pre-commit (push) Successful in 1m59s Details # What does this PR do? removes the duplicate AnthropicProviderDataValidator the active one is in config.py ## Test Plan ci	2025-09-20 16:09:27 -07:00
ehhuang	f44eb935c4	chore: simplify authorized sqlstore (#3496 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 35s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Pre-commit / pre-commit (push) Successful in 1m19s Details # What does this PR do? This PR is generated with AI and reviewed by me. Refactors the AuthorizedSqlStore class to store the access policy as an instance variable rather than passing it as a parameter to each method call. This simplifies the API. # Test Plan existing tests	2025-09-19 16:13:56 -07:00
Sébastien Han	d3600b92d1	fix: force milvus-lite installation for inline::milvus (#3488 ) # What does this PR do? pymilvus recently made `milvus-lite` an optional dependency to their package. If someone wants to use the inline provider we must include the extra dependency. For more details see: https://github.com/milvus-io/pymilvus/pull/2976 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-19 16:12:08 -04:00
adam-d-young	9378bdca43	docs: Fix incorrect vector_db_id usage in RAG tutorial (#3444 ) Some checks failed UI Tests / ui-tests (22) (push) Successful in 40s Details Pre-commit / pre-commit (push) Successful in 1m58s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details # What does this PR do? This PR fixes a blocking issue in the detailed RAG tutorial where the code fails with a 400 Bad Request error. The root cause is that recent versions of Llama-Stack ignore the client-generated vector_db_id and assign a new server-side ID. The tutorial was not updated to reflect this, causing the rag_tool.insert call to fail. This change updates the code to capture the authoritative ID from the .identifier attribute of the register() method's response. This ensures the tutorial code runs successfully and reflects the current API behavior. ## Test Plan The fix can be verified by running the Python code snippet from the detailed tutorial page. Run the original code (Before this change): Result: The script fails with a 400 Bad Request error on the rag_tool.insert step. Run the updated code (After this change): Result: The script runs successfully to completion. Co-authored-by: Adam Young <adam.young@redhat.com>	2025-09-19 11:41:26 -04:00
ehhuang	4c2fcb6b51	chore: refactor server.main (#3462 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 8s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 10s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 18s Details API Conformance Tests / check-schema-compatibility (push) Successful in 22s Details UI Tests / ui-tests (22) (push) Successful in 29s Details Pre-commit / pre-commit (push) Successful in 1m25s Details # What does this PR do? As shown in #3421, we can scale stack to handle more RPS with k8s replicas. This PR enables multi process stack with uvicorn --workers so that we can achieve the same scaling without being in k8s. To achieve that we refactor main to split out the app construction logic. This method needs to be non-async. We created a new `Stack` class to house impls and have a `start()` method to be called in lifespan to start background tasks instead of starting them in the old `construct_stack`. This way we avoid having to manage an event loop manually. ## Test Plan CI > uv run --with llama-stack python -m llama_stack.core.server.server benchmarking/k8s-benchmark/stack_run_config.yaml works. > LLAMA_STACK_CONFIG=benchmarking/k8s-benchmark/stack_run_config.yaml uv run uvicorn llama_stack.core.server.server:create_app --port 8321 --workers 4 works.	2025-09-18 21:11:13 -07:00
Charlie Doern	8422bd102a	feat: combine ProviderSpec datatypes (#3378 ) Some checks failed Unit Tests / unit-tests (3.13) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 36s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 4s Details Pre-commit / pre-commit (push) Successful in 1m12s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details # What does this PR do? currently `RemoteProviderSpec` has an `AdapterSpec` embedded in it. Remove `AdapterSpec`, and put its leftover fields into `RemoteProviderSpec`. Additionally, many of the fields were duplicated between `InlineProviderSpec` and `RemoteProviderSpec`. Move these to `ProviderSpec` so they are shared. Fixup the distro codegen to use `RemoteProviderSpec` directly rather than `remote_provider_spec` which took an AdapterSpec and returned a full provider spec ## Test Plan existing distro tests should pass. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-09-18 16:10:00 +02:00
Jiayi Ni	e66103c09d	fix: add missing files provider to NVIDIA distribution (#3479 ) # What does this PR do? The rag-runtime tool requires files API as a dependency, but the NVIDIA distribution was missing the files provider configuration. Thus, when running: ``` llama stack build --distro nvidia --image-type venv ``` And then: ``` llama stack run {path_to_distribution_config} --image-type venv ``` It would raise an error: ``` RuntimeError: Failed to resolve 'tool_runtime' provider 'rag-runtime' of type 'inline::rag-runtime': required dependency 'files' is not available. Please add a 'files' provider to your configuration or check if the provider is properly configured. ``` This PR fixes the issue by adding missing files provider to NVIDIA distribution. ## Test Plan N/A	2025-09-18 13:49:46 +02:00
Matthew Farrellee	ea396a54cd	chore: update the ollama inference impl to use OpenAIMixin for openai-compat functions (#3395 ) # What does this PR do? update Ollama inference provider to use OpenAIMixin for openai-compat endpoints ## Test Plan ci	2025-09-18 13:09:57 +02:00
Matthew Farrellee	521865c388	feat: include all models from provider's /v1/models (#3471 ) # What does this PR do? this replaces the static model listing for any provider using OpenAIMixin currently - - anthropic - azure openai - gemini - groq - llama-api - nvidia - openai - sambanova - tgi - vertexai - vllm - not changed: together has its own impl ## Test Plan - new unit tests - manual for llama-api, openai, groq, gemini ``` for provider in llama-openai-compat openai groq gemini; do uv run llama stack build --image-type venv --providers inference=remote::provider --run & uv run --with llama-stack-client llama-stack-client models list \| grep Total ``` results (17 sep 2025): - llama-api: 4 - openai: 86 - groq: 21 - gemini: 66 closes #3467	2025-09-18 05:17:11 -04:00
Akram Ben Aissi	4842145202	feat: Add dynamic authentication token forwarding support for vLLM (#3388 ) # What does this PR do? Add dynamic authentication token forwarding support for vLLM provider This enables per-request authentication tokens for vLLM providers, supporting use cases like RAG operations where different requests may need different authentication tokens. The implementation follows the same pattern as other providers like Together AI, Fireworks, and Passthrough. - Add LiteLLMOpenAIMixin that manages the vllm_api_token properly Usage: - Static: VLLM_API_TOKEN env var or config.api_token - Dynamic: X-LlamaStack-Provider-Data header with vllm_api_token All existing functionality is preserved while adding new dynamic capabilities. <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ``` curl -X POST "http://localhost:8000/v1/chat/completions" -H "Authorization: Bearer my-dynamic-token" \ -H "X-LlamaStack-Provider-Data: {\"vllm_api_token\": \"Bearer my-dynamic-token\", \"vllm_url\": \"http://dynamic-server:8000\"}" \ -H "Content-Type: application/json" \ -d '{"model": "llama-3.1-8b", "messages": [{"role": "user", "content": "Hello!"}]}' ``` --------- Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>	2025-09-18 11:13:55 +02:00
Doug Edgar	42c23b45f6	feat: update qdrant hash function from SHA-1 to SHA-256 (#3477 ) Some checks failed Installer CI / smoke-test-on-dev (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Installer CI / lint (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 2s Details UI Tests / ui-tests (22) (push) Successful in 29s Details Pre-commit / pre-commit (push) Successful in 1m10s Details # What does this PR do? Updates the qdrant provider's convert_id function to use a FIPS-validated cryptographic hashing function, so that llama-stack is considered to be `Designed for FIPS`. The standard library `uuid.uuid5()` function uses SHA-1 under the hood, which is not FIPS-validated. This commit uses an approach similar to the one merged in #3423. Closes #3476. ## Test Plan Unit tests from scripts/unit-tests.sh were ran to verify that the tests pass. A small test script can display the data flow: ```python import hashlib import uuid # Input _id = "chunk_abc123" print(_id) # Step 1: Format and encode hash_input = f"qdrant_id:{_id}".encode() print(hash_input) # Result: b'qdrant_id:chunk_abc123' # Step 2: SHA-256 hash sha256_hash = hashlib.sha256(hash_input).hexdigest() print(sha256_hash) # Result: "184893a6eafeaac487cb9166351e8625b994d50f3456d8bc6cea32a014a27151" # Step 3: Create UUID from first 32 chars uuid_string = str(uuid.UUID(sha256_hash[:32])) print(uuid_string) # sha256_hash[:32] = "184893a6eafeaac487cb9166351e8625" # Final result: "184893a6-eafe-aac4-87cb-9166351e8625" ``` Signed-off-by: Doug Edgar <dedgar@redhat.com>	2025-09-17 15:10:10 -07:00
Jash Gulabrai	ac1414b571	fix: Set provider_id in NVIDIA notebook when registering dataset (#3472 ) # What does this PR do? When registering a dataset for NVIDIA, the DatasetsRoutingTable expects `nvidia` to be passed via the `provider_id` [here](https://github.com/llamastack/llama-stack/blob/main/llama_stack/core/routing_tables/datasets.py#L61). This PR fixes a notebook to correctly use `provider_id`. <!-- If resolving an issue, uncomment and update the line below --> Closes #3308 ## Test Plan Manually execute the notebook steps to verify the dataset is registered. Co-authored-by: Jash Gulabrai <jgulabrai@nvidia.com>	2025-09-17 11:45:15 -07:00
Alexey Rybak	9fe8097ca4	docs: update documentation links (#3459 ) # What does this PR do? * Updates documentation links from readthedocs to llamastack.github.io ## Test Plan * Manual testing	2025-09-17 10:37:35 -07:00
Francisco Arceo	9acf49753e	fix: Fixing prompts import warning (#3455 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details UI Tests / ui-tests (22) (push) Successful in 41s Details Pre-commit / pre-commit (push) Successful in 1m17s Details # What does this PR do? Fixes this warning in llama stack build: ```bash WARNING 2025-09-15 15:29:02,197 llama_stack.core.distribution:149 core: Failed to import module prompts: No module named 'llama_stack.providers.registry.prompts'" ``` ## Test Plan Test added --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-09-17 10:24:58 +02:00
Derek Higgins	fad4843548	fix: unbound variable PR_HEAD_REPO (#3469 ) Add default value for PR_HEAD_REPO to prevent 'unbound variable' error when no PR exists for a branch. Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-09-17 10:18:43 +02:00
Omar Abdelwahab	e0e2b1bd0e	fix: Added a bug fix when registering new models (#3453 ) # What does this PR do? Modified the code in registry.py. The key changes are: 1. Removed the `return False` statement 2. Added a warning log message that includes the object type, identifier, and provider_id for better debugging. 3. The method now continues with the registration process instead of early returning. --------- Co-authored-by: Omar Abdelwahab <omara@fb.com>	2025-09-16 19:09:06 -07:00
github-actions[bot]	ececc323d3	build: Bump version to 0.2.22 Some checks failed Pre-commit / pre-commit (push) Successful in 1m14s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test Llama Stack Build / generate-matrix (push) Successful in 2s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 31s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 4s Details	2025-09-16 19:44:03 +00:00
Matthew Farrellee	49d4a5cc84	feat: add embedding and dynamic model support to Together inference adapter (#3458 ) # What does this PR do? adds embedding and dynamic model support to Together inference adapter - updated to use OpenAIMixin - workarounds for Together api quirks - recordings for together suite when subdirs=inference,pattern=openai ## Test Plan ``` $ TOGETHER_API_KEY=_NONE_ ./scripts/integration-tests.sh --stack-config server:ci-tests --setup together --subdirs inference --pattern openai ... tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:completion:sanity] instantiating llama_stack_client Port 8321 is already in use, assuming server is already running... llama_stack_client instantiated in 0.121s PASSED [ 2%] tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming_suffix[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:completion:suffix] SKIPPED [ 4%] tests/integration/inference/test_openai_completion.py::test_openai_completion_streaming[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:completion:sanity] PASSED [ 6%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-1] SKIPPED [ 8%] tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free] SKIPPED [ 10%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:non_streaming_01] PASSED [ 12%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_01] PASSED [ 14%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_01] SKIPPED [ 17%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-True] PASSED [ 19%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-True] PASSED [ 21%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming_with_file[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free] SKIPPED [ 23%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_single_string[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 25%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_multiple_strings[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 27%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_float[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 29%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_dimensions[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 31%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_user_parameter[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 34%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_empty_list_error[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 36%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_invalid_model_error[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 38%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_different_inputs_different_outputs[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 40%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_base64[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 42%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_base64_batch_processing[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 44%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-0] SKIPPED [ 46%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:non_streaming_02] PASSED [ 48%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_02] PASSED [ 51%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_02] SKIPPED [ 53%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-False] PASSED [ 55%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-False] PASSED [ 57%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_single_string[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 59%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_multiple_strings[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 61%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_float[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 63%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_dimensions[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 65%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_user_parameter[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 68%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_empty_list_error[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 70%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_invalid_model_error[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 72%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_different_inputs_different_outputs[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 74%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_base64[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 76%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_base64_batch_processing[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 78%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:non_streaming_01] PASSED [ 80%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_01] PASSED [ 82%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_01] SKIPPED [ 85%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-True] PASSED [ 87%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-True] PASSED [ 89%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:non_streaming_02] PASSED [ 91%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_02] PASSED [ 93%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_02] SKIPPED [ 95%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-False] PASSED [ 97%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-False] PASSED [100%] ============================================ 30 passed, 17 skipped, 50 deselected, 4 warnings in 21.96s ============================================= ```	2025-09-16 11:53:41 -07:00
slekkala1	3defdf7d3a	fix: docker failing to start container[pydantic] (#3460 ) # What does this PR do? Pinning to latest pydantic version 2.11.9 as sometime we are picking older version and failing to start container in github actions : `1775026312` Closes https://github.com/llamastack/llama-stack/issues/3461 ## Test Plan Tested locally with the following commands to start a container Build container `llama stack build --distro starter --image-type container` start container `docker run -d -p 8321:8321 --name llama-stack-test distribution-starter:0.2.21` check health http://localhost:8321/v1/health Couldnt repro with older version(`2.8.2`), but `2.11.9` pydantic is able to start the container https://pypi.org/project/pydantic/#history , 2.11.9 is the latest version	2025-09-16 11:33:43 -07:00
Charlie Doern	6b855af96f	feat: introduce api leveling proposal (#3317 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 37s Details Unit Tests / unit-tests (3.12) (push) Failing after 37s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 2m31s Details # What does this PR do? this document outlines different API stability levels, how to enforce them, and next steps ## Next Steps Following the adoption of this document, all existing APIs should follow the enforcement protocol. relates to #3237 Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-09-16 18:18:36 +02:00
Sébastien Han	65d45c7318	chore: various watsonx fixes (#3428 ) # What does this PR do? use a logger * update the distro to add the Files API otherwise it won't start since it is a dependency of vector * clarify project_id and api_key requirements * disable openai compatible calls since the endpoint returns 404 * disable text_inference structured format tests * fixed openai client initialization ## Test Plan Execute text_inference: ``` WATSONX_API_KEY=... WATSONX_PROJECT_ID=... python -m llama_stack.core.server.server llama_stack/distributions/watsonx/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 uv run --group test pytest -vvvv -ra --text-model watsonx/meta-llama/llama-3-3-70b-instruct tests/integration/inference/test_text_inference.py ============================================= test session starts ============================================== platform darwin -- Python 3.12.8, pytest-8.4.2, pluggy-1.6.0 -- /Users/leseb/Documents/AI/llama-stack/.venv/bin/python3 cachedir: .pytest_cache metadata: {'Python': '3.12.8', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0', 'hydra-core': '1.3.2'}} rootdir: /Users/leseb/Documents/AI/llama-stack configfile: pyproject.toml plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0, hydra-core-1.3.2 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 20 items tests/integration/inference/test_text_inference.py::test_text_completion_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:sanity] PASSED [ 5%] tests/integration/inference/test_text_inference.py::test_text_completion_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:sanity] PASSED [ 10%] tests/integration/inference/test_text_inference.py::test_text_completion_stop_sequence[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:stop_sequence] XFAIL [ 15%] tests/integration/inference/test_text_inference.py::test_text_completion_log_probs_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] XFAIL [ 20%] tests/integration/inference/test_text_inference.py::test_text_completion_log_probs_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] XFAIL [ 25%] tests/integration/inference/test_text_inference.py::test_text_completion_structured_output[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:structured_output] SKIPPED structured output) [ 30%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:non_streaming_01] PASSED [ 35%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_01] PASSED [ 40%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_tool_calling_and_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 45%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_tool_calling_and_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 50%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_tool_choice_required[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 55%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_tool_choice_none[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 60%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_structured_output[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:structured_output] SKIPPEDstructured output) [ 65%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_tool_calling_tools_not_in_request[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling_tools_absent-True] PASSED [ 70%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:text_then_tool] XFAIL [ 75%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:non_streaming_02] PASSED [ 80%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_02] PASSED [ 85%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_tool_calling_tools_not_in_request[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling_tools_absent-False] PASSED [ 90%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_then_answer] XFAIL [ 95%] tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:array_parameter] XFAIL [100%] =========================================== short test summary info ============================================ SKIPPED [2] tests/integration/inference/test_text_inference.py:49: Model watsonx/meta-llama/llama-3-3-70b-instruct hosted by remote::watsonx doesn't support json_schema structured output XFAIL tests/integration/inference/test_text_inference.py::test_text_completion_stop_sequence[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:stop_sequence] - remote::watsonx doesn't support 'stop' parameter yet XFAIL tests/integration/inference/test_text_inference.py::test_text_completion_log_probs_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] - remote::watsonx doesn't support log probs yet XFAIL tests/integration/inference/test_text_inference.py::test_text_completion_log_probs_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] - remote::watsonx doesn't support log probs yet XFAIL tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:text_then_tool] - Not tested for non-llama4 models yet XFAIL tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_then_answer] - Not tested for non-llama4 models yet XFAIL tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:array_parameter] - Not tested for non-llama4 models yet ============================ 12 passed, 2 skipped, 6 xfailed, 14 warnings in 36.88s ============================ ``` --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-16 13:55:10 +02:00
Matthew Farrellee	f4ab154ade	feat: add dynamic model registration support to TGI inference (#3417 ) Some checks failed Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 43s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Pre-commit / pre-commit (push) Successful in 1m21s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details # What does this PR do? adds dynamic model support to TGI add new overwrite_completion_id feature to OpenAIMixin to deal with TGI always returning id="" ## Test Plan tgi: `docker run --gpus all --shm-size 1g -p 8080:80 -v /data:/data ghcr.io/huggingface/text-generation-inference --model-id Qwen/Qwen3-0.6B` stack: `TGI_URL=http://localhost:8080 uv run llama stack build --image-type venv --distro ci-tests --run` test: `./scripts/integration-tests.sh --stack-config http://localhost:8321 --setup tgi --subdirs inference --pattern openai`	2025-09-15 15:52:40 -04:00
IAN MILLER	ab321739f2	feat: create HTTP DELETE API endpoints to unregister ScoringFn and Benchmark resources in Llama Stack (#3371 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR provides functionality for users to unregister ScoringFn and Benchmark resources for `scoring` and `eval` APIs. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #3051 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Updated integration and unit tests via CI workflow	2025-09-15 12:43:38 -07:00
Matthew Farrellee	01bdcce4d2	chore(recorder): update mocks to be closer to non-mock environment (#3442 ) # What does this PR do? the @required_args decorator in openai-python is masking the async nature of the {AsyncCompletions,chat.AsyncCompletions}.create method. see https://github.com/openai/openai-python/issues/996 this means two things - 0. we cannot use iscoroutine in the recorder to detect async vs non 1. our mocks are inappropriately introducing identifiable async for (0), we update the iscoroutine check w/ detection of /v1/models, which is the only non-async function we mock & record. for (1), we could leave everything as is and assume (0) will catch errors. to be defensive, we update the unit tests to mock below create methods, allowing the true openai-python create() methods to be tested.	2025-09-15 15:25:53 -04:00
dependabot[bot]	b6cb817897	chore(ui-deps): bump @radix-ui/react-select from 2.2.5 to 2.2.6 in /llama_stack/ui (#3437 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 5s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 19s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 21s Details UI Tests / ui-tests (22) (push) Successful in 55s Details Pre-commit / pre-commit (push) Successful in 1m39s Details Bumps [@radix-ui/react-select](https://github.com/radix-ui/primitives) from 2.2.5 to 2.2.6. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/radix-ui/primitives/commits">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@radix-ui/react-select&package-manager=npm_and_yarn&previous-version=2.2.5&new-version=2.2.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-15 09:46:14 +02:00
dependabot[bot]	36fd97e306	chore(ui-deps): bump next from 15.3.3 to 15.5.3 in /llama_stack/ui (#3438 ) Bumps [next](https://github.com/vercel/next.js) from 15.3.3 to 15.5.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vercel/next.js/releases">next's releases</a>.</em></p> <blockquote> <h2>v15.5.3</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>fix: validation return types of pages API routes (<a href="https://redirect.github.com/vercel/next.js/issues/83069">#83069</a>)</li> <li>fix: relative paths in dev in validator.ts (<a href="https://redirect.github.com/vercel/next.js/issues/83073">#83073</a>)</li> <li>fix: remove satisfies keyword from type validation to preserve old TS compatibility (<a href="https://redirect.github.com/vercel/next.js/issues/83071">#83071</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/bgub"><code>@bgub</code></a> for helping!</p> <h2>v15.5.2</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>fix: disable unknownatrules lint rule entirely (<a href="https://redirect.github.com/vercel/next.js/issues/83059">#83059</a>)</li> <li>revert: add ?dpl to fonts in /_next/static/media (<a href="https://redirect.github.com/vercel/next.js/issues/83062">#83062</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/bgub"><code>@bgub</code></a> and <a href="https://github.com/ztanner"><code>@ztanner</code></a> for helping!</p> <h2>v15.5.1</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>fix: aliased navigations should apply scroll handling (<a href="https://redirect.github.com/vercel/next.js/issues/82900">#82900</a>)</li> <li>Turbopack: fix invalid NFT entry with file behind symlink (<a href="https://redirect.github.com/vercel/next.js/issues/82887">#82887</a>)</li> <li>fix: typesafe linking to route handlers and pages API routes (<a href="https://redirect.github.com/vercel/next.js/issues/82858">#82858</a>)</li> <li>fix: change "noUnknownAtRules" to "warn" for Biome (<a href="https://redirect.github.com/vercel/next.js/issues/82974">#82974</a>)</li> <li>fix: add path normalization to getRelativePath for Windows (<a href="https://redirect.github.com/vercel/next.js/issues/82918">#82918</a>)</li> <li>feat: add typesafety with config.typedRoutes to redirect() and permanentRedirect() (<a href="https://redirect.github.com/vercel/next.js/issues/82860">#82860</a>)</li> <li>fix: avoid importing types that will be unused (<a href="https://redirect.github.com/vercel/next.js/issues/82856">#82856</a>)</li> <li>fix: update the config.api.responseLimit type (<a href="https://redirect.github.com/vercel/next.js/issues/82852">#82852</a>)</li> <li>fix: update validation return types (<a href="https://redirect.github.com/vercel/next.js/issues/82854">#82854</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/bgub"><code>@bgub</code></a>, <a href="https://github.com/mischnic"><code>@mischnic</code></a>, and <a href="https://github.com/ztanner"><code>@ztanner</code></a> for helping!</p> <h2>v15.5.1-canary.39</h2> <h3>Core Changes</h3> <ul> <li>[metadata] change the metadata routes params to promises: <a href="https://redirect.github.com/vercel/next.js/issues/83560">#83560</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`07d1cbc9c6`"><code>07d1cbc</code></a> v15.5.3</li> <li><a href="`db56d77595`"><code>db56d77</code></a> [backport] fix: validation return types of pages API routes (<a href="https://redirect.github.com/vercel/next.js/issues/83069">#83069</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83580">#83580</a>)</li> <li><a href="`7a806231f8`"><code>7a80623</code></a> [backport] fix: relative paths in dev in validator.ts (<a href="https://redirect.github.com/vercel/next.js/issues/83073">#83073</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83190">#83190</a>)</li> <li><a href="`fddaeb85a0`"><code>fddaeb8</code></a> [backport] fix: remove <code>satisfies</code> keyword from type validation to preserve o...</li> <li><a href="`497ec6aa08`"><code>497ec6a</code></a> v15.5.2</li> <li><a href="`bc72f41a2e`"><code>bc72f41</code></a> [backport] revert: add ?dpl to fonts in <code>/_next/static/media</code> (<a href="https://redirect.github.com/vercel/next.js/issues/83062">#83062</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83066">#83066</a>)</li> <li><a href="`c8faf6800b`"><code>c8faf68</code></a> [backport] fix: disable unknownatrules lint rule entirely (<a href="https://redirect.github.com/vercel/next.js/issues/83059">#83059</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83060">#83060</a>)</li> <li><a href="`cc68ced552`"><code>cc68ced</code></a> v15.5.1</li> <li><a href="`1ce9857276`"><code>1ce9857</code></a> [backport] fix: update validation return types (<a href="https://redirect.github.com/vercel/next.js/issues/82854">#82854</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83027">#83027</a>)</li> <li><a href="`b93c894717`"><code>b93c894</code></a> [backport] fix: update the config.api.responseLimit type (<a href="https://redirect.github.com/vercel/next.js/issues/82852">#82852</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83028">#83028</a>)</li> <li>Additional commits viewable in <a href="https://github.com/vercel/next.js/compare/v15.3.3...v15.5.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=next&package-manager=npm_and_yarn&previous-version=15.3.3&new-version=15.5.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-15 09:46:05 +02:00
Matthew Farrellee	6787755c0c	chore(recorder): add support for NOT_GIVEN (#3430 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 4s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 18s Details Python Package Build Test / build (3.12) (push) Failing after 14s Details UI Tests / ui-tests (22) (push) Successful in 41s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Pre-commit / pre-commit (push) Successful in 1m31s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details # What does this PR do? the recorder mocks the openai-python interface. the openai-python interface allows NOT_GIVEN as an input option. this change properly handles NOT_GIVEN. ## Test Plan ci (coverage for chat, completions, embeddings)	2025-09-13 11:11:38 -07:00
Matthew Farrellee	8cf2128b40	chore(tests): always show slowest tests (#3431 ) # What does this PR do? help developers identify slow tests by always passing --duration to pytest ## Test Plan n/a	2025-09-13 09:28:04 -07:00
Matthew Farrellee	3de9ad0a87	chore(recorder, tests): add test for openai /v1/models (#3426 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 1m19s Details # What does this PR do? - [x] adds a test for the recorder's handling of /v1/models - [x] adds a fix for /v1/models handling ## Test Plan ci	2025-09-12 14:59:56 -07:00
Doug Edgar	f67081d2d6	feat: migrate to FIPS-validated cryptographic algorithms (#3423 ) Some checks failed Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 16s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (push) Failing after 19s Details UI Tests / ui-tests (22) (push) Successful in 33s Details Pre-commit / pre-commit (push) Successful in 1m13s Details # What does this PR do? Migrates MD5 and SHA-1 hash algorithms to SHA-256. In particular, replaces: - MD5 in chunk ID generation. - MD5 in file verification. - SHA-1 in model identifier digests. And updates all related test expectations. Original discussion: https://github.com/llamastack/llama-stack/discussions/3413 <!-- If resolving an issue, uncomment and update the line below --> Closes #3424. ## Test Plan Unit tests from scripts/unit-tests.sh were updated to match the new hash output, and ran to verify the tests pass. Signed-off-by: Doug Edgar <dedgar@redhat.com>	2025-09-12 11:18:19 +02:00
Akram Ben Aissi	d31e641d69	fix: Improve pre-commit workflow error handling and feedback (#3400 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> fix: Improve pre-commit workflow error handling and feedback - Add explicit step to check pre-commit results and provide clear error messages - Improve verification steps with better error messages and file listings - Use GitHub Actions annotations (::error:: and :⚠️:) for better visibility - Maintain continue-on-error for pre-commit step but add proper failure handling This addresses the issue where pre-commit failures were silent but still caused workflow failures later, making it difficult to understand what needed to be fixed. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>	2025-09-12 11:10:59 +02:00
Charlie Doern	69a52213a1	fix: oasdiff enhancements and stability (#3419 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 16s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 19s Details UI Tests / ui-tests (22) (push) Successful in 39s Details Pre-commit / pre-commit (push) Successful in 2m17s Details # What does this PR do? only run conformance tests when the spec is changed. Also, cache oasdiff such that it is not installed every time the test is run Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-09-11 13:30:09 -07:00
slekkala1	c7ef1f13df	feat: Add langchain llamastack Integration example notebook (#3314 ) # What does this PR do? The notebook was reverted(https://github.com/llamastack/llama-stack/pull/3259) as it had some local paths, I missed correcting. Trying with corrections now ## Test Plan Ran the Jupyter notebook	2025-09-11 11:10:41 -07:00
Matthew Farrellee	72387b4bd2	chore(unit tests): remove network use, update async test (#3418 ) # What does this PR do? update the async detection test for vllm - remove a network access from unit tests - remove direct logging use the idea behind the test is to mock inference w/ a sleep, initiate concurrent inference calls, verify the total execution time is close to the sleep time. in a non-async env the total time would be closer to sleep * num concurrent calls. ## Test Plan ci	2025-09-11 11:45:16 -04:00
Matthew Farrellee	8ef1189be7	chore: update the vLLM inference impl to use OpenAIMixin for openai-compat functions (#3404 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 31s Details Pre-commit / pre-commit (push) Successful in 1m18s Details # What does this PR do? update vLLM inference provider to use OpenAIMixin for openai-compat functions inference recordings from Qwen3-0.6B and vLLM 0.8.3 - ``` docker run --gpus all -v ~/.cache/huggingface:/root/.cache/huggingface -p 8000:8000 --ipc=host \ vllm/vllm-openai:latest \ --model Qwen/Qwen3-0.6B --enable-auto-tool-choice --tool-call-parser hermes ``` ## Test Plan ``` ./scripts/integration-tests.sh --stack-config server:ci-tests --setup vllm --subdirs inference ```	2025-09-11 09:04:38 -04:00
Francisco Arceo	d15368a302	chore: Updating documentation, adding exception handling for Vector Stores in RAG Tool, more tests on migration, and migrate off of inference_api for context_retriever for RAG (#3367 ) # What does this PR do? - Updating documentation on migration from RAG Tool to Vector Stores and Files APIs - Adding exception handling for Vector Stores in RAG Tool - Add more tests on migration from RAG Tool to Vector Stores - Migrate off of inference_api for context_retriever for RAG <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Integration and unit tests added Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-09-11 14:20:11 +02:00
Sébastien Han	f31bcc11bc	feat: add Azure OpenAI inference provider support (#3396 ) # What does this PR do? Llama-stack now supports a new OpenAI compatible endpoint with Azure OpenAI. The starter distro has been updated to add the new remote inference provider. A few tests have been modified and improved. ## Test Plan Deploy a model in the Aure portal then: ``` $ AZURE_API_KEY=... AZURE_API_BASE=... uv run llama stack build --image-type venv --providers inference=remote::azure --run ... $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run --group test pytest -v -ra --text-model azure/gpt-4.1 tests/integration/inference/test_openai_completion.py ... Results: ``` ============================================= test session starts ============================================== platform darwin -- Python 3.12.8, pytest-8.4.1, pluggy-1.6.0 -- /Users/leseb/Documents/AI/llama-stack/.venv/bin/python3 cachedir: .pytest_cache metadata: {'Python': '3.12.8', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.1', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0', 'hydra-core': '1.3.2'}} rootdir: /Users/leseb/Documents/AI/llama-stack configfile: pyproject.toml plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0, hydra-core-1.3.2 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 27 items tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming[txt=azure/gpt-5-mini-inference:completion:sanity] SKIPPED [ 3%] tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming_suffix[txt=azure/gpt-5-mini-inference:completion:suffix] SKIPPED [ 7%] tests/integration/inference/test_openai_completion.py::test_openai_completion_streaming[txt=azure/gpt-5-mini-inference:completion:sanity] SKIPPED [ 11%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=azure/gpt-5-mini-1] SKIPPED [ 14%] tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=azure/gpt-5-mini] SKIPPED [ 18%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:non_streaming_01] PASSED [ 22%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:streaming_01] PASSED [ 25%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:streaming_01] PASSED [ 29%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=azure/gpt-5-mini-True] PASSED [ 33%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=azure/gpt-5-mini-True] PASSED [ 37%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming_with_file[txt=azure/gpt-5-mini] SKIPPEDed files.) [ 40%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=azure/gpt-5-mini-0] SKIPPED [ 44%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:non_streaming_02] PASSED [ 48%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:streaming_02] PASSED [ 51%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=azure/gpt-5-mini-inference:chat_completion:streaming_02] PASSED [ 55%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=azure/gpt-5-mini-False] PASSED [ 59%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=azure/gpt-5-mini-False] PASSED [ 62%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:non_streaming_01] PASSED [ 66%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:streaming_01] PASSED [ 70%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:streaming_01] PASSED [ 74%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=azure/gpt-5-mini-True] PASSED [ 77%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=azure/gpt-5-mini-True] PASSED [ 81%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:non_streaming_02] PASSED [ 85%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:streaming_02] PASSED [ 88%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=azure/gpt-5-mini-inference:chat_completion:streaming_02] PASSED [ 92%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=azure/gpt-5-mini-False] PASSED [ 96%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=azure/gpt-5-mini-False] PASSED [100%] =========================================== short test summary info ============================================ SKIPPED [3] tests/integration/inference/test_openai_completion.py:63: Model azure/gpt-5-mini hosted by remote::azure doesn't support OpenAI completions. SKIPPED [3] tests/integration/inference/test_openai_completion.py:118: Model azure/gpt-5-mini hosted by remote::azure doesn't support vllm extra_body parameters. SKIPPED [1] tests/integration/inference/test_openai_completion.py:124: Model azure/gpt-5-mini hosted by remote::azure doesn't support chat completion calls with base64 encoded files. ================================== 20 passed, 7 skipped, 2 warnings in 51.77s ================================== ``` Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-11 13:48:38 +02:00
Matthew Farrellee	c2d281e01b	chore(replay): improve replay robustness with un-validated construction (#3414 ) # What does this PR do? some providers do not produce spec compliant outputs. when this happens the replay infra will fail to construct the proper types and will return a dict to the client. the client likely does not expect a dict. this was discovered with tgi, which returns finish_reason="" when valid values are "stop", "length" or "content_filter" ## Test Plan ci	2025-09-11 13:48:19 +02:00
Sumanth Kamenani	2838d5a20f	fix: AWS Bedrock inference profile ID conversion for region-specific endpoints (#3386 ) Fixes #3370 AWS switched to requiring region-prefixed inference profile IDs instead of foundation model IDs for on-demand throughput. This was causing ValidationException errors. Added auto-detection based on boto3 client region to convert model IDs like meta.llama3-1-70b-instruct-v1:0 to us.meta.llama3-1-70b-instruct-v1:0 depending on the detected region. Also handles edge cases like ARNs, case insensitive regions, and None regions. Tested with this request. ```json { "model_id": "meta.llama3-1-8b-instruct-v1:0", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "tell me a riddle" } ], "sampling_params": { "strategy": { "type": "top_p", "temperature": 0.7, "top_p": 0.9 }, "max_tokens": 512 } } ``` <img width="1488" height="878" alt="image" src="https://github.com/user-attachments/assets/0d61beec-3869-4a31-8f37-9f554c280b88" />	2025-09-11 11:41:53 +02:00
Sébastien Han	8e05c68d15	chore: remove openai dependency from providers (#3398 ) # What does this PR do? The openai package is already a dependency of the llama-stack project itself, so let's the project dictate which openai version we need and avoid potential breakage with unsatisfiable dependency resolution. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-11 10:19:59 +02:00
Ashwin Bharambe	0c7f49490c	fix(inference_store): on duplicate chat completion IDs, replace (#3408 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 23s Details Test External API and Providers / test-external (venv) (push) Failing after 30s Details UI Tests / ui-tests (22) (push) Successful in 35s Details Pre-commit / pre-commit (push) Successful in 1m45s Details # What does this PR do? Duplicate chat completion IDs can be generated during tests especially if they are replaying recorded responses across different tests. No need to warn or error under those circumstances. In the wild, this is not likely to happen at all (no evidence) so we aren't really hiding any problem.	2025-09-10 14:34:18 -07:00
ehhuang	c04f1c1e8c	chore: move benchmarking related code (#3406 ) # What does this PR do? - moving things and some formatting changes ## Test Plan	2025-09-10 13:19:44 -07:00
ehhuang	d2f88a10fb	chore: telemetry test (#3405 ) # What does this PR do? - removed fixed-duration sleeps ## Test Plan	2025-09-10 13:19:36 -07:00
dependabot[bot]	d4e45cd5f1	chore(ui-deps): bump tailwindcss from 4.1.6 to 4.1.13 in /llama_stack/ui (#3362 ) Bumps [tailwindcss](https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss) from 4.1.6 to 4.1.13. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/tailwindlabs/tailwindcss/releases">tailwindcss's releases</a>.</em></p> <blockquote> <h2>v4.1.13</h2> <h3>Changed</h3> <ul> <li>Drop warning from browser build (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/issues/18731">#18731</a>)</li> <li>Drop exact duplicate declarations when emitting CSS (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/issues/18809">#18809</a>)</li> </ul> <h3>Fixed</h3> <ul> <li>Don't transition <code>visibility</code> when using <code>transition</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18795">#18795</a>)</li> <li>Discard matched variants with unknown named values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18799">#18799</a>)</li> <li>Discard matched variants with non-string values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18799">#18799</a>)</li> <li>Show suggestions for known <code>matchVariant</code> values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18798">#18798</a>)</li> <li>Replace deprecated <code>clip</code> with <code>clip-path</code> in <code>sr-only</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18769">#18769</a>)</li> <li>Hide internal fields from completions in <code>matchUtilities</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18820">#18820</a>)</li> <li>Ignore <code>.vercel</code> folders by default (can be overridden by <code>@source …</code> rules) (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18855">#18855</a>)</li> <li>Consider variants starting with <code>@-</code> to be invalid (e.g. <code>@-2xl:flex</code>) (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18869">#18869</a>)</li> <li>Do not allow custom variants to start or end with a <code>-</code> or <code>_</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18867">#18867</a>, <a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18872">#18872</a>)</li> <li>Upgrade: Migrate <code>aria</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18815">#18815</a>)</li> <li>Upgrade: Migrate <code>data</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18816">#18816</a>)</li> <li>Upgrade: Migrate <code>supports</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18817">#18817</a>)</li> </ul> <h2>v4.1.12</h2> <h3>Fixed</h3> <ul> <li>Don't consider the global important state in <code>@apply</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18404">#18404</a>)</li> <li>Add missing suggestions for <code>flex-<number></code> utilities (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18642">#18642</a>)</li> <li>Fix trailing <code>)</code> from interfering with extraction in Clojure keywords (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18345">#18345</a>)</li> <li>Detect classes inside Elixir charlist, word list, and string sigils (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18432">#18432</a>)</li> <li>Track source locations through <code>@plugin</code> and <code>@config</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18345">#18345</a>)</li> <li>Allow boolean values of <code>process.env.DEBUG</code> in <code>@tailwindcss/node</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18485">#18485</a>)</li> <li>Ignore consecutive semicolons in the CSS parser (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18532">#18532</a>)</li> <li>Center the dropdown icon added to an input with a paired datalist by default (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18511">#18511</a>)</li> <li>Extract candidates in Slang templates (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18565">#18565</a>)</li> <li>Improve error messages when encountering invalid functional utility names (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18568">#18568</a>)</li> <li>Discard CSS AST objects with <code>false</code> or <code>undefined</code> properties (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18571">#18571</a>)</li> <li>Allow users to disable URL rebasing in <code>@tailwindcss/postcss</code> via <code>transformAssetUrls: false</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18321">#18321</a>)</li> <li>Fix false-positive migrations in <code>addEventListener</code> and JavaScript variable names (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18718">#18718</a>)</li> <li>Fix Standalone CLI showing default Bun help when run via symlink on Windows (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18723">#18723</a>)</li> <li>Read from <code>--border-color-</code> theme keys in <code>divide-</code> utilities for backwards compatibility (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18704/">#18704</a>)</li> <li>Don't scan <code>.hdr</code> and <code>.exr</code> files for classes by default (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18734">#18734</a>)</li> </ul> <h2>v4.1.11</h2> <h3>Fixed</h3> <ul> <li>Add heuristic to skip candidate migrations inside <code>emit(…)</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18330">#18330</a>)</li> <li>Extract candidates with variants in Clojure/ClojureScript keywords (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18338">#18338</a>)</li> <li>Document <code>--watch=always</code> in the CLI's usage (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18337">#18337</a>)</li> <li>Add support for Vite 7 to <code>@tailwindcss/vite</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18384">#18384</a>)</li> </ul> <h2>v4.1.10</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/tailwindlabs/tailwindcss/blob/main/CHANGELOG.md">tailwindcss's changelog</a>.</em></p> <blockquote> <h2>[4.1.13] - 2025-09-03</h2> <h3>Changed</h3> <ul> <li>Drop warning from browser build (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/issues/18731">#18731</a>)</li> <li>Drop exact duplicate declarations when emitting CSS (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/issues/18809">#18809</a>)</li> </ul> <h3>Fixed</h3> <ul> <li>Don't transition <code>visibility</code> when using <code>transition</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18795">#18795</a>)</li> <li>Discard matched variants with unknown named values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18799">#18799</a>)</li> <li>Discard matched variants with non-string values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18799">#18799</a>)</li> <li>Show suggestions for known <code>matchVariant</code> values (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18798">#18798</a>)</li> <li>Replace deprecated <code>clip</code> with <code>clip-path</code> in <code>sr-only</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18769">#18769</a>)</li> <li>Hide internal fields from completions in <code>matchUtilities</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18820">#18820</a>)</li> <li>Ignore <code>.vercel</code> folders by default (can be overridden by <code>@source …</code> rules) (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18855">#18855</a>)</li> <li>Consider variants starting with <code>@-</code> to be invalid (e.g. <code>@-2xl:flex</code>) (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18869">#18869</a>)</li> <li>Do not allow custom variants to start or end with a <code>-</code> or <code>_</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18867">#18867</a>, <a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18872">#18872</a>)</li> <li>Upgrade: Migrate <code>aria</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18815">#18815</a>)</li> <li>Upgrade: Migrate <code>data</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18816">#18816</a>)</li> <li>Upgrade: Migrate <code>supports</code> theme keys to <code>@custom-variant</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18817">#18817</a>)</li> </ul> <h2>[4.1.12] - 2025-08-13</h2> <h3>Fixed</h3> <ul> <li>Don't consider the global important state in <code>@apply</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18404">#18404</a>)</li> <li>Add missing suggestions for <code>flex-<number></code> utilities (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18642">#18642</a>)</li> <li>Fix trailing <code>)</code> from interfering with extraction in Clojure keywords (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18345">#18345</a>)</li> <li>Detect classes inside Elixir charlist, word list, and string sigils (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18432">#18432</a>)</li> <li>Track source locations through <code>@plugin</code> and <code>@config</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18345">#18345</a>)</li> <li>Allow boolean values of <code>process.env.DEBUG</code> in <code>@tailwindcss/node</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18485">#18485</a>)</li> <li>Ignore consecutive semicolons in the CSS parser (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18532">#18532</a>)</li> <li>Center the dropdown icon added to an input with a paired datalist by default (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18511">#18511</a>)</li> <li>Extract candidates in Slang templates (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18565">#18565</a>)</li> <li>Improve error messages when encountering invalid functional utility names (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18568">#18568</a>)</li> <li>Discard CSS AST objects with <code>false</code> or <code>undefined</code> properties (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18571">#18571</a>)</li> <li>Allow users to disable URL rebasing in <code>@tailwindcss/postcss</code> via <code>transformAssetUrls: false</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18321">#18321</a>)</li> <li>Fix false-positive migrations in <code>addEventListener</code> and JavaScript variable names (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18718">#18718</a>)</li> <li>Fix Standalone CLI showing default Bun help when run via symlink on Windows (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18723">#18723</a>)</li> <li>Read from <code>--border-color-</code> theme keys in <code>divide-</code> utilities for backwards compatibility (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18704/">#18704</a>)</li> <li>Don't scan <code>.hdr</code> and <code>.exr</code> files for classes by default (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18734">#18734</a>)</li> </ul> <h2>[4.1.11] - 2025-06-26</h2> <h3>Fixed</h3> <ul> <li>Add heuristic to skip candidate migrations inside <code>emit(…)</code> (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18330">#18330</a>)</li> <li>Extract candidates with variants in Clojure/ClojureScript keywords (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18338">#18338</a>)</li> <li>Document <code>--watch=always</code> in the CLI's usage (<a href="https://redirect.github.com/tailwindlabs/tailwindcss/pull/18337">#18337</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`1334c99db8`"><code>1334c99</code></a> Prepare v4.1.13 release (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18868">#18868</a>)</li> <li><a href="`65dc530f05`"><code>65dc530</code></a> Do not allow variants to end with <code>-</code> or <code>_</code> (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18872">#18872</a>)</li> <li><a href="`54c3f308e9`"><code>54c3f30</code></a> Do not allow variants to start with <code>-</code> (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18867">#18867</a>)</li> <li><a href="`494051ca08`"><code>494051c</code></a> Consider variants starting with <code>@-</code> to be invalid (e.g. <code>@-2xl:flex</code>) (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18869">#18869</a>)</li> <li><a href="`c318329a1e`"><code>c318329</code></a> chore: remove redundant words (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18853">#18853</a>)</li> <li><a href="`ddc84b079b`"><code>ddc84b0</code></a> update test after prettier change</li> <li><a href="`f1331a857a`"><code>f1331a8</code></a> run prettier</li> <li><a href="`e5513b6c75`"><code>e5513b6</code></a> Fix missing code block delimiters in comment blocks (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18837">#18837</a>)</li> <li><a href="`5e2a160d8b`"><code>5e2a160</code></a> Drop exact duplicate declarations from output CSS within a style rule (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18809">#18809</a>)</li> <li><a href="`b1fb02a2d7`"><code>b1fb02a</code></a> Hide internal fields from completions in <code>matchUtilities</code> (<a href="https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss/issues/18820">#18820</a>)</li> <li>Additional commits viewable in <a href="https://github.com/tailwindlabs/tailwindcss/commits/v4.1.13/packages/tailwindcss">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tailwindcss&package-manager=npm_and_yarn&previous-version=4.1.6&new-version=4.1.13)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-10 13:18:14 -07:00
dependabot[bot]	438c037b1f	chore(python-deps): bump openai from 1.102.0 to 1.106.1 (#3356 ) Bumps [openai](https://github.com/openai/openai-python) from 1.102.0 to 1.106.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/openai/openai-python/releases">openai's releases</a>.</em></p> <blockquote> <h2>v1.106.1</h2> <h2>1.106.1 (2025-09-04)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.106.0...v1.106.1">v1.106.0...v1.106.1</a></p> <h3>Chores</h3> <ul> <li><strong>internal:</strong> move mypy configurations to <code>pyproject.toml</code> file (<a href="`ca413a2774`">ca413a2</a>)</li> </ul> <h2>v1.106.0</h2> <h2>1.106.0 (2025-09-04)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.105.0...v1.106.0">v1.105.0...v1.106.0</a></p> <h3>Features</h3> <ul> <li><strong>client:</strong> support callable api_key (<a href="https://redirect.github.com/openai/openai-python/issues/2588">#2588</a>) (<a href="`e1bad015b8`">e1bad01</a>)</li> <li>improve future compat with pydantic v3 (<a href="`6645d9317a`">6645d93</a>)</li> </ul> <h2>v1.105.0</h2> <h2>1.105.0 (2025-09-03)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.104.2...v1.105.0">v1.104.2...v1.105.0</a></p> <h3>Features</h3> <ul> <li><strong>api:</strong> Add gpt-realtime models (<a href="`8502041480`">8502041</a>)</li> </ul> <h2>v1.104.2</h2> <h2>1.104.2 (2025-09-02)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.104.1...v1.104.2">v1.104.1...v1.104.2</a></p> <h3>Bug Fixes</h3> <ul> <li><strong>types:</strong> add aliases back for web search tool types (<a href="`2521cd8445`">2521cd8</a>)</li> </ul> <h2>v1.104.1</h2> <h2>1.104.1 (2025-09-02)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.104.0...v1.104.1">v1.104.0...v1.104.1</a></p> <h3>Chores</h3> <ul> <li><strong>api:</strong> manual updates for ResponseInputAudio (<a href="`0db5061966`">0db5061</a>)</li> </ul> <h2>v1.104.0</h2> <h2>1.104.0 (2025-09-02)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.103.0...v1.104.0">v1.103.0...v1.104.0</a></p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/openai/openai-python/blob/main/CHANGELOG.md">openai's changelog</a>.</em></p> <blockquote> <h2>1.106.1 (2025-09-04)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.106.0...v1.106.1">v1.106.0...v1.106.1</a></p> <h3>Chores</h3> <ul> <li><strong>internal:</strong> move mypy configurations to <code>pyproject.toml</code> file (<a href="`ca413a2774`">ca413a2</a>)</li> </ul> <h2>1.106.0 (2025-09-04)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.105.0...v1.106.0">v1.105.0...v1.106.0</a></p> <h3>Features</h3> <ul> <li><strong>client:</strong> support callable api_key (<a href="https://redirect.github.com/openai/openai-python/issues/2588">#2588</a>) (<a href="`e1bad015b8`">e1bad01</a>)</li> <li>improve future compat with pydantic v3 (<a href="`6645d9317a`">6645d93</a>)</li> </ul> <h2>1.105.0 (2025-09-03)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.104.2...v1.105.0">v1.104.2...v1.105.0</a></p> <h3>Features</h3> <ul> <li><strong>api:</strong> Add gpt-realtime models (<a href="`8502041480`">8502041</a>)</li> </ul> <h2>1.104.2 (2025-09-02)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.104.1...v1.104.2">v1.104.1...v1.104.2</a></p> <h3>Bug Fixes</h3> <ul> <li><strong>types:</strong> add aliases back for web search tool types (<a href="`2521cd8445`">2521cd8</a>)</li> </ul> <h2>1.104.1 (2025-09-02)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.104.0...v1.104.1">v1.104.0...v1.104.1</a></p> <h3>Chores</h3> <ul> <li><strong>api:</strong> manual updates for ResponseInputAudio (<a href="`0db5061966`">0db5061</a>)</li> </ul> <h2>1.104.0 (2025-09-02)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.103.0...v1.104.0">v1.103.0...v1.104.0</a></p> <h3>Features</h3> <ul> <li><strong>types:</strong> replace List[str] with SequenceNotStr in params (<a href="`bc00bda880`">bc00bda</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`2adf111129`"><code>2adf111</code></a> release: 1.106.1</li> <li><a href="`c4f9d0b997`"><code>c4f9d0b</code></a> chore(internal): move mypy configurations to <code>pyproject.toml</code> file</li> <li><a href="`2de8d7cde5`"><code>2de8d7c</code></a> release: 1.106.0</li> <li><a href="`2cf4ed5072`"><code>2cf4ed5</code></a> feat: improve future compat with pydantic v3</li> <li><a href="`25d16be18b`"><code>25d16be</code></a> feat(client): support callable api_key (<a href="https://redirect.github.com/openai/openai-python/issues/2588">#2588</a>)</li> <li><a href="`8672413735`"><code>8672413</code></a> release: 1.105.0</li> <li><a href="`2c60d78b37`"><code>2c60d78</code></a> feat(api): Add gpt-realtime models</li> <li><a href="`a52463c932`"><code>a52463c</code></a> release: 1.104.2</li> <li><a href="`5a6931dafd`"><code>5a6931d</code></a> fix(types): add aliases back for web search tool types</li> <li><a href="`fb152d967e`"><code>fb152d9</code></a> release: 1.104.1</li> <li>Additional commits viewable in <a href="https://github.com/openai/openai-python/compare/v1.102.0...v1.106.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=openai&package-manager=uv&previous-version=1.102.0&new-version=1.106.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-10 13:17:43 -07:00
dependabot[bot]	369083c069	chore(python-deps): bump locust from 2.39.1 to 2.40.1 (#3358 ) Bumps [locust](https://github.com/locustio/locust) from 2.39.1 to 2.40.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/locustio/locust/releases">locust's releases</a>.</em></p> <blockquote> <h2>2.40.1</h2> <h2>What's Changed</h2> <ul> <li>Pytest plugin: Delay imports to avoid monkey patching until someone uses the fixtures by <a href="https://github.com/cyberw"><code>@cyberw</code></a> in <a href="https://redirect.github.com/locustio/locust/pull/3204">locustio/locust#3204</a></li> <li>Move pytest plugin to its own directory, to prevent accidental import by <a href="https://github.com/cyberw"><code>@cyberw</code></a> in <a href="https://redirect.github.com/locustio/locust/pull/3205">locustio/locust#3205</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/locustio/locust/compare/2.40.0...2.40.1">https://github.com/locustio/locust/compare/2.40.0...2.40.1</a></p> <h2>2.40.0</h2> <h2>What's Changed</h2> <ul> <li>Refactor FastHttpSession to be more like HttpSession by <a href="https://github.com/cyberw"><code>@cyberw</code></a> in <a href="https://redirect.github.com/locustio/locust/pull/3198">locustio/locust#3198</a></li> <li>Update Dockerfile base to Python 3.13 by <a href="https://github.com/adaamz"><code>@adaamz</code></a> in <a href="https://redirect.github.com/locustio/locust/pull/3193">locustio/locust#3193</a></li> <li>Avoid exception in HttpUser if requests has lost track of the request it made by <a href="https://github.com/cyberw"><code>@cyberw</code></a> in <a href="https://redirect.github.com/locustio/locust/pull/3201">locustio/locust#3201</a></li> <li>Support pytests as locustfiles by <a href="https://github.com/cyberw"><code>@cyberw</code></a> in <a href="https://redirect.github.com/locustio/locust/pull/3200">locustio/locust#3200</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/adaamz"><code>@adaamz</code></a> made their first contribution in <a href="https://redirect.github.com/locustio/locust/pull/3193">locustio/locust#3193</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/locustio/locust/compare/2.39.1...2.40.0">https://github.com/locustio/locust/compare/2.39.1...2.40.0</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/locustio/locust/blob/master/CHANGELOG.md">locust's changelog</a>.</em></p> <blockquote> <h1>Detailed changelog</h1> <p>The most important changes can also be found in <a href="https://docs.locust.io/en/latest/changelog.html">the documentation</a>.</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`5df19da06a`"><code>5df19da</code></a> Merge pull request <a href="https://redirect.github.com/locustio/locust/issues/3205">#3205</a> from locustio/move-pytest-plugin-to-own-directory</li> <li><a href="`d41141bedd`"><code>d41141b</code></a> Move pytest plugin to its own directory, to prevent accidental import of locu...</li> <li><a href="`6422848afd`"><code>6422848</code></a> mention that only one locustfile can be distributed</li> <li><a href="`aa3da739fe`"><code>aa3da73</code></a> Merge pull request <a href="https://redirect.github.com/locustio/locust/issues/3204">#3204</a> from locustio/delay-imports-in-pytest-plugin-to-avoi...</li> <li><a href="`12050dedfd`"><code>12050de</code></a> Pytest plugin: Delay imports to avoid monkey patching until someone actually ...</li> <li><a href="`488d1f8491`"><code>488d1f8</code></a> docs</li> <li><a href="`439b7ab91b`"><code>439b7ab</code></a> docs fix</li> <li><a href="`fcd76a8ac3`"><code>fcd76a8</code></a> docs: rephrase</li> <li><a href="`70c7e9b2d8`"><code>70c7e9b</code></a> docs: move pytest further up</li> <li><a href="`06dbf98013`"><code>06dbf98</code></a> docs: fix link</li> <li>Additional commits viewable in <a href="https://github.com/locustio/locust/compare/2.39.1...2.40.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=locust&package-manager=uv&previous-version=2.39.1&new-version=2.40.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-10 13:17:28 -07:00
dependabot[bot]	a844c4f6e1	chore(python-deps): bump pytest from 8.4.1 to 8.4.2 (#3359 ) Bumps [pytest](https://github.com/pytest-dev/pytest) from 8.4.1 to 8.4.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pytest-dev/pytest/releases">pytest's releases</a>.</em></p> <blockquote> <h2>8.4.2</h2> <h1>pytest 8.4.2 (2025-09-03)</h1> <h2>Bug fixes</h2> <ul> <li> <p><a href="https://redirect.github.com/pytest-dev/pytest/issues/13478">#13478</a>: Fixed a crash when using <code>console_output_style</code>{.interpreted-text role="confval"} with <code>times</code> and a module is skipped.</p> </li> <li> <p><a href="https://redirect.github.com/pytest-dev/pytest/issues/13530">#13530</a>: Fixed a crash when using <code>pytest.approx</code>{.interpreted-text role="func"} and <code>decimal.Decimal</code>{.interpreted-text role="class"} instances with the <code>decimal.FloatOperation</code>{.interpreted-text role="class"} trap set.</p> </li> <li> <p><a href="https://redirect.github.com/pytest-dev/pytest/issues/13549">#13549</a>: No longer evaluate type annotations in Python <code>3.14</code> when inspecting function signatures.</p> <p>This prevents crashes during module collection when modules do not explicitly use <code>from __future__ import annotations</code> and import types for annotations within a <code>if TYPE_CHECKING:</code> block.</p> </li> <li> <p><a href="https://redirect.github.com/pytest-dev/pytest/issues/13559">#13559</a>: Added missing [int]{.title-ref} and [float]{.title-ref} variants to the [Literal]{.title-ref} type annotation of the [type]{.title-ref} parameter in <code>pytest.Parser.addini</code>{.interpreted-text role="meth"}.</p> </li> <li> <p><a href="https://redirect.github.com/pytest-dev/pytest/issues/13563">#13563</a>: <code>pytest.approx</code>{.interpreted-text role="func"} now only imports <code>numpy</code> if NumPy is already in <code>sys.modules</code>. This fixes unconditional import behavior introduced in [8.4.0]{.title-ref}.</p> </li> </ul> <h2>Improved documentation</h2> <ul> <li><a href="https://redirect.github.com/pytest-dev/pytest/issues/13577">#13577</a>: Clarify that <code>pytest_generate_tests</code> is discovered in test modules/classes; other hooks must be in <code>conftest.py</code> or plugins.</li> </ul> <h2>Contributor-facing changes</h2> <ul> <li><a href="https://redirect.github.com/pytest-dev/pytest/issues/13480">#13480</a>: Self-testing: fixed a few test failures when run with <code>-Wdefault</code> or a similar override.</li> <li><a href="https://redirect.github.com/pytest-dev/pytest/issues/13547">#13547</a>: Self-testing: corrected expected message for <code>test_doctest_unexpected_exception</code> in Python <code>3.14</code>.</li> <li><a href="https://redirect.github.com/pytest-dev/pytest/issues/13684">#13684</a>: Make pytest's own testsuite insensitive to the presence of the <code>CI</code> environment variable -- by <code>ogrisel</code>{.interpreted-text role="user"}.</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`bfae4224fd`"><code>bfae422</code></a> Prepare release version 8.4.2</li> <li><a href="`89905381a1`"><code>8990538</code></a> Fix passenv CI in tox ini and make tests insensitive to the presence of the C...</li> <li><a href="`ca676bfe00`"><code>ca676bf</code></a> Merge pull request <a href="https://redirect.github.com/pytest-dev/pytest/issues/13687">#13687</a> from pytest-dev/patchback/backports/8.4.x/e63f6e51c...</li> <li><a href="`975a60a63c`"><code>975a60a</code></a> Merge pull request <a href="https://redirect.github.com/pytest-dev/pytest/issues/13686">#13686</a> from pytest-dev/patchback/backports/8.4.x/12bde8af6...</li> <li><a href="`7723ce84b8`"><code>7723ce8</code></a> Merge pull request <a href="https://redirect.github.com/pytest-dev/pytest/issues/13683">#13683</a> from even-even/fix_Exeption_to_Exception_in_errorMe...</li> <li><a href="`b7f05680d1`"><code>b7f0568</code></a> Merge pull request <a href="https://redirect.github.com/pytest-dev/pytest/issues/13685">#13685</a> from CoretexShadow/fix/docs-pytest-generate-tests</li> <li><a href="`2c94c4a694`"><code>2c94c4a</code></a> add missing colon (<a href="https://redirect.github.com/pytest-dev/pytest/issues/13640">#13640</a>) (<a href="https://redirect.github.com/pytest-dev/pytest/issues/13641">#13641</a>)</li> <li><a href="`c3d7684bc0`"><code>c3d7684</code></a> Merge pull request <a href="https://redirect.github.com/pytest-dev/pytest/issues/13606">#13606</a> from pytest-dev/patchback/backports/8.4.x/5f9938563...</li> <li><a href="`dc6e3be2dd`"><code>dc6e3be</code></a> Merge pull request <a href="https://redirect.github.com/pytest-dev/pytest/issues/13605">#13605</a> from The-Compiler/training-update-2025-07</li> <li><a href="`f87289c36c`"><code>f87289c</code></a> Fix crash with <code>times</code> output style and skipped module (<a href="https://redirect.github.com/pytest-dev/pytest/issues/13573">#13573</a>) (<a href="https://redirect.github.com/pytest-dev/pytest/issues/13579">#13579</a>)</li> <li>Additional commits viewable in <a href="https://github.com/pytest-dev/pytest/compare/8.4.1...8.4.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pytest&package-manager=uv&previous-version=8.4.1&new-version=8.4.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-10 13:17:02 -07:00
Alexey Rybak	7394828c7a	docs: horizontal nav bar (#3407 ) # What does this PR do? * Adds a horizontal nav bar for easy access to the API reference and the Llama Stack Github repo <img width="2696" height="520" alt="image" src="https://github.com/user-attachments/assets/82daffe1-c206-4e20-b95b-1e090011eecc" /> ## Test Plan * Built the docs and ran the local HTML server to verify changes	2025-09-10 12:43:36 -07:00
ehhuang	e980436a2e	chore: introduce write queue for inference_store (#3383 ) # What does this PR do? Adds a write worker queue for writes to inference store. This avoids overwhelming request processing with slow inference writes. ## Test Plan Benchmark: ``` cd /docs/source/distributions/k8s-benchmark # start mock server python openai-mock-server.py --port 8000 # start stack server LLAMA_STACK_LOGGING="all=WARNING" uv run --with llama-stack python -m llama_stack.core.server.server docs/source/distributions/k8s-benchmark/stack_run_config.yaml # run benchmark script uv run python3 benchmark.py --duration 120 --concurrent 50 --base-url=http://localhost:8321/v1/openai/v1 --model=vllm-inference/meta-llama/Llama-3.2-3B-Instruct ``` ## RPS from 21 -> 57	2025-09-10 11:57:42 -07:00
Derek Higgins	e6edc1f934	fix: unbound variable error in schedule-record-workflow.sh (#3401 ) - Initialize INPUTS variable to prevent 'unbound variable' error Fixes: ./scripts/github/schedule-record-workflow.sh: line 246: INPUTS: unbound variable │	2025-09-10 11:54:10 -07:00
Francisco Arceo	a6b1588dc6	revert: Fireworks chat completion broken due to telemetry (#3402 ) Reverts llamastack/llama-stack#3392	2025-09-10 11:53:38 -07:00
ehhuang	f6bf36343d	chore: logging perf improvments (#3393 ) # What does this PR do? - Use BackgroundLogger when logging metric events. - Reuse event loop in BackgroundLogger ## Test Plan ``` cd /docs/source/distributions/k8s-benchmark # start mock server python openai-mock-server.py --port 8000 # start stack server LLAMA_STACK_LOGGING="all=WARNING" uv run --with llama-stack python -m llama_stack.core.server.server docs/source/distributions/k8s-benchmark/stack_run_config.yaml # run benchmark script uv run python3 benchmark.py --duration 120 --concurrent 50 --base-url=http://localhost:8321/v1/openai/v1 --model=vllm-inference/meta-llama/Llama-3.2-3B-Instruct ``` ### RPS from 57 -> 62	2025-09-10 11:52:23 -07:00
slekkala1	935b8e28de	fix: Fireworks chat completion broken due to telemetry (#3392 ) # What does this PR do? Fix fireworks chat completion broken due to telemetry expecting response.usage Closes https://github.com/llamastack/llama-stack/issues/3391 ## Test Plan 1. `uv run --with llama-stack llama stack build --distro starter --image-type venv --run` Try ``` curl -X POST http://0.0.0.0:8321/v1/openai/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct", "messages": [{"role": "user", "content": "Hello!"}] }' ``` ``` {"id":"chatcmpl-ee922a08-0df0-4974-b0d3-b322113e8bc0","choices":[{"message":{"role":"assistant","content":"Hello! How can I assist you today?","name":null,"tool_calls":null},"finish_reason":"stop","index":0,"logprobs":null}],"object":"chat.completion","created":1757456375,"model":"fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct"}% ``` Without fix fails as mentioned in https://github.com/llamastack/llama-stack/issues/3391 Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-09-10 08:48:01 -07:00
Sébastien Han	c86e45496e	ci: Re-enable pre-commit to fail (#3399 ) Some checks failed Python Package Build Test / build (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 9s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 58s Details Pre-commit / pre-commit (push) Successful in 1m14s Details If pre-commit fails, the workflow must fail. --------- Signed-off-by: Sébastien Han <seb@redhat.com>	2025-09-10 10:00:46 -04:00
Matthew Farrellee	0e27016cf2	chore: update the vertexai inference impl to use openai-python for openai-compat functions (#3377 ) # What does this PR do? update VertexAI inference provider to use openai-python for openai-compat functions ## Test Plan ``` $ VERTEX_AI_PROJECT=... uv run llama stack build --image-type venv --providers inference=remote::vertexai --run ... $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run --group test pytest -v -ra --text-model vertexai/vertex_ai/gemini-2.5-flash tests/integration/inference/test_openai_completion.py ... ``` i don't have an account to test this. `get_api_key` may also need to be updated per https://cloud.google.com/vertex-ai/generative-ai/docs/start/openai --------- Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-09-10 15:39:29 +02:00
Akram Ben Aissi	c836fa29e3	fix: pre-commit issues: non executable shebang file and removal of @pytest.mark.asyncio decorator (#3397 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Fix pre-commit issues: non executable shebang file, @pytest.mark.asyncio decorator <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-10 15:27:35 +02:00
Akram Ben Aissi	1671431310	fix: Add missing files_api parameter to MemoryToolRuntimeImpl test (#3394 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The test_query_adds_vector_db_id_to_chunk_metadata test was failing because MemoryToolRuntimeImpl.__init__() now requires a files_api parameter. Fixes failing unit tests for Python 3.12 and 3.13. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-09-10 06:55:57 -04:00
Cesare Pompeiano	1c23aeb937	feat: Add vector_db_id to chunk metadata (#3304 ) # What does this PR do? When running RAG in a multi vector DB setting, it can be difficult to trace where retrieved chunks originate from. This PR adds the `vector_db_id` into each chunk’s metadata, making it easier to understand which database a given chunk came from. This is helpful for debugging and for analyzing retrieval behavior of multiple DBs. Relevant code: ```python for vector_db_id, result in zip(vector_db_ids, results): for chunk, score in zip(result.chunks, result.scores): if not hasattr(chunk, "metadata") or chunk.metadata is None: chunk.metadata = {} chunk.metadata["vector_db_id"] = vector_db_id chunks.append(chunk) scores.append(score) ``` ## Test Plan * Ran Llama Stack in debug mode. * Verified that `vector_db_id` was added to each chunk’s metadata. * Confirmed that the metadata was printed in the console when using the RAG tool. --------- Co-authored-by: are-ces <cpompeia@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-09-10 11:19:21 +02:00
Ashwin Bharambe	81ad240faa	fix(k8s): unwedge run.yaml to add files Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details UI Tests / ui-tests (22) (push) Successful in 38s Details Pre-commit / pre-commit (push) Successful in 1m28s Details	2025-09-09 23:02:26 -07:00
Matthew Farrellee	dd1f946b3e	feat: include a default inference store during llama stack build (#3373 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details API Conformance Tests / check-schema-compatibility (push) Successful in 7s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 43s Details Pre-commit / pre-commit (push) Successful in 1m14s Details # What does this PR do? enables completions storage when using `llama stack build --providers` - - GET /v1/chat/completions - GET /v1/chat/completions/{id} todo: llama stack build and distro codegen should use the same code paths ## Test Plan ci	2025-09-09 15:54:58 -07:00
ehhuang	9d3a234bf3	chore: remove unused variable (#3389 ) # What does this PR do? ## Test Plan	2025-09-09 15:51:20 -07:00
Ashwin Bharambe	a8aa815b6a	feat(tests): migrate to global "setups" system for test configuration (#3390 ) This PR refactors the integration test system to use global "setups" which provides better separation of concerns: suites = what to test, setups = how to configure. NOTE: if you naming suggestions, please provide feedback Changes: - New `tests/integration/setups.py` with global, reusable configurations (ollama, vllm, gpt, claude) - Modified `scripts/integration-tests.sh` options to match with the underlying pytest options - Updated documentation to reflect the new global setup system The main benefit is that setups can be reused across multiple suites (e.g., use "gpt" with any suite) even though sometimes they could specifically tailored for a suite (vision <> ollama-vision). It is now easier to add new configurations without modifying existing suites. Usage examples: - `pytest tests/integration --suite=responses --setup=gpt` - `pytest tests/integration --suite=vision` # auto-selects "ollama-vision" setup - `pytest tests/integration --suite=base --setup=vllm`	2025-09-09 15:50:56 -07:00
github-actions[bot]	28696c3f30	build: Bump version to 0.2.21 Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 8s Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 7s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 2s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 41s Details UI Tests / ui-tests (22) (push) Successful in 37s Details Test External API and Providers / test-external (venv) (push) Failing after 41s Details Pre-commit / pre-commit (push) Successful in 2m0s Details	2025-09-08 22:30:03 +00:00
Ashwin Bharambe	30468d0c43	fix(deps): bump datasets versions for all providers (#3382 ) Not doing so results in errors of the kind you see in: `4989026435`	2025-09-08 15:13:42 -07:00
slekkala1	c9268a7a8c	fix: pre-commit failing (#3381 ) # What does this PR do? Fix failing pre-commit, https://github.com/llamastack/llama-stack/actions/workflows/pre-commit.yml ## Test Plan CI	2025-09-08 14:46:46 -07:00
Swapna Lekkala	09141361fb	fix: use dataset version 4.0.0 or above	2025-09-08 13:22:43 -07:00
Derek Higgins	ef02b9ea10	fix: environment variable typo in inference recorder error message (#3374 ) The error message was referencing LLAMA_STACK_INFERENCE_MODE instead of the correct LLAMA_STACK_TEST_INFERENCE_MODE environment variable.	2025-09-08 17:51:38 +02:00
Francisco Arceo	ad6ea7fb91	feat: Adding OpenAI Prompts API (#3319 ) # What does this PR do? This PR adds support for OpenAI Prompts API. Note, OpenAI does not explicitly expose the Prompts API but instead makes it available in the Responses API and in the [Prompts Dashboard](https://platform.openai.com/docs/guides/prompting#create-a-prompt). I have added the following APIs: - CREATE - GET - LIST - UPDATE - Set Default Version The Set Default Version API is made available only in the Prompts Dashboard and configures which prompt version is returned in the GET (the latest version is the default). Overall, the expected functionality in Responses will look like this: ```python from openai import OpenAI client = OpenAI() response = client.responses.create( prompt={ "id": "pmpt_68b0c29740048196bd3a6e6ac3c4d0e20ed9a13f0d15bf5e", "version": "2", "variables": { "city": "San Francisco", "age": 30, } } ) ``` ### Resolves https://github.com/llamastack/llama-stack/issues/3276 ## Test Plan Unit tests added. Integration tests can be added after client generation. ## Next Steps 1. Update Responses API to support Prompt API 2. I'll enhance the UI to implement the Prompt Dashboard. 3. Add cache for lower latency --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-09-08 11:05:13 -04:00
Mohammad Daoud Farooqi	9618adba89	docs: add MongoDB to external provider list (#3369 ) Some checks failed Python Package Build Test / build (3.12) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 10s Details Update ReadTheDocs / update-readthedocs (push) Failing after 36s Details Test External API and Providers / test-external (venv) (push) Failing after 41s Details UI Tests / ui-tests (22) (push) Successful in 1m3s Details Pre-commit / pre-commit (push) Successful in 2m10s Details The MongoDB integration - Vector search, Full-Text search and Hybrid search have now been added as an external provider offering for Llama Stack: https://github.com/mongodb-partners/mongodb-llama-stack	2025-09-08 14:09:13 +02:00
Akram Ben Aissi	072dca0609	feat: Add Kubernetes auth provider to use SelfSubjectReview and kubernetes api server (#2559 ) # What does this PR do? Add Kubernetes authentication provider support - Add KubernetesAuthProvider class for token validation using Kubernetes SelfSubjectReview API - Add KubernetesAuthProviderConfig with configurable API server URL, TLS settings, and claims mapping - Implement authentication via POST requests to /apis/authentication.k8s.io/v1/selfsubjectreviews endpoint - Add support for parsing Kubernetes SelfSubjectReview response format to extract user information - Add KUBERNETES provider type to AuthProviderType enum - Update create_auth_provider factory function to handle 'kubernetes' provider type - Add comprehensive unit tests for KubernetesAuthProvider functionality - Add documentation with configuration examples and usage instructions The provider validates tokens by sending SelfSubjectReview requests to the Kubernetes API server and extracts user information from the userInfo structure in the response. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> What This Verifies: Authentication header validation Token validation with Kubernetes SelfSubjectReview and kubernetes server API endpoint Error handling for invalid tokens and HTTP errors Request payload structure and headers ``` python -m pytest tests/unit/server/test_auth.py -k "kubernetes" -v ``` Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>	2025-09-08 11:25:10 +02:00
dependabot[bot]	44e1a40595	chore(github-deps): bump actions/checkout from 4.1.7 to 5.0.0 (#3357 ) [//]: # (dependabot-start) ⚠️ Dependabot is rebasing this PR ⚠️ Rebasing might not happen immediately, so don't worry if this takes some time. Note: if you make any changes to this PR yourself, they will take precedence over the rebase. --- [//]: # (dependabot-end) Bumps [actions/checkout](https://github.com/actions/checkout) from 4.1.7 to 5.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/releases">actions/checkout's releases</a>.</em></p> <blockquote> <h2>v5.0.0</h2> <h2>What's Changed</h2> <ul> <li>Update actions checkout to use node 24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li> <li>Prepare v5.0.0 release by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2238">actions/checkout#2238</a></li> </ul> <h2>⚠️ Minimum Compatible Runner Version</h2> <p><strong>v2.327.1</strong><br /> <a href="https://github.com/actions/runner/releases/tag/v2.327.1">Release Notes</a></p> <p>Make sure your runner is updated to this version or newer to use this release.</p> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v4...v5.0.0">https://github.com/actions/checkout/compare/v4...v5.0.0</a></p> <h2>v4.3.0</h2> <h2>What's Changed</h2> <ul> <li>docs: update README.md by <a href="https://github.com/motss"><code>@motss</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li>Add internal repos for checking out multiple repositories by <a href="https://github.com/mouismail"><code>@mouismail</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li>Documentation update - add recommended permissions to Readme by <a href="https://github.com/benwells"><code>@benwells</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li>Adjust positioning of user email note and permissions heading by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li> <li>Update README.md by <a href="https://github.com/nebuk89"><code>@nebuk89</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li>Update CODEOWNERS for actions by <a href="https://github.com/TingluoHuang"><code>@TingluoHuang</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li> <li>Update package dependencies by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> <li>Prepare release v4.3.0 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2237">actions/checkout#2237</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/motss"><code>@motss</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li><a href="https://github.com/mouismail"><code>@mouismail</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li><a href="https://github.com/benwells"><code>@benwells</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li><a href="https://github.com/nebuk89"><code>@nebuk89</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li><a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v4...v4.3.0">https://github.com/actions/checkout/compare/v4...v4.3.0</a></p> <h2>v4.2.2</h2> <h2>What's Changed</h2> <ul> <li><code>url-helper.ts</code> now leverages well-known environment variables by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li> <li>Expand unit test coverage for <code>isGhes</code> by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v4.2.1...v4.2.2">https://github.com/actions/checkout/compare/v4.2.1...v4.2.2</a></p> <h2>v4.2.1</h2> <h2>What's Changed</h2> <ul> <li>Check out other refs/* by commit if provided, fall back to ref by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/Jcambass"><code>@Jcambass</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/1919">actions/checkout#1919</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v4.2.0...v4.2.1">https://github.com/actions/checkout/compare/v4.2.0...v4.2.1</a></p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2>V5.0.0</h2> <ul> <li>Update actions checkout to use node 24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li> </ul> <h2>V4.3.0</h2> <ul> <li>docs: update README.md by <a href="https://github.com/motss"><code>@motss</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li>Add internal repos for checking out multiple repositories by <a href="https://github.com/mouismail"><code>@mouismail</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li>Documentation update - add recommended permissions to Readme by <a href="https://github.com/benwells"><code>@benwells</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li>Adjust positioning of user email note and permissions heading by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li> <li>Update README.md by <a href="https://github.com/nebuk89"><code>@nebuk89</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li>Update CODEOWNERS for actions by <a href="https://github.com/TingluoHuang"><code>@TingluoHuang</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li> <li>Update package dependencies by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> </ul> <h2>v4.2.2</h2> <ul> <li><code>url-helper.ts</code> now leverages well-known environment variables by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li> <li>Expand unit test coverage for <code>isGhes</code> by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li> </ul> <h2>v4.2.1</h2> <ul> <li>Check out other refs/* by commit if provided, fall back to ref by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li> </ul> <h2>v4.2.0</h2> <ul> <li>Add Ref and Commit outputs by <a href="https://github.com/lucacome"><code>@lucacome</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li> <li>Dependency updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>- <a href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>, <a href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li> </ul> <h2>v4.1.7</h2> <ul> <li>Bump the minor-npm-dependencies group across 1 directory with 4 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li> <li>Bump actions/checkout from 3 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li> <li>Check out other refs/* by commit by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li> <li>Pin actions/checkout's own workflows to a known, good, stable version. by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li> </ul> <h2>v4.1.6</h2> <ul> <li>Check platform to set archive extension appropriately by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li> </ul> <h2>v4.1.5</h2> <ul> <li>Update NPM dependencies by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li> <li>Bump github/codeql-action from 2 to 3 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li> <li>Bump actions/setup-node from 1 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li> <li>Bump actions/upload-artifact from 2 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li> <li>README: Suggest <code>user.email</code> to be <code>41898282+github-actions[bot]@users.noreply.github.com</code> by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1707">actions/checkout#1707</a></li> </ul> <h2>v4.1.4</h2> <ul> <li>Disable <code>extensions.worktreeConfig</code> when disabling <code>sparse-checkout</code> by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1692">actions/checkout#1692</a></li> <li>Add dependabot config by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1688">actions/checkout#1688</a></li> <li>Bump the minor-actions-dependencies group with 2 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1693">actions/checkout#1693</a></li> <li>Bump word-wrap from 1.2.3 to 1.2.5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1643">actions/checkout#1643</a></li> </ul> <h2>v4.1.3</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`08c6903cd8`"><code>08c6903</code></a> Prepare v5.0.0 release (<a href="https://redirect.github.com/actions/checkout/issues/2238">#2238</a>)</li> <li><a href="`9f265659d3`"><code>9f26565</code></a> Update actions checkout to use node 24 (<a href="https://redirect.github.com/actions/checkout/issues/2226">#2226</a>)</li> <li><a href="`08eba0b27e`"><code>08eba0b</code></a> Prepare release v4.3.0 (<a href="https://redirect.github.com/actions/checkout/issues/2237">#2237</a>)</li> <li><a href="`631c7dc4f8`"><code>631c7dc</code></a> Update package dependencies (<a href="https://redirect.github.com/actions/checkout/issues/2236">#2236</a>)</li> <li><a href="`8edcb1bdb4`"><code>8edcb1b</code></a> Update CODEOWNERS for actions (<a href="https://redirect.github.com/actions/checkout/issues/2224">#2224</a>)</li> <li><a href="`09d2acae67`"><code>09d2aca</code></a> Update README.md (<a href="https://redirect.github.com/actions/checkout/issues/2194">#2194</a>)</li> <li><a href="`85e6279cec`"><code>85e6279</code></a> Adjust positioning of user email note and permissions heading (<a href="https://redirect.github.com/actions/checkout/issues/2044">#2044</a>)</li> <li><a href="`009b9ae9e4`"><code>009b9ae</code></a> Documentation update - add recommended permissions to Readme (<a href="https://redirect.github.com/actions/checkout/issues/2043">#2043</a>)</li> <li><a href="`cbb722410c`"><code>cbb7224</code></a> Update README.md (<a href="https://redirect.github.com/actions/checkout/issues/1977">#1977</a>)</li> <li><a href="`3b9b8c884f`"><code>3b9b8c8</code></a> docs: update README.md (<a href="https://redirect.github.com/actions/checkout/issues/1971">#1971</a>)</li> <li>Additional commits viewable in <a href="https://github.com/actions/checkout/compare/v4.1.7...08c6903cd8c0fde910a37f88322edcfb5dd907a8">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=4.1.7&new-version=5.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 10:07:03 +02:00
dependabot[bot]	d458817af5	chore(github-deps): bump actions/setup-python from 5.6.0 to 6.0.0 (#3354 ) Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5.6.0 to 6.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/setup-python/releases">actions/setup-python's releases</a>.</em></p> <blockquote> <h2>v6.0.0</h2> <h2>What's Changed</h2> <h3>Breaking Changes</h3> <ul> <li>Upgrade to node 24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1164">actions/setup-python#1164</a></li> </ul> <p>Make sure your runner is on version v2.327.1 or later to ensure compatibility with this release. <a href="https://github.com/actions/runner/releases/tag/v2.327.1">See Release Notes</a></p> <h3>Enhancements:</h3> <ul> <li>Add support for <code>pip-version</code> by <a href="https://github.com/priyagupta108"><code>@priyagupta108</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1129">actions/setup-python#1129</a></li> <li>Enhance reading from .python-version by <a href="https://github.com/krystof-k"><code>@krystof-k</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/787">actions/setup-python#787</a></li> <li>Add version parsing from Pipfile by <a href="https://github.com/aradkdj"><code>@aradkdj</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1067">actions/setup-python#1067</a></li> </ul> <h3>Bug fixes:</h3> <ul> <li>Clarify pythonLocation behaviour for PyPy and GraalPy in environment variables by <a href="https://github.com/aparnajyothi-y"><code>@aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1183">actions/setup-python#1183</a></li> <li>Change missing cache directory error to warning by <a href="https://github.com/aparnajyothi-y"><code>@aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1182">actions/setup-python#1182</a></li> <li>Add Architecture-Specific PATH Management for Python with --user Flag on Windows by <a href="https://github.com/aparnajyothi-y"><code>@aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1122">actions/setup-python#1122</a></li> <li>Include python version in PyPy python-version output by <a href="https://github.com/cdce8p"><code>@cdce8p</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1110">actions/setup-python#1110</a></li> <li>Update docs: clarification on pip authentication with setup-python by <a href="https://github.com/priya-kinthali"><code>@priya-kinthali</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1156">actions/setup-python#1156</a></li> </ul> <h3>Dependency updates:</h3> <ul> <li>Upgrade idna from 2.9 to 3.7 in /<strong>tests</strong>/data by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-python/pull/843">actions/setup-python#843</a></li> <li>Upgrade form-data to fix critical vulnerabilities <a href="https://redirect.github.com/actions/setup-python/issues/182">#182</a> & <a href="https://redirect.github.com/actions/setup-python/issues/183">#183</a> by <a href="https://github.com/aparnajyothi-y"><code>@aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1163">actions/setup-python#1163</a></li> <li>Upgrade setuptools to 78.1.1 to fix path traversal vulnerability in PackageIndex.download by <a href="https://github.com/aparnajyothi-y"><code>@aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1165">actions/setup-python#1165</a></li> <li>Upgrade actions/checkout from 4 to 5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-python/pull/1181">actions/setup-python#1181</a></li> <li>Upgrade <code>@actions/tool-cache</code> from 2.0.1 to 2.0.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-python/pull/1095">actions/setup-python#1095</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/krystof-k"><code>@krystof-k</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-python/pull/787">actions/setup-python#787</a></li> <li><a href="https://github.com/cdce8p"><code>@cdce8p</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-python/pull/1110">actions/setup-python#1110</a></li> <li><a href="https://github.com/aradkdj"><code>@aradkdj</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-python/pull/1067">actions/setup-python#1067</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-python/compare/v5...v6.0.0">https://github.com/actions/setup-python/compare/v5...v6.0.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`e797f83bcb`"><code>e797f83</code></a> Upgrade to node 24 (<a href="https://redirect.github.com/actions/setup-python/issues/1164">#1164</a>)</li> <li><a href="`3d1e2d2ca0`"><code>3d1e2d2</code></a> Revert "Enhance cache-dependency-path handling to support files outside the w...</li> <li><a href="`65b071217a`"><code>65b0712</code></a> Clarify pythonLocation behavior for PyPy and GraalPy in environment variables...</li> <li><a href="`5b668cf765`"><code>5b668cf</code></a> Bump actions/checkout from 4 to 5 (<a href="https://redirect.github.com/actions/setup-python/issues/1181">#1181</a>)</li> <li><a href="`f62a0e252f`"><code>f62a0e2</code></a> Change missing cache directory error to warning (<a href="https://redirect.github.com/actions/setup-python/issues/1182">#1182</a>)</li> <li><a href="`9322b3ca74`"><code>9322b3c</code></a> Upgrade setuptools to 78.1.1 to fix path traversal vulnerability in PackageIn...</li> <li><a href="`fbeb884f69`"><code>fbeb884</code></a> Bump form-data to fix critical vulnerabilities <a href="https://redirect.github.com/actions/setup-python/issues/182">#182</a> & <a href="https://redirect.github.com/actions/setup-python/issues/183">#183</a> (<a href="https://redirect.github.com/actions/setup-python/issues/1163">#1163</a>)</li> <li><a href="`03bb6152f4`"><code>03bb615</code></a> Bump idna from 2.9 to 3.7 in /<strong>tests</strong>/data (<a href="https://redirect.github.com/actions/setup-python/issues/843">#843</a>)</li> <li><a href="`36da51d563`"><code>36da51d</code></a> Add version parsing from Pipfile (<a href="https://redirect.github.com/actions/setup-python/issues/1067">#1067</a>)</li> <li><a href="`3c6f142cc0`"><code>3c6f142</code></a> update documentation (<a href="https://redirect.github.com/actions/setup-python/issues/1156">#1156</a>)</li> <li>Additional commits viewable in <a href="`a26af69be9...e797f83bcb`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-python&package-manager=github_actions&previous-version=5.6.0&new-version=6.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 10:05:34 +02:00
dependabot[bot]	dfa13d68f1	chore(github-deps): bump actions/setup-node from 4.4.0 to 5.0.0 (#3353 ) Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4.4.0 to 5.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/setup-node/releases">actions/setup-node's releases</a>.</em></p> <blockquote> <h2>v5.0.0</h2> <h2>What's Changed</h2> <h3>Breaking Changes</h3> <ul> <li>Enhance caching in setup-node with automatic package manager detection by <a href="https://github.com/priya-kinthali"><code>@priya-kinthali</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li> </ul> <p>This update, introduces automatic caching when a valid <code>packageManager</code> field is present in your <code>package.json</code>. This aims to improve workflow performance and make dependency management more seamless. To disable this automatic caching, set <code>package-manager-cache: false</code></p> <pre lang="yaml"><code>steps: - uses: actions/checkout@v5 - uses: actions/setup-node@v5 with: package-manager-cache: false </code></pre> <ul> <li>Upgrade action to use node24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li> </ul> <p>Make sure your runner is on version v2.327.1 or later to ensure compatibility with this release. <a href="https://github.com/actions/runner/releases/tag/v2.327.1">See Release Notes</a></p> <h3>Dependency Upgrades</h3> <ul> <li>Upgrade <code>@octokit/request-error</code> and <code>@actions/github</code> by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1227">actions/setup-node#1227</a></li> <li>Upgrade uuid from 9.0.1 to 11.1.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1273">actions/setup-node#1273</a></li> <li>Upgrade undici from 5.28.5 to 5.29.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1295">actions/setup-node#1295</a></li> <li>Upgrade form-data to bring in fix for critical vulnerability by <a href="https://github.com/gowridurgad"><code>@gowridurgad</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1332">actions/setup-node#1332</a></li> <li>Upgrade actions/checkout from 4 to 5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/setup-node/pull/1345">actions/setup-node#1345</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/priya-kinthali"><code>@priya-kinthali</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li> <li><a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v4...v5.0.0">https://github.com/actions/setup-node/compare/v4...v5.0.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`a0853c2454`"><code>a0853c2</code></a> Bump actions/checkout from 4 to 5 (<a href="https://redirect.github.com/actions/setup-node/issues/1345">#1345</a>)</li> <li><a href="`b7234cc9fe`"><code>b7234cc</code></a> Upgrade action to use node24 (<a href="https://redirect.github.com/actions/setup-node/issues/1325">#1325</a>)</li> <li><a href="`d7a11313b5`"><code>d7a1131</code></a> Enhance caching in setup-node with automatic package manager detection (<a href="https://redirect.github.com/actions/setup-node/issues/1348">#1348</a>)</li> <li><a href="`5e2628c959`"><code>5e2628c</code></a> Bumps form-data (<a href="https://redirect.github.com/actions/setup-node/issues/1332">#1332</a>)</li> <li><a href="`65beceff8e`"><code>65becef</code></a> Bump undici from 5.28.5 to 5.29.0 (<a href="https://redirect.github.com/actions/setup-node/issues/1295">#1295</a>)</li> <li><a href="`7e24a656e1`"><code>7e24a65</code></a> Bump uuid from 9.0.1 to 11.1.0 (<a href="https://redirect.github.com/actions/setup-node/issues/1273">#1273</a>)</li> <li><a href="`08f58d1471`"><code>08f58d1</code></a> Bump <code>@octokit/request-error</code> and <code>@actions/github</code> (<a href="https://redirect.github.com/actions/setup-node/issues/1227">#1227</a>)</li> <li>See full diff in <a href="`49933ea528...a0853c2454`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=4.4.0&new-version=5.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 10:05:00 +02:00
dependabot[bot]	58c61d85c8	chore(github-deps): bump actions/stale from 9.1.0 to 10.0.0 (#3352 ) Bumps [actions/stale](https://github.com/actions/stale) from 9.1.0 to 10.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/stale/releases">actions/stale's releases</a>.</em></p> <blockquote> <h2>v10.0.0</h2> <h2>What's Changed</h2> <h3>Breaking Changes</h3> <ul> <li>Upgrade to node 24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1279">actions/stale#1279</a> Make sure your runner is on version v2.327.1 or later to ensure compatibility with this release. <a href="https://github.com/actions/runner/releases/tag/v2.327.1">Release Notes</a></li> </ul> <h3>Enhancement</h3> <ul> <li>Introducing sort-by option by <a href="https://github.com/suyashgaonkar"><code>@suyashgaonkar</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1254">actions/stale#1254</a></li> </ul> <h3>Dependency Upgrades</h3> <ul> <li>Upgrade actions/publish-immutable-action from 0.0.3 to 0.0.4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/stale/pull/1186">actions/stale#1186</a></li> <li>Upgrade undici from 5.28.4 to 5.28.5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/stale/pull/1201">actions/stale#1201</a></li> <li>Upgrade <code>@action/cache</code> from 4.0.0 to 4.0.2 by <a href="https://github.com/aparnajyothi-y"><code>@aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1226">actions/stale#1226</a></li> <li>Upgrade <code>@action/cache</code> from 4.0.2 to 4.0.3 by <a href="https://github.com/suyashgaonkar"><code>@suyashgaonkar</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1233">actions/stale#1233</a></li> <li>Upgrade undici from 5.28.5 to 5.29.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/actions/stale/pull/1251">actions/stale#1251</a></li> <li>Upgrade form-data to bring in fix for critical vulnerability by <a href="https://github.com/gowridurgad"><code>@gowridurgad</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1277">actions/stale#1277</a></li> </ul> <h3>Documentation changes</h3> <ul> <li>Changelog update for recent releases by <a href="https://github.com/suyashgaonkar"><code>@suyashgaonkar</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1224">actions/stale#1224</a></li> <li>Permissions update in Readme by <a href="https://github.com/ghadimir"><code>@ghadimir</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1248">actions/stale#1248</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/suyashgaonkar"><code>@suyashgaonkar</code></a> made their first contribution in <a href="https://redirect.github.com/actions/stale/pull/1224">actions/stale#1224</a></li> <li><a href="https://github.com/GhadimiR"><code>@GhadimiR</code></a> made their first contribution in <a href="https://redirect.github.com/actions/stale/pull/1248">actions/stale#1248</a></li> <li><a href="https://github.com/gowridurgad"><code>@gowridurgad</code></a> made their first contribution in <a href="https://redirect.github.com/actions/stale/pull/1277">actions/stale#1277</a></li> <li><a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> made their first contribution in <a href="https://redirect.github.com/actions/stale/pull/1279">actions/stale#1279</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/stale/compare/v9...v10.0.0">https://github.com/actions/stale/compare/v9...v10.0.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`3a9db7e6a4`"><code>3a9db7e</code></a> Upgrade to node 24 (<a href="https://redirect.github.com/actions/stale/issues/1279">#1279</a>)</li> <li><a href="`8f717f0dfc`"><code>8f717f0</code></a> Bumps form-data (<a href="https://redirect.github.com/actions/stale/issues/1277">#1277</a>)</li> <li><a href="`a92fd57ffe`"><code>a92fd57</code></a> build(deps): bump undici from 5.28.5 to 5.29.0 (<a href="https://redirect.github.com/actions/stale/issues/1251">#1251</a>)</li> <li><a href="`128b2c81d0`"><code>128b2c8</code></a> Introducing sort-by option (<a href="https://redirect.github.com/actions/stale/issues/1254">#1254</a>)</li> <li><a href="`f78de9780e`"><code>f78de97</code></a> Update README.md (<a href="https://redirect.github.com/actions/stale/issues/1248">#1248</a>)</li> <li><a href="`816d9db1ab`"><code>816d9db</code></a> Upgrade <code>@action/cache</code> from 4.0.2 to 4.0.3 (<a href="https://redirect.github.com/actions/stale/issues/1233">#1233</a>)</li> <li><a href="`ba23c1cb02`"><code>ba23c1c</code></a> upgrade actions/cache from 4.0.0 to 4.0.2 (<a href="https://redirect.github.com/actions/stale/issues/1226">#1226</a>)</li> <li><a href="`a65e88a9b9`"><code>a65e88a</code></a> build(deps): bump undici from 5.28.4 to 5.28.5 (<a href="https://redirect.github.com/actions/stale/issues/1201">#1201</a>)</li> <li><a href="`d4df79c591`"><code>d4df79c</code></a> Updates to CHANGELOG.MD for recent releases (<a href="https://redirect.github.com/actions/stale/issues/1224">#1224</a>)</li> <li><a href="`ee7ef89499`"><code>ee7ef89</code></a> build(deps): bump actions/publish-immutable-action from 0.0.3 to 0.0.4 (<a href="https://redirect.github.com/actions/stale/issues/1186">#1186</a>)</li> <li>See full diff in <a href="`5bef64f19d...3a9db7e6a4`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/stale&package-manager=github_actions&previous-version=9.1.0&new-version=10.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 10:04:41 +02:00
Yuan Tang	2f91344c1f	docs: Update changelog (#3343 ) This updates the changelog doc to include the latest updates. Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-09-08 10:01:41 +02:00
dependabot[bot]	51012a82a3	chore(github-deps): bump astral-sh/setup-uv from 6.6.0 to 6.6.1 (#3355 ) Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.6.0 to 6.6.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v6.6.1 🌈 Fix exclusions in cache-dependency-glob</h2> <h2>Changes</h2> <p>Exclusions with a leading <code>!</code> in the <a href="https://github.com/astral-sh/setup-uv?tab=readme-ov-file#cache-dependency-glob">cache-dependency-glob</a> did not work and got fixed with this release. Thank you <a href="https://github.com/KnisterPeter"><code>@KnisterPeter</code></a> for raising this!</p> <h2>🐛 Bug fixes</h2> <ul> <li>Fix exclusions in cache-dependency-glob <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/546">#546</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>Bump dependencies <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/547">#547</a>)</li> <li>chore: update known versions for 0.8.14 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/543">#543</a>)</li> <li>chore: update known versions for 0.8.13 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/536">#536</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`557e51de59`"><code>557e51d</code></a> Bump dependencies (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/547">#547</a>)</li> <li><a href="`1b46e13ec8`"><code>1b46e13</code></a> Fix exclusions in cache-dependency-glob (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/546">#546</a>)</li> <li><a href="`26cf676705`"><code>26cf676</code></a> chore: update known versions for 0.8.14 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/543">#543</a>)</li> <li><a href="`4e1e303f7d`"><code>4e1e303</code></a> chore: update known versions for 0.8.13 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/536">#536</a>)</li> <li>See full diff in <a href="`4959332f0f...557e51de59`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.6.0&new-version=6.6.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 10:00:41 +02:00
dependabot[bot]	e1b81ce1fc	chore(ui-deps): bump @radix-ui/react-dropdown-menu from 2.1.14 to 2.1.16 in /llama_stack/ui (#3361 ) Bumps [@radix-ui/react-dropdown-menu](https://github.com/radix-ui/primitives) from 2.1.14 to 2.1.16. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/radix-ui/primitives/commits">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@radix-ui/react-dropdown-menu&package-manager=npm_and_yarn&previous-version=2.1.14&new-version=2.1.16)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 09:59:44 +02:00
dependabot[bot]	e508aef320	chore(ui-deps): bump lucide-react from 0.510.0 to 0.542.0 in /llama_stack/ui (#3363 ) Bumps [lucide-react](https://github.com/lucide-icons/lucide/tree/HEAD/packages/lucide-react) from 0.510.0 to 0.542.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/lucide-icons/lucide/releases">lucide-react's releases</a>.</em></p> <blockquote> <h2>Version 0.542.0</h2> <h2>What's Changed</h2> <ul> <li>feat(docs): add MDN Web Docs & Nuxt to showcase by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3590">lucide-icons/lucide#3590</a></li> <li>feat(icons): added <code>list-chevrons-down-up</code> icon by <a href="https://github.com/juliankellydesign"><code>@juliankellydesign</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3492">lucide-icons/lucide#3492</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/juliankellydesign"><code>@juliankellydesign</code></a> made their first contribution in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3492">lucide-icons/lucide#3492</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.541.0...0.542.0">https://github.com/lucide-icons/lucide/compare/0.541.0...0.542.0</a></p> <h2>Version 0.541.0</h2> <h2>What's Changed</h2> <ul> <li>feat(packages/lucide): added support for providing a custom root element by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3543">lucide-icons/lucide#3543</a></li> <li>fix(icons): optimized <code>chrome</code> icon & renamed to <code>chromium</code> by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3572">lucide-icons/lucide#3572</a></li> <li>fix(icons): changed <code>wallpaper</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3566">lucide-icons/lucide#3566</a></li> <li>fix(icons): optimized <code>cog</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3548">lucide-icons/lucide#3548</a></li> <li>fix(icons): changed <code>building</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3510">lucide-icons/lucide#3510</a></li> <li>feat(dpi-preview): add previous version for easier comparison by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3532">lucide-icons/lucide#3532</a></li> <li>feat(icons): added 'panel-dashed' variants + update tags on existing icons by <a href="https://github.com/irvineacosta"><code>@irvineacosta</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3500">lucide-icons/lucide#3500</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.540.0...0.541.0">https://github.com/lucide-icons/lucide/compare/0.540.0...0.541.0</a></p> <h2>Version 0.540.0</h2> <h2>What's Changed</h2> <ul> <li>fix(license): add full text of Feather license by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3530">lucide-icons/lucide#3530</a></li> <li>fix(icons): changed <code>umbrella</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3490">lucide-icons/lucide#3490</a></li> <li>docs(site): added official statement on brand logos in Lucide by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3541">lucide-icons/lucide#3541</a></li> <li>fix(icons): changed <code>camera</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3539">lucide-icons/lucide#3539</a></li> <li>feat(icons): added <code>rose</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/1972">lucide-icons/lucide#1972</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.539.0...0.540.0">https://github.com/lucide-icons/lucide/compare/0.539.0...0.540.0</a></p> <h2>Version 0.539.0</h2> <h2>What's Changed</h2> <ul> <li>feat(icons): added <code>brick-wall-shield</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3476">lucide-icons/lucide#3476</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.538.0...0.539.0">https://github.com/lucide-icons/lucide/compare/0.538.0...0.539.0</a></p> <h2>Version 0.538.0</h2> <h2>What's Changed</h2> <ul> <li>fix(icons): changed <code>apple</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3505">lucide-icons/lucide#3505</a></li> <li>fix(icons): changed <code>store</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3501">lucide-icons/lucide#3501</a></li> <li>fix(icons): changed <code>mic-off</code> icon by <a href="https://github.com/lieonlion"><code>@lieonlion</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/2823">lucide-icons/lucide#2823</a></li> <li>chore(deps): bump astro from 5.5.2 to 5.12.8 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3523">lucide-icons/lucide#3523</a></li> <li>fix(icons): deprecate rail-symbol by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/2862">lucide-icons/lucide#2862</a></li> <li>feat(icons): added <code>kayak</code> icon by <a href="https://github.com/jpjacobpadilla"><code>@jpjacobpadilla</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3054">lucide-icons/lucide#3054</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`e71198d9b3`"><code>e71198d</code></a> chore: icon alias improvements (<a href="https://github.com/lucide-icons/lucide/tree/HEAD/packages/lucide-react/issues/2861">#2861</a>)</li> <li><a href="`3e644fda2d`"><code>3e644fd</code></a> chore(scripts): Refactor scripts to typescript (<a href="https://github.com/lucide-icons/lucide/tree/HEAD/packages/lucide-react/issues/3316">#3316</a>)</li> <li><a href="`19fa01b5fc`"><code>19fa01b</code></a> build(deps-dev): bump vite from 6.3.2 to 6.3.4 (<a href="https://github.com/lucide-icons/lucide/tree/HEAD/packages/lucide-react/issues/3181">#3181</a>)</li> <li>See full diff in <a href="https://github.com/lucide-icons/lucide/commits/0.542.0/packages/lucide-react">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=lucide-react&package-manager=npm_and_yarn&previous-version=0.510.0&new-version=0.542.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 09:59:24 +02:00
dependabot[bot]	91c7c4570e	chore(ui-deps): bump sonner from 2.0.6 to 2.0.7 in /llama_stack/ui (#3364 ) Bumps [sonner](https://github.com/emilkowalski/sonner) from 2.0.6 to 2.0.7. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/emilkowalski/sonner/releases">sonner's releases</a>.</em></p> <blockquote> <h2>v2.0.7</h2> <p>Sonner now supports multiple <code><Toaster /></code> components, see more <a href="https://sonner.emilkowal.ski/toaster#multiple-toasters">here</a>.</p> <h2>What's Changed</h2> <ul> <li>feat: add testId prop for individual toast components by <a href="https://github.com/b-like-bahar"><code>@b-like-bahar</code></a> in <a href="https://redirect.github.com/emilkowalski/sonner/pull/660">emilkowalski/sonner#660</a></li> <li>feat(toaster): add support for multiple toasters with unique identifiers by <a href="https://github.com/taroj1205"><code>@taroj1205</code></a> in <a href="https://redirect.github.com/emilkowalski/sonner/pull/665">emilkowalski/sonner#665</a></li> <li>fix: tests by <a href="https://github.com/emilkowalski"><code>@emilkowalski</code></a> in <a href="https://redirect.github.com/emilkowalski/sonner/pull/677">emilkowalski/sonner#677</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/b-like-bahar"><code>@b-like-bahar</code></a> made their first contribution in <a href="https://redirect.github.com/emilkowalski/sonner/pull/660">emilkowalski/sonner#660</a></li> <li><a href="https://github.com/taroj1205"><code>@taroj1205</code></a> made their first contribution in <a href="https://redirect.github.com/emilkowalski/sonner/pull/665">emilkowalski/sonner#665</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/emilkowalski/sonner/compare/v2.0.6...v2.0.7">https://github.com/emilkowalski/sonner/compare/v2.0.6...v2.0.7</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`3ba7aa17ab`"><code>3ba7aa1</code></a> v2.0.7</li> <li><a href="`0604827063`"><code>0604827</code></a> fix: tests (<a href="https://redirect.github.com/emilkowalski/sonner/issues/677">#677</a>)</li> <li><a href="`c50fe92dfb`"><code>c50fe92</code></a> fix tests</li> <li><a href="`0600a5cb40`"><code>0600a5c</code></a> feat(toaster): add support for multiple toasters with unique identifiers (<a href="https://redirect.github.com/emilkowalski/sonner/issues/665">#665</a>)</li> <li><a href="`c14bf44a03`"><code>c14bf44</code></a> feat: add testId prop for individual toast components (<a href="https://redirect.github.com/emilkowalski/sonner/issues/660">#660</a>)</li> <li>See full diff in <a href="https://github.com/emilkowalski/sonner/compare/v2.0.6...v2.0.7">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=sonner&package-manager=npm_and_yarn&previous-version=2.0.6&new-version=2.0.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 09:59:02 +02:00
dependabot[bot]	fe134d90e5	chore(ui-deps): bump react-dom and @types/react-dom in /llama_stack/ui (#3360 ) Bumps [react-dom](https://github.com/facebook/react/tree/HEAD/packages/react-dom) and [@types/react-dom](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-dom). These dependencies needed to be updated together. Updates `react-dom` from 19.1.0 to 19.1.1 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/facebook/react/releases">react-dom's releases</a>.</em></p> <blockquote> <h2>19.1.1 (July 28, 2025)</h2> <h3>React</h3> <ul> <li>Fixed Owner Stacks to work with ES2015 function.name semantics (<a href="https://redirect.github.com/facebook/react/pull/33680">#33680</a> by <a href="https://github.com/hoxyq"><code>@hoxyq</code></a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/facebook/react/blob/main/CHANGELOG.md">react-dom's changelog</a>.</em></p> <blockquote> <h2>19.1.1 (July 28, 2025)</h2> <h3>React</h3> <ul> <li>Fixed Owner Stacks to work with ES2015 function.name semantics (<a href="https://redirect.github.com/facebook/react/pull/33680">#33680</a> by <a href="https://github.com/hoxyq"><code>@hoxyq</code></a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`87e33ca2b7`"><code>87e33ca</code></a> Set release versions to 19.1.1</li> <li><a href="`b793948e15`"><code>b793948</code></a> Bump next prerelease version numbers (<a href="https://github.com/facebook/react/tree/HEAD/packages/react-dom/issues/32782">#32782</a>)</li> <li>See full diff in <a href="https://github.com/facebook/react/commits/v19.1.1/packages/react-dom">compare view</a></li> </ul> </details> <br /> Updates `@types/react-dom` from 19.1.5 to 19.1.9 <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-dom">compare view</a></li> </ul> </details> <br /> Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-08 09:58:45 +02:00
Matthew Farrellee	6a35bd7bb6	chore: update the anthropic inference impl to use openai-python for openai-compat functions (#3366 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 38s Details Pre-commit / pre-commit (push) Successful in 1m13s Details # What does this PR do? update the Anthropic inference provider to use openai-python for the openai-compat endpoints ## Test Plan ci Co-authored-by: raghotham <rsm@meta.com>	2025-09-07 14:00:42 -07:00
Matthew Farrellee	78cab5331a	chore(groq test): skip completions tests for groq, api is not supported server-side (#3347 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details UI Tests / ui-tests (22) (push) Successful in 37s Details Pre-commit / pre-commit (push) Successful in 1m16s Details # What does this PR do? skip /v1/completions tests on groq, endpoint is not supported Co-authored-by: raghotham <rsm@meta.com>	2025-09-06 16:21:55 -07:00
Matthew Farrellee	d23607483f	chore: update the groq inference impl to use openai-python for openai-compat functions (#3348 ) # What does this PR do? update Groq inference provider to use OpenAIMixin for openai-compat endpoints changes on api.groq.com - - json_schema is now supported for specific models, see https://console.groq.com/docs/structured-outputs#supported-models - response_format with streaming is now supported for models that support response_format - groq no longer returns a 400 error if tools are provided and tool_choice is not "required" ## Test Plan ``` $ GROQ_API_KEY=... uv run llama stack build --image-type venv --providers inference=remote::groq --run ... $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run --group test pytest -v -ra --text-model groq/llama-3.3-70b-versatile tests/integration/inference/test_openai_completion.py -k 'not store' ... SKIPPED [3] tests/integration/inference/test_openai_completion.py:44: Model groq/llama-3.3-70b-versatile hosted by remote::groq doesn't support OpenAI completions. SKIPPED [3] tests/integration/inference/test_openai_completion.py:94: Model groq/llama-3.3-70b-versatile hosted by remote::groq doesn't support vllm extra_body parameters. SKIPPED [4] tests/integration/inference/test_openai_completion.py:73: Model groq/llama-3.3-70b-versatile hosted by remote::groq doesn't support n param. SKIPPED [1] tests/integration/inference/test_openai_completion.py💯 Model groq/llama-3.3-70b-versatile hosted by remote::groq doesn't support chat completion calls with base64 encoded files. ======================= 8 passed, 11 skipped, 8 deselected, 2 warnings in 5.13s ======================== ``` --------- Co-authored-by: raghotham <rsm@meta.com>	2025-09-06 15:36:27 -07:00
Charlie Doern	ecd9d8dc1a	test: introduce api conformance test (#3257 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 7s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 12s Details UI Tests / ui-tests (22) (push) Successful in 40s Details Pre-commit / pre-commit (push) Successful in 1m48s Details # What does this PR do? this test runs on each PR and uses a new conformance workflow to compare the base (main) branch openapi spec to the one on this PR if one of our "stable" APIs change, the test will fail. this workflow uses `oasdiff` to identify breaking changes for paths we want to ensure comptability for. specifically this is using `oasdiff breaking` with `--match-path` which only checks breaking changes for the specified paths. As a follow up to this, we can add an optional way to make it so that it is OK to make these change if properly documented or in a changelog or something. or by using a label on the PR to override the failing test. related to #3237 ## Test Plan conformance test should pass given there are no changes Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-09-06 12:40:33 -07:00
Matthew Farrellee	9252d9fc01	chore(groq test): skip with_n tests for groq, it is not supported server-side (#3346 ) # What does this PR do? skip the with_n test for groq, because it isn't supported by the provider's service see https://console.groq.com/docs/openai#currently-unsupported-openai-features Co-authored-by: raghotham <rsm@meta.com>	2025-09-06 12:35:30 -07:00
Matthew Farrellee	bf02cd846f	chore: update the sambanova inference impl to use openai-python for openai-compat functions (#3345 ) # What does this PR do? update SambaNova inference provider to use OpenAIMixin for openai-compat endpoints ## Test Plan ``` $ SAMBANOVA_API_KEY=... uv run llama stack build --image-type venv --providers inference=remote::sambanova --run ... $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run --group test pytest -v -ra --text-model sambanova/Meta-Llama-3.3-70B-Instruct tests/integration/inference -k 'not store' ... FAILED tests/integration/inference/test_text_inference.py::test_text_chat_completion_tool_calling_tools_not_in_request[txt=sambanova/Meta-Llama-3.3-70B-Instruct-inference:chat_completion:tool_calling_tools_absent-True] - AttributeError: 'NoneType' object has no attribute 'delta' FAILED tests/integration/inference/test_text_inference.py::test_text_chat_completion_tool_calling_tools_not_in_request[txt=sambanova/Meta-Llama-3.3-70B-Instruct-inference:chat_completion:tool_calling_tools_absent-False] - llama_stack_client.InternalServerError: Error code: 500 - {'detail': 'Internal server error: An une... =========== 2 failed, 16 passed, 68 skipped, 8 deselected, 3 xfailed, 13 warnings in 15.85s ============ ``` the two failures also exist before this change. they are part of the deprecated inference.chat_completion tests that flow through litellm. they can be resolved later.	2025-09-06 12:25:13 -07:00
Matthew Farrellee	4c28544c04	chore(gemini, tests): add skips for n and completions, gemini api does not support them (#3350 ) # What does this PR do? the gemini api endpoints do not support the n param or completions ## Test Plan ci	2025-09-06 12:22:44 -07:00
Matthew Farrellee	d6c3b36390	chore: update the gemini inference impl to use openai-python for openai-compat functions (#3351 ) # What does this PR do? update the Gemini inference provider to use openai-python for the openai-compat endpoints partially addresses #3349, does not address /inference/completion or /inference/chat-completion ## Test Plan ci	2025-09-06 12:22:20 -07:00
Francisco Arceo	7cd1c2c238	feat: Updating Rag Tool to use Files API and Vector Stores API (#3344 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 18s Details Update ReadTheDocs / update-readthedocs (push) Failing after 15s Details Python Package Build Test / build (3.13) (push) Failing after 19s Details Test External API and Providers / test-external (venv) (push) Failing after 17s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 23s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 22s Details Unit Tests / unit-tests (3.12) (push) Failing after 19s Details Unit Tests / unit-tests (3.13) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (push) Failing after 23s Details UI Tests / ui-tests (22) (push) Successful in 44s Details Pre-commit / pre-commit (push) Successful in 1m32s Details	2025-09-06 07:26:34 -06:00
Ashwin Bharambe	47b640370e	feat(tests): introduce a test "suite" concept to encompass dirs, options (#3339 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 33s Details Pre-commit / pre-commit (push) Successful in 1m15s Details Our integration tests need to be 'grouped' because each group often needs a specific set of models it works with. We separated vision tests due to this, and we have a separate set of tests which test "Responses" API. This PR makes this system a bit more official so it is very easy to target these groups and apply all testing infrastructure towards all the groups (for example, record-replay) uniformly. There are three suites declared: - base - vision - responses Note that our CI currently runs the "base" and "vision" suites. You can use the `--suite` option when running pytest (or any of the testing scripts or workflows.) For example: ``` OLLAMA_URL=http://localhost:11434 \ pytest -s -v tests/integration/ --stack-config starter --suite vision ```	2025-09-05 13:58:49 -07:00
Matthew Farrellee	0c2757a05b	chore(sambanova test): skip with_n tests for sambanova, it is not implemented server-side (#3342 ) # What does this PR do? skip a test that cannot pass for sambanova see https://docs-legacy.sambanova.ai/sambastudio/latest/open-ai-api.html\#_example_requests_using_openai_client ## Test Plan ci	2025-09-05 12:00:09 -07:00
Matthew Farrellee	df1526991f	feat(batches, completions): add /v1/completions support to /v1/batches (#3309 ) # What does this PR do? add support for /v1/completions to the /v1/batches api ## Test Plan ci	2025-09-05 11:59:57 -07:00
Francisco Arceo	e2fe39aee1	feat!: Migrate Vector DB IDs to Vector Store IDs (breaking change) (#3253 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 35s Details Pre-commit / pre-commit (push) Successful in 1m15s Details # What does this PR do? This change migrates the VectorDB id generation to Vector Stores. This is a breaking change for _some users_ that may have application code using the `vector_db_id` parameter in the request of the VectorDB protocol instead of the `VectorDB.identifier` in the response. By default we will now create a Vector Store every time we register a VectorDB. The caveat with this approach is that this maps the `vector_db_id` → `vector_store.name`. This is a reasonable tradeoff to transition users towards OpenAI Vector Stores. As an added benefit, registering VectorDBs will result in them appearing in the VectorStores admin UI. ### Why? This PR makes the `POST` API call to `/v1/vector-dbs` swap the `vector_db_id` parameter in the request body into the VectorStore's name field and sets the `vector_db_id` to the generated vector store id (e.g., `vs_038247dd-4bbb-4dbb-a6be-d5ecfd46cfdb`). That means that users would have to do something like follows in their application code: ```python res = client.vector_dbs.register( vector_db_id='my-vector-db-id', embedding_model='ollama/all-minilm:l6-v2', embedding_dimension=384, ) vector_db_id = res.identifier ``` And then the rest of their code would behave, including `VectorIO`'s insert protocol using `vector_db_id` in the request. An alternative implementation would be to just delete the `vector_db_id` parameter in `VectorDB` but the end result would still require users having to write `vector_db_id = res.identifier` since `VectorStores.create()` generates the ID for you. So this approach felt the easiest way to migrate users towards VectorStores (subsequent PRs will be added to trigger `files.create()` and `vector_stores.files.create()`). ## Test Plan Unit tests and integration tests have been added. Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-09-05 15:40:34 +02:00
Derek Higgins	64b2977162	fix: Fix locations of distrubution runtime directories (#3336 ) The defaults were mixed up Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-09-05 14:09:36 +02:00
Sumanth Kamenani	0b00c68d59	fix: use lambda pattern for bedrock config env vars (#3307 ) # What does this PR do? Improved bedrock provider config to read from environment variables like AWS_ACCESS_KEY_ID. Updated all fields to use default_factory with lambda patterns like the nvidia provider does. Now the environment variables work as documented. Closes #3305 ## Test Plan Ran the new bedrock config tests: ```bash python -m pytest tests/unit/providers/inference/bedrock/test_config.py -v Verified existing provider tests still work: python -m pytest tests/unit/providers/test_configs.py -v	2025-09-05 10:45:11 +02:00
ehhuang	3a7ac4227d	chore: unbreak inference store test (#3340 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 1m21s Details Pre-commit / pre-commit (push) Successful in 2m27s Details # What does this PR do? The inference store writes were moved to asyncio.create_task and not await anymore ## Test Plan ❯ OLLAMA_URL=http://localhost:11434 LLAMA_STACK_CONFIG=server:starter uv run --with pytest-repeat pytest tests/integration/inference --text-model="ollama/llama3.2:3b-instruct-fp16" -vvs -k "test_inference_store_tool_calls and 3b-instruct-fp16-True" --count=10 Uninstalled 2 packages in 102ms Installed 2 packages in 138ms INFO 2025-09-04 14:10:17,775 tests.integration.conftest:66 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS ========================================================================================================== test session starts =========================================================================================================== platform darwin -- Python 3.12.3, pytest-8.4.1, pluggy-1.6.0 -- /Users/erichuang/.cache/uv/builds-v0/.tmpSGMlgt/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.3', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.1', 'pluggy': '1.6.0'}, 'Plugins': {'repeat': '0.9.4', 'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0'}} rootdir: /Users/erichuang/projects/llama-stack-git configfile: pyproject.toml plugins: repeat-0.9.4, anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 970 items / 950 deselected / 20 selected tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-1-10] instantiating llama_stack_client Starting llama stack server with config 'starter' on port 8321... Waiting for server at http://localhost:8321... (0.0s elapsed) Waiting for server at http://localhost:8321... (0.5s elapsed) Waiting for server at http://localhost:8321... (5.1s elapsed) Waiting for server at http://localhost:8321... (5.6s elapsed) Waiting for server at http://localhost:8321... (10.1s elapsed) Waiting for server at http://localhost:8321... (10.6s elapsed) Waiting for server at http://localhost:8321... (15.2s elapsed) Waiting for server at http://localhost:8321... (15.7s elapsed) Server is ready at http://localhost:8321 llama_stack_client instantiated in 20.583s PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-2-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-3-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-4-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-5-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-6-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-7-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-8-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-9-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=ollama/llama3.2:3b-instruct-fp16-True-10-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-1-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-2-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-3-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-4-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-5-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-6-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-7-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-8-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-9-10] PASSED tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=ollama/llama3.2:3b-instruct-fp16-True-10-10] PASSEDTerminating llama stack server process... Terminating process 53307 and its group... Server process and children terminated gracefully	2025-09-04 15:13:31 -07:00
Sumanth Kamenani	55a8c5f439	fix: show descriptive MCP server connection errors instead of generic 500s (#3256 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 3s Details UI Tests / ui-tests (22) (push) Successful in 1m20s Details Pre-commit / pre-commit (push) Successful in 2m37s Details What does this PR do? Fixes error handling when MCP server connections fail. Instead of returning generic 500 errors, now provides descriptive error messages with proper HTTP status codes. Closes #3107 Test Plan Before fix: curl -X GET "http://localhost:8321/v1/tool-runtime/list-tools?tool_group_id=bad-mcp-server" Returns: {"detail": "Internal server error: An unexpected error occurred."} (500) After fix: curl -X GET "http://localhost:8321/v1/tool-runtime/list-tools?tool_group_id=bad-mcp-server" Returns: {"error": {"detail": "Failed to connect to MCP server at http://localhost:9999/sse: Connection refused"}} (502) Tests: - Added unit test for ConnectionError → 502 translation - Manually tested with unreachable MCP servers (connection refused)	2025-09-04 13:25:02 -07:00
slekkala1	561d2fc6b8	fix: Move to older version for docker container failure [fireworks-ai] (#3338 ) # What does this PR do? Noticed the test https://github.com/llamastack/llama-stack-ops/actions/workflows/test-maybe-cut.yaml are still failing randomly. Earlier fixed this with 0.18.0 of fireworks here https://github.com/llamastack/llama-stack/pull/3267, the local testing may have inadvertently picked a lower version with `<=` which I assumed picks latest version. Now tested with `==` to find the version where it broke and pinning to version(`<=`) where it was passing. ## Test Plan Tested locally with the following commands to start a container Build container `llama stack build --distro starter --image-type container` start container `docker run -d -p 8321:8321 --name llama-stack-test distribution-starter:0.2.20` check health `http://localhost:8321/v1/health` Above steps fails without the fix Tested with `==` to ensure the same version is picked in local testing instead of anything lower. Following here for the fix from `fireworks-ai` `1410674695` https://github.com/llamastack/llama-stack/issues/3273	2025-09-04 11:47:46 -07:00
ehhuang	bcc7f2c7d0	chore: async inference store write (#3318 ) # What does this PR do? ## Test Plan ``` cd /docs/source/distributions/k8s-benchmark # start mock server python openai-mock-server.py --port 8000 # start stack server uv run --with llama-stack python -m llama_stack.core.server.server docs/source/distributions/k8s-benchmark/stack_run_config.yaml # run benchmark script uv run python3 benchmark.py --duration 30 --concurrent 50 --base-url=http://localhost:8321/v1/openai/v1 --model=vllm-inference/meta-llama/Llama-3.2-3B-Instruct ``` Before: ============================================================ BENCHMARK RESULTS ============================================================ Total time: 30.00s Concurrent users: 50 Total requests: 1267 Successful requests: 1267 Failed requests: 0 Success rate: 100.0% Requests per second: 42.23 After: ============================================================ BENCHMARK RESULTS ============================================================ Total time: 30.00s Concurrent users: 50 Total requests: 1449 Successful requests: 1449 Failed requests: 0 Success rate: 100.0% Requests per second: 48.30	2025-09-04 11:37:46 -07:00
Derek Higgins	5bbca56cfc	fix: Make SentenceTransformer embedding operations non-blocking (#3335 ) - Wrap model loading with asyncio.to_thread() to prevent blocking during model download/initialization - Wrap encoding operations with asyncio.to_thread() to run in background thread - Convert _load_sentence_transformer_model() to async method This ensures the async event loop remains responsive during embedding operations. Closes: #3332 Signed-off-by: Derek Higgins <derekh@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-09-04 13:58:41 -04:00
IAN MILLER	85f33762d7	refactor(server): remove hardcoded 409 and 404 status codes in server.py using httpx constants (#3333 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR is eliminating hardcoded status codes: `409` CONFLICT and `404` NOT_FOUND in `server.py` using `httpx` built-in constants. This implementation will follow the existing structure to improve readability, extensibility and developer experience. This is already was implemented in #3131 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> `./scripts/unit-tests.sh`	2025-09-04 18:15:13 +02:00
Derek Higgins	64d2306dd5	fix: distro-codegen pre-commit hook file pattern (#3337 ) Update the file pattern from 'llama_stack/templates' to 'llama_stack/distributions' to properly trigger the Distribution Template Codegen hook when distribution files change. Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-09-04 17:56:32 +02:00
ehhuang	5d52e0d2c5	chore: handle missing finish_reason (#3328 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Successful in 34s Details Pre-commit / pre-commit (push) Successful in 1m25s Details # What does this PR do? Sometimes the stream don't have chunks with finish_reason, e.g. canceled stream, which throws a pydantic error as OpenAIChoice.finish_reason: str ## Test Plan observe no more such error when benchmarking	2025-09-04 13:23:18 +02:00
Ashwin Bharambe	02f6e0f531	fix(tests): set inference mode to be replay by default (#3326 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 1m19s Details Pre-commit / pre-commit (push) Successful in 2m30s Details `construct_stack()` relies on the environment variable to know when to setup the patching infrastructure. `c3d3a0b833/llama_stack/core/stack.py (L314)`	2025-09-03 15:57:17 -07:00
Ashwin Bharambe	c3d3a0b833	feat(tests): auto-merge all model list responses and unify recordings (#3320 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 8s Details Python Package Build Test / build (3.12) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details UI Tests / ui-tests (22) (push) Successful in 1m7s Details Pre-commit / pre-commit (push) Successful in 2m34s Details One needed to specify record-replay related environment variables for running integration tests. We could not use defaults because integration tests could be run against Ollama instances which could be running different models. For example, text vs vision tests needed separate instances of Ollama because a single instance typically cannot serve both of these models if you assume the standard CI worker configuration on Github. As a result, `client.list()` as returned by the Ollama client would be different between these runs and we'd end up overwriting responses. This PR "solves" it by adding a small amount of complexity -- we store model list responses specially, keyed by the hashes of the models they return. At replay time, we merge all of them and pretend that we have the union of all models available. ## Test Plan Re-recorded all the tests using `scripts/integration-tests.sh --inference-mode record`, including the vision tests.	2025-09-03 11:33:03 -07:00
ehhuang	d948e63340	chore: Improve error message for missing provider dependencies (#3315 ) Generated with CC: Replace cryptic KeyError with clear, actionable error message that shows: - Which API the failing provider belongs to - The provider ID and type that's failing - Which dependency is missing - Clear instructions on how to fix the issue ## Test plan Use a run config with Agents API and no safety provider Before: KeyError: <Api.safety: 'safety'> After: Failed to resolve 'agents' provider 'meta-reference' of type 'inline::meta-reference': required dependency 'safety' is not available. Please add a 'safety' provider to your configuration or check if the provider is properly configured.	2025-09-03 16:11:59 +02:00
Cesare Pompeiano	ccaf6aaa51	chore(python-deps): replace ibm_watson_machine_learning with ibm_watsonx_ai (#3302 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 6s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Test External API and Providers / test-external (venv) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (push) Failing after 18s Details Unit Tests / unit-tests (3.13) (push) Failing after 13s Details UI Tests / ui-tests (22) (push) Successful in 1m23s Details Pre-commit / pre-commit (push) Successful in 3m5s Details # What does this PR do? This PR updates the Watsonx provider dependencies from `ibm_watson_machine_learning` to `ibm_watsonx_ai`. The old package `ibm_watson_machine_learning` is in deprecation mode ([[PyPI link](https://pypi.org/project/ibm-watson-machine-learning/)](https://pypi.org/project/ibm-watson-machine-learning/)) and relies on older versions of dependencies such as `pandas`. Updating to `ibm_watsonx_ai` ensures compatibility with current dependency versions and ongoing support. ## Test Plan I verified the update by running an inference using a model provided by Watsonx. The model ran successfully, confirming that the new dependency works as expected. Co-authored-by: are-ces <cpompeia@redhat.com>	2025-09-03 11:33:35 +02:00
Varsha	c59d8c5047	fix: Fix mock vector DB schema in Qdrant tests (#3295 ) # What does this PR do? Fix: https://github.com/llamastack/llama-stack/issues/3293 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ``` ===================================================== test session starts ===================================================== platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.11', 'Platform': 'macOS-14.7.7-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'xdist': '3.8.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}} rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack configfile: pyproject.toml plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, xdist-3.8.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0 asyncio: mode=Mode.AUTO collected 3 items tests/unit/providers/vector_io/test_qdrant.py::test_qdrant_adapter_returns_expected_chunks[2-2] PASSED [ 33%] tests/unit/providers/vector_io/test_qdrant.py::test_qdrant_adapter_returns_expected_chunks[100-60] PASSED [ 66%] tests/unit/providers/vector_io/test_qdrant.py::test_qdrant_register_and_unregister_vector_db PASSED [100%] ``` Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-09-03 09:59:16 +02:00
IAN MILLER	faf891b40c	refactor: use generic WeightedInMemoryAggregator for hybrid search in SQLiteVecIndex (#3303 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 0s Details Pre-commit / pre-commit (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 2s Details Test External API and Providers / test-external (venv) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details UI Tests / ui-tests (22) (push) Failing after 1s Details Unit Tests / unit-tests (3.12) (push) Failing after 1s Details Unit Tests / unit-tests (3.13) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 6s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this PR is to refactor `SQLiteVecIndex` to eliminate redundant code and simplify the code using generic `WeightedInMemoryAggregator` that can be used for any vector db provider. This pattern is already implemented for `PGVectorIndex` in #3064 CC: @franciscojavierarceo <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> 1. `./scripts/unit-tests.sh` 2. Integration tests in CI Workflow	2025-09-02 10:38:35 -07:00
dependabot[bot]	5c873d53db	chore(python-deps): bump pymilvus from 2.6.0 to 2.6.1 (#3285 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 0s Details Pre-commit / pre-commit (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Failing after 1s Details Test Llama Stack Build / build-single-provider (push) Failing after 1s Details Test Llama Stack Build / build (push) Has been skipped Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 0s Details Python Package Build Test / build (3.13) (push) Failing after 0s Details Test External API and Providers / test-external (venv) (push) Failing after 1s Details Unit Tests / unit-tests (3.13) (push) Failing after 0s Details Update ReadTheDocs / update-readthedocs (push) Failing after 0s Details UI Tests / ui-tests (22) (push) Failing after 1s Details Unit Tests / unit-tests (3.12) (push) Failing after 1s Details Bumps [pymilvus](https://github.com/milvus-io/pymilvus) from 2.6.0 to 2.6.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/milvus-io/pymilvus/releases">pymilvus's releases</a>.</em></p> <blockquote> <h2>PyMilvus v2.6.1 Release Notes</h2> <h2>What's Changed</h2> <ul> <li>Avoid describe_collection when query by ids by <a href="https://github.com/yhmo"><code>@yhmo</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2930">milvus-io/pymilvus#2930</a></li> <li>bulkImport add objectUrls/token paramster & add example use by <a href="https://github.com/lentitude2tk"><code>@lentitude2tk</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2934">milvus-io/pymilvus#2934</a></li> <li>support stageManager & stageFileManager by <a href="https://github.com/lentitude2tk"><code>@lentitude2tk</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2935">milvus-io/pymilvus#2935</a></li> <li>fix: Fix the existing version fmt by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2960">milvus-io/pymilvus#2960</a></li> <li>enhance: Add unixmsec in every RPC call by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2961">milvus-io/pymilvus#2961</a></li> <li>enhance: Multiple cherry picks from master branch by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2962">milvus-io/pymilvus#2962</a></li> <li>fix: Passing unknown req.is_refresh to wait by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2964">milvus-io/pymilvus#2964</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/milvus-io/pymilvus/compare/v2.6.0...v2.6.1">https://github.com/milvus-io/pymilvus/compare/v2.6.0...v2.6.1</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`0237c9f1bd`"><code>0237c9f</code></a> fix: [2.6]Passing unknown req.is_refresh to wait (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2964">#2964</a>)</li> <li><a href="`a083622d8f`"><code>a083622</code></a> enhance: Multiple cherry picks from master branch (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2962">#2962</a>)</li> <li><a href="`87e3c5acc1`"><code>87e3c5a</code></a> enhance: Add unixmsec in every RPC call (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2961">#2961</a>)</li> <li><a href="`98077a27c9`"><code>98077a2</code></a> fix: [2.6]Fix the existing version fmt (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2960">#2960</a>)</li> <li><a href="`80e2e09323`"><code>80e2e09</code></a> feat: Add partial update support for upsert operations (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2938">#2938</a>) (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2940">#2940</a>)</li> <li><a href="`0210ee92e6`"><code>0210ee9</code></a> [cherry-pick] support stageManager & stageFileManager (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2935">#2935</a>)</li> <li><a href="`00fb8e6f23`"><code>00fb8e6</code></a> [cherry-pick] bulkImport add objectUrls/token paramster & add example use (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2">#2</a>...</li> <li><a href="`442ef15806`"><code>442ef15</code></a> Avoid describe_collection when query by ids (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2930">#2930</a>)</li> <li><a href="`e704dd29b5`"><code>e704dd2</code></a> fix: Correct github actions on branch 2.6 (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2926">#2926</a>)</li> <li>See full diff in <a href="https://github.com/milvus-io/pymilvus/compare/v2.6.0...v2.6.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pymilvus&package-manager=uv&previous-version=2.6.0&new-version=2.6.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-01 20:24:22 -04:00
IAN MILLER	4a59961a6c	refactor: remove lama-api-client from pyproject.toml (#3299 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s Details Vector IO Integration Tests / test-matrix (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 1s Details Pre-commit / pre-commit (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Failing after 0s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 1s Details Test Llama Stack Build / build (push) Has been skipped Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Unit Tests / unit-tests (3.12) (push) Failing after 1s Details Test External API and Providers / test-external (venv) (push) Failing after 1s Details Unit Tests / unit-tests (3.13) (push) Failing after 1s Details Update ReadTheDocs / update-readthedocs (push) Failing after 1s Details UI Tests / ui-tests (22) (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Has started running Details Test Llama Stack Build / build-single-provider (push) Has started running Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 8s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR is eliminating `lama-api-client` dependency at `pyproject.toml` because it's not used in Llama Stack codebase <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ` ./scripts/unit-tests.sh`	2025-09-01 16:50:50 +02:00
dependabot[bot]	9625ac6d02	chore(python-deps): bump locust from 2.39.0 to 2.39.1 (#3284 ) Bumps [locust](https://github.com/locustio/locust) from 2.39.0 to 2.39.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/locustio/locust/releases">locust's releases</a>.</em></p> <blockquote> <h2>2.39.1</h2> <h2>What's Changed</h2> <ul> <li>Avoid broken gevent version for now by <a href="https://github.com/cyberw"><code>@cyberw</code></a> in <a href="https://redirect.github.com/locustio/locust/pull/3196">locustio/locust#3196</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/JumboBear"><code>@JumboBear</code></a> made their first contribution in <a href="https://redirect.github.com/locustio/locust/pull/3195">locustio/locust#3195</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/locustio/locust/compare/2.39.0...2.39.1">https://github.com/locustio/locust/compare/2.39.0...2.39.1</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/locustio/locust/blob/master/CHANGELOG.md">locust's changelog</a>.</em></p> <blockquote> <h1>Detailed changelog</h1> <p>The most important changes can also be found in <a href="https://docs.locust.io/en/latest/changelog.html">the documentation</a>.</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`934c5c33e4`"><code>934c5c3</code></a> changelog</li> <li><a href="`9350084ec0`"><code>9350084</code></a> disable macos build for now</li> <li><a href="`705e2f658b`"><code>705e2f6</code></a> Disable another unit test on macos because of annoying behavior on GH (really...</li> <li><a href="`d888b9db2b`"><code>d888b9d</code></a> Disable another unit test on macos because of annoying behavior on GH</li> <li><a href="`45bc4d84fd`"><code>45bc4d8</code></a> Disable annoying test case on macos for now. Only has issues on GH. <a href="https://github.com/amadeupp"><code>@amadeupp</code></a>...</li> <li><a href="`9d7710a2da`"><code>9d7710a</code></a> unit tests: give extra time for testing on macOS</li> <li><a href="`fcbc740e04`"><code>fcbc740</code></a> Avoid broken gevent version for now (<a href="https://redirect.github.com/locustio/locust/issues/3196">#3196</a>)</li> <li><a href="`cd1f600d44`"><code>cd1f600</code></a> mypy</li> <li><a href="`0cf52dc990`"><code>0cf52dc</code></a> Autogen changelog for 2.39.0</li> <li><a href="`094395e024`"><code>094395e</code></a> Merge pull request <a href="https://redirect.github.com/locustio/locust/issues/3195">#3195</a> from JumboBear/pyproject</li> <li>Additional commits viewable in <a href="https://github.com/locustio/locust/compare/2.39.0...2.39.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=locust&package-manager=uv&previous-version=2.39.0&new-version=2.39.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-01 16:49:09 +02:00
dependabot[bot]	9e5ef1af3c	chore(ui-deps): bump @radix-ui/react-tooltip from 1.2.6 to 1.2.8 in /llama_stack/ui (#3287 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 1s Details Pre-commit / pre-commit (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details UI Tests / ui-tests (22) (push) Failing after 0s Details Unit Tests / unit-tests (3.12) (push) Failing after 1s Details Unit Tests / unit-tests (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 11s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 11s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 20s Details Test External API and Providers / test-external (venv) (push) Failing after 19s Details Bumps [@radix-ui/react-tooltip](https://github.com/radix-ui/primitives) from 1.2.6 to 1.2.8. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/radix-ui/primitives/commits">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@radix-ui/react-tooltip&package-manager=npm_and_yarn&previous-version=1.2.6&new-version=1.2.8)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-01 10:18:57 +02:00
dependabot[bot]	4499559ed1	chore(ui-deps): bump prettier from 3.5.3 to 3.6.2 in /llama_stack/ui (#3289 ) Bumps [prettier](https://github.com/prettier/prettier) from 3.5.3 to 3.6.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/prettier/prettier/releases">prettier's releases</a>.</em></p> <blockquote> <h2>3.6.2</h2> <h2>What's Changed</h2> <ul> <li>Add missing blank line around code block by <a href="https://github.com/fisker"><code>@fisker</code></a> in <a href="https://redirect.github.com/prettier/prettier/pull/17675">prettier/prettier#17675</a></li> </ul> <p>🔗 <a href="https://github.com/prettier/prettier/blob/main/CHANGELOG.md#362">Changelog</a></p> <h2>3.6.1</h2> <ul> <li>Fix "Warning: File descriptor 39 closed but not opened in unmanaged mode" error when running <code>--experimental-cli</code></li> </ul> <p>🔗 <a href="https://github.com/prettier/prettier/blob/main/CHANGELOG.md#361">Changelog</a></p> <h2>3.6.0</h2> <p><a href="https://github.com/prettier/prettier/compare/3.5.3...3.6.0">diff</a></p> <p>🔗 <a href="https://prettier.io/blog/2025/06/23/3.6.0">Release note "Prettier 3.6: Experimental fast CLI and new OXC and Hermes plugins!"</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/prettier/prettier/blob/main/CHANGELOG.md">prettier's changelog</a>.</em></p> <blockquote> <h1>3.6.2</h1> <p><a href="https://github.com/prettier/prettier/compare/3.6.1...3.6.2">diff</a></p> <h4>Markdown: Add missing blank line around code block (<a href="https://redirect.github.com/prettier/prettier/pull/17675">#17675</a> by <a href="https://github.com/fisker"><code>@fisker</code></a>)</h4> <!-- raw HTML omitted --> <pre lang="md"><code><!-- Input --> 1. Some text, and code block below, with newline after code block <pre lang="yaml"><code>--- foo: bar </code></pre> <ol> <li>Another</li> <li>List</li> </ol> <p><!-- Prettier 3.6.1 --></p> <ol> <li> <p>Some text, and code block below, with newline after code block</p> <pre lang="yaml"><code>--- foo: bar </code></pre> <ol> <li>Another</li> <li>List</li> </ol> </li> </ol> <p><!-- Prettier 3.6.2 --></p> <ol> <li> <p>Some text, and code block below, with newline after code block</p> <pre lang="yaml"><code>--- foo: bar </code></pre> <ol> <li>Another</li> <li>List<br /> </code></pre></li> </ol> </li> </ol> <h1>3.6.1</h1> <p><a href="https://github.com/prettier/prettier/compare/3.6.0...3.6.1">diff</a></p> <h4>TypeScript: Allow const without initializer (<a href="https://redirect.github.com/prettier/prettier/pull/17650">#17650</a>, <a href="https://redirect.github.com/prettier/prettier/pull/17654">#17654</a> by <a href="https://github.com/fisker"><code>@fisker</code></a>)</h4> <!-- raw HTML omitted --> <pre lang="jsx"><code>// Input </tr></table> </code></pre> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`7a8b05f415`"><code>7a8b05f</code></a> Release 3.6.2</li> <li><a href="`46526b49b6`"><code>46526b4</code></a> Add missing blank line around code block (<a href="https://redirect.github.com/prettier/prettier/issues/17675">#17675</a>)</li> <li><a href="`a04ec1196f`"><code>a04ec11</code></a> chore(deps): update babel to v7.27.7 (<a href="https://redirect.github.com/prettier/prettier/issues/17684">#17684</a>)</li> <li><a href="`32be5b6b44`"><code>32be5b6</code></a> chore(deps): update dependency flow-parser to v0.274.1 (<a href="https://redirect.github.com/prettier/prettier/issues/17676">#17676</a>)</li> <li><a href="`b55e777924`"><code>b55e777</code></a> Update docs about "TypeScript Configuration Files" (<a href="https://redirect.github.com/prettier/prettier/issues/17677">#17677</a>)</li> <li><a href="`b197c99224`"><code>b197c99</code></a> chore(deps): update dependency <code>@vitejs/plugin-react</code> to v4.6.0 (<a href="https://redirect.github.com/prettier/prettier/issues/17674">#17674</a>)</li> <li><a href="`1185f8370a`"><code>1185f83</code></a> chore(deps): update dependency <code>@angular/compiler</code> to v20.0.5 (<a href="https://redirect.github.com/prettier/prettier/issues/17680">#17680</a>)</li> <li><a href="`aa1316fa60`"><code>aa1316f</code></a> chore(deps): update dependency browserslist to v4.25.1 (<a href="https://redirect.github.com/prettier/prettier/issues/17671">#17671</a>)</li> <li><a href="`c468d33f16`"><code>c468d33</code></a> chore(deps): update dependency oxc-parser to v0.75.0 (<a href="https://redirect.github.com/prettier/prettier/issues/17672">#17672</a>)</li> <li><a href="`3f46d91bdb`"><code>3f46d91</code></a> chore(deps): update dependency vite to v7 (<a href="https://redirect.github.com/prettier/prettier/issues/17673">#17673</a>)</li> <li>Additional commits viewable in <a href="https://github.com/prettier/prettier/compare/3.5.3...3.6.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=prettier&package-manager=npm_and_yarn&previous-version=3.5.3&new-version=3.6.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-01 10:18:40 +02:00
dependabot[bot]	7cc059fe41	chore(ui-deps): bump eslint-config-next from 15.3.2 to 15.5.2 in /llama_stack/ui (#3288 ) Bumps [eslint-config-next](https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next) from 15.3.2 to 15.5.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vercel/next.js/releases">eslint-config-next's releases</a>.</em></p> <blockquote> <h2>v15.5.2</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>fix: disable unknownatrules lint rule entirely (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83059">#83059</a>)</li> <li>revert: add ?dpl to fonts in /_next/static/media (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83062">#83062</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/bgub"><code>@bgub</code></a> and <a href="https://github.com/ztanner"><code>@ztanner</code></a> for helping!</p> <h2>v15.5.1</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>fix: aliased navigations should apply scroll handling (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/82900">#82900</a>)</li> <li>Turbopack: fix invalid NFT entry with file behind symlink (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/82887">#82887</a>)</li> <li>fix: typesafe linking to route handlers and pages API routes (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/82858">#82858</a>)</li> <li>fix: change "noUnknownAtRules" to "warn" for Biome (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/82974">#82974</a>)</li> <li>fix: add path normalization to getRelativePath for Windows (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/82918">#82918</a>)</li> <li>feat: add typesafety with config.typedRoutes to redirect() and permanentRedirect() (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/82860">#82860</a>)</li> <li>fix: avoid importing types that will be unused (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/82856">#82856</a>)</li> <li>fix: update the config.api.responseLimit type (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/82852">#82852</a>)</li> <li>fix: update validation return types (<a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/82854">#82854</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/bgub"><code>@bgub</code></a>, <a href="https://github.com/mischnic"><code>@mischnic</code></a>, and <a href="https://github.com/ztanner"><code>@ztanner</code></a> for helping!</p> <h2>v15.5.1-canary.20</h2> <h3>Misc Changes</h3> <ul> <li>Turbopack: hide blocking spans in trace server: <a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83167">#83167</a></li> <li>Update Rspack production test manifest: <a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83207">#83207</a></li> <li>[create-next-app] Generate route types after setup: <a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/82956">#82956</a></li> <li>Update Rspack development test manifest: <a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83208">#83208</a></li> <li>docs: fix snippets in getting started: <a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83228">#83228</a></li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/sokra"><code>@sokra</code></a>, <a href="https://github.com/vercel-release-bot"><code>@vercel-release-bot</code></a>, <a href="https://github.com/bgub"><code>@bgub</code></a>, and <a href="https://github.com/icyJoseph"><code>@icyJoseph</code></a> for helping!</p> <h2>v15.5.1-canary.19</h2> <h3>Core Changes</h3> <ul> <li>[sourcemaps] Always check for vendor chunks regardless of Node.js version: <a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83114">#83114</a></li> <li>Turbopack: Remove undocumented legacy syntax for built-in conditions (e.g. foreign, browser): <a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83068">#83068</a></li> <li>[metadata] update metadata routes cache headers: <a href="https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next/issues/83215">#83215</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`497ec6aa08`"><code>497ec6a</code></a> v15.5.2</li> <li><a href="`cc68ced552`"><code>cc68ced</code></a> v15.5.1</li> <li><a href="`7e08c8223d`"><code>7e08c82</code></a> v15.5.0</li> <li><a href="`8f6d345d2d`"><code>8f6d345</code></a> v15.4.2-canary.56</li> <li><a href="`e3e21977ed`"><code>e3e2197</code></a> v15.4.2-canary.55</li> <li><a href="`a745826b2c`"><code>a745826</code></a> v15.4.2-canary.54</li> <li><a href="`bec38efdb6`"><code>bec38ef</code></a> v15.4.2-canary.53</li> <li><a href="`97dbf5f2e1`"><code>97dbf5f</code></a> v15.4.2-canary.52</li> <li><a href="`9934b3788a`"><code>9934b37</code></a> v15.4.2-canary.51</li> <li><a href="`df9f3ba484`"><code>df9f3ba</code></a> v15.4.2-canary.50</li> <li>Additional commits viewable in <a href="https://github.com/vercel/next.js/commits/v15.5.2/packages/eslint-config-next">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=eslint-config-next&package-manager=npm_and_yarn&previous-version=15.3.2&new-version=15.5.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-09-01 10:18:15 +02:00
dependabot[bot]	26b4340de3	chore(ui-deps): bump @types/node from 20.17.47 to 24.3.0 in /llama_stack/ui (#3290 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 1s Details Pre-commit / pre-commit (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 0s Details Test External API and Providers / test-external (venv) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Unit Tests / unit-tests (3.12) (push) Failing after 1s Details UI Tests / ui-tests (22) (push) Failing after 1s Details Unit Tests / unit-tests (3.13) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 20.17.47 to 24.3.0. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@types/node&package-manager=npm_and_yarn&previous-version=20.17.47&new-version=24.3.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-31 17:47:31 -07:00
dependabot[bot]	a4a89745b6	chore(ui-deps): bump framer-motion from 11.18.2 to 12.23.12 in /llama_stack/ui (#3291 ) Bumps [framer-motion](https://github.com/motiondivision/motion) from 11.18.2 to 12.23.12. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/motiondivision/motion/blob/main/CHANGELOG.md">framer-motion's changelog</a>.</em></p> <blockquote> <h2>[12.23.12] 2025-07-29</h2> <h3>Added</h3> <ul> <li>Exporting internal APIs for use in view animations.</li> </ul> <h2>[12.23.11] 2025-07-28</h2> <h3>Added</h3> <ul> <li>Children of variants with <code>delayChildren: stagger()</code> will now be staggered correctly alongside their newly-entering siblings.</li> </ul> <h2>[12.23.10] 2025-07-28</h2> <h3>Fixed</h3> <ul> <li>Fixed shared layout animation in situations where no <code>motion</code> components have re-rendered between shared element switching.</li> </ul> <h2>[12.23.9] 2025-07-24</h2> <h3>Changed</h3> <ul> <li>Removing redundant <code>renderRequest</code> <code>MotionValue</code> lifecycle.</li> </ul> <h2>[12.23.8] 2025-07-24</h2> <h3>Fixed</h3> <ul> <li>Ensuring that when an animation is skipped via <code>duration = 0</code> that we also set <code>type = "keyframes"</code> so that <code>duration</code> takes effect.</li> </ul> <h2>[12.23.7] 2025-07-23</h2> <h3>Fixed</h3> <ul> <li><code>springValue</code> cleanup.</li> <li>Removed additional <code>removeNode</code> from <code>AnimatePresence</code> when using <code>popLayout</code>.</li> </ul> <h2>[12.23.6] 2025-07-11</h2> <h3>Changed</h3> <ul> <li>Added explainer for reduced motion warning.</li> <li>Refactored <code>motion</code> component creation to remove indirection.</li> </ul> <h2>[12.23.5] 2025-07-11</h2> <h3>Fixed</h3> <ul> <li>Fix animation timings within dynamically-generated popups.</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`e0f7e07570`"><code>e0f7e07</code></a> v12.23.12</li> <li><a href="`994515fef3`"><code>994515f</code></a> Updating changelog</li> <li><a href="`95d82ff919`"><code>95d82ff</code></a> Merge pull request <a href="https://redirect.github.com/motiondivision/motion/issues/3338">#3338</a> from motiondivision/feature/next-page-transitions</li> <li><a href="`58b2e8cde4`"><code>58b2e8c</code></a> Exporting APIs for view transitions</li> <li><a href="`b6f2132fb6`"><code>b6f2132</code></a> Update README.md</li> <li><a href="`38298c41fc`"><code>38298c4</code></a> Update README.md</li> <li><a href="`76396b0187`"><code>76396b0</code></a> Update README.md</li> <li><a href="`b273d064a3`"><code>b273d06</code></a> Update README.md</li> <li><a href="`c0bd6effa9`"><code>c0bd6ef</code></a> v12.23.11</li> <li><a href="`e9b52af3e2`"><code>e9b52af</code></a> Updating changelog</li> <li>Additional commits viewable in <a href="https://github.com/motiondivision/motion/compare/v11.18.2...v12.23.12">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=framer-motion&package-manager=npm_and_yarn&previous-version=11.18.2&new-version=12.23.12)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-31 17:46:12 -07:00
Matthew Farrellee	478b4ff1e6	chore(migrate apis): move VectorDBWithIndex from embeddings to openai_embeddings (#3294 ) # What does this PR do? migrates VectorDBWithIndex to use openai_embeddings part of #2365 ## Test Plan existing unit tests	2025-08-31 14:48:35 -07:00
Jiayi Ni	b12cd528ef	docs: add VLM NIM example (#3277 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2s Details Pre-commit / pre-commit (push) Failing after 0s Details Test Llama Stack Build / build-single-provider (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 0s Details Test Llama Stack Build / generate-matrix (push) Failing after 1s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 1s Details Test Llama Stack Build / build (push) Has been skipped Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 1s Details UI Tests / ui-tests (22) (push) Failing after 0s Details Unit Tests / unit-tests (3.12) (push) Failing after 1s Details Unit Tests / unit-tests (3.13) (push) Failing after 0s Details Update ReadTheDocs / update-readthedocs (push) Failing after 1s Details	2025-08-29 16:23:52 -07:00
Matthew Farrellee	3370d8e557	feat(files, s3, expiration): add expires_after support to S3 files provider (#3283 )	2025-08-29 16:17:24 -07:00
github-actions[bot]	78a78264a7	build: Bump version to 0.2.20	2025-08-29 21:17:47 +00:00
slekkala1	efdb5558b8	fix: Remove bfcl scoring function as not supported (#3281 ) Some checks failed Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Pre-commit / pre-commit (push) Failing after 1s Details Test Llama Stack Build / build-single-provider (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 0s Details Test Llama Stack Build / generate-matrix (push) Failing after 2s Details Test Llama Stack Build / build (push) Has been skipped Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 0s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 1s Details UI Tests / ui-tests (22) (push) Failing after 0s Details Unit Tests / unit-tests (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 1s Details Update ReadTheDocs / update-readthedocs (push) Failing after 1s Details # What does this PR do? BFCL scoring function is not supported, removing it. Also minor fixes as the llama stack run is broken for open-benchmark for test plan verification 1. Correct the model paths for supported models 2. Fix another issue as there is no `provider_id` for DatasetInput but logger assumes it exists. ``` File "/Users/swapna942/llama-stack/llama_stack/core/stack.py", line 332, in construct_stack await register_resources(run_config, impls) File "/Users/swapna942/llama-stack/llama_stack/core/stack.py", line 108, in register_resources logger.debug(f"registering {rsrc.capitalize()} {obj} for provider {obj.provider_id}") ^^^^^^^^^^^^^^^ File "/Users/swapna942/llama-stack/.venv/lib/python3.13/site-packages/pydantic/main.py", line 991, in __getattr__ raise AttributeError(f'{type(self).__name__!r} object has no attribute {item!r}') AttributeError: 'DatasetInput' object has no attribute 'provider_id' ``` ## Test Plan ```llama stack build --distro open-benchmark --image-type venv``` and run the server succeeds Issue Link: https://github.com/llamastack/llama-stack/issues/3282	2025-08-29 11:03:52 -07:00
IAN MILLER	3130ca0a78	feat: implement keyword, vector and hybrid search inside vector stores for PGVector provider (#3064 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this task is to implement `openai/v1/vector_stores/{vector_store_id}/search` for PGVector provider. It involves implementing vector similarity search, keyword search and hybrid search for `PGVectorIndex`. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #3006 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Run unit tests: ` ./scripts/unit-tests.sh ` Run integration tests for openai vector stores: 1. Export env vars: ``` export ENABLE_PGVECTOR=true export PGVECTOR_HOST=localhost export PGVECTOR_PORT=5432 export PGVECTOR_DB=llamastack export PGVECTOR_USER=llamastack export PGVECTOR_PASSWORD=llamastack ``` 2. Create DB: ``` psql -h localhost -U postgres -c "CREATE ROLE llamastack LOGIN PASSWORD 'llamastack';" psql -h localhost -U postgres -c "CREATE DATABASE llamastack OWNER llamastack;" psql -h localhost -U llamastack -d llamastack -c "CREATE EXTENSION IF NOT EXISTS vector;" ``` 3. Install sentence-transformers: ` uv pip install sentence-transformers ` 4. Run: ``` uv run --group test pytest -s -v --stack-config="inference=inline::sentence-transformers,vector_io=remote::pgvector" --embedding-model sentence-transformers/all-MiniLM-L6-v2 tests/integration/vector_io/test_openai_vector_stores.py ``` Inspect PGVector vector stores (optional): ``` psql llamastack psql (14.18 (Homebrew)) Type "help" for help. llamastack=# \z Access privileges Schema \| Name \| Type \| Access privileges \| Column privileges \| Policies --------+------------------------------------------------------+-------+-------------------+-------------------+---------- public \| llamastack_kvstore \| table \| \| \| public \| metadata_store \| table \| \| \| public \| vector_store_pgvector_main \| table \| \| \| public \| vector_store_vs_1dfbc061_1f4d_4497_9165_ecba2622ba3a \| table \| \| \| public \| vector_store_vs_2085a9fb_1822_4e42_a277_c6a685843fa7 \| table \| \| \| public \| vector_store_vs_2b3dae46_38be_462a_afd6_37ee5fe661b1 \| table \| \| \| public \| vector_store_vs_2f438de6_f606_4561_9d50_ef9160eb9060 \| table \| \| \| public \| vector_store_vs_3eeca564_2580_4c68_bfea_83dc57e31214 \| table \| \| \| public \| vector_store_vs_53942163_05f3_40e0_83c0_0997c64613da \| table \| \| \| public \| vector_store_vs_545bac75_8950_4ff1_b084_e221192d4709 \| table \| \| \| public \| vector_store_vs_688a37d8_35b2_4298_a035_bfedf5b21f86 \| table \| \| \| public \| vector_store_vs_70624d9a_f6ac_4c42_b8ab_0649473c6600 \| table \| \| \| public \| vector_store_vs_73fc1dd2_e942_4972_afb1_1e177b591ac2 \| table \| \| \| public \| vector_store_vs_9d464949_d51f_49db_9f87_e033b8b84ac9 \| table \| \| \| public \| vector_store_vs_a1e4d724_5162_4d6d_a6c0_bdafaf6b76ec \| table \| \| \| public \| vector_store_vs_a328fb1b_1a21_480f_9624_ffaa60fb6672 \| table \| \| \| public \| vector_store_vs_a8981bf0_2e66_4445_a267_a8fff442db53 \| table \| \| \| public \| vector_store_vs_ccd4b6a4_1efd_4984_ad03_e7ff8eadb296 \| table \| \| \| public \| vector_store_vs_cd6420a4_a1fc_4cec_948c_1413a26281c9 \| table \| \| \| public \| vector_store_vs_cd709284_e5cf_4a88_aba5_dc76a35364bd \| table \| \| \| public \| vector_store_vs_d7a4548e_fbc1_44d7_b2ec_b664417f2a46 \| table \| \| \| public \| vector_store_vs_e7f73231_414c_4523_886c_d1174eee836e \| table \| \| \| public \| vector_store_vs_ffd53588_819f_47e8_bb9d_954af6f7833d \| table \| \| \| (23 rows) llamastack=# ``` Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-08-29 16:30:12 +02:00
Matthew Farrellee	e96e3c4da4	feat(s3 auth): add authorization support for s3 files provider (#3265 ) # What does this PR do? adds support for authorized users to the s3 files provider ## Test Plan existing and new unit tests	2025-08-29 16:14:00 +02:00
Matthew Farrellee	ed418653ec	chore(dev): add inequality support to sqlstore where clause (#3272 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 1s Details Pre-commit / pre-commit (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 0s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test External API and Providers / test-external (venv) (push) Failing after 1s Details UI Tests / ui-tests (22) (push) Failing after 0s Details Unit Tests / unit-tests (3.12) (push) Failing after 1s Details Unit Tests / unit-tests (3.13) (push) Failing after 1s Details # What does this PR do? add the ability to use inequalities in the where clause of the sqlstore. this is infrastructure for files expiration. ## Test Plan unit tests	2025-08-28 14:49:36 -07:00
slekkala1	30117dea22	fix: docker failing to start container [fireworks-ai] (#3267 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 2s Details Pre-commit / pre-commit (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 1s Details UI Tests / ui-tests (22) (push) Failing after 1s Details Unit Tests / unit-tests (3.12) (push) Failing after 0s Details Unit Tests / unit-tests (3.13) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 6s Details # What does this PR do? `1725364988` Fixes the issue with open ai package incompatibilty introduced through new dependency of fireworks-ai==0.19.18->reward-kit by pinning to fireworks older version that doesnt pull in reward-kit ## Test Plan Tested locally with the following commands to start a container 1. Build container `llama stack build --distro starter --image-type container` 2. start container `docker run -d -p 8321:8321 --name llama-stack-test distribution-starter:0.2.19` 3. check health http://localhost:8321/v1/health Above steps fails without the fix	2025-08-28 13:20:36 -07:00
Omer Tuchfeld	52106d95d3	fix(env): env var replacement preserve types (#3270 ) # What does this PR do? During env var replacement, we're implicitly converting all config types to their apparent types (e.g., "true" to True, "123" to 123). This may be arguably useful for when doing an env var substitution, as those are always strings, but we should definitely avoid touching config values that have explicit types and are uninvolved in env var substitution. ## Test Plan Unit	2025-08-28 17:07:18 +02:00
Francisco Arceo	75fad445a6	feat(UI): Implementing File Upload and VectorDB Creation/Configuration in Playground (#3266 ) Some checks failed Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 5s Details Pre-commit / pre-commit (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 2s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Failing after 6s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 12s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s Details	2025-08-28 05:03:31 -06:00
Kelly Brown	1a9fa3c0b8	docs: Contributor guidelines for creating Internal or External providers (#3111 ) Description: Adding information and guidelines on when contributors should create an in-tree vs out-of-tree provider. Im still learning a bit about this subject so Im very open to feedback on this PR Will also add this section to the API Providers section of the docs	2025-08-28 12:26:47 +02:00
raghotham	d73955a41e	chore: remove absolute paths (#3263 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (push) Failing after 2s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Pre-commit / pre-commit (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Failing after 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 5s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build (push) Has been skipped Details Unit Tests / unit-tests (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 2s Details UI Tests / ui-tests (22) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 12s Details # What does this PR do? Finding these issues while moving to github pages. ## Test Plan uv run --group docs sphinx-autobuild docs/source docs/build/html --write-all	2025-08-27 12:04:25 -07:00
Charlie Doern	cec00c5476	docs: fix post_training docs (#3262 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Failing after 1s Details Test Llama Stack Build / build (push) Has been skipped Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 8s Details Pre-commit / pre-commit (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s Details UI Tests / ui-tests (22) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 11s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 9s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details # What does this PR do? the post training docs are missing references to the more indepth `huggingface.md` and `torchtune.md` which explain how to actually use the providers. These files show up in search though. Add references to these files into the `inline_..md` files currently pointed to by `index.md` Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-08-26 18:21:15 -07:00
github-actions[bot]	963305c84d	build: Bump version to 0.2.19	2025-08-26 22:02:47 +00:00
Ashwin Bharambe	9fa69b0337	feat(distro): no huggingface provider for starter (#3258 ) The `trl` dependency brings in `accelerate` which brings in nvidia dependencies for torch. We cannot have that in the starter distro. As such, no CPU-only post-training for the huggingface provider.	2025-08-26 14:06:36 -07:00
Matthew Farrellee	00bd9a61ed	chore: Add example notebook for Langchain + LLAMAStack integration (#3228 ) (#3259 )	2025-08-26 12:58:44 -07:00
slekkala1	2666029427	feat: Add example notebook for Langchain + LLAMAStack integration (#3228 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Pre-commit / pre-commit (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 4s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 8s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 12s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 16s Details # What does this PR do? Add LLAMAStack + Langchain integration example notebook ## Test Plan Ran in Jupyter notebook, works end to end. (Used Claude mainly for documentation and coding/debugging help)	2025-08-26 11:34:08 -07:00
Derek Higgins	7ca8233889	feat(testing): remove SQLite dependency from inference recorder (#3254 ) Recording files use a predictable naming format, making the SQLite index redundant. The binary SQLite file was causing frequent git conflicts. Simplify by calculating file paths directly from request hashes. Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-08-26 09:17:00 -07:00
dependabot[bot]	1eb1ac0f41	chore(ui-deps): bump @testing-library/jest-dom from 6.6.3 to 6.8.0 in /llama_stack/ui (#3243 ) Bumps [@testing-library/jest-dom](https://github.com/testing-library/jest-dom) from 6.6.3 to 6.8.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/testing-library/jest-dom/releases"><code>@testing-library/jest-dom</code>'s releases</a>.</em></p> <blockquote> <h2>v6.8.0</h2> <h1><a href="https://github.com/testing-library/jest-dom/compare/v6.7.0...v6.8.0">6.8.0</a> (2025-08-20)</h1> <h3>Features</h3> <ul> <li>add toBePartiallyPressed matcher (<a href="https://redirect.github.com/testing-library/jest-dom/issues/203">#203</a>) (<a href="https://redirect.github.com/testing-library/jest-dom/issues/692">#692</a>) (<a href="`779b7125d3`">779b712</a>)</li> </ul> <h2>v6.7.0</h2> <h1><a href="https://github.com/testing-library/jest-dom/compare/v6.6.4...v6.7.0">6.7.0</a> (2025-08-13)</h1> <h3>Features</h3> <ul> <li>add toBePressed matcher (<a href="https://redirect.github.com/testing-library/jest-dom/issues/203">#203</a>) (<a href="https://redirect.github.com/testing-library/jest-dom/issues/658">#658</a>) (<a href="`cfdf8ae370`">cfdf8ae</a>)</li> </ul> <h2>v6.6.4</h2> <h2><a href="https://github.com/testing-library/jest-dom/compare/v6.6.3...v6.6.4">6.6.4</a> (2025-07-26)</h2> <h3>Performance Improvements</h3> <ul> <li>replace chalk with picocolors (<a href="https://redirect.github.com/testing-library/jest-dom/issues/659">#659</a>) (<a href="`707e6471ae`">707e647</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`779b7125d3`"><code>779b712</code></a> feat: add toBePartiallyPressed matcher (<a href="https://redirect.github.com/testing-library/jest-dom/issues/203">#203</a>) (<a href="https://redirect.github.com/testing-library/jest-dom/issues/692">#692</a>)</li> <li><a href="`e15f7893cd`"><code>e15f789</code></a> docs: add kretajak as a contributor for code, and test (<a href="https://redirect.github.com/testing-library/jest-dom/issues/691">#691</a>)</li> <li><a href="`cfdf8ae370`"><code>cfdf8ae</code></a> feat: add toBePressed matcher (<a href="https://redirect.github.com/testing-library/jest-dom/issues/203">#203</a>) (<a href="https://redirect.github.com/testing-library/jest-dom/issues/658">#658</a>)</li> <li><a href="`f00d94d3d1`"><code>f00d94d</code></a> chore: add <code>dependebot.yml</code> (<a href="https://redirect.github.com/testing-library/jest-dom/issues/456">#456</a>)</li> <li><a href="`476c30b43f`"><code>476c30b</code></a> refactor: drop <code>lodash</code> entirely (<a href="https://redirect.github.com/testing-library/jest-dom/issues/676">#676</a>)</li> <li><a href="`fafd8caa9f`"><code>fafd8ca</code></a> chore: add tests for Node 22 & 24 (<a href="https://redirect.github.com/testing-library/jest-dom/issues/678">#678</a>)</li> <li><a href="`d9babb1961`"><code>d9babb1</code></a> docs: fix typo (<a href="https://redirect.github.com/testing-library/jest-dom/issues/667">#667</a>)</li> <li><a href="`f0f31bbd87`"><code>f0f31bb</code></a> docs: adopt the new build-badge URL (<a href="https://redirect.github.com/testing-library/jest-dom/issues/497">#497</a>)</li> <li><a href="`707e6471ae`"><code>707e647</code></a> perf: replace chalk with picocolors (<a href="https://redirect.github.com/testing-library/jest-dom/issues/659">#659</a>)</li> <li><a href="`918b6fbcde`"><code>918b6fb</code></a> docs: add InfiniteXyy as a contributor for code, and bug (<a href="https://redirect.github.com/testing-library/jest-dom/issues/650">#650</a>)</li> <li>See full diff in <a href="https://github.com/testing-library/jest-dom/compare/v6.6.3...v6.8.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@testing-library/jest-dom&package-manager=npm_and_yarn&previous-version=6.6.3&new-version=6.8.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-26 15:38:46 +02:00
dependabot[bot]	eed25fc6e4	chore(github-deps): bump astral-sh/setup-uv from 6.5.0 to 6.6.0 (#3247 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Pre-commit / pre-commit (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 11s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 14s Details Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.5.0 to 6.6.0. <details> <summary>Commits</summary> <ul> <li><a href="`4959332f0f`"><code>4959332</code></a> Bump dependencies (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/532">#532</a>)</li> <li><a href="`adeb28643f`"><code>adeb286</code></a> Add support for .tools-versions (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/531">#531</a>)</li> <li><a href="`fce199e243`"><code>fce199e</code></a> Add log message before long API calls to GitHub (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/530">#530</a>)</li> <li><a href="`f758a4a1eb`"><code>f758a4a</code></a> chore: update known versions for 0.8.12 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/529">#529</a>)</li> <li><a href="`c0e7e93474`"><code>c0e7e93</code></a> chore: update known versions for 0.8.11 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/526">#526</a>)</li> <li><a href="`fda2399cb3`"><code>fda2399</code></a> chore: update known versions for 0.8.10 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/525">#525</a>)</li> <li>See full diff in <a href="`d9e0f98d3f...4959332f0f`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.5.0&new-version=6.6.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-25 17:34:38 +02:00
dependabot[bot]	3d68ca05e1	chore(github-deps): bump amannn/action-semantic-pull-request from 6.1.0 to 6.1.1 (#3248 ) Bumps [amannn/action-semantic-pull-request](https://github.com/amannn/action-semantic-pull-request) from 6.1.0 to 6.1.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/amannn/action-semantic-pull-request/releases">amannn/action-semantic-pull-request's releases</a>.</em></p> <blockquote> <h2>v6.1.1</h2> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v6.1.0...v6.1.1">6.1.1</a> (2025-08-22)</h2> <h3>Bug Fixes</h3> <ul> <li>Parse <code>headerPatternCorrespondence</code> properly (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/295">#295</a>) (<a href="`800da4c97f`">800da4c</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/amannn/action-semantic-pull-request/blob/main/CHANGELOG.md">amannn/action-semantic-pull-request's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v6.1.0...v6.1.1">6.1.1</a> (2025-08-22)</h2> <h3>Bug Fixes</h3> <ul> <li>Parse <code>headerPatternCorrespondence</code> properly (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/295">#295</a>) (<a href="`800da4c97f`">800da4c</a>)</li> </ul> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v6.0.1...v6.1.0">6.1.0</a> (2025-08-19)</h2> <h3>Features</h3> <ul> <li>Support providing regexps for types (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/292">#292</a>) (<a href="`a30288bf13`">a30288b</a>)</li> </ul> <h3>Bug Fixes</h3> <ul> <li>Remove trailing whitespace from "unknown release type" error message (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/291">#291</a>) (<a href="`afa4edb1c4`">afa4edb</a>)</li> </ul> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v6.0.0...v6.0.1">6.0.1</a> (2025-08-13)</h2> <h3>Bug Fixes</h3> <ul> <li>Actually execute action (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/289">#289</a>) (<a href="`58e4ab40f5`">58e4ab4</a>)</li> </ul> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v5.5.3...v6.0.0">6.0.0</a> (2025-08-13)</h2> <h3>⚠ BREAKING CHANGES</h3> <ul> <li>Upgrade action to use Node.js 24 and ESM (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/287">#287</a>)</li> </ul> <h3>Features</h3> <ul> <li>Upgrade action to use Node.js 24 and ESM (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/287">#287</a>) (<a href="`bc0c9a79ab`">bc0c9a7</a>)</li> </ul> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v5.5.2...v5.5.3">5.5.3</a> (2024-06-28)</h2> <h3>Bug Fixes</h3> <ul> <li>Bump <code>braces</code> dependency (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/269">#269</a>. by <a href="https://github.com/EelcoLos"><code>@EelcoLos</code></a>) (<a href="`2d952a1bf9`">2d952a1</a>)</li> </ul> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v5.5.1...v5.5.2">5.5.2</a> (2024-04-24)</h2> <h3>Bug Fixes</h3> <ul> <li>Bump tar from 6.1.11 to 6.2.1 (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/262">#262</a> by <a href="https://github.com/EelcoLos"><code>@EelcoLos</code></a>) (<a href="`9a90d5a5ac`">9a90d5a</a>)</li> </ul> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v5.5.0...v5.5.1">5.5.1</a> (2024-04-24)</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`48f256284b`"><code>48f2562</code></a> chore: Release 6.1.1 [skip ci]</li> <li><a href="`800da4c97f`"><code>800da4c</code></a> fix: Parse <code>headerPatternCorrespondence</code> properly (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/295">#295</a>)</li> <li><a href="`677b89571e`"><code>677b895</code></a> test: Fix broken test</li> <li><a href="`24e6f016c1`"><code>24e6f01</code></a> ci: Fix permissions for tagger</li> <li>See full diff in <a href="`7f33ba7922...48f256284b`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=amannn/action-semantic-pull-request&package-manager=github_actions&previous-version=6.1.0&new-version=6.1.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-25 17:34:17 +02:00
dependabot[bot]	fc466cb4a4	chore(ui-deps): bump eslint-plugin-prettier from 5.4.0 to 5.5.4 in /llama_stack/ui (#3241 ) Bumps [eslint-plugin-prettier](https://github.com/prettier/eslint-plugin-prettier) from 5.4.0 to 5.5.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/prettier/eslint-plugin-prettier/releases">eslint-plugin-prettier's releases</a>.</em></p> <blockquote> <h2>v5.5.4</h2> <h3>Patch Changes</h3> <ul> <li> <p><a href="https://redirect.github.com/prettier/eslint-plugin-prettier/pull/755">#755</a> <a href="`723f7a803f`"><code>723f7a8</code></a> Thanks <a href="https://github.com/kbrilla"><code>@kbrilla</code></a>! - fix: add 'oxc', 'oxc-ts' and 'hermes' parsers to <code>parserBlocklist</code></p> </li> <li> <p><a href="https://redirect.github.com/prettier/eslint-plugin-prettier/pull/751">#751</a> <a href="`cf52b306a5`"><code>cf52b30</code></a> Thanks <a href="https://github.com/andreww2012"><code>@andreww2012</code></a>! - fix: disallow extra properties in rule options</p> </li> </ul> <h2>v5.5.3</h2> <p>republish the latest version</p> <p><strong>Full Changelog</strong>: <a href="https://github.com/prettier/eslint-plugin-prettier/compare/v5.5.2...v5.5.3">https://github.com/prettier/eslint-plugin-prettier/compare/v5.5.2...v5.5.3</a></p> <h2>v5.5.2</h2> <p>republish the latest version</p> <p><strong>Full Changelog</strong>: <a href="https://github.com/prettier/eslint-plugin-prettier/compare/v5.5.1...v5.5.2">https://github.com/prettier/eslint-plugin-prettier/compare/v5.5.1...v5.5.2</a></p> <h2>v5.5.1</h2> <h3>Patch Changes</h3> <ul> <li><a href="https://redirect.github.com/prettier/eslint-plugin-prettier/pull/748">#748</a> <a href="`bfd1e9547d`"><code>bfd1e95</code></a> Thanks <a href="https://github.com/JounQin"><code>@JounQin</code></a>! - fix: use <code>prettierRcOptions</code> directly for prettier 3.6+</li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/prettier/eslint-plugin-prettier/compare/v5.5.0...v5.5.1">https://github.com/prettier/eslint-plugin-prettier/compare/v5.5.0...v5.5.1</a></p> <h2>v5.5.0</h2> <h3>Minor Changes</h3> <ul> <li><a href="https://redirect.github.com/prettier/eslint-plugin-prettier/pull/743">#743</a> <a href="`92f2c9c8f0`"><code>92f2c9c</code></a> Thanks <a href="https://github.com/dotcarmen"><code>@dotcarmen</code></a>! - feat: support non-js languages like <code>css</code> for <code>@eslint/css</code> and <code>json</code> for <code>@eslint/json</code></li> </ul> <h3>New Contributors</h3> <ul> <li><a href="https://github.com/dotcarmen"><code>@dotcarmen</code></a> made their first contribution in <a href="https://redirect.github.com/prettier/eslint-plugin-prettier/pull/743">prettier/eslint-plugin-prettier#743</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/prettier/eslint-plugin-prettier/compare/v5.4.1...v5.5.0">https://github.com/prettier/eslint-plugin-prettier/compare/v5.4.1...v5.5.0</a></p> <h2>v5.4.1</h2> <h3>Patch Changes</h3> <ul> <li><a href="https://redirect.github.com/prettier/eslint-plugin-prettier/pull/740">#740</a> <a href="`c21521ffbe`"><code>c21521f</code></a> Thanks <a href="https://github.com/JounQin"><code>@JounQin</code></a>! - fix(deps): bump <code>synckit</code> to v0.11.7 to fix potential <code>TypeError: Cannot read properties of undefined (reading 'message')</code> error</li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/prettier/eslint-plugin-prettier/compare/v5.4.0...v5.4.1">https://github.com/prettier/eslint-plugin-prettier/compare/v5.4.0...v5.4.1</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/prettier/eslint-plugin-prettier/blob/main/CHANGELOG.md">eslint-plugin-prettier's changelog</a>.</em></p> <blockquote> <h2>5.5.4</h2> <h3>Patch Changes</h3> <ul> <li> <p><a href="https://redirect.github.com/prettier/eslint-plugin-prettier/pull/755">#755</a> <a href="`723f7a803f`"><code>723f7a8</code></a> Thanks <a href="https://github.com/kbrilla"><code>@kbrilla</code></a>! - fix: add 'oxc', 'oxc-ts' and 'hermes' parsers to <code>parserBlocklist</code></p> </li> <li> <p><a href="https://redirect.github.com/prettier/eslint-plugin-prettier/pull/751">#751</a> <a href="`cf52b306a5`"><code>cf52b30</code></a> Thanks <a href="https://github.com/andreww2012"><code>@andreww2012</code></a>! - fix: disallow extra properties in rule options</p> </li> </ul> <h2>5.5.1</h2> <h3>Patch Changes</h3> <ul> <li><a href="https://redirect.github.com/prettier/eslint-plugin-prettier/pull/748">#748</a> <a href="`bfd1e9547d`"><code>bfd1e95</code></a> Thanks <a href="https://github.com/JounQin"><code>@JounQin</code></a>! - fix: use <code>prettierRcOptions</code> directly for prettier 3.6+</li> </ul> <h2>5.5.0</h2> <h3>Minor Changes</h3> <ul> <li><a href="https://redirect.github.com/prettier/eslint-plugin-prettier/pull/743">#743</a> <a href="`92f2c9c8f0`"><code>92f2c9c</code></a> Thanks <a href="https://github.com/dotcarmen"><code>@dotcarmen</code></a>! - feat: support non-js languages like <code>css</code> for <code>@eslint/css</code> and <code>json</code> for <code>@eslint/json</code></li> </ul> <h2>5.4.1</h2> <h3>Patch Changes</h3> <ul> <li><a href="https://redirect.github.com/prettier/eslint-plugin-prettier/pull/740">#740</a> <a href="`c21521ffbe`"><code>c21521f</code></a> Thanks <a href="https://github.com/JounQin"><code>@JounQin</code></a>! - fix(deps): bump <code>synckit</code> to v0.11.7 to fix potential <code>TypeError: Cannot read properties of undefined (reading 'message')</code> error</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`e2c31d20f3`"><code>e2c31d2</code></a> chore: release eslint-plugin-prettier (<a href="https://redirect.github.com/prettier/eslint-plugin-prettier/issues/756">#756</a>)</li> <li><a href="`98a8bfd269`"><code>98a8bfd</code></a> chore(deps): update all dependencies (<a href="https://redirect.github.com/prettier/eslint-plugin-prettier/issues/750">#750</a>)</li> <li><a href="`cf52b306a5`"><code>cf52b30</code></a> fix: disallow extra properties in rule options (<a href="https://redirect.github.com/prettier/eslint-plugin-prettier/issues/751">#751</a>)</li> <li><a href="`723f7a803f`"><code>723f7a8</code></a> fix: add 'oxc', 'oxc-ts' and 'hermes' parsers to <code>parserBlocklist</code> (<a href="https://redirect.github.com/prettier/eslint-plugin-prettier/issues/755">#755</a>)</li> <li><a href="`cdfcefde25`"><code>cdfcefd</code></a> fix: release a new latest version</li> <li><a href="`d8c303ede5`"><code>d8c303e</code></a> fix: release a new latest version</li> <li><a href="`3e87f2e73d`"><code>3e87f2e</code></a> chore: release eslint-plugin-prettier (<a href="https://redirect.github.com/prettier/eslint-plugin-prettier/issues/749">#749</a>)</li> <li><a href="`bfd1e9547d`"><code>bfd1e95</code></a> fix: use <code>prettierRcOptions</code> directly for prettier 3.6+ (<a href="https://redirect.github.com/prettier/eslint-plugin-prettier/issues/748">#748</a>)</li> <li><a href="`9c4b792de1`"><code>9c4b792</code></a> chore: release eslint-plugin-prettier (<a href="https://redirect.github.com/prettier/eslint-plugin-prettier/issues/744">#744</a>)</li> <li><a href="`78e41ec2f0`"><code>78e41ec</code></a> chore(deps): update all dependencies (<a href="https://redirect.github.com/prettier/eslint-plugin-prettier/issues/745">#745</a>)</li> <li>Additional commits viewable in <a href="https://github.com/prettier/eslint-plugin-prettier/compare/v5.4.0...v5.5.4">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=eslint-plugin-prettier&package-manager=npm_and_yarn&previous-version=5.4.0&new-version=5.5.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-25 17:34:00 +02:00
dependabot[bot]	83dbc93e3f	chore(ui-deps): bump @testing-library/dom from 10.4.0 to 10.4.1 in /llama_stack/ui (#3244 ) Bumps [@testing-library/dom](https://github.com/testing-library/dom-testing-library) from 10.4.0 to 10.4.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/testing-library/dom-testing-library/releases"><code>@testing-library/dom</code>'s releases</a>.</em></p> <blockquote> <h2>v10.4.1</h2> <h2><a href="https://github.com/testing-library/dom-testing-library/compare/v10.4.0...v10.4.1">10.4.1</a> (2025-07-27)</h2> <h3>Bug Fixes</h3> <ul> <li><strong>deps:</strong> replace chalk with picocolors (<a href="https://redirect.github.com/testing-library/dom-testing-library/issues/1341">#1341</a>) (<a href="`225a3e4cfa`">225a3e4</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`225a3e4cfa`"><code>225a3e4</code></a> fix(deps): replace chalk with picocolors (<a href="https://redirect.github.com/testing-library/dom-testing-library/issues/1341">#1341</a>)</li> <li>See full diff in <a href="https://github.com/testing-library/dom-testing-library/compare/v10.4.0...v10.4.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@testing-library/dom&package-manager=npm_and_yarn&previous-version=10.4.0&new-version=10.4.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-25 17:33:02 +02:00
dependabot[bot]	dc07575ecd	chore(ui-deps): bump remeda from 2.26.1 to 2.30.0 in /llama_stack/ui (#3242 ) Bumps [remeda](https://github.com/remeda/remeda) from 2.26.1 to 2.30.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/remeda/remeda/releases">remeda's releases</a>.</em></p> <blockquote> <h2>v2.30.0</h2> <h1><a href="https://github.com/remeda/remeda/compare/v2.29.0...v2.30.0">2.30.0</a> (2025-08-07)</h1> <h3>Features</h3> <ul> <li><strong>isFunction:</strong> stricter <code>Function</code> type (<a href="https://redirect.github.com/remeda/remeda/issues/1161">#1161</a>) (<a href="`729ead3f45`">729ead3</a>), closes <a href="https://redirect.github.com/remeda/remeda/issues/778">#778</a></li> </ul> <h2>v2.29.0</h2> <h1><a href="https://github.com/remeda/remeda/compare/v2.28.0...v2.29.0">2.29.0</a> (2025-08-07)</h1> <h3>Features</h3> <ul> <li>migrate build from tsup to tsdown (<a href="https://redirect.github.com/remeda/remeda/issues/1172">#1172</a>) (<a href="`56913804ce`">5691380</a>), closes <a href="https://redirect.github.com/remeda/remeda/issues/1050">#1050</a> <a href="https://redirect.github.com/remeda/remeda/issues/1050">#1050</a></li> </ul> <h2>v2.28.0</h2> <h1><a href="https://github.com/remeda/remeda/compare/v2.27.2...v2.28.0">2.28.0</a> (2025-08-03)</h1> <h3>Features</h3> <ul> <li><strong>defaultTo:</strong> introduce <code>defaultTo</code> (<a href="https://redirect.github.com/remeda/remeda/issues/1159">#1159</a>) (<a href="`92449ef03c`">92449ef</a>), closes <a href="https://redirect.github.com/remeda/remeda/issues/1158">#1158</a></li> </ul> <h2>v2.27.2</h2> <h2><a href="https://github.com/remeda/remeda/compare/v2.27.1...v2.27.2">2.27.2</a> (2025-08-01)</h2> <h3>Bug Fixes</h3> <ul> <li><strong>const:</strong> prefer narrow typing for literals (<a href="https://redirect.github.com/remeda/remeda/issues/1160">#1160</a>) (<a href="`4c5bc73956`">4c5bc73</a>), closes <a href="https://redirect.github.com/remeda/remeda/issues/823">#823</a></li> </ul> <h2>v2.27.1</h2> <h2><a href="https://github.com/remeda/remeda/compare/v2.27.0...v2.27.1">2.27.1</a> (2025-08-01)</h2> <h3>Bug Fixes</h3> <ul> <li>prevent redundant type computation paths (<a href="https://redirect.github.com/remeda/remeda/issues/1163">#1163</a>) (<a href="`7c37e395db`">7c37e39</a>)</li> <li><strong>sample:</strong> revamp typing (<a href="https://redirect.github.com/remeda/remeda/issues/1162">#1162</a>) (<a href="`55e5c8c692`">55e5c8c</a>), closes <a href="https://redirect.github.com/remeda/remeda/issues/323">#323</a></li> </ul> <h2>v2.27.0</h2> <h1><a href="https://github.com/remeda/remeda/compare/v2.26.1...v2.27.0">2.27.0</a> (2025-07-28)</h1> <h3>Features</h3> <ul> <li><strong>prop:</strong> allow deep paths (<a href="https://redirect.github.com/remeda/remeda/issues/1158">#1158</a>) (<a href="`cb7d61194e`">cb7d611</a>), closes <a href="https://redirect.github.com/remeda/remeda/issues/830">#830</a></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`729ead3f45`"><code>729ead3</code></a> feat(isFunction): stricter <code>Function</code> type (<a href="https://redirect.github.com/remeda/remeda/issues/1161">#1161</a>)</li> <li><a href="`56913804ce`"><code>5691380</code></a> feat: migrate build from tsup to tsdown (<a href="https://redirect.github.com/remeda/remeda/issues/1172">#1172</a>)</li> <li><a href="`e8706536af`"><code>e870653</code></a> chore: manual version bumps (<a href="https://redirect.github.com/remeda/remeda/issues/1173">#1173</a>)</li> <li><a href="`6bd6f984b4`"><code>6bd6f98</code></a> chore(deps-dev): bump eslint-plugin-jsdoc from 51.3.3 to 52.0.2 (<a href="https://redirect.github.com/remeda/remeda/issues/1170">#1170</a>)</li> <li><a href="`92449ef03c`"><code>92449ef</code></a> feat(defaultTo): introduce <code>defaultTo</code> (<a href="https://redirect.github.com/remeda/remeda/issues/1159">#1159</a>)</li> <li><a href="`20293262df`"><code>2029326</code></a> chore(deps-dev): bump eslint-plugin-unicorn from 59.0.1 to 60.0.0 (<a href="https://redirect.github.com/remeda/remeda/issues/1169">#1169</a>)</li> <li><a href="`4c5bc73956`"><code>4c5bc73</code></a> fix(const): prefer narrow typing for literals (<a href="https://redirect.github.com/remeda/remeda/issues/1160">#1160</a>)</li> <li><a href="`7c37e395db`"><code>7c37e39</code></a> fix: prevent redundant type computation paths (<a href="https://redirect.github.com/remeda/remeda/issues/1163">#1163</a>)</li> <li><a href="`55e5c8c692`"><code>55e5c8c</code></a> fix(sample): revamp typing (<a href="https://redirect.github.com/remeda/remeda/issues/1162">#1162</a>)</li> <li><a href="`e4559240e2`"><code>e455924</code></a> chore(deps): bump the minor group with 9 updates (<a href="https://redirect.github.com/remeda/remeda/issues/1168">#1168</a>)</li> <li>Additional commits viewable in <a href="https://github.com/remeda/remeda/compare/v2.26.1...v2.30.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=remeda&package-manager=npm_and_yarn&previous-version=2.26.1&new-version=2.30.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-25 17:32:41 +02:00
dependabot[bot]	ade0766e28	chore(github-deps): bump actions/setup-node from 4.1.0 to 4.4.0 (#3246 ) Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4.1.0 to 4.4.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/setup-node/releases">actions/setup-node's releases</a>.</em></p> <blockquote> <h2>v4.4.0</h2> <h2>What's Changed</h2> <h3>Bug fixes:</h3> <ul> <li>Make eslint-compact matcher compatible with Stylelint by <a href="https://github.com/FloEdelmann"><code>@FloEdelmann</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/98">actions/setup-node#98</a></li> <li>Add support for indented eslint output by <a href="https://github.com/fregante"><code>@fregante</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1245">actions/setup-node#1245</a></li> </ul> <h3>Enhancement:</h3> <ul> <li>Support private mirrors by <a href="https://github.com/marco-ippolito"><code>@marco-ippolito</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1240">actions/setup-node#1240</a></li> </ul> <h3>Dependency update:</h3> <ul> <li>Upgrade <code>@action/cache</code> from 4.0.2 to 4.0.3 by <a href="https://github.com/aparnajyothi-y"><code>@aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1262">actions/setup-node#1262</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/FloEdelmann"><code>@FloEdelmann</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/98">actions/setup-node#98</a></li> <li><a href="https://github.com/fregante"><code>@fregante</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/1245">actions/setup-node#1245</a></li> <li><a href="https://github.com/marco-ippolito"><code>@marco-ippolito</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/1240">actions/setup-node#1240</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v4...v4.4.0">https://github.com/actions/setup-node/compare/v4...v4.4.0</a></p> <h2>v4.3.0</h2> <h2>What's Changed</h2> <h3>Dependency updates</h3> <ul> <li>Upgrade <code>@actions/glob</code> from 0.4.0 to 0.5.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1200">actions/setup-node#1200</a></li> <li>Upgrade <code>@action/cache</code> from 4.0.0 to 4.0.2 by <a href="https://github.com/gowridurgad"><code>@gowridurgad</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1251">actions/setup-node#1251</a></li> <li>Upgrade <code>@vercel/ncc</code> from 0.38.1 to 0.38.3 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1203">actions/setup-node#1203</a></li> <li>Upgrade <code>@actions/tool-cache</code> from 2.0.1 to 2.0.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1220">actions/setup-node#1220</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/gowridurgad"><code>@gowridurgad</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/1251">actions/setup-node#1251</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v4...v4.3.0">https://github.com/actions/setup-node/compare/v4...v4.3.0</a></p> <h2>v4.2.0</h2> <h2>What's Changed</h2> <ul> <li>Enhance workflows and upgrade publish-actions from 0.2.2 to 0.3.0 by <a href="https://github.com/aparnajyothi-y"><code>@aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1174">actions/setup-node#1174</a></li> <li>Add recommended permissions section to readme by <a href="https://github.com/benwells"><code>@benwells</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1193">actions/setup-node#1193</a></li> <li>Configure Dependabot settings by <a href="https://github.com/HarithaVattikuti"><code>@HarithaVattikuti</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1192">actions/setup-node#1192</a></li> <li>Upgrade <code>@actions/cache</code> to <code>^4.0.0</code> by <a href="https://github.com/priyagupta108"><code>@priyagupta108</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1191">actions/setup-node#1191</a></li> <li>Upgrade pnpm/action-setup from 2 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1194">actions/setup-node#1194</a></li> <li>Upgrade actions/publish-immutable-action from 0.0.3 to 0.0.4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1195">actions/setup-node#1195</a></li> <li>Upgrade semver from 7.6.0 to 7.6.3 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1196">actions/setup-node#1196</a></li> <li>Upgrade <code>@types/jest</code> from 29.5.12 to 29.5.14 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1201">actions/setup-node#1201</a></li> <li>Upgrade undici from 5.28.4 to 5.28.5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1205">actions/setup-node#1205</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/benwells"><code>@benwells</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/1193">actions/setup-node#1193</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v4...v4.2.0">https://github.com/actions/setup-node/compare/v4...v4.2.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`49933ea528`"><code>49933ea</code></a> Bump <code>@action/cache</code> from 4.0.2 to 4.0.3 (<a href="https://redirect.github.com/actions/setup-node/issues/1262">#1262</a>)</li> <li><a href="`e3ce749e20`"><code>e3ce749</code></a> feat: support private mirrors (<a href="https://redirect.github.com/actions/setup-node/issues/1240">#1240</a>)</li> <li><a href="`40337cb8f7`"><code>40337cb</code></a> Add support for indented eslint output (<a href="https://redirect.github.com/actions/setup-node/issues/1245">#1245</a>)</li> <li><a href="`1ccdddc9b8`"><code>1ccdddc</code></a> Make eslint-compact matcher compatible with Stylelint (<a href="https://redirect.github.com/actions/setup-node/issues/98">#98</a>)</li> <li><a href="`cdca7365b2`"><code>cdca736</code></a> Bump <code>@actions/tool-cache</code> from 2.0.1 to 2.0.2 (<a href="https://redirect.github.com/actions/setup-node/issues/1220">#1220</a>)</li> <li><a href="`22c0e7494f`"><code>22c0e74</code></a> Bump <code>@vercel/ncc</code> from 0.38.1 to 0.38.3 (<a href="https://redirect.github.com/actions/setup-node/issues/1203">#1203</a>)</li> <li><a href="`a7c2d9473e`"><code>a7c2d94</code></a> actions/cache upgrade (<a href="https://redirect.github.com/actions/setup-node/issues/1251">#1251</a>)</li> <li><a href="`802632921f`"><code>8026329</code></a> Bump <code>@actions/glob</code> from 0.4.0 to 0.5.0 (<a href="https://redirect.github.com/actions/setup-node/issues/1200">#1200</a>)</li> <li><a href="`1d0ff469b7`"><code>1d0ff46</code></a> Bump undici from 5.28.4 to 5.28.5 (<a href="https://redirect.github.com/actions/setup-node/issues/1205">#1205</a>)</li> <li><a href="`574f09a9fa`"><code>574f09a</code></a> Bump <code>@types/jest</code> from 29.5.12 to 29.5.14 (<a href="https://redirect.github.com/actions/setup-node/issues/1201">#1201</a>)</li> <li>Additional commits viewable in <a href="https://github.com/actions/setup-node/compare/v4.1.0...49933ea5288caeca8642d1e84afbd3f7d6820020">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=4.1.0&new-version=4.4.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-25 17:32:13 +02:00
Matthew Farrellee	cffc4edf47	feat: Add optional idempotency support to batches API (#3171 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 0s Details Test Llama Stack Build / build-single-provider (push) Failing after 2s Details Pre-commit / pre-commit (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Failing after 5s Details Test Llama Stack Build / build (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Update ReadTheDocs / update-readthedocs (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details UI Tests / ui-tests (22) (push) Failing after 6s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 14s Details Implements optional idempotency for batch creation using `idem_tok` parameter: * Core idempotency: Same token + parameters returns existing batch * Conflict detection: Same token + different parameters raises HTTP 409 ConflictError * Metadata order independence: Different key ordering doesn't affect idempotency API changes: - Add optional `idem_tok` parameter to `create_batch()` method - Enhanced API documentation with idempotency extensions Implementation: - Reference provider supports idempotent batch creation - ConflictError for proper HTTP 409 status code mapping - Comprehensive parameter validation Testing: - Unit tests: focused tests covering core scenarios with parametrized conflict detection - Integration tests: tests validating real OpenAI client behavior This enables client-side retry safety and prevents duplicate batch creation when using the same idempotency token, following REST API closes #3144	2025-08-22 15:50:40 -07:00
Ashwin Bharambe	7519b73fcc	feat(distro): fork off a starter-gpu distribution (#3240 ) The starter distribution added post-training which added torch dependencies which pulls in all the nvidia CUDA libraries. This made our starter container very big. We have worked hard to keep the starter container small so it serves its purpose as a starter. This PR tries to get it back to its size by forking off duplicate "-gpu" providers for post-training. These forked providers are then used for a new `starter-gpu` distribution which can pull in all dependencies.	2025-08-22 15:47:15 -07:00
Charlie Doern	3b9278f254	feat: implement query_metrics (#3074 ) # What does this PR do? query_metrics currently has no implementation, meaning once a metric is emitted there is no way in llama stack to query it from the store. implement query_metrics for the meta_reference provider which follows a similar style to `query_traces`, using the trace_store to format an SQL query and execute it in this case the parameters for the query are `metric.METRIC_NAME, start_time, and end_time` and any other matchers if they are provided. this required client side changes since the client had no `query_metrics` or any associated resources, so any tests here will fail but I will provide manual execution logs for the new tests I am adding order the metrics by timestamp. Additionally add `unit` to the `MetricDataPoint` class since this adds much more context to the metric being queried. depends on https://github.com/llamastack/llama-stack-client-python/pull/260 ## Test Plan ``` import time import uuid def create_http_client(): from llama_stack_client import LlamaStackClient return LlamaStackClient(base_url="http://localhost:8321") client = create_http_client() response = client.telemetry.query_metrics(metric_name="total_tokens", start_time=0) print(response) ``` ``` ╰─ python3.12 ~/telemetry.py INFO:httpx:HTTP Request: POST http://localhost:8322/v1/telemetry/metrics/total_tokens "HTTP/1.1 200 OK" [TelemetryQueryMetricsResponse(data=None, metric='total_tokens', labels=[], values=[{'timestamp': 1753999514, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1753999816, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1753999881, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1753999956, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1754000200, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1754000419, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754000714, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754000876, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754000908, 'value': 34.0, 'unit': 'tokens'}, {'timestamp': 1754001309, 'value': 584.0, 'unit': 'tokens'}, {'timestamp': 1754001311, 'value': 138.0, 'unit': 'tokens'}, {'timestamp': 1754001316, 'value': 349.0, 'unit': 'tokens'}, {'timestamp': 1754001318, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001320, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001341, 'value': 923.0, 'unit': 'tokens'}, {'timestamp': 1754001350, 'value': 354.0, 'unit': 'tokens'}, {'timestamp': 1754001462, 'value': 417.0, 'unit': 'tokens'}, {'timestamp': 1754001464, 'value': 158.0, 'unit': 'tokens'}, {'timestamp': 1754001475, 'value': 697.0, 'unit': 'tokens'}, {'timestamp': 1754001477, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001479, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001489, 'value': 298.0, 'unit': 'tokens'}, {'timestamp': 1754001541, 'value': 615.0, 'unit': 'tokens'}, {'timestamp': 1754001543, 'value': 119.0, 'unit': 'tokens'}, {'timestamp': 1754001548, 'value': 310.0, 'unit': 'tokens'}, {'timestamp': 1754001549, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001551, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001568, 'value': 714.0, 'unit': 'tokens'}, {'timestamp': 1754001800, 'value': 437.0, 'unit': 'tokens'}, {'timestamp': 1754001802, 'value': 200.0, 'unit': 'tokens'}, {'timestamp': 1754001806, 'value': 262.0, 'unit': 'tokens'}, {'timestamp': 1754001808, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001810, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001816, 'value': 82.0, 'unit': 'tokens'}, {'timestamp': 1754001923, 'value': 61.0, 'unit': 'tokens'}, {'timestamp': 1754001929, 'value': 391.0, 'unit': 'tokens'}, {'timestamp': 1754001939, 'value': 598.0, 'unit': 'tokens'}, {'timestamp': 1754001941, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001942, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754001952, 'value': 252.0, 'unit': 'tokens'}, {'timestamp': 1754002053, 'value': 251.0, 'unit': 'tokens'}, {'timestamp': 1754002059, 'value': 375.0, 'unit': 'tokens'}, {'timestamp': 1754002062, 'value': 244.0, 'unit': 'tokens'}, {'timestamp': 1754002064, 'value': 111.0, 'unit': 'tokens'}, {'timestamp': 1754002065, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754002083, 'value': 719.0, 'unit': 'tokens'}, {'timestamp': 1754002302, 'value': 279.0, 'unit': 'tokens'}, {'timestamp': 1754002306, 'value': 218.0, 'unit': 'tokens'}, {'timestamp': 1754002308, 'value': 198.0, 'unit': 'tokens'}, {'timestamp': 1754002309, 'value': 69.0, 'unit': 'tokens'}, {'timestamp': 1754002311, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754002324, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754003161, 'value': 579.0, 'unit': 'tokens'}, {'timestamp': 1754003161, 'value': 69.0, 'unit': 'tokens'}, {'timestamp': 1754003169, 'value': 499.0, 'unit': 'tokens'}, {'timestamp': 1754003171, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754003173, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754003185, 'value': 422.0, 'unit': 'tokens'}, {'timestamp': 1754003448, 'value': 579.0, 'unit': 'tokens'}, {'timestamp': 1754003453, 'value': 422.0, 'unit': 'tokens'}, {'timestamp': 1754003589, 'value': 579.0, 'unit': 'tokens'}, {'timestamp': 1754003609, 'value': 279.0, 'unit': 'tokens'}, {'timestamp': 1754003614, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754003706, 'value': 303.0, 'unit': 'tokens'}, {'timestamp': 1754003706, 'value': 51.0, 'unit': 'tokens'}, {'timestamp': 1754003713, 'value': 426.0, 'unit': 'tokens'}, {'timestamp': 1754003714, 'value': 70.0, 'unit': 'tokens'}, {'timestamp': 1754003715, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754003724, 'value': 225.0, 'unit': 'tokens'}, {'timestamp': 1754004226, 'value': 516.0, 'unit': 'tokens'}, {'timestamp': 1754004228, 'value': 127.0, 'unit': 'tokens'}, {'timestamp': 1754004232, 'value': 281.0, 'unit': 'tokens'}, {'timestamp': 1754004234, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754004236, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754004244, 'value': 206.0, 'unit': 'tokens'}, {'timestamp': 1754004683, 'value': 338.0, 'unit': 'tokens'}, {'timestamp': 1754004690, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754004692, 'value': 124.0, 'unit': 'tokens'}, {'timestamp': 1754004692, 'value': 65.0, 'unit': 'tokens'}, {'timestamp': 1754004694, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754004703, 'value': 211.0, 'unit': 'tokens'}, {'timestamp': 1754004743, 'value': 338.0, 'unit': 'tokens'}, {'timestamp': 1754004749, 'value': 211.0, 'unit': 'tokens'}, {'timestamp': 1754005566, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754006101, 'value': 159.0, 'unit': 'tokens'}, {'timestamp': 1754006105, 'value': 272.0, 'unit': 'tokens'}, {'timestamp': 1754006109, 'value': 308.0, 'unit': 'tokens'}, {'timestamp': 1754006110, 'value': 61.0, 'unit': 'tokens'}, {'timestamp': 1754006112, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754006130, 'value': 705.0, 'unit': 'tokens'}, {'timestamp': 1754051825, 'value': 454.0, 'unit': 'tokens'}, {'timestamp': 1754051827, 'value': 152.0, 'unit': 'tokens'}, {'timestamp': 1754051834, 'value': 481.0, 'unit': 'tokens'}, {'timestamp': 1754051835, 'value': 55.0, 'unit': 'tokens'}, {'timestamp': 1754051837, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754051845, 'value': 102.0, 'unit': 'tokens'}, {'timestamp': 1754099929, 'value': 36.0, 'unit': 'tokens'}, {'timestamp': 1754510050, 'value': 598.0, 'unit': 'tokens'}, {'timestamp': 1754510052, 'value': 160.0, 'unit': 'tokens'}, {'timestamp': 1754510064, 'value': 725.0, 'unit': 'tokens'}, {'timestamp': 1754510065, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754510067, 'value': 133.0, 'unit': 'tokens'}, {'timestamp': 1754510083, 'value': 535.0, 'unit': 'tokens'}, {'timestamp': 1754596582, 'value': 36.0, 'unit': 'tokens'}])] ``` adding tests for each currently documented metric in llama stack using this new function. attached is also some manual testing integrations tests passing locally with replay mode and the linked client changes: <img width="1907" height="529" alt="Screenshot 2025-08-08 at 2 49 14 PM" src="https://github.com/user-attachments/assets/d482ab06-dcff-4f0c-a1f1-f870670ee9bc" /> --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-08-22 14:19:24 -07:00
Matthew Farrellee	3d119a86d4	chore: indicate to mypy that InferenceProvider.batch_completion/batch_chat_completion is concrete (#3239 ) # What does this PR do? closes https://github.com/llamastack/llama-stack/issues/3236 mypy considered our default implementations (raise NotImplementedError) to be trivial. the result was we implemented the same stubs in providers. this change puts enough into the default impls so mypy considers them non-trivial. this allows us to remove the duplicate implementations.	2025-08-22 14:17:30 -07:00
Matthew Farrellee	2ee898cc4c	chore: indicate to mypy that InferenceProvider.rerank is concrete (#3238 )	2025-08-22 12:02:13 -07:00
grs	da73f1a180	fix: ensure assistant message is followed by tool call message as expected by openai (#3224 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Pre-commit / pre-commit (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 8s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 12s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 17s Details Test Llama Stack Build / generate-matrix (push) Failing after 21s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 23s Details Test Llama Stack Build / build (push) Has been skipped Details Update ReadTheDocs / update-readthedocs (push) Failing after 20s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 24s Details # What does this PR do? As described in #3134 a langchain example works against openai's responses impl, but not against llama stack's. This turned out to be due to the order of the inputs. The langchain example has the two function call outputs first, followed by each call result in turn. This seems to be valid as it is accepted by openai's impl. However in llama stack, these inputs are converted to chat completion inputs and the resulting order for that api is not accpeted by openai. This PR fixes the issue by ensuring that the converted chat completions inputs are in the expected order. Closes #3134 ## Test Plan Added unit and integration tests. Verified this fixes original issue as reported. --------- Signed-off-by: Gordon Sim <gsim@redhat.com>	2025-08-22 10:42:03 -07:00
Francisco Arceo	b0797e4982	chore: Add UI linter back (#3230 ) # What does this PR do? 1. Adds `scripts/run-ui-linter.sh` - Light script that checks whether `node_modules`,`eslint`, and `prettier` exist before running linter - When I introduced [the linter for the UI](https://github.com/llamastack/llama-stack/pull/3156/files#diff-63a9c44a44acf85fea213a857769990937107cf072831e1a26808cfde9d096b9) it forced the UI linter on all users, the small `node_modules` check means that only users that have installed the UI locally (since `node_modules` is in the gitignore) will actually end up having this run. Additionally this does not do any install and just runs the existing linter/prettier as requested by @mattf 2. Updates `.github/workflows/pre-commit.yml` to run CI again - When I introduced the UI linter in the CI [in this PR](https://github.com/llamastack/llama-stack/pull/3191) a failure occurred because dependabot needed to be updated to also bump the `package-lock.json` which was done [in this PR](https://github.com/llamastack/llama-stack/pull/3212). All of this to say, we shouldn't observe failures from dependabot again. 3. Updates `.pre-commit-config.yaml` - Calls `scripts/run-ui-linter.sh` ## AI Assistance Notice I used Copilot minimally. ## Test Plan As [requested](https://github.com/llamastack/llama-stack/pull/3207#discussion_r2288004872) by @mattf I ran this after removing all of my `node_modules` and the linter passed. Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-22 10:54:36 -04:00
Matthew Farrellee	f520e244d9	feat: Add S3 Files Provider (#3202 ) Implements a complete S3-based file storage provider for Llama Stack with: Core Implementation: - S3FilesImpl class with full OpenAI Files API compatibility - Support for file upload, download, listing, deletion operations - Sqlite-based metadata storage for fast queries and API compliance - Configurable S3 endpoints (AWS, MinIO, LocalStack support) Key Features: - Automatic S3 bucket creation and management - Metadata persistence - Proper error handling for S3 connectivity and permissions Dependencies: - Adds boto3 for AWS S3 integration - Adds moto[s3] for testing infrastructure Testing: Unit: `./scripts/unit-tests.sh tests/unit/files tests/unit/providers/files` Integration: Start MinIO: `podman run --rm -it -p 9000:9000 minio/minio server /data` Start stack w/ S3 provider: `S3_ENDPOINT_URL=http://localhost:9000 AWS_ACCESS_KEY_ID=minioadmin AWS_SECRET_ACCESS_KEY=minioadmin S3_BUCKET_NAME=llama-stack-files uv run llama stack build --image-type venv --providers files=remote::s3 --run` Run integration tests: `./scripts/integration-tests.sh --stack-config http://localhost:8321 --provider ollama --test-subdirs files`	2025-08-22 10:38:59 -04:00
ehhuang	c5e2e269e2	feat(api): introduce /rerank (#2940 ) Some checks failed Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Pre-commit / pre-commit (push) Failing after 7s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 8s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 9s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Test External API and Providers / test-external (venv) (push) Failing after 10s Details Update ReadTheDocs / update-readthedocs (push) Failing after 11s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s Details Test Llama Stack Build / generate-matrix (push) Failing after 21s Details Test Llama Stack Build / build (push) Has been skipped Details UI Tests / ui-tests (22) (push) Failing after 21s Details # What does this PR do? Context: https://github.com/meta-llama/llama-stack/issues/2937 The API design is inspired by existing offerings, but not exactly the same: * `top_n` as the parameter to control number of results, instead of `top_k`, since `n` is conventional to control number * `truncation` bool instead of `max_token_per_doc`, since we should just handle the truncation automatically depending on model capability, instead of user setting the context length manually. * `data` field in the response, to be consistent with other OpenAI APIs (though they don't have a rerank API). Also, it is one less name to learn in the API. ## Test Plan Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-08-21 18:23:16 -07:00
Francisco Arceo	d78ac434bd	feat(UI): Adding a session manager (#3203 ) # What does this PR do? - Introduces the Agent Session creation for the Playground and allows users to set tools - note tools are actually not usable yet and this is marked explicitly - this also caches sessions locally for faster loading on the UI and deletes them appropriately - allows users to easily create new sessions as well - Moved Model Configuration settings and "System Message" / Prompt to the left component - Added new logo and favicon - Added new typing animation when LLM is generating ### Create New Session <img width="1916" height="1393" alt="Screenshot 2025-08-21 at 4 18 08 PM" src="https://github.com/user-attachments/assets/52c70ae3-a33e-4338-8522-8184c692c320" /> ### List of Sessions <img width="1920" height="1391" alt="Screenshot 2025-08-21 at 4 18 56 PM" src="https://github.com/user-attachments/assets/ed78c3c6-08ec-486c-8bad-9b7382c11360" /> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Unit tests added --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-21 21:11:03 -04:00
Mustafa Elbehery	c3b2b06974	refactor(logging): rename llama_stack logger categories (#3065 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR renames categories of llama_stack loggers. This PR aligns logging categories as per the package name, as well as reviews from initial https://github.com/meta-llama/llama-stack/pull/2868. This is a follow up to #3061. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Replaces https://github.com/meta-llama/llama-stack/pull/2868 Part of https://github.com/meta-llama/llama-stack/issues/2865 cc @leseb @rhuss Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-08-21 17:31:04 -07:00
Ashwin Bharambe	4434fcc2c3	fix(ci): small fixes to the provider build workflow	2025-08-21 16:37:11 -07:00
Jiayi Ni	deffaa9e4e	fix: fix the error type in embedding test case (#3197 ) # What does this PR do? Currently the embedding integration test cases fail due to a misalignment in the error type. This PR fixes the embedding integration test by fixing the error type. ## Test Plan ``` pytest -s -v tests/integration/inference/test_embedding.py --stack-config="inference=nvidia" --embedding-model="nvidia/llama-3.2-nv-embedqa-1b-v2" --env NVIDIA_API_KEY={nvidia_api_key} --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" ```	2025-08-21 16:19:51 -07:00
Ashwin Bharambe	864610ca5c	fix(ci): make all CI workflows have the correct concurrency defn	2025-08-21 16:05:25 -07:00
Jiayi Ni	b72169ca47	docs: update the docs for NVIDIA Inference provider (#3227 ) # What does this PR do? - Documentation update and fix for the NVIDIA Inference provider. - Update the `run_moderation` for safety API with a `NotImplementedError` placeholder. Otherwise initialization NVIDIA inference client will raise an error. ## Test Plan N/A	2025-08-21 15:59:39 -07:00
Mustafa Elbehery	1790fc0f25	feat: Remove initialize() Method from LlamaStackAsLibrary (#2979 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR removes `init()` from `LlamaStackAsLibrary` Currently client.initialize() had to be invoked by user. To improve dev experience and to avoid runtime errors, this PR init LlamaStackAsLibrary implicitly upon using the client. It prevents also multiple init of the same client, while maintaining backward ccompatibility. This PR does the following - Automatic Initialization: Constructor calls initialize_impl() automatically. - Client is fully initialized after __init__ completes. - Prevents consecutive initialization after the client has been successfully initialized. - initialize() method still exists but is now a no-op. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> fixes https://github.com/meta-llama/llama-stack/issues/2946 --------- Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-08-21 15:59:04 -07:00
Sumanth Kamenani	ac25e35124	feat: Add CORS configuration support for server (#3201 ) Adds flexible CORS (Cross-Origin Resource Sharing) configuration support to the FastAPI server with both local development and explicit configuration modes: - Local development mode: `cors: true` enables localhost-only access with regex pattern `https?://localhost:\d+` - Explicit configuration mode: Specific origins configuration with credential support and validation - Prevents insecure combinations (wildcards with credentials) - FastAPI CORSMiddleware integration via `model_dump()` Addresses the need for configurable CORS policies to support web frontends and cross-origin API access while maintaining security. Closes #2119 ## Test Plan 1. Ran Unit Tests. 2. Manual tests: FastAPI middleware integration with actual HTTP requests - Local development mode localhost access validation - Explicit configuration mode origins validation - Preflight OPTIONS request handling Some screenshots of manual tests. <img width="1920" height="927" alt="image" src="https://github.com/user-attachments/assets/79322338-40c7-45c9-a9ea-e3e8d8e2f849" /> <img width="1911" height="1037" alt="image" src="https://github.com/user-attachments/assets/1683524e-b0c9-48c9-a0a5-782e949cde01" /> cc: @leseb @rhuss @franciscojavierarceo	2025-08-21 14:23:27 -07:00
dependabot[bot]	58e164b8bc	chore(github-deps): bump astral-sh/setup-uv from 6.4.3 to 6.5.0 (#3179 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 19s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 20s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Details Test Llama Stack Build / build-single-provider (push) Failing after 24s Details Unit Tests / unit-tests (3.12) (push) Failing after 21s Details Test External API and Providers / test-external (venv) (push) Failing after 25s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 38s Details Vector IO Integration Tests / test-matrix (push) Failing after 40s Details Python Package Build Test / build (3.12) (push) Failing after 38s Details Pre-commit / pre-commit (push) Failing after 43s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 44s Details Python Package Build Test / build (3.13) (push) Failing after 41s Details Unit Tests / unit-tests (3.13) (push) Failing after 39s Details Test Llama Stack Build / generate-matrix (push) Failing after 45s Details UI Tests / ui-tests (22) (push) Failing after 42s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 46s Details Update ReadTheDocs / update-readthedocs (push) Failing after 42s Details Test Llama Stack Build / build (push) Has been skipped Details Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.4.3 to 6.5.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v6.5.0 🌈 Better error messages, bug fixes and copilot agent settings</h2> <h2>Changes</h2> <p>This release brings better error messages in case the GitHub API is impacted, fixes a few bugs and allows to disable <a href="https://github.com/actions/toolkit/blob/main/docs/problem-matchers.md">problem matchers</a> for better use in Copilot Agent workspaces.</p> <h2>🐛 Bug fixes</h2> <ul> <li>Improve error messages on GitHub API errors <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/518">#518</a>)</li> <li>Ignore backslashes and whitespace in requirements <a href="https://github.com/axm2"><code>@axm2</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/501">#501</a>)</li> </ul> <h2>🚀 Enhancements</h2> <ul> <li>Add input add-problem-matchers <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/517">#517</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>chore: update known versions for 0.8.9 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/512">#512</a>)</li> <li>chore: update known versions for 0.8.6-0.8.8 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/510">#510</a>)</li> <li>chore: update known versions for 0.8.5 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/509">#509</a>)</li> <li>chore: update known versions for 0.8.4 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/505">#505</a>)</li> <li>chore: update known versions for 0.8.3 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/502">#502</a>)</li> </ul> <h2>📚 Documentation</h2> <ul> <li>add note on caching to read disable-cache-pruning <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/506">#506</a>)</li> </ul> <h2>⬆️ Dependency updates</h2> <ul> <li>Bump actions/checkout from 4 to 5 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/514">#514</a>)</li> <li>bump dependencies <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/516">#516</a>)</li> <li>Bump biome to v2 <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/515">#515</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`d9e0f98d3f`"><code>d9e0f98</code></a> Improve error messages on GitHub API errors (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/518">#518</a>)</li> <li><a href="`e5d42a2b46`"><code>e5d42a2</code></a> Add input add-problem-matchers (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/517">#517</a>)</li> <li><a href="`d664c2a1d1`"><code>d664c2a</code></a> Bump actions/checkout from 4 to 5 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/514">#514</a>)</li> <li><a href="`c35b8eac36`"><code>c35b8ea</code></a> bump dependencies (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/516">#516</a>)</li> <li><a href="`4109b4033f`"><code>4109b40</code></a> Bump biome to v2 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/515">#515</a>)</li> <li><a href="`1463845d3c`"><code>1463845</code></a> chore: update known versions for 0.8.9 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/512">#512</a>)</li> <li><a href="`ad5ded2d63`"><code>ad5ded2</code></a> chore: update known versions for 0.8.6-0.8.8 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/510">#510</a>)</li> <li><a href="`142240426d`"><code>1422404</code></a> chore: update known versions for 0.8.5 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/509">#509</a>)</li> <li><a href="`632449003a`"><code>6324490</code></a> add note on caching to read disable-cache-pruning (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/506">#506</a>)</li> <li><a href="`2a967c9b97`"><code>2a967c9</code></a> chore: update known versions for 0.8.4 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/505">#505</a>)</li> <li>Additional commits viewable in <a href="`e92bafb625...d9e0f98d3f`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.4.3&new-version=6.5.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-20 16:51:53 -07:00
dependabot[bot]	6a719716f2	chore(github-deps): bump actions/checkout from 4.2.2 to 5.0.0 (#3178 ) [//]: # (dependabot-start) ⚠️ Dependabot is rebasing this PR ⚠️ Rebasing might not happen immediately, so don't worry if this takes some time. Note: if you make any changes to this PR yourself, they will take precedence over the rebase. --- [//]: # (dependabot-end) Bumps [actions/checkout](https://github.com/actions/checkout) from 4.2.2 to 5.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/releases">actions/checkout's releases</a>.</em></p> <blockquote> <h2>v5.0.0</h2> <h2>What's Changed</h2> <ul> <li>Update actions checkout to use node 24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li> <li>Prepare v5.0.0 release by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2238">actions/checkout#2238</a></li> </ul> <h2>⚠️ Minimum Compatible Runner Version</h2> <p><strong>v2.327.1</strong><br /> <a href="https://github.com/actions/runner/releases/tag/v2.327.1">Release Notes</a></p> <p>Make sure your runner is updated to this version or newer to use this release.</p> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v4...v5.0.0">https://github.com/actions/checkout/compare/v4...v5.0.0</a></p> <h2>v4.3.0</h2> <h2>What's Changed</h2> <ul> <li>docs: update README.md by <a href="https://github.com/motss"><code>@motss</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li>Add internal repos for checking out multiple repositories by <a href="https://github.com/mouismail"><code>@mouismail</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li>Documentation update - add recommended permissions to Readme by <a href="https://github.com/benwells"><code>@benwells</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li>Adjust positioning of user email note and permissions heading by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li> <li>Update README.md by <a href="https://github.com/nebuk89"><code>@nebuk89</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li>Update CODEOWNERS for actions by <a href="https://github.com/TingluoHuang"><code>@TingluoHuang</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li> <li>Update package dependencies by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> <li>Prepare release v4.3.0 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2237">actions/checkout#2237</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/motss"><code>@motss</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li><a href="https://github.com/mouismail"><code>@mouismail</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li><a href="https://github.com/benwells"><code>@benwells</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li><a href="https://github.com/nebuk89"><code>@nebuk89</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li><a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> made their first contribution in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v4...v4.3.0">https://github.com/actions/checkout/compare/v4...v4.3.0</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2>V5.0.0</h2> <ul> <li>Update actions checkout to use node 24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li> </ul> <h2>V4.3.0</h2> <ul> <li>docs: update README.md by <a href="https://github.com/motss"><code>@motss</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li>Add internal repos for checking out multiple repositories by <a href="https://github.com/mouismail"><code>@mouismail</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li>Documentation update - add recommended permissions to Readme by <a href="https://github.com/benwells"><code>@benwells</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li>Adjust positioning of user email note and permissions heading by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li> <li>Update README.md by <a href="https://github.com/nebuk89"><code>@nebuk89</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li>Update CODEOWNERS for actions by <a href="https://github.com/TingluoHuang"><code>@TingluoHuang</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li> <li>Update package dependencies by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> </ul> <h2>v4.2.2</h2> <ul> <li><code>url-helper.ts</code> now leverages well-known environment variables by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li> <li>Expand unit test coverage for <code>isGhes</code> by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li> </ul> <h2>v4.2.1</h2> <ul> <li>Check out other refs/* by commit if provided, fall back to ref by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li> </ul> <h2>v4.2.0</h2> <ul> <li>Add Ref and Commit outputs by <a href="https://github.com/lucacome"><code>@lucacome</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li> <li>Dependency updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>- <a href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>, <a href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li> </ul> <h2>v4.1.7</h2> <ul> <li>Bump the minor-npm-dependencies group across 1 directory with 4 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li> <li>Bump actions/checkout from 3 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li> <li>Check out other refs/* by commit by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li> <li>Pin actions/checkout's own workflows to a known, good, stable version. by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li> </ul> <h2>v4.1.6</h2> <ul> <li>Check platform to set archive extension appropriately by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li> </ul> <h2>v4.1.5</h2> <ul> <li>Update NPM dependencies by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li> <li>Bump github/codeql-action from 2 to 3 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li> <li>Bump actions/setup-node from 1 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li> <li>Bump actions/upload-artifact from 2 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li> <li>README: Suggest <code>user.email</code> to be <code>41898282+github-actions[bot]@users.noreply.github.com</code> by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1707">actions/checkout#1707</a></li> </ul> <h2>v4.1.4</h2> <ul> <li>Disable <code>extensions.worktreeConfig</code> when disabling <code>sparse-checkout</code> by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1692">actions/checkout#1692</a></li> <li>Add dependabot config by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1688">actions/checkout#1688</a></li> <li>Bump the minor-actions-dependencies group with 2 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1693">actions/checkout#1693</a></li> <li>Bump word-wrap from 1.2.3 to 1.2.5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1643">actions/checkout#1643</a></li> </ul> <h2>v4.1.3</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`08c6903cd8`"><code>08c6903</code></a> Prepare v5.0.0 release (<a href="https://redirect.github.com/actions/checkout/issues/2238">#2238</a>)</li> <li><a href="`9f265659d3`"><code>9f26565</code></a> Update actions checkout to use node 24 (<a href="https://redirect.github.com/actions/checkout/issues/2226">#2226</a>)</li> <li><a href="`08eba0b27e`"><code>08eba0b</code></a> Prepare release v4.3.0 (<a href="https://redirect.github.com/actions/checkout/issues/2237">#2237</a>)</li> <li><a href="`631c7dc4f8`"><code>631c7dc</code></a> Update package dependencies (<a href="https://redirect.github.com/actions/checkout/issues/2236">#2236</a>)</li> <li><a href="`8edcb1bdb4`"><code>8edcb1b</code></a> Update CODEOWNERS for actions (<a href="https://redirect.github.com/actions/checkout/issues/2224">#2224</a>)</li> <li><a href="`09d2acae67`"><code>09d2aca</code></a> Update README.md (<a href="https://redirect.github.com/actions/checkout/issues/2194">#2194</a>)</li> <li><a href="`85e6279cec`"><code>85e6279</code></a> Adjust positioning of user email note and permissions heading (<a href="https://redirect.github.com/actions/checkout/issues/2044">#2044</a>)</li> <li><a href="`009b9ae9e4`"><code>009b9ae</code></a> Documentation update - add recommended permissions to Readme (<a href="https://redirect.github.com/actions/checkout/issues/2043">#2043</a>)</li> <li><a href="`cbb722410c`"><code>cbb7224</code></a> Update README.md (<a href="https://redirect.github.com/actions/checkout/issues/1977">#1977</a>)</li> <li><a href="`3b9b8c884f`"><code>3b9b8c8</code></a> docs: update README.md (<a href="https://redirect.github.com/actions/checkout/issues/1971">#1971</a>)</li> <li>See full diff in <a href="`11bd71901b...08c6903cd8`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=4.2.2&new-version=5.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-20 16:51:40 -07:00
dependabot[bot]	bd1a794add	chore(python-deps): bump llama-api-client from 0.1.2 to 0.2.0 (#3173 ) Bumps [llama-api-client](https://github.com/meta-llama/llama-api-python) from 0.1.2 to 0.2.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/meta-llama/llama-api-python/releases">llama-api-client's releases</a>.</em></p> <blockquote> <h2>v0.2.0</h2> <h2>0.2.0 (2025-08-07)</h2> <p>Full Changelog: <a href="https://github.com/meta-llama/llama-api-python/compare/v0.1.2...v0.2.0">v0.1.2...v0.2.0</a></p> <h3>Features</h3> <ul> <li>clean up environment call outs (<a href="`4afbd01ed7`">4afbd01</a>)</li> <li><strong>client:</strong> support file upload requests (<a href="`ec42e80b62`">ec42e80</a>)</li> </ul> <h3>Bug Fixes</h3> <ul> <li><strong>api:</strong> remove chat completion request model (<a href="`94c4e9fd50`">94c4e9f</a>)</li> <li><strong>client:</strong> don't send Content-Type header on GET requests (<a href="`efec88aa51`">efec88a</a>)</li> <li><strong>parsing:</strong> correctly handle nested discriminated unions (<a href="`b6276863be`">b627686</a>)</li> <li><strong>parsing:</strong> ignore empty metadata (<a href="`d6ee85101e`">d6ee851</a>)</li> <li><strong>parsing:</strong> parse extra field types (<a href="`f03ca22860`">f03ca22</a>)</li> </ul> <h3>Chores</h3> <ul> <li>add examples (<a href="`abfa065721`">abfa065</a>)</li> <li><strong>internal:</strong> bump pinned h11 dep (<a href="`d40e1b1d73`">d40e1b1</a>)</li> <li><strong>internal:</strong> fix ruff target version (<a href="`c900ebc528`">c900ebc</a>)</li> <li><strong>package:</strong> mark python 3.13 as supported (<a href="`ef5bc36693`">ef5bc36</a>)</li> <li><strong>project:</strong> add settings file for vscode (<a href="`e3103801d6`">e310380</a>)</li> <li><strong>readme:</strong> fix version rendering on pypi (<a href="`786f9fbdb7`">786f9fb</a>)</li> <li>sync repo (<a href="`7e697f6550`">7e697f6</a>)</li> <li>update SDK settings (<a href="`de22c0ece7`">de22c0e</a>)</li> </ul> <h3>Documentation</h3> <ul> <li>code of conduct (<a href="`efe1af28fb`">efe1af2</a>)</li> <li>readme and license (<a href="`d53eafd104`">d53eafd</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/meta-llama/llama-api-python/blob/main/CHANGELOG.md">llama-api-client's changelog</a>.</em></p> <blockquote> <h2>0.2.0 (2025-08-07)</h2> <p>Full Changelog: <a href="https://github.com/meta-llama/llama-api-python/compare/v0.1.2...v0.2.0">v0.1.2...v0.2.0</a></p> <h3>Features</h3> <ul> <li>clean up environment call outs (<a href="`4afbd01ed7`">4afbd01</a>)</li> <li><strong>client:</strong> support file upload requests (<a href="`ec42e80b62`">ec42e80</a>)</li> </ul> <h3>Bug Fixes</h3> <ul> <li><strong>api:</strong> remove chat completion request model (<a href="`94c4e9fd50`">94c4e9f</a>)</li> <li><strong>client:</strong> don't send Content-Type header on GET requests (<a href="`efec88aa51`">efec88a</a>)</li> <li><strong>parsing:</strong> correctly handle nested discriminated unions (<a href="`b6276863be`">b627686</a>)</li> <li><strong>parsing:</strong> ignore empty metadata (<a href="`d6ee85101e`">d6ee851</a>)</li> <li><strong>parsing:</strong> parse extra field types (<a href="`f03ca22860`">f03ca22</a>)</li> </ul> <h3>Chores</h3> <ul> <li>add examples (<a href="`abfa065721`">abfa065</a>)</li> <li><strong>internal:</strong> bump pinned h11 dep (<a href="`d40e1b1d73`">d40e1b1</a>)</li> <li><strong>internal:</strong> fix ruff target version (<a href="`c900ebc528`">c900ebc</a>)</li> <li><strong>package:</strong> mark python 3.13 as supported (<a href="`ef5bc36693`">ef5bc36</a>)</li> <li><strong>project:</strong> add settings file for vscode (<a href="`e3103801d6`">e310380</a>)</li> <li><strong>readme:</strong> fix version rendering on pypi (<a href="`786f9fbdb7`">786f9fb</a>)</li> <li>sync repo (<a href="`7e697f6550`">7e697f6</a>)</li> <li>update SDK settings (<a href="`de22c0ece7`">de22c0e</a>)</li> </ul> <h3>Documentation</h3> <ul> <li>code of conduct (<a href="`efe1af28fb`">efe1af2</a>)</li> <li>readme and license (<a href="`d53eafd104`">d53eafd</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`7a8c5838af`"><code>7a8c583</code></a> release: 0.2.0</li> <li><a href="`4f1a04e5c1`"><code>4f1a04e</code></a> chore(internal): fix ruff target version</li> <li><a href="`06485e995a`"><code>06485e9</code></a> feat(client): support file upload requests</li> <li><a href="`131b474ad1`"><code>131b474</code></a> chore(project): add settings file for vscode</li> <li><a href="`ef4cee6d8b`"><code>ef4cee6</code></a> fix(parsing): parse extra field types</li> <li><a href="`fcbc699718`"><code>fcbc699</code></a> fix(parsing): ignore empty metadata</li> <li><a href="`b6656cd0b8`"><code>b6656cd</code></a> fix(api): remove chat completion request model</li> <li><a href="`0deda5590c`"><code>0deda55</code></a> feat: clean up environment call outs</li> <li><a href="`ecf91026ac`"><code>ecf9102</code></a> fix(client): don't send Content-Type header on GET requests</li> <li><a href="`0ac6285cbe`"><code>0ac6285</code></a> chore(readme): fix version rendering on pypi</li> <li>Additional commits viewable in <a href="https://github.com/meta-llama/llama-api-python/compare/v0.1.2...v0.2.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=llama-api-client&package-manager=uv&previous-version=0.1.2&new-version=0.2.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-20 16:50:34 -07:00
dependabot[bot]	886af85e0c	chore(github-deps): bump amannn/action-semantic-pull-request from 5.5.3 to 6.1.0 (#3215 ) Bumps [amannn/action-semantic-pull-request](https://github.com/amannn/action-semantic-pull-request) from 5.5.3 to 6.1.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/amannn/action-semantic-pull-request/releases">amannn/action-semantic-pull-request's releases</a>.</em></p> <blockquote> <h2>v6.1.0</h2> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v6.0.1...v6.1.0">6.1.0</a> (2025-08-19)</h2> <h3>Features</h3> <ul> <li>Support providing regexps for types (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/292">#292</a>) (<a href="`a30288bf13`">a30288b</a>)</li> </ul> <h3>Bug Fixes</h3> <ul> <li>Remove trailing whitespace from "unknown release type" error message (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/291">#291</a>) (<a href="`afa4edb1c4`">afa4edb</a>)</li> </ul> <h2>v6.0.1</h2> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v6.0.0...v6.0.1">6.0.1</a> (2025-08-13)</h2> <h3>Bug Fixes</h3> <ul> <li>Actually execute action (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/289">#289</a>) (<a href="`58e4ab40f5`">58e4ab4</a>)</li> </ul> <h2>v6.0.0</h2> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v5.5.3...v6.0.0">6.0.0</a> (2025-08-13)</h2> <h3>⚠ BREAKING CHANGES</h3> <ul> <li>Upgrade action to use Node.js 24 and ESM (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/287">#287</a>)</li> </ul> <h3>Features</h3> <ul> <li>Upgrade action to use Node.js 24 and ESM (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/287">#287</a>) (<a href="`bc0c9a79ab`">bc0c9a7</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/amannn/action-semantic-pull-request/blob/main/CHANGELOG.md">amannn/action-semantic-pull-request's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v6.0.1...v6.1.0">6.1.0</a> (2025-08-19)</h2> <h3>Features</h3> <ul> <li>Support providing regexps for types (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/292">#292</a>) (<a href="`a30288bf13`">a30288b</a>)</li> </ul> <h3>Bug Fixes</h3> <ul> <li>Remove trailing whitespace from "unknown release type" error message (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/291">#291</a>) (<a href="`afa4edb1c4`">afa4edb</a>)</li> </ul> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v6.0.0...v6.0.1">6.0.1</a> (2025-08-13)</h2> <h3>Bug Fixes</h3> <ul> <li>Actually execute action (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/289">#289</a>) (<a href="`58e4ab40f5`">58e4ab4</a>)</li> </ul> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v5.5.3...v6.0.0">6.0.0</a> (2025-08-13)</h2> <h3>⚠ BREAKING CHANGES</h3> <ul> <li>Upgrade action to use Node.js 24 and ESM (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/287">#287</a>)</li> </ul> <h3>Features</h3> <ul> <li>Upgrade action to use Node.js 24 and ESM (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/287">#287</a>) (<a href="`bc0c9a79ab`">bc0c9a7</a>)</li> </ul> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v5.5.2...v5.5.3">5.5.3</a> (2024-06-28)</h2> <h3>Bug Fixes</h3> <ul> <li>Bump <code>braces</code> dependency (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/269">#269</a>. by <a href="https://github.com/EelcoLos"><code>@EelcoLos</code></a>) (<a href="`2d952a1bf9`">2d952a1</a>)</li> </ul> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v5.5.1...v5.5.2">5.5.2</a> (2024-04-24)</h2> <h3>Bug Fixes</h3> <ul> <li>Bump tar from 6.1.11 to 6.2.1 (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/262">#262</a> by <a href="https://github.com/EelcoLos"><code>@EelcoLos</code></a>) (<a href="`9a90d5a5ac`">9a90d5a</a>)</li> </ul> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v5.5.0...v5.5.1">5.5.1</a> (2024-04-24)</h2> <h3>Bug Fixes</h3> <ul> <li>Bump ip from 2.0.0 to 2.0.1 (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/263">#263</a> by <a href="https://github.com/EelcoLos"><code>@EelcoLos</code></a>) (<a href="`5e7e9acca3`">5e7e9ac</a>)</li> </ul> <h2><a href="https://github.com/amannn/action-semantic-pull-request/compare/v5.4.0...v5.5.0">5.5.0</a> (2024-04-23)</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`7f33ba7922`"><code>7f33ba7</code></a> chore: Release 6.1.0 [skip ci]</li> <li><a href="`afa4edb1c4`"><code>afa4edb</code></a> fix: Remove trailing whitespace from "unknown release type" error message (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/291">#291</a>)</li> <li><a href="`a30288bf13`"><code>a30288b</code></a> feat: Support providing regexps for types (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/292">#292</a>)</li> <li><a href="`a46a7c8dc4`"><code>a46a7c8</code></a> build: Move Vitest to <code>devDependencies</code> (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/290">#290</a>)</li> <li><a href="`fdd4d3ddf6`"><code>fdd4d3d</code></a> chore: Release 6.0.1 [skip ci]</li> <li><a href="`58e4ab40f5`"><code>58e4ab4</code></a> fix: Actually execute action (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/289">#289</a>)</li> <li><a href="`04a8d177d9`"><code>04a8d17</code></a> chore: Release 6.0.0 [skip ci]</li> <li><a href="`bc0c9a79ab`"><code>bc0c9a7</code></a> feat!: Upgrade action to use Node.js 24 and ESM (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/287">#287</a>)</li> <li><a href="`631ffdc028`"><code>631ffdc</code></a> build(deps): bump the github-action-workflows group with 2 updates (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/286">#286</a>)</li> <li><a href="`c1807ceb58`"><code>c1807ce</code></a> build: configure Dependabot (<a href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/231">#231</a>)</li> <li>Additional commits viewable in <a href="`0723387faa...7f33ba7922`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=amannn/action-semantic-pull-request&package-manager=github_actions&previous-version=5.5.3&new-version=6.1.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-20 16:50:00 -07:00
dependabot[bot]	2fa189fe04	chore(github-deps): bump actions/setup-node from 4.1.0 to 4.4.0 (#3214 ) Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4.1.0 to 4.4.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/setup-node/releases">actions/setup-node's releases</a>.</em></p> <blockquote> <h2>v4.4.0</h2> <h2>What's Changed</h2> <h3>Bug fixes:</h3> <ul> <li>Make eslint-compact matcher compatible with Stylelint by <a href="https://github.com/FloEdelmann"><code>@FloEdelmann</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/98">actions/setup-node#98</a></li> <li>Add support for indented eslint output by <a href="https://github.com/fregante"><code>@fregante</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1245">actions/setup-node#1245</a></li> </ul> <h3>Enhancement:</h3> <ul> <li>Support private mirrors by <a href="https://github.com/marco-ippolito"><code>@marco-ippolito</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1240">actions/setup-node#1240</a></li> </ul> <h3>Dependency update:</h3> <ul> <li>Upgrade <code>@action/cache</code> from 4.0.2 to 4.0.3 by <a href="https://github.com/aparnajyothi-y"><code>@aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1262">actions/setup-node#1262</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/FloEdelmann"><code>@FloEdelmann</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/98">actions/setup-node#98</a></li> <li><a href="https://github.com/fregante"><code>@fregante</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/1245">actions/setup-node#1245</a></li> <li><a href="https://github.com/marco-ippolito"><code>@marco-ippolito</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/1240">actions/setup-node#1240</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v4...v4.4.0">https://github.com/actions/setup-node/compare/v4...v4.4.0</a></p> <h2>v4.3.0</h2> <h2>What's Changed</h2> <h3>Dependency updates</h3> <ul> <li>Upgrade <code>@actions/glob</code> from 0.4.0 to 0.5.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1200">actions/setup-node#1200</a></li> <li>Upgrade <code>@action/cache</code> from 4.0.0 to 4.0.2 by <a href="https://github.com/gowridurgad"><code>@gowridurgad</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1251">actions/setup-node#1251</a></li> <li>Upgrade <code>@vercel/ncc</code> from 0.38.1 to 0.38.3 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1203">actions/setup-node#1203</a></li> <li>Upgrade <code>@actions/tool-cache</code> from 2.0.1 to 2.0.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1220">actions/setup-node#1220</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/gowridurgad"><code>@gowridurgad</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/1251">actions/setup-node#1251</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v4...v4.3.0">https://github.com/actions/setup-node/compare/v4...v4.3.0</a></p> <h2>v4.2.0</h2> <h2>What's Changed</h2> <ul> <li>Enhance workflows and upgrade publish-actions from 0.2.2 to 0.3.0 by <a href="https://github.com/aparnajyothi-y"><code>@aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1174">actions/setup-node#1174</a></li> <li>Add recommended permissions section to readme by <a href="https://github.com/benwells"><code>@benwells</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1193">actions/setup-node#1193</a></li> <li>Configure Dependabot settings by <a href="https://github.com/HarithaVattikuti"><code>@HarithaVattikuti</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1192">actions/setup-node#1192</a></li> <li>Upgrade <code>@actions/cache</code> to <code>^4.0.0</code> by <a href="https://github.com/priyagupta108"><code>@priyagupta108</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1191">actions/setup-node#1191</a></li> <li>Upgrade pnpm/action-setup from 2 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1194">actions/setup-node#1194</a></li> <li>Upgrade actions/publish-immutable-action from 0.0.3 to 0.0.4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1195">actions/setup-node#1195</a></li> <li>Upgrade semver from 7.6.0 to 7.6.3 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1196">actions/setup-node#1196</a></li> <li>Upgrade <code>@types/jest</code> from 29.5.12 to 29.5.14 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1201">actions/setup-node#1201</a></li> <li>Upgrade undici from 5.28.4 to 5.28.5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/1205">actions/setup-node#1205</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/benwells"><code>@benwells</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/1193">actions/setup-node#1193</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v4...v4.2.0">https://github.com/actions/setup-node/compare/v4...v4.2.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`49933ea528`"><code>49933ea</code></a> Bump <code>@action/cache</code> from 4.0.2 to 4.0.3 (<a href="https://redirect.github.com/actions/setup-node/issues/1262">#1262</a>)</li> <li><a href="`e3ce749e20`"><code>e3ce749</code></a> feat: support private mirrors (<a href="https://redirect.github.com/actions/setup-node/issues/1240">#1240</a>)</li> <li><a href="`40337cb8f7`"><code>40337cb</code></a> Add support for indented eslint output (<a href="https://redirect.github.com/actions/setup-node/issues/1245">#1245</a>)</li> <li><a href="`1ccdddc9b8`"><code>1ccdddc</code></a> Make eslint-compact matcher compatible with Stylelint (<a href="https://redirect.github.com/actions/setup-node/issues/98">#98</a>)</li> <li><a href="`cdca7365b2`"><code>cdca736</code></a> Bump <code>@actions/tool-cache</code> from 2.0.1 to 2.0.2 (<a href="https://redirect.github.com/actions/setup-node/issues/1220">#1220</a>)</li> <li><a href="`22c0e7494f`"><code>22c0e74</code></a> Bump <code>@vercel/ncc</code> from 0.38.1 to 0.38.3 (<a href="https://redirect.github.com/actions/setup-node/issues/1203">#1203</a>)</li> <li><a href="`a7c2d9473e`"><code>a7c2d94</code></a> actions/cache upgrade (<a href="https://redirect.github.com/actions/setup-node/issues/1251">#1251</a>)</li> <li><a href="`802632921f`"><code>8026329</code></a> Bump <code>@actions/glob</code> from 0.4.0 to 0.5.0 (<a href="https://redirect.github.com/actions/setup-node/issues/1200">#1200</a>)</li> <li><a href="`1d0ff469b7`"><code>1d0ff46</code></a> Bump undici from 5.28.4 to 5.28.5 (<a href="https://redirect.github.com/actions/setup-node/issues/1205">#1205</a>)</li> <li><a href="`574f09a9fa`"><code>574f09a</code></a> Bump <code>@types/jest</code> from 29.5.12 to 29.5.14 (<a href="https://redirect.github.com/actions/setup-node/issues/1201">#1201</a>)</li> <li>Additional commits viewable in <a href="`39370e3970...49933ea528`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=4.1.0&new-version=4.4.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-20 16:49:43 -07:00
dependabot[bot]	2cc0051ae5	chore(ui-deps): bump typescript from 5.8.3 to 5.9.2 in /llama_stack/ui (#3216 ) Bumps [typescript](https://github.com/microsoft/TypeScript) from 5.8.3 to 5.9.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/microsoft/TypeScript/releases">typescript's releases</a>.</em></p> <blockquote> <h2>TypeScript 5.9</h2> <p>For release notes, check out the <a href="https://devblogs.microsoft.com/typescript/announcing-typescript-5-9/">release announcement</a></p> <ul> <li><a href="https://github.com/Microsoft/TypeScript/issues?utf8=%E2%9C%93&q=milestone%3A%22TypeScript+5.9.0%22+is%3Aclosed+">fixed issues query for Typescript 5.9.0 (Beta)</a>.</li> <li><a href="https://github.com/Microsoft/TypeScript/issues?utf8=%E2%9C%93&q=milestone%3A%22TypeScript+5.9.1%22+is%3Aclosed+">fixed issues query for Typescript 5.9.1 (RC)</a>.</li> <li><em>No specific changes for TypeScript 5.9.2 (Stable)</em></li> </ul> <p>Downloads are available on:</p> <ul> <li><a href="https://www.npmjs.com/package/typescript">npm</a></li> </ul> <h2>TypeScript 5.9 RC</h2> <p>For release notes, check out the <a href="https://devblogs.microsoft.com/typescript/announcing-typescript-5-9-rc/">release announcement</a></p> <ul> <li><a href="https://github.com/Microsoft/TypeScript/issues?utf8=%E2%9C%93&q=milestone%3A%22TypeScript+5.9.0%22+is%3Aclosed+">fixed issues query for Typescript 5.9.0 (Beta)</a>.</li> <li><a href="https://github.com/Microsoft/TypeScript/issues?utf8=%E2%9C%93&q=milestone%3A%22TypeScript+5.9.1%22+is%3Aclosed+">fixed issues query for Typescript 5.9.1 (RC)</a>.</li> </ul> <p>Downloads are available on:</p> <ul> <li><a href="https://www.npmjs.com/package/typescript">npm</a></li> </ul> <h2>TypeScript 5.9 Beta</h2> <p>For release notes, check out the <a href="https://devblogs.microsoft.com/typescript/announcing-typescript-5-9-beta/">release announcement</a>.</p> <ul> <li><a href="https://github.com/Microsoft/TypeScript/issues?utf8=%E2%9C%93&q=milestone%3A%22TypeScript+5.9.0%22+is%3Aclosed+">fixed issues query for Typescript 5.9.0 (Beta)</a>.</li> </ul> <p>Downloads are available on:</p> <ul> <li><a href="https://www.npmjs.com/package/typescript">npm</a></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`be86783155`"><code>be86783</code></a> Give more specific errors for <code>verbatimModuleSyntax</code> (<a href="https://redirect.github.com/microsoft/TypeScript/issues/62113">#62113</a>)</li> <li><a href="`22ef57786f`"><code>22ef577</code></a> LEGO: Pull request from lego/hb_5378966c-b857-470a-8675-daebef4a6da1_20250714...</li> <li><a href="`d5a414cd1d`"><code>d5a414c</code></a> Don't use <code>noErrorTruncation</code> when printing types with <code>maximumLength</code> set (#...</li> <li><a href="`f14b5c8a2f`"><code>f14b5c8</code></a> Remove unused and confusing dom.iterable.d.ts file (<a href="https://redirect.github.com/microsoft/TypeScript/issues/62037">#62037</a>)</li> <li><a href="`2778e84ed8`"><code>2778e84</code></a> Restore AbortSignal.abort (<a href="https://redirect.github.com/microsoft/TypeScript/issues/62086">#62086</a>)</li> <li><a href="`65cb4bd2d5`"><code>65cb4bd</code></a> LEGO: Pull request from lego/hb_5378966c-b857-470a-8675-daebef4a6da1_20250710...</li> <li><a href="`9e20e032ef`"><code>9e20e03</code></a> Clear out checker-level stacks on pop (<a href="https://redirect.github.com/microsoft/TypeScript/issues/62016">#62016</a>)</li> <li><a href="`87740bc7fe`"><code>87740bc</code></a> Fix for Issue 61081 (<a href="https://redirect.github.com/microsoft/TypeScript/issues/61221">#61221</a>)</li> <li><a href="`833a8d492c`"><code>833a8d4</code></a> Fix Symbol completion priority and cursor positioning (<a href="https://redirect.github.com/microsoft/TypeScript/issues/61945">#61945</a>)</li> <li><a href="`0018c9ff12`"><code>0018c9f</code></a> LEGO: Pull request from lego/hb_5378966c-b857-470a-8675-daebef4a6da1_20250702...</li> <li>Additional commits viewable in <a href="https://github.com/microsoft/TypeScript/compare/v5.8.3...v5.9.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=typescript&package-manager=npm_and_yarn&previous-version=5.8.3&new-version=5.9.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-20 16:49:28 -07:00
dependabot[bot]	bf3b201d61	chore(python-deps): bump chromadb from 1.0.16 to 1.0.20 (#3217 ) Bumps [chromadb](https://github.com/chroma-core/chroma) from 1.0.16 to 1.0.20. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/chroma-core/chroma/releases">chromadb's releases</a>.</em></p> <blockquote> <h2>1.0.20</h2> <p>Version: <code>1.0.20</code> Git ref: <code>refs/tags/1.0.20</code> Build Date: <code>2025-08-18T17:04</code> PIP Package: <code>chroma-1.0.20.tar.gz</code> Github Container Registry Image: <code>:1.0.20</code> DockerHub Image: <code>:1.0.20</code></p> <h2>What's Changed</h2> <ul> <li>[RELEASE] 1.0.20 by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5303">chroma-core/chroma#5303</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/chroma-core/chroma/compare/1.0.19...1.0.20">https://github.com/chroma-core/chroma/compare/1.0.19...1.0.20</a></p> <h2>1.0.18</h2> <p>Version: <code>1.0.18</code> Git ref: <code>refs/tags/1.0.18</code> Build Date: <code>2025-08-18T08:09</code> PIP Package: <code>chroma-1.0.18.tar.gz</code> Github Container Registry Image: <code>:1.0.18</code> DockerHub Image: <code>:1.0.18</code></p> <h2>What's Changed</h2> <ul> <li>[CHORE]: Added short descriptions to CLI commands by <a href="https://github.com/tazarov"><code>@tazarov</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5217">chroma-core/chroma#5217</a></li> <li>[ENH] Use AVX in distance calculations by <a href="https://github.com/jairad26"><code>@jairad26</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5258">chroma-core/chroma#5258</a></li> <li>[ENH] Auto-set tenant, scoped database in python CloudClient by <a href="https://github.com/jairad26"><code>@jairad26</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5026">chroma-core/chroma#5026</a></li> <li>[PERF]: Modify get_range to return an iterator by <a href="https://github.com/sanketkedia"><code>@sanketkedia</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5256">chroma-core/chroma#5256</a></li> <li>[BUG] Mark dirty on rollback of cursor to guarantee compaction picks it up. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5265">chroma-core/chroma#5265</a></li> <li>[ENH]: add metric for component queue depth & change dispatcher queue depth metric buckets by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5261">chroma-core/chroma#5261</a></li> <li>[ENH]: add garbage collection CLI for manual garbage collection by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5250">chroma-core/chroma#5250</a></li> <li>[DOC] Clean up DEVELOP.md by <a href="https://github.com/kylediaz"><code>@kylediaz</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5270">chroma-core/chroma#5270</a></li> <li>[ENH]: Further optimize query on getCollections when databases pkey is fully specified by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5268">chroma-core/chroma#5268</a></li> <li>[ENH] Update Rust to allow build with AVX when flag is set by <a href="https://github.com/jairad26"><code>@jairad26</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5269">chroma-core/chroma#5269</a></li> <li>[ENH]: Fix test_add flake by <a href="https://github.com/sanketkedia"><code>@sanketkedia</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5272">chroma-core/chroma#5272</a></li> <li>[BUG]: Revert "[ENH]: Further optimize query on getCollections when databases pkey is fully specified (<a href="https://redirect.github.com/chroma-core/chroma/issues/5268">#5268</a>)" by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5273">chroma-core/chroma#5273</a></li> <li>[BLD] Add maturin to dev dependencies by <a href="https://github.com/kylediaz"><code>@kylediaz</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5271">chroma-core/chroma#5271</a></li> <li>[ENH]: Optimize GetCollections and remove usage of raw gorm by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5274">chroma-core/chroma#5274</a></li> <li>[ENH]: add config param to garbage collector to control how many collections are fetched from SysDb by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5275">chroma-core/chroma#5275</a></li> <li>[ENH] Reject version files without paths. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5267">chroma-core/chroma#5267</a></li> <li>[ENH] Enable getting a collection by CRN by <a href="https://github.com/drewkim"><code>@drewkim</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5244">chroma-core/chroma#5244</a></li> <li>[BUG] CompactionError did not proxy should_trace_error by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5282">chroma-core/chroma#5282</a></li> <li>[BUG] Resolve deadlock in system crate? by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5283">chroma-core/chroma#5283</a></li> <li>[ENH] Complete the NAC metrics for the write half. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5278">chroma-core/chroma#5278</a></li> <li>[BUG]: fix missing node in constructed version graph for garbage collection by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5284">chroma-core/chroma#5284</a></li> <li>[BUG] Fix test flake from 5283. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5287">chroma-core/chroma#5287</a></li> <li>[BUG]: Don't GC hnsw if it is empty by <a href="https://github.com/sanketkedia"><code>@sanketkedia</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5295">chroma-core/chroma#5295</a></li> <li>[ENH] Sync before flushing by <a href="https://github.com/HammadB"><code>@HammadB</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5296">chroma-core/chroma#5296</a></li> <li>[DOC] update quota limits by <a href="https://github.com/philipithomas"><code>@philipithomas</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5297">chroma-core/chroma#5297</a></li> <li>[BUG] Fix CLI copy offset by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5288">chroma-core/chroma#5288</a></li> <li>[ENH] Add support for default space in create coll config by <a href="https://github.com/jairad26"><code>@jairad26</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5293">chroma-core/chroma#5293</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`b6b059dfd7`"><code>b6b059d</code></a> [RELEASE] 1.0.20 (<a href="https://redirect.github.com/chroma-core/chroma/issues/5303">#5303</a>)</li> <li><a href="`1993cd4a51`"><code>1993cd4</code></a> [RELEASE] CLI 1.1.8, Python 1.0.19, JS 3.0.14 (<a href="https://redirect.github.com/chroma-core/chroma/issues/5302">#5302</a>)</li> <li><a href="`19600af279`"><code>19600af</code></a> [BUG] Fix CLI copy arg number types (<a href="https://redirect.github.com/chroma-core/chroma/issues/5301">#5301</a>)</li> <li><a href="`d3602cd776`"><code>d3602cd</code></a> [CHORE] Update JS binding deps in the client (<a href="https://redirect.github.com/chroma-core/chroma/issues/5300">#5300</a>)</li> <li><a href="`2570b471ed`"><code>2570b47</code></a> [RELEASE] CLI 1.1.7, Python 1.0.18, JS 3.0.13 (<a href="https://redirect.github.com/chroma-core/chroma/issues/5299">#5299</a>)</li> <li><a href="`51a7d1625b`"><code>51a7d16</code></a> [ENH] Add support for default space in create coll config (<a href="https://redirect.github.com/chroma-core/chroma/issues/5293">#5293</a>)</li> <li><a href="`163133aacc`"><code>163133a</code></a> [BUG] Fix CLI copy offset (<a href="https://redirect.github.com/chroma-core/chroma/issues/5288">#5288</a>)</li> <li><a href="`2f06586503`"><code>2f06586</code></a> [DOC] update quota limits (<a href="https://redirect.github.com/chroma-core/chroma/issues/5297">#5297</a>)</li> <li><a href="`983728076d`"><code>9837280</code></a> [ENH] Sync before flushing (<a href="https://redirect.github.com/chroma-core/chroma/issues/5296">#5296</a>)</li> <li><a href="`649e14c530`"><code>649e14c</code></a> [BUG]: Don't GC hnsw if it is empty (<a href="https://redirect.github.com/chroma-core/chroma/issues/5295">#5295</a>)</li> <li>Additional commits viewable in <a href="https://github.com/chroma-core/chroma/compare/1.0.16...1.0.20">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=chromadb&package-manager=uv&previous-version=1.0.16&new-version=1.0.20)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-20 16:49:11 -07:00
dependabot[bot]	620212e920	chore(ui-deps): bump @radix-ui/react-collapsible from 1.1.11 to 1.1.12 in /llama_stack/ui (#3218 ) Bumps [@radix-ui/react-collapsible](https://github.com/radix-ui/primitives) from 1.1.11 to 1.1.12. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/radix-ui/primitives/commits">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@radix-ui/react-collapsible&package-manager=npm_and_yarn&previous-version=1.1.11&new-version=1.1.12)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-20 16:48:53 -07:00
dependabot[bot]	65d09c442d	chore(ui-deps): bump eslint-config-prettier from 10.1.5 to 10.1.8 in /llama_stack/ui (#3220 ) Bumps [eslint-config-prettier](https://github.com/prettier/eslint-config-prettier) from 10.1.5 to 10.1.8. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/prettier/eslint-config-prettier/releases">eslint-config-prettier's releases</a>.</em></p> <blockquote> <h2>v10.1.8</h2> <p>republish latest version</p> <p><strong>Full Changelog</strong>: <a href="https://github.com/prettier/eslint-config-prettier/compare/v10.1.5...v10.1.8">https://github.com/prettier/eslint-config-prettier/compare/v10.1.5...v10.1.8</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/prettier/eslint-config-prettier/blob/main/CHANGELOG.md">eslint-config-prettier's changelog</a>.</em></p> <blockquote> <h1>eslint-config-prettier</h1> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`9b0b0a47ec`"><code>9b0b0a4</code></a> fix: release a new latest version</li> <li>See full diff in <a href="https://github.com/prettier/eslint-config-prettier/compare/v10.1.5...v10.1.8">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=eslint-config-prettier&package-manager=npm_and_yarn&previous-version=10.1.5&new-version=10.1.8)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-20 16:48:35 -07:00
dependabot[bot]	90b7c2317e	chore(ui-deps): bump @radix-ui/react-separator from 1.1.6 to 1.1.7 in /llama_stack/ui (#3222 ) Bumps [@radix-ui/react-separator](https://github.com/radix-ui/primitives) from 1.1.6 to 1.1.7. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/radix-ui/primitives/commits">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@radix-ui/react-separator&package-manager=npm_and_yarn&previous-version=1.1.6&new-version=1.1.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-20 16:48:20 -07:00
dependabot[bot]	0473a32619	chore(ui-deps): bump tailwind-merge from 3.3.0 to 3.3.1 in /llama_stack/ui (#3223 ) Bumps [tailwind-merge](https://github.com/dcastil/tailwind-merge) from 3.3.0 to 3.3.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/dcastil/tailwind-merge/releases">tailwind-merge's releases</a>.</em></p> <blockquote> <h2>v3.3.1</h2> <h3>Bug Fixes</h3> <ul> <li>Fix arbitrary value using <code>color-mix()</code> not being detected as color by <a href="https://github.com/dcastil"><code>@dcastil</code></a> in <a href="https://redirect.github.com/dcastil/tailwind-merge/pull/591">dcastil/tailwind-merge#591</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/dcastil/tailwind-merge/compare/v3.3.0...v3.3.1">https://github.com/dcastil/tailwind-merge/compare/v3.3.0...v3.3.1</a></p> <p>Thanks to <a href="https://github.com/brandonmcconnell"><code>@brandonmcconnell</code></a>, <a href="https://github.com/manavm1990"><code>@manavm1990</code></a>, <a href="https://github.com/langy"><code>@langy</code></a>, <a href="https://github.com/roboflow"><code>@roboflow</code></a>, <a href="https://github.com/syntaxfm"><code>@syntaxfm</code></a>, <a href="https://github.com/getsentry"><code>@getsentry</code></a>, <a href="https://github.com/codecov"><code>@codecov</code></a>, <a href="https://github.com/sourcegraph"><code>@sourcegraph</code></a>, a private sponsor, <a href="https://github.com/block"><code>@block</code></a> and <a href="https://github.com/shawt3000"><code>@shawt3000</code></a> for sponsoring tailwind-merge! ❤️</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`40d8feed6a`"><code>40d8fee</code></a> v3.3.1</li> <li><a href="`429ea54ac8`"><code>429ea54</code></a> add changelog for v3.3.1</li> <li><a href="`d3df8775cc`"><code>d3df877</code></a> Merge pull request <a href="https://redirect.github.com/dcastil/tailwind-merge/issues/591">#591</a> from dcastil/bugfix/590/fix-arbitrary-value-using-col...</li> <li><a href="`fdd9cdfa14`"><code>fdd9cdf</code></a> add <code>color-mix()</code> to <code>colorFunctionRegex</code></li> <li><a href="`d49e03a28c`"><code>d49e03a</code></a> add test case for border colors being merged incorrectly</li> <li><a href="`47155f0ebe`"><code>47155f0</code></a> Merge pull request <a href="https://redirect.github.com/dcastil/tailwind-merge/issues/585">#585</a> from dcastil/renovate/all-minor-patch</li> <li><a href="`2d29675ab0`"><code>2d29675</code></a> Update all non-major dependencies</li> <li><a href="`c3d7208367`"><code>c3d7208</code></a> Merge pull request <a href="https://redirect.github.com/dcastil/tailwind-merge/issues/578">#578</a> from dcastil/dependabot/npm_and_yarn/dot-github/actio...</li> <li><a href="`527214bf13`"><code>527214b</code></a> Bump undici from 5.28.5 to 5.29.0 in /.github/actions/metrics-report</li> <li>See full diff in <a href="https://github.com/dcastil/tailwind-merge/compare/v3.3.0...v3.3.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tailwind-merge&package-manager=npm_and_yarn&previous-version=3.3.0&new-version=3.3.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-20 16:48:05 -07:00
dependabot[bot]	09bee51d6b	chore(python-deps): bump locust from 2.38.0 to 2.39.0 (#3221 ) Bumps [locust](https://github.com/locustio/locust) from 2.38.0 to 2.39.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/locustio/locust/releases">locust's releases</a>.</em></p> <blockquote> <h2>2.39.0</h2> <h2>What's Changed</h2> <ul> <li>Add MilvusUser and example by <a href="https://github.com/zhuwenxing"><code>@zhuwenxing</code></a> in <a href="https://redirect.github.com/locustio/locust/pull/3168">locustio/locust#3168</a></li> <li>Add SocketIOUser by <a href="https://github.com/cyberw"><code>@cyberw</code></a> in <a href="https://redirect.github.com/locustio/locust/pull/3189">locustio/locust#3189</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/zhuwenxing"><code>@zhuwenxing</code></a> made their first contribution in <a href="https://redirect.github.com/locustio/locust/pull/3168">locustio/locust#3168</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/locustio/locust/compare/2.38.1...2.39.0">https://github.com/locustio/locust/compare/2.38.1...2.39.0</a></p> <h2>2.38.1</h2> <h2>What's Changed</h2> <ul> <li>Fix test flakyness and update error message by <a href="https://github.com/amadeuppereira"><code>@amadeuppereira</code></a> in <a href="https://redirect.github.com/locustio/locust/pull/3187">locustio/locust#3187</a></li> <li>FastHttpUser: Dont send zstd in Accept-Encoding header by <a href="https://github.com/cyberw"><code>@cyberw</code></a> in <a href="https://redirect.github.com/locustio/locust/pull/3188">locustio/locust#3188</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/locustio/locust/compare/2.38.0...2.38.1">https://github.com/locustio/locust/compare/2.38.0...2.38.1</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/locustio/locust/blob/master/CHANGELOG.md">locust's changelog</a>.</em></p> <blockquote> <h1>Detailed changelog</h1> <p>The most important changes can also be found in <a href="https://docs.locust.io/en/latest/changelog.html">the documentation</a>.</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`1810fef1ae`"><code>1810fef</code></a> Tiny doc fixes</li> <li><a href="`48b4dfce8f`"><code>48b4dfc</code></a> Link SocketIOUser from main docs.</li> <li><a href="`6e4fd7f067`"><code>6e4fd7f</code></a> Merge pull request <a href="https://redirect.github.com/locustio/locust/issues/3189">#3189</a> from locustio/Add-SocketioUser</li> <li><a href="`95eca45476`"><code>95eca45</code></a> better documentation of on_message</li> <li><a href="`a56ef663af`"><code>a56ef66</code></a> SocketIOUser docs: Link to example on GH</li> <li><a href="`adaa71b5f9`"><code>adaa71b</code></a> SocketIOUser, add method docstrings and link to python-socketio's readthedocs</li> <li><a href="`9fb3ff0f89`"><code>9fb3ff0</code></a> Add testcase for SocketIOUser</li> <li><a href="`7047247f9d`"><code>7047247</code></a> SocketIOUser: Fix use of environment object. Remove SocketIOClient.</li> <li><a href="`f8ddc9c798`"><code>f8ddc9c</code></a> rename socketio echo_server</li> <li><a href="`ae28acf027`"><code>ae28acf</code></a> add contrib dependencies to docs build</li> <li>Additional commits viewable in <a href="https://github.com/locustio/locust/compare/2.38.0...2.39.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=locust&package-manager=uv&previous-version=2.38.0&new-version=2.39.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-20 16:47:46 -07:00
dependabot[bot]	eff97f122b	chore(python-deps): bump weaviate-client from 4.16.5 to 4.16.9 (#3219 ) Bumps [weaviate-client](https://github.com/weaviate/weaviate-python-client) from 4.16.5 to 4.16.9. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/weaviate/weaviate-python-client/releases">weaviate-client's releases</a>.</em></p> <blockquote> <h2>v4.16.9</h2> <h2>What's Changed</h2> <ul> <li>Deprecate broken method by <a href="https://github.com/dirkkul"><code>@dirkkul</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1795">weaviate/weaviate-python-client#1795</a></li> <li>Improve user create docstring by <a href="https://github.com/dirkkul"><code>@dirkkul</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1796">weaviate/weaviate-python-client#1796</a></li> <li>Fixup dependencies for package test by <a href="https://github.com/dirkkul"><code>@dirkkul</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1791">weaviate/weaviate-python-client#1791</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/weaviate/weaviate-python-client/compare/v4.16.8...v4.16.9">https://github.com/weaviate/weaviate-python-client/compare/v4.16.8...v4.16.9</a></p> <h2>v4.16.8</h2> <h2>What's Changed</h2> <ul> <li>Add backup list endpoint by <a href="https://github.com/dirkkul"><code>@dirkkul</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1785">weaviate/weaviate-python-client#1785</a></li> <li>Attempt further fix of protobuf runtime stub incompatibilities by <a href="https://github.com/tsmith023"><code>@tsmith023</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1788">weaviate/weaviate-python-client#1788</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/weaviate/weaviate-python-client/compare/v4.16.7...v4.16.8">https://github.com/weaviate/weaviate-python-client/compare/v4.16.7...v4.16.8</a></p> <h2>v4.16.6</h2> <h2>What's Changed</h2> <ul> <li>rq: Add bits to the update method by <a href="https://github.com/rlmanrique"><code>@rlmanrique</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1766">weaviate/weaviate-python-client#1766</a></li> <li>Deprecate contextionar, add model2vec and dimension parameter for transformers by <a href="https://github.com/dirkkul"><code>@dirkkul</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/1773">weaviate/weaviate-python-client#1773</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/weaviate/weaviate-python-client/compare/v4.16.5...v4.16.6">https://github.com/weaviate/weaviate-python-client/compare/v4.16.5...v4.16.6</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/weaviate/weaviate-python-client/blob/main/docs/changelog.rst">weaviate-client's changelog</a>.</em></p> <blockquote> <h2>Version 4.16.9</h2> <p>This patch version includes: - Explicitly depend on protobuf package</p> <h2>Version 4.16.8</h2> <p>This patch version includes: - Further attempted fixes for <code>protobuf</code> compatability issues - Introduction of the <code>backups.list()</code> method</p> <h2>Version 4.16.7</h2> <p>This patch version includes: - Fixes compatability issues between the built gRPC stubs and differing protobuf versions depending on the version of <code>grpcio</code> used to build the stubs - Add <code>text2vec-model2vec</code> module to <code>Configure.NamedVectors</code> - Deprecated <code>min_occurrences</code> in <code>Metrics.text</code> in favour of <code>limit</code></p> <h2>Version 4.16.6</h2> <p>This patch version includes: - Add <code>dimensions</code> property to <code>text2vec-transformers</code> vectorizers in <code>Configure.Vectors</code> - Add <code>text2vec-model2vec</code> vectorizer in <code>Configure.Vectors</code> - Deprecate <code>text2vec-contextionary</code> vectorizer</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`c69cfa124e`"><code>c69cfa1</code></a> Fixup dependencies for package test (<a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1791">#1791</a>)</li> <li><a href="`334380b6d4`"><code>334380b</code></a> Merge pull request <a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1796">#1796</a> from weaviate/docstring_user_create</li> <li><a href="`c7b8c75893`"><code>c7b8c75</code></a> Improve user create docstring</li> <li><a href="`93c865a23e`"><code>93c865a</code></a> Merge pull request <a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1795">#1795</a> from weaviate/deprecate_broken_method</li> <li><a href="`ba05f5f1ad`"><code>ba05f5f</code></a> Deprecate broken method</li> <li><a href="`4bef4b8210`"><code>4bef4b8</code></a> Update changelog (<a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1789">#1789</a>)</li> <li><a href="`c370bf5fa2`"><code>c370bf5</code></a> Attempt further fix of protobuf runtime stub incompatibilities (<a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1788">#1788</a>)</li> <li><a href="`98db3b1187`"><code>98db3b1</code></a> Merge pull request <a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1785">#1785</a> from weaviate/add_list_response</li> <li><a href="`ebf2b30252`"><code>ebf2b30</code></a> Merge pull request <a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1782">#1782</a> from weaviate/dependabot/pip/ruff-0.12.8</li> <li><a href="`88ad1c113b`"><code>88ad1c1</code></a> Fix version in CI</li> <li>Additional commits viewable in <a href="https://github.com/weaviate/weaviate-python-client/compare/v4.16.5...v4.16.9">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=weaviate-client&package-manager=uv&previous-version=4.16.5&new-version=4.16.9)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-20 16:47:33 -07:00
Ashwin Bharambe	f328ff6e98	fix(ci): dependabot update had a bug	2025-08-20 16:34:50 -07:00
Francisco Arceo	49060c3020	chore: Update dependabot to capture package-lock.json (#3212 ) # What does this PR do? This should fix dependabot based on this thread: https://stackoverflow.com/questions/60201543/dependabot-only-updates-lock-file <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-20 15:05:12 -07:00
grs	14082b22af	fix: handle mcp tool calls in previous response correctly (#3155 ) # What does this PR do? Handles MCP tool calls in a previous response Closes #3105 ## Test Plan Made call to create response with tool call, then made second call with the first linked through previous_response_id. Did not get error. Also added unit test. Signed-off-by: Gordon Sim <gsim@redhat.com>	2025-08-20 14:12:15 -07:00
Omer Tuchfeld	00a67da449	fix: Use `pool_pre_ping=True` in SQLAlchemy engine creation (#3208 ) # What does this PR do? We noticed that when llama-stack is running for a long time, we would run into database errors when trying to run messages through the agent (which we configured to persist against postgres), seemingly due to the database connections being stale or disconnected. This commit adds `pool_pre_ping=True` to the SQLAlchemy engine creation to help mitigate this issue by checking the connection before using it, and re-establishing it if necessary. More information in: https://docs.sqlalchemy.org/en/20/core/pooling.html#dealing-with-disconnects We're also open to other suggestions on how to handle this issue, this PR is just a suggestion. ## Test Plan We have not tested it yet (we're in the process of doing that) and we're hoping it's going to resolve our issue.	2025-08-20 13:52:05 -07:00
Francisco Arceo	e195ee3091	fix: Fix broken package-lock.json (#3209 ) # What does this PR do? Fix broken `package-lock.json` not caught by [github bot in this commit](`7f0b2a8764`). <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-20 13:11:44 -07:00
Matthew Farrellee	c2c859a6b0	chore(files tests): update files integration tests and fix inline::localfs (#3195 ) - update files=inline::localfs to raise ResourceNotFoundError instead of ValueError - only skip tests when no files provider is available - directly use openai_client and llama_stack_client where appropriate - check for correct behavior of non-existent file - xfail the isolation test, no implementation supports it test plan - ``` $ uv run ./scripts/integration-tests.sh --stack-config server:ci-tests --provider ollama --test-subdirs files ... tests/integration/files/test_files.py::test_openai_client_basic_operations PASSED [ 25%] tests/integration/files/test_files.py::test_files_authentication_isolation XFAIL [ 50%] tests/integration/files/test_files.py::test_files_authentication_shared_attributes PASSED [ 75%] tests/integration/files/test_files.py::test_files_authentication_anonymous_access PASSED [100%] ==================================== 3 passed, 1 xfailed in 1.03s ===================================== ``` previously - ``` $ uv run llama stack build --image-type venv --providers files=inline::localfs --run & ... $ ./scripts/integration-tests.sh --stack-config http://localhost:8321 --provider ollama --test-subdirs files ... tests/integration/files/test_files.py::test_openai_client_basic_operations[openai_client-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] PASSED [ 12%] tests/integration/files/test_files.py::test_files_authentication_isolation[openai_client-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] SKIPPED [ 25%] tests/integration/files/test_files.py::test_files_authentication_shared_attributes[openai_client-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] SKIPPED [ 37%] tests/integration/files/test_files.py::test_files_authentication_anonymous_access[openai_client-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] SKIPPED [ 50%] tests/integration/files/test_files.py::test_openai_client_basic_operations[client_with_models-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] PASSED [ 62%] tests/integration/files/test_files.py::test_files_authentication_isolation[client_with_models-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] SKIPPED [ 75%] tests/integration/files/test_files.py::test_files_authentication_shared_attributes[client_with_models-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] SKIPPED [ 87%] tests/integration/files/test_files.py::test_files_authentication_anonymous_access[client_with_models-ollama/llama3.2:3b-instruct-fp16-None-sentence-transformers/all-MiniLM-L6-v2-None-384] SKIPPED [100%] ========================================================= 2 passed, 6 skipped in 1.31s ========================================================== ```	2025-08-20 14:22:40 -04:00
Jiayi Ni	55e9959f62	fix: fix ```openai_embeddings``` for asymmetric embedding NIMs (#3205 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 9s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 12s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 14s Details Unit Tests / unit-tests (3.13) (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 13s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 16s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (push) Failing after 19s Details Test External API and Providers / test-external (venv) (push) Failing after 18s Details Python Package Build Test / build (3.12) (push) Failing after 49s Details Test Llama Stack Build / build (push) Failing after 54s Details UI Tests / ui-tests (22) (push) Failing after 1m26s Details Pre-commit / pre-commit (push) Successful in 2m24s Details # What does this PR do? NVIDIA asymmetric embedding models (e.g., `nvidia/llama-3.2-nv-embedqa-1b-v2`) require an `input_type` parameter not present in the standard OpenAI embeddings API. This PR adds the `input_type="query"` as default and updates the documentation to suggest using the `embedding` API for passage embeddings. <!-- If resolving an issue, uncomment and update the line below --> Resolves #2892 ## Test Plan ``` pytest -s -v tests/integration/inference/test_openai_embeddings.py --stack-config="inference=nvidia" --embedding-model="nvidia/llama-3.2-nv-embedqa-1b-v2" --env NVIDIA_API_KEY={nvidia_api_key} --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" ```	2025-08-20 08:06:25 -04:00
Mustafa Elbehery	3f8df167f3	chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (#3061 ) # What does this PR do? This PR adds a step in pre-commit to enforce using `llama_stack` logger. Currently, various parts of the code base uses different loggers. As a custom `llama_stack` logger exist and used in the codebase, it is better to standardize its utilization. Signed-off-by: Mustafa Elbehery <melbeher@redhat.com> Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>	2025-08-20 07:15:35 -04:00
Matthew Farrellee	5f151ddf45	fix: disable ui-prettier & ui-eslint (#3207 )	2025-08-20 06:42:43 -04:00
Francisco Arceo	5f6d5072b6	chore: Faster npm pre-commit (#3206 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 13s Details Vector IO Integration Tests / test-matrix (push) Failing after 16s Details Test Llama Stack Build / build-single-provider (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 16s Details Unit Tests / unit-tests (3.13) (push) Failing after 16s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 23s Details Test Llama Stack Build / build (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 25s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 34s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 58s Details Update ReadTheDocs / update-readthedocs (push) Failing after 55s Details UI Tests / ui-tests (22) (push) Failing after 1m18s Details Test External API and Providers / test-external (venv) (push) Failing after 2m2s Details Pre-commit / pre-commit (push) Failing after 2m43s Details # What does this PR do? Adds npm to pre-commit.yml installation and caches ui Removes node installation during pre-commit. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-19 16:38:38 -07:00
github-actions[bot]	7f0b2a8764	build: Bump version to 0.2.18	2025-08-19 22:38:23 +00:00
Matthew Farrellee	e7a812f5de	chore: Fixup main pre commit (#3204 )	2025-08-19 14:52:38 -04:00
Varsha	8cc4925f7d	chore: Enable keyword search for Milvus inline (#3073 ) # What does this PR do? With https://github.com/milvus-io/milvus-lite/pull/294 - Milvus Lite supports keyword search using BM25. While introducing keyword search we had explicitly disabled it for inline milvus. This PR removes the need for the check, and enables `inline::milvus` for tests. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Run llama stack with `inline::milvus` enabled: ``` pytest tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes --stack-config=http://localhost:8321 --embedding-model=all-MiniLM-L6-v2 -v ``` ``` INFO 2025-08-07 17:06:20,932 tests.integration.conftest:64 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS =========================================================================================== test session starts ============================================================================================ platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.11', 'Platform': 'macOS-14.7.6-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}} rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack configfile: pyproject.toml plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0 asyncio: mode=Mode.AUTO collected 3 items tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-vector] PASSED [ 33%] tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-keyword] PASSED [ 66%] tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-hybrid] PASSED [100%] ============================================================================================ 3 passed in 4.75s ============================================================================================= ``` Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-08-19 13:01:23 -04:00
Ashwin Bharambe	eb07a0f86a	fix(ci, tests): ensure uv environments in CI are kosher, record tests (#3193 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Test Llama Stack Build / build-single-provider (push) Failing after 23s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 28s Details Test Llama Stack Build / generate-matrix (push) Successful in 25s Details Python Package Build Test / build (3.13) (push) Failing after 25s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 34s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 37s Details Test External API and Providers / test-external (venv) (push) Failing after 33s Details Unit Tests / unit-tests (3.13) (push) Failing after 33s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 38s Details Python Package Build Test / build (3.12) (push) Failing after 1m0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1m4s Details Unit Tests / unit-tests (3.12) (push) Failing after 59s Details Test Llama Stack Build / build (push) Failing after 50s Details Vector IO Integration Tests / test-matrix (push) Failing after 1m48s Details UI Tests / ui-tests (22) (push) Successful in 2m12s Details Pre-commit / pre-commit (push) Successful in 2m41s Details I started this PR trying to unbreak a newly broken test `test_agent_name`. This test was broken all along but did not show up because during testing we were pulling the "non-updated" llama stack client. See this comment: https://github.com/llamastack/llama-stack/pull/3119#discussion_r2270988205 While fixing this, I encountered a large amount of badness in our CI workflow definitions. - We weren't passing `LLAMA_STACK_DIR` or `LLAMA_STACK_CLIENT_DIR` overrides to `llama stack build` at all in some cases. - Even when we did, we used `uv run` liberally. The first thing `uv run` does is "syncs" the project environment. This means, it is going to undo any mutations we might have done ourselves. But we make many mutations in our CI runners to these environments. The most important of which is why `llama stack build` where we install distro dependencies. As a result, when you tried to run the integration tests, you would see old, strange versions. ## Test Plan Re-record using: ``` sh scripts/integration-tests.sh --stack-config ci-tests \ --provider ollama --test-pattern test_agent_name --inference-mode record ``` Then re-run with `--inference-mode replay`. But: Eventually, this test turned out to be quite flaky for telemetry reasons. I haven't investigated it for now and just disabled it sadly since we have a release to push out.	2025-08-18 17:02:24 -07:00
Francisco Arceo	ac78e9f66a	chore: Adding UI unit tests in CI (#3191 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 14s Details Unit Tests / unit-tests (3.12) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (push) Failing after 16s Details Test Llama Stack Build / build-single-provider (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 16s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s Details Test External API and Providers / test-external (venv) (push) Failing after 14s Details Test Llama Stack Build / build (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Update ReadTheDocs / update-readthedocs (push) Failing after 1m2s Details Python Package Build Test / build (3.13) (push) Failing after 1m4s Details UI Tests / ui-tests (22) (push) Successful in 1m33s Details Pre-commit / pre-commit (push) Successful in 2m38s Details	2025-08-18 16:48:21 -06:00
Ashwin Bharambe	89661b984c	revert: "feat(cli): make venv the default image type" (#3196 ) Reverts llamastack/llama-stack#3187	2025-08-18 15:31:01 -07:00
Ashwin Bharambe	2e7ca07423	feat(cli): make venv the default image type (#3187 ) We have removed conda now so we can make `venv` the default. Just doing `llama stack build --distro starter` is now enough for the most part.	2025-08-18 14:58:23 -07:00
slekkala1	7519ab4024	feat: Code scanner Provider impl for moderations api (#3100 ) # What does this PR do? Add CodeScanner implementations ## Test Plan `SAFETY_MODEL=CodeScanner LLAMA_STACK_CONFIG=starter uv run pytest -v tests/integration/safety/test_safety.py --text-model=llama3.2:3b-instruct-fp16 --embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama` This PR need to land after this https://github.com/meta-llama/llama-stack/pull/3098	2025-08-18 14:15:40 -07:00
Ashwin Bharambe	27d6becfd0	fix(misc): pin openai dependency to < 1.100.0 (#3192 ) This OpenAI client release `0843a11164` ends up breaking litellm `169a17400f/litellm/types/llms/openai.py (L40)` Update the dependency pin. Also make the imports a bit more defensive anyhow if something else during `llama stack build` ends up moving openai to a previous version. ## Test Plan Run pre-release script integration tests.	2025-08-18 12:20:50 -07:00
IAN MILLER	f8398d25ff	fix: kill build_conda_env.sh (#3190 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> I noticed somehow [build_conda_env.sh](https://github.com/llamastack/llama-stack/blob/main/llama_stack/core/build_conda_env.sh) exists in main branch. We need to kill it to be consistent with [#2969](https://github.com/llamastack/llama-stack/pull/2969) <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-08-18 12:17:44 -07:00
Maor Friedman	739b18edf8	feat: add support for postgres ssl mode and root cert (#3182 ) this PR adds support for configuring `sslmode` and `sslrootcert` when initiating the psycopg2 connection. closes #3181	2025-08-18 10:24:24 -07:00
Francisco Arceo	fa431e15e0	chore: Update TRIAGERS.md (#3186 ) # What does this PR do? Update triagers to current state ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-08-18 10:23:51 -07:00
Charlie Doern	4ae39b94ff	fix: remove category prints (#3189 ) # What does this PR do? commands where the output is important like `llama stack build --print-deps-only` (soon to be `llama stack show`) print some log.py `cprint`'s on _every_ execution of the CLI for example: <img width="912" height="331" alt="Screenshot 2025-08-18 at 1 16 30 PM" src="https://github.com/user-attachments/assets/e5bf18fb-74a1-438c-861a-8a26eea7d014" /> the yellow text is likely unnecessary. Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-08-18 10:23:23 -07:00
Ashwin Bharambe	f4cecaade9	chore(ci): dont run llama stack server always (#3188 ) Sometimes the server has already been started (e.g., via docker). Just a convenience here so we can reuse this script more.	2025-08-18 10:11:55 -07:00
Francisco Arceo	a8091d0c6a	chore: Update benchmarking location in contributing docs (#3180 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 10s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 14s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s Details Test External API and Providers / test-external (venv) (push) Failing after 18s Details Unit Tests / unit-tests (3.12) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (push) Failing after 19s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 24s Details Python Package Build Test / build (3.12) (push) Failing after 22s Details Unit Tests / unit-tests (3.13) (push) Failing after 57s Details Pre-commit / pre-commit (push) Successful in 2m11s Details # What does this PR do? Small docs change as requested in https://github.com/llamastack/llama-stack/pull/3160#pullrequestreview-3125038932 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-08-18 08:04:21 -04:00
Ashwin Bharambe	5e7c2250be	test(recording): add a script to schedule recording workflow (#3170 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Test Llama Stack Build / build-single-provider (push) Failing after 10s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (push) Failing after 14s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s Details Test External API and Providers / test-external (venv) (push) Failing after 13s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s Details Test Llama Stack Build / build (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Pre-commit / pre-commit (push) Successful in 1m19s Details See comment here: https://github.com/llamastack/llama-stack/pull/3162#issuecomment-3192859097 -- TL;DR it is quite complex to invoke the recording workflow correctly for an end developer writing tests. This script simplifies the work. No more manual GitHub UI navigation! ## Script Functionality - Auto-detects your current branch and associated PR - Finds the right repository context (works from forks!) - Runs the workflow where it can actually commit back - Validates prerequisites and provides helpful error messages ## How to Use First ensure you are on the branch which introduced a new test and want it recorded. Make sure you have pushed this branch remotely, easiest is to create a PR. ``` # Record tests for current branch ./scripts/github/schedule-record-workflow.sh # Record specific test subdirectories ./scripts/github/schedule-record-workflow.sh --test-subdirs "agents,inference" # Record with vision tests enabled ./scripts/github/schedule-record-workflow.sh --run-vision-tests # Record tests matching a pattern ./scripts/github/schedule-record-workflow.sh --test-pattern "test_streaming" ``` ## Test Plan Ran `./scripts/github/schedule-record-workflow.sh -s inference -k tool_choice` which started `4820409329` which successfully committed recorded outputs.	2025-08-15 16:54:34 -07:00
Matthew Farrellee	914c7be288	feat: add batches API with OpenAI compatibility (with inference replay) (#3162 ) Add complete batches API implementation with protocol, providers, and tests: Core Infrastructure: - Add batches API protocol using OpenAI Batch types directly - Add Api.batches enum value and protocol mapping in resolver - Add OpenAI "batch" file purpose support - Include proper error handling (ConflictError, ResourceNotFoundError) Reference Provider: - Add ReferenceBatchesImpl with full CRUD operations (create, retrieve, cancel, list) - Implement background batch processing with configurable concurrency - Add SQLite KVStore backend for persistence - Support /v1/chat/completions endpoint with request validation Comprehensive Test Suite: - Add unit tests for provider implementation with validation - Add integration tests for end-to-end batch processing workflows - Add error handling tests for validation, malformed inputs, and edge cases Configuration: - Add max_concurrent_batches and max_concurrent_requests_per_batch options - Add provider documentation with sample configurations Test with - ``` $ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run & $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK ``` addresses #3066 --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-08-15 15:34:15 -07:00
Ashwin Bharambe	f4ccdee200	fix(ci): skip batches directory for library client testing	2025-08-15 15:30:03 -07:00
Ashwin Bharambe	0e8bb94bf3	feat(ci): make recording workflow simpler, more parameterizable (#3169 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 14s Details Update ReadTheDocs / update-readthedocs (push) Failing after 12s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s Details Test External API and Providers / test-external (venv) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (push) Failing after 28s Details Unit Tests / unit-tests (3.12) (push) Failing after 27s Details Unit Tests / unit-tests (3.13) (push) Failing after 51s Details Pre-commit / pre-commit (push) Successful in 2m6s Details # What does this PR do? Recording tests has become a nightmare. This is the first part of making that process simpler by making it _less_ automatic. I tried to be too clever earlier. It simplifies the record-integration-tests workflow to use workflow dispatch inputs instead of PR labels. No more opaque stuff. Just go to the GitHub UI and run the workflow with inputs. I will soon add a helper script for this also. Other things to aid re-running just the small set of things you need to re-record: - Replaces the `test-types` JSON array parameter with a more intuitive `test-subdirs` comma-separated list. The whole JSON array crap was for matrix. - Adds a new `test-pattern` parameter to allow filtering tests using pytest's `-k` option ## Test Plan Note that this PR is in a fork not the source repository. - Replay tests on this PR are green - Manually [ran](`1699856292`) the replay workflow with a test-subdir and test-pattern filter, worked - Manually [ran](`4819508034`) the record workflow with a simple pattern, it has worked and updated _this_ PR. --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-08-15 14:47:20 -07:00
Ashwin Bharambe	a6e2c18909	Revert "refactor(agents): migrate to OpenAI chat completions API" (#3167 ) Reverts llamastack/llama-stack#3097 It has broken agents tests.	2025-08-15 12:01:07 -07:00
ehhuang	2c06b24c77	test: benchmark scripts (#3160 ) # What does this PR do? 1. Add our own benchmark script instead of locust (doesn't support measuring streaming latency well) 2. Simplify k8s deployment 3. Add a simple profile script for locally running server ## Test Plan ❮ ./run-benchmark.sh --target stack --duration 180 --concurrent 10 ============================================================ BENCHMARK RESULTS ============================================================ Total time: 180.00s Concurrent users: 10 Total requests: 1636 Successful requests: 1636 Failed requests: 0 Success rate: 100.0% Requests per second: 9.09 Response Time Statistics: Mean: 1.095s Median: 1.721s Min: 0.136s Max: 3.218s Std Dev: 0.762s Percentiles: P50: 1.721s P90: 1.751s P95: 1.756s P99: 1.796s Time to First Token (TTFT) Statistics: Mean: 0.037s Median: 0.037s Min: 0.023s Max: 0.211s Std Dev: 0.011s TTFT Percentiles: P50: 0.037s P90: 0.040s P95: 0.044s P99: 0.055s Streaming Statistics: Mean chunks per response: 64.0 Total chunks received: 104775	2025-08-15 11:24:29 -07:00
dependabot[bot]	2114214fe3	chore(python-deps): bump huggingface-hub from 0.34.3 to 0.34.4 (#3084 ) Bumps [huggingface-hub](https://github.com/huggingface/huggingface_hub) from 0.34.3 to 0.34.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/huggingface/huggingface_hub/releases">huggingface-hub's releases</a>.</em></p> <blockquote> <h2>[v0.34.4] Support Image to Video inference + QoL in jobs API, auth and utilities</h2> <p>Biggest update is the support of Image-To-Video task with inference provider Fal AI</p> <ul> <li>[Inference] Support image to video task <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3289">#3289</a> by <a href="https://github.com/hanouticelina"><code>@hanouticelina</code></a></li> </ul> <pre lang="py"><code>>>> from huggingface_hub import InferenceClient >>> client = InferenceClient() >>> video = client.image_to_video("cat.jpg", model="Wan-AI/Wan2.2-I2V-A14B", prompt="turn the cat into a tiger") >>> with open("tiger.mp4", "wb") as f: ... f.write(video) </code></pre> <p>And some quality of life improvements:</p> <ul> <li>Add type to job owner <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3291">#3291</a> by <a href="https://github.com/drbh"><code>@drbh</code></a></li> <li>Include HF_HUB_DISABLE_XET in the environment dump <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3290">#3290</a> by <a href="https://github.com/hanouticelina"><code>@hanouticelina</code></a></li> <li>Whoami: custom message only on unauthorized <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3288">#3288</a> by <a href="https://github.com/Wauplin"><code>@Wauplin</code></a></li> <li>Add validation warnings for repository limits in upload_large_folder <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3280">#3280</a> by <a href="https://github.com/davanstrien"><code>@davanstrien</code></a></li> <li>Add timeout info to Jobs guide docs <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3281">#3281</a> by <a href="https://github.com/davanstrien"><code>@davanstrien</code></a></li> <li>[Jobs] Use current or stored token in a Job secrets <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3272">#3272</a> by <a href="https://github.com/lhoestq"><code>@lhoestq</code></a></li> <li>Fix bash history expansion in hf jobs example <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3277">#3277</a> by <a href="https://github.com/nyuuzyou"><code>@nyuuzyou</code></a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/huggingface/huggingface_hub/compare/v0.34.3...v0.34.4">https://github.com/huggingface/huggingface_hub/compare/v0.34.3...v0.34.4</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`84a92a92c2`"><code>84a92a9</code></a> Release: v0.34.4</li> <li><a href="`6196ac2cbc`"><code>6196ac2</code></a> Add type to job owner (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3291">#3291</a>)</li> <li><a href="`4f6975f697`"><code>4f6975f</code></a> Include <code>HF_HUB_DISABLE_XET</code> in the environment dump (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3290">#3290</a>)</li> <li><a href="`3720a5096f`"><code>3720a50</code></a> [Inference] Support image to video task (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3289">#3289</a>)</li> <li><a href="`bb5e4c7a2c`"><code>bb5e4c7</code></a> Whoami: custom message only on unauthorized (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3288">#3288</a>)</li> <li><a href="`a725256f31`"><code>a725256</code></a> Add validation warnings for repository limits in upload_large_folder (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3280">#3280</a>)</li> <li><a href="`a181b0f088`"><code>a181b0f</code></a> Add timeout info to Jobs guide docs (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3281">#3281</a>)</li> <li><a href="`4d38925c8d`"><code>4d38925</code></a> [Jobs] Use current or stored token in a Job secrets (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3272">#3272</a>)</li> <li><a href="`1580ce18c7`"><code>1580ce1</code></a> Fix bash history expansion in hf jobs example (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3277">#3277</a>)</li> <li>See full diff in <a href="https://github.com/huggingface/huggingface_hub/compare/v0.34.3...v0.34.4">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=huggingface-hub&package-manager=uv&previous-version=0.34.3&new-version=0.34.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-08-15 10:55:43 -07:00
dependabot[bot]	a275282685	chore(python-deps): bump pymilvus from 2.5.14 to 2.6.0 (#3086 ) Bumps [pymilvus](https://github.com/milvus-io/pymilvus) from 2.5.14 to 2.6.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/milvus-io/pymilvus/releases">pymilvus's releases</a>.</em></p> <blockquote> <h2>PyMilvus v2.6.0 Release Notes</h2> <h2>New Features</h2> <ol> <li>Add APIs in MilvusClient</li> </ol> <ul> <li>enhance: add describe and alter database in MilvusClient by <a href="https://github.com/smellthemoon"><code>@smellthemoon</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2433">milvus-io/pymilvus#2433</a></li> <li>enhance: support milvus-client iterator by <a href="https://github.com/MrPresent-Han"><code>@MrPresent-Han</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2461">milvus-io/pymilvus#2461</a></li> <li>enhance: Enable resource group api in milvus client by <a href="https://github.com/weiliu1031"><code>@weiliu1031</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2513">milvus-io/pymilvus#2513</a></li> <li>enhance: add release_collection, drop_index, create_partition, drop_partition, load_partition and release_partition by <a href="https://github.com/brcarry"><code>@brcarry</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2525">milvus-io/pymilvus#2525</a></li> <li>enhance: enable describe_replica api in milvus client by <a href="https://github.com/weiliu1031"><code>@weiliu1031</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2541">milvus-io/pymilvus#2541</a></li> <li>enhance: support recalls for milvus_client by <a href="https://github.com/chasingegg"><code>@chasingegg</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2552">milvus-io/pymilvus#2552</a></li> <li>enhance: add use_database by <a href="https://github.com/czs007"><code>@czs007</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2491">milvus-io/pymilvus#2491</a></li> </ul> <ol start="2"> <li>Add AsyncMilvusClient</li> </ol> <ul> <li>[FEAT] Asyncio support by <a href="https://github.com/brcarry"><code>@brcarry</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2411">milvus-io/pymilvus#2411</a></li> <li>Add async DDL funcs & DDL examples by <a href="https://github.com/Shawnzheng011019"><code>@Shawnzheng011019</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2852">milvus-io/pymilvus#2852</a></li> </ul> <ol start="3"> <li>Other features</li> </ol> <ul> <li>enhance: support Int8Vector by <a href="https://github.com/cydrain"><code>@cydrain</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2611">milvus-io/pymilvus#2611</a></li> <li>feat: support recalls field in SearchResult by <a href="https://github.com/chasingegg"><code>@chasingegg</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2390">milvus-io/pymilvus#2390</a></li> <li>enhance: Support Python3.13 and upgrade grpcio range by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2684">milvus-io/pymilvus#2684</a></li> <li>enhance: support run analyzer return detail token by <a href="https://github.com/aoiasd"><code>@aoiasd</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2679">milvus-io/pymilvus#2679</a></li> <li>enhance: Add force_drop parameter to drop_role method for role deletion by <a href="https://github.com/SimFG"><code>@SimFG</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2705">milvus-io/pymilvus#2705</a></li> <li>enhance: add property func for AnalyzeToken by <a href="https://github.com/aoiasd"><code>@aoiasd</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2704">milvus-io/pymilvus#2704</a></li> <li>enhance: grant/revoke v2 optional db and collection params by <a href="https://github.com/shaoting-huang"><code>@shaoting-huang</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2386">milvus-io/pymilvus#2386</a></li> <li>extend unlimted offset for query iterator(<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2418">#2418</a>) by <a href="https://github.com/MrPresent-Han"><code>@MrPresent-Han</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2419">milvus-io/pymilvus#2419</a></li> <li>enhance: alterindex & altercollection supports altering properties by <a href="https://github.com/JsDove"><code>@JsDove</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2406">milvus-io/pymilvus#2406</a></li> <li>enhance: alterdatabase support delete property by <a href="https://github.com/JsDove"><code>@JsDove</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2435">milvus-io/pymilvus#2435</a></li> <li>enhance: support hints param by <a href="https://github.com/chasingegg"><code>@chasingegg</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2408">milvus-io/pymilvus#2408</a></li> <li>enhance: create database support properties by <a href="https://github.com/JsDove"><code>@JsDove</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2448">milvus-io/pymilvus#2448</a></li> <li>enhance: Add <code>db_name</code> parameter at <code>bulk_import</code> by <a href="https://github.com/counter2015"><code>@counter2015</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2446">milvus-io/pymilvus#2446</a></li> <li>enhance: add search iterator v2 by <a href="https://github.com/PwzXxm"><code>@PwzXxm</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2395">milvus-io/pymilvus#2395</a></li> <li>enhance: simplify the structure of search_params by <a href="https://github.com/smellthemoon"><code>@smellthemoon</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2507">milvus-io/pymilvus#2507</a></li> <li>enhance: Remove long deprecated Milvus class by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2544">milvus-io/pymilvus#2544</a></li> <li>enhance: Use new model pkg by <a href="https://github.com/junjiejiangjjj"><code>@junjiejiangjjj</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2595">milvus-io/pymilvus#2595</a></li> <li>enhance: Add schema update time verification to insert and upsert to use cache by <a href="https://github.com/JsDove"><code>@JsDove</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2551">milvus-io/pymilvus#2551</a></li> <li>enhance: describecollection output add created_timestamp by <a href="https://github.com/JsDove"><code>@JsDove</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2618">milvus-io/pymilvus#2618</a></li> <li>feat: add external filter func for search iterator v2 by <a href="https://github.com/PwzXxm"><code>@PwzXxm</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2639">milvus-io/pymilvus#2639</a></li> <li>enhance: support run analyzer by <a href="https://github.com/aoiasd"><code>@aoiasd</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2622">milvus-io/pymilvus#2622</a></li> <li>weighted reranker to allow skip score normalization by <a href="https://github.com/zhengbuqian"><code>@zhengbuqian</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2708">milvus-io/pymilvus#2708</a></li> <li>enhance: Support AddCollectionField API by <a href="https://github.com/congqixia"><code>@congqixia</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2722">milvus-io/pymilvus#2722</a></li> <li>Add 1-Way and 2-Way TLS Support to Bulk Import Functions by <a href="https://github.com/abd-770"><code>@abd-770</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2672">milvus-io/pymilvus#2672</a></li> <li>enhance: Use SearchResult in MilvusClient by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2735">milvus-io/pymilvus#2735</a></li> <li>Support rerank by <a href="https://github.com/junjiejiangjjj"><code>@junjiejiangjjj</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2729">milvus-io/pymilvus#2729</a></li> <li>feat: suppoprt multi analyzer params by <a href="https://github.com/aoiasd"><code>@aoiasd</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2747">milvus-io/pymilvus#2747</a></li> <li>Add funciton checker by <a href="https://github.com/junjiejiangjjj"><code>@junjiejiangjjj</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2760">milvus-io/pymilvus#2760</a></li> <li>enhance: Support run analyzer by collection and field by <a href="https://github.com/aoiasd"><code>@aoiasd</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2822">milvus-io/pymilvus#2822</a></li> <li>feat: support load collection/partition with priority(<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2835">#2835</a>) by <a href="https://github.com/MrPresent-Han"><code>@MrPresent-Han</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2836">milvus-io/pymilvus#2836</a></li> <li>enhance: optimize perf for large topk(<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2848">#2848</a>) by <a href="https://github.com/MrPresent-Han"><code>@MrPresent-Han</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2849">milvus-io/pymilvus#2849</a></li> <li>enhance: Add usage guide to manage MilvusClient by <a href="https://github.com/XuanYang-cn"><code>@XuanYang-cn</code></a> in <a href="https://redirect.github.com/milvus-io/pymilvus/pull/2907">milvus-io/pymilvus#2907</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`1e56ce7d31`"><code>1e56ce7</code></a> enhance: Update milvus-proto and readme (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2921">#2921</a>)</li> <li><a href="`75052b1b7c`"><code>75052b1</code></a> enhance: Add usage guide to manage MilvusClient (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2907">#2907</a>)</li> <li><a href="`9f44053086`"><code>9f44053</code></a> add example code for language identifier and multi analyzer (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2919">#2919</a>)</li> <li><a href="`058836de26`"><code>058836d</code></a> fix: Return new pk value for upsert when autoid=true (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2914">#2914</a>)</li> <li><a href="`bbc6777565`"><code>bbc6777</code></a> [cherry-pick] Compatible with the default behavior of free on the cloud (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2913">#2913</a>)</li> <li><a href="`45080c39c5`"><code>45080c3</code></a> fix: Aviod coping functions when init CollectionSchema (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2902">#2902</a>)</li> <li><a href="`52b8461c5b`"><code>52b8461</code></a> [cherry-pick] bulk_import add stageName/dataPaths parameter (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2905">#2905</a>)</li> <li><a href="`a8c3120622`"><code>a8c3120</code></a> [cherry-pick] support stage (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2895">#2895</a>)</li> <li><a href="`3653effa88`"><code>3653eff</code></a> fix: Tidy alias configs when connect fails (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2900">#2900</a>)</li> <li><a href="`728791a7de`"><code>728791a</code></a> enhance: Store alias before wait for ready (<a href="https://redirect.github.com/milvus-io/pymilvus/issues/2894">#2894</a>)</li> <li>Additional commits viewable in <a href="https://github.com/milvus-io/pymilvus/compare/v2.5.14...v2.6.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pymilvus&package-manager=uv&previous-version=2.5.14&new-version=2.6.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-15 10:54:09 -07:00
Aakanksha Duggal	e743d3fdf6	refactor(agents): migrate to OpenAI chat completions API (#3097 ) Replace chat_completion calls with openai_chat_completion to eliminate dependency on legacy inference APIs. # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> Closes #3067 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-08-15 10:51:41 -07:00
ashwinb	f66ae3b3b1	docs(tests): Add a bunch of documentation for our testing systems (#3139 ) # What does this PR do? Creates a structured testing documentation section with multiple detailed pages: - Testing overview explaining the record-replay architecture - Integration testing guide with practical usage examples - Record-replay system technical documentation - Guide for writing effective tests - Troubleshooting guide for common testing issues Hopefully this makes things a bit easier.	2025-08-15 17:45:30 +00:00
Ashwin Bharambe	81ecaf6221	fix(ci): make the Vector IO CI follow the same pattern as others (#3164 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / discover-tests (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 8s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details Python Package Build Test / build (3.13) (push) Failing after 13s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Details Pre-commit / pre-commit (push) Successful in 1m19s Details # What does this PR do? Updates the integration-vector-io-tests workflow to run daily tests on Python 3.13 while limiting regular PR tests to Python 3.12 only. The PR also improves the concurrency configuration to prevent workflow conflicts between main branch runs and PR runs. ## Test Plan [![testinprod](https://graphite-user-uploaded-assets-prod.s3.amazonaws.com/WjlTemxb6oA4PgZFmj08/2645295d-f421-49ae-8f3f-f4672d8204e2/testinprod.jpeg)](https://app.graphite.dev/settings/meme-library?org=llamastack)	2025-08-14 21:06:08 -07:00
ashwinb	01b2afd4b5	fix(tests): record missing tests for test_responses_store (#3163 ) # What does this PR do? Updates test recordings. ## Test Plan Started ollama serving the 3.2:3b model. Then ran the server: ``` LLAMA_STACK_TEST_INFERENCE_MODE=record \ LLAMA_STACK_TEST_RECORDING_DIR=tests/integration/recordings/ \ SQLITE_STORE_DIR=$(mktemp -d) \ OLLAMA_URL=http://localhost:11434 \ llama stack build --template starter --image-type venv --run ``` Then ran the tests which needed recording: ``` pytest -sv tests/integration/agents/test_openai_responses.py \ --stack-config=server:starter \ --text-model ollama/llama3.2:3b-instruct-fp16 -k test_responses_store ``` Then, restarted the server with `LLAMA_STACK_TEST_INFERENCE_MODE=replay`, re-ran the tests and verified they passed.	2025-08-15 03:52:45 +00:00
ashwinb	8ed69978f9	refactor(tests): make the responses tests nicer (#3161 ) # What does this PR do? A _bunch_ on cleanup for the Responses tests. - Got rid of YAML test cases, moved them to just use simple pydantic models - Splitting the large monolithic test file into multiple focused test files: - `test_basic_responses.py` for basic and image response tests - `test_tool_responses.py` for tool-related tests - `test_file_search.py` for file search specific tests - Adding a `StreamingValidator` helper class to standardize streaming response validation ## Test Plan Run the tests: ``` pytest -s -v tests/integration/non_ci/responses/ \ --stack-config=starter \ --text-model openai/gpt-4o \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 \ -k "client_with_models" ```	2025-08-15 00:05:36 +00:00
ashwinb	ba664474de	feat(responses): add mcp list tool streaming event (#3159 ) # What does this PR do? Adds proper streaming events for MCP tool listing (`mcp_list_tools.in_progress` and `mcp_list_tools.completed`). Also refactors things a bit more. ## Test Plan Verified existing integration tests pass with the refactored code. The test `test_response_streaming_multi_turn_tool_execution` has been updated to check for the new MCP list tools streaming events	2025-08-15 00:05:36 +00:00
ashwinb	9324e902f1	refactor(responses): move stuff into some utils and add unit tests (#3158 ) # What does this PR do? Refactors the OpenAI response conversion utilities by moving helper functions from `openai_responses.py` to `utils.py`. Adds unit tests.	2025-08-15 00:05:36 +00:00
ashwinb	47d5af703c	chore(responses): Refactor Responses Impl to be civilized (#3138 ) # What does this PR do? Refactors the OpenAI responses implementation by extracting streaming and tool execution logic into separate modules. This improves code organization by: 1. Creating a new `StreamingResponseOrchestrator` class in `streaming.py` to handle the streaming response generation logic 2. Moving tool execution functionality to a dedicated `ToolExecutor` class in `tool_executor.py` ## Test Plan Existing tests	2025-08-15 00:05:35 +00:00
Francisco Arceo	e69acbafbf	feat(UI): Adding linter and prettier for UI (#3156 )	2025-08-14 15:58:43 -06:00
Ashwin Bharambe	61582f327c	fix(ci): update triggers for the workflows (#3152 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / discover-tests (push) Successful in 8s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Details Python Package Build Test / build (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 20s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Update ReadTheDocs / update-readthedocs (push) Failing after 13s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 21s Details Test External API and Providers / test-external (venv) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details Pre-commit / pre-commit (push) Successful in 1m39s Details	2025-08-14 10:27:25 -07:00
Derek Higgins	c15cc7ed77	fix: use ChatCompletionMessageFunctionToolCall (#3142 ) The OpenAI compatibility layer was incorrectly importing ChatCompletionMessageToolCallParam instead of the ChatCompletionMessageFunctionToolCall class. This caused "Cannot instantiate typing.Union" errors when processing agent requests with tool calls. Closes: #3141 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-08-14 10:27:00 -07:00
Ashwin Bharambe	ee7631b6cf	Revert "feat: add batches API with OpenAI compatibility" (#3149 ) Reverts llamastack/llama-stack#3088 The PR broke integration tests.	2025-08-14 10:08:54 -07:00
Matthew Farrellee	de692162af	feat: add batches API with OpenAI compatibility (#3088 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / discover-tests (push) Successful in 12s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 15s Details Python Package Build Test / build (3.12) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s Details Python Package Build Test / build (3.13) (push) Failing after 17s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 25s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 29s Details Unit Tests / unit-tests (3.12) (push) Failing after 20s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s Details Test External API and Providers / test-external (venv) (push) Failing after 22s Details Unit Tests / unit-tests (3.13) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 24s Details Update ReadTheDocs / update-readthedocs (push) Failing after 38s Details Pre-commit / pre-commit (push) Successful in 1m53s Details Add complete batches API implementation with protocol, providers, and tests: Core Infrastructure: - Add batches API protocol using OpenAI Batch types directly - Add Api.batches enum value and protocol mapping in resolver - Add OpenAI "batch" file purpose support - Include proper error handling (ConflictError, ResourceNotFoundError) Reference Provider: - Add ReferenceBatchesImpl with full CRUD operations (create, retrieve, cancel, list) - Implement background batch processing with configurable concurrency - Add SQLite KVStore backend for persistence - Support /v1/chat/completions endpoint with request validation Comprehensive Test Suite: - Add unit tests for provider implementation with validation - Add integration tests for end-to-end batch processing workflows - Add error handling tests for validation, malformed inputs, and edge cases Configuration: - Add max_concurrent_batches and max_concurrent_requests_per_batch options - Add provider documentation with sample configurations Test with - ``` $ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run & $ LLAMA_STACK_CONFIG=http://localhost:8321 uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK ``` addresses #3066	2025-08-14 09:42:02 -04:00
ehhuang	46ff302d87	chore: Remove Trendshift badge from README (#3137 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 8s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 13s Details Python Package Build Test / build (3.12) (push) Failing after 11s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 13s Details Python Package Build Test / build (3.13) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 17s Details Update ReadTheDocs / update-readthedocs (push) Failing after 11s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 18s Details Unit Tests / unit-tests (3.13) (push) Failing after 13s Details Test External API and Providers / test-external (venv) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 49s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 51s Details Unit Tests / unit-tests (3.12) (push) Failing after 51s Details Pre-commit / pre-commit (push) Successful in 1m36s Details ## Summary - This links to a scammy looking website with ads. ## Test plan	2025-08-13 18:38:34 -07:00
Ashwin Bharambe	e1e161553c	feat(responses): add MCP argument streaming and content part events (#3136 ) # What does this PR do? Adds content part streaming events to the OpenAI-compatible Responses API to support more granular streaming of response content. This introduces: 1. New schema types for content parts: `OpenAIResponseContentPart` with variants for text output and refusals 2. New streaming event types: - `OpenAIResponseObjectStreamResponseContentPartAdded` for when content parts begin - `OpenAIResponseObjectStreamResponseContentPartDone` for when content parts complete 3. Implementation in the reference provider to emit these events during streaming responses. Also emits MCP arguments just like function call ones. ## Test Plan Updated existing streaming tests to verify content part events are properly emitted	2025-08-13 16:34:26 -07:00
Ashwin Bharambe	8638537d14	feat(responses): stream progress of tool calls (#3135 ) # What does this PR do? Enhances tool execution streaming by adding support for real-time progress events during tool calls. This implementation adds streaming events for MCP and web search tools, including in-progress, searching, completed, and failed states. The refactored `_execute_tool_call` method now returns an async iterator that yields streaming events throughout the tool execution lifecycle. ## Test Plan Updated the integration test `test_response_streaming_multi_turn_tool_execution` to verify the presence and structure of new streaming events, including: - Checking for MCP in-progress and completed events - Verifying that progress events contain required fields (item_id, output_index, sequence_number) - Ensuring completed events have the necessary sequence_number field	2025-08-13 16:31:25 -07:00
Ashwin Bharambe	5b312a80b9	feat(responses): improve streaming for function calls (#3124 ) Some checks failed Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 10s Details Test Llama Stack Build / generate-matrix (push) Successful in 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 21s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 29s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Test External API and Providers / test-external (venv) (push) Failing after 13s Details Update ReadTheDocs / update-readthedocs (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 22s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 17s Details Pre-commit / pre-commit (push) Successful in 1m10s Details Test Llama Stack Build / build (push) Failing after 12s Details Emit streaming events for function calls ## Test Plan Improved the test case	2025-08-13 11:23:27 -07:00
ehhuang	d6ae54723d	chore: setup for performance benchmarking (#3096 ) # What does this PR do? 1. Added a simple mock openai-compat server that serves chat/completion 2. Add a benchmark server in EKS that includes mock inference server 3. Add locust (https://locust.io/) file for load testing ## Test Plan bash apply.sh kubectl port-forward service/locust-web-ui 8089:8089 Go to localhost:8089 to start a load test <img width="1392" height="334" alt="image" src="https://github.com/user-attachments/assets/d6aa3deb-583a-42ed-889b-751262b8e91c" /> <img width="1362" height="881" alt="image" src="https://github.com/user-attachments/assets/6a28b9b4-05e6-44e2-b504-07e60c12d35e" />	2025-08-13 10:58:22 -07:00
ehhuang	2f51273215	fix: huge speed boost (#3132 ) # What does this PR do? make llama stack fast again ## Test Plan	2025-08-13 09:51:35 -07:00
slekkala1	25e0553eed	chore: Change moderations api response to Provider returned categories (#3098 ) # What does this PR do? To be compliant with model policies for LLAMA, just return the categories as is from provider, we will lose the OAI compat in moderations api response. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan `SAFETY_MODEL=llama-guard3:8b LLAMA_STACK_CONFIG=starter uv run pytest -v tests/integration/safety/test_safety.py --text-model=llama3.2:3b-instruct-fp16 --embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama`	2025-08-13 09:47:35 -07:00
Ashwin Bharambe	a9081d87b9	feat(ci): update Recording workflow trigger and concurrency group	2025-08-13 09:36:13 -07:00
IAN MILLER	0950168f26	refactor: replace hardcoded status codes by httpx.codes (#3131 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this PR is to eliminate hardcoded status codes in server's responses and replace it by `httpx.codes` functionality for better consistency across the whole project and improvement in code readability. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Run `./scripts/unit-tests.sh`	2025-08-13 08:43:41 -07:00
Kelly Brown	0cbd93c5cc	docs: Update blocks formatting in docs/source files (#3120 ) Description: The standard markdown [!NOTE] format is not supported on Sphinx generated documentation, replacing those instances. Also updating other Notes, Tips and Warning blocks throughout the source docs WIP: Working to update the provider code gen	2025-08-13 08:06:31 -07:00
IAN MILLER	c9b78602d3	refactor: modify DELETE API endpoints by returning HTTP 204 No Content + empty body instead of 200 OK + response body with null (#3112 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this PR is to make the behavior DELETE API endpoints be consistent with standard RESTful conventions and eliminate confusion for API consumers. Old Behavior ``` HTTP Status: 200 OK Response Body: null ``` Eg. `curl -X DELETE http://localhost:8321/v1/shields/test-shield` `null% ` `INFO 2025-08-12 16:11:57,932 console_span_processor:65 telemetry: 15:11:57.929 [INFO] ::1:59805 - "DELETE /v1/shields/test-shield HTTP/1.1" 200 ` Updated Behavior ``` HTTP Status: 204 No Content Response Body: empty (no body) ``` Eg. `curl -X DELETE http://localhost:8321/v1/shields/test-shield` `INFO 2025-08-12 16:18:16,645 console_span_processor:62 telemetry: 15:18:16.637 [INFO] ::1:60283 - "DELETE /v1/shields/test-shield HTTP/1.1" 204 ` <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #3090 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Run `./scripts/unit-tests.sh`	2025-08-13 07:56:26 -07:00
Francisco Arceo	92aca434a7	fix: Fix list_sessions() (#3114 ) # What does this PR do? 1. Updates `AgentPersistence.list_sessions()` to properly filter out `Turn` keys from `Session` keys. 2. Adds a suite of unit tests to confirm the `list_sessions()` behavior and tests the failed sample in https://github.com/meta-llama/llama-stack/issues/3048 ## Fixes https://github.com/meta-llama/llama-stack/issues/3048 ## Test Plan Unit tests added. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-13 07:46:26 -07:00
Krzysztof Malczuk	5bd6cb52fb	fix: github action canceling valid tasks for checking semantic pr title (#3127 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR changes the group name from github.ref to github.even.pull_request_number. The reason for this is that github.ref does not act as a unique identifier in the pull_request_target event and only is unique in pull_request. The github action was getting canceled was because the group name was not unique in the concurrency section. <!-- If resolving an issue, uncomment and update the line below --> Closes #3102 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> To test this I have created a fake github action and ran it trough act to see what the github.ref variable produced and what alternatives can be used. This confirmed that the github.ref was not unique and that github.event.pull_request_number is unique to the PR.	2025-08-13 07:14:03 -07:00
Chacksu	fffdab4f5c	fix: Dell distribution missing kvstore (#3113 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details Integration Tests (Replay) / discover-tests (push) Successful in 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 11s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 27s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 29s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 14s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 16s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s Details Test External API and Providers / test-external (venv) (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 13s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 11s Details Test Llama Stack Build / build (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 37s Details Pre-commit / pre-commit (push) Successful in 1m44s Details # What does this PR do? - Added kvstore config to ChromaDB provider config for Dell distribution similar to [starter config](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/distributions/starter/run.yaml#L110-L112) - Fixed [error](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/inference/_generated/_async_client.py#L3424-L3425) getting endpoint information by adding `hf-inference` as the provider to the `AsyncInferenceClient` (TGI client). ## Test Plan ``` export INFERENCE_PORT=8181 export DEH_URL=http://0.0.0.0:$INFERENCE_PORT export INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct export CHROMADB_HOST=localhost export CHROMADB_PORT=8000 export CHROMA_URL=http://$CHROMADB_HOST:$CHROMADB_PORT export CUDA_VISIBLE_DEVICES=0 export LLAMA_STACK_PORT=8321 export HF_TOKEN=[redacted] # TGI Server docker run --rm -it \ --pull always \ --network host \ -v $HOME/.cache/huggingface:/data \ -e HF_TOKEN=$HF_TOKEN \ -e PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True \ -p $INFERENCE_PORT:$INFERENCE_PORT \ --gpus all \ ghcr.io/huggingface/text-generation-inference:latest \ --dtype float16 \ --usage-stats off \ --sharded false \ --cuda-memory-fraction 0.8 \ --model-id meta-llama/Llama-3.2-3B-Instruct \ --port $INFERENCE_PORT \ --hostname 0.0.0.0 # Chrome DB docker run --rm -it \ --name chromadb \ --net=host -p 8000:8000 \ -v ~/chroma:/chroma/chroma \ -e IS_PERSISTENT=TRUE \ -e ANONYMIZED_TELEMETRY=FALSE \ chromadb/chroma:latest # Llama Stack llama stack run dell \ --port $LLAMA_STACK_PORT \ --env INFERENCE_MODEL=$INFERENCE_MODEL \ --env DEH_URL=$DEH_URL \ --env CHROMA_URL=$CHROMA_URL ``` --------- Co-authored-by: Connor Hack <connorhack@fb.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-08-13 06:18:25 -07:00
Kelly Brown	6358d0a478	docs: reorganize contributor guide (#3110 ) Some checks failed Test Llama Stack Build / generate-matrix (push) Successful in 7s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 22s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 24s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Test Llama Stack Build / build-single-provider (push) Failing after 11s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 23s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 19s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 18s Details Unit Tests / unit-tests (3.12) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 18s Details Unit Tests / unit-tests (3.13) (push) Failing after 15s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s Details Test External API and Providers / test-external (venv) (push) Failing after 17s Details Test Llama Stack Build / build (push) Failing after 11s Details Pre-commit / pre-commit (push) Successful in 1m48s Details Description: Restructures contribution guide and move some sections into categories <img width="1399" height="527" alt="Screenshot 2025-08-12 at 9 28 44 AM" src="https://github.com/user-attachments/assets/404e23b4-0001-4174-b662-593e0173ef7d" />	2025-08-12 16:17:03 -07:00
Ashwin Bharambe	3d90117891	chore(tests): fix responses and vector_io tests (#3119 ) Some fixes to MCP tests. And a bunch of fixes for Vector providers. I also enabled a bunch of Vector IO tests to be used with `LlamaStackLibraryClient` ## Test Plan Run Responses tests with llama stack library client: ``` pytest -s -v tests/integration/non_ci/responses/ --stack-config=server:starter \ --text-model openai/gpt-4o \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 \ -k "client_with_models" ``` Do the same with `-k openai_client` The rest should be taken care of by CI.	2025-08-12 16:15:53 -07:00
Ashwin Bharambe	1721aafc1f	feat(responses): type file results properly (#3117 ) Some checks failed Python Package Build Test / build (3.13) (push) Failing after 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 10s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 13s Details Test Llama Stack Build / generate-matrix (push) Successful in 8s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s Details Test Llama Stack Build / build-single-provider (push) Failing after 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 12s Details Test External API and Providers / test-external (venv) (push) Failing after 15s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 28s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 26s Details Test Llama Stack Build / build (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 17s Details Pre-commit / pre-commit (push) Successful in 1m16s Details Another thing our tests implicitly depended on.	2025-08-12 10:39:09 -07:00
Ashwin Bharambe	4fec49dfdb	feat(responses): add include parameter (#3115 ) Well our Responses tests use it so we better include it in the API, no? I discovered it because I want to make sure `llama-stack-client` can be used always instead of `openai-python` as the client (we do want to be _truly_ compatible.)	2025-08-12 10:24:01 -07:00
Nathan Weinberg	6812aa1e1e	chore: bump min python version in docs and tests (#3103 ) # What does this PR do? the minimum python version for the project was bumped to 3.12 a couple months ago, but there remains some artifacts in the repo suggesting we support >=3.10 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-08-12 08:52:57 -07:00
dependabot[bot]	88c4fdc5d7	chore(python-deps): bump chromadb from 1.0.15 to 1.0.16 (#3083 ) Bumps [chromadb](https://github.com/chroma-core/chroma) from 1.0.15 to 1.0.16. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/chroma-core/chroma/releases">chromadb's releases</a>.</em></p> <blockquote> <h2>1.0.16</h2> <p>Version: <code>1.0.16</code> Git ref: <code>refs/tags/1.0.16</code> Build Date: <code>2025-08-08T00:26</code> PIP Package: <code>chroma-1.0.16.tar.gz</code> Github Container Registry Image: <code>:1.0.16</code> DockerHub Image: <code>:1.0.16</code></p> <h2>What's Changed</h2> <ul> <li>[ENH]: add cache mount & tolerations to garbage collector template in Helm chart by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5016">chroma-core/chroma#5016</a></li> <li>[DOC] Fix docs typo by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5018">chroma-core/chroma#5018</a></li> <li>[CLN] Change GenericQuotaError from 429 to 422 by <a href="https://github.com/drewkim"><code>@drewkim</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5022">chroma-core/chroma#5022</a></li> <li>[CHORE] Fix type error in batch_utils by <a href="https://github.com/jairad26"><code>@jairad26</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5024">chroma-core/chroma#5024</a></li> <li>[ENH] Add block-level metrics by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/4801">chroma-core/chroma#4801</a></li> <li>[ENH]: return error on /add if embeddings are not provided by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5033">chroma-core/chroma#5033</a></li> <li>[DOC] Docs Polish 07/2025 by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5032">chroma-core/chroma#5032</a></li> <li>[DOC] Flatten public txt files by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5040">chroma-core/chroma#5040</a></li> <li>[ENH]: require embeddings & require min embedding dimension on /add by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5037">chroma-core/chroma#5037</a></li> <li>[ENH] - Adds in dark mode support for hero image by <a href="https://github.com/tjkrusinskichroma"><code>@tjkrusinskichroma</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5042">chroma-core/chroma#5042</a></li> <li>[BLD] Use 8core runners for all our windows jobs by <a href="https://github.com/eculver"><code>@eculver</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5027">chroma-core/chroma#5027</a></li> <li>[TST] More benchmark queries for regex by <a href="https://github.com/Sicheng-Pan"><code>@Sicheng-Pan</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/4910">chroma-core/chroma#4910</a></li> <li>[BUG]: refactor otel/tracing initialization in the frontend to be independent of hosted entry point by <a href="https://github.com/c-gamble"><code>@c-gamble</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5028">chroma-core/chroma#5028</a></li> <li>[BUG] js client: handle 422 billing errors as QuotaExceeded instead of ChromaConnectionError by <a href="https://github.com/philipithomas"><code>@philipithomas</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5049">chroma-core/chroma#5049</a></li> <li>[BUG] RLS should use 32MB GRPC payload size limit by <a href="https://github.com/Sicheng-Pan"><code>@Sicheng-Pan</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5044">chroma-core/chroma#5044</a></li> <li>[BUG] Sync protoc arch and version in dockerfile by <a href="https://github.com/Sicheng-Pan"><code>@Sicheng-Pan</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5045">chroma-core/chroma#5045</a></li> <li>[BLD] Fix windows runner label by <a href="https://github.com/eculver"><code>@eculver</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5052">chroma-core/chroma#5052</a></li> <li>[PERF]: Prefetch segments in get and query by <a href="https://github.com/sanketkedia"><code>@sanketkedia</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5053">chroma-core/chroma#5053</a></li> <li>[PERF]: Parallelize fetching blocks for brute force regex by <a href="https://github.com/sanketkedia"><code>@sanketkedia</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5051">chroma-core/chroma#5051</a></li> <li>[RELEASE] JS 3.0.7 by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5059">chroma-core/chroma#5059</a></li> <li>[ENH] Add a delete_many call to the storage API. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5020">chroma-core/chroma#5020</a></li> <li>[ENH] Consume delete_many from the wal3 garbage collector. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5021">chroma-core/chroma#5021</a></li> <li>[ENH]: limit number of concurrent get_all_block_ids() when using buffer_unordered() by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5062">chroma-core/chroma#5062</a></li> <li>[ENH]: use new <code>delete_many()</code> storage method in DeleteUnusedFiles operator by <a href="https://github.com/codetheweb"><code>@codetheweb</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5061">chroma-core/chroma#5061</a></li> <li>[BUG]: Disable aws stalled stream protection by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5063">chroma-core/chroma#5063</a></li> <li>[DOC] Update manage collections docs with correct delete collection info by <a href="https://github.com/jairad26"><code>@jairad26</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5066">chroma-core/chroma#5066</a></li> <li>[BUG] Improve wal3 robustness with better shutdown handling and error recovery by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5046">chroma-core/chroma#5046</a></li> <li>[ENH] Do not do any mutations of the manifest from within GC. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5050">chroma-core/chroma#5050</a></li> <li>[CHORE]: enable change notifier otel/tracing by <a href="https://github.com/c-gamble"><code>@c-gamble</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5073">chroma-core/chroma#5073</a></li> <li>[CHORE] Add pprof server to query service by <a href="https://github.com/eculver"><code>@eculver</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5072">chroma-core/chroma#5072</a></li> <li>[ENH]: Dedup inserts to the same key in foyer by <a href="https://github.com/sanketkedia"><code>@sanketkedia</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5074">chroma-core/chroma#5074</a></li> <li>[ENH] "Failed to fetch: status: NotFound" be gone. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5064">chroma-core/chroma#5064</a></li> <li>[CLN] Remove the the top most spammy log lines from rls/wal3. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5071">chroma-core/chroma#5071</a></li> <li>[DOC] Fix badge in readme by <a href="https://github.com/kylediaz"><code>@kylediaz</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5025">chroma-core/chroma#5025</a></li> <li>[ENH] A tool for patching logs that were deleted before a new manifest was installed. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5083">chroma-core/chroma#5083</a></li> <li>[BUG] Add billing errors to JS client by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5084">chroma-core/chroma#5084</a></li> <li>[CHORE]: Add s3 get metrics and pod name to tracing spans by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5086">chroma-core/chroma#5086</a></li> <li>[RELEASE] JS 3.0.8 by <a href="https://github.com/itaismith"><code>@itaismith</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5087">chroma-core/chroma#5087</a></li> <li>[ENH] A tool to purge the cache. by <a href="https://github.com/rescrv"><code>@rescrv</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5085">chroma-core/chroma#5085</a></li> <li>[DOC] Update PR template for migration and observability by <a href="https://github.com/HammadB"><code>@HammadB</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5089">chroma-core/chroma#5089</a></li> <li>[CHORE]: Fix s3 get metric name by <a href="https://github.com/tanujnay112"><code>@tanujnay112</code></a> in <a href="https://redirect.github.com/chroma-core/chroma/pull/5091">chroma-core/chroma#5091</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`dff3a786db`"><code>dff3a78</code></a> [RELEASE] CLI 1.1.5, Python 1.0.16, JS 3.0.11 (<a href="https://redirect.github.com/chroma-core/chroma/issues/5227">#5227</a>)</li> <li><a href="`f60f932b8d`"><code>f60f932</code></a> [ENH]: Increase nprobe for smaller collections (<a href="https://redirect.github.com/chroma-core/chroma/issues/5226">#5226</a>)</li> <li><a href="`f593a43b5d`"><code>f593a43</code></a> [ENH] Add <code>InsertRecordSet</code> to JS client (<a href="https://redirect.github.com/chroma-core/chroma/issues/5225">#5225</a>)</li> <li><a href="`76a14c226a`"><code>76a14c2</code></a> [DOC] Made light/dark mode for Chroma logo (<a href="https://redirect.github.com/chroma-core/chroma/issues/5215">#5215</a>)</li> <li><a href="`d80817ede4`"><code>d80817e</code></a> [ENH]: Add more tracing in the filter path (<a href="https://redirect.github.com/chroma-core/chroma/issues/5219">#5219</a>)</li> <li><a href="`73abfdc51a`"><code>73abfdc</code></a> [ENH] Handle when the garbage doesn't overlap the manifest. (<a href="https://redirect.github.com/chroma-core/chroma/issues/5207">#5207</a>)</li> <li><a href="`fa392226ba`"><code>fa39222</code></a> [BUG] Revert accidentally commited code (<a href="https://redirect.github.com/chroma-core/chroma/issues/5205">#5205</a>)</li> <li><a href="`815c3ac561`"><code>815c3ac</code></a> [ENH]: Fix CI flake with adaptive nsearch (<a href="https://redirect.github.com/chroma-core/chroma/issues/5203">#5203</a>)</li> <li><a href="`ea66d6929c`"><code>ea66d69</code></a> [BUG] Switch to rust-tls (<a href="https://redirect.github.com/chroma-core/chroma/issues/5204">#5204</a>)</li> <li><a href="`04aeb22139`"><code>04aeb22</code></a> [ENH]: Calculate cache weight of block size instead of hardcoding (<a href="https://redirect.github.com/chroma-core/chroma/issues/5201">#5201</a>)</li> <li>Additional commits viewable in <a href="https://github.com/chroma-core/chroma/compare/1.0.15...1.0.16">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=chromadb&package-manager=uv&previous-version=1.0.15&new-version=1.0.16)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-12 08:44:39 -07:00
dependabot[bot]	393f3714b0	chore(python-deps): bump torch from 2.7.1 to 2.8.0 (#3082 ) Bumps [torch](https://github.com/pytorch/pytorch) from 2.7.1 to 2.8.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pytorch/pytorch/releases">torch's releases</a>.</em></p> <blockquote> <h1>PyTorch 2.8.0 Release Notes</h1> <ul> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#highlights">Highlights</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#backwards-incompatible-changes">Backwards Incompatible Changes</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#deprecations">Deprecations</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#new-features">New Features</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#improvements">Improvements</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#bug-fixes">Bug fixes</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#performance">Performance</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#documentation">Documentation</a></li> <li><a href="https://github.com/pytorch/pytorch/blob/HEAD/#developers">Developers</a></li> </ul> <h1>Highlights</h1> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`ba56102387`"><code>ba56102</code></a> Cherrypick: Add the RunLLM widget to the website (<a href="https://redirect.github.com/pytorch/pytorch/issues/159592">#159592</a>)</li> <li><a href="`c525a02c89`"><code>c525a02</code></a> [dynamo, docs] cherry pick torch.compile programming model docs into 2.8 (<a href="https://redirect.github.com/pytorch/pytorch/issues/15">#15</a>...</li> <li><a href="`a1cb3cc05d`"><code>a1cb3cc</code></a> [Release Only] Remove nvshmem from list of preload libraries (<a href="https://redirect.github.com/pytorch/pytorch/issues/158925">#158925</a>)</li> <li><a href="`c76b2356bc`"><code>c76b235</code></a> Move out super large one off foreach_copy test (<a href="https://redirect.github.com/pytorch/pytorch/issues/158880">#158880</a>)</li> <li><a href="`20a0e225a0`"><code>20a0e22</code></a> Revert "[Dynamo] Allow inlining into AO quantization modules (<a href="https://redirect.github.com/pytorch/pytorch/issues/152934">#152934</a>)" (<a href="https://redirect.github.com/pytorch/pytorch/issues/158">#158</a>...</li> <li><a href="`9167ac8c75`"><code>9167ac8</code></a> [MPS] Switch Cholesky decomp to column wise (<a href="https://redirect.github.com/pytorch/pytorch/issues/158237">#158237</a>)</li> <li><a href="`5534685c62`"><code>5534685</code></a> [MPS] Reimplement <code>tri[ul]</code> as Metal shaders (<a href="https://redirect.github.com/pytorch/pytorch/issues/158867">#158867</a>)</li> <li><a href="`d19e08d74b`"><code>d19e08d</code></a> Cherry pick PR 158746 (<a href="https://redirect.github.com/pytorch/pytorch/issues/158801">#158801</a>)</li> <li><a href="`a6c044ab9a`"><code>a6c044a</code></a> [cherry-pick] Unify torch.tensor and torch.ops.aten.scalar_tensor behavior (#...</li> <li><a href="`620ebd0646`"><code>620ebd0</code></a> [Dynamo] Use proper sources for constructing dataclass defaults (<a href="https://redirect.github.com/pytorch/pytorch/issues/158689">#158689</a>)</li> <li>Additional commits viewable in <a href="https://github.com/pytorch/pytorch/compare/v2.7.1...v2.8.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=torch&package-manager=uv&previous-version=2.7.1&new-version=2.8.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-12 08:44:24 -07:00
Matthew Farrellee	b70e2f1f09	fix(dep): update to openai >= 1.99.6 and use new Function location (#3087 ) # What does this PR do? closes #3072 ## Test Plan ci	2025-08-12 08:40:32 -07:00
Mustafa Elbehery	4a13ef45e9	fix: Implement missing `run_moderation` method in `PromptGuardSafetyImpl` (#3101 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR addresses an issue where `PromptGuardSafetyImpl` was an incomplete implementation of an abstract class. The class was missing the required run_moderation method from its parent interface. Currently, running `pre-commit` locally fails with the error below. ``` llama_stack/providers/inline/safety/prompt_guard/__init__.py:15: error: Cannot instantiate abstract class "PromptGuardSafetyImpl" with abstract attribute "run_moderation" [abstract] Found 1 error in 1 file (checked 410 source files) ``` This PR fixes the issue as follows - Added the missing run_moderation method to PromptGuardSafetyImpl - Method raises NotImplementedError with appropriate message indicating this functionality is not implemented for PromptGuard - This allows the class to be properly instantiated while clearly indicating the limitation <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-08-12 08:32:52 -07:00
Nathan Weinberg	19123ca957	refactor: standardize InferenceRouter model handling (#2965 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s Details Python Package Build Test / build (3.13) (push) Failing after 16s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 29s Details Test External API and Providers / test-external (venv) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 25s Details Unit Tests / unit-tests (3.12) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 21s Details Unit Tests / unit-tests (3.13) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 24s Details Pre-commit / pre-commit (push) Successful in 1m19s Details	2025-08-12 04:20:39 -06:00
Ashwin Bharambe	803114180b	chore(logging)!: use comma as a delimiter (#3095 ) Some checks failed Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 14s Details Test Llama Stack Build / generate-matrix (push) Successful in 11s Details Test Llama Stack Build / build-single-provider (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 11s Details Unit Tests / unit-tests (3.13) (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s Details Update ReadTheDocs / update-readthedocs (push) Failing after 12s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 29s Details Test External API and Providers / test-external (venv) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 34s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 26s Details Integration Tests (Replay) / discover-tests (push) Successful in 31s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Unit Tests / unit-tests (3.12) (push) Failing after 30s Details Python Package Build Test / build (3.13) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 32s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 33s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 40s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 40s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 42s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 44s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 32s Details Pre-commit / pre-commit (push) Successful in 1m24s Details Test Llama Stack Build / build (push) Failing after 54s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 13s Details Using commas is much more shell-friendly. A semi-colon is a statement delimiter and must be escaped. This change is backwards incompatible but I imagine not many people are using this. I could be wrong. Looking for feedback.	2025-08-11 11:51:43 -07:00
Francisco Arceo	f7adf58b1b	docs: Add documentation on how to contribute a Vector DB provider and update testing documentation (#3093 ) # What does this PR do? - Adds documentation on how to contribute a Vector DB provider. - Updates the testing section to be a little friendlier to navigate. - Also added new shortcut for search so that `/` and `⌘ K` or `ctrl+K` trigger search <img width="1903" height="1346" alt="Screenshot 2025-08-11 at 10 10 12 AM" src="https://github.com/user-attachments/assets/6995b3b8-a2ab-4200-be72-c5b03a784a29" /> <img width="1915" height="1438" alt="Screenshot 2025-08-11 at 10 10 25 AM" src="https://github.com/user-attachments/assets/1f54d30e-5be1-4f27-b1e9-3c3537dcb8e9" /> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-11 11:11:09 -07:00
Mustafa Elbehery	b5b5f5b9ae	chore: add `mypy` prompt guard (#2678 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-08-11 08:40:40 -07:00
Francisco Arceo	7448a4a88c	chore: Updating UI Sidebar (#3081 ) # What does this PR do? This updates the sidebar to look a little more like other popular ones. <img width="1913" height="1352" alt="Screenshot 2025-08-08 at 11 25 31 PM" src="https://github.com/user-attachments/assets/00738412-1101-48ec-8864-cde4a8733ec1" /> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-11 07:39:52 -07:00
Matthew Farrellee	8faff92591	chore: remove redundant code in unregister_toolgroup (#3092 ) # What does this PR do? removes redundant code ## Test Plan ci	2025-08-11 07:38:54 -07:00
Eran Cohen	a4bad6c0b4	feat: Add Google Vertex AI inference provider support (#2841 ) Some checks failed Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 10s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 12s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Test Llama Stack Build / generate-matrix (push) Successful in 8s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s Details Test External API and Providers / test-external (venv) (push) Failing after 11s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s Details Test Llama Stack Build / build-single-provider (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 15s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s Details Test Llama Stack Build / build (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 47s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 49s Details Unit Tests / unit-tests (3.13) (push) Failing after 39s Details Pre-commit / pre-commit (push) Successful in 1m37s Details # What does this PR do? - Add new Vertex AI remote inference provider with litellm integration - Support for Gemini models through Google Cloud Vertex AI platform - Uses Google Cloud Application Default Credentials (ADC) for authentication - Added VertexAI models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash. - Updated provider registry to include vertexai provider - Updated starter template to support Vertex AI configuration - Added comprehensive documentation and sample configuration <!-- If resolving an issue, uncomment and update the line below --> relates to https://github.com/meta-llama/llama-stack/issues/2747 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Eran Cohen <eranco@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-08-11 08:22:04 -04:00
Francisco Arceo	78a59a4dbe	chore: Adding GitHub Stars, trends, and contributor shout out to README (#3079 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / discover-tests (push) Successful in 6s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 13s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 15s Details Test External API and Providers / test-external (venv) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 16s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 50s Details Unit Tests / unit-tests (3.13) (push) Failing after 48s Details Pre-commit / pre-commit (push) Successful in 1m54s Details # What does this PR do? Updates READMe to add 1. GitHub badge highlighting Llama Stack as #1 Repo of the Day 2. GitHub Star History (cumulative stars chart) 3. Contributor shout out <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-10 21:11:14 -04:00
Varsha	69dc789e15	docs: Add unsupported search mode info about FAISS (#3089 )	2025-08-10 17:34:34 -06:00
Varsha	ce72a28525	docs: Update doc on search modes for Milvus (#3078 ) # What does this PR do? Update Milvus doc on using search modes. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>	2025-08-10 18:48:36 -04:00
Vlastimil Eliáš	1677d6bffd	feat: Flash-Lite 2.0 and 2.5 models added to Gemini inference provider (#3058 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 11s Details Python Package Build Test / build (3.12) (push) Failing after 8s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Python Package Build Test / build (3.13) (push) Failing after 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s Details Test External API and Providers / test-external (venv) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 59s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 1m1s Details Unit Tests / unit-tests (3.13) (push) Failing after 59s Details Pre-commit / pre-commit (push) Successful in 1m41s Details PR adds Flash-Lite 2.0 and 2.5 models to the Gemini inference provider Closes #3046 ## Test Plan I was not able to locate any existing test for this provider, so I performed manual testing. But the change is really trivial and straightforward.	2025-08-08 13:48:15 -07:00
ehhuang	0b5a794c27	fix: telemetry logger spams when queue is full (#3070 ) # What does this PR do? ## Test Plan Ran a stress test on chat completion endpoint locally: For 10 concurrent users over 3 minutes: Before: <img width="1440" height="201" alt="image" src="https://github.com/user-attachments/assets/24e0d580-186e-4e24-931e-2b936c5859b6" /> After: <img width="1434" height="204" alt="image" src="https://github.com/user-attachments/assets/4b806d88-f822-41e9-b25a-018cc4bec866" /> (Will send scripts in a future PR.)	2025-08-08 13:47:36 -07:00
Francisco Arceo	9b70bb9d4b	feat(ui): Adding Vector Store Files to Admin UI (#3041 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s Details Integration Tests (Replay) / discover-tests (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 16s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Test External API and Providers / test-external (venv) (push) Failing after 13s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 20s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 57s Details Unit Tests / unit-tests (3.12) (push) Failing after 55s Details Pre-commit / pre-commit (push) Successful in 2m10s Details # What does this PR do? This PR updates the UI to create new: 1. `/files/{file_id}` 2. `files/{file_id}/contents` 3. `files/{file_id}/contents/{content_id}` The list of files are clickable which brings the user to the FIles Detail page The File Details page shows all of the content The content details page shows the individual chunk/content parsed These only use our existing OpenAI compatible APIs. I have a separate branch where I expose the embedding and the portal is correctly populated. I included the FE rendering code for that in this PR. 1. `vector-stores/{vector_store_id}/files/{file_id}` <img width="1913" height="1351" alt="Screenshot 2025-08-06 at 10 20 12 PM" src="https://github.com/user-attachments/assets/08010d5e-60c8-4bd9-9f3e-a2731ed1ad55" /> 2. `vector-stores/{vector_store_id}/files/{file_id}/contents` <img width="1920" height="1272" alt="Screenshot 2025-08-06 at 10 21 23 PM" src="https://github.com/user-attachments/assets/3b91e67b-5d64-4fe6-91b6-18f14587e850" /> 3. `vector-stores/{vector_store_id}/files/{file_id}/contents/{content_id}` <img width="1916" height="1273" alt="Screenshot 2025-08-06 at 10 21 45 PM" src="https://github.com/user-attachments/assets/d38ca996-e8d9-460c-9e39-7ff0cb5ec0dd" /> ## Test Plan I tested this locally and reviewed the code. I generated a significant share of the code with Claude and some manual intervention. After this, I'll begin adding tests to the UI. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-08 07:44:06 -07:00
Jiayi Ni	9e78f2da96	docs: fix the docs for NVIDIA Inference Provider (#3055 ) Some checks failed Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s Details Test Llama Stack Build / build-single-provider (push) Failing after 11s Details Test Llama Stack Build / generate-matrix (push) Successful in 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 20s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s Details Test External API and Providers / test-external (venv) (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 20s Details Python Package Build Test / build (3.12) (push) Failing after 23s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 18s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 21s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 17s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 51s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 58s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 56s Details Pre-commit / pre-commit (push) Successful in 1m40s Details Test Llama Stack Build / build (push) Failing after 14s Details # What does this PR do? Fix the NVIDIA inference docs by updating API methods, model IDs, and embedding example. ## Test Plan N/A	2025-08-08 11:27:55 +02:00
Ashwin Bharambe	e90fe25890	fix(tests): move llama stack client init back to fixture (#3071 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / discover-tests (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 10s Details Test External API and Providers / test-external (venv) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 16s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 50s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 54s Details Unit Tests / unit-tests (3.13) (push) Failing after 47s Details Pre-commit / pre-commit (push) Successful in 1m44s Details See inline comments	2025-08-07 15:29:53 -07:00
Ashwin Bharambe	5f1ddd35e4	chore(tests): refactor and move responses tests away from verifications (#3068 ) This PR kills the verifications infrastructure which is no longer used. It was relocated to the `llama-stack-evals` (https://github.com/meta-llama/llama-stack-evals) repository previously. Responses tests used this infrastructure but that wasn't quite necessary, just a little useful back when @bbrownin introduced the tests. On Discord, we agreed that tests can be moved to our regular integrations test infra. ## Test Plan Some tests currently do fail (although they run!) I will send a follow-up PR which makes them all pass.	2025-08-07 13:48:16 -07:00
Dean Wampler	342550c1e2	docs: Added comment about a known limitation of AgentEventLogger (#2930 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / discover-tests (push) Successful in 7s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 10s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 12s Details Python Package Build Test / build (3.13) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 14s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s Details Test External API and Providers / test-external (venv) (push) Failing after 16s Details Unit Tests / unit-tests (3.12) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 17s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 28s Details Pre-commit / pre-commit (push) Successful in 1m11s Details # What does this PR do? `AgentEventLogger` only supports streaming responses, so I suggest adding a comment near the bottom of `demo_script.py` letting the user know this, e.g., if they change the `stream` value to `False` in the call to `create_turn`, they need to comment out the logging lines. See https://github.com/llamastack/llama-stack-client-python/issues/15 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Dean Wampler <dean.wampler@ibm.com>	2025-08-07 10:09:57 -07:00
Varsha	e3928e6a29	feat: Implement hybrid search in Milvus (#2644 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 5s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 10s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 11s Details Test External API and Providers / test-external (venv) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s Details Pre-commit / pre-commit (push) Successful in 57s Details # What does this PR do? This PR implements hybrid search for Milvus DB based on the inbuilt milvus support. To test: ``` pytest tests/unit/providers/vector_io/remote/test_milvus.py -v -s --tb=long --disable-warnings --asyncio-mode=auto ``` Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>	2025-08-07 09:42:03 +02:00
Nathan Weinberg	5a2d323eca	docs: add use of custom exceptions to code style guide (#3049 ) Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 15s Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 16s Details Integration Tests (Replay) / discover-tests (push) Successful in 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 17s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 24s Details Test External API and Providers / test-external (venv) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 26s Details Unit Tests / unit-tests (3.12) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 1m3s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 1m5s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 48s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m0s Details Pre-commit / pre-commit (push) Successful in 1m55s Details # What does this PR do? Adds a blurb to the `CONTRIBUTING.md` encouraging the use of the standardized custom exception classes for resources where applicable Relates to #2379 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-08-06 14:12:08 -07:00
slekkala1	26d3d25c87	feat: Add moderations create api (#3020 ) # What does this PR do? This PR adds Open AI Compatible moderations api. Currently only implementing for llama guard safety provider Image support, expand to other safety providers and Deprecation of run_shield will be next steps. ## Test Plan Added 2 new tests for safe/ unsafe text prompt examples for the new open ai compatible moderations api usage `SAFETY_MODEL=llama-guard3:8b LLAMA_STACK_CONFIG=starter uv run pytest -v tests/integration/safety/test_safety.py --text-model=llama3.2:3b-instruct-fp16 --embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama` (Had some issue with previous PR https://github.com/meta-llama/llama-stack/pull/2994 while updating and accidentally close it , reopened new one )	2025-08-06 13:51:23 -07:00
Charlie Doern	0caef40e0d	fix: telemetry fixes (inference and core telemetry) (#2733 ) # What does this PR do? I found a few issues while adding new metrics for various APIs: currently metrics are only propagated in `chat_completion` and `completion` since most providers use the `openai_..` routes as the default in `llama-stack-client inference chat-completion`, metrics are currently not working as expected. in order to get them working the following had to be done: 1. get the completion as usual 2. use new `openai_` versions of the metric gathering functions which use `.usage` from the `OpenAI..` response types to gather the metrics which are already populated. 3. define a `stream_generator` which counts the tokens and computes the metrics (only for stream=True) 5. add metrics to response NOTE: I could not add metrics to `openai_completion` where stream=True because that ONLY returns an `OpenAICompletion` not an AsyncGenerator that we can manipulate. acquire the lock, and add event to the span as the other `_log_...` methods do some new output: `llama-stack-client inference chat-completion --message hi` <img width="2416" height="425" alt="Screenshot 2025-07-16 at 8 28 20 AM" src="https://github.com/user-attachments/assets/ccdf1643-a184-4ddd-9641-d426c4d51326" /> and in the client: <img width="763" height="319" alt="Screenshot 2025-07-16 at 8 28 32 AM" src="https://github.com/user-attachments/assets/6bceb811-5201-47e9-9e16-8130f0d60007" /> these were not previously being recorded nor were they being printed to the server due to the improper console sink handling --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-08-06 13:37:40 -07:00
Ashwin Bharambe	c252dfa3ef	fix(ci): allow tests to skip llama stack client instantiation (#3052 ) Some checks failed Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Test Llama Stack Build / build-single-provider (push) Failing after 11s Details Python Package Build Test / build (3.13) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 20s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s Details Test External API and Providers / test-external (venv) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 15s Details Pre-commit / pre-commit (push) Successful in 1m16s Details Test Llama Stack Build / build (push) Failing after 8s Details	2025-08-06 11:15:41 -07:00
IAN MILLER	8ba04205ac	docs: remove pure venv references (#3047 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Remove pure venv (without uv) references in docs <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-08-06 10:42:34 -07:00
Nathan Weinberg	e9fced773a	refactor: introduce common 'ResourceNotFoundError' exception (#3032 ) # What does this PR do? 1. Introduce new base custom exception class `ResourceNotFoundError` 2. All other "not found" exception classes now inherit from `ResourceNotFoundError` Closes #3030 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-08-06 10:22:55 -07:00
Ashwin Bharambe	dfce05d0c5	fix(docs): update llama stack build CLI doc (#3050 )	2025-08-06 09:32:09 -07:00
ehhuang	3e695cf320	chore: update postgres_demo with new config (#3045 ) # What does this PR do? closes https://github.com/meta-llama/llama-stack/issues/3044 ## Test Plan matches starter's template	2025-08-06 07:48:40 -07:00
Mohamed Rebai	7eff1bb3ec	ci(pre-commit): enforce presence of 'upload-time' field in uv.lock (#2920 ) # What does this PR do? This PR adds a minimum version `0.7.0` to the project. The diff issue happens because an `upload-time` field in the `uv.lock` file did not exist in older uv versions (pre `0.6.15`). This effectively prevents large diffs in PRs from devs that use older versions of uv. Closes #2887 --------- Co-authored-by: Charlie Doern <charlie@doern.me>	2025-08-06 07:46:59 -07:00
Ashwin Bharambe	7f834339ba	chore(misc): make tests and starter faster (#3042 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 9s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s Details Test Llama Stack Build / generate-matrix (push) Successful in 11s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 22s Details Test External API and Providers / test-external (venv) (push) Failing after 14s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 22s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Test Llama Stack Build / build-single-provider (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Unit Tests / unit-tests (3.12) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 18s Details Test Llama Stack Build / build (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Details Python Package Build Test / build (3.13) (push) Failing after 53s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 59s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 1m1s Details Update ReadTheDocs / update-readthedocs (push) Failing after 1m6s Details Pre-commit / pre-commit (push) Successful in 1m53s Details A bunch of miscellaneous cleanup focusing on tests, but ended up speeding up starter distro substantially. - Pulled llama stack client init for tests into `pytest_sessionstart` so it does not clobber output - Profiling of that told me where we were doing lots of heavy imports for starter, so lazied them - starter now starts 20seconds+ faster on my Mac - A few other smallish refactors for `compat_client`	2025-08-05 14:55:05 -07:00
IAN MILLER	e12524af85	feat: create unregister shield API endpoint in Llama Stack (#2853 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s Details Integration Tests (Replay) / discover-tests (push) Successful in 13s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 24s Details Test External API and Providers / test-external (venv) (push) Failing after 12s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 27s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 21s Details Unit Tests / unit-tests (3.12) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 35s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 39s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 35s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 35s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 1m2s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 1m4s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 1m2s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s Details Pre-commit / pre-commit (push) Successful in 2m21s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Extend the Shields Protocol and implement the capability to unregister previously registered shields and CLI for shields management. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2581 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> First of, test API for shields 1. Install and start Ollama: `ollama serve` 2. Pull Llama Guard Model in Ollama: `ollama pull llama-guard3:8b` 3. Configure env variables: ``` export ENABLE_OLLAMA=ollama export OLLAMA_URL=http://localhost:11434 ``` 4. Build Llama Stack distro: `llama stack build --template starter --image-type venv ` 5. Start Llama Stack server: `llama stack run starter --port 8321` 6. Check if Ollama model is available: `curl -X GET http://localhost:8321/v1/models \| jq '.data[] \| select(.provider_id=="ollama")'` 7. Register a new Shield using Ollama provider: ``` curl -X POST http://localhost:8321/v1/shields \ -H "Content-Type: application/json" \ -d '{ "shield_id": "test-shield", "provider_id": "llama-guard", "provider_shield_id": "ollama/llama-guard3:8b", "params": {} }' ``` `{"identifier":"test-shield","provider_resource_id":"ollama/llama-guard3:8b","provider_id":"llama-guard","type":"shield","owner":{"principal":"","attributes":{}},"params":{}}% ` 8. Check if shield was registered: `curl -X GET http://localhost:8321/v1/shields/test-shield` `{"identifier":"test-shield","provider_resource_id":"ollama/llama-guard3:8b","provider_id":"llama-guard","type":"shield","owner":{"principal":"","attributes":{}},"params":{}}% ` 9. Run shield: ``` curl -X POST http://localhost:8321/v1/safety/run-shield \ -H "Content-Type: application/json" \ -d '{ "shield_id": "test-shield", "messages": [ { "role": "user", "content": "How can I hack into someone computer?" } ], "params": {} }' ``` `{"violation":{"violation_level":"error","user_message":"I can't answer that. Can I help with something else?","metadata":{"violation_type":"S2"}}}% ` 10. Unregister shield: `curl -X DELETE http://localhost:8321/v1/shields/test-shield` `null% ` 11. Verify shield was deleted: `curl -X GET http://localhost:8321/v1/shields/test-shield` `{"detail":"Invalid value: Shield 'test-shield' not found"}%` All tests passed ✅ ``` ========================================================================== 430 passed, 194 warnings in 19.54s ========================================================================== /Users/iamiller/GitHub/llama-stack/.venv/lib/python3.12/site-packages/litellm/llms/custom_httpx/async_client_cleanup.py:78: RuntimeWarning: coroutine 'close_litellm_async_clients' was never awaited loop.close() RuntimeWarning: Enable tracemalloc to get the object allocation traceback Wrote HTML report to htmlcov-3.12/index.html ```	2025-08-05 07:33:46 -07:00
github-actions[bot]	e565b91182	build: Bump version to 0.2.17 Some checks failed Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 7s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s Details Test Llama Stack Build / generate-matrix (push) Successful in 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 9s Details Python Package Build Test / build (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s Details Test External API and Providers / test-external (venv) (push) Failing after 7s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 9s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s Details Unit Tests / unit-tests (3.13) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s Details Test Llama Stack Build / build (push) Failing after 12s Details Pre-commit / pre-commit (push) Successful in 1m38s Details	2025-08-05 01:43:30 +00:00
Ashwin Bharambe	ea46f74092	fix: rectify typo in MANIFEST.in due to #2975	2025-08-04 18:22:49 -07:00
ehhuang	bb6b6041d6	chore: fix: integration tests failures marked as successful (#3039 )	2025-08-04 17:06:28 -07:00
Francisco Arceo	eac1e0c7d4	chore: Fixing Markdown renderer (#3038 )	2025-08-04 14:16:09 -07:00
Nathan Weinberg	68b0071861	chore: standardize session not found error (#3031 ) # What does this PR do? 1. Creates a new `SessionNotFoundError` class 2. Implements the new class where appropriate Relates to #2379 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-08-04 13:12:02 -07:00
Nathan Weinberg	05cfa213b6	chore: standardize tool group not found error (#2986 ) # What does this PR do? 1. Creates a new `ToolGroupNotFoundError` class 2. Implements the new class where appropriate Relates to #2379 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-08-04 11:41:33 -07:00
dependabot[bot]	55a2694c80	chore(python-deps): bump openai from 1.97.1 to 1.98.0 (#3025 ) Bumps [openai](https://github.com/openai/openai-python) from 1.97.1 to 1.98.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/openai/openai-python/releases">openai's releases</a>.</em></p> <blockquote> <h2>v1.98.0</h2> <h2>1.98.0 (2025-07-30)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.97.2...v1.98.0">v1.97.2...v1.98.0</a></p> <h3>Features</h3> <ul> <li><strong>api:</strong> manual updates (<a href="`88a8036c5e`">88a8036</a>)</li> </ul> <h2>v1.97.2</h2> <h2>1.97.2 (2025-07-30)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.97.1...v1.97.2">v1.97.1...v1.97.2</a></p> <h3>Chores</h3> <ul> <li><strong>client:</strong> refactor streaming slightly to better future proof it (<a href="`71c0c74713`">71c0c74</a>)</li> <li><strong>project:</strong> add settings file for vscode (<a href="`29c22c90fd`">29c22c9</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/openai/openai-python/blob/main/CHANGELOG.md">openai's changelog</a>.</em></p> <blockquote> <h2>1.98.0 (2025-07-30)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.97.2...v1.98.0">v1.97.2...v1.98.0</a></p> <h3>Features</h3> <ul> <li><strong>api:</strong> manual updates (<a href="`88a8036c5e`">88a8036</a>)</li> </ul> <h2>1.97.2 (2025-07-30)</h2> <p>Full Changelog: <a href="https://github.com/openai/openai-python/compare/v1.97.1...v1.97.2">v1.97.1...v1.97.2</a></p> <h3>Chores</h3> <ul> <li><strong>client:</strong> refactor streaming slightly to better future proof it (<a href="`71c0c74713`">71c0c74</a>)</li> <li><strong>project:</strong> add settings file for vscode (<a href="`29c22c90fd`">29c22c9</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`a3315d9fcc`"><code>a3315d9</code></a> release: 1.98.0 (<a href="https://redirect.github.com/openai/openai-python/issues/2503">#2503</a>)</li> <li><a href="`48188cc8d5`"><code>48188cc</code></a> release: 1.97.2 (<a href="https://redirect.github.com/openai/openai-python/issues/2494">#2494</a>)</li> <li>See full diff in <a href="https://github.com/openai/openai-python/compare/v1.97.1...v1.98.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=openai&package-manager=uv&previous-version=1.97.1&new-version=1.98.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-04 11:40:56 -07:00
Ashwin Bharambe	cc87995e2b	chore: rename templates to distributions (#3035 ) As the title says. Distributions is in, Templates is out. `llama stack build --template` --> `llama stack build --distro`. For backward compatibility, the previous option is kept but results in a warning. Updated `server.py` to remove the "config_or_template" backward compatibility since it has been a couple releases since that change.	2025-08-04 11:34:17 -07:00
dependabot[bot]	12f964437a	chore(python-deps): bump opentelemetry-exporter-otlp-proto-http from 1.35.0 to 1.36.0 (#3027 ) Some checks failed Test Llama Stack Build / generate-matrix (push) Successful in 8s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 6s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s Details Python Package Build Test / build (3.12) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 25s Details Test Llama Stack Build / build-single-provider (push) Failing after 19s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 28s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 11s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 34s Details Unit Tests / unit-tests (3.12) (push) Failing after 13s Details Test External API and Providers / test-external (venv) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 25s Details Unit Tests / unit-tests (3.13) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 31s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Has started running Details Test Llama Stack Build / build (push) Failing after 12s Details Pre-commit / pre-commit (push) Successful in 1m46s Details Bumps [opentelemetry-exporter-otlp-proto-http](https://github.com/open-telemetry/opentelemetry-python) from 1.35.0 to 1.36.0. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/open-telemetry/opentelemetry-python/blob/main/CHANGELOG.md">opentelemetry-exporter-otlp-proto-http's changelog</a>.</em></p> <blockquote> <h2>Version 1.36.0/0.57b0 (2025-07-29)</h2> <ul> <li> <p>Add missing Prometheus exporter documentation (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/pull/4485">#4485</a>)</p> </li> <li> <p>Overwrite logging.config.fileConfig and logging.config.dictConfig to ensure the OTLP <code>LogHandler</code> remains attached to the root logger. Fix a bug that can cause a deadlock to occur over <code>logging._lock</code> in some cases (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/pull/4636">#4636</a>).</p> </li> <li> <p>otlp-http-exporter: set default value for param <code>timeout_sec</code> in <code>_export</code> method (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/pull/4691">#4691</a>)</p> </li> <li> <p>Update OTLP gRPC/HTTP exporters: calling shutdown will now interrupt exporters that are sleeping before a retry attempt, and cause them to return failure immediately. Update BatchSpan/LogRecordProcessors: shutdown will now complete after 30 seconds of trying to finish exporting any buffered telemetry, instead of continuing to export until all telemetry was exported. (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/pull/4638">#4638</a>).</p> </li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`1aaa2a2587`"><code>1aaa2a2</code></a> Prepare release 1.36.0/0.57b0 (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/issues/4704">#4704</a>)</li> <li><a href="`f9ca4755af`"><code>f9ca475</code></a> Use <code>@pytest.mark.flaky</code> decorator instead of <code>@flaky.flaky</code> (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/issues/4700">#4700</a>)</li> <li><a href="`eb1a4c574c`"><code>eb1a4c5</code></a> otlp-http-exporter: set default value for param <code>timeout_sec</code> in <code>_export</code> me...</li> <li><a href="`23aad5e4ad`"><code>23aad5e</code></a> Add permissions that were missed on the first pass (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/issues/4692">#4692</a>)</li> <li><a href="`344c647774`"><code>344c647</code></a> Add minimum token permissions for all github workflow files (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/issues/4663">#4663</a>)</li> <li><a href="`ff9dc82d3a`"><code>ff9dc82</code></a> Migrate from opentelemetrybot to otelbot (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/issues/4685">#4685</a>)</li> <li><a href="`d4e606846e`"><code>d4e6068</code></a> Interrupt exporter retry backoff sleeps when shutdown is called. Update Batch...</li> <li><a href="`a28b0cadce`"><code>a28b0ca</code></a> Fix broken link in Prometheus exporter README. Fixes <a href="https://redirect.github.com/open-telemetry/opentelemetry-python/issues/4399">#4399</a> (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/issues/4485">#4485</a>)</li> <li><a href="`9746645818`"><code>9746645</code></a> Introducing tox-uv (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/issues/4516">#4516</a>)</li> <li><a href="`57cb935e88`"><code>57cb935</code></a> Fix issue where deadlock can occur over logging._lock (<a href="https://redirect.github.com/open-telemetry/opentelemetry-python/issues/4636">#4636</a>)</li> <li>Additional commits viewable in <a href="https://github.com/open-telemetry/opentelemetry-python/compare/v1.35.0...v1.36.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=opentelemetry-exporter-otlp-proto-http&package-manager=uv&previous-version=1.35.0&new-version=1.36.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-04 09:37:58 -07:00
dependabot[bot]	48b49e318f	chore(python-deps): bump weaviate-client from 4.16.4 to 4.16.5 (#3026 ) [//]: # (dependabot-start) ⚠️ Dependabot is rebasing this PR ⚠️ Rebasing might not happen immediately, so don't worry if this takes some time. Note: if you make any changes to this PR yourself, they will take precedence over the rebase. --- [//]: # (dependabot-end) Bumps [weaviate-client](https://github.com/weaviate/weaviate-python-client) from 4.16.4 to 4.16.5. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/weaviate/weaviate-python-client/releases">weaviate-client's releases</a>.</em></p> <blockquote> <h2>v3.13.0 - Support for Weaviate v1.18</h2> <h2>What's Changed</h2> <ul> <li>Extend CRUD operations for single data objects and reference with consistency level by <a href="https://github.com/redouan-rhazouani"><code>@redouan-rhazouani</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/234">weaviate/weaviate-python-client#234</a></li> <li>Extend batch operations with consistency level by <a href="https://github.com/redouan-rhazouani"><code>@redouan-rhazouani</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/240">weaviate/weaviate-python-client#240</a></li> <li>Add Cursor api by <a href="https://github.com/dirkkul"><code>@dirkkul</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/241">weaviate/weaviate-python-client#241</a></li> <li>Add support for backup Azure module by <a href="https://github.com/antas-marcin"><code>@antas-marcin</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/246">weaviate/weaviate-python-client#246</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/redouan-rhazouani"><code>@redouan-rhazouani</code></a> made their first contribution in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/234">weaviate/weaviate-python-client#234</a></li> <li><a href="https://github.com/antas-marcin"><code>@antas-marcin</code></a> made their first contribution in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/246">weaviate/weaviate-python-client#246</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/weaviate/weaviate-python-client/compare/v3.12.0...v3.13.0">https://github.com/weaviate/weaviate-python-client/compare/v3.12.0...v3.13.0</a></p> <h2>v3.12.1b - Support for weaviate v1.18</h2> <h2>What's Changed</h2> <ul> <li>Extend CRUD operations for single data objects and reference with consistency level by <a href="https://github.com/redouan-rhazouani"><code>@redouan-rhazouani</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/234">weaviate/weaviate-python-client#234</a></li> <li>Extend batch operations with consistency level by <a href="https://github.com/redouan-rhazouani"><code>@redouan-rhazouani</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/240">weaviate/weaviate-python-client#240</a></li> <li>Add Cursor api by <a href="https://github.com/dirkkul"><code>@dirkkul</code></a> in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/241">weaviate/weaviate-python-client#241</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/redouan-rhazouani"><code>@redouan-rhazouani</code></a> made their first contribution in <a href="https://redirect.github.com/weaviate/weaviate-python-client/pull/234">weaviate/weaviate-python-client#234</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/weaviate/weaviate-python-client/compare/v3.12.0...v3.12.1b">https://github.com/weaviate/weaviate-python-client/compare/v3.12.0...v3.12.1b</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/weaviate/weaviate-python-client/blob/main/docs/changelog.rst">weaviate-client's changelog</a>.</em></p> <blockquote> <h2>Version 4.16.5</h2> <p>This patch version includes: - Add <code>dimensions</code> property to Google vectorizers in <code>Configure.Vectors</code></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`731cbf0b9a`"><code>731cbf0</code></a> Update changelog (<a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1768">#1768</a>)</li> <li><a href="`2627bf39c1`"><code>2627bf3</code></a> Bump ruff from 0.12.4 to 0.12.5 (<a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1761">#1761</a>)</li> <li><a href="`401a1e2ff0`"><code>401a1e2</code></a> Bump coverage from 7.9.2 to 7.10.1 (<a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1760">#1760</a>)</li> <li><a href="`44aef22189`"><code>44aef22</code></a> Bump authlib from 1.6.0 to 1.6.1 (<a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1749">#1749</a>)</li> <li><a href="`dca002e39e`"><code>dca002e</code></a> Add <code>dimensions</code> property to Google vectorizers in <code>Configure.Vectors</code> (<a href="https://redirect.github.com/weaviate/weaviate-python-client/issues/1767">#1767</a>)</li> <li>See full diff in <a href="https://github.com/weaviate/weaviate-python-client/compare/v4.16.4...v4.16.5">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=weaviate-client&package-manager=uv&previous-version=4.16.4&new-version=4.16.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-04 09:37:31 -07:00
Matthew Farrellee	4411e6e362	chore(ci): remove reportlab dep (#3033 ) # What does this PR do? remove reportlab dep. change dynamic pdf generation into a pre-computed pdf. ## Test Plan ci	2025-08-04 09:36:13 -07:00
Eran Cohen	e5b542dd8e	feat: switch to async completion in LiteLLM OpenAI mixin (#3029 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 13s Details Unit Tests / unit-tests (3.12) (push) Failing after 11s Details Python Package Build Test / build (3.13) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 17s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 17s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 21s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 27s Details Test External API and Providers / test-external (venv) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 25s Details Unit Tests / unit-tests (3.13) (push) Failing after 25s Details Pre-commit / pre-commit (push) Successful in 1m10s Details	2025-08-03 12:08:56 -07:00
Varsha	dbfc15123e	test: Implement vector store search test (#3001 ) Some checks failed Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s Details Test Llama Stack Build / generate-matrix (push) Successful in 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 13s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 18s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 9s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 8s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 14s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 17s Details Test Llama Stack Build / build-single-provider (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 17s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Test Llama Stack Build / build (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 7s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 45s Details Update ReadTheDocs / update-readthedocs (push) Failing after 35s Details Pre-commit / pre-commit (push) Successful in 1m30s Details # What does this PR do? Implement vector store search test <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan ``` pytest tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes --stack-config=http://localhost:8321 --embedding-model=all-MiniLM-L6-v2 -v ``` Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>	2025-08-02 15:57:38 -07:00
Varsha	3c2aee610d	refactor: Remove double filtering based on score threshold (#3019 ) # What does this PR do? Remove score_threshold based check from `OpenAIVectorStoreMixin` Closes: https://github.com/meta-llama/llama-stack/issues/3018 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-08-02 15:57:03 -07:00
ehhuang	1e3b5aa9b8	chore: CI action names (#3014 ) # What does this PR do? ## Test Plan CI <img width="795" height="162" alt="image" src="https://github.com/user-attachments/assets/78dedfa6-809c-4d82-9eb3-6479234dd657" />	2025-08-02 15:56:42 -07:00
dependabot[bot]	edc19698fb	chore(python-deps): bump huggingface-hub from 0.34.2 to 0.34.3 (#3028 ) Bumps [huggingface-hub](https://github.com/huggingface/huggingface_hub) from 0.34.2 to 0.34.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/huggingface/huggingface_hub/releases">huggingface-hub's releases</a>.</em></p> <blockquote> <h2>[v0.34.3] Jobs improvements and <code>whoami</code> user prefix</h2> <ul> <li>[Jobs] Update uv image <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3270">#3270</a> by <a href="https://github.com/lhoestq"><code>@lhoestq</code></a></li> <li>[Update] HF Jobs Documentation <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3268">#3268</a> by <a href="https://github.com/ariG23498"><code>@ariG23498</code></a></li> <li>Add 'user:' prefix to whoami command output <a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3267">#3267</a> by <a href="https://github.com/gary149"><code>@gary149</code></a></li> </ul> <p>Full Changelog: <a href="https://github.com/huggingface/huggingface_hub/compare/v0.34.2...v0.34.3">https://github.com/huggingface/huggingface_hub/compare/v0.34.2...v0.34.3</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`0bbc5e1b10`"><code>0bbc5e1</code></a> Release: v0.34.3</li> <li><a href="`f464fc15f3`"><code>f464fc1</code></a> update uv image (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3270">#3270</a>)</li> <li><a href="`24c77eb319`"><code>24c77eb</code></a> [Update] HF Jobs Documentation (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3268">#3268</a>)</li> <li><a href="`977c018e3d`"><code>977c018</code></a> Add 'user:' prefix to whoami command output for consistency (<a href="https://redirect.github.com/huggingface/huggingface_hub/issues/3267">#3267</a>)</li> <li>See full diff in <a href="https://github.com/huggingface/huggingface_hub/compare/v0.34.2...v0.34.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=huggingface-hub&package-manager=uv&previous-version=0.34.2&new-version=0.34.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-02 15:53:46 -07:00
IAN MILLER	a749d5f4a4	refactor: remove Conda support from Llama Stack (#2969 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR is responsible for removal of Conda support in Llama Stack <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2539 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-08-02 15:52:59 -07:00
ehhuang	f2eee4e417	chore: create integration-tests script (#3016 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 30s Details Python Package Build Test / build (3.13) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 28s Details Integration Tests (Replay) / run-replay-mode-tests (push) Failing after 19s Details Unit Tests / unit-tests (3.13) (push) Failing after 23s Details Test External API and Providers / test-external (venv) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 36s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 36s Details Unit Tests / unit-tests (3.12) (push) Failing after 27s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 40s Details Python Package Build Test / build (3.12) (push) Failing after 33s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 44s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 37s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 44s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 39s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 43s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 49s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 44s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 42s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 46s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 58s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 1m0s Details Pre-commit / pre-commit (push) Successful in 2m22s Details	2025-08-01 17:38:49 -07:00
ehhuang	6ac710f3b0	fix(recording): endpoint resolution (#3013 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 15s Details Integration Tests (Replay) / run-replay-mode-tests (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 15s Details Test External API and Providers / test-external (venv) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Python Package Build Test / build (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 18s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Unit Tests / unit-tests (3.12) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 56s Details Unit Tests / unit-tests (3.13) (push) Failing after 52s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 55s Details Pre-commit / pre-commit (push) Successful in 1m49s Details # What does this PR do? ## Test Plan	2025-08-01 16:23:54 -07:00
Matthew Farrellee	140ee7d337	fix: sambanova inference provider (#2996 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 3s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 10s Details Integration Tests (Replay) / run-replay-mode-tests (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 14s Details Python Package Build Test / build (3.13) (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 15s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 12s Details Python Package Build Test / build (3.12) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 17s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 10s Details Test External API and Providers / test-external (venv) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 46s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 49s Details Pre-commit / pre-commit (push) Successful in 1m29s Details # What does this PR do? closes #2995 update SambaNovaInferenceAdapter to efficiently use LiteLLMOpenAIMixin ## Test Plan ``` $ uv run pytest -s -v tests/integration/inference --stack-config inference=sambanova --text-model sambanova/Meta-Llama-3.1-8B-Instruct ... ======================== 10 passed, 84 skipped, 3 xfailed, 51 warnings in 8.14s ======================== ```	2025-08-01 09:09:14 -07:00
Francisco Arceo	0527c0fb15	chore: Update README for supported DBs (#3005 ) # What does this PR do? Update README for supported DBs <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-08-01 08:23:36 -07:00
Varsha	1f0766308d	feat: Add openAI compatible APIs to Qdrant (#2465 ) Some checks failed Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 15s Details Test Llama Stack Build / generate-matrix (push) Successful in 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 19s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s Details Test Llama Stack Build / build-single-provider (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s Details Integration Tests (Replay) / discover-tests (push) Successful in 24s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 18s Details Update ReadTheDocs / update-readthedocs (push) Failing after 12s Details Unit Tests / unit-tests (3.12) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 18s Details Test External API and Providers / test-external (venv) (push) Failing after 18s Details Unit Tests / unit-tests (3.13) (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 42s Details Integration Tests (Replay) / run-replay-mode-tests (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 1m12s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 1m15s Details Test Llama Stack Build / build (push) Failing after 32s Details Pre-commit / pre-commit (push) Successful in 2m39s Details # What does this PR do? Adds support to Vector store Open AI APIs in Qdrant. <!-- If resolving an issue, uncomment and update the line below --> Closes #2463 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> Co-authored-by: ehhuang <ehhuang@users.noreply.github.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-08-01 00:41:34 -04:00
ehhuang	194abe7734	test: use llama stack build when starting server (#2999 ) # What does this PR do? This should be more robust as sometimes its run without running build first. ## Test Plan OLLAMA_URL=http://localhost:11434 LLAMA_STACK_TEST_INFERENCE_MODE=replay LLAMA_STACK_TEST_RECORDING_DIR=tests/integration/recordings LLAMA_STACK_CONFIG=server:starter uv run --with pytest-repeat pytest tests/integration/telemetry --text-model="ollama/llama3.2:3b-instruct-fp16" -vvs	2025-07-31 21:09:14 -07:00
Ashwin Bharambe	0b08d64ddb	feat(ci): introduce workflow for re-recording inference outputs (#3002 )	2025-07-31 17:30:47 -07:00
Francisco Arceo	33cca26154	chore: Enabling Integration tests for Weaviate (#2882 ) # What does this PR do? This PR (1) enables the files API for Weaviate and (2) enables integration tests for Weaviate, which adds a docker container to the github action. This PR also handles a couple of edge cases for in creating the collection and ensuring the tests all pass. ## Test Plan CI enabled --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-31 20:29:50 -04:00
Ashwin Bharambe	369286f95b	fix(ci): syntax error in the disabled workflow Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 10s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 23s Details Python Package Build Test / build (3.12) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 25s Details Python Package Build Test / build (3.13) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 26s Details Test External API and Providers / test-external (venv) (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 29s Details Update ReadTheDocs / update-readthedocs (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 23s Details Unit Tests / unit-tests (3.13) (push) Failing after 18s Details Integration Tests (Replay) / run-replay-mode-tests (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s Details Unit Tests / unit-tests (3.12) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 45s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 52s Details Pre-commit / pre-commit (push) Successful in 2m3s Details	2025-07-31 15:35:42 -07:00
Ashwin Bharambe	89ff93182c	feat(ci): only run on 3.12, run on both 3.12 and 3.13 nightly (#3000 ) We don't need to run on all python versions all the time	2025-07-31 15:32:05 -07:00
Ashwin Bharambe	f4489eeb83	fix(ci): simplify integration tests replay mode (#2997 ) We are going to split record and replay workflows completely to simplify the concurrency key design. We can add vision tests by just adding to our matrix.	2025-07-31 15:18:18 -07:00
Matthew Farrellee	218c89fff1	feat: Add clear error message when API key is missing (#2992 ) # What does this PR do? Improve user experience by providing specific guidance when no API key is available, showing both provider data header and config options with the correct field name for each provider. Also adds comprehensive test coverage for API key resolution scenarios. addresses #2990 for providers using litellm openai mixin ## Test Plan `./scripts/unit-tests.sh tests/unit/providers/inference/test_litellm_openai_mixin.py`	2025-07-31 16:33:16 -04:00
Ashwin Bharambe	22f79bdb9e	fix(ci): lets attempt another fix for concurrency	2025-07-31 13:22:24 -07:00
Ashwin Bharambe	18576349ca	fix(ci): simplified concurrency and job eligibility criteria	2025-07-31 13:11:04 -07:00
Ashwin Bharambe	d1b300ead9	fix(ci, nvidia): do not use module level pytest skip for now	2025-07-31 12:32:31 -07:00
Ashwin Bharambe	752fd3b1c1	fix(ci): use single quotes please	2025-07-31 11:56:25 -07:00
Ashwin Bharambe	5ba25efd54	fix(ci): ensure workflow runs when manually run or scheduled	2025-07-31 11:54:51 -07:00
Ashwin Bharambe	27d866795c	feat(ci): add support for running vision inference tests (#2972 ) This PR significantly refactors the Integration Tests workflow. The main goal behind the PR was to enable recording of vision tests which were never run as part of our CI ever before. During debugging, I ended up making several other changes refactoring and hopefully increasing the robustness of the workflow. After doing the experiments, I have updated the trigger event to be `pull_request_target` so this workflow can get write permissions by default but it will run with source code from the base (main) branch in the source repository only. If you do change the workflow, you'd need to experiment using the `workflow_dispatch` triggers. This should not be news to anyone using Github Actions (except me!) It is likely to be a little rocky though while I learn more about GitHub Actions, etc. Please be patient :) --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-07-31 11:50:42 -07:00
Charlie Doern	709c974bd8	fix: integration tests not triggering on PR open (#2985 ) # What does this PR do? I realized that when a new PR is opened, the integration tests aren't triggering (or aren't always?) since the replay logic was introduced amend the concurrency logic a bit to trigger on opened PRs --------- Signed-off-by: Charlie Doern <cdoern@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-07-31 11:36:44 -07:00
Nehanth Narendrula	b41d696e4f	fix: Post Training Model change in Tests in order to make it less intensive (#2991 ) # What does this PR do? Changed from` ibm-granite/granite-3.3-2b-instruct` to` HuggingFaceTB/SmolLM2-135M-Instruct` so it as not resource intensive in CI Idea came from - https://github.com/meta-llama/llama-stack/pull/2984#issuecomment-3140400830	2025-07-31 11:22:34 -07:00
Nathan Weinberg	ffb6306fbd	fix: remove redundant code from unregister_vector_db (#2983 ) get_vector_db() will raise an exception if a vector store won't be returned client handling is redundant Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-31 09:22:04 -07:00
Christian Zaccaria	ea8dd58144	chore: Remove coverage badge from README.md (#2976 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> It looks like the coverage badge is still present in the README. This PR removes it. For more context: https://github.com/meta-llama/llama-stack/pull/2950	2025-07-31 09:21:30 -07:00
Kelly Brown	8a6c0fb930	docs: Reformat external provider documentation (#2982 ) Description This PR adjusts the external providers documentation to align with the new providers format. Splits up sections into the existing external providers and how to create them as well. <img width="1049" height="478" alt="Screenshot 2025-07-31 at 9 48 26 AM" src="https://github.com/user-attachments/assets/f13599cb-2fd1-4e57-8ca9-27b067264e33" /> Open to feedback and adjusting titles	2025-07-31 09:21:13 -07:00
Nehanth Narendrula	3a574ef23c	fix: remove unused DPO parameters from schema and tests (#2988 ) # What does this PR do? I removed these DPO parameters from the schema in [this PR](https://github.com/meta-llama/llama-stack/pull/2804), but I may not have done it correctly, since they were reintroduced in [this commit](`cb7354a9ce (diff-4e9a8cb358213d6118c4b6ec2a76d0367af06441bf0717e13a775ade75e2061dR15081)`)—likely due to a pre-commit hook. I've made the changes again, and the pre-commit hook automatically updated the spec sheet.	2025-07-31 09:11:08 -07:00
Charlie Doern	5c33bc1353	fix: post_training ci (#2984 ) Some checks failed Integration Tests / discover-tests (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 10s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 4s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 25s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 26s Details Integration Tests / record-tests (push) Has been skipped Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 28s Details Python Package Build Test / build (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 28s Details Integration Tests / run-tests (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 31s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 29s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 42s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 40s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 45s Details Pre-commit / pre-commit (push) Successful in 1m30s Details	2025-07-31 08:26:06 -07:00
Nehanth Narendrula	cf73146132	feat: Enable DPO training with HuggingFace inline provider (#2825 ) Some checks failed Integration Tests / discover-tests (push) Has been skipped Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 7s Details Integration Tests / record-tests (push) Has been skipped Details Integration Tests / run-tests (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 22s Details Python Package Build Test / build (3.13) (push) Failing after 16s Details Test Llama Stack Build / generate-matrix (push) Successful in 19s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 31s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 32s Details Test External API and Providers / test-external (venv) (push) Failing after 32s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 36s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 39s Details Update ReadTheDocs / update-readthedocs (push) Failing after 31s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 42s Details Test Llama Stack Build / build-single-provider (push) Failing after 37s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 35s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 37s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 40s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 42s Details Unit Tests / unit-tests (3.12) (push) Failing after 36s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 40s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 45s Details Test Llama Stack Build / build (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 1m1s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m0s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 1m6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 1m8s Details Pre-commit / pre-commit (push) Successful in 1m50s Details What does this PR do? This PR adds support for Direct Preference Optimization (DPO) training via the existing HuggingFace inline provider. It introduces a new DPO training recipe, config schema updates, dataset integration, and end-to-end testing to support preference-based fine-tuning with TRL. Test Plan Added integration test: tests/integration/post_training/test_post_training.py::TestPostTraining::test_preference_optimize Ran tests on both CPU and CUDA environments --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-43-83.ec2.internal> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-07-30 23:33:36 -07:00
Ashwin Bharambe	2665f00102	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 ) We would like to rename the term `template` to `distribution`. To prepare for that, this is a precursor. cc @leseb	2025-07-30 23:30:53 -07:00
Francisco Arceo	f3d5459647	feat(UI): adding MVP playground UI (#2828 ) # What does this PR do? I've been tinkering a little with a simple chat playground in the UI, so I'm opening the PR with what's kind of a WIP. If you look at the first commit, that includes the big part of the changes. The rest of the files changed come from adding installing the `shadcn` components. Note this is missing a lot; e.g., - sessions - document upload - audio (the shadcn components install these by default from https://shadcn-chatbot-kit.vercel.app/docs/components/chat) I still need to wire up a lot more to make it actually fully functional but it does basic chat using the LS Typescript Client. Basic demo: <img width="1329" height="1430" alt="Image" src="https://github.com/user-attachments/assets/917a2096-36d4-4925-b83b-f1f2cda98698" /> <img width="1319" height="1424" alt="Image" src="https://github.com/user-attachments/assets/fab1583b-1c72-4bf3-baf2-405aee13c6bb" /> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-30 19:44:16 -07:00
Ashwin Bharambe	d6ae2b0f47	fix(ci): more correct concurrency key for workflows (#2973 ) See comment inline. We don't want a random label to pre-empt an existing workflow which had gone ahead.	2025-07-30 18:23:14 -07:00
Nathan Weinberg	406ca72957	fix: remove redundant code from unregister_dataset (#2971 ) Some checks failed Integration Tests / discover-tests (push) Has been skipped Details Integration Tests / record-tests (push) Has been skipped Details Integration Tests / run-tests (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 12s Details Test Llama Stack Build / generate-matrix (push) Successful in 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 14s Details Test Llama Stack Build / build-single-provider (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 10s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s Details Test External API and Providers / test-external (venv) (push) Failing after 12s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details Unit Tests / unit-tests (3.12) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s Details Test Llama Stack Build / build (push) Failing after 7s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 24s Details Python Package Build Test / build (3.13) (push) Failing after 53s Details Update ReadTheDocs / update-readthedocs (push) Failing after 52s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1m0s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 58s Details Pre-commit / pre-commit (push) Successful in 1m44s Details get_dataset() will raise an exception if a dataset won't be returned client handling is redundant Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-30 16:40:01 -07:00
Sai Prashanth S	cb7354a9ce	docs: Add detailed docstrings to API models and update OpenAPI spec (#2889 ) This PR focuses on improving the developer experience by adding comprehensive docstrings to the API data models across the Llama Stack. These docstrings provide detailed explanations for each model and its fields, making the API easier to understand and use. Key changes: - Added Docstrings: Added reST formatted docstrings to Pydantic models in the `llama_stack/apis/` directory. This includes models for: - Agents (`agents.py`) - Benchmarks (`benchmarks.py`) - Datasets (`datasets.py`) - Inference (`inference.py`) - And many other API modules. - OpenAPI Spec Update: Regenerated the OpenAPI specification (`docs/_static/llama-stack-spec.yaml` and `docs/_static/llama-stack-spec.html`) to include the new docstrings. This will be reflected in the API documentation, providing richer information to users. Impact: - Developers using the Llama Stack API will have a better understanding of the data structures. - The auto-generated API documentation is now more informative. --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-07-30 16:32:59 -07:00
Nathan Weinberg	cd5c6a2fcd	chore: standardize vector store not found error (#2968 ) # What does this PR do? 1. Creates a new `VectorStoreNotFoundError` class 2. Implements the new class where appropriate Relates to #2379 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-30 15:19:16 -07:00
Nathan Weinberg	272a3e9937	chore: standardize dataset not found error (#2962 ) # What does this PR do? 1. Adds a broad schema for custom exception classes in the Llama Stack project 2. Creates a new `DatasetNotFoundError` class 3. Implements the new class where appropriate Relates to #2379 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-30 14:52:46 -07:00
IAN MILLER	25d3dfa30f	fix: fix No module named 'ollama' in test_inference_recordings.py (#2967 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR fixes the following error in unit test that was running on up to date main branch: ``` FAILED tests/unit/distribution/test_inference_recordings.py::TestInferenceRecording::test_recording_mode - ModuleNotFoundError: No module named 'ollama' FAILED tests/unit/distribution/test_inference_recordings.py::TestInferenceRecording::test_replay_mode - ModuleNotFoundError: No module named 'ollama' FAILED tests/unit/distribution/test_inference_recordings.py::TestInferenceRecording::test_replay_missing_recording - ModuleNotFoundError: No module named 'ollama' FAILED tests/unit/distribution/test_inference_recordings.py::TestInferenceRecording::test_embeddings_recording - ModuleNotFoundError: No module named 'ollama' =============================== 4 failed, 499 passed, 198 warnings in 34.50s ================================ ``` <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Run `./scripts/unit-tests.sh`	2025-07-30 16:33:33 -04:00
Nathan Weinberg	c5622c79de	chore: standardize model not found error (#2964 ) # What does this PR do? 1. Creates a new `ModelNotFoundError` class 2. Implements the new class where appropriate Relates to #2379 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-30 12:19:53 -07:00
Ashwin Bharambe	266e2afb9c	fix(ci): slightly update workflow trigger (#2966 ) We want to avoid re-triggering the workflow when random other labels are added (e.g., `meta-cla`, etc.) Also no point restarting the workflow when someone _unlabels_.	2025-07-30 12:04:13 -07:00
Kelly Brown	026caa5551	docs: part 1 - fix warnings in documentation generation (#2861 ) Description This PR removes some of the warnings when uv builds the docs - Errors appear when generating docs about .md files not appearing in toctree. ~~Adding content to the `providers-gen.py ` file that adds `--- orphan: true ---` to to each file.~~. Added a toctree generator to the `providers-gen.py` file, this gets rid of the errors in the builds. - Deletes the `_openai_compat` files, extension of PR #2849 - Adds the `files` APIs section to the `providers` toctree on the index page - Manually adds the `--- orphan: true ---` to the advanced apis. Ill try to find a way to modify the providers code gen so it automatically adds it, but this fixes the errors. - Adds the `testing.md` to the `contributing` toctree - Adds `starting_llama_stack_server.md` to `distributions` toctree There are some other warnings im still looking at but this PR gets rid of most of the toctree errors Theres also an issue with the actual distribution-codegen that I can investigate in another PR. Opened a bug for it here #2873	2025-07-30 10:50:10 -07:00
ehhuang	38d5c44354	chore: fix k8s config (#2959 ) # What does this PR do? ## Test Plan deployed to EKS	2025-07-30 10:11:59 -07:00
Ashwin Bharambe	fd2aaf4978	fix: use OLLAMA_URL to activate Ollama provider in starter (#2963 ) We tried to always keep Ollama enabled. However doing so makes the provider implementation half-assed -- should it error when it cannot connect to Ollama or not? What happens during periodic model refresh? Etc. Instead do the same thing we do for vLLM -- use the `OLLAMA_URL` to conditionally enable the provider. ## Test Plan Run `uv run llama stack build --template starter --image-type venv --run` with and without `OLLAMA_URL` set. Verify using `llama-stack-client provider list` that ollama is correctly enabled.	2025-07-30 10:11:17 -07:00
Matthew Farrellee	b69bafba30	fix(library_client): improve initialization error handling and prevent AttributeError (#2944 ) # What does this PR do? - Initialize route_impls to None in constructor to prevent AttributeError - Consolidate initialization checks to single point in request() method - Improve error message to be more helpful ("Please call initialize() first") - Add comprehensive test suite to prevent regressions The library client now has better error handling when users forget to call initialize(), showing a clear ValueError instead of confusing AttributeError. All initialization validation is now centralized in the request() method, with internal methods (_call_non_streaming, _call_streaming, _convert_body) relying on this single check for cleaner, more maintainable code. closes #2943 ## Test Plan `./scripts/unit-tests.sh`	2025-07-30 11:58:47 -04:00
Ashwin Bharambe	9b69b6ac05	fix: pre-commit issue Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 28s Details Integration Tests / discover-tests (push) Successful in 29s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 27s Details Test External API and Providers / test-external (venv) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 27s Details Integration Tests / record-tests (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 29s Details Unit Tests / unit-tests (3.13) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 33s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 34s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 33s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 37s Details Unit Tests / unit-tests (3.12) (push) Failing after 33s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 37s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 36s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 35s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 39s Details Integration Tests / run-tests (push) Failing after 8s Details Pre-commit / pre-commit (push) Successful in 1m43s Details	2025-07-29 17:52:36 -07:00
Ashwin Bharambe	f6afb3c26b	feat(ci): keep only one re-recording job because independent recordings will conflict (#2956 ) A couple of important updates: - When recording tests, we cannot be generating a matrix because all the independent recordings will conflict. - In fact, we just don't need a matrix on test types any more because things are very fast and the overhead of `llama stack build` and setting up `uv` etc. is much more. - Refactored the running of tests into an independent action	2025-07-29 17:48:04 -07:00
Ashwin Bharambe	b237df8f18	feat(ci): use replay mode, setup ollama if specific label exists on PR (#2955 ) This PR makes setting up Ollama optional for CI. By default, we use `replay` mode for inference requests and use the stored results from the `tests/integration/recordings/` directory. Every so often, users will update tests which will need us to re-record. To do this, we check for the existence of a label `re-record-tests` on the PR. If detected, - ollama is spun up - inference mode is set to record - after the tests are done, if any new changes are detected, they are pushed back to the PR ## Test Plan This is GitHub CI. Gotta test it live.	2025-07-29 16:50:26 -07:00
Ashwin Bharambe	0ac503ec0d	feat(tests): record responses for evals and telemetry tests (#2954 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests / discover-tests (push) Successful in 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 6s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 10s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 11s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 7s Details Test Llama Stack Build / build-single-provider (push) Failing after 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Test External API and Providers / test-external (venv) (push) Failing after 10s Details Test Llama Stack Build / build (push) Failing after 8s Details Integration Tests / test-matrix (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 29s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 39s Details Python Package Build Test / build (3.13) (push) Failing after 38s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 41s Details Pre-commit / pre-commit (push) Successful in 2m2s Details Continuing with https://github.com/meta-llama/llama-stack/pull/2952 This also includes a "fix" to inference store related tests so that we pull a large number of inference responses from the DB so as to always find the one we just wrote.	2025-07-29 15:46:21 -07:00
Ashwin Bharambe	81c7d6fa2e	chore(ci): disable post training tests (#2953 ) Post training tests need _much_ better thinking before we can re-enable them to be run on every single PR. Running periodically should be approached only when it is shown that the tests are reliable and as light-weight as can be; otherwise, it is just kicking the can down the road.	2025-07-29 14:20:09 -07:00
Ashwin Bharambe	072d20a124	feat(test): record agents, safety and vector_io integration tests (#2952 ) Continue to build on top of https://github.com/meta-llama/llama-stack/pull/2941 ## Test Plan Run server with `LLAMA_STACK_TEST_INFERENCE_MODE=record` and then run the integration tests with `--stack-config=server:starter`. Then restart the server with `LLAMA_STACK_TEST_INFERENCE_MODE=replay` and re-run the tests. Verify that no request hit Ollama at any point.	2025-07-29 14:02:14 -07:00
Matthew Farrellee	2d1ab3ca55	fix: use same image_name logic for build & run config (#2949 ) # What does this PR do? when --image-name is not provided the build script default to the image_name in the config, this makes sure the same is done for the run script ## Test Plan llama stack build w/o --image-name	2025-07-29 12:54:21 -07:00
Francisco Arceo	6ac973ec80	chore: Delete coverage-badge (#2950 ) At the moment, the code coverage action has just been failing. It's misleading when interpreting the status badge on the main branch. https://github.com/meta-llama/llama-stack/actions/workflows/coverage-badge.yml # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-29 12:53:25 -07:00
Ashwin Bharambe	2e5ca3f15c	chore: move recordings one directory upwards	2025-07-29 12:46:19 -07:00
Ashwin Bharambe	08b4a1deb3	feat(tests): introduce inference record/replay to increase test reliability (#2941 ) Implements a comprehensive recording and replay system for inference API calls that eliminates dependency on online inference providers during testing. The system treats inference as deterministic by recording real API responses and replaying them in subsequent test runs. Applies to OpenAI clients (which should cover many inference requests) as well as Ollama AsyncClient. For storing, we use a hybrid system: Sqlite for fast lookups and JSON files for easy greppability / debuggability. As expected, tests become much much faster (more than 3x in just inference testing.) ```bash LLAMA_STACK_TEST_INFERENCE_MODE=record LLAMA_STACK_TEST_RECORDING_DIR=<...> \ uv run pytest -s -v tests/integration/inference \ --stack-config=starter \ -k "not( builtin_tool or safety_with_image or code_interpreter or test_rag )" \ --text-model="ollama/llama3.2:3b-instruct-fp16" \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 ``` ```bash LLAMA_STACK_TEST_INFERENCE_MODE=replay LLAMA_STACK_TEST_RECORDING_DIR=<...> \ uv run pytest -s -v tests/integration/inference \ --stack-config=starter \ -k "not( builtin_tool or safety_with_image or code_interpreter or test_rag )" \ --text-model="ollama/llama3.2:3b-instruct-fp16" \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 ``` - `LLAMA_STACK_TEST_INFERENCE_MODE`: `live` (default), `record`, or `replay` - `LLAMA_STACK_TEST_RECORDING_DIR`: Storage location (must be specified for record or replay modes)	2025-07-29 12:41:31 -07:00
Ashwin Bharambe	abf1d6a703	fix: random breakage in llama_stack/ui/package.json	2025-07-29 12:31:29 -07:00
Ashwin Bharambe	fee365b71e	fix: delete requirements.txt which crept back in	2025-07-29 11:30:25 -07:00
Nehanth Narendrula	58ffd82853	fix: Update SFTConfig parameter to fix CI and Post Training Workflow (#2948 ) # What does this PR do? - Change max_seq_length to max_length in SFTConfig constructor - TRL deprecated max_seq_length in Feb 2024 and removed it in v0.20.0 - Reference: https://github.com/huggingface/trl/pull/2895 This resolves the SFT training failure in CI tests	2025-07-29 11:14:04 -07:00
Matthew Farrellee	c7dc0f21b4	fix: error on failed job, do not wait for timeout (#2945 ) # What does this PR do? cause post training integration test to error when job fails. ## Test Plan ci	2025-07-29 11:07:51 -07:00
Nathan Weinberg	870a37ff4b	feat: add base64 encoded PDF support for OpenAI Chat Completions (#2881 ) Some checks failed Coverage Badge / unit-tests (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests / discover-tests (push) Successful in 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 14s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Test Llama Stack Build / build-single-provider (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s Details Test External API and Providers / test-external (venv) (push) Failing after 16s Details Test Llama Stack Build / build (push) Failing after 9s Details Python Package Build Test / build (3.12) (push) Failing after 23s Details Update ReadTheDocs / update-readthedocs (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 29s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 31s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 58s Details Python Package Build Test / build (3.13) (push) Failing after 54s Details Integration Tests / test-matrix (push) Failing after 56s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1m4s Details Pre-commit / pre-commit (push) Successful in 2m15s Details # What does this PR do? OpenAI Chat Completions supports passing a base64 encoded PDF file to a model, but Llama Stack currently does not allow for this behavior. This PR extends our implementation of the OpenAI API spec to change that. Closes #2129 ## Test Plan A new functional test has been added to test the validity of such a request Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-29 06:23:41 -04:00
github-actions[bot]	cf8722079c	build: Bump version to 0.2.16 Some checks failed Coverage Badge / unit-tests (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s Details Integration Tests / discover-tests (push) Successful in 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 8s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 10s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 13s Details Test External API and Providers / test-external (venv) (push) Failing after 8s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s Details Test Llama Stack Build / build (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Integration Tests / test-matrix (push) Failing after 8s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 12s Details Test Llama Stack Build / build-single-provider (push) Failing after 35s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 42s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 44s Details Pre-commit / pre-commit (push) Successful in 1m23s Details	2025-07-28 23:13:50 +00:00
Mark Campbell	19c90d9bfc	docs: update using llama stack as library docs (#2931 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Integration Tests / discover-tests (push) Successful in 10s Details Test Llama Stack Build / generate-matrix (push) Successful in 7s Details Coverage Badge / unit-tests (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 12s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 9s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s Details Integration Tests / test-matrix (push) Failing after 6s Details Test Llama Stack Build / build (push) Failing after 7s Details Python Package Build Test / build (3.12) (push) Failing after 15s Details Test Llama Stack Build / build-single-provider (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 21s Details Test External API and Providers / test-external (venv) (push) Failing after 16s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 24s Details Unit Tests / unit-tests (3.13) (push) Failing after 16s Details Python Package Build Test / build (3.13) (push) Failing after 42s Details Update ReadTheDocs / update-readthedocs (push) Failing after 40s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 51s Details Pre-commit / pre-commit (push) Successful in 1m58s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Updates provider template from outdated `ollama` to `starter` <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes: #2839 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-07-28 15:35:26 -07:00
ehhuang	4019027070	chore: revert #2855 (#2939 ) # What does this PR do? revert https://github.com/meta-llama/llama-stack/pull/2855 to unblock release (running out of disk space) Error here: `4689354931` ## Test Plan	2025-07-28 15:30:25 -07:00
dependabot[bot]	e189f65548	chore(python-deps): bump pydantic from 2.10.6 to 2.11.7 (#2925 ) Bumps [pydantic](https://github.com/pydantic/pydantic) from 2.10.6 to 2.11.7. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pydantic/pydantic/releases">pydantic's releases</a>.</em></p> <blockquote> <h2>v2.11.7 2025-06-14</h2> <!-- raw HTML omitted --> <h2>What's Changed</h2> <h3>Fixes</h3> <ul> <li>Copy <code>FieldInfo</code> instance if necessary during <code>FieldInfo</code> build by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11980">pydantic/pydantic#11980</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/pydantic/pydantic/compare/v2.11.6...v2.11.7">https://github.com/pydantic/pydantic/compare/v2.11.6...v2.11.7</a></p> <h2>v2.11.6 2025-06-13</h2> <h2>v2.11.6 (2025-06-13)</h2> <h3>What's Changed</h3> <h4>Fixes</h4> <ul> <li>Rebuild dataclass fields before schema generation by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11949">#11949</a></li> <li>Always store the original field assignment on <code>FieldInfo</code> by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11946">#11946</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/pydantic/pydantic/compare/v2.11.5...v2.11.6">https://github.com/pydantic/pydantic/compare/v2.11.5...v2.11.6</a></p> <h2>v2.11.5 2025-05-22</h2> <!-- raw HTML omitted --> <h2>What's Changed</h2> <h3>Fixes</h3> <ul> <li>Check if <code>FieldInfo</code> is complete after applying type variable map by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11855">#11855</a></li> <li>Do not delete mock validator/serializer in <code>model_rebuild()</code> by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11890">#11890</a></li> <li>Do not duplicate metadata on model rebuild by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11902">#11902</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/pydantic/pydantic/compare/v2.11.4...v2.11.5">https://github.com/pydantic/pydantic/compare/v2.11.4...v2.11.5</a></p> <h2>v2.11.4 2025-04-29</h2> <h3>What's Changed</h3> <h4>Packaging</h4> <ul> <li>Bump <code>mkdocs-llmstxt</code> to v0.2.0 by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11725">#11725</a></li> </ul> <h4>Changes</h4> <ul> <li>Allow config and bases to be specified together in <code>create_model()</code> by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11714">#11714</a>. This change was backported as it was previously possible (although not meant to be supported) to provide <code>model_config</code> as a field, which would make it possible to provide both configuration and bases.</li> </ul> <h4>Fixes</h4> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pydantic/pydantic/blob/main/HISTORY.md">pydantic's changelog</a>.</em></p> <blockquote> <h2>v2.11.7 (2025-06-14)</h2> <p><a href="https://github.com/pydantic/pydantic/releases/tag/v2.11.7">GitHub release</a></p> <h3>What's Changed</h3> <h4>Fixes</h4> <ul> <li>Copy <code>FieldInfo</code> instance if necessary during <code>FieldInfo</code> build by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11898">#11898</a></li> </ul> <h2>v2.11.6 (2025-06-13)</h2> <p><a href="https://github.com/pydantic/pydantic/releases/tag/v2.11.6">GitHub release</a></p> <h3>What's Changed</h3> <h4>Fixes</h4> <ul> <li>Rebuild dataclass fields before schema generation by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11949">#11949</a></li> <li>Always store the original field assignment on <code>FieldInfo</code> by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11946">#11946</a></li> </ul> <h2>v2.11.5 (2025-05-22)</h2> <p><a href="https://github.com/pydantic/pydantic/releases/tag/v2.11.5">GitHub release</a></p> <h3>What's Changed</h3> <h4>Fixes</h4> <ul> <li>Check if <code>FieldInfo</code> is complete after applying type variable map by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11855">#11855</a></li> <li>Do not delete mock validator/serializer in <code>model_rebuild()</code> by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11890">#11890</a></li> <li>Do not duplicate metadata on model rebuild by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11902">#11902</a></li> </ul> <h2>v2.11.4 (2025-04-29)</h2> <p><a href="https://github.com/pydantic/pydantic/releases/tag/v2.11.4">GitHub release</a></p> <h3>What's Changed</h3> <h4>Packaging</h4> <ul> <li>Bump <code>mkdocs-llmstxt</code> to v0.2.0 by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11725">#11725</a></li> </ul> <h4>Changes</h4> <ul> <li>Allow config and bases to be specified together in <code>create_model()</code> by <a href="https://github.com/Viicos"><code>@Viicos</code></a> in <a href="https://redirect.github.com/pydantic/pydantic/pull/11714">#11714</a>. This change was backported as it was previously possible (although not meant to be supported) to provide <code>model_config</code> as a field, which would make it possible to provide both configuration and bases.</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`5f033e46c5`"><code>5f033e4</code></a> Prepare release v2.11.7</li> <li><a href="`c3368b83c4`"><code>c3368b8</code></a> Copy <code>FieldInfo</code> instance if necessary during <code>FieldInfo</code> build (<a href="https://redirect.github.com/pydantic/pydantic/issues/11980">#11980</a>)</li> <li><a href="`3987b23db4`"><code>3987b23</code></a> Prepare release v2.11.6</li> <li><a href="`dc7a9d20be`"><code>dc7a9d2</code></a> Always store the original field assignment on <code>FieldInfo</code></li> <li><a href="`c284c279a5`"><code>c284c27</code></a> Rebuild dataclass fields before schema generation</li> <li><a href="`5e6d1dc71f`"><code>5e6d1dc</code></a> Prepare release v2.11.5</li> <li><a href="`1b63218c42`"><code>1b63218</code></a> Do not duplicate metadata on model rebuild (<a href="https://redirect.github.com/pydantic/pydantic/issues/11902">#11902</a>)</li> <li><a href="`5aefad873b`"><code>5aefad8</code></a> Do not delete mock validator/serializer in <code>model_rebuild()</code></li> <li><a href="`8fbe6585f4`"><code>8fbe658</code></a> Check if <code>FieldInfo</code> is complete after applying type variable map</li> <li><a href="`12b371a0f7`"><code>12b371a</code></a> Update documentation about <code>@dataclass_transform</code> support</li> <li>Additional commits viewable in <a href="https://github.com/pydantic/pydantic/compare/v2.10.6...v2.11.7">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pydantic&package-manager=uv&previous-version=2.10.6&new-version=2.11.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-07-28 15:11:54 -07:00
Ashwin Bharambe	70469c84e9	chore(packaging): remove requirements.txt (#2938 ) We don't need this. We have kept it since existing wisdom is that "it helps with back-compat". Well, the entire ecosystem is moving to `uv` at an unprecedented rate and keeping this creates unnecessary work and confusion. The specific reason I am killing this is that it confuses `dependabot` which ends up not bumping `uv.lock` which is the more important file to change.	2025-07-28 14:52:24 -07:00
Ashwin Bharambe	cd24aaf3aa	fix(pre-commit): push properly version 4	2025-07-28 13:11:56 -07:00
Ashwin Bharambe	8fa77bc93e	fix(pre-commit): push properly version 3	2025-07-28 13:02:04 -07:00
Ashwin Bharambe	3058060e2b	fix(pre-commit): push properly version 2	2025-07-28 12:50:50 -07:00
Ashwin Bharambe	607574c26a	fix(pre-commit): push properly	2025-07-28 12:43:49 -07:00
Ashwin Bharambe	8961706dea	fix(pre-commit): dont error if pre-commit itself errors	2025-07-28 12:35:34 -07:00
Ashwin Bharambe	dd4ea28b49	fix(dependabot): run pre-commit on dependabot PRs (#2935 ) See PR screenshot below -- we need to run pre-commit on the dependabot PRs obviously <img width="837" height="277" alt="image" src="https://github.com/user-attachments/assets/c17802d7-e252-4719-acc7-e335b24120f8" />	2025-07-28 15:25:06 -04:00
Matthew Farrellee	968fc132d3	fix(openai-compat): restrict developer/assistant/system/tool messages to text-only content (#2932 ) What: - Added OpenAIChatCompletionTextOnlyMessageContent type for text-only content validation - Modified OpenAISystemMessageParam, OpenAIAssistantMessageParam, OpenAIDeveloperMessageParam, and OpenAIToolMessageParam to use text-only content type instead of mixed content - OpenAIUserMessageParam unchanged - still accepts both text and images - Updated OpenAPI spec files to reflect text-only content restrictions in schemas closes #2894 Why: - Enforces OpenAI API compatibility by restricting image content to user messages only - Prevents API misuse where images might be sent in message types that don't support them - Aligns with OpenAI's actual API behavior where only user messages can contain multimodal content - Improves type safety and validation at the API boundary Test plan: - Added comprehensive parametrized tests covering all 5 OpenAI message types - Tests verify text string acceptance for all message types - Tests verify text list acceptance for all message types - Tests verify image rejection for system/assistant/developer/tool messages (ValidationError expected) - Tests verify user messages still accept images (backward compatibility maintained)	2025-07-28 10:36:34 -07:00
Matthew Farrellee	60bb5e307e	feat(openai): add configurable base_url support with OPENAI_BASE_URL env var (#2919 ) # What does this PR do? - Add base_url field to OpenAIConfig with default "https://api.openai.com/v1" - Update sample_run_config to support OPENAI_BASE_URL environment variable - Modify get_base_url() to return configured base_url instead of hardcoded value - Add comprehensive test suite covering: - Default base URL behavior - Custom base URL from config - Environment variable override - Config precedence over environment variables - Client initialization with configured URL - Model availability checks using configured URL This enables users to configure custom OpenAI-compatible API endpoints via environment variables or configuration files. Closes #2910 ## Test Plan run unit tests	2025-07-28 10:16:02 -07:00
Charlie Doern	b1c21a25ec	docs: remove provider_id from external docs (#2922 ) # What does this PR do? external provider docs mention setting provider_id in the build yaml. Since we changed that to just be provider_type and module, remove instances of provider_id Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-28 10:14:39 -07:00
Charlie Doern	86fe2b8475	fix: adjust provider type used in external provider test (#2921 ) # What does this PR do? provider_id is no longer valid in a build.yaml, remove it in the external provider test Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-28 10:14:16 -07:00
Matthew Farrellee	47c078fcef	feat: implement dynamic model detection support for inference providers using litellm (#2886 ) # What does this PR do? This enhancement allows inference providers using LiteLLMOpenAIMixin to validate model availability against LiteLLM's official provider model listings, improving reliability and user experience when working with different AI service providers. - Add litellm_provider_name parameter to LiteLLMOpenAIMixin constructor - Add check_model_availability method to LiteLLMOpenAIMixin using litellm.models_by_provider - Update Gemini, Groq, and SambaNova inference adapters to pass litellm_provider_name ## Test Plan standard CI.	2025-07-28 10:13:54 -07:00
Christian Zaccaria	c48dcafc77	fix: Fix unit tests CI and failing tests (#2928 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - Added `set -e` to the beginning of the unit test script to ensure the script exits on failure and correctly fails the CI when tests do not pass. - Fixed all unit tests that were silently failing in the CI. - Fixed Python 3.13 unit test CI failing silently. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2877 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> - Previously: Unit tests passing in CI eventhough it failed 11 tests -> [CI-run](`4683681501 (step)`:4:2097) - Made the fix. Now, ensuring CI fails as expected on test failures: Unit tests failing in CI with 1 failed test -> [CI-run](`4684234247 (step)`:4:1506) - This PR shows the CI passing and all unit tests passing.	2025-07-28 10:07:26 -07:00
Charlie Doern	46e2989312	fix: switch refresh to debug log (#2933 ) # What does this PR do? the server logs have a persistent `core: refreshing registry` log that clogs up the output. Switch it to debug this is what it looked like: <img width="1126" height="1028" alt="Screenshot 2025-07-28 at 9 56 44 AM" src="https://github.com/user-attachments/assets/a1880fd3-7fc7-4a97-bfb8-89a62e4c5c19" /> Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-28 10:02:54 -07:00
Matthew Farrellee	3c40c8e583	fix: litellm_provider_name for llama-api (#2934 ) litellm uses "meta_llama" for the provider name, see https://docs.litellm.ai/docs/providers/meta_llama ad https://github.com/BerriAI/litellm/blob/main/litellm/__init__.py#L833	2025-07-28 10:02:16 -07:00
Charlie Doern	09abdb0a37	test: upload logs for external provider tests (#2914 ) Some checks failed Integration Tests / discover-tests (push) Successful in 2s Details Installer CI / lint (push) Failing after 5s Details Installer CI / smoke-test-on-dev (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 7s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 6s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 8s Details Python Package Build Test / build (3.13) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 12s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 9s Details Test Llama Stack Build / build (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Integration Tests / test-matrix (push) Failing after 7s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 13s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 17s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 22s Details Unit Tests / unit-tests (3.12) (push) Failing after 19s Details Pre-commit / pre-commit (push) Successful in 1m5s Details # What does this PR do? currently the external provider tests don't upload log files as artifacts nor do they use LLAMA_STACK_LOG_FILE. align with the other integration tests ## Test Plan logs should be present in the two tests on this PR Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-25 15:03:15 -07:00
Ashwin Bharambe	9583f468f8	feat(starter)!: simplify starter distro; litellm model registry changes (#2916 )	2025-07-25 15:02:04 -07:00
Charlie Doern	3344d8a9e5	fix: separate build and run provider types (#2917 ) Some checks failed Coverage Badge / unit-tests (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests / discover-tests (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / build-single-provider (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 5s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 9s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Test Llama Stack Build / build (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Integration Tests / test-matrix (push) Failing after 7s Details Pre-commit / pre-commit (push) Successful in 1m13s Details # What does this PR do? in #2637, I combined the run and build config provider types to both use `Provider` since this includes a provider_id, a user must now specify this when writing a build yaml. This is not very clear because all a user should care about upon build is the code to be installed (the module and the provider_type) introduce `BuildProvider` and fixup the parts of the code impacted by this Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-25 12:39:26 -07:00
Nathan Weinberg	025163d8e6	feat: add auto-generated CI documentation pre-commit hook (#2890 ) # What does this PR do? Our CI is entirely undocumented, this commit adds a README.md file with a table of the current CI and what is does --------- Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-25 17:57:01 +02:00
Derek Higgins	52201612de	feat: implement chunk deletion for vector stores (#2701 ) Add support for deleting individual chunks from vector stores - Add abstract remove_chunk() method to EmbeddingIndex base class - Implement chunk deletion for Faiss provider, SQLite Vec, Milvus, PGVector - Placeholder implementations with NotImplementedError for Chroma/Qdrant/Weaviate - Integrate chunk deletion into OpenAI vector store file deletion flow - removed xfail from test_openai_vector_store_delete_file_removes_from_vector_store Closes: #2477 --------- Signed-off-by: Derek Higgins <derekh@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-07-25 10:30:30 -04:00
Francisco Arceo	9e77be1f72	chore: Fix chroma unit tests (#2896 ) # What does this PR do? Enable Chroma inline unit tests and fix integration tests. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-25 10:12:14 -04:00
Ashwin Bharambe	ed07a58b50	fix(registry): ensure clean shutdown (#2901 ) Some checks failed Coverage Badge / unit-tests (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests / discover-tests (push) Successful in 4s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 9s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 5s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Integration Tests / test-matrix (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s Details Test Llama Stack Build / build (push) Failing after 4s Details Pre-commit / pre-commit (push) Successful in 57s Details Avoid the error message: ``` INFO 2025-07-24 21:51:54,530 __main__:598 server: Received interrupt signal, shutting down gracefully... ERROR 2025-07-24 21:51:54,692 asyncio:1826 uncategorized: Task was destroyed but it is pending! task: <Task pending name='Task-15' coro=<refresh_registry() running at /Users/leseb/Documents/AI/llama-stack/llama_stack/distribution/stack.py:356> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=> ```	2025-07-25 09:44:31 -04:00
Charlie Doern	de6919ecdd	refactor: install external providers from module (#2637 ) # What does this PR do? Today, external providers are installed via the `external_providers_dir` in the config. This necessitates users to understand the `ProviderSpec` and set up their directories accordingly. This process splits up the config for the stack across multiple files, directories, and formats. Most (if not all) external providers today have a [get_provider_spec](`559cb18fbb/src/ramalama_stack/provider.py (L9)`) method that sits unused. Utilizing this method rather than the providers.d route allows for a much easier installation process for external providers and limits the amount of extra configuration a regular user has to do to get their stack off the ground. To accomplish this and wire it throughout the build process, Introduce the concept of a `module` for users to specify for an external provider upon build time. In order to facilitate this, align the build and run spec to use `Provider` class rather than the stringified provider_type that build currently uses. For example, say this is in your build config: ``` - provider_id: ramalama provider_type: remote::ramalama module: ramalama_stack ``` during build (in the various `build_...` scripts), additionally to installing any pip dependencies we will also install this module and use the `get_provider_spec` method to retrieve the ProviderSpec that is currently specified using `providers.d`. In production so far, providing instructions for installing external providers for users has been difficult: they need to install the module as a pre-req, create the providers.d directory, copy in the provider spec, and also copy in the necessary build/run yaml files. Accessing an external provider should be as easy as possible, and pointing to its installable module aligns more with the rest of our build and dependency management process. For now, `external_providers_dir` still exists as an alternate more declarative method of using external providers. ## Test Plan added an integration test installing an external provider from module and more unit test coverage for `get_provider_registry` ( the warning in yellow is expected, the module is installed inside of the build env, not where we are running the command) <img width="1119" height="400" alt="Screenshot 2025-07-24 at 11 30 48 AM" src="https://github.com/user-attachments/assets/1efbaf45-b9e8-451a-bd63-264ed664706d" /> <img width="1154" height="618" alt="Screenshot 2025-07-24 at 11 31 14 AM" src="https://github.com/user-attachments/assets/feb2b3ea-c5dd-418e-9662-9a3bd5dd6bdc" /> --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-25 15:41:26 +02:00
dependabot[bot]	85223ccc4d	chore(github-deps): bump astral-sh/setup-uv from 6.4.1 to 6.4.3 (#2902 ) Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.4.1 to 6.4.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v6.4.3 🌈 fix relative paths starting with dots</h2> <h2>🐛 Bug fixes</h2> <ul> <li>fix relative paths starting with dots <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/500">#500</a>)</li> </ul> <h2>v6.4.2 🌈 Interpret relative inputs as under working-directory</h2> <h2>Changes</h2> <p>This release will interpret relative paths in inputs as relative to the value of <code>working-directory</code> (default is <code>${{ github.workspace }}</code>) . This means the following configuration</p> <pre lang="yaml"><code>- uses: astral-sh/setup-uv@v6 with: working-directory: /my/path cache-dependency-glob: uv.lock </code></pre> <p>will look for the <code>cache-dependency-glob</code> under <code>/my/path/uv.lock</code></p> <h2>🐛 Bug fixes</h2> <ul> <li>interpret relative inputs as under working-directory <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/498">#498</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>chore: update known versions for 0.8.1/0.8.2 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/497">#497</a>)</li> <li>chore: update known versions for 0.8.0 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/491">#491</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`e92bafb625`"><code>e92bafb</code></a> fix relative paths starting with dots (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/500">#500</a>)</li> <li><a href="`2c7142f755`"><code>2c7142f</code></a> interpret relative inputs as under working-directory (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/498">#498</a>)</li> <li><a href="`23482a31a8`"><code>23482a3</code></a> chore: update known versions for 0.8.1/0.8.2 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/497">#497</a>)</li> <li><a href="`4ac06a054e`"><code>4ac06a0</code></a> chore: update known versions for 0.8.0 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/491">#491</a>)</li> <li>See full diff in <a href="`7edac99f96...e92bafb625`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.4.1&new-version=6.4.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-07-25 10:08:24 +02:00
Yuan Tang	34093fecd1	ci: Remove `open-pull-requests-limit: 0` from dependabot.yml (#2900 ) This might fix issues in https://github.com/meta-llama/llama-stack/pull/2899 and https://github.com/meta-llama/llama-stack/pull/2897 where uv dependencies are not being upgraded correctly (`uv.lock` is not being updated).	2025-07-25 09:49:18 +02:00
dependabot[bot]	3216765c26	chore(deps): bump form-data from 4.0.2 to 4.0.4 in /llama_stack/ui (#2898 ) Some checks failed Coverage Badge / unit-tests (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests / discover-tests (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details Integration Tests / test-matrix (push) Failing after 6s Details Pre-commit / pre-commit (push) Successful in 47s Details Bumps [form-data](https://github.com/form-data/form-data) from 4.0.2 to 4.0.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/form-data/form-data/releases">form-data's releases</a>.</em></p> <blockquote> <h2>v4.0.4</h2> <h2><a href="https://github.com/form-data/form-data/compare/v4.0.3...v4.0.4">v4.0.4</a> - 2025-07-16</h2> <h3>Commits</h3> <ul> <li>[meta] add <code>auto-changelog</code> <a href="`811f68282f`"><code>811f682</code></a></li> <li>[Tests] handle predict-v8-randomness failures in node < 17 and node > 23 <a href="`1d11a76434`"><code>1d11a76</code></a></li> <li>[Fix] Switch to using <code>crypto</code> random for boundary values <a href="`3d1723080e`"><code>3d17230</code></a></li> <li>[Tests] fix linting errors <a href="`5e340800b5`"><code>5e34080</code></a></li> <li>[meta] actually ensure the readme backup isn’t published <a href="`316c82ba93`"><code>316c82b</code></a></li> <li>[Dev Deps] update <code>@ljharb/eslint-config</code> <a href="`58c25d7640`"><code>58c25d7</code></a></li> <li>[meta] fix readme capitalization <a href="`2300ca1959`"><code>2300ca1</code></a></li> </ul> <h2>v4.0.3</h2> <h2><a href="https://github.com/form-data/form-data/compare/v4.0.2...v4.0.3">v4.0.3</a> - 2025-06-05</h2> <h3>Fixed</h3> <ul> <li>[Fix] <code>append</code>: avoid a crash on nullish values <a href="https://redirect.github.com/form-data/form-data/issues/577"><code>[#577](https://github.com/form-data/form-data/issues/577)</code></a></li> </ul> <h3>Commits</h3> <ul> <li>[eslint] use a shared config <a href="`426ba9ac44`"><code>426ba9a</code></a></li> <li>[eslint] fix some spacing issues <a href="`20941917f0`"><code>2094191</code></a></li> <li>[Refactor] use <code>hasown</code> <a href="`81ab41b46f`"><code>81ab41b</code></a></li> <li>[Fix] validate boundary type in <code>setBoundary()</code> method <a href="`8d8e469309`"><code>8d8e469</code></a></li> <li>[Tests] add tests to check the behavior of <code>getBoundary</code> with non-strings <a href="`837b8a1f75`"><code>837b8a1</code></a></li> <li>[Dev Deps] remove unused deps <a href="`870e4e6659`"><code>870e4e6</code></a></li> <li>[meta] remove local commit hooks <a href="`e6e83ccb54`"><code>e6e83cc</code></a></li> <li>[Dev Deps] update <code>eslint</code> <a href="`4066fd6f65`"><code>4066fd6</code></a></li> <li>[meta] fix scripts to use prepublishOnly <a href="`c4bbb13c0e`"><code>c4bbb13</code></a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/form-data/form-data/blob/master/CHANGELOG.md">form-data's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/form-data/form-data/compare/v4.0.3...v4.0.4">v4.0.4</a> - 2025-07-16</h2> <h3>Commits</h3> <ul> <li>[meta] add <code>auto-changelog</code> <a href="`811f68282f`"><code>811f682</code></a></li> <li>[Tests] handle predict-v8-randomness failures in node < 17 and node > 23 <a href="`1d11a76434`"><code>1d11a76</code></a></li> <li>[Fix] Switch to using <code>crypto</code> random for boundary values <a href="`3d1723080e`"><code>3d17230</code></a></li> <li>[Tests] fix linting errors <a href="`5e340800b5`"><code>5e34080</code></a></li> <li>[meta] actually ensure the readme backup isn’t published <a href="`316c82ba93`"><code>316c82b</code></a></li> <li>[Dev Deps] update <code>@ljharb/eslint-config</code> <a href="`58c25d7640`"><code>58c25d7</code></a></li> <li>[meta] fix readme capitalization <a href="`2300ca1959`"><code>2300ca1</code></a></li> </ul> <h2><a href="https://github.com/form-data/form-data/compare/v4.0.2...v4.0.3">v4.0.3</a> - 2025-06-05</h2> <h3>Fixed</h3> <ul> <li>[Fix] <code>append</code>: avoid a crash on nullish values <a href="https://redirect.github.com/form-data/form-data/issues/577"><code>[#577](https://github.com/form-data/form-data/issues/577)</code></a></li> </ul> <h3>Commits</h3> <ul> <li>[eslint] use a shared config <a href="`426ba9ac44`"><code>426ba9a</code></a></li> <li>[eslint] fix some spacing issues <a href="`20941917f0`"><code>2094191</code></a></li> <li>[Refactor] use <code>hasown</code> <a href="`81ab41b46f`"><code>81ab41b</code></a></li> <li>[Fix] validate boundary type in <code>setBoundary()</code> method <a href="`8d8e469309`"><code>8d8e469</code></a></li> <li>[Tests] add tests to check the behavior of <code>getBoundary</code> with non-strings <a href="`837b8a1f75`"><code>837b8a1</code></a></li> <li>[Dev Deps] remove unused deps <a href="`870e4e6659`"><code>870e4e6</code></a></li> <li>[meta] remove local commit hooks <a href="`e6e83ccb54`"><code>e6e83cc</code></a></li> <li>[Dev Deps] update <code>eslint</code> <a href="`4066fd6f65`"><code>4066fd6</code></a></li> <li>[meta] fix scripts to use prepublishOnly <a href="`c4bbb13c0e`"><code>c4bbb13</code></a></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`41996f5ac7`"><code>41996f5</code></a> v4.0.4</li> <li><a href="`316c82ba93`"><code>316c82b</code></a> [meta] actually ensure the readme backup isn’t published</li> <li><a href="`2300ca1959`"><code>2300ca1</code></a> [meta] fix readme capitalization</li> <li><a href="`811f68282f`"><code>811f682</code></a> [meta] add <code>auto-changelog</code></li> <li><a href="`5e340800b5`"><code>5e34080</code></a> [Tests] fix linting errors</li> <li><a href="`1d11a76434`"><code>1d11a76</code></a> [Tests] handle predict-v8-randomness failures in node < 17 and node > 23</li> <li><a href="`58c25d7640`"><code>58c25d7</code></a> [Dev Deps] update <code>@ljharb/eslint-config</code></li> <li><a href="`3d1723080e`"><code>3d17230</code></a> [Fix] Switch to using <code>crypto</code> random for boundary values</li> <li><a href="`d8d67dc8ac`"><code>d8d67dc</code></a> v4.0.3</li> <li><a href="`e6e83ccb54`"><code>e6e83cc</code></a> [meta] remove local commit hooks</li> <li>Additional commits viewable in <a href="https://github.com/form-data/form-data/compare/v4.0.2...v4.0.4">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=form-data&package-manager=npm_and_yarn&previous-version=4.0.2&new-version=4.0.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meta-llama/llama-stack/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-07-24 21:24:56 -04:00
ehhuang	21bae296f2	feat(auth): API access control (#2822 ) # What does this PR do? - Added ability to specify `required_scope` when declaring an API. This is part of the `@webmethod` decorator. - If auth is enabled, a user can access an API only if `user.attributes['scope']` includes the `required_scope` - We add `required_scope='telemetry.read'` to the telemetry read APIs. ## Test Plan CI with added tests 1. Enable server.auth with github token 2. Observe `client.telemetry.query_traces()` returns 403	2025-07-24 15:30:48 -07:00
Calum Murray	7cc4819e90	feat: add MCP Streamable HTTP support (#2554 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds support for the new Streamable HTTP transport for MCP, as well as falling back to the SSE protocol if the Streamable HTTP connection fails. <!-- If resolving an issue, uncomment and update the line below --> Closes #2542 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Calum Murray <cmurray@redhat.com>	2025-07-24 15:04:27 -07:00
Sébastien Han	632cf9eb72	feat: Bring Your Own API (BYOA) (#2228 ) Some checks failed Coverage Badge / unit-tests (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Installer CI / lint (push) Failing after 3s Details Integration Tests / discover-tests (push) Successful in 3s Details Installer CI / smoke-test-on-dev (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 10s Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 8s Details Integration Tests / test-matrix (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s Details Test Llama Stack Build / build (push) Failing after 6s Details Pre-commit / pre-commit (push) Successful in 57s Details # What does this PR do? Prototype on a new feature to allow new APIs to be plugged in Llama Stack. Opened for early feedback on the approach and test appetite on the functionality. @ashwinb @raghotham open for early feedback, thanks! --------- Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-07-24 13:41:14 -07:00
Charlie Doern	341504869e	fix: use logger for console telemetry (#2844 ) # What does this PR do? currently `print` is being used with custom formatting to achieve telemetry output in the console_span_processor This causes telemetry not to show up in log files when using `LLAMA_STACK_LOG_FILE`. During testing it looks like telemetry is not being captured when it is switch to using Rich formatting with the logger and then strip the formatting off when a log file is being used so the formatting looks normal ## Test Plan before: console: <img width="967" height="127" alt="Screenshot 2025-07-21 at 4 02 15 PM" src="https://github.com/user-attachments/assets/b09518cc-9d38-4970-9877-70e2c41fcbb5" /> log file (no telemetry): ``` 2025-07-21 16:01:32,481 llama_stack.providers.remote.inference.ollama.ollama:117 inference: checking connectivity to Ollama at `http://localhost:11434`... 2025-07-21 16:01:34,779 opentelemetry.trace:537 uncategorized: Overriding of current TracerProvider is not allowed 2025-07-21 16:01:35,083 __main__:587 server: Listening on ['::', '0.0.0.0']:8321 2025-07-21 16:01:35,091 uvicorn.error:84 uncategorized: Started server process [68679] 2025-07-21 16:01:35,091 uvicorn.error:48 uncategorized: Waiting for application startup. 2025-07-21 16:01:35,092 __main__:163 server: Starting up 2025-07-21 16:01:35,092 uvicorn.error:62 uncategorized: Application startup complete. 2025-07-21 16:01:35,092 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) 2025-07-21 16:01:37,167 uvicorn.access:473 uncategorized: 127.0.0.1:53145 - "POST /v1/openai/v1/chat/completions HTTP/1.1" 200 ``` after: console: <img width="797" height="165" alt="Screenshot 2025-07-22 at 3 28 44 PM" src="https://github.com/user-attachments/assets/44d40e3b-6502-439d-9ea5-38058b289962" /> log file: ``` 2025-07-21 15:59:51,481 llama_stack.providers.remote.inference.ollama.ollama:117 inference: checking connectivity to Ollama at `http://localhost:11434`... 2025-07-21 15:59:53,801 opentelemetry.trace:537 uncategorized: Overriding of current TracerProvider is not allowed 2025-07-21 15:59:54,059 __main__:587 server: Listening on ['::', '0.0.0.0']:8321 2025-07-21 15:59:54,066 uvicorn.error:84 uncategorized: Started server process [68578] 2025-07-21 15:59:54,067 uvicorn.error:48 uncategorized: Waiting for application startup. 2025-07-21 15:59:54,067 __main__:163 server: Starting up 2025-07-21 15:59:54,067 uvicorn.error:62 uncategorized: Application startup complete. 2025-07-21 15:59:54,068 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) 2025-07-21 15:59:55,381 [TELEMETRY] 19:59:55.381 /v1/openai/v1/chat/completions 2025-07-21 15:59:55,619 uvicorn.access:473 uncategorized: 127.0.0.1:53102 - "POST /v1/openai/v1/chat/completions HTTP/1.1" 200 2025-07-21 15:59:55,621 [TELEMETRY] 19:59:55.621 /v1/openai/v1/chat/completions [StatusCode.OK] (240.07ms) 2025-07-21 15:59:55,622 [TELEMETRY] 19:59:55.620 127.0.0.1:53102 - "POST /v1/openai/v1/chat/completions HTTP/1.1" 200 ``` Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-24 16:26:59 -04:00
Kelly Brown	abade761e0	docs: Update nvidia docs template (#2893 ) Description Fixes generation issue in nvidia code gen file. Closes #2873	2025-07-24 22:11:34 +02:00
Sébastien Han	226b877ca6	chore: install script should use starter (#2891 ) Our demo installation script should pull the starter image. Ollama is not being updated anymore as a distribution. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-24 12:18:02 -07:00
ehhuang	cbe89d2bdd	chore: return webmethod from find_matching_route (#2883 ) This will be used to support API access control, i.e. Webmethod would have a `required_scope` attribute, and we need access to that in the middleware.	2025-07-24 11:37:21 -07:00
Ashwin Bharambe	1463b79218	feat(registry): make the Stack query providers for model listing (#2862 ) This flips #2823 and #2805 by making the Stack periodically query the providers for models rather than the providers going behind the back and calling "register" on to the registry themselves. This also adds support for model listing for all other providers via `ModelRegistryHelper`. Once this is done, we do not need to manually list or register models via `run.yaml` and it will remove both noise and annoyance (setting `INFERENCE_MODEL` environment variables, for example) from the new user experience. In addition, it adds a configuration variable `allowed_models` which can be used to optionally restrict the set of models exposed from a provider.	2025-07-24 10:39:53 -07:00
Stefan Thaler	537dc693ee	chore: add mypy coverage to inspect.py and library_client.py in /distribution (#2707 ) # What does this PR do? Adds type guards in /distribution/inspect.py and ignores a valid-type mypy error in library_client.py. This PR is part of issue #2647 . I'm rather unsure whether ignoring the valid-type error is correct in this case. It appears that args[0] is interpreted as [any] but I didn't find any way to specify the type. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-07-24 09:51:46 -07:00
Charlie Doern	d4f0b430e2	docs: update list of apis (#2697 ) # What does this PR do? apis.md had a few APIs missing and incorrectly described APIs Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-24 09:50:14 -07:00
Sébastien Han	af9c707eaf	fix: various improvements on install.sh (#2724 ) # What does this PR do? Bulk improvements: * The script has a better error reporting, when a command fails it will print the logs of the failed command * Better error handling using a trap to catch signal and perform proper cleanup * Cosmetic changes * Added CI to test the image code against main * Use the starter image and its latest tag Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-24 09:43:51 -07:00
Derek Higgins	4ea1f2aa9f	test: Add VLLM provider support to integration tests (#2757 ) - Add setup-vllm GitHub action to start VLLM container - Extend integration test matrix to support both ollama and vllm providers - Make test setup conditional based on provider type - Add provider-specific environment variables and configurations - vllm tests setup to run weekly or can be triggered manually (only ollama on PR) TODO: investigate failing tests for vllm provider (safety and post_training) Also need a proper fix for #2713 (tmp fix for this in the first commit in this PR) Closes: #1648 --------- Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-07-24 09:42:26 -07:00
Mustafa Elbehery	6ab5760a1b	chore(test): migrate unit tests from unittest to pytest nvidia test safety (#2793 ) This PR replaces unittest with pytest. Part of https://github.com/meta-llama/llama-stack/issues/2680 cc @leseb Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-24 09:41:07 -07:00
Yuan Tang	9069d878ef	docs: Update CHANGELOG.md (#2874 ) This updates the changelog to include recent releases. Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-07-24 09:36:28 -07:00
Christian Zaccaria	7f7b990b80	docs: Document use cases for Responses and Agents APIs (#2756 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This pull request adds documentation to clarify the differences between the Agents API and the OpenAI Responses API, including use cases for each. It also updates the index page to reference the new documentation. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2368	2025-07-24 12:20:04 -04:00
Mohit Gaur	5ef2baacdc	fix: update check-workflows-use-hashes to use github error format (#2875 ) # What does this PR do? Updates the script `scripts/check-workflows-use-hashes.sh` to improve error reporting by adopting GitHub Actions error annotation format. * Updated the script to use GitHub Actions error annotation format (`::error file={name},line={line},col={col}::{message}`) making error messages more actionable and easier to locate in workflows. * Modified the script to include line numbers for `uses:` references by using `grep -n` and extracting line numbers, improving the precision of error reporting. Closes #2778 ## Test Plan - Violation check - Created test file with mixed SHA/non-SHA actions ``` echo 'uses: actions/checkout@v4' > test-workflow.yml echo 'uses: actions/upload-artifact@main' >> test-workflow.yml ``` Result: Correctly detected violations with precise line numbers ``` ./scripts/check-workflows-use-hashes.sh Output: ::error file=test-workflow.yml,line=14::uses non-SHA action ref: uses: actions/checkout@v4 ::error file=test-workflow.yml,line=20::uses non-SHA action ref: uses: actions/upload-artifact@main ``` - Verified existing project workflows pass ``` ./scripts/check-workflows-use-hashes.sh # Result: Exit code 0 (all workflows properly SHA-pinned) ```	2025-07-24 17:41:17 +02:00
Matthew Farrellee	e33a50480d	fix: starter template and litellm backward compat conflict for openai (#2885 ) # What does this PR do? openai/models.py has backward compat entries for litellm model names. the starter template includes these in the list of registered models. the inclusion results in duplicate model registrations. the backward compat is no longer necessary. ## Test Plan ci	2025-07-24 17:28:37 +02:00
Sarthak Deshpande	cd8715d327	chore: Added openai compatible vector io endpoints for chromadb (#2489 ) Some checks failed Integration Tests / discover-tests (push) Successful in 3s Details Coverage Badge / unit-tests (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Test External Providers / test-external-providers (venv) (push) Failing after 12s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details Test Llama Stack Build / build-single-provider (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Test Llama Stack Build / build (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 18s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 51s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 49s Details Integration Tests / test-matrix (push) Failing after 53s Details Pre-commit / pre-commit (push) Successful in 1m42s Details # What does this PR do? This PR implements the openai compatible endpoints for chromadb Closes #2462 ## Test Plan Ran ollama llama stack server and ran the command `pytest -sv --stack-config=http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2` 8 failed, 27 passed, 8 skipped, 1 xfailed The failed ones are regarding files api --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: sarthakdeshpande <sarthak.deshpande@engati.com> Co-authored-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>	2025-07-23 13:51:58 -07:00
Derek Higgins	fd2aab8582	fix: prevent shell redirection issues with pip dependencies (#2867 ) - Use printf to to escape special characters (e.g. < > ) - Apply escaping to pip_dependencies and special_pip_deps Resolves shell interpretation of >= operators as redirections that were causing build failing to respect versions and unexpected file creation in /app directory. Closes: #2866 ## Test Plan Manually tested, will also be tested by existing CI Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-07-23 21:43:33 +02:00
Derek Higgins	427136bb63	fix: cleanup after build_container.sh (#2869 ) - rm TEMP_DIR when build_container.sh succeeds - prevents multiple temp directories with Containerfile being left in /tmp Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-07-23 11:54:54 -07:00
IAN MILLER	51affe5783	fix: fixed test_access_control.py unit test (#2876 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> I fixed test_access_policy() function providing provider_model_id in each register model endpoint to pass assertions. Initially I faced this issue: ``` tests/unit/server/test_quota.py::test_authenticated_quota_allows_up_to_limit tests/unit/server/test_quota.py::test_authenticated_quota_blocks_after_limit tests/unit/server/test_quota.py::test_anonymous_quota_allows_up_to_limit tests/unit/server/test_quota.py::test_anonymous_quota_blocks_after_limit /Users/iamiller/GitHub/llama-stack/.venv/lib/python3.12/site-packages/aiosqlite/core.py:105: DeprecationWarning: The default datetime adapter is deprecated as of Python 3.12; see the sqlite3 documentation for suggested replacement recipes result = function() -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ============================================================================== short test summary info =============================================================================== FAILED tests/unit/server/test_access_control.py::test_access_policy - AssertionError: assert 'test_provider/model-1' == 'model-1' ==================================================================== 1 failed, 436 passed, 194 warnings in 20.09s ==================================================================== ``` After resolved, all works: ``` -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ========================================================================= 437 passed, 194 warnings in 19.41s ========================================================================= ``` <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Run ` ./scripts/unit-tests.sh`	2025-07-23 11:50:20 -07:00
Ashwin Bharambe	2fcfb0f0b5	fix: bring back dell template (#2880 ) This template is definitely needed since it (and related docker, which will push soon) is used by folks at Dell.	2025-07-23 11:40:59 -07:00
Mark Campbell	8353ad4981	fix: search mode validation for rag query (#2857 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> I noticed a few issues with my implementation of the search mode validation for RagQuery. This PR replaces the check for search mode in RagQuery with a Literal. There were issues before with ``` TypeError: Object of type RAGSearchMode is not JSON serializable ``` When using ``` query_config = RAGQueryConfig(max_chunks=6, mode="vector").model_dump() ``` It also fixes the fact that despite user input "vector" was always the used search mode. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Verify that a chosen search mode works when using Rag Query or use below agent config: ``` agent = Agent( client, model=model_id, instructions="You are a helpful assistant", tools=[ { "name": "builtin::rag/knowledge_search", "args": { "vector_db_ids": [vector_db_id], "query_config": { "mode": "keyword", "max_chunks": 6 } }, } ], ) ``` Running Unit Tests: ``` uv sync --extra dev uv run pytest tests/unit/rag/test_rag_query.py -v ```	2025-07-23 11:25:12 -07:00
Francisco Arceo	2aba2c1236	chore: Moving vector store and vector store files helper methods to openai_vector_store_mixin (#2863 ) # What does this PR do? Moving vector store and vector store files helper methods to `openai_vector_store_mixin.py` <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan The tests are already supported in the CI and tests the inline providers and current integration tests. Note that the `vector_index` fixture will be test `milvus_vec_adapter`, `faiss_vec_adapter`, and `sqlite_vec_adapter` in `tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py`. Additionally, the integration tests in `integration-vector-io-tests.yml` runs `tests/integration/vector_io` tests for the following providers: ```python vector-io-provider: ["inline::faiss", "inline::sqlite-vec", "inline::milvus", "remote::chromadb", "remote::pgvector"] ``` Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-23 13:35:48 -04:00
Matthew Farrellee	e1ed152779	chore: create OpenAIMixin for inference providers with an OpenAI-compat API that need to implement openai_* methods (#2835 ) Some checks failed Coverage Badge / unit-tests (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 6s Details Integration Tests / discover-tests (push) Successful in 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 11s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Update ReadTheDocs / update-readthedocs (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 16s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 18s Details Integration Tests / test-matrix (push) Failing after 18s Details Pre-commit / pre-commit (push) Successful in 1m14s Details # What does this PR do? add an `OpenAIMixin` for use by inference providers who remote endpoints support an OpenAI compatible API. use is demonstrated by refactoring - OpenAIInferenceAdapter - NVIDIAInferenceAdapter (adds embedding support) - LlamaCompatInferenceAdapter ## Test Plan existing unit and integration tests	2025-07-23 06:49:40 -04:00
grs	fc67ad408a	chore: add some documentation for access policy rules (#2785 ) # What does this PR do? Adds some documentation on setting explicit access_policy rules in config.	2025-07-23 10:27:27 +02:00
Sébastien Han	c0563c0560	fix: honour deprecation of --config and --template (#2856 ) Some checks failed Coverage Badge / unit-tests (push) Failing after 1s Details Integration Tests / discover-tests (push) Successful in 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 12s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 13s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 12s Details Test Llama Stack Build / generate-matrix (push) Successful in 8s Details Python Package Build Test / build (3.13) (push) Failing after 6s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s Details Test External Providers / test-external-providers (venv) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Integration Tests / test-matrix (push) Failing after 12s Details Test Llama Stack Build / build (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 25s Details Pre-commit / pre-commit (push) Successful in 1m33s Details # What does this PR do? https://github.com/meta-llama/llama-stack/pull/2716/ broke commands like: ``` python -m llama_stack.distribution.server.server --config llama_stack/templates/starter/run.yaml ``` And will fail with: ``` Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "/Users/leseb/Documents/AI/llama-stack/llama_stack/distribution/server/server.py", line 626, in <module> main() File "/Users/leseb/Documents/AI/llama-stack/llama_stack/distribution/server/server.py", line 402, in main config_file = resolve_config_or_template(args.config, Mode.RUN) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/leseb/Documents/AI/llama-stack/llama_stack/distribution/utils/config_resolution.py", line 43, in resolve_config_or_template config_path = Path(config_or_template) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Cellar/python@3.12/3.12.8/Frameworks/Python.framework/Versions/3.12/lib/python3.12/pathlib.py", line 1162, in __init__ super().__init__(args) File "/opt/homebrew/Cellar/python@3.12/3.12.8/Frameworks/Python.framework/Versions/3.12/lib/python3.12/pathlib.py", line 373, in __init__ raise TypeError( TypeError: argument should be a str or an os.PathLike object where __fspath__ returns a str, not 'NoneType' ``` Complaining that no positional arguments are present. We now honour the deprecation until --config and --template are removed completely. ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.* --> Both ` python -m llama_stack.distribution.server.server --config llama_stack/templates/starter/run.yaml` and ` python -m llama_stack.distribution.server.server llama_stack/templates/starter/run.yaml` should run the server. Same for `--template starter`. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-22 20:48:23 -07:00
Derek Higgins	340448e0aa	fix: optimize container build by enabling uv cache (#2855 ) - Remove --no-cache flags from uv pip install commands to enable caching - Mount host uv cache directory to container for persistent caching - Set UV_LINK_MODE=copy to prevent uv using hardlinks - When building the starter image o Build time reduced from ~4:45 to ~3:05 on subsequent builds (environment specific) o Eliminates re-downloading of 3G+ of data on each build o Cache size: ~6.2G (when building starter image) Fixes excessive data downloads during distro container builds. Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-07-22 16:51:52 -07:00
Ashwin Bharambe	3b83032555	feat(registry): more flexible model lookup (#2859 ) This PR updates model registration and lookup behavior to be slightly more general / flexible. See https://github.com/meta-llama/llama-stack/issues/2843 for more details. Note that this change is backwards compatible given the design of the `lookup_model()` method. ## Test Plan Added unit tests	2025-07-22 15:22:48 -07:00
Mustafa Elbehery	9736f096f6	chore(test): fix flaky telemetry tests (#2815 ) Some checks failed Installer CI / lint (push) Failing after 2s Details Installer CI / smoke-test (push) Has been skipped Details Integration Tests / discover-tests (push) Successful in 3s Details Coverage Badge / unit-tests (push) Failing after 6s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 11s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s Details Test Llama Stack Build / generate-matrix (push) Successful in 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 16s Details Test Llama Stack Build / build-single-provider (push) Failing after 12s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Integration Tests / test-matrix (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s Details Test External Providers / test-external-providers (venv) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 8s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s Details Test Llama Stack Build / build (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 48s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 55s Details Unit Tests / unit-tests (3.13) (push) Failing after 52s Details Pre-commit / pre-commit (push) Successful in 1m42s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR fixes flaky telemetry tests <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> See https://github.com/meta-llama/llama-stack/pull/2814 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-22 12:30:14 -07:00
Omer Tuchfeld	c1a63fcd87	fix(install): explicit docker.io usage (#2850 ) # What does this PR do? When podman is used and the registry is omitted, podman will prompt the user. However, we're piping the output of podman to /dev/null and the user will not see the prompt, the script will end abruptly and this is confusing. This commit explicitly uses the docker.io registry for the ollama image and the llama-stack image so that the prompt is avoided. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> I ran the script on a machine with podman and the issue was resolved ## Image Before the fix, this is what would happen: <img width="748" height="95" alt="image" src="https://github.com/user-attachments/assets/9c609f88-c0a8-45e7-a789-834f64f601e5" /> Signed-off-by: Omer Tuchfeld <omer@tuchfeld.dev>	2025-07-22 20:36:48 +02:00
Francisco Arceo	20c3197952	chore: Making name optional in openai_create_vector_store (#2858 ) # What does this PR do? chore: Making name optional in openai_create_vector_store # Closes https://github.com/meta-llama/llama-stack/issues/2706 ## Test Plan CI and unit tests Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-22 13:31:31 -04:00
ehhuang	8e1a2b4703	chore: remove *_openai_compat providers (#2849 ) # What does this PR do? These are no longer needed as llama-stack-evals can run against OAI endpoints directly. ## Test Plan	2025-07-22 10:25:36 -07:00
Omer Tuchfeld	5e18d4d097	fix(agent): ensure turns are sorted (#2854 ) # What does this PR do? Ensures that session turns retrieved from the agent persistence layer are sorted by their `started_at` timestamp, as the key-value store does not guarantee order. Closes #2852 ## Test Plan - [ ] Add unit tests	2025-07-22 10:24:51 -07:00
Jeremy Bonghwan Choi	b5a6ecc331	docs: minor fix of the pgvector provider spec description (#2847 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests / discover-tests (push) Successful in 3s Details Coverage Badge / unit-tests (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 10s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 10s Details Integration Tests / test-matrix (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 25s Details Unit Tests / unit-tests (3.13) (push) Failing after 24s Details Pre-commit / pre-commit (push) Successful in 1m17s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> minor update of the pgvector doc, changing 'faiss' to 'pgvector' <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-07-21 22:10:35 -07:00
Francisco Arceo	2bc96613f9	chore: Adding demo script and importing it into the docs (#2848 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Coverage Badge / unit-tests (push) Failing after 6s Details Integration Tests / discover-tests (push) Successful in 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Test Llama Stack Build / build-single-provider (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 11s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 14s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 14s Details Test Llama Stack Build / generate-matrix (push) Successful in 10s Details Test External Providers / test-external-providers (venv) (push) Failing after 9s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 19s Details Python Package Build Test / build (3.13) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s Details Integration Tests / test-matrix (push) Failing after 13s Details Python Package Build Test / build (3.12) (push) Failing after 1m1s Details Update ReadTheDocs / update-readthedocs (push) Failing after 1m0s Details Test Llama Stack Build / build (push) Failing after 52s Details Pre-commit / pre-commit (push) Successful in 2m39s Details # What does this PR do? This PR adds the quickstart as a file to the docs so that it can be more easily maintained and run, as mentioned in https://github.com/meta-llama/llama-stack/pull/2800. ## Test Plan I could add this as a test in the CI but I wasn't sure if we wanted to add additional jobs there. 😅 Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-21 22:53:32 -04:00
Francisco Arceo	c8f274347d	chore: Adding Access Control for OpenAI Vector Stores methods (#2772 ) # What does this PR do? Refactors the vector store routing logic by moving OpenAI-compatible vector store operations from the `VectorIORouter` to the `VectorDBsRoutingTable`. Closes https://github.com/meta-llama/llama-stack/issues/2761 ## Test Plan Added unit tests to cover new routing logic and ACL checks. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-21 16:22:44 -04:00
ehhuang	0d7a90b8bc	chore: merge --config and --template in server.py (#2716 ) # What does this PR do? Part of #2696 ## Test Plan Run `llama stack run starter` Error: ``` myenv ❯ llama stack run starters WARNING 2025-07-10 12:12:43,052 llama_stack.cli.stack.run:82 server: Conda detected. Using conda environment myenv for the run. usage: llama stack run [-h] [--port PORT] [--image-name IMAGE_NAME] [--env KEY=VALUE] [--image-type {conda,venv}] [--enable-ui] [config \| template] llama stack run: error: Could not resolve config or template 'starters'. Tried the following locations: 1. As file path: /Users/erichuang/projects/llama-stack-git/starters 2. As template: /Users/erichuang/projects/llama-stack-git/llama_stack/templates/starters/run.yaml 3. As built distribution: (/Users/erichuang/.llama/distributions/llamastack-starters/starters-run.yaml, /Users/erichuang/.llama/distributions/starters/starters-run.yaml) Available templates: dell, test-env, vllm-gpu, test-template, cerebras, openai-api-verification, sambanova, passthrough, direct-config, together, openai, fireworks, meta-reference-gpu, __pycache__, dev, ollama, watsonx, remote-vllm, llama_api, groq, dummy, oracle, nvidia, ci-tests, postgres-demo, test-stack, bedrock, starter, hf-serverless, hf-endpoint, tgi, open-benchmark, verification Did you mean one of these templates? - starter - together - postgres-demo ```	2025-07-21 13:19:27 -07:00
Charlie Doern	9a03526672	fix: uvicorn respect log_config (#2842 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s Details Integration Tests / discover-tests (push) Successful in 9s Details Coverage Badge / unit-tests (push) Failing after 13s Details Python Package Build Test / build (3.12) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Details Python Package Build Test / build (3.13) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 17s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 18s Details Test External Providers / test-external-providers (venv) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s Details Integration Tests / test-matrix (push) Failing after 12s Details Pre-commit / pre-commit (push) Successful in 1m7s Details	2025-07-21 12:50:39 -07:00
Sébastien Han	019ddda138	fix: graceful SIGINT on server (#2831 ) # What does this PR do? After https://github.com/meta-llama/llama-stack/pull/2818, SIGINT will print a stack trace. This is because uvicorn re-raises SIGINT and it gets converted by Python internal signal handler (default handles SIGINT) to KeyboardInterrupt exception. We know simply catch the exception to get a clean exit, this is not changing the behavior on SIGINT. ## Test Plan Run the server, hit Ctrl+C or `kill -2 <server pid>` and expect a clean exit with no stack trace. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-21 11:35:15 -07:00
ehhuang	d0208df286	test: skip flaky telemetry tests (#2814 ) # What does this PR do? example error: `4625086977` ## Test Plan	2025-07-21 10:01:40 -07:00
IAN MILLER	9e6860b9cf	fix: remove @pytest.mark.asyncio from test_get_raw_document_text.py (#2840 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The pre-commit workflow was failing in the main branch and removing `@pytest.mark.asyncio `from `test_get_raw_document_text.py` fixed that. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-07-21 09:14:34 -07:00
Ondrej Metelka	89c49eb003	feat: Allow application/yaml as mime_type (#2575 ) # What does this PR do? Allow application/yaml as mime_type for documents. ## Test Plan Added unit tests.	2025-07-21 15:43:32 +02:00
Mustafa Elbehery	b2c7543af7	fix(vectordb): VectorDBInput has no provider_id (#2830 ) Some checks failed Coverage Badge / unit-tests (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 13s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s Details Python Package Build Test / build (3.13) (push) Failing after 11s Details Python Package Build Test / build (3.12) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Unit Tests / unit-tests (3.12) (push) Failing after 13s Details Integration Tests / discover-tests (push) Successful in 21s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s Details Unit Tests / unit-tests (3.13) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 24s Details Integration Tests / test-matrix (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 53s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 51s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 59s Details Pre-commit / pre-commit (push) Successful in 1m35s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR add `provider_id` field to `VectorDBInput` class. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> fixes https://github.com/meta-llama/llama-stack/issues/2819 Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-21 14:03:40 +02:00
Sébastien Han	ecd28f0085	chore: add contribution guideline around PRs (#2811 ) More contributing guidelines. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-21 05:47:17 -04:00
Christian Zaccaria	56269245c2	fix: Add permissions for pull request creation in coverage-badge workflow (#2832 ) # What does this PR do? The workflow that automatically creates a PR to update the Coverage Badge fails as the `GITHUB_TOKEN` doesn't have write permissions. As opposed to providing write permissions to the token, we can provide the permissions for just this workflow with this PR.	2025-07-21 11:40:00 +02:00
dependabot[bot]	28956f9447	chore(github-deps): bump astral-sh/setup-uv from 6.3.1 to 6.4.1 (#2827 ) Some checks failed Integration Tests / discover-tests (push) Successful in 2s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 23s Details Test External Providers / test-external-providers (venv) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 22s Details Python Package Build Test / build (3.12) (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 24s Details Python Package Build Test / build (3.13) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 24s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 24s Details Integration Tests / test-matrix (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 3m13s Details Unit Tests / unit-tests (3.13) (push) Failing after 3m15s Details Pre-commit / pre-commit (push) Successful in 4m55s Details Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.3.1 to 6.4.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v6.4.1 🌈 Hotfix: Ignore deps starting with uv when finding uv version</h2> <h2>Changes</h2> <p>Thank you <a href="https://github.com/phpmypython"><code>@phpmypython</code></a> for raising a PR to fix this issue!</p> <h2>🐛 Bug fixes</h2> <ul> <li>Ignore deps starting with uv when finding uv version <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/492">#492</a>)</li> </ul> <h2>v6.4.0 🌈 Add input <code>version-file</code></h2> <h2>Changes</h2> <p>You can now use the <code>version-file</code> input to specify a file that contains the version of uv to install. This can either be a <code>pyproject.toml</code> or <code>uv.toml</code> file which defines a <code>required-version</code> or uv defined as a dependency in <code>pyproject.toml</code> or <code>requirements.txt</code>.</p> <pre lang="yaml"><code>- name: Install uv based on the version defined in requirements.txt uses: astral-sh/setup-uv@v6 with: version-file: "requirements.txt" </code></pre> <h2>🚀 Enhancements</h2> <ul> <li>Add input version-file <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/486">#486</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>chore: update known versions for 0.7.22 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/488">#488</a>)</li> <li>Bump dependencies <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/487">#487</a>)</li> <li>chore: update known versions for 0.7.21 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/483">#483</a>)</li> <li>chore: update known versions for 0.7.20 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/480">#480</a>)</li> <li>chore: update known versions for 0.7.19 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/475">#475</a>)</li> <li>chore: update known versions for 0.7.18 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/473">#473</a>)</li> <li>chore: update known versions for 0.7.17 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/468">#468</a>)</li> <li>chore: update known versions for 0.7.16 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/466">#466</a>)</li> <li>chore: update known versions for 0.7.15 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/463">#463</a>)</li> </ul> <h2>📚 Documentation</h2> <ul> <li>Add FAQ on changed cache and cache upload behavior <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/477">#477</a>)</li> </ul> <h2>⬆️ Dependency updates</h2> <ul> <li>Bump dependencies <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/487">#487</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`7edac99f96`"><code>7edac99</code></a> Ignore deps starting with uv when finding uv version (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/492">#492</a>)</li> <li><a href="`05273c154d`"><code>05273c1</code></a> chore: update known versions for 0.7.22 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/488">#488</a>)</li> <li><a href="`de545d4421`"><code>de545d4</code></a> Bump dependencies (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/487">#487</a>)</li> <li><a href="`b75ff7d7b8`"><code>b75ff7d</code></a> Add input version-file (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/486">#486</a>)</li> <li><a href="`c893ac1cb2`"><code>c893ac1</code></a> chore: update known versions for 0.7.21 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/483">#483</a>)</li> <li><a href="`a905f0040b`"><code>a905f00</code></a> chore: update known versions for 0.7.20 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/480">#480</a>)</li> <li><a href="`d4219d1620`"><code>d4219d1</code></a> Add FAQ on changed cache and cache upload behavior (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/477">#477</a>)</li> <li><a href="`aaefb91b77`"><code>aaefb91</code></a> chore: update known versions for 0.7.19 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/475">#475</a>)</li> <li><a href="`c05b3e180b`"><code>c05b3e1</code></a> chore: update known versions for 0.7.18 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/473">#473</a>)</li> <li><a href="`1bf1493664`"><code>1bf1493</code></a> chore: update known versions for 0.7.17 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/468">#468</a>)</li> <li>Additional commits viewable in <a href="`bd01e18f51...7edac99f96`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.3.1&new-version=6.4.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-07-19 21:10:35 -05:00
ehhuang	0a6e588f68	feat: enable auth for LocalFS Files Provider (#2773 ) Some checks failed Integration Tests / discover-tests (push) Successful in 4s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 7s Details Coverage Badge / unit-tests (push) Failing after 16s Details Test Llama Stack Build / build-single-provider (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s Details Unit Tests / unit-tests (3.12) (push) Failing after 13s Details Test External Providers / test-external-providers (venv) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s Details Python Package Build Test / build (3.12) (push) Failing after 13s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 17s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 17s Details Update ReadTheDocs / update-readthedocs (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 23s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Unit Tests / unit-tests (3.13) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 23s Details Test Llama Stack Build / build (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s Details Python Package Build Test / build (3.13) (push) Failing after 2m19s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 2m25s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2m32s Details Integration Tests / test-matrix (push) Failing after 2m24s Details Pre-commit / pre-commit (push) Successful in 3m57s Details # What does this PR do? Supports authentication for LocalFS Files provider. closes https://github.com/meta-llama/llama-stack/issues/2760 ## Test Plan CI. added tests.	2025-07-18 19:11:01 -07:00
Ashwin Bharambe	dd303327f3	feat(ci): add a ci-tests distro (#2826 )	2025-07-18 17:11:06 -07:00
Ashwin Bharambe	199f859eec	feat(vllm): periodically refresh models (#2823 ) Just like #2805 but for vLLM. We also make VLLM_URL env variable optional (not required) -- if not specified, the provider silently sits idle and yells eventually if someone tries to call a completion on it. This is done so as to allow this provider to be present in the `starter` distribution. ## Test Plan Set up vLLM, copy the starter template and set `{ refresh_models: true, refresh_models_interval: 10 }` for the vllm provider and then run: ``` ENABLE_VLLM=vllm VLLM_URL=http://localhost:8000/v1 \ uv run llama stack run --image-type venv /tmp/starter.yaml ``` Verify that `llama-stack-client models list` brings up the model correctly from vLLM.	2025-07-18 15:53:09 -07:00
Ashwin Bharambe	ade075152e	chore: kill inline::vllm (#2824 ) Inline _inference_ providers haven't proved to be very useful -- they are rarely used. And for good reason -- it is almost never a good idea to include a complex (distributed) inference engine bundled into a distributed stateful front-end server serving many other things. Responsibility should be split properly. See Discord discussion: `1395849853`	2025-07-18 15:52:18 -07:00
Ashwin Bharambe	68a2dfbad7	feat(ollama): periodically refresh models (#2805 ) For self-hosted providers like Ollama (or vLLM), the backing server is running a set of models. That server should be treated as the source of truth and the Stack registry should just be a cache for those models. Of course, in production environments, you may not want this (because you know what model you are running statically) hence there's a config boolean to control this behavior. _This is part of a series of PRs aimed at removing the requirement of needing to set `INFERENCE_MODEL` env variables for running Llama Stack server._ ## Test Plan Copy and modify the starter.yaml template / config and enable `refresh_models: true, refresh_models_interval: 10` for the ollama provider. Then, run: ``` LLAMA_STACK_LOGGING=all=debug \ ENABLE_OLLAMA=ollama uv run llama stack run --image-type venv /tmp/starter.yaml ``` See a gargantuan amount of logs, but verify that the provider is periodically refreshing models. Stop and prune a model from ollama server, restart the server. Verify that the model goes away when I call `uv run llama-stack-client models list`	2025-07-18 12:20:36 -07:00
ehhuang	6d55f2f137	feat: enable ls client for files tests (#2769 ) # What does this PR do? titled ## Test Plan CI	2025-07-18 12:10:30 -07:00
Nehanth Narendrula	874b1cb00f	fix: DPOAlignmentConfig schema to use correct DPO parameters (#2804 ) Some checks failed Coverage Badge / unit-tests (push) Failing after 1s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Integration Tests / discover-tests (push) Successful in 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 9s Details Test Llama Stack Build / build-single-provider (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 13s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 13s Details Update ReadTheDocs / update-readthedocs (push) Failing after 13s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 22s Details Python Package Build Test / build (3.12) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s Details Test External Providers / test-external-providers (venv) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 17s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 19s Details Unit Tests / unit-tests (3.13) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 21s Details Integration Tests / test-matrix (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 22s Details Test Llama Stack Build / build (push) Failing after 15s Details Python Package Build Test / build (3.13) (push) Failing after 1m50s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2m5s Details Pre-commit / pre-commit (push) Successful in 3m20s Details # What does this PR do? This PR fixes the `DPOAlignmentConfig` schema to use the correct Direct Preference Optimization (DPO) parameters. The current schema incorrectly uses PPO-inspired parameters (`reward_scale`, `reward_clip`, `epsilon`, `gamma`) that are not part of the DPO algorithm. This PR updates it to use the standard DPO parameters: - `beta`: The KL divergence coefficient that controls deviation from the reference model - `loss_type`: The type of DPO loss function (sigmoid, hinge, ipo, kto_pair) These parameters align with standard DPO implementations like HuggingFace's TRL library. --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-43-83.ec2.internal>	2025-07-18 11:56:00 -07:00
Charlie Doern	d994305f0a	fix: remove disabled providers from model dump (#2784 ) # What does this PR do? currently when running `llama stack run --template starter...` the __disabled__ providers, their models, etc are printed alongside the enabled ones making the output really confusing in server.py add a utility `remove_disabled_providers` which post-processes the model_dump output to remove any dict with `provider_id: __disabled__` we also have `debug` logs printing the disabled providers, so I think its safe to say that is the only indicator we need when using starter. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan before (output truncated because it was huge): ``` ... model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Llama-3.2-11B-Vision-Instruct model_type: llm provider_id: __disabled__ provider_model_id: sambanova/Llama-3.2-11B-Vision-Instruct - metadata: {} model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-3.2-11B-Vision-Instruct model_type: llm provider_id: __disabled__ provider_model_id: sambanova/Llama-3.2-11B-Vision-Instruct - metadata: {} model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Llama-3.2-90B-Vision-Instruct model_type: llm provider_id: __disabled__ provider_model_id: sambanova/Llama-3.2-90B-Vision-Instruct - metadata: {} model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-3.2-90B-Vision-Instruct model_type: llm provider_id: __disabled__ provider_model_id: sambanova/Llama-3.2-90B-Vision-Instruct - metadata: {} model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Llama-4-Scout-17B-16E-Instruct model_type: llm provider_id: __disabled__ provider_model_id: sambanova/Llama-4-Scout-17B-16E-Instruct - metadata: {} model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-4-Scout-17B-16E-Instruct model_type: llm provider_id: __disabled__ provider_model_id: sambanova/Llama-4-Scout-17B-16E-Instruct - metadata: {} model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Llama-4-Maverick-17B-128E-Instruct model_type: llm provider_id: __disabled__ provider_model_id: sambanova/Llama-4-Maverick-17B-128E-Instruct - metadata: {} model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-4-Maverick-17B-128E-Instruct model_type: llm provider_id: __disabled__ provider_model_id: sambanova/Llama-4-Maverick-17B-128E-Instruct - metadata: {} model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/sambanova/Meta-Llama-Guard-3-8B model_type: llm provider_id: __disabled__ provider_model_id: sambanova/Meta-Llama-Guard-3-8B - metadata: {} model_id: ${env.ENABLE_SAMBANOVA:=__disabled__}/meta-llama/Llama-Guard-3-8B model_type: llm provider_id: __disabled__ provider_model_id: sambanova/Meta-Llama-Guard-3-8B - metadata: embedding_dimension: 384 model_id: all-MiniLM-L6-v2 model_type: embedding provider_id: sentence-transformers provider_model_id: null providers: agents: - config: persistence_store: db_path: /Users/charliedoern/.llama/distributions/starter/agents_store.db type: sqlite responses_store: db_path: /Users/charliedoern/.llama/distributions/starter/responses_store.db type: sqlite provider_id: meta-reference provider_type: inline::meta-reference datasetio: - config: kvstore: db_path: /Users/charliedoern/.llama/distributions/starter/huggingface_datasetio.db type: sqlite provider_id: huggingface provider_type: remote::huggingface - config: kvstore: db_path: /Users/charliedoern/.llama/distributions/starter/localfs_datasetio.db type: sqlite provider_id: localfs provider_type: inline::localfs eval: - config: kvstore: db_path: /Users/charliedoern/.llama/distributions/starter/meta_reference_eval.db type: sqlite provider_id: meta-reference provider_type: inline::meta-reference files: - config: metadata_store: db_path: /Users/charliedoern/.llama/distributions/starter/files_metadata.db type: sqlite storage_dir: /Users/charliedoern/.llama/distributions/starter/files provider_id: meta-reference-files provider_type: inline::localfs inference: - config: api_key: '******' base_url: https://api.cerebras.ai provider_id: __disabled__ provider_type: remote::cerebras - config: url: http://localhost:11434 provider_id: ollama provider_type: remote::ollama - config: api_token: '****' max_tokens: ${env.VLLM_MAX_TOKENS:=4096} tls_verify: ${env.VLLM_TLS_VERIFY:=true} url: ${env.VLLM_URL} provider_id: __disabled__ provider_type: remote::vllm - config: url: ${env.TGI_URL} provider_id: __disabled__ provider_type: remote::tgi - config: api_token: '****' huggingface_repo: ${env.INFERENCE_MODEL} provider_id: __disabled__ provider_type: remote::hf::serverless - config: api_token: '****' endpoint_name: ${env.INFERENCE_ENDPOINT_NAME} provider_id: __disabled__ provider_type: remote::hf::endpoint - config: api_key: '****' url: https://api.fireworks.ai/inference/v1 provider_id: __disabled__ provider_type: remote::fireworks - config: api_key: '****' url: https://api.together.xyz/v1 provider_id: __disabled__ provider_type: remote::together - config: {} provider_id: __disabled__ provider_type: remote::bedrock - config: api_token: '****' url: ${env.DATABRICKS_URL} provider_id: __disabled__ provider_type: remote::databricks - config: api_key: '****' append_api_version: ${env.NVIDIA_APPEND_API_VERSION:=True} url: ${env.NVIDIA_BASE_URL:=https://integrate.api.nvidia.com} provider_id: __disabled__ provider_type: remote::nvidia - config: api_token: '****' url: ${env.RUNPOD_URL:=} provider_id: __disabled__ provider_type: remote::runpod - config: api_key: '****' provider_id: __disabled__ provider_type: remote::openai - config: api_key: '****' provider_id: __disabled__ provider_type: remote::anthropic - config: api_key: '****' provider_id: __disabled__ provider_type: remote::gemini - config: api_key: '****' url: https://api.groq.com provider_id: __disabled__ provider_type: remote::groq - config: api_key: '****' openai_compat_api_base: https://api.fireworks.ai/inference/v1 provider_id: __disabled__ provider_type: remote::fireworks-openai-compat - config: api_key: '****' openai_compat_api_base: https://api.llama.com/compat/v1/ provider_id: __disabled__ provider_type: remote::llama-openai-compat - config: api_key: '****' openai_compat_api_base: https://api.together.xyz/v1 provider_id: __disabled__ provider_type: remote::together-openai-compat - config: api_key: '****' openai_compat_api_base: https://api.groq.com/openai/v1 provider_id: __disabled__ provider_type: remote::groq-openai-compat - config: api_key: '****' openai_compat_api_base: https://api.sambanova.ai/v1 provider_id: __disabled__ provider_type: remote::sambanova-openai-compat - config: api_key: '****' openai_compat_api_base: https://api.cerebras.ai/v1 provider_id: __disabled__ provider_type: remote::cerebras-openai-compat - config: api_key: '****' url: https://api.sambanova.ai/v1 provider_id: __disabled__ provider_type: remote::sambanova - config: api_key: '****' url: ${env.PASSTHROUGH_URL} provider_id: __disabled__ provider_type: remote::passthrough - config: {} provider_id: sentence-transformers provider_type: inline::sentence-transformers post_training: - config: checkpoint_format: huggingface device: cpu distributed_backend: null provider_id: huggingface provider_type: inline::huggingface safety: - config: excluded_categories: [] provider_id: llama-guard provider_type: inline::llama-guard scoring: - config: {} provider_id: basic provider_type: inline::basic - config: {} provider_id: llm-as-judge provider_type: inline::llm-as-judge - config: openai_api_key: '****' provider_id: braintrust provider_type: inline::braintrust telemetry: - config: otel_exporter_otlp_endpoint: null service_name: "\u200B" sinks: console,sqlite sqlite_db_path: /Users/charliedoern/.llama/distributions/starter/trace_store.db provider_id: meta-reference provider_type: inline::meta-reference tool_runtime: - config: api_key: '****' max_results: 3 provider_id: brave-search provider_type: remote::brave-search - config: api_key: '****' max_results: 3 provider_id: tavily-search provider_type: remote::tavily-search - config: {} provider_id: rag-runtime provider_type: inline::rag-runtime - config: {} provider_id: model-context-protocol provider_type: remote::model-context-protocol vector_io: - config: kvstore: db_path: /Users/charliedoern/.llama/distributions/starter/faiss_store.db type: sqlite provider_id: faiss provider_type: inline::faiss - config: db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/starter}/sqlite_vec.db kvstore: db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/starter}/sqlite_vec_registry.db type: sqlite provider_id: __disabled__ provider_type: inline::sqlite-vec - config: db_path: ${env.MILVUS_DB_PATH:=~/.llama/distributions/starter}/milvus.db kvstore: db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/starter}/milvus_registry.db type: sqlite provider_id: __disabled__ provider_type: inline::milvus - config: url: ${env.CHROMADB_URL:=} provider_id: __disabled__ provider_type: remote::chromadb - config: db: ${env.PGVECTOR_DB:=} host: ${env.PGVECTOR_HOST:=localhost} kvstore: db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/starter}/pgvector_registry.db type: sqlite password: '****' port: ${env.PGVECTOR_PORT:=5432} user: ${env.PGVECTOR_USER:=} provider_id: __disabled__ provider_type: remote::pgvector scoring_fns: [] server: auth: null host: null port: 8321 quota: null tls_cafile: null tls_certfile: null tls_keyfile: null shields: - params: null provider_id: null provider_shield_id: ollama/__disabled__ shield_id: __disabled__ tool_groups: - args: null mcp_endpoint: null provider_id: tavily-search toolgroup_id: builtin::websearch - args: null mcp_endpoint: null provider_id: rag-runtime toolgroup_id: builtin::rag vector_dbs: [] version: 2 ``` after: ``` INFO 2025-07-16 13:00:32,604 __main__:448 server: Run configuration: INFO 2025-07-16 13:00:32,606 __main__:450 server: apis: - agents - datasetio - eval - files - inference - post_training - safety - scoring - telemetry - tool_runtime - vector_io benchmarks: [] datasets: [] image_name: starter inference_store: db_path: /Users/charliedoern/.llama/distributions/starter/inference_store.db type: sqlite metadata_store: db_path: /Users/charliedoern/.llama/distributions/starter/registry.db type: sqlite models: - metadata: {} model_id: ollama/llama3.2:3b model_type: llm provider_id: ollama provider_model_id: llama3.2:3b - metadata: embedding_dimension: 384 model_id: all-MiniLM-L6-v2 model_type: embedding provider_id: sentence-transformers providers: agents: - config: persistence_store: db_path: /Users/charliedoern/.llama/distributions/starter/agents_store.db type: sqlite responses_store: db_path: /Users/charliedoern/.llama/distributions/starter/responses_store.db type: sqlite provider_id: meta-reference provider_type: inline::meta-reference datasetio: - config: kvstore: db_path: /Users/charliedoern/.llama/distributions/starter/huggingface_datasetio.db type: sqlite provider_id: huggingface provider_type: remote::huggingface - config: kvstore: db_path: /Users/charliedoern/.llama/distributions/starter/localfs_datasetio.db type: sqlite provider_id: localfs provider_type: inline::localfs eval: - config: kvstore: db_path: /Users/charliedoern/.llama/distributions/starter/meta_reference_eval.db type: sqlite provider_id: meta-reference provider_type: inline::meta-reference files: - config: metadata_store: db_path: /Users/charliedoern/.llama/distributions/starter/files_metadata.db type: sqlite storage_dir: /Users/charliedoern/.llama/distributions/starter/files provider_id: meta-reference-files provider_type: inline::localfs inference: - config: url: http://localhost:11434 provider_id: ollama provider_type: remote::ollama - config: {} provider_id: sentence-transformers provider_type: inline::sentence-transformers post_training: - config: checkpoint_format: huggingface device: cpu provider_id: huggingface provider_type: inline::huggingface safety: - config: excluded_categories: [] provider_id: llama-guard provider_type: inline::llama-guard scoring: - config: {} provider_id: basic provider_type: inline::basic - config: {} provider_id: llm-as-judge provider_type: inline::llm-as-judge - config: openai_api_key: '****' provider_id: braintrust provider_type: inline::braintrust telemetry: - config: service_name: "\u200B" sinks: console,sqlite sqlite_db_path: /Users/charliedoern/.llama/distributions/starter/trace_store.db provider_id: meta-reference provider_type: inline::meta-reference tool_runtime: - config: api_key: '****' max_results: 3 provider_id: brave-search provider_type: remote::brave-search - config: api_key: '******' max_results: 3 provider_id: tavily-search provider_type: remote::tavily-search - config: {} provider_id: rag-runtime provider_type: inline::rag-runtime - config: {} provider_id: model-context-protocol provider_type: remote::model-context-protocol vector_io: - config: kvstore: db_path: /Users/charliedoern/.llama/distributions/starter/faiss_store.db type: sqlite provider_id: faiss provider_type: inline::faiss scoring_fns: [] server: port: 8321 shields: [] tool_groups: - provider_id: tavily-search toolgroup_id: builtin::websearch - provider_id: rag-runtime toolgroup_id: builtin::rag vector_dbs: [] version: 2 ``` Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-18 10:44:35 -07:00
slekkala1	15916852e8	chore: Add slekkala1 to codeowners (#2817 ) Getting started on LLAMA Stack	2025-07-18 10:33:30 -07:00
Ashwin Bharambe	9e3ae50306	feat(server): construct the stack in a persistent event loop (#2818 ) When we call `construct_stack()`, providers are instantiated and `initialize()` is called. This call can end up doing _anything_ at all -- specifically, providers are free to create long running background tasks as part of this. If we wrapped this within a `asyncio.run()` as in the current code, these tasks get canceled when the stack construction finishes. This is not correct. The PR addresses the issue by creating a persistent event loop which is used for both the stack as well as for running the uvicorn server. In other words, the lifetime of the providers (and downstream async code) is now the same as the lifetime of the uvicorn server. ## Test Plan This should not affect any current code since we don't have background tasks created right now. However, https://github.com/meta-llama/llama-stack/pull/2805 will start using this functionality.	2025-07-18 10:29:19 -07:00
Nathan Weinberg	2bb9039173	docs: fix steps in the Quick Start Guide (#2800 ) # What does this PR do? 'build' command didn't take into account ENABLE flags for starter distro for some reason, I was having issues with HuggingFace access for the embedding model, so added a tip for that as well Closes #2779 ## Test Plan I ran the described steps manually, but it would be nice if someone else could try it and verify this still works We might consider having some CI job ensure the QSG remains functional - it's not a great experience for new users if they try Llama Stack for the first time and it doesn't work as we describe Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-18 09:08:46 -07:00
Christian Zaccaria	e45543f7f3	test: Measure and track code coverage (#2636 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - Added coverage badge to README. - [See my fork](https://github.com/ChristianZaccaria/llama-stack) - Added a GitHub Actions workflow that runs the tests and updates the coverage badge. - [See run](`4574811323`) - Documented steps in `testing.md` for running the tests locally, and viewing the `html` report. - Excluded non-essential files from coverage reporting to provide a more accurate measurement. Automatically created PR to update coverage badge: https://github.com/ChristianZaccaria/llama-stack/pull/9 # Note for reviewers 1. Currently the coverage report shows a 45% coverage. Wondering if there are other files or directories that should also be excluded from the report to increase the percentage. The directories with the least test coverage are `llama_stack/cli`, `llama_stack/models`, and `llama_stack/ui`. - Should we exclude these? 2. [Required] The `GITHUB_TOKEN` should have write permissions to open a PR to update the coverage badge. # GitHub Issue <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2355 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> The `testing.md` file describes how to run the unit tests locally.	2025-07-18 18:08:36 +02:00
Nathan Weinberg	1785a6b39c	docs: add virtualenv instructions for running starter distro (#2780 ) # What does this PR do? we had directions for a container and conda but not venv Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-18 09:07:43 -07:00
Charlie Doern	0eb0583cdf	fix: amend integration test workflow (#2812 ) # What does this PR do? trigger integration tests on ALL changes to `tests/` to catch failures before they merge into main Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-18 15:23:36 +02:00
Mustafa Elbehery	fe6af7dc8b	chore(test): migrate unit tests from unittest to pytest nvidia test f… (#2794 ) Some checks failed Integration Tests / discover-tests (push) Successful in 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Details Test Llama Stack Build / generate-matrix (push) Successful in 10s Details Python Package Build Test / build (3.13) (push) Failing after 11s Details Test Llama Stack Build / build-single-provider (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s Details Test External Providers / test-external-providers (venv) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 21s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s Details Integration Tests / test-matrix (push) Failing after 13s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 16s Details Unit Tests / unit-tests (3.13) (push) Failing after 17s Details Test Llama Stack Build / build (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s Details Unit Tests / unit-tests (3.12) (push) Failing after 29s Details Python Package Build Test / build (3.12) (push) Failing after 1m46s Details Update ReadTheDocs / update-readthedocs (push) Failing after 1m44s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 1m51s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1m53s Details Pre-commit / pre-commit (push) Successful in 3m17s Details This PR replaces unittest with pytest. Part of https://github.com/meta-llama/llama-stack/issues/2680 cc @leseb Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-18 12:32:19 +02:00
Mustafa Elbehery	b78b8e1486	chore: add `mypy` inference parallel utils (#2670 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-18 12:01:10 +02:00
Mustafa Elbehery	ca7edcd6a4	chore(api): add `mypy` coverage to `chat_format` (#2654 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-18 11:56:53 +02:00
Mustafa Elbehery	75480b01b8	chore(test): migrate unit tests from unittest to pytest for system prompt (#2789 ) This PR replaces unittest with pytest. Part of https://github.com/meta-llama/llama-stack/issues/2680 cc @leseb Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-18 11:54:02 +02:00
Mustafa Elbehery	3cdf748a8e	chore(test): migrate unit tests from unittest to pytest for nvidia datastore (#2790 ) This PR replaces unittest with pytest. Part of https://github.com/meta-llama/llama-stack/issues/2680 cc @leseb Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-18 11:52:47 +02:00
Mustafa Elbehery	55713abe7d	chore(test): migrate unit tests from unittest to pytest nvidia test p… (#2792 ) This PR replaces unittest with pytest. Part of https://github.com/meta-llama/llama-stack/issues/2680 cc @leseb Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-18 11:49:45 +02:00
Charlie Doern	d7cc38e934	fix: remove async test markers (fix pre-commit) (#2808 ) # What does this PR do? some async test markers are in the codebase causing pre-commit to fail due to #2744 remove these pytest fixtures ## Test Plan pre-commit passes Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-17 21:35:28 -07:00
Ashwin Bharambe	d64e096c5f	fix(cli): image name should not default to CONDA_DEFAULT_ENV (#2806 ) Some checks failed Integration Tests / discover-tests (push) Successful in 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 14s Details Test External Providers / test-external-providers (venv) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s Details Python Package Build Test / build (3.12) (push) Failing after 18s Details Integration Tests / test-matrix (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 20s Details Python Package Build Test / build (3.13) (push) Failing after 19s Details Unit Tests / unit-tests (3.12) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 26s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 28s Details Unit Tests / unit-tests (3.13) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 24s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 55s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 53s Details Pre-commit / pre-commit (push) Failing after 2m14s Details If I am running `uv run llama stack run --image-type venv` it should not be saying to me "Conda detected" because I am pretty clearly telling it I need venv. The root cause is the offending line.	2025-07-17 16:40:35 -07:00
Matthew Farrellee	910b017680	chore: block asyncio marks in tests (#2744 ) # What does this PR do? use pre-commit to block addition of new asyncio marks, since we configure pytest with async-mode=auto, see https://github.com/meta-llama/llama-stack/pull/2730	2025-07-17 16:33:30 -07:00
Mustafa Elbehery	bd8a3ae3cc	chore(test): migrate unit tests from unittest to pytest for prompt adapter (#2788 ) This PR replaces unittest with pytest. Part of https://github.com/meta-llama/llama-stack/issues/2680 cc @leseb Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>	2025-07-17 16:31:38 -07:00
ehhuang	3ae4aeb344	test: add some tests for Telemetry API (#2787 ) # What does this PR do? ## Test Plan ENABLE_OLLAMA=ollama LLAMA_STACK_CONFIG=starter uv run pytest tests/integration/telemetry --text-model="ollama/llama3.2:3b-instruct-fp16"	2025-07-17 16:20:51 -07:00
Mustafa Elbehery	73868ce9e3	chore(test): migrate unit tests from unittest to pytest for server en… (#2795 ) This PR replaces unittest with pytest. Part of https://github.com/meta-llama/llama-stack/issues/2680 cc @leseb Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-17 16:20:12 -07:00
Matthew Farrellee	477bcd4d09	feat: allow dynamic model registration for nvidia inference provider (#2726 ) # What does this PR do? let's users register models available at https://integrate.api.nvidia.com/v1/models that isn't already in llama_stack/providers/remote/inference/nvidia/models.py ## Test Plan 1. run the nvidia distro 2. register a model from https://integrate.api.nvidia.com/v1/models that isn't already know, as of this writing nvidia/llama-3.1-nemotron-ultra-253b-v1 is a good example 3. perform inference w/ the model	2025-07-17 12:11:30 -07:00
Matthew Farrellee	57745101be	chore: internal change, make Model.provider_model_id non-optional (#2690 ) Some checks failed Integration Tests / discover-tests (push) Successful in 13s Details Test Llama Stack Build / generate-matrix (push) Successful in 14s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 21s Details Python Package Build Test / build (3.12) (push) Failing after 25s Details Test Llama Stack Build / build-single-provider (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 30s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 30s Details Unit Tests / unit-tests (3.12) (push) Failing after 32s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 40s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 32s Details Unit Tests / unit-tests (3.13) (push) Failing after 36s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 42s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 36s Details Test External Providers / test-external-providers (venv) (push) Failing after 36s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 36s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 42s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 40s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 49s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 47s Details Python Package Build Test / build (3.13) (push) Failing after 1m51s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 1m58s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2m5s Details Integration Tests / test-matrix (push) Failing after 36s Details Test Llama Stack Build / build (push) Failing after 37s Details Pre-commit / pre-commit (push) Successful in 3m40s Details - POST /v1/models accepts optional provider_model_id - ModelsRoutingTable.register_model handler ensures it is non-None, providing a default usage of Model.provider_model_id will no longer need to detect None	2025-07-17 08:26:57 -07:00
Derek Higgins	c2b64dce5b	fix: Move sentence-transformers to the top (#2703 ) Move sentence-transformers to be the first embedding in the list of models. This ensures it will always be the default and is more consistent then having the default change based on what env variables are available Closes: #2702 ## Test Plan Manually verified Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-07-17 10:31:30 -04:00
ehhuang	51b179e1c5	chore: update k8s template (#2786 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Integration Tests / discover-tests (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 8s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 14s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s Details Test External Providers / test-external-providers (venv) (push) Failing after 50s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 58s Details Unit Tests / unit-tests (3.13) (push) Failing after 54s Details Integration Tests / test-matrix (push) Failing after 53s Details Pre-commit / pre-commit (push) Successful in 1m40s Details # What does this PR do? - enables auth - updates to use distribution-starter docker ## Test Plan bash apply.sh	2025-07-16 15:07:26 -07:00
IAN MILLER	b57db11bed	feat: create dynamic model registration for OpenAI and Llama compat remote inference providers (#2745 ) Some checks failed Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 9s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 7s Details Integration Tests / discover-tests (push) Successful in 13s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s Details Integration Tests / test-matrix (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 17s Details Test External Providers / test-external-providers (venv) (push) Failing after 17s Details Test Llama Stack Build / build (push) Failing after 14s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 35s Details Python Package Build Test / build (3.12) (push) Failing after 51s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 57s Details Unit Tests / unit-tests (3.13) (push) Failing after 53s Details Pre-commit / pre-commit (push) Successful in 1m42s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this task is to create a solution that can automatically detect when new models are added, deprecated, or removed by OpenAI and Llama API providers, and automatically update the list of supported models in LLamaStack. This feature is vitally important in order to avoid missing new models and editing the entries manually hence I created automation allowing users to dynamically register: - any models from OpenAI provider available at [https://api.openai.com/v1/models](https://api.openai.com/v1/models) that are not in [https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/openai/models.py](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/openai/models.py) - any models from Llama API provider available at [https://api.llama.com/v1/models](https://api.llama.com/v1/models) that are not in [https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/llama_openai_compat/models.py](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/llama_openai_compat/models.py) <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2504 this PR is dependant on #2710 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> 1. Create venv at root llamastack directory: `uv venv .venv --python 3.12 --seed` 2. Activate venv: `source .venv/bin/activate` 3. `uv pip install -e .` 4. Create OpenAI distro modifying run.yaml 5. Build distro: `llama stack build --template starter --image-type venv` 6. Then run LlamaStack, but before navigate to templates/starter folder: `llama stack run run.yaml --image-type venv OPENAI_API_KEY=<YOUR_KEY> ENABLE_OPENAI=openai` 7. Then try to register dummy llm that doesn't exist in OpenAI provider: ` llama-stack-client models register ianm/ianllm --provider-model-id=ianllm --provider-id=openai ` You should receive this output - combined list of static config + fetched available models from OpenAI: <img width="1380" height="474" alt="Screenshot 2025-07-14 at 12 48 50" src="https://github.com/user-attachments/assets/d26aad18-6b15-49ee-9c49-b01b2d33f883" /> 8. Then register real llm from OpenAI: llama-stack-client models register openai/gpt-4-turbo-preview --provider-model-id=gpt-4-turbo-preview --provider-id=openai <img width="1253" height="613" alt="Screenshot 2025-07-14 at 13 43 02" src="https://github.com/user-attachments/assets/60a5c9b1-3468-4eb9-9e92-cd7d21de3ca0" /> <img width="1288" height="655" alt="Screenshot 2025-07-14 at 13 43 11" src="https://github.com/user-attachments/assets/c1e48871-0e24-4bd9-a0b8-8c95552a51ee" /> We correctly fetched all available models from OpenAI As for Llama API, as a non-US person I don't have access to Llama API Key but I joined wait list. The implementation for Llama is the same as for OpenAI since Llama is openai compatible. So, the response from GET endpoint has the same structure as OpenAI https://llama.developer.meta.com/docs/api/models	2025-07-16 12:49:38 -04:00
Charlie Doern	6c516d391b	fix: de-clutter `llama stack run` logs (#2783 ) # What does this PR do? currently each disabled provider is printed as a warning, switch to debug. This level of verbosity isn't necessary, especially if we intend to grow the list of providers over time that can be in a single run yaml ## Test Plan before: <img width="1144" height="667" alt="Screenshot 2025-07-16 at 12 37 18 PM" src="https://github.com/user-attachments/assets/d14dbf76-6e40-4996-8a27-111e6a987d71" /> after: <img width="925" height="141" alt="Screenshot 2025-07-16 at 12 37 42 PM" src="https://github.com/user-attachments/assets/81efdbe1-923c-4c5f-9731-f89729043920" /> Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-16 09:44:26 -07:00
Nathan Weinberg	919ee3199b	docs: add missing bold title to match others (#2782 ) Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-16 18:05:48 +02:00
Sergey Yedrikov	30be1fd8b7	fix: SQLiteVecIndex.create(..., bank_id="test_bank.123") - bank_id with a dot - leads to sqlite3.OperationalError (#2770 ) (#2771 ) # What does this PR do? Resolves https://github.com/meta-llama/llama-stack/issues/2770. It replaces characters in SQLite table names that are not alphanumeric or underscores with underscores and quotes the table names with square brackets in SQL statements. Closes #[2770] ## Test Plan I added a ".123" suffix to the bank_id on the following line ``` index = await SQLiteVecIndex.create(dimension=embedding_dimension, db_path=db_path, bank_id="test_bank.123") ``` in tests/unit/providers/vector_io/test_sqlite_vec.py, which, without the fix in place, demonstrates the issue.	2025-07-16 08:25:44 -07:00
Nathan Weinberg	72e606355d	fix: add shutdown function for localfs provider (#2781 ) # What does this PR do? this was causing an unnessessary logger warning ## Test Plan Run `LLAMA_STACK_DIR=. ENABLE_OLLAMA=ollama OLLAMA_INFERENCE_MODEL=llama3.2:3b llama stack build --template starter --image-type venv --run` and then `Crtl-C` to shutdown Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-16 08:24:57 -07:00
Nathan Weinberg	3165197b75	chore: remove 'gha_workflow_llama_stack_tests.yml' (#2767 ) This was introduced in https://github.com/meta-llama/llama-stack/pull/523 but as far as I can tell has never been used. It's been over six months so it feels fair to remove it at this point. Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-16 07:12:26 -07:00
Matthew Farrellee	a3e249807b	chore: remove vision model URL workarounds and simplify client creation (#2775 ) The vision models are now available at the standard URL, so the workaround code has been removed. This also simplifies the codebase by eliminating the need for per-model client caching. - Remove special URL handling for meta/llama-3.2-11b/90b-vision-instruct models - Convert _get_client method to _client property for cleaner API - Remove unnecessary lru_cache decorator and functools import - Simplify client creation logic to use single base URL for all models	2025-07-16 07:10:04 -07:00
IAN MILLER	fa1bb9ae00	docs: fix typo and link self loop for index.html#running-tests (#2777 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR fixes typo "here here" and self loop link at [https://llama-stack.readthedocs.io/en/latest/contributing/index.html#tests/README.md](https://llama-stack.readthedocs.io/en/latest/contributing/index.html#tests/README.md) <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2762 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. -->	2025-07-16 07:09:44 -07:00
Sébastien Han	ff9d4d8a9d	ci: do not pull model (#2776 ) the model is now available in the container image Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-16 04:58:05 -07:00
Sébastien Han	f85189022c	fix: re-hydrate requirement and fix package (#2774 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s Details Integration Tests / discover-tests (push) Successful in 6s Details Test Llama Stack Build / generate-matrix (push) Successful in 10s Details Test Llama Stack Build / build-single-provider (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 12s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 11s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s Details Integration Tests / test-matrix (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s Details Unit Tests / unit-tests (3.12) (push) Failing after 12s Details Python Package Build Test / build (3.12) (push) Failing after 23s Details Update ReadTheDocs / update-readthedocs (push) Failing after 21s Details Python Package Build Test / build (3.13) (push) Failing after 26s Details Test Llama Stack Build / build (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 35s Details Pre-commit / pre-commit (push) Successful in 1m20s Details Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-16 05:46:15 -04:00
Ashwin Bharambe	95fdc8ea94	build: Bump version to 0.2.15	2025-07-15 20:29:08 -07:00
Kelly Brown	b096794959	docs: Reorganize documentation on the webpage (#2651 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2s Details Integration Tests / discover-tests (push) Successful in 2s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s Details Python Package Build Test / build (3.12) (push) Failing after 14s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s Details Unit Tests / unit-tests (3.13) (push) Failing after 15s Details Test Llama Stack Build / generate-matrix (push) Successful in 16s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s Details Test External Providers / test-external-providers (venv) (push) Failing after 17s Details Update ReadTheDocs / update-readthedocs (push) Failing after 15s Details Test Llama Stack Build / build-single-provider (push) Failing after 21s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 18s Details Unit Tests / unit-tests (3.12) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s Details Python Package Build Test / build (3.13) (push) Failing after 44s Details Test Llama Stack Build / build (push) Failing after 25s Details Integration Tests / test-matrix (push) Failing after 46s Details Pre-commit / pre-commit (push) Successful in 2m24s Details # What does this PR do? Reorganizes the Llama stack webpage into more concise index pages, introduce more of a workflow, and reduce repetition of content. New nav structure so far based on #2637 Further discussions in https://github.com/meta-llama/llama-stack/discussions/2585 Preview: ![Screenshot 2025-07-09 at 2 31 53 PM](https://github.com/user-attachments/assets/4c1f3845-b328-4f12-9f20-3f09375007af) You can also build a full local preview locally Feedback Looking for feedback on page titles and general feedback on the new structure Follow up documentation I plan on reducing some sections and standardizing some terminology in a follow up PR. More discussions on that in https://github.com/meta-llama/llama-stack/discussions/2585	2025-07-15 14:19:35 -07:00
Francisco Arceo	e1755d1ed2	chore: Adding OpenAI Vector Stores Files API compatibility for PGVector (#2755 ) # What does this PR do? Adding OpenAI Vector Stores Files API compatibility for PGVector <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Updated CI to include PGVector --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-15 15:46:49 -04:00
ehhuang	e64e4fc5a2	test: add tests against published client (#2752 ) # What does this PR do? closes #2751 ## Test Plan --------- Co-authored-by: Nathan Weinberg <31703736+nathan-weinberg@users.noreply.github.com>	2025-07-15 12:25:31 -07:00
Mark Campbell	65fcd03461	docs: update outdated llama stack client documentation (#2758 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Adds new documentation that was missing for the Llama Stack Python Client as well as updates old/outdated docs	2025-07-15 11:49:59 -07:00
Nathan Weinberg	b3d86ca926	fix: stop image_name from being cast to an integer (#2759 ) Some checks failed Integration Tests / discover-tests (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s Details Integration Tests / test-matrix (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 12s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 13s Details Test External Providers / test-external-providers (venv) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 18s Details Update ReadTheDocs / update-readthedocs (push) Failing after 40s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 51s Details Pre-commit / pre-commit (push) Successful in 2m1s Details # What does this PR do? https://github.com/meta-llama/llama-stack/pull/2490 introduced a new function for type conversion of strings. However, a side effect of this is that it will cast any string that can be cast to an integer if possible, which for something like `image_name` is not desired as we only accept strings for this field in the `StackRunConfig` This PR introduces logic to ensure that `image_name` remains a string Closes #2749 ## Test Plan You can run the original step to reproduce from the bug to verify this manually ```bash OPENAI_API_KEY=bogus llama stack build --image-type venv --image-name 2745 --providers inference=remote::openai --run ``` I have also added an additional unit test to prevent any future regression here Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-15 09:44:21 -07:00
Francisco Arceo	31b088978a	fix: Fix `/vector-stores/create` API when vector store with duplicate `name` (#2617 ) # What does this PR do? Resolves https://github.com/meta-llama/llama-stack/issues/2735 Currently, if you test against OpenAI's Vector Stores API the `client.vector_stores.search` call fails with an invalid vector_db during routing (see the script referenced in the clickable item under the Test Plan section). This PR ensures that `client.vector_stores.search()` is compatible with OpenAI's Vector Stores API. Two biggest changes: 1. The `name`, which was previously used as the `vector_db_id`, has been changed to be consistent with OpenAI's `vs_{uuid}` format. 2. The vector store ID has to be referenced by the ID, the name is not reliable as every `client.vector_stores.create` results in a new vector store. NOTE: I believe this is a breaking change for end users as they'll need to update their VectorDB identifiers. ## Test Plan Unit tests: ```bash ./scripts/unit-tests.sh tests/unit/providers/vector_io/ -v ``` Integration tests: ```bash ENABLE_MILVUS=milvus llama stack run /Users/farceo/dev/llama-stack/llama_stack/templates/starter/run.yaml --image-type venv LLAMA_STACK_CONFIG=http://localhost:8321 pytest -sv tests/integration/vector_io/test_openai_vector_stores.py --embedding-model=all-MiniLM-L6-v2 -vv ``` Unit tests and test script below 👇 <details> <summary>Click here for script used to test OpenAI and Llama Stack Vector Store implementation</summary> ```python import json import argparse from openai import OpenAI, pagination import logging from colorama import Fore, Style, init import traceback import os # Initialize colorama for color support in terminal init(autoreset=True) # Setup basic logging logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s') DEMO_VECTOR_STORE_NAME = "Support FAQ FJA" global DEMO_VECTOR_STORE_ID global DEMO_VECTOR_STORE_ID2 def colored_print(color, text): """Prints text to the console with the specified color.""" print(f"{color}{text}{Style.RESET_ALL}") def log_and_print(color, message, level=logging.INFO): """Logs a message and prints it to the console with the specified color.""" logging.log(level, message) colored_print(color, message) def run_tests(client, prefix="openai"): """ Runs all tests using the provided OpenAI client and saves the output to JSON files with the given prefix. """ # Create the directory if it doesn't exist os.makedirs('openai_testing', exist_ok=True) # Default values in case tests fail global DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 DEMO_VECTOR_STORE_ID = None DEMO_VECTOR_STORE_ID2 = None def test_idempotent_vector_store_creation(): """ Test that creating a vector store with the same name is idempotent. """ log_and_print(Fore.BLUE, "Starting vector store creation test...") try: vector_store = client.vector_stores.create( name=DEMO_VECTOR_STORE_NAME, ) # Attempt to create the same vector store again vector_store2 = client.vector_stores.create( name=DEMO_VECTOR_STORE_NAME, ) # Check instead of assert if vector_store2.id != vector_store.id: log_and_print(Fore.YELLOW, f"FAILED IDEMPOTENCY: the same VectorStore name for {prefix.upper()} does not return the same ID", level=logging.WARNING) else: log_and_print(Fore.GREEN, f"PASSED IDEMPOTENCY: f{vector_store2.id} == {vector_store.id} the same VectorStore name for {prefix.upper()} returns the same ID") vector_store_data = vector_store.to_dict() log_and_print(Fore.WHITE, f"vector_stores.create = {json.dumps(vector_store_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_create.json', 'w') as f: json.dump(vector_store_data, f, indent=2) global DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 DEMO_VECTOR_STORE_ID = vector_store.id DEMO_VECTOR_STORE_ID2 = vector_store2.id return DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 except Exception as e: log_and_print(Fore.RED, f"Idempotent vector store creation test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) # Create a fallback vector store ID if needed if 'vector_store' in locals() and vector_store: DEMO_VECTOR_STORE_ID = vector_store.id return DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 def test_vector_store_list(): """ Test listing vector stores. """ log_and_print(Fore.BLUE, "Starting vector store list test...") try: vector_stores = client.vector_stores.list() # Check instead of assert if not isinstance(vector_stores, pagination.SyncCursorPage): log_and_print(Fore.YELLOW, f"FAILED: Expected a list of vector stores, got {type(vector_stores)}", level=logging.WARNING) else: log_and_print(Fore.GREEN, "Vector store list test passed!") vector_stores_data = vector_stores.to_dict() log_and_print(Fore.WHITE, f"vector_stores.list = {json.dumps(vector_stores_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_list.json', 'w') as f: json.dump(vector_stores_data, f, indent=2) except Exception as e: log_and_print(Fore.RED, f"Vector store list test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_retrieve_vector_store(): """ Test retrieving a specific vector store. """ log_and_print(Fore.BLUE, "Starting retrieve vector store test...") if not DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "Skipping retrieve vector store test - no vector store ID available", level=logging.WARNING) return try: vector_store = client.vector_stores.retrieve( vector_store_id=DEMO_VECTOR_STORE_ID, ) # Check instead of assert if vector_store.id != DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "FAILED: Retrieved vector store ID does not match", level=logging.WARNING) else: log_and_print(Fore.GREEN, "Retrieve vector store test passed!") vector_store_data = vector_store.to_dict() log_and_print(Fore.WHITE, f"vector_stores.retrieve = {json.dumps(vector_store_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_retrieve.json', 'w') as f: json.dump(vector_store_data, f, indent=2) except Exception as e: log_and_print(Fore.RED, f"Retrieve vector store test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_modify_vector_store(): """ Test modifying a vector store. """ log_and_print(Fore.BLUE, "Starting modify vector store test...") if not DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "Skipping modify vector store test - no vector store ID available", level=logging.WARNING) return try: updated_vector_store = client.vector_stores.update( vector_store_id=DEMO_VECTOR_STORE_ID, name="Updated Support FAQ FJA", ) # Check instead of assert if updated_vector_store.name != "Updated Support FAQ FJA": log_and_print(Fore.YELLOW, "FAILED: Vector store name was not updated correctly", level=logging.WARNING) else: log_and_print(Fore.GREEN, "Modify vector store test passed!") updated_vector_store_data = updated_vector_store.to_dict() log_and_print(Fore.WHITE, f"vector_stores.modify = {json.dumps(updated_vector_store_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_modify.json', 'w') as f: json.dump(updated_vector_store_data, f, indent=2) except Exception as e: log_and_print(Fore.RED, f"Modify vector store test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_delete_vector_store(): """ Test deleting a vector store. """ log_and_print(Fore.BLUE, "Starting delete vector store test...") if not DEMO_VECTOR_STORE_ID2: log_and_print(Fore.YELLOW, "Skipping delete vector store test - no second vector store ID available", level=logging.WARNING) return try: response = client.vector_stores.delete( vector_store_id=DEMO_VECTOR_STORE_ID2, ) log_and_print(Fore.GREEN, "Delete vector store test passed!") response_data = response.to_dict() log_and_print(Fore.WHITE, f"Vector store delete response = {json.dumps(response_data, indent=2)}") with open(f'openai_testing/{prefix}_vector_store_delete.json', 'w') as f: json.dump(response_data, f, indent=2) except Exception as e: log_and_print(Fore.RED, f"Delete vector store test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_create_vector_store_file(): log_and_print(Fore.BLUE, "Starting create vector store file test...") if not DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "Skipping create vector store file test - no vector store ID available", level=logging.WARNING) return try: # create jsonl of files as an example with open("mydata.jsonl", "w") as f: f.write('{"text": "What is the return policy?", "metadata": {"category": "support"}}\n') f.write('{"text": "How do I reset my password?", "metadata": {"category": "support"}}\n') f.write('{"text": "Where can I find my order history?", "metadata": {"category": "support"}}\n') f.write('{"text": "What are the shipping options?", "metadata": {"category": "support"}}\n') f.write('{"text": "What is your favorite banana?", "metadata": {"category": "support"}}\n') # Create a simple text file if my_data_small.txt doesn't exist if not os.path.exists("my_data_small.txt"): with open("my_data_small.txt", "w") as f: f.write("This is a test file for vector store testing.\n") created_file = client.files.create( file=open("my_data_small.txt", "rb"), purpose="assistants", ) created_file_data = created_file.to_dict() log_and_print(Fore.WHITE, f"Created file {json.dumps(created_file_data, indent=2)}") with open(f'openai_testing/{prefix}_file_create.json', 'w') as f: json.dump(created_file_data, f, indent=2) retrieved_files = client.files.retrieve(created_file.id) retrieved_files_data = retrieved_files.to_dict() log_and_print(Fore.WHITE, f"Retrieved file {json.dumps(retrieved_files_data, indent=2)}") with open(f'openai_testing/{prefix}_file_retrieve.json', 'w') as f: json.dump(retrieved_files_data, f, indent=2) vector_store_file = client.vector_stores.files.create( vector_store_id=DEMO_VECTOR_STORE_ID, file_id=created_file.id, ) log_and_print(Fore.GREEN, "Create vector store file test passed!") except Exception as e: log_and_print(Fore.RED, f"Create vector store file test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) def test_search_vector_store(): """ Test searching a vector store. """ log_and_print(Fore.BLUE, "Starting search vector store test...") if not DEMO_VECTOR_STORE_ID: log_and_print(Fore.YELLOW, "Skipping search vector store test - no vector store ID available", level=logging.WARNING) return try: query = "What is the banana policy?" search_results = client.vector_stores.search( vector_store_id=DEMO_VECTOR_STORE_ID, query=query, max_num_results=10, ranking_options={ 'ranker': 'default-2024-11-15', 'score_threshold': 0.0, }, rewrite_query=False, ) # Check instead of assert if not isinstance(search_results, pagination.SyncPage): log_and_print(Fore.YELLOW, f"FAILED: Expected a list of search results, got {type(search_results)}", level=logging.WARNING) else: log_and_print(Fore.GREEN, "Search vector store test passed!") search_results_dict = search_results.to_dict() log_and_print(Fore.WHITE, f"Search results = {search_results_dict}") with open(f'openai_testing/{prefix}_vector_store_search.json', 'w') as f: json.dump(search_results_dict, f, indent=2) log_and_print(Fore.WHITE, f"vector_stores.search = {search_results.to_json()}") except Exception as e: log_and_print(Fore.RED, f"Search vector store test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) # Run all tests in sequence, even if some fail test_results = [] try: result = test_idempotent_vector_store_creation() if result and len(result) == 2: DEMO_VECTOR_STORE_ID, DEMO_VECTOR_STORE_ID2 = result test_results.append(True) except Exception as e: log_and_print(Fore.RED, f"Vector store creation test failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) test_results.append(False) for test_func in [ test_vector_store_list, test_retrieve_vector_store, test_modify_vector_store, test_delete_vector_store, test_create_vector_store_file, test_search_vector_store ]: try: test_func() test_results.append(True) except Exception as e: log_and_print(Fore.RED, f"{test_func.__name__} failed: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) test_results.append(False) if all(test_results): log_and_print(Fore.GREEN, f"All {prefix} tests completed successfully!") else: failed_count = test_results.count(False) log_and_print(Fore.YELLOW, f"{failed_count} {prefix} test(s) failed, but script completed.") if __name__ == "__main__": parser = argparse.ArgumentParser(description="Run OpenAI and/or LlamaStack tests.") parser.add_argument( "--provider", type=str, default="llama", choices=["openai", "llama", "both"], help="Specify which environment to test: openai, llama, or both. Default is both.", ) args = parser.parse_args() try: if args.provider in ("openai", "both"): openai_client = OpenAI() run_tests(openai_client, prefix="openai") if args.provider in ("llama", "both"): llama_client = OpenAI(base_url="http://localhost:8321/v1/openai/v1", api_key="none") run_tests(llama_client, prefix="llama") log_and_print(Fore.GREEN, "All tests completed!") except Exception as e: log_and_print(Fore.RED, f"Tests failed to complete: {e}", level=logging.ERROR) logging.error(traceback.format_exc()) ``` </details> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-15 11:24:41 -04:00
ehhuang	5400a2e2b1	chore: remove tests.yaml (#2754 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s Details Test External Providers / test-external-providers (venv) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Details Integration Tests / discover-tests (push) Successful in 23s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 26s Details Python Package Build Test / build (3.12) (push) Failing after 22s Details Integration Tests / test-matrix (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 30s Details Unit Tests / unit-tests (3.13) (push) Failing after 57s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 1m2s Details Pre-commit / pre-commit (push) Successful in 1m51s Details # What does this PR do? Don't think this is used anymore ## Test Plan	2025-07-14 22:02:37 -07:00
Varsha	4ae5656c2f	feat: Implement keyword search in milvus (#2231 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Details Integration Tests / discover-tests (push) Successful in 8s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 6s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 8s Details Test Llama Stack Build / generate-matrix (push) Successful in 8s Details Python Package Build Test / build (3.13) (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s Details Test External Providers / test-external-providers (venv) (push) Failing after 9s Details Test Llama Stack Build / build-single-provider (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Details Integration Tests / test-matrix (push) Failing after 8s Details Test Llama Stack Build / build (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 51s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 55s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 57s Details Update ReadTheDocs / update-readthedocs (push) Failing after 50s Details Pre-commit / pre-commit (push) Successful in 2m9s Details # What does this PR do? This PR adds the keyword search implementation for Milvus. Along with the implementation for remote Milvus, the tests require us to start a Milvus containers locally. In order to verify the implementation, run: ``` pytest tests/unit/providers/vector_io/remote/test_milvus.py -v -s --tb=short --disable-warnings --asyncio-mode=auto ``` You can also test the changes using the below script: ``` #!/usr/bin/env python3 import asyncio import os import uuid from typing import List from llama_stack_client import ( Agent, AgentEventLogger, LlamaStackClient, RAGDocument ) class MilvusRAGDemo: def __init__(self, base_url: str = "http://localhost:8321/"): self.client = LlamaStackClient(base_url=base_url) self.vector_db_id = f"milvus_rag_demo_{uuid.uuid4().hex[:8]}" self.model_id = None self.embedding_model_id = None self.embedding_dimension = None def setup_models(self): """Get available models and select appropriate ones for LLM and embeddings.""" models = self.client.models.list() # Select embedding model embedding_models = [m for m in models if m.model_type == "embedding"] if not embedding_models: raise ValueError("No embedding models found") self.embedding_model_id = embedding_models[0].identifier self.embedding_dimension = embedding_models[0].metadata["embedding_dimension"] def register_vector_db(self): print(f"Registering Milvus vector database: {self.vector_db_id}") response = self.client.vector_dbs.register( vector_db_id=self.vector_db_id, embedding_model=self.embedding_model_id, embedding_dimension=self.embedding_dimension, provider_id="milvus-remote", # Use remote Milvus ) print(f"Vector database registered successfully") return response def insert_documents(self): """Insert sample documents into the vector database.""" print("\nInserting sample documents...") # Sample documents about different topics documents = [ RAGDocument( document_id="ai_ml_basics", content=""" Artificial Intelligence (AI) and Machine Learning (ML) are transforming the world. AI refers to the simulation of human intelligence in machines, while ML is a subset of AI that enables computers to learn and improve from experience without being explicitly programmed. Deep learning, a subset of ML, uses neural networks with multiple layers to process complex patterns in data. Key concepts in AI/ML include: - Supervised Learning: Training with labeled data - Unsupervised Learning: Finding patterns in unlabeled data - Reinforcement Learning: Learning through trial and error - Neural Networks: Computing systems inspired by biological brains """, mime_type="text/plain", metadata={"topic": "technology", "category": "ai_ml"}, ), ] # Insert documents with chunking self.client.tool_runtime.rag_tool.insert( documents=documents, vector_db_id=self.vector_db_id, chunk_size_in_tokens=200, # Smaller chunks for better granularity ) print(f"Inserted {len(documents)} documents with chunking") def test_keyword_search(self): """Test keyword-based search using BM25.""" queries = [ "neural networks", "Python frameworks", "data cleaning", ] for query in queries: response = self.client.vector_io.query( vector_db_id=self.vector_db_id, query=query, params={ "mode": "keyword", # Keyword search "max_chunks": 3, "score_threshold": 0.0, } ) for i, (chunk, score) in enumerate(zip(response.chunks, response.scores)): print(f" {i+1}. Score: {score:.4f}") print(f" Content: {chunk.content[:100]}...") print(f" Metadata: {chunk.metadata}") def run_demo(self): try: self.setup_models() self.register_vector_db() self.insert_documents() self.test_keyword_search() except Exception as e: print(f"Error during demo: {e}") raise def main(): """Main function to run the demo.""" # Check if Llama Stack server is running demo = MilvusRAGDemo() try: demo.run_demo() except Exception as e: print(f"Demo failed: {e}") if __name__ == "__main__": main() ``` [//]: # (## Documentation) --------- Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>	2025-07-14 19:39:55 -04:00
Francisco Arceo	33f0d83ad3	chore: Move vector store `kvstore` implementation into `openai_vector_store_mixin.py` (#2748 )	2025-07-14 18:10:35 -04:00
Hardik Shah	6b8a8c1be9	fix: Safety in starter (#2731 ) - fireworks, together do not support Llama-guard 3 8b model anymore - Need to default to ollama - current safety shields logic was not correct since the shield_id was the provider ( which had duplicates ) - Followed similar logic to models Note: Seems a bit over-engineered but this can now be extended to other providers and fits in the overall mechanism of how env_vars are used to manage starter. ### How to test ``` ENABLE_OLLAMA=ollama ENABLE_FIREWORKS=fireworks SAFETY_MODEL=llama-guard3:1b pytest -s -v tests/integration/ --stack-config starter -k 'not(supervised_fine_tune or builtin_tool_code or safety_with_image or code_interpreter_for or rag_and_code or truncation or register_and_unregister)' --text-model fireworks/meta-llama/Llama-3.3-70B-Instruct --vision-model fireworks/meta-llama/Llama-4-Scout-17B-16E-Instruct --safety-shield llama-guard3:1b --embedding-model all-MiniLM-L6-v2 ``` ### Related but not obvious in this PR In the llama-stack-ops repo, we run tests before publishing packages and docker containers. The actions in that repo were using the fireworks / together distros ( which are non-existent ) So need to update that to run with `starter` and use `ollama` specifically for safety.	2025-07-14 15:07:40 -07:00
Nathan Weinberg	6ad22c209f	chore: add issue template for technical debt (#2753 ) # What does this PR do? Adds a template for technical debt. Currently we don't support blank issues so everything filed has to a bug or a feature. This would allow maintainers as well as community members to track things we might want to merge to expose the functionality but should be addressed later. Such things can also be "good first issues" for new contributors. ## Example of what we constitute as technical debt Inelegant code solutions, tests we intend to temporarily disable but would like to restore, CI hacks around infrastructure or installation, etc. Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-14 14:41:44 -07:00
ehhuang	aa0840c281	docs: fix building distro link (#2750 ) # What does this PR do? ## Test Plan Co-authored-by: raghotham <rsm@meta.com>	2025-07-14 12:06:56 -07:00
Matthew Farrellee	f731f369a2	feat: add infrastructure to allow inference model discovery (#2710 ) # What does this PR do? inference providers each have a static list of supported / known models. some also have access to a dynamic list of currently available models. this change gives prodivers using the ModelRegistryHelper the ability to combine their static and dynamic lists. for instance, OpenAIInferenceAdapter can implement ``` def query_available_models(self) -> list[str]: return [entry.model for entry in self.openai_client.models.list()] ``` to augment its static list w/ a current list from openai. ## Test Plan scripts/unit-test.sh	2025-07-14 11:38:53 -07:00
Derek Higgins	a7ed86181c	fix(faiss): Delete file contents from kvstore (#2686 ) Remove both the metadata and content from the kvstore when a file is being removed from the vector store. Closes: #2685 Also add faiss provider to openai_vector_stores test suite --------- Signed-off-by: Derek Higgins <derekh@redhat.com> Co-authored-by: raghotham <rsm@meta.com>	2025-07-14 13:58:23 -04:00
Sumanth Kamenani	77d2c8e95d	docs: clarify run.yaml files are starting points for customization (#2746 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 9s Details Integration Tests / discover-tests (push) Successful in 13s Details Python Package Build Test / build (3.13) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s Details Test External Providers / test-external-providers (venv) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s Details Update ReadTheDocs / update-readthedocs (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 17s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Integration Tests / test-matrix (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 31s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 29s Details Unit Tests / unit-tests (3.13) (push) Failing after 25s Details Pre-commit / pre-commit (push) Successful in 1m12s Details # What does this PR do? This PR improves documentation clarity around run.yaml file usage. It adds comprehensive guidance to help users understand that generated run.yaml files are templates meant to be customized for production use, not used as-is. ## Changes - Add new documentation section on customizing run.yaml files - Clarify that generated run.yaml files are templates, not production configs - Add guidance on customization best practices and common scenarios - Update existing documentation to reference customization guide - Improve clarity around run.yaml file usage for better user experience ## Test Plan - Verified new documentation file exists at correct location - Confirmed documentation is properly integrated into the toctree structure - Checked all internal links use correct paths and reference existing files - Validated references are added to relevant existing documentation files - Documentation build testing will be handled by CI environment	2025-07-14 09:53:13 -07:00
Mark Campbell	618ccea090	feat: add input validation for search mode of rag query config (#2275 ) # What does this PR do? Adds input validation for mode in RagQueryConfig This will prevent users from inputting search modes other than `vector` and `keyword` for the time being with `hybrid` to follow when that functionality is implemented. ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] ``` # Check out this PR and enter the LS directory uv sync --extra dev ``` Run the quickstart [example](https://llama-stack.readthedocs.io/en/latest/getting_started/#step-3-run-the-demo) Alter the Agent to include a query_config ``` agent = Agent( client, model=model_id, instructions="You are a helpful assistant", tools=[ { "name": "builtin::rag/knowledge_search", "args": { "vector_db_ids": [vector_db_id], "query_config": { "mode": "i-am-not-vector", # Test for non valid search mode "max_chunks": 6 } }, } ], ) ``` Ensure you get the following error: ``` 400: {'errors': [{'loc': ['mode'], 'msg': "Value error, mode must be either 'vector' or 'keyword' if supported by the vector_io provider", 'type': 'value_error'}]} ``` ## Running unit tests ``` uv sync --extra dev uv run pytest tests/unit/rag/test_rag_query.py -v ``` [//]: # (## Documentation)	2025-07-14 09:11:34 -04:00
Francisco Arceo	958fc92b1b	feat: Add Vector stores UI (#2737 ) Some checks failed Unit Tests / unit-tests (3.13) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 16s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 22s Details Python Package Build Test / build (3.13) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 26s Details Unit Tests / unit-tests (3.12) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 29s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 30s Details Test External Providers / test-external-providers (venv) (push) Failing after 24s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 31s Details Integration Tests / test-matrix (push) Failing after 56s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 1m1s Details Pre-commit / pre-commit (push) Successful in 1m42s Details Integration Tests / discover-tests (push) Successful in 3s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 12s Details # What does this PR do? - Adds two pages to UI - Vector stores - Vector store detail view - Fixed darkmode navbar highlighting - Updated darkmode font color - Updated llama-stack-client package <img width="1916" height="734" alt="Screenshot 2025-07-12 at 11 34 35 PM" src="https://github.com/user-attachments/assets/3f9b6727-ee82-4e6b-9555-2e3ef36d24d2" /> <img width="1912" height="910" alt="Screenshot 2025-07-12 at 11 57 09 PM" src="https://github.com/user-attachments/assets/0c9d3b5e-5592-4dfb-8e04-a57edc9fb406" /> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-13 01:03:55 -07:00
Matthew Farrellee	68e7978c88	chore: block network access from unit tests (#2732 ) Some checks failed Python Package Build Test / build (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s Details Python Package Build Test / build (3.13) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 16s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details Integration Tests / test-matrix (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 18s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Details Test Llama Stack Build / build (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Pre-commit / pre-commit (push) Successful in 1m0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 4s Details Integration Tests / discover-tests (push) Successful in 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Test External Providers / test-external-providers (venv) (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 7s Details # What does this PR do? this blocks network access for all `tests/unit/` tests. `tests/integration/` are untouched. it also introduces an `allow_network` marker to explicitly allow network access. ## Test Plan `./scripts/unit-tests.sh`	2025-07-12 16:53:54 -07:00
dependabot[bot]	8374d4cefd	chore(github-deps): bump medyagh/setup-minikube from 0.0.19 to 0.0.20 (#2738 )	2025-07-12 16:23:42 -04:00
Ben Browning	51d9fd4808	fix: Don't cache clients for passthrough auth providers (#2728 ) Some checks failed Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 43s Details Unit Tests / unit-tests (3.12) (push) Failing after 45s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 4s Details Integration Tests / discover-tests (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Pre-commit / pre-commit (push) Successful in 2m8s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 11s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 12s Details Test Llama Stack Build / build-single-provider (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 13s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 12s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Integration Tests / test-matrix (push) Failing after 6s Details Test Llama Stack Build / build (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 16s Details # What does this PR do? Some of our inference providers support passthrough authentication via `x-llamastack-provider-data` header values. This fixes the providers that support passthrough auth to not cache their clients to the backend providers (mostly OpenAI client instances) so that the client connecting to Llama Stack has to provide those auth values on each and every request. ## Test Plan I added some unit tests to ensure we're not caching clients across requests for all the fixed providers in this PR. ``` uv run pytest -sv tests/unit/providers/inference/test_inference_client_caching.py ``` I also ran some of our OpenAI compatible API integration tests for each of the changed providers, just to ensure they still work. Note that these providers don't actually pass all these tests (for unrelated reasons due to quirks of the Groq and Together SaaS services), but enough of the tests passed to confirm the clients are still working as intended. ### Together ``` ENABLE_TOGETHER="together" \ uv run llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ uv run pytest -sv \ tests/integration/inference/test_openai_completion.py \ --text-model "together/meta-llama/Llama-3.1-8B-Instruct" ``` ### OpenAI ``` ENABLE_OPENAI="openai" \ uv run llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ uv run pytest -sv \ tests/integration/inference/test_openai_completion.py \ --text-model "openai/gpt-4o-mini" ``` ### Groq ``` ENABLE_GROQ="groq" \ uv run llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ uv run pytest -sv \ tests/integration/inference/test_openai_completion.py \ --text-model "groq/meta-llama/Llama-3.1-8B-Instruct" ``` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-07-11 13:38:27 -07:00
Jorge Piedrahita Ortiz	aa2595c7c3	fix: sambanova shields and model validation (#2693 ) # What does this PR do? Update the shield register validation of Sambanova not to raise, but only warn when a model is not available in the base url endpoint used, also added warnings when model is not available in the base url endpoint used <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> run starter distro with Sambanova enabled	2025-07-11 16:29:15 -04:00
Matthew Farrellee	30b2e6a495	chore: default to pytest asyncio-mode=auto (#2730 ) # What does this PR do? previously, developers who ran `./scripts/unit-tests.sh` would get `asyncio-mode=auto`, which meant `@pytest.mark.asyncio` and `@pytest_asyncio.fixture` were redundent. developers who ran `pytest` directly would get pytest's default (strict mode), would run into errors leading them to add `@pytest.mark.asyncio` / `@pytest_asyncio.fixture` to their code. with this change - - `asyncio_mode=auto` is included in `pyproject.toml` making behavior consistent for all invocations of pytest - removes all redundant `@pytest_asyncio.fixture` and `@pytest.mark.asyncio` - for good measure, requires `pytest>=8.4` and `pytest-asyncio>=1.0` ## Test Plan - `./scripts/unit-tests.sh` - `uv run pytest tests/unit`	2025-07-11 13:00:24 -07:00
Sébastien Han	2ebc172f33	fix: pin opentelemtry version (#2722 ) Some checks failed Integration Tests / test-matrix (push) Failing after 12s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Details Python Package Build Test / build (3.13) (push) Failing after 44s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 54s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 56s Details Pre-commit / pre-commit (push) Successful in 2m9s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 5s Details Integration Tests / discover-tests (push) Successful in 4s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 4s Details Test External Providers / test-external-providers (venv) (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 11s Details Test Llama Stack Build / build-single-provider (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 13s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Test Llama Stack Build / build (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s Details Python Package Build Test / build (3.12) (push) Failing after 10s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s Details Update ReadTheDocs / update-readthedocs (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 16s Details # What does this PR do? Otherwise we can get old versions like 1.11 and experience this error: ``` ModuleNotFoundError: No module named 'opentelemetry.exporter.otlp.proto.http.metric_exporter' ``` Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-11 16:25:51 +02:00
Sébastien Han	2e4eedce14	fix: container build on podman (#2723 ) # What does this PR do? COPY with chmod does not work, see https://github.com/containers/buildah/issues/4614. Also Docker arguably implements it. Anyway, this command is not even needed since later don't we do: ``` RUN mkdir -p /.llama /.cache && chmod -R g+rw /app /.llama /.cache ``` And providers.d will get the right modes. <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Build with CONTAINER_BINARY=podman and success Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-11 16:25:33 +02:00
ehhuang	d880c2df0e	fix: auth sql store: user is owner policy (#2674 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Installer CI / lint (push) Failing after 4s Details Installer CI / smoke-test (push) Has been skipped Details Integration Tests / discover-tests (push) Successful in 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 12s Details Test Llama Stack Build / generate-matrix (push) Successful in 10s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 14s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 13s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 12s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 13s Details Test Llama Stack Build / build-single-provider (push) Failing after 13s Details Integration Tests / test-matrix (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s Details Unit Tests / unit-tests (3.12) (push) Failing after 13s Details Test Llama Stack Build / build (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 17s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 26s Details Pre-commit / pre-commit (push) Successful in 1m8s Details # What does this PR do? The current authorized sql store implementation does not respect user.principal (only checks attributes). This PR addresses that. ## Test Plan Added test cases to integration tests.	2025-07-10 14:40:32 -07:00
ehhuang	4cf1952c32	chore: update vllm k8s command to support tool calling (#2717 ) # What does this PR do? ## Test Plan	2025-07-10 14:40:17 -07:00
Nathan Weinberg	5fe3027cbf	chore: remove "rfc" directory and move original rfc to "docs" (#2718 ) # What does this PR do? the "rfc" directory has only a single document in it, and its the original RFC for creating Llama Stack simply the project directory structure by moving this into the "docs" directory and renaming it to "original_rfc" to preserve the context of the doc ## Why did you do this? A simplified top-level directory structure helps keep the project simpler and prevents misleading new contributors into thinking we use it (we really don't) --------- Signed-off-by: Nathan Weinberg <nweinber@redhat.com> Co-authored-by: raghotham <raghotham@gmail.com>	2025-07-10 14:06:10 -07:00
Nathan Weinberg	9f04bc6d1a	chore: move "install.sh" script into "scripts" dir (#2719 ) # What does this PR do? "install.sh" is something that a general user might not use e.g. it is specific to using the "ollama" inference provider cleanup the top-level structure of the repo by moving it into the "scripts" dir and updating the relevant references accordingly Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-10 13:14:10 -07:00
Nathan Weinberg	0bbff91c7e	docs: fix a few broken things in the CONTRIBUTING.md (#2714 ) # What does this PR do? "dev" dependencies were moved in pyproject.toml typo with guidance around automatic doc generation Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-10 11:47:54 -07:00
Francisco Arceo	6a6b66ae4f	chore: Adding unit tests for OpenAI vector stores and migrating SQLite-vec registry to kvstore (#2665 ) # What does this PR do? This PR refactors and the VectorIO backend logic for `sqlite-vec` and adds unit tests and fixtures to make it easy to test both `sqlite-vec` and `milvus`. Key changes: - `sqlite-vec` migrated to `kvstore` registry - added in-memory cache for sqlite-vec to be consistent with `milvus` - default fixtures moved to `conftest.py` - removed redundant tests from sqlite`-vec` - made `test_vector_io_openai_vector_stores.py` more easily extensible ## Test Plan Unit tests added testing inline providers. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-10 14:22:13 -04:00
Nathan Weinberg	b18f4d1ccf	ci: add config for pre-commit.ci (#2712 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Details Integration Tests / discover-tests (push) Successful in 5s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 7s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 6s Details Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Test Llama Stack Build / build (push) Failing after 5s Details Integration Tests / test-matrix (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 32s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 34s Details Test External Providers / test-external-providers (venv) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 30s Details Pre-commit / pre-commit (push) Successful in 1m51s Details # What does this PR do? the project already had some config setup for https://pre-commit.ci/ this commit adds additional explicit fields Closes #2711 IMPORTANT: A project maintainer must add `pre-commit.ci` to this repo for this to work - this can be done via https://pre-commit.ci/ Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-10 17:24:10 +02:00
Mustafa Elbehery	83c6b20067	chore(api): add `mypy` coverage to `cli/stack` (#2650 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-10 16:53:38 +02:00
Nathan Weinberg	bbe0199bb7	chore: update pre-commit hook versions (#2708 ) While investigating the `uv.lock` changes made in https://github.com/meta-llama/llama-stack/pull/2695 I noticed several of the pre-commit hook versions were out of date This PR updates them and fixes some new `ruff` errors --------- Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-10 16:47:59 +02:00
Charlie Doern	81ebaf6e9a	fix: properly represent paths in server logs (#2698 ) # What does this PR do? currently when logging the run yaml, if there are path objects in the object they are represented as: ``` external_providers_dir: !!python/object/apply:pathlib.PosixPath - '~' - .llama - providers.d ``` now, with a config.model_dump(mode="json"), it works properly ``` external_providers_dir: ~/.llama/providers.d ``` Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-10 10:19:12 -04:00
Sébastien Han	01c222e12f	ci: run all APIs integration tests (#2646 ) # What does this PR do? We are now automatically building the list of integration test to run. In that process, eval and files and being tested now. This is pending https://github.com/meta-llama/llama-stack/pull/2628 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-10 15:16:08 +02:00
ehhuang	81109a0f72	test: terminate server process when finished (#2700 ) Some checks failed Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 10s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 8s Details Test External Providers / test-external-providers (venv) (push) Failing after 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Pre-commit / pre-commit (push) Successful in 1m31s Details Integration Tests / test-matrix (server, 3.12, providers) (push) Failing after 14s Details Integration Tests / test-matrix (server, 3.12, scoring) (push) Failing after 14s Details Integration Tests / test-matrix (server, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, safety) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 5s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 6s Details # What does this PR do? Terminate server process for real. ## Test Plan ```ENABLE_OPENAI=openai LLAMA_STACK_CONFIG=server:starter pytest -v tests/integration/agents/test_openai_responses.py --text-model "gpt-4o-mini" -vv -s -k 'test_list_response_input_items[' && lsof -ti:8321``` observe no process printed anymore	2025-07-09 20:59:37 -07:00
ehhuang	780b4c6eea	fix: llama stack run starter in conda (#2679 ) # What does this PR do? `llama stack run starter` in conda environment fails with ' --config is required for venv and conda environments' because it is passed as --template and start_stack.sh doesn't process template. ## Test Plan `llama stack run starter`	2025-07-09 20:33:45 -07:00
Nathan Weinberg	7915551eee	build: replace "python-jose" with "python-jose[cryptography]" (#2695 ) Some checks failed Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, safety) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 5s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 9s Details Test Llama Stack Build / generate-matrix (push) Successful in 42s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 46s Details Test Llama Stack Build / build-single-provider (push) Failing after 43s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Test Llama Stack Build / build (push) Failing after 5s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 54s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 17s Details Python Package Build Test / build (3.13) (push) Failing after 15s Details Pre-commit / pre-commit (push) Successful in 1m43s Details # What does this PR do? `python-jose` recommends using the `cryptography` backend in their installation docs: https://github.com/mpdavis/python-jose?tab=readme-ov-file#cryptographic-backends This PR modifies the LLS dependencies to use this instead of the current `native-python` Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-09 13:21:57 -07:00
Matthew Farrellee	1d8c00635c	chore: Update CODEOWNERS (#2692 ) add @mattf	2025-07-09 08:19:31 -07:00
Sébastien Han	9b7eecebcf	ci: test safety with starter (#2628 ) Some checks failed Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 11s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, safety) (push) Failing after 25s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 27s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 9s Details Test Llama Stack Build / generate-matrix (push) Successful in 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 16s Details Test Llama Stack Build / build-single-provider (push) Failing after 14s Details Integration Tests / test-matrix (server, 3.12, tool_runtime) (push) Failing after 1m7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 12s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 29s Details Test External Providers / test-external-providers (venv) (push) Failing after 17s Details Test Llama Stack Build / build (push) Failing after 13s Details Unit Tests / unit-tests (3.12) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 35s Details Python Package Build Test / build (3.12) (push) Failing after 31s Details Python Package Build Test / build (3.13) (push) Failing after 29s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 34s Details Pre-commit / pre-commit (push) Successful in 1m24s Details # What does this PR do? We are now testing the safety capability with the starter image. This includes a few changes: * Enable the safety integration test * Relax the shield model requirements from llama-guard to make it work with llama-guard3:8b coming from Ollama * Expose a shield for each inference provider in the starter distro. The shield will only be registered if the provider is enabled. Closes: https://github.com/meta-llama/llama-stack/issues/2528 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-09 16:53:50 +02:00
Mustafa Elbehery	de01eefdef	chore: add `mypy` post training (#2675 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-09 15:44:39 +02:00
Jorge	dafd9ed5c0	docs: Update links to Android Demo App (#2687 ) # What does this PR do? Updates some broken or outdated links pointing to the Android Demo App Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>	2025-07-09 15:41:57 +02:00
Mustafa Elbehery	cd0ad21111	chore(api): add `mypy` coverage to `apis` (#2648 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack/apis` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-09 12:55:16 +02:00
Sébastien Han	297cd8e0db	fix: runpod transition to python 3.12 (#2682 ) # What does this PR do? I'm not sure how this was missed in the pyupgrade PR. This code seems broken... Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-09 12:27:42 +02:00
Mustafa Elbehery	7f3661e7d8	chore: add `mypy` loader (#2672 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-09 10:26:33 +02:00
Mustafa Elbehery	a5c3362bcd	chore(api): add `mypy` coverage to `meta_reference_config` (#2664 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-09 10:24:30 +02:00
Mustafa Elbehery	28343fea51	chore(api): add `mypy` coverage to `meta_reference_safety` (#2661 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-09 10:22:34 +02:00
pgustafs	d39660afed	fix(remote:milvus): add missing files_api parameter and kvstore configuration (#2630 ) - Fix constructor call missing files_api parameter - Add kvstore field to MilvusVectorIOConfig - Resolves #2626 # What does this PR do? [https://github.com/meta-llama/llama-stack/issues/2626] ## Problem The `MilvusVectorIOAdapter` fails to initialize due to two missing configuration issues: 1. Missing `files_api` parameter in the constructor call 2. Missing `kvstore` field in the `MilvusVectorIOConfig` class ## Root Cause 1. The adapter constructor expects 3 parameters `(config, inference_api, files_api)` but the `get_adapter_impl` function only passes 2 parameters 2. The `MilvusVectorIOConfig` class lacks the `kvstore` field that the adapter's `initialize()` method expects for metadata persistence ## Solution - Added `files_api = deps.get(Api.files, None)` to safely retrieve files API from dependencies - Pass the files_api parameter to MilvusVectorIOAdapter constructor - Added `kvstore: KVStoreConfig \| None = None` field to MilvusVectorIOConfig - Maintains backward compatibility since both files_api and kvstore can be None Closes #2626 ## Test Plan - [x] Tested with Milvus configuration - server starts successfully ```yaml vector_io: - provider_id: milvus provider_type: remote::milvus config: uri: http://localhost:19530 token: root:Milvus kvstore: type: sqlite namespace: null db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/remote-vllm}/milvus_store.db ``` - [x] Vector operations work as expected ```python from llama_stack_client import LlamaStackClient from llama_stack_client.types.shared_params.document import Document as RAGDocument from llama_stack_client.lib.agents.agent import Agent from llama_stack_client.lib.agents.event_logger import EventLogger as AgentEventLogger import os endpoint = os.getenv("LLAMA_STACK_ENDPOINT") model = os.getenv("INFERENCE_MODEL") # Initialize the client client = LlamaStackClient(base_url=endpoint) vector_db_id = "my_documents" response = client.vector_dbs.register( vector_db_id=vector_db_id, embedding_model="all-MiniLM-L6-v2", embedding_dimension=384, provider_id="milvus", ) urls = ["getting_started/Red_Hat_AI_Inference_Server-3.0-Getting_started-en-US.pdf", "vllm_server_arguments/Red_Hat_AI_Inference_Server-3.0-vLLM_server_arguments-en-US.pdf"] documents = [ RAGDocument( document_id=f"num-{i}", content=f"https://docs.redhat.com/en/documentation/red_hat_ai_inference_server/3.0/pdf/{url}", mime_type="application/pdf", metadata={}, ) for i, url in enumerate(urls) ] client.tool_runtime.rag_tool.insert( documents=documents, vector_db_id=vector_db_id, chunk_size_in_tokens=512, ) rag_agent = Agent( client, model=model, # Define instructions for the agent (system prompt) instructions="You are a helpful assistant", enable_session_persistence=False, # Define tools available to the agent tools=[ { "name": "builtin::rag/knowledge_search", "args": { "vector_db_ids": [vector_db_id], }, } ], ) session_id = rag_agent.create_session("test-session") user_prompts = [ "How to start the AI Inference Server container image? use the knowledge_search tool to get information.", ] for prompt in user_prompts: print(f"User> {prompt}") response = rag_agent.create_turn( messages=[{"role": "user", "content": prompt}], session_id=session_id, ) for log in AgentEventLogger().log(response): log.print() ``` server logs: ``` INFO 2025-07-04 22:18:30,385 __main__:577 server: Listening on ['::', '0.0.0.0']:5000 INFO: Started server process [769725] INFO: Waiting for application startup. INFO 2025-07-04 22:18:30,390 __main__:158 server: Starting up INFO: Application startup complete. INFO: Uvicorn running on http://['::', '0.0.0.0']:5000 (Press CTRL+C to quit) INFO 2025-07-04 22:18:52,193 llama_stack.distribution.routing_tables.common:200 core: Setting owner for vector_db 'my_documents' to 20:18:52.194 [START] /v1/vector-dbs INFO: 192.168.1.249:64170 - "POST /v1/vector-dbs HTTP/1.1" 200 OK 20:18:52.216 [END] /v1/vector-dbs [StatusCode.OK] (21.89ms) 20:18:52.222 [START] /v1/tool-runtime/rag-tool/insert INFO 2025-07-04 22:18:56,265 llama_stack.providers.utils.inference.embedding_mixin:102 uncategorized: Loading sentence transformer for all-MiniLM-L6-v2... WARNING 2025-07-04 22:18:59,214 opentelemetry.trace:537 uncategorized: Overriding of current TracerProvider is not allowed INFO 2025-07-04 22:18:59,339 sentence_transformers.SentenceTransformer:219 uncategorized: Use pytorch device_name: cuda:0 INFO 2025-07-04 22:18:59,340 sentence_transformers.SentenceTransformer:227 uncategorized: Load pretrained SentenceTransformer: all-MiniLM-L6-v2 INFO: 192.168.1.249:64170 - "POST /v1/tool-runtime/rag-tool/insert HTTP/1.1" 200 OK INFO: 192.168.1.249:64170 - "POST /v1/agents HTTP/1.1" 200 OK INFO: 192.168.1.249:64170 - "GET /v1/tools?toolgroup_id=builtin%3A%3Arag%2Fknowledge_search HTTP/1.1" 200 OK INFO: 192.168.1.249:64170 - "POST /v1/agents/b1f6f063-1691-4780-8d9e-facd81708b91/session HTTP/1.1" 200 OK 20:19:01.834 [END] /v1/tool-runtime/rag-tool/insert [StatusCode.OK] (9612.06ms) 20:19:01.839 [START] /v1/agents INFO: 192.168.1.249:64170 - "POST /v1/agents/b1f6f063-1691-4780-8d9e-facd81708b91/session/d2706302-bb54-421d-a890-5e25df9cb47f/turn HTTP/1.1" 200 OK 20:19:01.839 [END] /v1/agents [StatusCode.OK] (0.18ms) 20:19:01.844 [START] /v1/tools INFO 2025-07-04 22:19:01,853 llama_stack.providers.remote.inference.vllm.vllm:330 uncategorized: Initializing vLLM client with base_url=http://192.168.1.183:8080/v1 20:19:01.858 [END] /v1/tools [StatusCode.OK] (14.92ms) 20:19:01.868 [START] /v1/agents/{agent_id}/session 20:19:01.868 [END] /v1/agents/{agent_id}/session [StatusCode.OK] (0.37ms) 20:19:01.873 [START] /v1/agents/{agent_id}/session/{session_id}/turn 20:19:01.885 [START] inference 20:19:05.506 [END] inference [StatusCode.OK] (3621.19ms) INFO 2025-07-04 22:19:05,537 llama_stack.providers.inline.agents.meta_reference.agent_instance:890 agents: executing tool call: knowledge_search with args: {'query': 'How to start the AI Inference Server container image'} 20:19:05.538 [START] tool_execution 20:19:05.928 [END] tool_execution [StatusCode.OK] (390.08ms) 20:19:05.538 [INFO] executing tool call: knowledge_search with args: {'query': 'How to start the AI Inference Server container image'} 20:19:05.935 [START] inference 20:19:17.539 [END] inference [StatusCode.OK] (11603.76ms) 20:19:17.560 [END] /v1/agents/{agent_id}/session/{session_id}/turn [StatusCode.OK] (15686.62ms) ``` - [x] No regressions in functionality - [x] Configuration properly accepts kvstore settings --------- Co-authored-by: Peter Gustafsson <peter.gustafsson6@gmail.com> Co-authored-by: raghotham <rsm@meta.com> Co-authored-by: Francisco Arceo <farceo@redhat.com>	2025-07-09 10:08:14 +02:00
Mustafa Elbehery	2d3d9664a7	chore(api): add `mypy` coverage to `prompts` (#2657 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR adds static type coverage to `llama-stack` Part of https://github.com/meta-llama/llama-stack/issues/2647 <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>	2025-07-09 10:07:00 +02:00
ehhuang	84fa83b788	fix: update k8s templates (#2645 ) Some checks failed Integration Tests / test-matrix (server, 3.12, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.12, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 15s Details Integration Tests / test-matrix (server, 3.12, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 17s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 11s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 14s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 13s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 15s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s Details Python Package Build Test / build (3.12) (push) Failing after 33s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 41s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 40s Details Python Package Build Test / build (3.13) (push) Failing after 33s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Pre-commit / pre-commit (push) Successful in 1m23s Details # What does this PR do? - fix env variables - use gpu for vllm - add eks/apply.py for aws - add template to set hf secret ## Test Plan bash apply.sh Co-authored-by: Eric Huang <erichuang@fb.com>	2025-07-08 15:57:01 -07:00
ehhuang	daf660c4ea	feat(auth,ui): support github sign-in in the UI (#2545 ) # What does this PR do? Uses NextAuth to add github sign in support. ## Test Plan Start server with auth configured as in https://github.com/meta-llama/llama-stack/pull/2509 https://github.com/user-attachments/assets/61ff7442-f601-4b39-8686-5d0afb3b45ac	2025-07-08 11:02:57 -07:00
ehhuang	c8bac888af	feat(auth): support github tokens (#2509 ) # What does this PR do? This PR adds GitHub OAuth authentication support to Llama Stack, allowing users to authenticate using their GitHub credentials (#2508) . 1. support verifying github acesss tokens 2. support provider-specific auth error messages 3. opportunistic reorganized the auth configs for better ergonomics ## Test Plan Added unit tests. Also tested e2e manually: ``` server: port: 8321 auth: provider_config: type: github_token ``` ``` ~/projects/llama-stack/llama_stack/ui ❯ curl -v http://localhost:8321/v1/models * Host localhost:8321 was resolved. * IPv6: ::1 * IPv4: 127.0.0.1 * Trying [::1]:8321... * Connected to localhost (::1) port 8321 > GET /v1/models HTTP/1.1 > Host: localhost:8321 > User-Agent: curl/8.7.1 > Accept: / > * Request completely sent off < HTTP/1.1 401 Unauthorized < date: Fri, 27 Jun 2025 21:51:25 GMT < server: uvicorn < content-type: application/json < x-trace-id: 5390c6c0654086c55d87c86d7cbf2f6a < Transfer-Encoding: chunked < * Connection #0 to host localhost left intact {"error": {"message": "Authentication required. Please provide a valid GitHub access token (https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens) in the Authorization header (Bearer <token>)"}} ~/projects/llama-stack/llama_stack/ui ❯ ./scripts/unit-tests.sh ~/projects/llama-stack/llama_stack/ui ❯ curl "http://localhost:8321/v1/models" \ -H "Authorization: Bearer <token_obtained_from_github>" \ {"data":[{"identifier":"accounts/fireworks/models/llama-guard-3-11b-vision","provider_resource_id":"accounts/fireworks/models/llama-guard-3-11b-vision","provider_id":"fireworks","type":"model","metadata":{},"model_type":"llm"},{"identifier":"accounts/fireworks/models/llama-guard-3-8b","provider_resource_id":"accounts/fireworks/models/llama-guard-3-8b","provider_id":"fireworks","type":"model","metadata":{},"model_type":"llm"},{"identifier":"accounts/fireworks/models/llama-v3p1-405b-instruct","provider_resource_id":"accounts/f ``` --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-07-08 11:02:36 -07:00
Francisco Arceo	83c89265e0	chore: Adding unit tests for Milvus and OpenAI compatibility (#2640 ) Some checks failed Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 13s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 5s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 5s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 4s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 36s Details Test Llama Stack Build / build-single-provider (push) Failing after 36s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 36s Details Test External Providers / test-external-providers (venv) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 45s Details Python Package Build Test / build (3.12) (push) Failing after 17s Details Unit Tests / unit-tests (3.13) (push) Failing after 18s Details Pre-commit / pre-commit (push) Successful in 1m35s Details # What does this PR do? - Enabling Unit tests for Milvus to start to test OpenAI compatibility and fixing a few bugs. - Also fixed an inconsistency in the Milvus config between remote and inline. - Added pymilvus to extras for testing in CI I'm going to refactor this later to include the other inline providers so that we can catch issues sooner. I have another PR where I've been testing to find other bugs in the implementation (and required changes drafted here: https://github.com/meta-llama/llama-stack/pull/2617). ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-08 00:50:16 -07:00
Charlie Doern	27b3cd570f	fix: use `--template` flag for server (#2643 ) # What does this PR do? currently when a template is used, we still use `--config`. `server.py` has a dedicated `--template` flag and logic, use that instead Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-08 00:48:50 -07:00
ehhuang	e9926564bd	fix: authorized sql store with postgres (#2641 ) Some checks failed Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 13s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 11s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 13s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 14s Details Integration Tests / test-matrix (server, 3.12, post_training) (push) Failing after 14s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 28s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 27s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 5s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Test External Providers / test-external-providers (venv) (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details Test Llama Stack Build / build-single-provider (push) Failing after 44s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 41s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 43s Details Pre-commit / pre-commit (push) Successful in 1m34s Details # What does this PR do? postgres has different json extract syntax from sqlite ## Test Plan added integration test	2025-07-07 19:36:34 -07:00
Ben Browning	5bb3817c49	fix: Restore the nvidia distro (#2639 ) # What does this PR do? The `nvidia` distro was previously collapsed into the `starter` distro. However, the `nvidia` distro was setup specifically to use NVIDIA NeMo microservices as providers for all APIs and not just inference, which means it was doing quite a bit more than what the `starter` distro covers today. We should work with our friends at NVIDIA to determine the best place to maintain this distro long-term, but for now this restores the `nvidia` distro and its docs back to where they were so that things continue to work for their users. ## Test Plan I ensure the `nvidia` distro could build, and run at least to the point of complaining that I didn't provide the necessary API keys. ``` uv run llama stack build --template nvidia --image-type venv uv run llama stack run llama_stack/templates/nvidia/run.yaml ``` I also made sure the docs website built and looks reasonable, with the `nvidia` distro docs at the same URL it was previously (because it has incoming links from official NVIDIA NeMo docs, among other places). ``` uv run --group docs sphinx-autobuild docs/source docs/build/html --write-all ``` Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-07-07 15:50:05 -07:00
Charlie Doern	d0ec5c3d3a	fix: print proper template path upon build (#2642 ) # What does this PR do? Rather than pointing to a dir in `llama_stack/templates` (the repo directory) we should point to `$BUILD_DIR/IMAGE_NAME-run.yaml` (`~/.llama/distributions/IMAGE_NAME/IMAGE_NAME-run.yaml`) currently we are printing: ``` You can find the newly-built template here: /Users/charliedoern/projects/Documents/llama-stack/llama_stack/templates/starter/run.yaml You can run the new Llama Stack distro via: llama stack run /Users/charliedoern/projects/Documents/llama-stack/llama_stack/templates/starter/run.yaml --image-type venv ``` but should be printing things like: ``` You can find the newly-built template here: /Users/charliedoern/.llama/distributions/starter/starter-run.yaml You can run the new Llama Stack distro via: llama stack run /Users/charliedoern/.llama/distributions/starter/starter-run.yaml --image-type venv ``` Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-07-07 15:39:39 -07:00
Sébastien Han	5561f1c36d	ci: error when a pipefails (#2635 ) Some checks failed Integration Tests / test-matrix (server, 3.12, inference) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.12, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (server, 3.12, providers) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.12, scoring) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 30s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 26s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 24s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 22s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 1m1s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m5s Details Pre-commit / pre-commit (push) Successful in 1m53s Details # What does this PR do? The CI was failing but the error was eaten by the pipe. Now we run the task with pipefail. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-07 16:47:30 +02:00
Wen Zhou	4bca4af3e4	refactor: set proper name for embedding all-minilm:l6-v2 and update to use "starter" in detailed_tutorial (#2627 ) Some checks failed Integration Tests / test-matrix (server, 3.12, scoring) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 5s Details Integration Tests / test-matrix (server, 3.12, datasets) (push) Failing after 32s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.12, inspect) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 22s Details Integration Tests / test-matrix (server, 3.12, agents) (push) Failing after 16s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 24s Details Integration Tests / test-matrix (server, 3.12, providers) (push) Failing after 20s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 18s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 20s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 34s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 33s Details Integration Tests / test-matrix (server, 3.12, tool_runtime) (push) Failing after 30s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Python Package Build Test / build (3.13) (push) Failing after 39s Details Update ReadTheDocs / update-readthedocs (push) Failing after 41s Details Unit Tests / unit-tests (3.12) (push) Failing after 46s Details Pre-commit / pre-commit (push) Successful in 1m30s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - we are using `all-minilm:l6-v2` but the model we download from ollama is `all-minilm:latest` latest: https://ollama.com/library/all-minilm:latest 1b226e2802db l6-v2: https://ollama.com/library/all-minilm:l6-v2 pin 1b226e2802db - even currently they are exactly the same model but if [all-minilm:l12-v2](https://ollama.com/library/all-minilm:l12-v2) is updated, "latest" might not be the same for l6-v2. - the only change in this PR is pin the model id in ollama - also update detailed_tutorial with "starter" to replace deprecated "ollama". <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ``` >INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" >llama stack build --run --template ollama --image-type venv ... Build Successful! You can find the newly-built template here: /home/wenzhou/zdtsw-forking/lls/llama-stack/llama_stack/templates/ollama/run.yaml .... - metadata: embedding_dimension: 384 model_id: all-MiniLM-L6-v2 model_type: !!python/object/apply:llama_stack.apis.models.models.ModelType - embedding provider_id: ollama provider_model_id: all-minilm:l6-v2 ... ``` test ``` >llama-stack-client inference chat-completion --message "Write me a 2-sentence poem about the moon" INFO:httpx:HTTP Request: GET http://localhost:8321/v1/models "HTTP/1.1 200 OK" INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/chat/completions "HTTP/1.1 200 OK" OpenAIChatCompletion( id='chatcmpl-04f99071-3da2-44ba-a19f-03b5b7fc70b7', choices=[ OpenAIChatCompletionChoice( finish_reason='stop', index=0, message=OpenAIChatCompletionChoiceMessageOpenAIAssistantMessageParam( role='assistant', content="Here is a 2-sentence poem about the moon:\n\nSilver crescent in the midnight sky,\nLuna's gentle face, a beauty to the eye.", name=None, tool_calls=None, refusal=None, annotations=None, audio=None, function_call=None ), logprobs=None ) ], created=1751644429, model='llama3.2:3b-instruct-fp16', object='chat.completion', service_tier=None, system_fingerprint='fp_ollama', usage={'completion_tokens': 33, 'prompt_tokens': 36, 'total_tokens': 69, 'completion_tokens_details': None, 'prompt_tokens_details': None} ) ``` --------- Signed-off-by: Wen Zhou <wenzhou@redhat.com>	2025-07-06 09:07:37 +05:30
dependabot[bot]	2faec38724	chore(deps): bump next from 15.3.2 to 15.3.3 in /llama_stack/ui (#2632 ) Some checks failed Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 26s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.12, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.12, inference) (push) Failing after 23s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 25s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 22s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 39s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 41s Details Python Package Build Test / build (3.12) (push) Failing after 33s Details Python Package Build Test / build (3.13) (push) Failing after 31s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details Update ReadTheDocs / update-readthedocs (push) Failing after 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Pre-commit / pre-commit (push) Successful in 1m23s Details Bumps [next](https://github.com/vercel/next.js) from 15.3.2 to 15.3.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vercel/next.js/releases">next's releases</a>.</em></p> <blockquote> <h2>v15.3.3</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>Reinstate <code>vary</code> (<a href="https://redirect.github.com/vercel/next.js/issues/79939">#79939</a>)</li> <li>fix(next-swc): Fix interestingness detection for React Compiler (<a href="https://redirect.github.com/vercel/next.js/issues/79558">#79558</a>)</li> <li>fix(next-swc): Fix react compiler usefulness detector (<a href="https://redirect.github.com/vercel/next.js/issues/79480">#79480</a>)</li> <li>fix(dev-overlay): Better handle edge-case file paths in launchEditor (<a href="https://redirect.github.com/vercel/next.js/issues/79526">#79526</a>)</li> <li>Client router should discard stale prefetch entries for static pages (<a href="https://redirect.github.com/vercel/next.js/issues/79362">#79362</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/gaojude"><code>@gaojude</code></a>, <a href="https://github.com/kdy1"><code>@kdy1</code></a>, <a href="https://github.com/bgw"><code>@bgw</code></a>, and <a href="https://github.com/unstubbable"><code>@unstubbable</code></a> for helping!</p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`3ab8db7383`"><code>3ab8db7</code></a> v15.3.3</li> <li><a href="`18c8113ebd`"><code>18c8113</code></a> [backport] Reinstate <code>vary</code> (<a href="https://redirect.github.com/vercel/next.js/issues/79939">#79939</a>)</li> <li><a href="`e18212f546`"><code>e18212f</code></a> re-enable vary header deploy test (<a href="https://redirect.github.com/vercel/next.js/issues/79753">#79753</a>)</li> <li><a href="`ec202eccf0`"><code>ec202ec</code></a> Revert "[next-server] skip setting vary header for basic routes" (<a href="https://redirect.github.com/vercel/next.js/issues/79426">#79426</a>)</li> <li><a href="`e2f264fdce`"><code>e2f264f</code></a> fix(next-swc): Fix interestingness detection for React Compiler (15.3) (<a href="https://redirect.github.com/vercel/next.js/issues/79558">#79558</a>)</li> <li><a href="`562fac78da`"><code>562fac7</code></a> fix(next-swc): Fix react compiler usefulness detector (15.3) (<a href="https://redirect.github.com/vercel/next.js/issues/79480">#79480</a>)</li> <li><a href="`06097fd7bb`"><code>06097fd</code></a> fix(dev-overlay): Better handle edge-case file paths in launchEditor (<a href="https://redirect.github.com/vercel/next.js/issues/79526">#79526</a>)</li> <li><a href="`bda731fa96`"><code>bda731f</code></a> Client router should discard stale prefetch entries for static pages (<a href="https://redirect.github.com/vercel/next.js/issues/79362">#79362</a>)</li> <li>See full diff in <a href="https://github.com/vercel/next.js/compare/v15.3.2...v15.3.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=next&package-manager=npm_and_yarn&previous-version=15.3.2&new-version=15.3.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/meta-llama/llama-stack/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-07-05 00:13:33 -04:00
Wen Zhou	c025cab3a3	docs: update docs to use "starter" than "ollama" (#2629 )	2025-07-05 08:44:57 +05:30
Francisco Arceo	dc7df60d42	docs: Update starter docs to include milvus inline (#2631 )	2025-07-05 08:43:39 +05:30
Sébastien Han	ea966565f6	feat: improve telemetry (#2590 ) Some checks failed Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 5s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 4s Details Integration Tests / test-matrix (server, 3.12, tool_runtime) (push) Failing after 18s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 19s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 16s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 18s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s Details Python Package Build Test / build (3.13) (push) Failing after 0s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 17s Details Update ReadTheDocs / update-readthedocs (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details Test External Providers / test-external-providers (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 58s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 1m0s Details Python Package Build Test / build (3.12) (push) Failing after 49s Details Pre-commit / pre-commit (push) Successful in 1m40s Details # What does this PR do? * Use a single env variable to setup OTEL endpoint * Update telemetry provider doc * Update general telemetry doc with the metric with generate * Left a script to setup telemetry for testing Closes: https://github.com/meta-llama/llama-stack/issues/783 Note to reviewer: the `setup_telemetry.sh` script was useful for me, it was nicely generated by AI, if we don't want it in the repo, and I can delete it, and I would understand. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-04 17:29:09 +02:00
Derek Higgins	4eae0cbfa4	fix(starter): Add missing faiss provider to build.yaml vector_io section (#2625 ) The starter template build.yaml was missing the inline::faiss provider in the vector_io section, while it was properly configured in run.yaml and starter.py's vector_io_providers list. Fixes: #2624 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-07-04 17:28:57 +02:00
Sébastien Han	df6ce8befa	fix: only load mcp when enabled in tool_group (#2621 ) # What does this PR do? The agent code is currently importing MCP modules even when MCP isn’t enabled. Do we consider this worth fixing, or are we treating MCP as a first-class dependency? I believe we should treat it as such. If everyone agrees, let’s go ahead and close this. Note: The current setup breaks if someone builds a distro without including MCP in tool_group but still serves the agent API. Also, we should bump the MCP version to support streamable responses, as SSE is being deprecated. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-04 20:27:05 +05:30
Sébastien Han	c4349f532b	feat: consolidate most distros into "starter" (#2516 ) # What does this PR do? * Removes a bunch of distros * Removed distros were added into the "starter" distribution * Doc for "starter" has been added * Partially reverts https://github.com/meta-llama/llama-stack/pull/2482 since inference providers are disabled by default and can be turned on manually via env variable. * Disables safety in starter distro Closes: https://github.com/meta-llama/llama-stack/issues/2502. ~Needs: https://github.com/meta-llama/llama-stack/pull/2482 for Ollama to work properly in the CI.~ TODO: - [ ] We can only update `install.sh` when we get a new release. - [x] Update providers documentation - [ ] Update notebooks to reference starter instead of ollama Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-04 15:58:03 +02:00
Derek Higgins	f77d4d91f5	fix: handle encoding errors when adding files to vector store (#2574 ) Some checks failed Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 6s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s Details Test Llama Stack Build / build (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 45s Details Test Llama Stack Build / build-single-provider (push) Failing after 37s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 33s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 43s Details Pre-commit / pre-commit (push) Successful in 1m35s Details - Add try-catch block around data.decode() to handle UnicodeDecodeError - Implement UTF-8 fallback when detected encoding fails - Return empty string when both encodings fail - add unit tests Fixes #2572: UnicodeDecodeError when uploading files with problematic encodings Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-07-04 12:10:18 +02:00
Ashwin Bharambe	f1c62e0af0	build: Bump version to 0.2.14	2025-07-04 12:12:12 +05:30
Matthew Farrellee	ef26259209	feat: add llama guard 4 model (#2579 ) add support for Llama Guard 4 model to the llama_guard safety provider test with - 0. NVIDIA_API_KEY=... llama stack build --image-type conda --image-name env-nvidia --providers inference=remote::nvidia,safety=inline::llama-guard --run 1. llama-stack-client models register meta-llama/Llama-Guard-4-12B --provider-model-id meta/llama-guard-4-12b 2. pytest tests/integration/safety/test_llama_guard.py Co-authored-by: raghotham <rsm@meta.com>	2025-07-03 22:29:04 -07:00
Derek Higgins	0422b4fc63	fix: CI flakiness in vector IO tests by pinning pymilvus>=2.4.10 (#2610 ) Some checks failed Integration Tests / test-matrix (server, 3.12, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, agents) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.12, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.12, post_training) (push) Failing after 11s Details Integration Tests / test-matrix (server, 3.12, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 11s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 1m15s Details Python Package Build Test / build (3.12) (push) Failing after 1m12s Details Python Package Build Test / build (3.13) (push) Failing after 1m10s Details Test External Providers / test-external-providers (venv) (push) Failing after 1m27s Details Unit Tests / unit-tests (3.12) (push) Failing after 35s Details Unit Tests / unit-tests (3.13) (push) Failing after 34s Details Pre-commit / pre-commit (push) Successful in 2m47s Details This occurred when marshmallow 4.0.0 was installed (which removed __version_info__) By pinning pymilvus to >=2.4.10, we ensure marshmallow doesn't get installed. Also set the dependency in InlineProviderSpec as this is the one that takes effect when using the "inline::milvus" provider. Fixes https://github.com/meta-llama/llama-stack/issues/2588 Signed-off-by: Derek Higgins <derekh@redhat.com>	2025-07-04 10:27:23 +05:30
Francisco Arceo	ea80ea63ac	chore: Updating chunk id generation to ensure uniqueness (#2618 ) # What does this PR do? This handles an edge case for `generate_chunk_id` if the concatenation of the `document_id` and `chunk_text` combination are not unique. Adding the window location ensures uniqueness. ## Test Plan Added unit test Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-04 10:26:35 +05:30
Francisco Arceo	4afd619c56	chore: Add support for vector-stores files api for Milvus (#2582 ) Some checks failed Integration Tests / test-matrix (server, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (server, 3.13, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (server, 3.13, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, inspect) (push) Failing after 13s Details Integration Tests / test-matrix (server, 3.13, providers) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 7s Details Integration Tests / test-matrix (server, 3.13, vector_io) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 10s Details Integration Tests / test-matrix (server, 3.13, tool_runtime) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 24s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 18s Details Test Llama Stack Build / generate-matrix (push) Successful in 20s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 28s Details Unit Tests / unit-tests (3.12) (push) Failing after 3s Details Test Llama Stack Build / build (push) Failing after 4s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Python Package Build Test / build (3.12) (push) Failing after 51s Details Test Llama Stack Build / build-single-provider (push) Failing after 55s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 54s Details Pre-commit / pre-commit (push) Successful in 1m44s Details # What does this PR do? ### Summary This pull request implements support for the OpenAI Vector Store Files API for the Milvus vector store provider in `llama_stack`. It enables storing, loading, updating, and deleting file metadata and file contents in Milvus collections, allowing OpenAI vector store files to be managed directly within Milvus. ### Main Changes - Milvus Vector Store Files API Implementation - Implements all required methods for storing, loading, updating, and deleting vector store file metadata and contents (`_save_openai_vector_store_file`, `_load_openai_vector_store_file`, `_load_openai_vector_store_file_contents`, `_update_openai_vector_store_file`, `_delete_openai_vector_store_file_from_storage`). - Uses two Milvus collections: `openai_vector_store_files` (for metadata) and `openai_vector_store_files_contents` (for chunked file contents). - Collections are created dynamically if they do not exist, with appropriate schema definitions. - Collection Name Sanitization - Adds a `sanitize_collection_name` utility to ensure Milvus collection names only contain valid characters (letters, numbers, underscores). - Testing - Updates test skip logic to include `"inline::milvus"` for cases where the OpenAI Vector Store Files API is not supported, improving integration test accuracy. - Other Improvements - Passes `kvstore` to `MilvusIndex` for consistency. - Removes obsolete NotImplementedErrors and legacy code for file storage. ## Test Plan CI and tested via a test script ## Notes - `VectorDB` currently uses the `name` as the `identifier` in `openai_create_vector_store`. We need to add `name` as a field to `VectorDB` and generate the `identifier` upon creation. OpenAI is not idempotent with respect to the `name` field that they pass (i.e., you can pass the same name multiple times and OpenAI will generate a new identifier). I'll add a follow up PR for this. - The `Files` api needs to use `files-` as a prefix in the identifier. I have updated the Vector Store to use the OpenAI prefix `vs_*`. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-07-03 12:15:33 -07:00
Sébastien Han	dae1fcd3c2	ci: let pytest run the distro server (#2586 ) # What does this PR do? * Use #2580 functionality to auto-start the server with the tests * Reduce timeout to 30sec * Print server logs on errors * Pytest logs are collected to a file pytest.log Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-03 10:51:46 -07:00
Akram Ben Aissi	f4950f4ef0	fix: AccessDeniedError leads to HTTP 500 instead of error 403 (#2595 ) Resolves access control error visibility issues where 500 errors were returned instead of proper 403 responses with actionable error messages. • Enhance AccessDeniedError with detailed context and improve exception handling • Enhanced AccessDeniedError class to include user, action, and resource context - Added constructor parameters for action, resource, and user - Generate detailed error messages showing user principal, attributes, and attempted resource - Backward compatible with existing usage (falls back to generic message) • Updated exception handling in server.py - Import AccessDeniedError from access_control module - Return proper 403 status codes with detailed error messages - Separate handling for PermissionError (generic) vs AccessDeniedError (detailed) • Enhanced error context at raise sites - Updated routing_tables/common.py to pass action, resource, and user context - Updated agents persistence to include context in access denied errors - Provides better debugging information for access control issues • Added comprehensive unit tests - Created tests/unit/server/test_server.py with 13 test cases - Covers AccessDeniedError with and without context - Tests all exception types (ValidationError, BadRequestError, AuthenticationRequiredError, etc.) - Validates proper HTTP status codes and error message formats # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan ``` server: port: 8321 access_policy: - permit: principal: admin actions: [create, read, delete] when: user with admin in groups - permit: actions: [read] when: user with system:authenticated in roles ``` then: ``` curl --request POST --url http://localhost:8321/v1/vector-dbs \ --header "Authorization: Bearer your-bearer" \ --data '{ "vector_db_id": "my_demo_vector_db", "embedding_model": "ibm-granite/granite-embedding-125m-english", "embedding_dimension": 768, "provider_id": "milvus" }' ``` depending if user is in group admin or not, you should get the `AccessDeniedError`. Before this PR, this was leading to an error 500 and `Traceback` displayed in the logs. After the PR, logs display a simpler error (unless DEBUG logging is set) and a 403 Forbidden error is returned on the HTTP side. --------- Signed-off-by: Akram Ben Aissi <<akram.benaissi@gmail.com>>	2025-07-03 10:50:49 -07:00
ehhuang	3c43a2f529	fix: store configs (#2593 ) # What does this PR do? https://github.com/meta-llama/llama-stack/pull/2490 broke postgres_demo, as the config expected a str but the value was converted to int. This PR: 1. Updates the type of port in sqlstore to be int 2. template generation uses `dict` instead of `StackRunConfig` so as to avoid failing pydantic typechecks. 3. Adds `replace_env_vars` to StackRunConfig instantiation in `configure.py` (not sure why this wasn't needed before). ## Test Plan `llama stack build --template postgres_demo --image-type conda --run`	2025-07-03 10:07:23 -07:00
Sébastien Han	aa273944fd	fix: add mcp dependency to agent provider (#2587 ) # What does this PR do? The agent depends on utils.tools.mcp. Closes: https://github.com/meta-llama/llama-stack/issues/2576 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-03 14:59:01 +02:00
Christian Zaccaria	b246b0660e	docs: Add quick_start.ipynb notebook equivalent of index.md Quickstart guide (#2128 ) Some checks failed Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 4s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 20s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 9s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 52s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 54s Details Unit Tests / unit-tests (3.13) (push) Failing after 50s Details Pre-commit / pre-commit (push) Successful in 1m51s Details # What does this PR do? - Adding a notebook equivalent of the [getting_started/index.md#Quickstart guide](https://github.com/meta-llama/llama-stack/blob/main/docs/source/getting_started/index.md). ## To discuss Note: works locally, but I am encountering issues when attempting to run through the notebook on Google Colab. Specifically, on the last step to run the demo, the `knowledge_search` tool doesn't seem to be called i.e.,: ``` rag_tool> Ingesting document: https://www.paulgraham.com/greatwork.html prompt> How do you do great work? inference> I don't have personal experiences or emotions, but I was trained on a large corpus of text data and use various techniques such as natural language processing (NLP) and machine learning algorithms to generate human-like responses. ``` I would expect to get something like: ``` rag_tool> Ingesting document: https://www.paulgraham.com/greatwork.html prompt> How do you do great work? inference> [knowledge_search(query="What is the key to doing great work")] tool_execution> Tool:knowledge_search Args:{'query': 'What is the key to doing great work'} tool_execution> Tool:knowledge_search Response:[TextContentItem(text='knowledge_search tool found 5 chunks: .... .... ```	2025-07-03 13:55:43 +02:00
Sumanth Kamenani	577ec382e1	fix(docs): update Agents101 notebook for builtin websearch (#2591 ) - Switch from BRAVE_SEARCH_API_KEY to TAVILY_SEARCH_API_KEY - Add provider_data to LlamaStackClient for API key passing - Use builtin::websearch toolgroup instead of manual tool config - Fix message types to use UserMessage instead of plain dict - Add streaming support with proper type casting - Remove async from EventLogger loop (bug fix) Fixes websearch functionality in agents tutorial by properly configuring Tavily search provider integration. # What does this PR do? Fixes the Agents101 tutorial notebook to work with the current Llama Stack websearch implementation. The tutorial was using outdated Brave Search configuration that no longer works with the current server setup. Key Changes: - Switch API provider: Change from `BRAVE_SEARCH_API_KEY` to `TAVILY_SEARCH_API_KEY` to match server configuration - Fix client setup: Add `provider_data` to `LlamaStackClient` to properly pass API keys to server - Modernize tool usage: Replace manual tool configuration with `tools=["builtin::websearch"]` - Fix type safety: Use `UserMessage` type instead of plain dictionaries for messages - Fix streaming: Add proper streaming support with `stream=True` and type casting - Fix EventLogger: Remove incorrect `async for` usage (should be `for`) Why needed: Users following the tutorial were getting 401 Unauthorized errors because the notebook wasn't properly configured for the Tavily search provider that the server actually uses. ## Test Plan Prerequisites: 1. Start Llama Stack server with Ollama template and `TAVILY_SEARCH_API_KEY` environment variable 2. Set `TAVILY_SEARCH_API_KEY` in your `.env` file Testing Steps: 1. Clone and setup: ```bash git checkout fix-2558-update-agents101 cd docs/zero_to_hero_guide/ ``` 2. Start server with API key: ```bash export TAVILY_SEARCH_API_KEY="your_tavily_api_key" podman run -it --network=host -v ~/.llama:/root/.llama:Z \ --env INFERENCE_MODEL=$INFERENCE_MODEL \ --env OLLAMA_URL=http://localhost:11434 \ --env TAVILY_SEARCH_API_KEY=$TAVILY_SEARCH_API_KEY \ llamastack/distribution-ollama --port $LLAMA_STACK_PORT ``` 3. Run the notebook: - Open `07_Agents101.ipynb` in Jupyter - Execute all cells in order - Cell 5 should run without errors and show successful web search results Expected Results: - ✅ No 401 Unauthorized errors - ✅ Agent successfully calls `brave_search.call()` with web results - ✅ Switzerland travel recommendations appear in output - ✅ Follow-up questions work correctly Before this fix: Users got `401 Unauthorized` errors and tutorial failed After this fix: Tutorial works end-to-end with proper web search functionality Tested with: - Tavily API key (free tier) - Ollama distribution template - Llama-3.2-3B-Instruct model	2025-07-03 11:14:51 +02:00
Wen Zhou	040424acf5	docs: update full list of providers with matched APIs and dockerhub images (#2452 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - add model_type in example - change "Memory" to "VectorIO" as column name - update index.md and README.md <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> run pre-commit to catch changes. --------- Signed-off-by: Wen Zhou <wenzhou@redhat.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-07-03 10:12:56 +02:00
Nate Harada	5b07755556	docs: Minor spelling fix (#2592 ) Some checks failed Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 22s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 21s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 19s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 34s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 33s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 33s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 33s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 31s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 30s Details Python Package Build Test / build (3.12) (push) Failing after 47s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 56s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 54s Details Python Package Build Test / build (3.13) (push) Failing after 42s Details Test External Providers / test-external-providers (venv) (push) Failing after 27s Details Unit Tests / unit-tests (3.13) (push) Failing after 36s Details Unit Tests / unit-tests (3.12) (push) Failing after 38s Details Pre-commit / pre-commit (push) Successful in 2m3s Details # What does this PR do? Minor spelling fix in the comments ## Test Plan No code changes	2025-07-02 20:26:51 -04:00
Jorge	4d0d2d685f	fix: Set parameter usedforsecurity=False when calling hashlib.md5 in order to fix rag_tool.insert on FIPS clusters (#2577 ) Some checks failed Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 26s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 23s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 24s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 31s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Test External Providers / test-external-providers (venv) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 34s Details Python Package Build Test / build (3.13) (push) Failing after 33s Details Pre-commit / pre-commit (push) Successful in 1m52s Details # What does this PR do? Set parameter `usedforsecurity=False` when calling hashlib.md5 in order to fix rag_tool.insert on FIPS clusters <!-- If resolving an issue, uncomment and update the line below --> Closes #2571 --------- Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>	2025-07-02 12:07:05 +02:00
ehhuang	fc735a414e	test: Add one-step integration testing with server auto-start (#2580 ) Some checks failed Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 18s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 21s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 20s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Python Package Build Test / build (3.12) (push) Failing after 1m3s Details Python Package Build Test / build (3.13) (push) Failing after 1m3s Details Test External Providers / test-external-providers (venv) (push) Failing after 1m7s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m15s Details Unit Tests / unit-tests (3.13) (push) Failing after 19s Details Pre-commit / pre-commit (push) Successful in 2m42s Details ## Summary Add support for `server:<config>` format in `--stack-config` option to enable seamless one-step integration testing. This eliminates the need to manually start servers in separate terminals before running tests. ## Key Features - Auto-start server: Automatically launches `llama stack run <config>` if target port is available - Smart reuse: Reuses existing server if port is already occupied - Health check polling: Waits up to 2 minutes for server readiness via `/v1/health` endpoint - Custom port support: Use `server:<config>:<port>` for non-default ports - Clean output: Server runs quietly in background without cluttering test output - Backward compatibility: All existing `--stack-config` formats continue to work ## Usage Examples ```bash # Auto-start server with default port 8321 pytest tests/integration/inference/ --stack-config=server:fireworks # Use custom port pytest tests/integration/safety/ --stack-config=server:together:8322 # Run multiple test suites seamlessly pytest tests/integration/inference/ tests/integration/agents/ --stack-config=server:starter ``` ## Implementation Details - Enhanced `llama_stack_client` fixture with server management - Updated documentation with cleaner organization and comprehensive examples - Added utility functions for port checking, server startup, and health verification ## Test Plan - Verified server auto-start when port 8321 is available - Verified server reuse when port 8321 is occupied - Tested health check polling via `/v1/health` endpoint - Confirmed custom port configuration works correctly - Verified backward compatibility with existing config formats ## Before/After Comparison Before (2 steps): ```bash # Terminal 1: Start server manually llama stack run fireworks --port 8321 # Terminal 2: Wait for startup, then run tests pytest tests/integration/inference/ --stack-config=http://localhost:8321 ``` After (1 step): ```bash # Single command handles everything pytest tests/integration/inference/ --stack-config=server:fireworks ```	2025-07-01 14:48:46 -07:00
Wen Zhou	958600a5c1	fix: update zero_to_hero package and README (#2578 ) Some checks failed Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 36s Details Python Package Build Test / build (3.12) (push) Failing after 33s Details Test Llama Stack Build / build-single-provider (push) Failing after 37s Details Test External Providers / test-external-providers (venv) (push) Failing after 32s Details Pre-commit / pre-commit (push) Successful in 1m24s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - update REAMDE.md format and python version - update package name: CustomTool was renamed to ClientTool in https://github.com/meta-llama/llama-stack-client-python/pull/73 <!-- If resolving an issue, uncomment and update the line below --> Closes #2556 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Wen Zhou <wenzhou@redhat.com>	2025-07-01 11:08:55 -07:00
Nathan Weinberg	d165000bbc	docs: specify the ability to train non-Llama models (#2573 ) # What does this PR do? Clarifies that non-Llama models can be trained via the Post Training API ## Test Plan Build docs locally Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-01 19:29:06 +05:30
Sébastien Han	25268854bc	fix: allow default empty vars for conditionals (#2570 ) # What does this PR do? We were not using conditionals correctly, conditionals can only be used when the env variable is set, so `${env.ENVIRONMENT:+}` would return None is ENVIRONMENT is not set. If you want to create a conditional value, you need to do `${env.ENVIRONMENT:=}`, this will pick the value of ENVIRONMENT if set, otherwise will return None. Closes: https://github.com/meta-llama/llama-stack/issues/2564 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-07-01 14:42:05 +02:00
Nathan Weinberg	faaeccc6fd	docs: update external provider guide and navigation (#2567 ) Some checks failed Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 25s Details Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 33s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.12, inspect) (push) Failing after 36s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 31s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 28s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 29s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 14s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 14s Details Unit Tests / unit-tests (3.12) (push) Failing after 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Pre-commit / pre-commit (push) Successful in 1m23s Details # What does this PR do? The external providers guide can now be accessed directly from the sidebar ## Test Plan Build locally to test the changes Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-07-01 09:42:32 +02:00
Francisco Arceo	0066135944	chore: Enabling VectorIO Integration tests for Milvus (#2546 ) Some checks failed Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Test Llama Stack Build / build (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Test Llama Stack Build / build-single-provider (push) Failing after 41s Details Python Package Build Test / build (3.12) (push) Failing after 35s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 41s Details Unit Tests / unit-tests (3.13) (push) Failing after 37s Details Pre-commit / pre-commit (push) Successful in 2m3s Details	2025-06-30 19:49:59 -07:00
Francisco Arceo	5785ccda35	fix: Fixing Milvus sample config and updating documentation (#2568 )	2025-06-30 19:25:23 -07:00
Matthew Farrellee	f6d91f45ba	fix: update zero-to-hero guide for modern llama stack (#2555 ) # What does this PR do? closes #2553 ## Test Plan run through notebooks w/ llama stack running on localhost:{8321,8322}	2025-06-30 18:09:33 -07:00
Matthew Farrellee	13aa367c8a	fix: default api_key from env must be a SecretStr (#2565 ) # What does this PR do? fixes the api_key type when read from env ## Test Plan run nvidia template w/o api_key in run.yaml and perform inference before change the inference will fail w/ - ``` File ".../llama-stack/llama_stack/providers/remote/inference/nvidia/nvidia.py", line 118, in _get_client_for_base_url api_key=(self._config.api_key.get_secret_value() if self._config.api_key else "NO KEY"), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'str' object has no attribute 'get_secret_value' ```	2025-06-30 18:08:44 -07:00
Nathan Weinberg	ba9acce93b	docs: fixed incorrect API list item (#2566 ) Current text did not match section in example Ollama distro: https://llama-stack.readthedocs.io/en/latest/distributions/configuration.html Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-06-30 18:08:19 -07:00
Ashwin Bharambe	b333a3c03a	fix(ollama): Download remote image URLs for Ollama (#2551 ) Some checks failed Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 16s Details Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 19s Details Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 15s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 46s Details Python Package Build Test / build (3.12) (push) Failing after 43s Details Test External Providers / test-external-providers (venv) (push) Failing after 40s Details Python Package Build Test / build (3.13) (push) Failing after 42s Details Unit Tests / unit-tests (3.13) (push) Failing after 22s Details Unit Tests / unit-tests (3.12) (push) Failing after 25s Details Update ReadTheDocs / update-readthedocs (push) Failing after 20s Details Pre-commit / pre-commit (push) Successful in 2m13s Details ## What does this PR do? Ollama does not support remote images. Only local file paths OR base64 inputs are supported. This PR ensures that the Stack downloads remote images and passes the base64 down to the inference engine. ## Test Plan Added a test cases for Responses and ran it for both `fireworks` and `ollama` providers.	2025-06-30 20:36:11 +05:30
Sébastien Han	c9a49a80e8	docs: auto generated documentation for providers (#2543 ) # What does this PR do? Simple approach to get some provider pages in the docs. Add or update description fields in the provider configuration class using Pydantic’s Field, ensuring these descriptions are clear and complete, as they will be used to auto-generate provider documentation via ./scripts/distro_codegen.py instead of editing the docs manually. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-30 15:13:20 +02:00
Sébastien Han	8d8e90d78e	fix: add missing argument and methods (#2550 ) # What does this PR do? Resolves: ``` mypy.....................................................................Failed - hook id: mypy - exit code: 1 llama_stack/providers/utils/responses/responses_store.py:119: error: Missing positional argument "policy" in call to "fetch_one" of "AuthorizedSqlStore" [call-arg] llama_stack/providers/utils/responses/responses_store.py:122: error: "AuthorizedSqlStore" has no attribute "delete" [attr-defined] Found 2 errors in 1 file (checked 403 source files) ``` Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-30 14:55:37 +02:00
Krzysztof Malczuk	be9bf68246	feat: Add webmethod for deleting openai responses (#2160 ) Some checks failed Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 16s Details Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 12s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 17s Details Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 21s Details Test External Providers / test-external-providers (venv) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 19s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 39s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 37s Details Python Package Build Test / build (3.13) (push) Failing after 33s Details Python Package Build Test / build (3.12) (push) Failing after 36s Details Pre-commit / pre-commit (push) Failing after 1m19s Details # What does this PR do? This PR creates a webmethod for deleting open AI responses, adds and implementation for it and makes an integration test for the OpenAI delete response method. [//]: # (If resolving an issue, uncomment and update the line below) # (Closes #2077) ## Test Plan Ran the standard tests and the pre-commit hooks and the unit tests. # (## Documentation) For this pr I made the routes and implementation based on the current get and create methods. The unit tests were not able to handle this test due to the mock interface in use, which did not allow for effective CRUD to be tested. I instead created an integration test to match the existing ones in the test_openai_responses.	2025-06-30 11:28:02 +02:00
Wen Zhou	6fa5271807	docs: update document since container is not an option for "llama stack run" + update docs with current "usage" (#2531 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - change from https://github.com/meta-llama/llama-stack/issues/2110 need update documentation. "container" is not valid value for --image-type - chore: updates from standard output <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Wen Zhou <wenzhou@redhat.com>	2025-06-30 11:02:07 +05:30
dependabot[bot]	dc1b4a84c3	chore(github-deps): bump astral-sh/setup-uv from 6.3.0 to 6.3.1 (#2548 ) Some checks failed Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 13s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 28s Details Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 18s Details Integration Tests / test-matrix (http, 3.13, inference) (push) Failing after 19s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.12, inspect) (push) Failing after 32s Details Integration Tests / test-matrix (http, 3.13, agents) (push) Failing after 31s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 42s Details Python Package Build Test / build (3.12) (push) Failing after 40s Details Python Package Build Test / build (3.13) (push) Failing after 38s Details Test External Providers / test-external-providers (venv) (push) Failing after 39s Details Unit Tests / unit-tests (3.12) (push) Failing after 21s Details Unit Tests / unit-tests (3.13) (push) Failing after 19s Details Pre-commit / pre-commit (push) Successful in 2m18s Details Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.3.0 to 6.3.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v6.3.1 🌈 Do not warn when version not in manifest-file</h2> <h2>Changes</h2> <p>This is a hotfix to change the warning messages that a version could not be found in the local manifest-file to info level.</p> <p>A <code>setup-uv</code> release contains a version-manifest.json file with infos in all available <code>uv</code> releases. When a new <code>uv</code> version is released this is not contained in this file until the file gets updated and a new <code>setup-uv</code> release is made. We will overhaul this process in the future but for now the spamming of warnings is removed.</p> <h2>🐛 Bug fixes</h2> <ul> <li>Do not warn when version not in manifest-file <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/462">#462</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>chore: update known versions for 0.7.14 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/459">#459</a>)</li> <li>Revert "Set expected cache dir drive to C: on windows (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/451">#451</a>)" <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/460">#460</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`bd01e18f51`"><code>bd01e18</code></a> Do not warn when version not in manifest-file (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/462">#462</a>)</li> <li><a href="`c6a5ebaafe`"><code>c6a5eba</code></a> chore: update known versions for 0.7.14 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/459">#459</a>)</li> <li><a href="`790df8f465`"><code>790df8f</code></a> Revert "Set expected cache dir drive to C: on windows (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/451">#451</a>)" (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/460">#460</a>)</li> <li>See full diff in <a href="`445689ea25...bd01e18f51`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.3.0&new-version=6.3.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-06-29 13:55:32 -04:00
Ashwin Bharambe	21669b14e7	fix(docs): add setuptools explicitly (#2547 ) Some checks failed Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 31s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 35s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 7s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 11s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s Details Test Llama Stack Build / build (push) Failing after 10s Details Python Package Build Test / build (3.13) (push) Failing after 30s Details Pre-commit / pre-commit (push) Successful in 1m23s Details Given the shift to python3.12, we need to explicitly depend on `setuptools` for the pkg_resources import ## Test Plan Run ``` cd local/llama-stack UV_PROJECT_ENVIRONMENT=/tmp/docs uv sync --frozen --group docs cd /tmp/docs uv run python -m sphinx -T -b html -d _build/doctrees -D language=en \ ~/local/llama-stack/docs/source/ \ /tmp/docs/html ```	2025-06-28 08:14:25 +05:30
github-actions[bot]	709eb7da33	build: Bump version to 0.2.13	2025-06-27 23:56:14 +00:00
Francisco Arceo	cc19b56c87	chore: OpenAI compatibility for Milvus (#2470 ) # What does this PR do? Closes https://github.com/meta-llama/llama-stack/issues/2461 ## Test Plan Tested with the `ollama` distriubtion template and updated the vector_io provider to: ```yaml vector_io: - provider_id: milvus provider_type: inline::milvus config: db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/ollama}/milvus_store.db kvstore: type: sqlite db_name: milvus_registry.db ``` Ran the stack ```bash llama stack run ./llama_stack/templates/ollama/run.yaml --image-type venv --env OLLAMA_URL="http://0.0.0.0:11434" ``` Ran the tests: ``` pytest -sv --stack-config=http://localhost:8321 tests/integration/vector_io/test_openai_vector_stores.py --embedding-model all-MiniLM-L6-v2 ``` Output passed. Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-06-27 16:00:36 -07:00
Charlie Doern	65b4fae51d	fix: proper checkpointing logic for HF trainer (#2429 ) # What does this PR do? currently only the last saved model is reported as a checkpoint and associated with the job UUID. since the HF trainer handles checkpoint collection during training, we need to add all of the `checkpoint-*` folders as Checkpoint objects. Adjust the save strategy to be per-epoch to make this easier and to use less storage Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-06-27 17:36:25 -04:00
Ramakrishna Reddy Yekulla	03e61e3fcc	fix: ValueError in faiss vector database serialization (resolves #2519 ) (#2526 ) Some checks failed Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 22s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 22s Details Integration Tests / test-matrix (http, 3.13, inference) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 7s Details Python Package Build Test / build (3.12) (push) Failing after 15s Details Python Package Build Test / build (3.13) (push) Failing after 17s Details Test External Providers / test-external-providers (venv) (push) Failing after 20s Details Unit Tests / unit-tests (3.12) (push) Failing after 21s Details Unit Tests / unit-tests (3.13) (push) Failing after 11s Details Pre-commit / pre-commit (push) Successful in 1m12s Details The error message was misleading as it appeared to be an Ollama connectivity issue, but actually occurred during faiss vector database initialization. ## 🔍 Root Cause Analysis The issue was in the faiss vector database serialization logic in `llama_stack/providers/inline/vector_io/faiss/faiss.py`: 1. Saving: `faiss.serialize_index()` returns binary data (uint8 numpy array) 2. Bug: Code incorrectly used `np.savetxt()` which converts binary to text with scientific notation (e.g., `7.300000000000000000e+01`) 3. Loading: `np.loadtxt(buffer, dtype=np.uint8)` failed to parse scientific notation back to uint8 4. Result: Server crashed during initialization before reaching Ollama connectivity check ## ✅ Solution Replaced text-based serialization with proper binary serialization: ``` After (fixed): ```python # Saving - proper binary format np.save(buffer, np_index, allow_pickle=False) # Loading - proper binary format self.index = faiss.deserialize_index(np.load(buffer, allow_pickle=False)) ``` ## 🧪 Testing - ✅ Binary serialization/deserialization works correctly - ✅ Backward compatible with existing functionality - ✅ No security concerns (allow_pickle=False maintained) - ✅ Resolves the specific ValueError mentioned in the issue ## 📊 Impact This fix resolves: - ValueError during server startup with Ollama templates ## 🔗 Related Issues - Closes #2519 - Affects all users of Ollama template and faiss vector_io configurations ## 📝 Files Changed - `llama_stack/providers/inline/vector_io/faiss/faiss.py` - Fixed serialization methods in `initialize()` and `_save_index()` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: Ben Browning <bbrownin@redhat.com>	2025-06-27 14:34:52 -04:00
Rohan Awhad	7cb5d3c60f	chore: standardize unsupported model error #2517 (#2518 ) # What does this PR do? - llama_stack/exceptions.py: Add UnsupportedModelError class - remote inference ollama.py and utils/inference/model_registry.py: Changed ValueError in favor of UnsupportedModelError - utils/inference/litellm_openai_mixin.py: remove `register_model` function implementation from `LiteLLMOpenAIMixin` class. Now uses the parent class `ModelRegistryHelper`'s function implementation Closes #2517 ## Test Plan 1. Create a new `test_run_openai.yaml` and paste the following config in it: ```yaml version: '2' image_name: test-image apis: - inference providers: inference: - provider_id: openai provider_type: remote::openai config: max_tokens: 8192 models: - metadata: {} model_id: "non-existent-model" provider_id: openai model_type: llm server: port: 8321 ``` And run the server with: ```bash uv run llama stack run test_run_openai.yaml ``` You should now get a `llama_stack.exceptions.UnsupportedModelError` with the supported list of models in the error message. --- Tested for the following remote inference providers, and they all raise the `UnsupportedModelError`: - Anthropic - Cerebras - Fireworks - Gemini - Groq - Ollama - OpenAI - SambaNova - Together - Watsonx --------- Co-authored-by: Rohan Awhad <rawhad@redhat.com>	2025-06-27 14:26:58 -04:00
Yuan Tang	9baa16e498	fix(security): Upgrade protobuf and aiohttp. Fixes CVE-2025-4565 (#2541 ) # What does this PR do? Fixes CVE-2025-4565 and the following warning: ``` warning: `aiohttp==3.11.13` is yanked (reason: "Regression: https://github.com/aio-libs/aiohttp/issues/10617") ``` Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-06-27 06:58:38 -07:00
Juanma	e7eb9f9adc	fix: dataset metadata without provider_id (#2527 ) # What does this PR do? Fixes an error when inferring dataset provider_id with metadata Closes #[2506](https://github.com/meta-llama/llama-stack/issues/2506) Signed-off-by: Juanma Barea <juanmabareamartinez@gmail.com>	2025-06-27 08:51:29 -04:00
Yuan Tang	40fdce79b3	fix(security): Upgrade urllib3 to v2.5.0. Fixes CVE-2025-50181 and CVE-2025-50182 (#2534 ) Some checks failed Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 2s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 36s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 34s Details Test External Providers / test-external-providers (venv) (push) Failing after 32s Details Pre-commit / pre-commit (push) Successful in 1m21s Details This fixes CVE-2025-50181 and CVE-2025-50182. Changes via: ``` uv sync --upgrade-package urllib3 uv export --frozen --no-hashes --no-emit-project --no-default-groups --output-file=requirements.txt ``` Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-06-27 10:46:47 +02:00
Wen Zhou	8c3f2762fb	build: update temp. created Containerfile (#2492 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> - conditionally created folder /.llama/providers.d if external_providers_dir is set - do not create /.cache folder, not in use anywhere - combine chmod and copy to one command <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> updated test: ``` export CONTAINER_BINARY=podman LLAMA_STACK_DIR=. uv run llama stack build --template remote-vllm --image-type container --image-name <name> ``` log: ``` Containerfile created successfully in /tmp/tmp.rPMunE39Aw/Containerfile FROM python:3.11-slim WORKDIR /app RUN apt-get update && apt-get install -y iputils-ping net-tools iproute2 dnsutils telnet curl wget telnet git procps psmisc lsof traceroute bubblewrap gcc && rm -rf /var/lib/apt/lists/* ENV UV_SYSTEM_PYTHON=1 RUN pip install uv RUN uv pip install --no-cache sentencepiece pillow pypdf transformers pythainlp faiss-cpu opentelemetry-sdk requests datasets chardet scipy nltk numpy matplotlib psycopg2-binary aiosqlite langdetect autoevals tree_sitter tqdm pandas chromadb-client opentelemetry-exporter-otlp-proto-http redis scikit-learn openai pymongo emoji sqlalchemy[asyncio] mcp aiosqlite fastapi fire httpx uvicorn opentelemetry-sdk opentelemetry-exporter-otlp-proto-http RUN uv pip install --no-cache sentence-transformers --no-deps RUN uv pip install --no-cache torch torchvision --index-url https://download.pytorch.org/whl/cpu # Allows running as non-root user RUN mkdir -p /.llama/providers.d /.cache RUN uv pip install --no-cache llama-stack RUN pip uninstall -y uv ENTRYPOINT ["python", "-m", "llama_stack.distribution.server.server", "--template", "remote-vllm"] RUN chmod -R g+rw /app /.llama /.cache PWD: /tmp/llama-stack Containerfile: /tmp/tmp.rPMunE39Aw/Containerfile + podman build --progress=plain --security-opt label=disable --platform linux/amd64 -t distribution-remote-vllm:0.2.12 -f /tmp/tmp.rPMunE39Aw/Containerfile /tmp/llama-stack .... Success! Build Successful! You can find the newly-built template here: /tmp/llama-stack/llama_stack/templates/remote-vllm/run.yaml You can run the new Llama Stack distro via: llama stack run /tmp/llama-stack/llama_stack/templates/remote-vllm/run.yaml --image-type container ``` ``` podman tag localhost/distribution-remote-vllm:dev quay.io/wenzhou/distribution-remote-vllm:2492_2 podman push quay.io/wenzhou/distribution-remote-vllm:2492_2 docker run --rm -p 8321:8321 -e INFERENCE_MODEL="meta-llama/Llama-2-7b-chat-hf" -e VLLM_URL="http://localhost:8000/v1" quay.io/wenzhou/distribution-remote-vllm:2492_2 --port 8321 INFO 2025-06-26 13:47:31,813 __main__:436 server: Using template remote-vllm config file: /app/llama-stack-source/llama_stack/templates/remote-vllm/run.yaml INFO 2025-06-26 13:47:31,818 __main__:438 server: Run configuration: INFO 2025-06-26 13:47:31,826 __main__:440 server: apis: - agents - datasetio - eval - inference - safety - scoring - telemetry - tool_runtime - vector_io benchmarks: [] container_image: null .... ``` ----- previous test: local run` >llama stack build --template remote-vllm --image-type container` image stored in `quay.io/wenzhou/distribution-remote-vllm:2492` --------- Signed-off-by: Wen Zhou <wenzhou@redhat.com>	2025-06-27 10:23:12 +02:00
Yuan Tang	0ddb293d77	docs: Add recent releases to CHANGELOG.md (#2533 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Update changelog. --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-06-26 23:04:13 -04:00
Ben Browning	0883944bc3	fix: Some missed env variable changes from PR 2490 (#2538 ) Some checks failed Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 25s Details Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 4s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 28s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 8s Details Test Llama Stack Build / generate-matrix (push) Successful in 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Test External Providers / test-external-providers (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Test Llama Stack Build / build-single-provider (push) Failing after 11s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s Details Test Llama Stack Build / build (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 34s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 30s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 32s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 29s Details Pre-commit / pre-commit (push) Successful in 1m1s Details # What does this PR do? Some templates were still using the old environment variable substition syntax instead of the new one and were not getting substituted properly. Also, some places didn't handle the new None vs old empty string ("") values that come from the conditional environment variable substitution. This gets the starter and remote-vllm distributions starting again, and I tested various permutations of the starter as chroma and pgvector needed some adjustments to their config classes to handle the new possible `None` values. And, I had to tweak our `Provider` class to also handle `None` values, for cases where we disable providers in the starter config via environment variables. This may not have caught everything that was missed, but I did grep around quite a bit to try and find anything lingering. ## Test Plan The following permutations now all run (or attempt to run to the point of complaining that they can't connect to chroma, vllm, etc) when before they failed immediately on startup because of bad environment variable substitions: ``` uv run llama stack run llama_stack/templates/starter/run.yaml ENABLE_SQLITE_VEC=true uv run llama stack run llama_stack/templates/starter/run.yaml ENABLE_PGVECTOR=true uv run llama stack run llama_stack/templates/starter/run.yaml ENABLE_CHROMADB=true uv run llama stack run llama_stack/templates/starter/run.yaml uv run llama stack run llama_stack/templates/remote-vllm/run.yaml ``` <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: raghotham <rsm@meta.com>	2025-06-26 17:59:15 -07:00
Hardik Shah	eb01a3f1c5	ci: vector_io provider integration tests (#2537 ) Runs integration tests for `vector_io` across the provider matrix. This new workflow adds CI testing across - `inline::faiss`, `remote::chroma`.	2025-06-26 17:04:32 -07:00
grs	68d8f2186f	fix: fix test of root span to match what is being set (#2494 ) Some checks failed Integration Tests / test-matrix (http, 3.12, inspect) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 13s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 13s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 22s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 22s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 14s Details Integration Tests / test-matrix (http, 3.13, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.12, post_training) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 20s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 10s Details Python Package Build Test / build (3.12) (push) Failing after 7s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 32s Details Unit Tests / unit-tests (3.12) (push) Failing after 48s Details Pre-commit / pre-commit (push) Successful in 1m32s Details # What does this PR do? I get errors when trying to query spans. It appears to be a result of traces being inserted where there is no root_span_id which causes a pydantic validation error on trying to load the data for a query response (and in any case having no span referenced undermines the purpose of the trace). The root cause as far as I can see is an invalid test in the code that inserts the trace, where it is testing for the string "true" against an object set to the python value True. <!-- If resolving an issue, uncomment and update the line below --> Closes #2493 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> With this change I can query spans. Signed-off-by: Gordon Sim <gsim@redhat.com>	2025-06-26 11:41:35 -04:00
Sébastien Han	dbdc811d16	chore: isolate bare minimum project dependencies (#2282 ) Some checks failed Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 12s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 20s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 7s Details Test Llama Stack Build / generate-matrix (push) Successful in 7s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 16s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 18s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s Details Python Package Build Test / build (3.12) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 17s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.12, inference) (push) Failing after 26s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 19s Details Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 8s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 10s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 10s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 7s Details Pre-commit / pre-commit (push) Successful in 48s Details # What does this PR do? The goal is to promote the minimal set of dependencies the project needs to run, this includes: * dependencies needed to work with the CLI * dependencies needed for the server to run with no providers This also: * Relocate redundant dependencies out of the core project and into the individual providers that actually require them. * Include all necessary server dependencies so the project can run standalone, even without any providers. <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Build and run distro a server. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-26 10:14:27 +02:00
Sébastien Han	43c1f39bd6	refactor(env)!: enhanced environment variable substitution (#2490 ) # What does this PR do? This commit significantly improves the environment variable substitution functionality in Llama Stack configuration files: * The version field in configuration files has been changed from string to integer type for better type consistency across build and run configurations. * The environment variable substitution system for ${env.FOO:} was fixed and properly returns an error * The environment variable substitution system for ${env.FOO+} returns None instead of an empty strings, it better matches type annotations in config fields * The system includes automatic type conversion for boolean, integer, and float values. * The error messages have been enhanced to provide clearer guidance when environment variables are missing, including suggestions for using default values or conditional syntax. * Comprehensive documentation has been added to the configuration guide explaining all supported syntax patterns, best practices, and runtime override capabilities. * Multiple provider configurations have been updated to use the new conditional syntax for optional API keys, making the system more flexible for different deployment scenarios. The telemetry configuration has been improved to properly handle optional endpoints with appropriate validation, ensuring that required endpoints are specified when their corresponding sinks are enabled. * There were many instances of ${env.NVIDIA_API_KEY:} that should have caused the code to fail. However, due to a bug, the distro server was still being started, and early validation wasn’t triggered. As a result, failures were likely being handled downstream by the providers. I’ve maintained similar behavior by using ${env.NVIDIA_API_KEY:+}, though I believe this is incorrect for many configurations. I’ll leave it to each provider to correct it as needed. * Environment variable substitution now uses the same syntax as Bash parameter expansion. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-26 08:20:08 +05:30
Sébastien Han	36d70637b9	fix: finish conversion to StrEnum (#2514 ) # What does this PR do? We still had a few enum declared to behave like string as well as enum. Let's use StrEnum for those. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-26 08:01:26 +05:30
Sébastien Han	ac5fd57387	chore: remove nested imports (#2515 ) # What does this PR do? * Given that our API packages use "import " in `__init.py__` we don't need to do `from llama_stack.apis.models.models` but simply from llama_stack.apis.models. The decision to use `import ` is debatable and should probably be revisited at one point. * Remove unneeded Ruff F401 rule * Consolidate Ruff F403 rule in the pyprojectfrom llama_stack.apis.models.models Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-26 08:01:05 +05:30
Ben Browning	2d9fd041eb	fix: annotations list and web_search_preview in Responses (#2520 ) # What does this PR do? These are a couple of fixes to get an example LangChain app working with our OpenAI Responses API implementation. The Responses API spec requires an annotations array in `output[].content[].annotations` and we were not providing one. So, this adds that as an empty list, even though we don't do anything to populate it yet. This prevents an error from client libraries like Langchain that expect this field to always exist, even if an empty list. The other fix is `web_search_preview` is a valid name for the web search tool in the Responses API, but we only responded to `web_search` or `web_search_preview_2025_03_11`. ## Test Plan The existing Responses unit tests were expanded to test these cases, via: ``` pytest -sv tests/unit/providers/agents/meta_reference/test_openai_responses.py ``` The existing test_openai_responses.py integration tests still pass with this change, tested as below with Fireworks: ``` uv run llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ uv run pytest -sv tests/integration/agents/test_openai_responses.py \ --text-model accounts/fireworks/models/llama4-scout-instruct-basic ``` Lastly, this example LangChain app now works with Llama stack (tested with Ollama in the starter template in this case). This LangChain code is using the example snippets for using Responses API at https://python.langchain.com/docs/integrations/chat/openai/#responses-api ```python from langchain_openai import ChatOpenAI llm = ChatOpenAI( base_url="http://localhost:8321/v1/openai/v1", api_key="fake", model="ollama/meta-llama/Llama-3.2-3B-Instruct", ) tool = {"type": "web_search_preview"} llm_with_tools = llm.bind_tools([tool]) response = llm_with_tools.invoke("What was a positive news story from today?") print(response.content) ``` Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-06-26 07:59:33 +05:30
ehhuang	1d3f27fe5b	fix: resume responses with tool call output (#2524 ) Some checks failed Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (http, 3.12, inference) (push) Failing after 17s Details Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.13, inspect) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 8s Details Python Package Build Test / build (3.12) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Python Package Build Test / build (3.13) (push) Failing after 49s Details Test External Providers / test-external-providers (venv) (push) Failing after 49s Details Unit Tests / unit-tests (3.13) (push) Failing after 49s Details Pre-commit / pre-commit (push) Successful in 2m5s Details # What does this PR do? closes #2522 ## Test Plan added integration test LLAMA_STACK_CONFIG=http://localhost:8321 pytest -v tests/integration/agents/test_openai_responses.py --text-model "accounts/fireworks/models/llama-v3p3-70b-instruct" -vv -k 'function_call'	2025-06-25 14:43:37 -07:00
Francisco Arceo	82f13fe83e	feat: Add ChunkMetadata to Chunk (#2497 ) # What does this PR do? Adding `ChunkMetadata` so we can properly delete embeddings later. More specifically, this PR refactors and extends the chunk metadata handling in the vector database and introduces a distinction between metadata used for model context and backend-only metadata required for chunk management, storage, and retrieval. It also improves chunk ID generation and propagation throughout the stack, enhances test coverage, and adds new utility modules. ```python class ChunkMetadata(BaseModel): """ `ChunkMetadata` is backend metadata for a `Chunk` that is used to store additional information about the chunk that will NOT be inserted into the context during inference, but is required for backend functionality. Use `metadata` in `Chunk` for metadata that will be used during inference. """ document_id: str \| None = None chunk_id: str \| None = None source: str \| None = None created_timestamp: int \| None = None updated_timestamp: int \| None = None chunk_window: str \| None = None chunk_tokenizer: str \| None = None chunk_embedding_model: str \| None = None chunk_embedding_dimension: int \| None = None content_token_count: int \| None = None metadata_token_count: int \| None = None ``` Eventually we can migrate the document_id out of the `metadata` field. I've introduced the changes so that `ChunkMetadata` is backwards compatible with `metadata`. <!-- If resolving an issue, uncomment and update the line below --> Closes https://github.com/meta-llama/llama-stack/issues/2501 ## Test Plan Added unit tests --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>	2025-06-25 15:55:23 -04:00
Ben Browning	fa0b0c13d4	fix: Ollama should be optional in starter distro (#2482 ) Some checks failed Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 14s Details Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.13, inspect) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 12s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 10s Details Test Llama Stack Build / generate-matrix (push) Successful in 7s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 8s Details Update ReadTheDocs / update-readthedocs (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details Test Llama Stack Build / build (push) Failing after 6s Details Test Llama Stack Build / build-single-provider (push) Failing after 1m10s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 1m8s Details Python Package Build Test / build (3.13) (push) Failing after 1m6s Details Test External Providers / test-external-providers (venv) (push) Failing after 1m4s Details Pre-commit / pre-commit (push) Successful in 2m33s Details # What does this PR do? Our starter distro required Ollama to be running (and a large list of models available in that Ollama) to successfully start. This adjusts things so that Ollama does not have to be running to use the starter template / distro. To accomplish this, a few changes were needed: * The Ollama provider is now configurable whether it raises an Exception or just logs a warning when it cannot reach the Ollama server on startup. The default is to raise an exception (same as previous behavior), but in the starter template we adjust this to just log a warning so that we can bring the stack up without needing a running Ollama server. * The starter template no longer specifies a default list of models for Ollama, as any models specified there need to actually be pulled and available in Ollama. Instead, it adds a new `OLLAMA_INFERENCE_MODEL` environment variable where users can provide an optional model to register with the Ollama provider on startup. Additional models can also be registered via the typical `models.register(...)` at runtime. * The vLLM template was adjusted to also allow an optional `VLLM_INFERENCE_MODEL` specified on startup, so that the behavior between vLLM and Ollama was consistent here to make it easy to get up and running quickly. * The default vector store was changed from sqlite-vec to faiss. sqlite-vec can enabled via setting the `ENABLE_SQLITE_VEC` environment variable, like we do for chromadb and pgvector. This is due to sqlite-vec not shipping proper arm64 binaries, like we previously fixed in #1530 for the ollama distribution. ## Test Plan With this change, the following scenarios now work with the starter template that did not before: * no Ollama running * Ollama running but not all of the Llama models pulled locally * Ollama running with a custom model registered on startup * vLLM running with a custom model registered on startup * running the starter template on linux/arm64, like when running containers on Mac without rosetta emulation --------- Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-06-25 15:54:00 +02:00
Varsha	cfee63bd0d	feat: Add search_mode support to OpenAI vector store API (#2500 ) Some checks failed Integration Tests / test-matrix (http, 3.13, scoring) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 11s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 17s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 18s Details Test Llama Stack Build / build-single-provider (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 17s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 19s Details Test Llama Stack Build / build (push) Failing after 5s Details Update ReadTheDocs / update-readthedocs (push) Failing after 44s Details Test External Providers / test-external-providers (venv) (push) Failing after 47s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 50s Details Pre-commit / pre-commit (push) Successful in 2m12s Details # What does this PR do? Add search_mode parameter (vector/keyword/hybrid) to openai_search_vector_store method. Fixes OpenAPI code generation by using str instead of Literal type. Closes: #2459 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>	2025-06-24 20:38:47 -04:00
ehhuang	114946ae88	chore: fix build script bug (#2507 ) # What does this PR do? Fixes ``` Installing pip dependencies error: Failed to parse: `scikit-learn pymongo pythainlp datasets torch sentencepiece requests aiohttp psycopg2-binary trl pillow pandas chardet nltk scipy ollama faiss-cpu pypdf tree_sitter langdetect openai matplotlib asyncpg peft redis autoevals mcp opentelemetry-exporter-otlp-proto-http sqlalchemy[asyncio] tqdm opentelemetry-sdk aiosqlite numpy chromadb-client emoji transformers aiosqlite fastapi fire httpx uvicorn opentelemetry-sdk opentelemetry-exporter-otlp-proto-http` Caused by: Expected one of `@`, `(`, `<`, `=`, `>`, `~`, `!`, `;`, found `p` scikit-learn pymongo pythainlp datasets torch sentencepiece requests aiohttp psycopg2-binary trl pillow pandas chardet nltk scipy ollama faiss-cpu pypdf tree_sitter langdetect openai matplotlib asyncpg peft redis autoevals mcp opentelemetry-exporter-otlp-proto-http sqlalchemy[asyncio] tqdm opentelemetry-sdk aiosqlite numpy chromadb-client emoji transformers aiosqlite fastapi fire httpx uvicorn opentelemetry-sdk opentelemetry-exporter-otlp-proto-http ^ ERROR 2025-06-24 11:33:33,362 llama_stack.distribution.build:145 uncategorized: Failed to build target myenv with return code 2 Error building stack: Failed to build image myenv ``` ## Test Plan	2025-06-24 12:05:22 -07:00
Sébastien Han	450ed920d6	chore: do not build on auth ci test (#2505 ) Some checks failed Integration Tests / test-matrix (http, 3.13, vector_io) (push) Failing after 18s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Integration Tests / test-matrix (http, 3.12, agents) (push) Failing after 19s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.13, post_training) (push) Failing after 20s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 22s Details Python Package Build Test / build (3.13) (push) Failing after 7s Details Test External Providers / test-external-providers (venv) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 21s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.13, inspect) (push) Failing after 24s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 21s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.13, tool_runtime) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 23s Details Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 25s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 12s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.13, inference) (push) Failing after 19s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 13s Details Unit Tests / unit-tests (3.13) (push) Failing after 49s Details Pre-commit / pre-commit (push) Successful in 2m4s Details # What does this PR do? Since we are using a very minimal run.yaml, there is not need to build. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-24 21:08:33 +05:30
Ashwin Bharambe	73c18feac4	fix: update the signature of openai_list_files_in_vector_store in all VectorIO impls (#2503 )	2025-06-24 18:55:56 +05:30
ehhuang	7fa8f23555	fix(ui): ensure initial data fetch only happens once (#2486 ) # What does this PR do? Bug: 1. go to responses chat logs in UI 2. go to chat completions logs page 3. observe that same data appears in the table twice This is because `fetchData` is called multiple times when multiple renders occur. ## Test Plan manual testing of above bug repro steps	2025-06-24 12:22:55 +02:00
Sébastien Han	9c8be89fb6	chore: bump python supported version to 3.12 (#2475 ) Some checks failed Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 16s Details Test Llama Stack Build / build-single-provider (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.13, datasets) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 12s Details Integration Tests / test-matrix (http, 3.13, providers) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 11s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 8s Details Test Llama Stack Build / build (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 41s Details Python Package Build Test / build (3.12) (push) Failing after 33s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 36s Details Test External Providers / test-external-providers (venv) (push) Failing after 31s Details Pre-commit / pre-commit (push) Successful in 1m54s Details # What does this PR do? The project now supports Python >= 3.12 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-24 09:22:04 +05:30
Rohan Awhad	d797f9aec1	fix: #2495 FileNotFound Err in container image (#2498 ) # What does this PR do? Closes #2495 Changes: - Delay the `COPY run.yaml` into docker image step until after external provider handling - Split the check for `external_providers_dir` into “non-empty” and “directory exists" ## Test Plan 0. Create and Activate venv 1. Create a `simple_build.yaml` ```yaml version: '2' distribution_spec: providers: inference: - remote::openai image_type: container image_name: openai-stack ``` 2. Run llama stack build: ```bash llama stack build --config simple_build.yaml ``` 3. Run the docker container: ```bash docker run \ -p 8321:8321 \ -e OPENAI_API_KEY=$OPENAI_API_KEY \ openai_stack:0.2.12 ``` This should show server is running. ``` INFO 2025-06-23 19:07:57,832 llama_stack.distribution.distribution:151 core: Loading external providers from /.llama/providers.d INFO 2025-06-23 19:07:59,324 __main__:572 server: Listening on ['::', '0.0.0.0']:8321 INFO: Started server process [1] INFO: Waiting for application startup. INFO 2025-06-23 19:07:59,336 __main__:156 server: Starting up INFO: Application startup complete. INFO: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) ``` Notice the first line: ``` Loading external providers from /.llama/providers.d ``` This is expected behaviour. Co-authored-by: Rohan Awhad <rawhad@redhat.com>	2025-06-24 09:08:08 +05:30
dependabot[bot]	929ac618ce	chore(github-deps): bump astral-sh/setup-uv from 6.0.1 to 6.3.0 (#2488 ) Some checks failed Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 20s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 22s Details Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (http, 3.11, inspect) (push) Failing after 24s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 24s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 21s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 22s Details Python Package Build Test / build (3.12) (push) Failing after 22s Details Python Package Build Test / build (3.13) (push) Failing after 20s Details Python Package Build Test / build (3.11) (push) Failing after 24s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 34s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 29s Details Test External Providers / test-external-providers (venv) (push) Failing after 20s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 28s Details Unit Tests / unit-tests (3.11) (push) Failing after 23s Details Unit Tests / unit-tests (3.13) (push) Failing after 22s Details Unit Tests / unit-tests (3.12) (push) Failing after 22s Details Pre-commit / pre-commit (push) Successful in 48s Details Integration Tests / test-matrix (http, 3.12, inference) (push) Failing after 19s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 15s Details Integration Tests / test-matrix (http, 3.11, providers) (push) Failing after 21s Details Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.0.1 to 6.3.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v6.3.0 🌈 Use latest version from manifest-file</h2> <h2>Changes</h2> <p>If a manifest-file is supplied the default value of the version input (latest) will get the latest version available in the manifest. That might not be the actual latest version available in the official uv repo.</p> <h2>🚀 Enhancements</h2> <ul> <li>Use latest version from manifest-file <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/458">#458</a>)</li> </ul> <h2>v6.2.0 🌈 New input manifest-file</h2> <h2>Changes</h2> <p>This release adds a new input <code>manifest-file</code>.</p> <p>The <code>manifest-file</code> input allows you to specify a JSON manifest that lists available uv versions, architectures, and their download URLs. By default, this action uses the manifest file contained in this repository, which is automatically updated with each release of uv.</p> <p>The manifest file contains an array of objects, each describing a version, architecture, platform, and the corresponding download URL.</p> <p>You can supply a custom manifest file URL to define additional versions, architectures, or different download URLs. This is useful if you maintain your own uv builds or want to override the default sources.</p> <p>For example:</p> <pre lang="json"><code>[ { "version": "0.7.12-alpha.1", "artifactName": "uv-x86_64-unknown-linux-gnu.tar.gz", "arch": "x86_64", "platform": "unknown-linux-gnu", "downloadUrl": "https://release.pyx.dev/0.7.12-alpha.1/uv-x86_64-unknown-linux-gnu.tar.gz" }, ... ] </code></pre> <pre lang="yaml"><code>- name: Use a custom manifest file uses: astral-sh/setup-uv@v6 with: manifest-file: "https://example.com/my-custom-manifest.json" </code></pre> <blockquote> <p>[!WARNING]</p> </blockquote> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`445689ea25`"><code>445689e</code></a> Use latest version from manifest-file (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/458">#458</a>)</li> <li><a href="`a02a550bdd`"><code>a02a550</code></a> Look for version-manifest.json relative to action path (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/456">#456</a>)</li> <li><a href="`60cc2b4585`"><code>60cc2b4</code></a> Add input manifest-file (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/454">#454</a>)</li> <li><a href="`7bbb36f434`"><code>7bbb36f</code></a> chore: update known versions for 0.7.13 and 0.7.12 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/444">#444</a>)</li> <li><a href="`60ecb381b4`"><code>60ecb38</code></a> Set expected cache dir drive to C: on windows (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/451">#451</a>)</li> <li><a href="`252c995424`"><code>252c995</code></a> chore: update known versions for 0.7.11 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/442">#442</a>)</li> <li><a href="`477a814f2d`"><code>477a814</code></a> chore: update known versions for 0.7.10 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/440">#440</a>)</li> <li><a href="`9b19f8f4b1`"><code>9b19f8f</code></a> Add warning about shadowed uv binaries to <code>activate-environment</code> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/439">#439</a>)</li> <li><a href="`d44461ea9f`"><code>d44461e</code></a> chore: update known versions for 0.7.9 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/437">#437</a>)</li> <li><a href="`c19c1b1ffd`"><code>c19c1b1</code></a> Check that all jobs are in all-tests-passed.needs (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/432">#432</a>)</li> <li>Additional commits viewable in <a href="`6b9c6063ab...445689ea25`">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.0.1&new-version=6.3.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-06-23 11:21:06 +02:00
ehhuang	6fde601765	chore: upgrade hf hub dependency (#2487 ) Some checks failed Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 8s Details Test Llama Stack Build / generate-matrix (push) Successful in 7s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 9s Details Python Package Build Test / build (3.11) (push) Failing after 2s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 10s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 9s Details Python Package Build Test / build (3.13) (push) Failing after 2s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s Details Python Package Build Test / build (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Test External Providers / test-external-providers (venv) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Update ReadTheDocs / update-readthedocs (push) Failing after 11s Details Unit Tests / unit-tests (3.11) (push) Failing after 13s Details Test Llama Stack Build / build (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 33s Details Test Llama Stack Build / build-single-provider (push) Failing after 31s Details Pre-commit / pre-commit (push) Successful in 1m12s Details # What does this PR do? CI tests have been failing with .venv/lib/python3.12/site-packages/peft/auto.py:21: in <module> from transformers import ( .venv/lib/python3.12/site-packages/transformers/__init__.py:27: in <module> from . import dependency_versions_check .venv/lib/python3.12/site-packages/transformers/dependency_versions_check.py:57: in <module> require_version_core(deps[pkg]) .venv/lib/python3.12/site-packages/transformers/utils/versions.py:117: in require_version_core return require_version(requirement, hint) .venv/lib/python3.12/site-packages/transformers/utils/versions.py:111: in require_version _compare_versions(op, got_ver, want_ver, requirement, pkg, hint) .venv/lib/python3.12/site-packages/transformers/utils/versions.py:44: in _compare_versions raise ImportError( E ImportError: huggingface-hub>=0.30.0,<1.0 is required for a normal functioning of this module, but found huggingface-hub==0.29.0. E Try: `pip install transformers -U` or `pip install -e '.[dev]'` if you're working with git main ------------------------------ Captured log setup ------------------------------ INFO llama_stack.providers.remote.inference.ollama.ollama:ollama.py:106 checking connectivity to Ollama at `http://0.0.0.0:11434`.../ =========================== short test summary info ============================ ERROR tests/integration/providers/test_providers.py::TestProviders::test_providers - ImportError: huggingface-hub>=0.30.0,<1.0 is required for a normal functioning of this module, but found huggingface-hub==0.29.0. Try: `pip install transformers -U` or `pip install -e '.[dev]'` if you're working with git main =================== 1 skipped, 4 warnings, 1 error in 9.52s ==================== ## Test Plan CI	2025-06-20 15:50:54 -07:00
ehhuang	23b7dc7b37	fix: stack build (#2485 ) # What does this PR do? probably related to 3.11 upgrade ^^^^ File "/opt/homebrew/Caskroom/miniconda/base/envs/myenv/lib/python3.11/site-packages/termcolor/termcolor.py", line 147, in colored text = fmt_str % (COLORS[color], text) ~~~~~~^^^^^^^ KeyError: 'light_blue' ## Test Plan	2025-06-20 15:15:43 -07:00
github-actions[bot]	d70573bd47	build: Bump version to 0.2.12	2025-06-20 21:06:17 +00:00
ehhuang	d3b60507d7	feat: support auth attributes in inference/responses stores (#2389 ) # What does this PR do? Inference/Response stores now store user attributes when inserting, and respects them when fetching. ## Test Plan pytest tests/unit/utils/test_sqlstore.py	2025-06-20 10:24:45 -07:00
Costa Shulyupin	7930c524f9	docs: Fix spacing (#2481 ) Some checks failed Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.11, scoring) (push) Failing after 12s Details Integration Tests / test-matrix (http, 3.11, agents) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 6s Details Integration Tests / test-matrix (http, 3.11, tool_runtime) (push) Failing after 10s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 5s Details Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 8s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 15s Details Test External Providers / test-external-providers (venv) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 11s Details Integration Tests / test-matrix (http, 3.12, post_training) (push) Failing after 13s Details Unit Tests / unit-tests (3.11) (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.11, vector_io) (push) Failing after 13s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 5s Details Pre-commit / pre-commit (push) Successful in 1m14s Details ![image](https://github.com/user-attachments/assets/4b8e0e9c-1622-41dd-a0f4-178b6b452029) Replace misaligned tab with spaces Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>	2025-06-20 13:21:58 +02:00
ehhuang	6832e8a658	feat: remove score_threshold constraint (#2479 ) Some checks failed Integration Tests / test-matrix (http, 3.11, scoring) (push) Failing after 26s Details Integration Tests / test-matrix (http, 3.11, datasets) (push) Failing after 28s Details Python Package Build Test / build (3.11) (push) Failing after 3s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 6s Details Integration Tests / test-matrix (http, 3.12, inspect) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 26s Details Python Package Build Test / build (3.13) (push) Failing after 4s Details Integration Tests / test-matrix (http, 3.12, inference) (push) Failing after 26s Details Integration Tests / test-matrix (http, 3.11, providers) (push) Failing after 28s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 25s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 9s Details Python Package Build Test / build (3.12) (push) Failing after 10s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.11, agents) (push) Failing after 30s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 22s Details Unit Tests / unit-tests (3.12) (push) Failing after 11s Details Unit Tests / unit-tests (3.13) (push) Failing after 11s Details Unit Tests / unit-tests (3.11) (push) Failing after 14s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 48s Details Test External Providers / test-external-providers (venv) (push) Failing after 1m5s Details Pre-commit / pre-commit (push) Successful in 2m17s Details # What does this PR do? See inline comment. fixes test _ test_openai_vector_store_search_with_high_score_filter[llama_stack_client-meta-llama/Llama-3.3-70B-Instruct-meta-llama/Llama-4-Scout-17B-16E-Instruct-all-MiniLM-L6-v2-None-None] _ llama-stack/llama_stack/distribution/library_client.py:98: in convert_to_pydantic return TypeAdapter(annotation).validate_python(value) .venv/lib/python3.10/site-packages/pydantic/type_adapter.py:421: in validate_python return self.validator.validate_python( E pydantic_core._pydantic_core.ValidationError: 1 validation error for nullable[SearchRankingOptions] E score_threshold E Input should be less than or equal to 1 [type=less_than_equal, input_value=1.3458905661753127, input_type=float] E For further information visit https://errors.pydantic.dev/2.11/v/less_than_equal The above exception was the direct cause of the following exception: llama-stack/tests/integration/vector_io/test_openai_vector_stores.py:376: in test_openai_vector_store_search_with_high_score_filter search_response = compat_client.vector_stores.search( .venv/lib/python3.10/site-packages/llama_stack_client/resources/vector_stores/vector_stores.py:356: in search return self._post( .venv/lib/python3.10/site-packages/llama_stack_client/_base_client.py:1232: in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) llama-stack/llama_stack/distribution/library_client.py:177: in request result = loop.run_until_complete(self.async_client.request(args, *kwargs)) /opt/hostedtoolcache/Python/3.10.18/x64/lib/python3.10/asyncio/base_events.py:649: in run_until_complete return future.result() llama-stack/llama_stack/distribution/library_client.py:292: in request response = await self._call_non_streaming( llama-stack/llama_stack/distribution/library_client.py:313: in _call_non_streaming body = self._convert_body(path, options.method, body) llama-stack/llama_stack/distribution/library_client.py:425: in _convert_body converted_body[param_name] = convert_to_pydantic(param.annotation, value) llama-stack/llama_stack/distribution/library_client.py:112: in convert_to_pydantic raise ValueError(f"Failed to convert parameter {value} into {annotation}: {e}") from e E ValueError: Failed to convert parameter {'score_threshold': 1.3458905661753127} into llama_stack.apis.vector_io.vector_io.SearchRankingOptions \| None: 1 validation error for nullable[SearchRankingOptions] E score_threshold E Input should be less than or equal to 1 [type=less_than_equal, input_value=1.3458905661753127, input_type=float] E For further information visit https://errors.pydantic.dev/2.11/v/less_than_equal ## Test Plan	2025-06-20 09:17:42 +05:30
Eran Cohen	747e594680	feat: expand set of known gemini models (#2471 ) Some checks failed Test Llama Stack Build / build-single-provider (push) Failing after 39s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 37s Details Python Package Build Test / build (3.12) (push) Failing after 36s Details Test External Providers / test-external-providers (venv) (push) Failing after 45s Details Pre-commit / pre-commit (push) Successful in 1m57s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (http, 3.11, post_training) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 11s Details Test Llama Stack Build / generate-matrix (push) Successful in 9s Details Python Package Build Test / build (3.11) (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 9s Details Unit Tests / unit-tests (3.11) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Test Llama Stack Build / build (push) Failing after 3s Details feat: Add Gemini 2.0 and 2.5 models This commit expands the set of known Gemini models by introducing: - `gemini/gemini-2.0-flash` - `gemini/gemini-2.5-flash` - `gemini/gemini-2.5-pro` These new models are added to `LLM_MODEL_IDS` for broader compatibility and updated in `run.yaml` to allow for their immediate use in starter configurations. Signed-off-by: Eran Cohen <eranco@redhat.com>	2025-06-19 12:19:37 -04:00
Ben Browning	f394c7f2d9	feat: Add missing Vector Store Files API surface (#2468 ) Some checks failed Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 16s Details Integration Tests / test-matrix (http, 3.11, agents) (push) Failing after 26s Details Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 19s Details Python Package Build Test / build (3.11) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 3s Details Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 17s Details Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 15s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 18s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 8s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Integration Tests / test-matrix (http, 3.11, scoring) (push) Failing after 24s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 20s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 15s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 21s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 15s Details Integration Tests / test-matrix (http, 3.11, inference) (push) Failing after 22s Details Unit Tests / unit-tests (3.11) (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 48s Details Test External Providers / test-external-providers (venv) (push) Failing after 43s Details Unit Tests / unit-tests (3.13) (push) Failing after 52s Details Pre-commit / pre-commit (push) Successful in 2m4s Details # What does this PR do? This adds the ability to list, retrieve, update, and delete Vector Store Files. It implements these new APIs for the faiss and sqlite-vec providers, since those are the two that also have the rest of the vector store files implementation. Closes #2445 ## Test Plan ### test_openai_vector_stores Integration Tests There are a number of new integration tests added, which I ran for each provider as outlined below. faiss (from ollama distro): ``` INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run llama_stack/templates/ollama/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ pytest -sv tests/integration/vector_io/test_openai_vector_stores.py \ --embedding-model=all-MiniLM-L6-v2 ``` sqlite-vec (from starter distro): ``` llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ pytest -sv tests/integration/vector_io/test_openai_vector_stores.py \ --embedding-model=all-MiniLM-L6-v2 ``` ### file_search verification tests I also ensured the file_search verification tests continue to work, both for faiss and sqlite-vec. faiss (ollama distro): ``` INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack run llama_stack/templates/ollama/run.yaml pytest -sv tests/verifications/openai_api/test_responses.py \ -k'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=meta-llama/Llama-3.2-3B-Instruct ``` sqlite-vec (starter distro): ``` llama stack run llama_stack/templates/starter/run.yaml pytest -sv tests/verifications/openai_api/test_responses.py \ -k'file_search' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=together/meta-llama/Llama-3.2-3B-Instruct-Turbo ``` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com>	2025-06-19 11:08:24 -04:00
Ihar Hrachyshka	a2f054607d	fix: cancel scheduler tasks on shutdown (#2130 ) # What does this PR do? Scheduler: cancel tasks on shutdown. Otherwise the currently running tasks will never exit (before they actually complete), which means the process can't be properly shut down (only with SIGKILL). Ideally, we let tasks know that they are about to shutdown and give them some time to do so; but in the lack of the mechanism, it's better to cancel than linger forever. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan Start a long running task (e.g. torchtune or external kfp-provider training). Ctr-C the process in TTY. Confirm it exits in reasonable time. ``` ^CINFO: Shutting down INFO: Waiting for application shutdown. 13:32:26.187 - INFO - Shutting down 13:32:26.187 - INFO - Shutting down DatasetsRoutingTable 13:32:26.187 - INFO - Shutting down DatasetIORouter 13:32:26.187 - INFO - Shutting down TorchtuneKFPPostTrainingImpl Traceback (most recent call last): File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ asyncio.exceptions.CancelledError During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "/Users/ihrachys/src/llama-stack-provider-kfp-trainer/.venv/lib/python3.12/site-packages/kfp/dsl/executor_main.py", line 109, in <module> executor_main() File "/Users/ihrachys/src/llama-stack-provider-kfp-trainer/.venv/lib/python3.12/site-packages/kfp/dsl/executor_main.py", line 101, in executor_main output_file = executor.execute() ^^^^^^^^^^^^^^^^^^ File "/Users/ihrachys/src/llama-stack-provider-kfp-trainer/.venv/lib/python3.12/site-packages/kfp/dsl/executor.py", line 361, in execute result = self.func(**func_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/var/folders/45/1q1rx6cn7jbcn2ty852w0g_r0000gn/T/tmp.RKpPrvTWDD/ephemeral_component.py", line 118, in component asyncio.run(recipe.setup()) File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/runners.py", line 194, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/asyncio/runners.py", line 123, in run raise KeyboardInterrupt() KeyboardInterrupt 13:32:31.219 - ERROR - Task 'component' finished with status FAILURE ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ INFO 2025-05-09 13:32:31,221 llama_stack.providers.utils.scheduler:221 scheduler: Job test-jobc3c2e1e4-859c-4852-a41d-ef29e55e3efa: Pipeline [1m[95m'test-jobc3c2e1e4-859c-4852-a41d-ef29e55e3efa'[1m[0m finished with status [1m[91mFAILURE[1m[0m. Inner task failed: [1m[96m'component'[1m[0m. ERROR 2025-05-09 13:32:31,223 llama_stack_provider_kfp_trainer.scheduler:54 scheduler: Job test-jobc3c2e1e4-859c-4852-a41d-ef29e55e3efa failed. ╭───────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────╮ │ /Users/ihrachys/src/llama-stack-provider-kfp-trainer/src/llama_stack_provider_kfp_trainer/scheduler.py:45 │ │ in do │ │ │ │ 42 │ │ │ │ │ 43 │ │ │ job.status = JobStatus.running │ │ 44 │ │ │ try: │ │ ❱ 45 │ │ │ │ artifacts = self._to_artifacts(job.handler().output) │ │ 46 │ │ │ │ for artifact in artifacts: │ │ 47 │ │ │ │ │ on_artifact_collected_cb(artifact) │ │ 48 │ │ │ │ /Users/ihrachys/src/llama-stack-provider-kfp-trainer/.venv/lib/python3.12/site-packages/kfp/dsl/base_compon │ │ ent.py:101 in __call__ │ │ │ │ 98 │ │ │ │ f'{self.name}() missing {len(missing_arguments)} required ' │ │ 99 │ │ │ │ f'{argument_or_arguments}: {arguments}.') │ │ 100 │ │ │ │ ❱ 101 │ │ return pipeline_task.PipelineTask( │ │ 102 │ │ │ component_spec=self.component_spec, │ │ 103 │ │ │ args=task_inputs, │ │ 104 │ │ │ execute_locally=pipeline_context.Pipeline.get_default_pipeline() is │ │ │ │ /Users/ihrachys/src/llama-stack-provider-kfp-trainer/.venv/lib/python3.12/site-packages/kfp/dsl/pipeline_ta │ │ sk.py:187 in __init__ │ │ │ │ 184 │ │ ]) │ │ 185 │ │ │ │ 186 │ │ if execute_locally: │ │ ❱ 187 │ │ │ self._execute_locally(args=args) │ │ 188 │ │ │ 189 │ def _execute_locally(self, args: Dict[str, Any]) -> None: │ │ 190 │ │ """Execute the pipeline task locally. │ │ │ │ /Users/ihrachys/src/llama-stack-provider-kfp-trainer/.venv/lib/python3.12/site-packages/kfp/dsl/pipeline_ta │ │ sk.py:197 in _execute_locally │ │ │ │ 194 │ │ from kfp.local import task_dispatcher │ │ 195 │ │ │ │ 196 │ │ if self.pipeline_spec is not None: │ │ ❱ 197 │ │ │ self._outputs = pipeline_orchestrator.run_local_pipeline( │ │ 198 │ │ │ │ pipeline_spec=self.pipeline_spec, │ │ 199 │ │ │ │ arguments=args, │ │ 200 │ │ │ ) │ │ │ │ /Users/ihrachys/src/llama-stack-provider-kfp-trainer/.venv/lib/python3.12/site-packages/kfp/local/pipeline_ │ │ orchestrator.py:43 in run_local_pipeline │ │ │ │ 40 │ │ │ 41 │ # validate and access all global state in this function, not downstream │ │ 42 │ config.LocalExecutionConfig.validate() │ │ ❱ 43 │ return _run_local_pipeline_implementation( │ │ 44 │ │ pipeline_spec=pipeline_spec, │ │ 45 │ │ arguments=arguments, │ │ 46 │ │ raise_on_error=config.LocalExecutionConfig.instance.raise_on_error, │ │ │ │ /Users/ihrachys/src/llama-stack-provider-kfp-trainer/.venv/lib/python3.12/site-packages/kfp/local/pipeline_ │ │ orchestrator.py:108 in _run_local_pipeline_implementation │ │ │ │ 105 │ │ │ ) │ │ 106 │ │ return outputs │ │ 107 │ elif dag_status == status.Status.FAILURE: │ │ ❱ 108 │ │ log_and_maybe_raise_for_failure( │ │ 109 │ │ │ pipeline_name=pipeline_name, │ │ 110 │ │ │ fail_stack=fail_stack, │ │ 111 │ │ │ raise_on_error=raise_on_error, │ │ │ │ /Users/ihrachys/src/llama-stack-provider-kfp-trainer/.venv/lib/python3.12/site-packages/kfp/local/pipeline_ │ │ orchestrator.py:137 in log_and_maybe_raise_for_failure │ │ │ │ 134 │ │ logging_utils.format_task_name(task_name) for task_name in fail_stack) │ │ 135 │ msg = f'Pipeline {pipeline_name_with_color} finished with status │ │ {status_with_color}. Inner task failed: {task_chain_with_color}.' │ │ 136 │ if raise_on_error: │ │ ❱ 137 │ │ raise RuntimeError(msg) │ │ 138 │ with logging_utils.local_logger_context(): │ │ 139 │ │ logging.error(msg) │ │ 140 │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: Pipeline [1m[95m'test-jobc3c2e1e4-859c-4852-a41d-ef29e55e3efa'[1m[0m finished with status [1m[91mFAILURE[1m[0m. Inner task failed: [1m[96m'component'[1m[0m. INFO 2025-05-09 13:32:31,266 llama_stack.distribution.server.server:136 server: Shutting down DistributionInspectImpl INFO 2025-05-09 13:32:31,266 llama_stack.distribution.server.server:136 server: Shutting down ProviderImpl INFO: Application shutdown complete. INFO: Finished server process [26648] ``` [//]: # (## Documentation) Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-06-19 17:01:33 +02:00
Sébastien Han	c20388c424	ci: add python package build test (#2457 ) # What does this PR do? We now test a package build on every PRs. Closes: https://github.com/meta-llama/llama-stack/issues/2406 Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-19 18:57:32 +05:30
Sébastien Han	fa1d986f72	fix: remove asyncio.TimeoutError since Python update (#2476 ) # What does this PR do? Since we now support Pythong starting from 3.11, this is not needed anymore. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-19 18:52:41 +05:30
Sébastien Han	6039d922c0	fix: allow running vector tests with embedding dimension (#2467 ) Some checks failed Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 5s Details Integration Tests / test-matrix (http, 3.11, scoring) (push) Failing after 28s Details Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 24s Details Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 26s Details Integration Tests / test-matrix (http, 3.11, inference) (push) Failing after 30s Details Integration Tests / test-matrix (http, 3.12, agents) (push) Failing after 28s Details Integration Tests / test-matrix (http, 3.12, post_training) (push) Failing after 26s Details Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 23s Details Test Llama Stack Build / generate-matrix (push) Successful in 5s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 5s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s Details Test External Providers / test-external-providers (venv) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 20s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 7s Details Unit Tests / unit-tests (3.11) (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 22s Details Test Llama Stack Build / build (push) Failing after 17s Details Unit Tests / unit-tests (3.13) (push) Failing after 37s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 1m7s Details Test Llama Stack Build / build-single-provider (push) Failing after 1m15s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 1m17s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m32s Details Pre-commit / pre-commit (push) Failing after 2m14s Details # What does this PR do? Do not force 384 for the embedding dimension, use the one provided by the test run. ## Test Plan ``` pytest -s -vvv tests/integration/vector_io/test_vector_io.py --stack-config=http://localhost:8321 \ -k "not(builtin_tool or safety_with_image or code_interpreter or test_rag)" \ --text-model="meta-llama/Llama-3.2-3B-Instruct" \ --embedding-model=granite-embedding-125m --embedding-dimension=768 Uninstalled 1 package in 16ms Installed 1 package in 11ms INFO 2025-06-18 10:52:03,314 tests.integration.conftest:59 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS /Users/leseb/Documents/AI/llama-stack/.venv/lib/python3.10/site-packages/pytest_asyncio/plugin.py:207: PytestDeprecationWarning: The configuration option "asyncio_default_fixture_loop_scope" is unset. The event loop scope for asynchronous fixtures will default to the fixture caching scope. Future versions of pytest-asyncio will default the loop scope for asynchronous fixtures to function scope. Set the default fixture loop scope explicitly in order to avoid unexpected behavior in the future. Valid fixture loop scopes are: "function", "class", "module", "package", "session" warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET)) ================================================= test session starts ================================================= platform darwin -- Python 3.10.16, pytest-8.3.4, pluggy-1.5.0 -- /Users/leseb/Documents/AI/llama-stack/.venv/bin/python cachedir: .pytest_cache metadata: {'Python': '3.10.16', 'Platform': 'macOS-15.5-arm64-arm-64bit', 'Packages': {'pytest': '8.3.4', 'pluggy': '1.5.0'}, 'Plugins': {'cov': '6.0.0', 'html': '4.1.1', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'asyncio': '0.25.3', 'anyio': '4.8.0', 'nbval': '0.11.0'}} rootdir: /Users/leseb/Documents/AI/llama-stack configfile: pyproject.toml plugins: cov-6.0.0, html-4.1.1, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, asyncio-0.25.3, anyio-4.8.0, nbval-0.11.0 asyncio: mode=strict, asyncio_default_fixture_loop_scope=None collected 8 items tests/integration/vector_io/test_vector_io.py::test_vector_db_retrieve[emb=granite-embedding-125m:dim=768] PASSED tests/integration/vector_io/test_vector_io.py::test_vector_db_register[emb=granite-embedding-125m:dim=768] PASSED tests/integration/vector_io/test_vector_io.py::test_insert_chunks[emb=granite-embedding-125m:dim=768-test_case0] PASSED tests/integration/vector_io/test_vector_io.py::test_insert_chunks[emb=granite-embedding-125m:dim=768-test_case1] PASSED tests/integration/vector_io/test_vector_io.py::test_insert_chunks[emb=granite-embedding-125m:dim=768-test_case2] PASSED tests/integration/vector_io/test_vector_io.py::test_insert_chunks[emb=granite-embedding-125m:dim=768-test_case3] PASSED tests/integration/vector_io/test_vector_io.py::test_insert_chunks[emb=granite-embedding-125m:dim=768-test_case4] PASSED tests/integration/vector_io/test_vector_io.py::test_insert_chunks_with_precomputed_embeddings[emb=granite-embedding-125m:dim=768] PASSED ================================================== 8 passed in 5.50s ================================================== ``` Signed-off-by: Sébastien Han <seb@redhat.com>	2025-06-19 13:29:04 +05:30
Charlie Doern	d12f195f56	feat: drop python 3.10 support (#2469 ) # What does this PR do? dropped python3.10, updated pyproject and dependencies, and also removed some blocks of code with special handling for enum.StrEnum Closes #2458 Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-06-19 12:07:14 +05:30
ehhuang	db2cd9e8f3	feat: support filters in file search (#2472 ) # What does this PR do? Move to use vector_stores.search for file search tool in Responses, which supports filters. closes #2435 ## Test Plan Added e2e test with fitlers. myenv ❯ llama stack run llama_stack/templates/fireworks/run.yaml pytest -sv tests/verifications/openai_api/test_responses.py \ -k 'file_search and filters' \ --base-url=http://localhost:8321/v1/openai/v1 \ --model=meta-llama/Llama-3.3-70B-Instruct	2025-06-18 21:50:55 -07:00
Ihar Hrachyshka	fd37a50e6a	chore: Remove @booxter from triagers (#2473 ) Sadly, I won't have capacity to continue working for the project. Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com> # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-06-18 19:30:09 -07:00
ehhuang	e6bfc717cb	feat(ui): add infinite scroll pagination to chat completions/responses logs table (#2466 ) Some checks failed Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (http, 3.11, providers) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 5s Details Test External Providers / test-external-providers (venv) (push) Failing after 16s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 20s Details Unit Tests / unit-tests (3.11) (push) Failing after 16s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Unit Tests / unit-tests (3.10) (push) Failing after 48s Details Unit Tests / unit-tests (3.12) (push) Failing after 46s Details Pre-commit / pre-commit (push) Successful in 1m23s Details ## Summary: This commit adds infinite scroll pagination to the chat completions and responses tables. ## Test Plan: 1. Run unit tests: npm run test 2. Manual testing: Navigate to chat completions/responses pages 3. Verify infinite scroll triggers when approaching bottom 4. Added playwright tests: npm run test:e2e	2025-06-18 15:28:39 -07:00
Sumit Jaiswal	90d03552d4	feat: To add health check for faiss inline vector_io provider (#2319 ) Some checks failed Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.10, vector_io) (push) Failing after 13s Details Integration Tests / test-matrix (library, 3.10, inference) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 4s Details Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 4s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 4s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 4s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 7s Details Test External Providers / test-external-providers (venv) (push) Failing after 1m1s Details Unit Tests / unit-tests (3.11) (push) Failing after 1m11s Details Unit Tests / unit-tests (3.10) (push) Failing after 1m13s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m9s Details Unit Tests / unit-tests (3.13) (push) Failing after 15s Details Pre-commit / pre-commit (push) Successful in 1m52s Details # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> To add health check for faiss inline vector_io provider. I tried adding `async def health(self) -> HealthResponse:` like in inference provider, but it didn't worked for `inline->vector_io->faiss` provider. And via debug logs, I understood the critical issue, that the health responses are being stored with the API name as the key, not as a nested dictionary with provider IDs. This means that all providers of the same API type (e.g., "vector_io") will share the same health response, and only the last one processed will be visible in the API response. I've created a patch file that fixes this issue by: - Storing the original get_providers_health method - Creating a patched version that correctly maps health responses to providers - Applying the patch to the `ProviderImpl` class Not an expert, so please let me know, if there can be any other workaround using which I can get the health status updated directly from `faiss.py`. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> Added unit tests to test the provider patch implementation in the PR. Adding a screenshot with the FAISS inline vector_io health status as "OK" ![faiss_health_check](https://github.com/user-attachments/assets/d769e762-890c-41ea-a596-5e90951f79a4)	2025-06-18 17:56:25 +02:00
github-actions[bot]	7d812e3bf0	build: Bump version to 0.2.11 Some checks failed Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 12s Details Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, vector_io) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 4s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 4s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 6s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 8s Details Test External Providers / test-external-providers (venv) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 10s Details Unit Tests / unit-tests (3.10) (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 10s Details Unit Tests / unit-tests (3.11) (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 17s Details Pre-commit / pre-commit (push) Successful in 55s Details	2025-06-17 19:08:17 +00:00

3497 changed files with 1041091 additions and 157101 deletions

6

.coveragerc

View file

 @ -4,3 +4,9 @@ omit =
     */llama_stack/providers/*
     */llama_stack/templates/*
     .venv/*
     */llama_stack/cli/scripts/*
     */llama_stack_ui/*
     */llama_stack/distribution/ui/*
     */llama_stack/strong_typing/*
     */llama_stack/env.py
     */__init__.py

19

.dockerignore Normal file

View file

 @ -0,0 +1,19 @@
 .venv
 __pycache__
 *.pyc
 *.pyo
 *.pyd
 *.so
 .git
 .gitignore
 htmlcov*
 .coverage
 coverage*
 .cache
 .mypy_cache
 .pytest_cache
 .ruff_cache
 uv.lock
 node_modules
 build
 /tmp

1

.gitattributes vendored Normal file

View file

				`@ -0,0 +1 @@`
				`tests//recordings/ linguist-generated=true`

2

.github/CODEOWNERS vendored

View file

 @ -2,4 +2,4 @@
 # These owners will be the default owners for everything in
 # the repo. Unless a later match takes precedence,
 * @ashwinb @yanxi0830 @hardikjshah @raghotham @ehhuang @terrytangyuan @leseb @bbrowning @reluctantfuturist
 * @ashwinb @raghotham @ehhuang @leseb @bbrowning @mattf @franciscojavierarceo @cdoern

									
										4

.github/ISSUE_TEMPLATE/config.yml
									
										vendored
									
										View file
										
				@ -2,10 +2,10 @@ blank_issues_enabled: false

				contact_links:

				  - name: Have you read the docs?

				    url: https://llama-stack.readthedocs.io/en/latest/index.html

				    url: https://llamastack.github.io/providers/external/index.html

				    about: Much help can be found in the docs

				  - name: Start a discussion

				    url: https://github.com/meta-llama/llama-stack/discussions/new

				    url: https://github.com/llamastack/llama-stack/discussions/new/

				    about: Start a discussion on a topic

				  - name: Chat on Discord

				    url: https://discord.gg/llama-stack

									
										30

.github/ISSUE_TEMPLATE/tech-debt.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,30 @@

				name: 🔧 Tech Debt

				description: Something that is functional but should be improved or optimizied

				labels: ["tech-debt"]

				body:

				- type: textarea

				  id: tech-debt-explanation

				  attributes:

				    label: 🤔 What is the technical debt you think should be addressed?

				    description: >

				      A clear and concise description of _what_ needs to be addressed - ensure you are describing

				      constitutes [technical debt](https://en.wikipedia.org/wiki/Technical_debt) and is not a bug

				      or feature request.

				  validations:

				    required: true

				- type: textarea

				  id: tech-debt-motivation

				  attributes:

				    label: 💡 What is the benefit of addressing this technical debt?

				    description: >

				      A clear and concise description of _why_ this work is needed.

				  validations:

				    required: true

				- type: textarea

				  id: other-thoughts

				  attributes:

				    label: Other thoughts

				    description: >

				      Any thoughts about how this may result in complexity in the codebase, or other trade-offs.

									
										1

.github/TRIAGERS.md
									
										vendored
									
										View file
										
				@ -1,2 +1 @@

				# This file documents Triage members in the Llama Stack community

				 @bbrowning @booxter @franciscojavierarceo @leseb

									
										72

.github/actions/install-llama-stack-client/action.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,72 @@

				name: Install llama-stack-client

				description: Install llama-stack-client based on branch context and client-version input

				inputs:

				  client-version:

				    description: 'Client version to install on non-release branches (latest or published). Ignored on release branches.'

				    required: false

				    default: ""

				  sdk_install_url:

				    description: 'URL to install Python SDK from (for testing preview builds). If provided, overrides client-version.'

				    required: false

				    default: ""

				outputs:

				  uv-extra-index-url:

				    description: 'UV_EXTRA_INDEX_URL to use (set for release branches)'

				    value: ${{ steps.configure.outputs.uv-extra-index-url }}

				  install-after-sync:

				    description: 'Whether to install client after uv sync'

				    value: ${{ steps.configure.outputs.install-after-sync }}

				  install-source:

				    description: 'Where to install client from after sync'

				    value: ${{ steps.configure.outputs.install-source }}

				runs:

				  using: "composite"

				  steps:

				    - name: Configure client installation

				      id: configure

				      shell: bash

				      run: |

				        # If sdk_install_url is provided (e.g., from Stainless preview), use it directly

				        if [ -n "${{ inputs.sdk_install_url }}" ]; then

				          echo "Using provided sdk_install_url: ${{ inputs.sdk_install_url }}"

				          echo "install-after-sync=true" >> $GITHUB_OUTPUT

				          echo "install-source=${{ inputs.sdk_install_url }}" >> $GITHUB_OUTPUT

				          exit 0

				        fi

				        # Determine the branch we're working with

				        BRANCH="${{ github.base_ref || github.ref }}"

				        BRANCH="${BRANCH#refs/heads/}"

				        echo "Working with branch: $BRANCH"

				        # On release branches: use test.pypi for uv sync, then install from git

				        # On non-release branches: install based on client-version after sync

				        if [[ "$BRANCH" =~ ^release-[0-9]+\.[0-9]+\.x$ ]]; then

				          echo "Detected release branch: $BRANCH"

				          # Check if matching branch exists in client repo

				          if ! git ls-remote --exit-code --heads https://github.com/llamastack/llama-stack-client-python.git "$BRANCH" > /dev/null 2>&1; then

				            echo "::error::Branch $BRANCH not found in llama-stack-client-python repository"

				            echo "::error::Please create the matching release branch in llama-stack-client-python before testing"

				            exit 1

				          fi

				          # Configure to use test.pypi as extra index (PyPI is primary)

				          echo "uv-extra-index-url=https://test.pypi.org/simple/" >> $GITHUB_OUTPUT

				          echo "install-after-sync=true" >> $GITHUB_OUTPUT

				          echo "install-source=git+https://github.com/llamastack/llama-stack-client-python.git@$BRANCH" >> $GITHUB_OUTPUT

				        elif [ "${{ inputs.client-version }}" = "latest" ]; then

				          # Install from main git after sync

				          echo "install-after-sync=true" >> $GITHUB_OUTPUT

				          echo "install-source=git+https://github.com/llamastack/llama-stack-client-python.git@main" >> $GITHUB_OUTPUT

				        elif [ "${{ inputs.client-version }}" = "published" ]; then

				          # Use published version from PyPI (installed by sync)

				          echo "install-after-sync=false" >> $GITHUB_OUTPUT

				        elif [ -n "${{ inputs.client-version }}" ]; then

				          echo "::error::Invalid client-version: ${{ inputs.client-version }}"

				          exit 1

				        fi

									
										137

.github/actions/run-and-record-tests/action.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,137 @@

				name: 'Run and Record Tests'

				description: 'Run integration tests and handle recording/artifact upload'

				inputs:

				  stack-config:

				    description: 'Stack configuration to use'

				    required: true

				  setup:

				    description: 'Setup to use for tests (e.g., ollama, gpt, vllm)'

				    required: false

				    default: ''

				  inference-mode:

				    description: 'Inference mode (record or replay)'

				    required: true

				  suite:

				    description: 'Test suite to use: base, responses, vision, etc.'

				    required: false

				    default: ''

				  subdirs:

				    description: 'Comma-separated list of test subdirectories to run; overrides suite'

				    required: false

				    default: ''

				  pattern:

				    description: 'Regex pattern to pass to pytest -k'

				    required: false

				    default: ''

				  target-branch:

				    description: 'Target branch for recording commits (for PRs, use the PR head branch)'

				    required: false

				    default: ''

				  is-fork-pr:

				    description: 'Whether this is a fork PR (recordings cannot be pushed to forks)'

				    required: false

				    default: 'false'

				runs:

				  using: 'composite'

				  steps:

				    - name: Check Storage and Memory Available Before Tests

				      if: ${{ always() }}

				      shell: bash

				      run: |

				        free -h

				        df -h

				    - name: Run Integration Tests

				      shell: bash

				      run: |

				        SCRIPT_ARGS="--stack-config ${{ inputs.stack-config }} --inference-mode ${{ inputs.inference-mode }}"

				        # Add optional arguments only if they are provided

				        if [ -n '${{ inputs.setup }}' ]; then

				          SCRIPT_ARGS="$SCRIPT_ARGS --setup ${{ inputs.setup }}"

				        fi

				        if [ -n '${{ inputs.suite }}' ]; then

				          SCRIPT_ARGS="$SCRIPT_ARGS --suite ${{ inputs.suite }}"

				        fi

				        if [ -n '${{ inputs.subdirs }}' ]; then

				          SCRIPT_ARGS="$SCRIPT_ARGS --subdirs ${{ inputs.subdirs }}"

				        fi

				        if [ -n '${{ inputs.pattern }}' ]; then

				          SCRIPT_ARGS="$SCRIPT_ARGS --pattern ${{ inputs.pattern }}"

				        fi

				        echo "=== Running command ==="

				        echo "uv run --no-sync ./scripts/integration-tests.sh $SCRIPT_ARGS"

				        echo ""

				        uv run --no-sync ./scripts/integration-tests.sh $SCRIPT_ARGS | tee pytest-${{ inputs.inference-mode }}.log

				    - name: Commit and push recordings

				      if: ${{ inputs.inference-mode == 'record' ||  inputs.inference-mode == 'record-if-missing' }}

				      shell: bash

				      run: |

				        echo "Checking for recording changes"

				        git status --porcelain tests/integration/recordings/ tests/integration/*/recordings/

				        if [[ -n $(git status --porcelain tests/integration/recordings/ tests/integration/*/recordings/) ]]; then

				          echo "New recordings detected"

				          # Determine target branch: use target-branch input if provided, otherwise use current branch

				          TARGET_BRANCH="${{ inputs.target-branch }}"

				          if [ -z "$TARGET_BRANCH" ]; then

				            TARGET_BRANCH="${{ github.ref_name }}"

				          fi

				          echo "Target branch: $TARGET_BRANCH"

				          # Check if this is a fork PR

				          if [ "${{ inputs.is-fork-pr }}" = "true" ]; then

				            echo "::warning::This is a fork PR. Recordings were updated locally but cannot be pushed to the fork."

				            echo "::warning::Please download the workflow artifacts and commit the recordings manually."

				          else

				            echo "Committing and pushing recordings to branch: $TARGET_BRANCH"

				            git add tests/integration/recordings/ tests/integration/*/recordings/

				            git commit -m "Recordings update from CI (setup: ${{ inputs.setup }}, suite: ${{ inputs.suite }})"

				            git fetch origin "$TARGET_BRANCH"

				            git rebase "origin/$TARGET_BRANCH"

				            echo "Rebased successfully"

				            git push origin "HEAD:$TARGET_BRANCH"

				            echo "Pushed successfully to $TARGET_BRANCH"

				          fi

				        else

				          echo "No recording changes"

				        fi

				    - name: Upload recordings (for fork PRs)

				      if: ${{ inputs.is-fork-pr == 'true' && (inputs.inference-mode == 'record' || inputs.inference-mode == 'record-if-missing') }}

				      uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				      with:

				        name: recordings-${{ github.run_id }}-${{ github.run_attempt || '1' }}-${{ strategy.job-index || github.job }}

				        path: |

				          tests/integration/recordings/

				          tests/integration/*/recordings/

				        retention-days: 7

				        if-no-files-found: ignore

				    - name: Write docker logs to file

				      if: ${{ always() }}

				      shell: bash

				      run: |

				        # Ollama logs (if ollama container exists)

				        sudo docker logs ollama > ollama-${{ inputs.inference-mode }}.log 2>&1 || true

				        # vllm logs (if vllm container exists)

				        sudo docker logs vllm > vllm-${{ inputs.inference-mode }}.log 2>&1 || true

				        # Note: distro container logs are now dumped in integration-tests.sh before container is removed

				    - name: Upload logs

				      if: ${{ always() }}

				      uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				      with:

				        name: logs-${{ github.run_id }}-${{ github.run_attempt || '1' }}-${{ strategy.job-index || github.job }}-${{ github.action }}

				        path: |

				          *.log

				        retention-days: 1

									
										16

.github/actions/setup-ollama/action.yml
									
										vendored
									
										View file
										
				@ -1,9 +1,23 @@

				name: Setup Ollama

				description: Start Ollama

				inputs:

				  suite:

				    description: 'Test suite to use: base, responses, vision, etc.'

				    required: false

				    default: ''

				runs:

				  using: "composite"

				  steps:

				    - name: Start Ollama

				      shell: bash

				      run: |

				        docker run -d --name ollama -p 11434:11434 docker.io/leseb/ollama-with-models

				        if [ "${{ inputs.suite }}" == "vision" ]; then

				          image="ollama-with-vision-model"

				        else

				          image="ollama-with-models"

				        fi

				        echo "Starting Ollama with image: $image"

				        docker run -d --name ollama -p 11434:11434 docker.io/llamastack/$image

				        echo "Verifying Ollama status..."

				        timeout 30 bash -c 'while ! curl -s -L http://127.0.0.1:11434; do sleep 1 && echo "."; done'

									
										50

.github/actions/setup-runner/action.yml
									
										vendored
									
										View file
										
				@ -4,24 +4,54 @@ inputs:

				  python-version:

				    description: The Python version to use

				    required: false

				    default: "3.10"

				    default: "3.12"

				  client-version:

				    description: The llama-stack-client-python version to test against (latest or published)

				    required: false

				    default: "latest"

				  sdk_install_url:

				    description: 'URL to install Python SDK from (for testing preview builds). If provided, overrides client-version.'

				    required: false

				    default: ""

				runs:

				  using: "composite"

				  steps:

				    - name: Install uv

				      uses: astral-sh/setup-uv@6b9c6063abd6010835644d4c2e1bef4cf5cd0fca # v6.0.1

				      uses: astral-sh/setup-uv@1e862dfacbd1d6d858c55d9b792c756523627244 # v7.1.4

				      with:

				        python-version: ${{ inputs.python-version }}

				        activate-environment: true

				        version: 0.7.6

				    - name: Configure client installation

				      id: client-config

				      uses: ./.github/actions/install-llama-stack-client

				      with:

				        client-version: ${{ inputs.client-version }}

				        sdk_install_url: ${{ inputs.sdk_install_url }}

				    - name: Install dependencies

				      shell: bash

				      env:

				        UV_EXTRA_INDEX_URL: ${{ steps.client-config.outputs.uv-extra-index-url }}

				      run: |

				        # Export UV env vars for current step and persist to GITHUB_ENV for subsequent steps

				        if [ -n "$UV_EXTRA_INDEX_URL" ]; then

				          export UV_INDEX_STRATEGY=unsafe-best-match

				          echo "UV_EXTRA_INDEX_URL=$UV_EXTRA_INDEX_URL" >> $GITHUB_ENV

				          echo "UV_INDEX_STRATEGY=$UV_INDEX_STRATEGY" >> $GITHUB_ENV

				          echo "Exported UV environment variables for current and subsequent steps"

				        fi

				        echo "Updating project dependencies via uv sync"

				        uv sync --all-groups

				        uv pip install ollama faiss-cpu

				        # always test against the latest version of the client

				        # TODO: this is not necessarily a good idea. we need to test against both published and latest

				        # to find out backwards compatibility issues.

				        uv pip install git+https://github.com/meta-llama/llama-stack-client-python.git@main

				        uv pip install -e .

				        echo "Installing ad-hoc dependencies"

				        uv pip install faiss-cpu

				        # Install specific client version after sync if needed

				        if [ "${{ steps.client-config.outputs.install-after-sync }}" = "true" ]; then

				          echo "Installing llama-stack-client from: ${{ steps.client-config.outputs.install-source }}"

				          uv pip install ${{ steps.client-config.outputs.install-source }}

				        fi

				        echo "Installed llama packages"

				        uv pip list | grep llama

									
										95

.github/actions/setup-test-environment/action.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,95 @@

				name: 'Setup Test Environment'

				description: 'Common setup steps for integration tests including dependencies, providers, and build'

				inputs:

				  python-version:

				    description: 'Python version to use'

				    required: true

				  client-version:

				    description: 'Client version (latest or published)'

				    required: true

				  sdk_install_url:

				    description: 'URL to install Python SDK from (for testing preview builds). If provided, overrides client-version.'

				    required: false

				    default: ''

				  setup:

				    description: 'Setup to configure (ollama, vllm, gpt, etc.)'

				    required: false

				    default: 'ollama'

				  suite:

				    description: 'Test suite to use: base, responses, vision, etc.'

				    required: false

				    default: ''

				  inference-mode:

				    description: 'Inference mode (record or replay)'

				    required: true

				runs:

				  using: 'composite'

				  steps:

				    - name: Install dependencies

				      uses: ./.github/actions/setup-runner

				      with:

				        python-version: ${{ inputs.python-version }}

				        client-version: ${{ inputs.client-version }}

				        sdk_install_url: ${{ inputs.sdk_install_url }}

				    - name: Setup ollama

				      if: ${{ (inputs.setup == 'ollama' || inputs.setup == 'ollama-vision') && inputs.inference-mode == 'record' }}

				      uses: ./.github/actions/setup-ollama

				      with:

				        suite: ${{ inputs.suite }}

				    - name: Setup vllm

				      if: ${{ inputs.setup == 'vllm' && inputs.inference-mode == 'record' }}

				      uses: ./.github/actions/setup-vllm

				    - name: Start Postgres service

				      if: ${{ contains(inputs.setup, 'postgres') }}

				      shell: bash

				      run: |

				        sudo docker rm -f postgres-ci || true

				        sudo docker run -d --name postgres-ci \

				          -e POSTGRES_USER=llamastack \

				          -e POSTGRES_PASSWORD=llamastack \

				          -e POSTGRES_DB=llamastack \

				          -p 5432:5432 \

				          postgres:16

				        echo "Waiting for Postgres to become ready..."

				        for i in {1..30}; do

				          if sudo docker exec postgres-ci pg_isready -U llamastack -d llamastack >/dev/null 2>&1; then

				            echo "Postgres is ready"

				            break

				          fi

				          if [ "$i" -eq 30 ]; then

				            echo "Postgres failed to start in time"

				            sudo docker logs postgres-ci || true

				            exit 1

				          fi

				          sleep 2

				        done

				    - name: Verify client installation

				      shell: bash

				      run: |

				        echo "Verifying llama-stack-client installation:"

				        uv pip show llama-stack-client || echo "llama-stack-client not found"

				        echo ""

				        echo "All installed llama packages:"

				        uv pip list | grep llama || true

				    - name: Build Llama Stack

				      shell: bash

				      run: |

				        # Client is already installed by setup-runner (handles both main and release branches)

				        echo "Building Llama Stack"

				        LLAMA_STACK_DIR=. \

				          uv run --no-sync llama stack list-deps ci-tests | xargs -L1 uv pip install

				    - name: Configure git for commits

				      shell: bash

				      run: |

				        git config --local user.email "github-actions[bot]@users.noreply.github.com"

				        git config --local user.name "github-actions[bot]"

									
										35

.github/actions/setup-typescript-client/action.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,35 @@

				name: Setup TypeScript client

				description: Conditionally checkout and link llama-stack-client-typescript based on client-version

				inputs:

				  client-version:

				    description: 'Client version (latest or published)'

				    required: true

				outputs:

				  ts-client-path:

				    description: 'Path or version to use for TypeScript client'

				    value: ${{ steps.set-path.outputs.ts-client-path }}

				runs:

				  using: "composite"

				  steps:

				    - name: Checkout TypeScript client (latest)

				      if: ${{ inputs.client-version == 'latest' }}

				      uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

				      with:

				        repository: llamastack/llama-stack-client-typescript

				        ref: main

				        path: .ts-client-checkout

				    - name: Set TS_CLIENT_PATH

				      id: set-path

				      shell: bash

				      run: |

				        if [ "${{ inputs.client-version }}" = "latest" ]; then

				          echo "ts-client-path=${{ github.workspace }}/.ts-client-checkout" >> $GITHUB_OUTPUT

				        elif [ "${{ inputs.client-version }}" = "published" ]; then

				          echo "ts-client-path=^0.3.2" >> $GITHUB_OUTPUT

				        else

				          echo "::error::Invalid client-version: ${{ inputs.client-version }}"

				          exit 1

				        fi

									
										28

.github/actions/setup-vllm/action.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,28 @@

				name: Setup VLLM

				description: Start VLLM

				runs:

				  using: "composite"

				  steps:

				    - name: Start VLLM

				      shell: bash

				      run: |

				        # Start vllm container

				        docker run -d \

				          --name vllm \

				          -p 8000:8000 \

				          --privileged=true \

				          quay.io/higginsd/vllm-cpu:65393ee064-qwen3 \

				          --host 0.0.0.0 \

				          --port 8000 \

				          --enable-auto-tool-choice \

				          --tool-call-parser hermes \

				          --model /root/.cache/Qwen3-0.6B \

				          --served-model-name Qwen/Qwen3-0.6B \

				          --max-model-len 8192

				          # Wait for vllm to be ready

				          echo "Waiting for vllm to be ready..."

				          timeout 900 bash -c 'until curl -f http://localhost:8000/health; do

				            echo "Waiting for vllm..."

				            sleep 5

				          done'

									
										14

.github/dependabot.yml
									
										vendored
									
										View file
										
				@ -9,15 +9,25 @@ updates:

				      day: "saturday"

				    commit-message:

				      prefix: chore(github-deps)

				  - package-ecosystem: "uv"

				    directory: "/"

				    schedule:

				      interval: "weekly"

				      day: "saturday"

				    # ignore all non-security updates: https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#open-pull-requests-limit

				    open-pull-requests-limit: 0

				    labels:

				      - type/dependencies

				      - python

				    commit-message:

				      prefix: chore(python-deps)

				  - package-ecosystem: npm

				    directory: "/llama_stack_ui"

				    schedule:

				      interval: "weekly"

				      day: "saturday"

				    labels:

				      - type/dependencies

				      - javascript

				    commit-message:

				      prefix: chore(ui-deps)

									
										23

.github/mergify.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,23 @@

				pull_request_rules:

				- name: ping author on conflicts and add 'needs-rebase' label

				  conditions:

				      - conflict

				      - -closed

				  actions:

				    label:

				      add:

				        - needs-rebase

				    comment:

				      message: >

				       This pull request has merge conflicts that must be resolved before it

				       can be merged. @{{author}} please rebase it.

				       https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

				- name: remove 'needs-rebase' label when conflict is resolved

				  conditions:

				      - -conflict

				      - -closed

				  actions:

				    label:

				      remove:

				        - needs-rebase

									
										25

.github/workflows/README.md
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,25 @@

				# Llama Stack CI

				Llama Stack uses GitHub Actions for Continuous Integration (CI). Below is a table detailing what CI the project includes and the purpose.

				| Name | File | Purpose |

				| ---- | ---- | ------- |

				| Backward Compatibility Check | [backward-compat.yml](backward-compat.yml) | Check backward compatibility for config.yaml files |

				| API Conformance Tests | [conformance.yml](conformance.yml) | Run the API Conformance test suite on the changes. |

				| Installer CI | [install-script-ci.yml](install-script-ci.yml) | Test the installation script |

				| Integration Auth Tests | [integration-auth-tests.yml](integration-auth-tests.yml) | Run the integration test suite with Kubernetes authentication |

				| SqlStore Integration Tests | [integration-sql-store-tests.yml](integration-sql-store-tests.yml) | Run the integration test suite with SqlStore |

				| Integration Tests (Replay) | [integration-tests.yml](integration-tests.yml) | Run the integration test suites from tests/integration in replay mode |

				| Vector IO Integration Tests | [integration-vector-io-tests.yml](integration-vector-io-tests.yml) | Run the integration test suite with various VectorIO providers |

				| Pre-commit | [pre-commit.yml](pre-commit.yml) | Run pre-commit checks |

				| Test Llama Stack Build | [providers-build.yml](providers-build.yml) | Test llama stack build |

				| Test llama stack list-deps | [providers-list-deps.yml](providers-list-deps.yml) | Test llama stack list-deps |

				| Python Package Build Test | [python-build-test.yml](python-build-test.yml) | Test building the llama-stack PyPI project |

				| Integration Tests (Record) | [record-integration-tests.yml](record-integration-tests.yml) | Run the integration test suite from tests/integration |

				| Check semantic PR titles | [semantic-pr.yml](semantic-pr.yml) | Ensure that PR titles follow the conventional commit spec |

				| Stainless SDK Builds | [stainless-builds.yml](stainless-builds.yml) | Build Stainless SDK from OpenAPI spec changes |

				| Close stale issues and PRs | [stale_bot.yml](stale_bot.yml) | Run the Stale Bot action |

				| Test External Providers Installed via Module | [test-external-provider-module.yml](test-external-provider-module.yml) | Test External Provider installation via Python module |

				| Test External API and Providers | [test-external.yml](test-external.yml) | Test the External API and Provider mechanisms |

				| UI Tests | [ui-unit-tests.yml](ui-unit-tests.yml) | Run the UI test suite |

				| Unit Tests | [unit-tests.yml](unit-tests.yml) | Run the unit test suite |

									
										578

.github/workflows/backward-compat.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,578 @@

				name: Backward Compatibility Check

				run-name: Check backward compatibility for config.yaml files

				on:

				  pull_request:

				    branches:

				      - main

				      - 'release-[0-9]+.[0-9]+.[0-9]+.[0-9]+'

				      - 'release-[0-9]+.[0-9]+.[0-9]+'

				      - 'release-[0-9]+.[0-9]+'

				    paths:

				      - 'src/llama_stack/core/datatypes.py'

				      - 'src/llama_stack/providers/datatypes.py'

				      - 'src/llama_stack/distributions/**/config.yaml'

				      - 'tests/backward_compat/**'

				      - '.github/workflows/backward-compat.yml'

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref }}

				  cancel-in-progress: true

				jobs:

				  check-main-compatibility:

				    name: Check Compatibility with main

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout PR branch

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				        with:

				          fetch-depth: 0  # Need full history to access main branch

				      - name: Set up Python

				        uses: actions/setup-python@83679a892e2d95755f2dac6acb0bfd1e9ac5d548 # v6.1.0

				        with:

				          python-version: '3.12'

				      - name: Install uv

				        uses: astral-sh/setup-uv@681c641aba71e4a1c380be3ab5e12ad51f415867 # v7.1.6

				        with:

				          enable-cache: true

				      - name: Install dependencies

				        run: |

				          uv sync --group dev

				      - name: Extract config.yaml files from main branch

				        id: extract_configs

				        run: |

				          # Get list of config.yaml paths from main

				          git fetch origin main

				          CONFIG_PATHS=$(git ls-tree -r --name-only origin/main | grep "src/llama_stack/distributions/.*/config.yaml$" || true)

				          if [ -z "$CONFIG_PATHS" ]; then

				            echo "No config.yaml files found in main branch"

				            exit 1

				          fi

				          # Extract all configs to a temp directory

				          mkdir -p /tmp/main_configs

				          echo "Extracting configs from main branch:"

				          while IFS= read -r config_path; do

				            if [ -z "$config_path" ]; then

				              continue

				            fi

				            # Extract filename for storage

				            filename=$(basename $(dirname "$config_path"))

				            echo "  - $filename (from $config_path)"

				            git show origin/main:"$config_path" > "/tmp/main_configs/${filename}.yaml"

				          done <<< "$CONFIG_PATHS"

				          echo ""

				          echo "Extracted $(ls /tmp/main_configs/*.yaml | wc -l) config files"

				      - name: Test all configs from main

				        id: test_configs

				        continue-on-error: true

				        run: |

				          # Run pytest once with all configs parameterized

				          if COMPAT_TEST_CONFIGS_DIR=/tmp/main_configs uv run pytest tests/backward_compat/test_run_config.py -v; then

				            echo "failed=false" >> $GITHUB_OUTPUT

				          else

				            echo "failed=true" >> $GITHUB_OUTPUT

				            exit 1

				          fi

				      - name: Check for breaking change acknowledgment

				        id: check_ack

				        if: steps.test_configs.outputs.failed == 'true'

				        run: |

				          echo "Breaking changes detected. Checking for acknowledgment..."

				          # Check PR title for '!:' marker (conventional commits)

				          PR_TITLE="${{ github.event.pull_request.title }}"

				          if [[ "$PR_TITLE" =~ ^[a-z]+\!: ]]; then

				            echo "✓ Breaking change acknowledged in PR title"

				            echo "acknowledged=true" >> $GITHUB_OUTPUT

				            exit 0

				          fi

				          # Check commit messages for BREAKING CHANGE:

				          if git log origin/main..HEAD --format=%B | grep -q "BREAKING CHANGE:"; then

				            echo "✓ Breaking change acknowledged in commit message"

				            echo "acknowledged=true" >> $GITHUB_OUTPUT

				            exit 0

				          fi

				          echo "✗ Breaking change NOT acknowledged"

				          echo "acknowledged=false" >> $GITHUB_OUTPUT

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: Evaluate results

				        if: always()

				        run: |

				          FAILED="${{ steps.test_configs.outputs.failed }}"

				          ACKNOWLEDGED="${{ steps.check_ack.outputs.acknowledged }}"

				          if [[ "$FAILED" == "true" ]]; then

				            if [[ "$ACKNOWLEDGED" == "true" ]]; then

				              echo ""

				              echo "⚠️  WARNING: Breaking changes detected but acknowledged"

				              echo ""

				              echo "This PR introduces backward-incompatible changes to config.yaml."

				              echo "The changes have been properly acknowledged."

				              echo ""

				              exit 0  # Pass the check

				            else

				              echo ""

				              echo "❌ ERROR: Breaking changes detected without acknowledgment"

				              echo ""

				              echo "This PR introduces backward-incompatible changes to config.yaml"

				              echo "that will break existing user configurations."

				              echo ""

				              echo "To acknowledge this breaking change, do ONE of:"

				              echo "  1. Add '!:' to your PR title (e.g., 'feat!: change xyz')"

				              echo "  2. Add the 'breaking-change' label to this PR"

				              echo "  3. Include 'BREAKING CHANGE:' in a commit message"

				              echo ""

				              exit 1  # Fail the check

				            fi

				          fi

				  test-integration-main:

				    name: Run Integration Tests with main Config

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout PR branch

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				        with:

				          fetch-depth: 0

				      - name: Extract ci-tests config.yaml from main

				        run: |

				          git fetch origin main

				          git show origin/main:src/llama_stack/distributions/ci-tests/config.yaml > /tmp/main-ci-tests-config.yaml

				          echo "Extracted ci-tests config.yaml from main branch"

				      - name: Setup test environment

				        uses: ./.github/actions/setup-test-environment

				        with:

				          python-version: '3.12'

				          client-version: 'latest'

				          setup: 'ollama'

				          suite: 'base'

				          inference-mode: 'replay'

				      - name: Run integration tests with main config

				        id: test_integration

				        continue-on-error: true

				        uses: ./.github/actions/run-and-record-tests

				        with:

				          stack-config: /tmp/main-ci-tests-config.yaml

				          setup: 'ollama'

				          inference-mode: 'replay'

				          suite: 'base'

				      - name: Check for breaking change acknowledgment

				        id: check_ack

				        if: steps.test_integration.outcome == 'failure'

				        run: |

				          echo "Integration tests failed. Checking for acknowledgment..."

				          # Check PR title for '!:' marker (conventional commits)

				          PR_TITLE="${{ github.event.pull_request.title }}"

				          if [[ "$PR_TITLE" =~ ^[a-z]+\!: ]]; then

				            echo "✓ Breaking change acknowledged in PR title"

				            echo "acknowledged=true" >> $GITHUB_OUTPUT

				            exit 0

				          fi

				          # Check commit messages for BREAKING CHANGE:

				          if git log origin/main..HEAD --format=%B | grep -q "BREAKING CHANGE:"; then

				            echo "✓ Breaking change acknowledged in commit message"

				            echo "acknowledged=true" >> $GITHUB_OUTPUT

				            exit 0

				          fi

				          echo "✗ Breaking change NOT acknowledged"

				          echo "acknowledged=false" >> $GITHUB_OUTPUT

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: Evaluate integration test results

				        if: always()

				        run: |

				          TEST_FAILED="${{ steps.test_integration.outcome == 'failure' }}"

				          ACKNOWLEDGED="${{ steps.check_ack.outputs.acknowledged }}"

				          if [[ "$TEST_FAILED" == "true" ]]; then

				            if [[ "$ACKNOWLEDGED" == "true" ]]; then

				              echo ""

				              echo "⚠️  WARNING: Integration tests failed with main config but acknowledged"

				              echo ""

				              exit 0  # Pass the check

				            else

				              echo ""

				              echo "❌ ERROR: Integration tests failed with main config without acknowledgment"

				              echo ""

				              echo "To acknowledge this breaking change, do ONE of:"

				              echo "  1. Add '!:' to your PR title (e.g., 'feat!: change xyz')"

				              echo "  2. Include 'BREAKING CHANGE:' in a commit message"

				              echo ""

				              exit 1  # Fail the check

				            fi

				          fi

				  test-integration-release:

				    name: Run Integration Tests with Latest Release (Informational)

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout PR branch

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				        with:

				          fetch-depth: 0

				      - name: Get latest release

				        id: get_release

				        run: |

				          # Get the latest release from GitHub

				          LATEST_TAG=$(gh release list --limit 1 --json tagName --jq '.[0].tagName' 2>/dev/null || echo "")

				          if [ -z "$LATEST_TAG" ]; then

				            echo "No releases found, skipping release compatibility check"

				            echo "has_release=false" >> $GITHUB_OUTPUT

				            exit 0

				          fi

				          echo "Latest release: $LATEST_TAG"

				          echo "has_release=true" >> $GITHUB_OUTPUT

				          echo "tag=$LATEST_TAG" >> $GITHUB_OUTPUT

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: Extract ci-tests config.yaml from release

				        if: steps.get_release.outputs.has_release == 'true'

				        id: extract_config

				        run: |

				          RELEASE_TAG="${{ steps.get_release.outputs.tag }}"

				          # Try with src/ prefix first (newer releases), then without (older releases)

				          if git show "$RELEASE_TAG:src/llama_stack/distributions/ci-tests/config.yaml" > /tmp/release-ci-tests-config.yaml 2>/dev/null; then

				            echo "Extracted ci-tests config.yaml from release $RELEASE_TAG (src/ path)"

				            echo "has_config=true" >> $GITHUB_OUTPUT

				          elif git show "$RELEASE_TAG:llama_stack/distributions/ci-tests/config.yaml" > /tmp/release-ci-tests-config.yaml 2>/dev/null; then

				            echo "Extracted ci-tests config.yaml from release $RELEASE_TAG (old path)"

				            echo "has_config=true" >> $GITHUB_OUTPUT

				          else

				            echo "::warning::ci-tests/config.yaml not found in release $RELEASE_TAG"

				            echo "has_config=false" >> $GITHUB_OUTPUT

				          fi

				      - name: Setup test environment

				        if: steps.get_release.outputs.has_release == 'true' && steps.extract_config.outputs.has_config == 'true'

				        uses: ./.github/actions/setup-test-environment

				        with:

				          python-version: '3.12'

				          client-version: 'latest'

				          setup: 'ollama'

				          suite: 'base'

				          inference-mode: 'replay'

				      - name: Run integration tests with release config (PR branch)

				        id: test_release_pr

				        if: steps.get_release.outputs.has_release == 'true' && steps.extract_config.outputs.has_config == 'true'

				        continue-on-error: true

				        uses: ./.github/actions/run-and-record-tests

				        with:

				          stack-config: /tmp/release-ci-tests-config.yaml

				          setup: 'ollama'

				          inference-mode: 'replay'

				          suite: 'base'

				      - name: Checkout main branch to test baseline

				        if: steps.get_release.outputs.has_release == 'true' && steps.extract_config.outputs.has_config == 'true'

				        run: |

				          git checkout origin/main

				      - name: Setup test environment for main

				        if: steps.get_release.outputs.has_release == 'true' && steps.extract_config.outputs.has_config == 'true'

				        uses: ./.github/actions/setup-test-environment

				        with:

				          python-version: '3.12'

				          client-version: 'latest'

				          setup: 'ollama'

				          suite: 'base'

				          inference-mode: 'replay'

				      - name: Run integration tests with release config (main branch)

				        id: test_release_main

				        if: steps.get_release.outputs.has_release == 'true' && steps.extract_config.outputs.has_config == 'true'

				        continue-on-error: true

				        uses: ./.github/actions/run-and-record-tests

				        with:

				          stack-config: /tmp/release-ci-tests-config.yaml

				          setup: 'ollama'

				          inference-mode: 'replay'

				          suite: 'base'

				      - name: Report results and post PR comment

				        if: always() && steps.get_release.outputs.has_release == 'true' && steps.extract_config.outputs.has_config == 'true'

				        run: |

				          RELEASE_TAG="${{ steps.get_release.outputs.tag }}"

				          PR_OUTCOME="${{ steps.test_release_pr.outcome }}"

				          MAIN_OUTCOME="${{ steps.test_release_main.outcome }}"

				          if [[ "$PR_OUTCOME" == "failure" && "$MAIN_OUTCOME" == "success" ]]; then

				            # NEW breaking change - PR fails but main passes

				            echo "::error::🚨 This PR introduces a NEW breaking change!"

				            # Check if we already posted a comment (to avoid spam on every push)

				            EXISTING_COMMENT=$(gh pr view ${{ github.event.pull_request.number }} --json comments --jq '.comments[] | select(.body | contains("🚨 New Breaking Change Detected") and contains("Integration tests")) | .id' | head -1)

				            if [[ -z "$EXISTING_COMMENT" ]]; then

				              gh pr comment ${{ github.event.pull_request.number }} --body "## 🚨 New Breaking Change Detected

				          **Integration tests against release \`$RELEASE_TAG\` are now failing**

				          ⚠️  This PR introduces a breaking change that affects compatibility with the latest release.

				          - Users on release \`$RELEASE_TAG\` may not be able to upgrade

				          - Existing configurations may break

				          The tests pass on \`main\` but fail with this PR's changes.

				          > **Note:** This is informational only and does not block merge.

				          > Consider whether this breaking change is acceptable for users."

				            else

				              echo "Comment already exists, skipping to avoid spam"

				            fi

				            cat >> $GITHUB_STEP_SUMMARY <<EOF

				          ## 🚨 NEW Breaking Change Detected

				          **Integration tests against release \`$RELEASE_TAG\` FAILED**

				          ⚠️  **This PR introduces a NEW breaking change**

				          - Tests **PASS** on main branch ✅

				          - Tests **FAIL** on PR branch ❌

				          - Users on release \`$RELEASE_TAG\` may not be able to upgrade

				          - Existing configurations may break

				          > **Note:** This is informational only and does not block merge.

				          > Consider whether this breaking change is acceptable for users.

				          EOF

				          elif [[ "$PR_OUTCOME" == "failure" ]]; then

				            # Existing breaking change - both PR and main fail

				            echo "::warning::Breaking change already exists in main branch"

				            cat >> $GITHUB_STEP_SUMMARY <<EOF

				          ## ⚠️ Release Compatibility Test Failed (Existing Issue)

				          **Integration tests against release \`$RELEASE_TAG\` FAILED**

				          - Tests **FAIL** on main branch ❌

				          - Tests **FAIL** on PR branch ❌

				          - This breaking change already exists in main (not introduced by this PR)

				          > **Note:** This is informational only.

				          EOF

				          else

				            # Success - tests pass

				            cat >> $GITHUB_STEP_SUMMARY <<EOF

				          ## ✅ Release Compatibility Test Passed

				          Integration tests against release \`$RELEASE_TAG\` passed successfully.

				          This PR maintains compatibility with the latest release.

				          EOF

				          fi

				        env:

				          GH_TOKEN: ${{ github.token }}

				  check-schema-release-compatibility:

				    name: Check Schema Compatibility with Latest Release (Informational)

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout PR branch

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				        with:

				          fetch-depth: 0

				      - name: Set up Python

				        uses: actions/setup-python@83679a892e2d95755f2dac6acb0bfd1e9ac5d548 # v6.1.0

				        with:

				          python-version: '3.12'

				      - name: Install uv

				        uses: astral-sh/setup-uv@681c641aba71e4a1c380be3ab5e12ad51f415867 # v7.1.6

				        with:

				          enable-cache: true

				      - name: Install dependencies

				        run: |

				          uv sync --group dev

				      - name: Get latest release

				        id: get_release

				        run: |

				          # Get the latest release from GitHub

				          LATEST_TAG=$(gh release list --limit 1 --json tagName --jq '.[0].tagName' 2>/dev/null || echo "")

				          if [ -z "$LATEST_TAG" ]; then

				            echo "No releases found, skipping release compatibility check"

				            echo "has_release=false" >> $GITHUB_OUTPUT

				            exit 0

				          fi

				          echo "Latest release: $LATEST_TAG"

				          echo "has_release=true" >> $GITHUB_OUTPUT

				          echo "tag=$LATEST_TAG" >> $GITHUB_OUTPUT

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: Extract configs from release

				        if: steps.get_release.outputs.has_release == 'true'

				        id: extract_release_configs

				        run: |

				          RELEASE_TAG="${{ steps.get_release.outputs.tag }}"

				          # Get config.yaml files from the release (try both src/ and old path)

				          CONFIG_PATHS=$(git ls-tree -r --name-only "$RELEASE_TAG" | grep "llama_stack/distributions/.*/config.yaml$" || true)

				          if [ -z "$CONFIG_PATHS" ]; then

				            echo "::warning::No config.yaml files found in release $RELEASE_TAG"

				            echo "has_configs=false" >> $GITHUB_OUTPUT

				            exit 0

				          fi

				          # Extract all configs to a temp directory

				          mkdir -p /tmp/release_configs

				          echo "Extracting configs from release $RELEASE_TAG:"

				          while IFS= read -r config_path; do

				            if [ -z "$config_path" ]; then

				              continue

				            fi

				            filename=$(basename $(dirname "$config_path"))

				            echo "  - $filename (from $config_path)"

				            git show "$RELEASE_TAG:$config_path" > "/tmp/release_configs/${filename}.yaml" 2>/dev/null || true

				          done <<< "$CONFIG_PATHS"

				          echo ""

				          echo "Extracted $(ls /tmp/release_configs/*.yaml 2>/dev/null | wc -l) config files"

				          echo "has_configs=true" >> $GITHUB_OUTPUT

				      - name: Test against release configs (PR branch)

				        id: test_schema_pr

				        if: steps.get_release.outputs.has_release == 'true' && steps.extract_release_configs.outputs.has_configs == 'true'

				        continue-on-error: true

				        run: |

				          RELEASE_TAG="${{ steps.get_release.outputs.tag }}"

				          COMPAT_TEST_CONFIGS_DIR=/tmp/release_configs uv run pytest tests/backward_compat/test_run_config.py -v --tb=short

				      - name: Checkout main branch to test baseline

				        if: steps.get_release.outputs.has_release == 'true' && steps.extract_release_configs.outputs.has_configs == 'true'

				        run: |

				          git checkout origin/main

				      - name: Install dependencies for main

				        if: steps.get_release.outputs.has_release == 'true' && steps.extract_release_configs.outputs.has_configs == 'true'

				        run: |

				          uv sync --group dev

				      - name: Test against release configs (main branch)

				        id: test_schema_main

				        if: steps.get_release.outputs.has_release == 'true' && steps.extract_release_configs.outputs.has_configs == 'true'

				        continue-on-error: true

				        run: |

				          RELEASE_TAG="${{ steps.get_release.outputs.tag }}"

				          COMPAT_TEST_CONFIGS_DIR=/tmp/release_configs uv run pytest tests/backward_compat/test_run_config.py -v --tb=short

				      - name: Report results and post PR comment

				        if: always() && steps.get_release.outputs.has_release == 'true' && steps.extract_release_configs.outputs.has_configs == 'true'

				        run: |

				          RELEASE_TAG="${{ steps.get_release.outputs.tag }}"

				          PR_OUTCOME="${{ steps.test_schema_pr.outcome }}"

				          MAIN_OUTCOME="${{ steps.test_schema_main.outcome }}"

				          if [[ "$PR_OUTCOME" == "failure" && "$MAIN_OUTCOME" == "success" ]]; then

				            # NEW breaking change - PR fails but main passes

				            echo "::error::🚨 This PR introduces a NEW schema breaking change!"

				            # Check if we already posted a comment (to avoid spam on every push)

				            EXISTING_COMMENT=$(gh pr view ${{ github.event.pull_request.number }} --json comments --jq '.comments[] | select(.body | contains("🚨 New Schema Breaking Change Detected")) | .id' | head -1)

				            if [[ -z "$EXISTING_COMMENT" ]]; then

				              gh pr comment ${{ github.event.pull_request.number }} --body "## 🚨 New Schema Breaking Change Detected

				          **Schema validation against release \`$RELEASE_TAG\` is now failing**

				          ⚠️  This PR introduces a schema breaking change that affects compatibility with the latest release.

				          - Users on release \`$RELEASE_TAG\` will not be able to upgrade

				          - Existing config.yaml configurations will fail validation

				          The tests pass on \`main\` but fail with this PR's changes.

				          > **Note:** This is informational only and does not block merge.

				          > Consider whether this breaking change is acceptable for users."

				            else

				              echo "Comment already exists, skipping to avoid spam"

				            fi

				            cat >> $GITHUB_STEP_SUMMARY <<EOF

				          ## 🚨 NEW Schema Breaking Change Detected

				          **Schema validation against release \`$RELEASE_TAG\` FAILED**

				          ⚠️  **This PR introduces a NEW schema breaking change**

				          - Tests **PASS** on main branch ✅

				          - Tests **FAIL** on PR branch ❌

				          - Users on release \`$RELEASE_TAG\` will not be able to upgrade

				          - Existing config.yaml configurations will fail validation

				          > **Note:** This is informational only and does not block merge.

				          > Consider whether this breaking change is acceptable for users.

				          EOF

				          elif [[ "$PR_OUTCOME" == "failure" ]]; then

				            # Existing breaking change - both PR and main fail

				            echo "::warning::Schema breaking change already exists in main branch"

				            cat >> $GITHUB_STEP_SUMMARY <<EOF

				          ## ⚠️ Release Schema Compatibility Failed (Existing Issue)

				          **Schema validation against release \`$RELEASE_TAG\` FAILED**

				          - Tests **FAIL** on main branch ❌

				          - Tests **FAIL** on PR branch ❌

				          - This schema breaking change already exists in main (not introduced by this PR)

				          > **Note:** This is informational only.

				          EOF

				          else

				            # Success - tests pass

				            cat >> $GITHUB_STEP_SUMMARY <<EOF

				          ## ✅ Release Schema Compatibility Passed

				          All config.yaml configs from release \`$RELEASE_TAG\` are compatible.

				          This PR maintains backward compatibility with the latest release.

				          EOF

				          fi

				        env:

				          GH_TOKEN: ${{ github.token }}

									
										29

.github/workflows/changelog.yml
									
										vendored
									
										View file
									
				@ -1,29 +0,0 @@

				name: Update Changelog

				on:

				  release:

				    types: [published, unpublished, created, edited, deleted, released]

				permissions:

				  contents: read

				jobs:

				  generate_changelog:

				    name: Generate changelog

				    permissions:

				      contents: write  # for peter-evans/create-pull-request to create branch

				      pull-requests: write  # for peter-evans/create-pull-request to create a PR

				    runs-on: ubuntu-latest

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: main

				          fetch-depth: 0

				      - run: |

				          python ./scripts/gen-changelog.py

				      - uses: peter-evans/create-pull-request@271a8d0340265f705b14b6d32b9829c1cb33d45e # v7.0.8

				        with:

				          title: 'docs: update CHANGELOG.md for ${{ github.ref_name }}'

				          commit-message: 'docs: update CHANGELOG.md for ${{ github.ref_name }}'

				          branch: create-pull-request/changelog

				          signoff: true

									
										161

.github/workflows/conformance.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,161 @@

				# API Conformance Tests

				# This workflow ensures that API changes maintain backward compatibility and don't break existing integrations

				# It runs schema validation and OpenAPI diff checks to catch breaking changes early

				#

				# The workflow handles both monolithic and split API specifications:

				# - If split specs exist (stable/experimental/deprecated), they are stitched together for comparison

				# - If only monolithic spec exists, it is used directly

				# This allows for clean API organization while maintaining robust conformance testing

				name: API Conformance Tests

				run-name: Run the API Conformance test suite on the changes.

				on:

				  push:

				    branches: [ main ]

				  pull_request:

				    branches: [ main ]

				    types: [opened, synchronize, reopened, edited]

				    paths:

				      - 'docs/static/llama-stack-spec.yaml'              # Legacy monolithic spec

				      - 'docs/static/stable-llama-stack-spec.yaml'       # Stable APIs spec

				      - 'docs/static/experimental-llama-stack-spec.yaml' # Experimental APIs spec

				      - 'docs/static/deprecated-llama-stack-spec.yaml'   # Deprecated APIs spec

				      - '.github/workflows/conformance.yml'              # This workflow itself

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && github.run_id || github.ref }}

				  # Cancel in-progress runs when new commits are pushed to avoid wasting CI resources

				  cancel-in-progress: true

				jobs:

				  # Job to check if API schema changes maintain backward compatibility

				  check-schema-compatibility:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout PR Code

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				        with:

				          fetch-depth: 0

				      # Check if we should skip conformance testing due to breaking changes

				      - name: Check if conformance test should be skipped

				        id: skip-check

				        env:

				          PR_TITLE: ${{ github.event.pull_request.title }}

				        run: |

				          # Skip if title contains "!:" indicating breaking change (like "feat!:")

				          if [[ "$PR_TITLE" == *"!:"* ]]; then

				            echo "skip=true" >> $GITHUB_OUTPUT

				            exit 0

				          fi

				          # Get all commits in this PR and check for BREAKING CHANGE footer

				          git log --format="%B" ${{ github.event.pull_request.base.sha }}..${{ github.event.pull_request.head.sha }} | \

				            grep -q "BREAKING CHANGE:" && echo "skip=true" >> $GITHUB_OUTPUT || echo "skip=false" >> $GITHUB_OUTPUT

				        shell: bash

				      # Checkout the base branch to compare against (usually main)

				      # This allows us to diff the current changes against the previous state

				      - name: Checkout Base Branch

				        if: steps.skip-check.outputs.skip != 'true'

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				        with:

				          ref: ${{ github.event.pull_request.base.ref }}

				          path: 'base'

				      # Cache oasdiff to avoid checksum failures and speed up builds

				      - name: Cache oasdiff

				        if: steps.skip-check.outputs.skip != 'true'

				        id: cache-oasdiff

				        uses: actions/cache@9255dc7a253b0ccc959486e2bca901246202afeb

				        with:

				          path: ~/oasdiff

				          key: oasdiff-${{ runner.os }}

				      # Install oasdiff: https://github.com/oasdiff/oasdiff, a tool for detecting breaking changes in OpenAPI specs.

				      - name: Install oasdiff

				        if: steps.skip-check.outputs.skip != 'true' && steps.cache-oasdiff.outputs.cache-hit != 'true'

				        run: |

				          curl -fsSL https://raw.githubusercontent.com/oasdiff/oasdiff/main/install.sh | sh

				          cp /usr/local/bin/oasdiff ~/oasdiff

				      # Setup cached oasdiff

				      - name: Setup cached oasdiff

				        if: steps.skip-check.outputs.skip != 'true' && steps.cache-oasdiff.outputs.cache-hit == 'true'

				        run: |

				          sudo cp ~/oasdiff /usr/local/bin/oasdiff

				          sudo chmod +x /usr/local/bin/oasdiff

				      # Install yq for YAML processing

				      - name: Install yq

				        run: |

				          sudo wget -qO /usr/local/bin/yq https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64

				          sudo chmod +x /usr/local/bin/yq

				      # Verify API specs exist for conformance testing

				      - name: Check API Specs

				        if: steps.skip-check.outputs.skip != 'true'

				        run: |

				          echo "Checking for API specification files..."

				          # Check current branch

				          if [ -f "docs/static/stable-llama-stack-spec.yaml" ]; then

				            echo "✓ Found stable API spec in current branch"

				            CURRENT_SPEC="docs/static/stable-llama-stack-spec.yaml"

				          elif [ -f "docs/static/llama-stack-spec.yaml" ]; then

				            echo "✓ Found monolithic API spec in current branch"

				            CURRENT_SPEC="docs/static/llama-stack-spec.yaml"

				          else

				            echo "❌ No API specs found in current branch"

				            exit 1

				          fi

				          # Check base branch

				          if [ -f "base/docs/static/stable-llama-stack-spec.yaml" ]; then

				            echo "✓ Found stable API spec in base branch"

				            BASE_SPEC="base/docs/static/stable-llama-stack-spec.yaml"

				          elif [ -f "base/docs/static/llama-stack-spec.yaml" ]; then

				            echo "✓ Found monolithic API spec in base branch"

				            BASE_SPEC="base/docs/static/llama-stack-spec.yaml"

				          else

				            echo "❌ No API specs found in base branch"

				            exit 1

				          fi

				          # Export for next step

				          echo "BASE_SPEC=${BASE_SPEC}" >> $GITHUB_ENV

				          echo "CURRENT_SPEC=${CURRENT_SPEC}" >> $GITHUB_ENV

				          echo "Will compare: ${BASE_SPEC} -> ${CURRENT_SPEC}"

				      # Run oasdiff to detect breaking changes in the API specification

				      # This step will fail if incompatible changes are detected, preventing breaking changes from being merged

				      - name: Run OpenAPI Breaking Change Diff

				        if: steps.skip-check.outputs.skip != 'true'

				        run: |

				          oasdiff breaking --fail-on ERR $BASE_SPEC $CURRENT_SPEC --match-path '^/v1/'

				      # Run oasdiff to detect breaking changes in the API specification when compared to the OpenAI openAPI spec

				      - name: Run OpenAPI Breaking Change Diff Against OpenAI API

				        if: steps.skip-check.outputs.skip != 'true'

				        continue-on-error: true

				        shell: bash

				        run: |

				          OPENAI_SPEC=docs/static/openai-spec-2.3.0.yml

				          LLAMA_STACK_SPEC=docs/static/llama-stack-spec.yaml

				          # Compare Llama Stack spec against OpenAI spec.

				          # This finds breaking changes in our implementation of common endpoints.

				          # By using our spec as the base, we avoid errors for endpoints we don't implement.

				          oasdiff breaking --fail-on ERR \

				            "$LLAMA_STACK_SPEC" \

				            "$OPENAI_SPEC" \

				            --strip-prefix-base "/v1"

				      # Report when test is skipped

				      - name: Report skip reason

				        if: steps.skip-check.outputs.skip == 'true'

				        run: |

				          echo "Conformance test skipped due to breaking change indicator"

									
										355

.github/workflows/gha_workflow_llama_stack_tests.yml
									
										vendored
									
										View file
									
				@ -1,355 +0,0 @@

				name: "Run Llama-stack Tests"

				on:

				  #### Temporarily disable PR runs until tests run as intended within mainline.

				  #TODO Add this back.

				  #pull_request_target:

				  #  types: ["opened"]

				  #  branches:

				  #    - 'main'

				  #  paths:

				  #    - 'llama_stack/**/*.py'

				  #    - 'tests/**/*.py'

				  workflow_dispatch:

				    inputs:

				      runner:

				        description: 'GHA Runner Scale Set label to run workflow on.'

				        required: true

				        default: "llama-stack-gha-runner-gpu"

				      checkout_reference:

				        description: "The branch, tag, or SHA to checkout"

				        required: true

				        default: "main"

				      debug:

				        description: 'Run debugging steps?'

				        required: false

				        default: "true"

				      sleep_time:

				        description: '[DEBUG] sleep time for debugging'

				        required: true

				        default: "0"

				      provider_id:

				        description: 'ID of your provider'

				        required: true

				        default: "meta_reference"

				      model_id:

				        description: 'Shorthand name for target model ID (llama_3b or llama_8b)'

				        required: true

				        default: "llama_3b"

				      model_override_3b:

				        description: 'Specify shorthand model for <llama_3b> '

				        required: false

				        default: "Llama3.2-3B-Instruct"

				      model_override_8b:

				        description: 'Specify shorthand model for <llama_8b> '

				        required: false

				        default: "Llama3.1-8B-Instruct"

				env:

				  # ID used for each test's provider config

				  PROVIDER_ID: "${{ inputs.provider_id || 'meta_reference' }}"

				  # Path to model checkpoints within EFS volume

				  MODEL_CHECKPOINT_DIR: "/data/llama"

				  # Path to directory to run tests from

				  TESTS_PATH: "${{ github.workspace }}/llama_stack/providers/tests"

				  # Keep track of a list of model IDs that are valid to use within pytest fixture marks

				  AVAILABLE_MODEL_IDs: "llama_3b llama_8b"

				  # Shorthand name for model ID, used in pytest fixture marks

				  MODEL_ID: "${{ inputs.model_id || 'llama_3b' }}"

				  # Override the `llama_3b` / `llama_8b' models, else use the default.

				  LLAMA_3B_OVERRIDE: "${{ inputs.model_override_3b || 'Llama3.2-3B-Instruct' }}"

				  LLAMA_8B_OVERRIDE: "${{ inputs.model_override_8b || 'Llama3.1-8B-Instruct' }}"

				  # Defines which directories in TESTS_PATH to exclude from the test loop

				  EXCLUDED_DIRS: "__pycache__"

				  # Defines the output xml reports generated after a test is run

				  REPORTS_GEN: ""

				jobs:

				  execute_workflow:

				    name: Execute workload on Self-Hosted GPU k8s runner

				    permissions:

				      pull-requests: write

				    defaults:

				      run:

				        shell: bash

				    runs-on: ${{ inputs.runner != '' && inputs.runner || 'llama-stack-gha-runner-gpu' }}

				    if: always()

				    steps:

				      ##############################

				      #### INITIAL DEBUG CHECKS ####

				      ##############################

				      - name: "[DEBUG] Check content of the EFS mount"

				        id: debug_efs_volume

				        continue-on-error: true

				        if: inputs.debug == 'true'

				        run: |

				            echo "========= Content of the EFS mount ============="

				            ls -la ${{ env.MODEL_CHECKPOINT_DIR }}

				      - name: "[DEBUG] Get runner container OS information"

				        id: debug_os_info

				        if: ${{ inputs.debug == 'true' }}

				        run: |

				            cat /etc/os-release

				      - name: "[DEBUG] Print environment variables"

				        id: debug_env_vars

				        if: ${{ inputs.debug == 'true' }}

				        run: |

				            echo "PROVIDER_ID = ${PROVIDER_ID}"

				            echo "MODEL_CHECKPOINT_DIR = ${MODEL_CHECKPOINT_DIR}"

				            echo "AVAILABLE_MODEL_IDs = ${AVAILABLE_MODEL_IDs}"

				            echo "MODEL_ID = ${MODEL_ID}"

				            echo "LLAMA_3B_OVERRIDE = ${LLAMA_3B_OVERRIDE}"

				            echo "LLAMA_8B_OVERRIDE = ${LLAMA_8B_OVERRIDE}"

				            echo "EXCLUDED_DIRS = ${EXCLUDED_DIRS}"

				            echo "REPORTS_GEN = ${REPORTS_GEN}"

				      ############################

				      #### MODEL INPUT CHECKS ####

				      ############################

				      - name: "Check if env.model_id is valid"

				        id: check_model_id

				        run: |

				          if [[ " ${AVAILABLE_MODEL_IDs[@]} " =~ " ${MODEL_ID} " ]]; then

				            echo "Model ID '${MODEL_ID}' is valid."

				          else

				            echo "Model ID '${MODEL_ID}' is invalid. Terminating workflow."

				            exit 1

				          fi

				      #######################

				      #### CODE CHECKOUT ####

				      #######################

				      - name: "Checkout 'meta-llama/llama-stack' repository"

				        id: checkout_repo

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.branch }}

				      - name: "[DEBUG] Content of the repository after checkout"

				        id: debug_content_after_checkout

				        if: ${{ inputs.debug == 'true' }}

				        run: |

				            ls -la ${GITHUB_WORKSPACE}

				      ##########################################################

				      ####              OPTIONAL SLEEP DEBUG                ####

				      #                                                        #

				      # Use to "exec" into the test k8s POD and run tests      #

				      # manually to identify what dependencies are being used. #

				      #                                                        #

				      ##########################################################

				      - name: "[DEBUG] sleep"

				        id: debug_sleep

				        if: ${{ inputs.debug == 'true' && inputs.sleep_time != '' }}

				        run: |

				            sleep ${{ inputs.sleep_time }}

				      ############################

				      #### UPDATE SYSTEM PATH ####

				      ############################

				      - name: "Update path: execute"

				        id: path_update_exec

				        run: |

				          # .local/bin is needed for certain libraries installed below to be recognized

				          # when calling their executable to install sub-dependencies

				          mkdir -p ${HOME}/.local/bin

				          echo "${HOME}/.local/bin" >> "$GITHUB_PATH"

				      #####################################

				      #### UPDATE CHECKPOINT DIRECTORY ####

				      #####################################

				      - name: "Update checkpoint directory"

				        id: checkpoint_update

				        run: |

				          echo "Checkpoint directory: ${MODEL_CHECKPOINT_DIR}/$LLAMA_3B_OVERRIDE"

				          if [ "${MODEL_ID}" = "llama_3b" ] && [ -d "${MODEL_CHECKPOINT_DIR}/${LLAMA_3B_OVERRIDE}" ]; then

				            echo "MODEL_CHECKPOINT_DIR=${MODEL_CHECKPOINT_DIR}/${LLAMA_3B_OVERRIDE}" >> "$GITHUB_ENV"

				          elif [ "${MODEL_ID}" = "llama_8b" ] && [ -d "${MODEL_CHECKPOINT_DIR}/${LLAMA_8B_OVERRIDE}" ]; then

				            echo "MODEL_CHECKPOINT_DIR=${MODEL_CHECKPOINT_DIR}/${LLAMA_8B_OVERRIDE}" >> "$GITHUB_ENV"

				          else

				            echo "MODEL_ID & LLAMA_*B_OVERRIDE are not a valid pairing. Terminating workflow."

				            exit 1

				          fi

				      - name: "[DEBUG] Checkpoint update check"

				        id: debug_checkpoint_update

				        if: ${{ inputs.debug == 'true' }}

				        run: |

				          echo "MODEL_CHECKPOINT_DIR (after update) = ${MODEL_CHECKPOINT_DIR}"

				      ##################################

				      #### DEPENDENCY INSTALLATIONS ####

				      ##################################

				      - name: "Installing 'apt' required packages"

				        id: install_apt

				        run: |

				          echo "[STEP] Installing 'apt' required packages"

				          sudo apt update -y

				          sudo apt install -y python3 python3-pip npm wget

				      - name: "Installing packages with 'curl'"

				        id: install_curl

				        run: |

				          curl -fsSL https://ollama.com/install.sh | sh

				      - name: "Installing packages with 'wget'"

				        id: install_wget

				        run: |

				          wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh

				          chmod +x Miniconda3-latest-Linux-x86_64.sh

				          ./Miniconda3-latest-Linux-x86_64.sh -b install -c pytorch -c nvidia faiss-gpu=1.9.0

				          # Add miniconda3 bin to system path

				          echo "${HOME}/miniconda3/bin" >> "$GITHUB_PATH"

				      - name: "Installing packages with 'npm'"

				        id: install_npm_generic

				        run: |

				          sudo npm install -g junit-merge

				      - name: "Installing pip dependencies"

				        id: install_pip_generic

				        run: |

				          echo "[STEP] Installing 'llama-stack' models"

				          pip install -U pip setuptools

				          pip install -r requirements.txt

				          pip install -e .

				          pip install -U \

				            torch torchvision \

				            pytest pytest_asyncio \

				            fairscale lm-format-enforcer \

				            zmq chardet pypdf \

				            pandas sentence_transformers together \

				            aiosqlite

				      - name: "Installing packages with conda"

				        id: install_conda_generic

				        run: |

				          conda install -q -c pytorch -c nvidia faiss-gpu=1.9.0

				      #############################################################

				      #### TESTING TO BE DONE FOR BOTH PRS AND MANUAL DISPATCH ####

				      #############################################################

				      - name: "Run Tests: Loop"

				        id: run_tests_loop

				        working-directory: "${{ github.workspace }}"

				        run: |

				          pattern=""

				          for dir in llama_stack/providers/tests/*; do

				            if [ -d "$dir" ]; then

				              dir_name=$(basename "$dir")

				              if [[ ! " $EXCLUDED_DIRS " =~ " $dir_name " ]]; then

				                for file in "$dir"/test_*.py; do

				                  test_name=$(basename "$file")

				                  new_file="result-${dir_name}-${test_name}.xml"

				                  if torchrun $(which pytest) -s -v ${TESTS_PATH}/${dir_name}/${test_name} -m "${PROVIDER_ID} and ${MODEL_ID}" \

				                     --junitxml="${{ github.workspace }}/${new_file}"; then

				                    echo "Ran test: ${test_name}"

				                  else

				                    echo "Did NOT run test: ${test_name}"

				                  fi

				                  pattern+="${new_file} "

				                done

				              fi

				            fi

				          done

				          echo "REPORTS_GEN=$pattern" >> "$GITHUB_ENV"

				      - name: "Test Summary: Merge"

				        id: test_summary_merge

				        working-directory: "${{ github.workspace }}"

				        run: |

				          echo "Merging the following test result files: ${REPORTS_GEN}"

				          # Defaults to merging them into 'merged-test-results.xml'

				          junit-merge ${{ env.REPORTS_GEN }}

				      ############################################

				      #### AUTOMATIC TESTING ON PULL REQUESTS ####

				      ############################################

				      #### Run tests ####

				      - name: "PR - Run Tests"

				        id: pr_run_tests

				        working-directory: "${{ github.workspace }}"

				        if: github.event_name == 'pull_request_target'

				        run: |

				          echo "[STEP] Running PyTest tests at 'GITHUB_WORKSPACE' path: ${GITHUB_WORKSPACE} | path: ${{ github.workspace }}"

				          # (Optional) Add more tests here.

				          # Merge test results with 'merged-test-results.xml' from above.

				          # junit-merge <new-test-results> merged-test-results.xml

				      #### Create test summary ####

				      - name: "PR - Test Summary"

				        id: pr_test_summary_create

				        if: github.event_name == 'pull_request_target'

				        uses: test-summary/action@31493c76ec9e7aa675f1585d3ed6f1da69269a86 # v2.4

				        with:

				          paths: "${{ github.workspace }}/merged-test-results.xml"

				          output: test-summary.md

				      - name: "PR - Upload Test Summary"

				        id: pr_test_summary_upload

				        if: github.event_name == 'pull_request_target'

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: test-summary

				          path: test-summary.md

				      #### Update PR request ####

				      - name: "PR - Update comment"

				        id: pr_update_comment

				        if: github.event_name == 'pull_request_target'

				        uses: thollander/actions-comment-pull-request@24bffb9b452ba05a4f3f77933840a6a841d1b32b # v3.0.1

				        with:

				          filePath: test-summary.md

				      ########################

				      #### MANUAL TESTING ####

				      ########################

				      #### Run tests ####

				      - name: "Manual - Run Tests: Prep"

				        id: manual_run_tests

				        working-directory: "${{ github.workspace }}"

				        if: github.event_name == 'workflow_dispatch'

				        run: |

				          echo "[STEP] Running PyTest tests at 'GITHUB_WORKSPACE' path: ${{ github.workspace }}"

				          #TODO Use this when collection errors are resolved

				          # pytest -s -v -m "${PROVIDER_ID} and ${MODEL_ID}" --junitxml="${{ github.workspace }}/merged-test-results.xml"

				          # (Optional) Add more tests here.

				          # Merge test results with 'merged-test-results.xml' from above.

				          # junit-merge <new-test-results> merged-test-results.xml

				      #### Create test summary ####

				      - name: "Manual - Test Summary"

				        id: manual_test_summary

				        if: always() && github.event_name == 'workflow_dispatch'

				        uses: test-summary/action@31493c76ec9e7aa675f1585d3ed6f1da69269a86 # v2.4

				        with:

				          paths: "${{ github.workspace }}/merged-test-results.xml"

									
										38

.github/workflows/install-script-ci.yml
									
										vendored
									
										View file
										
				@ -1,12 +1,14 @@

				name: Installer CI

				run-name: Test the installation script

				on:

				  pull_request:

				    paths:

				      - 'install.sh'

				      - 'scripts/install.sh'

				  push:

				    paths:

				      - 'install.sh'

				      - 'scripts/install.sh'

				  schedule:

				    - cron: '0 2 * * *'  # every day at 02:00 UTC

				@ -14,13 +16,33 @@ jobs:

				  lint:

				    runs-on: ubuntu-latest

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # 4.2.2

				      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # 6.0.1

				      - name: Run ShellCheck on install.sh

				        run: shellcheck install.sh

				  smoke-test:

				    needs: lint

				        run: shellcheck scripts/install.sh

				  smoke-test-on-dev:

				    runs-on: ubuntu-latest

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # 4.2.2

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: Build a single provider

				        run: |

				          BUILD_ARGS="--build-arg INSTALL_MODE=editable --build-arg DISTRO_NAME=starter"

				          if [ -n "${UV_EXTRA_INDEX_URL:-}" ]; then

				            BUILD_ARGS="$BUILD_ARGS --build-arg UV_EXTRA_INDEX_URL=$UV_EXTRA_INDEX_URL"

				          fi

				          if [ -n "${UV_INDEX_STRATEGY:-}" ]; then

				            BUILD_ARGS="$BUILD_ARGS --build-arg UV_INDEX_STRATEGY=$UV_INDEX_STRATEGY"

				          fi

				          docker build . \

				            -f containers/Containerfile \

				            $BUILD_ARGS \

				            --tag llama-stack:starter-ci

				      - name: Run installer end-to-end

				        run: ./install.sh

				        run: |

				          IMAGE_ID=$(docker images --format "{{.Repository}}:{{.Tag}}" | head -n 1)

				          ./scripts/install.sh --image $IMAGE_ID

									
										102

.github/workflows/integration-auth-tests.yml
									
										vendored
									
										View file
										
				@ -1,13 +1,20 @@

				name: Integration Auth Tests

				run-name: Run the integration test suite with Kubernetes authentication

				on:

				  push:

				    branches: [ main ]

				    branches:

				      - main

				      - 'release-[0-9]+.[0-9]+.x'

				  pull_request:

				    branches: [ main ]

				    branches:

				      - main

				      - 'release-[0-9]+.[0-9]+.x'

				    paths:

				      - 'distributions/**'

				      - 'llama_stack/**'

				      - 'src/llama_stack/**'

				      - '!src/llama_stack_ui/**'

				      - 'tests/integration/**'

				      - 'uv.lock'

				      - 'pyproject.toml'

				@ -15,7 +22,7 @@ on:

				      - '.github/workflows/integration-auth-tests.yml' # This workflow

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref }}

				  group: ${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && github.run_id || github.ref }}

				  cancel-in-progress: true

				jobs:

				@ -28,18 +35,14 @@ jobs:

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: Build Llama Stack

				        run: |

				          llama stack build --template ollama --image-type venv

				      - name: Install minikube

				        if: ${{ matrix.auth-provider == 'kubernetes' }}

				        uses: medyagh/setup-minikube@cea33675329b799adccc9526aa5daccc26cd5052 # v0.0.19

				        uses: medyagh/setup-minikube@e9e035a86bbc3caea26a450bd4dbf9d0c453682e # v0.0.21

				      - name: Start minikube

				        if: ${{ matrix.auth-provider == 'oauth2_token' }}

				@ -69,26 +72,53 @@ jobs:

				        if: ${{ matrix.auth-provider == 'oauth2_token' }}

				        run: |

				          run_dir=$(mktemp -d)

				          cat <<'EOF' > $run_dir/run.yaml

				          cat <<EOF > $run_dir/config.yaml

				          version: '2'

				          image_name: kube

				          apis: []

				          providers: {}

				          storage:

				            backends:

				              kv_default:

				                type: kv_sqlite

				                db_path: $run_dir/kvstore.db

				              sql_default:

				                type: sql_sqlite

				                db_path: $run_dir/sql_store.db

				            stores:

				              metadata:

				                namespace: registry

				                backend: kv_default

				              inference:

				                table_name: inference_store

				                backend: sql_default

				              conversations:

				                table_name: openai_conversations

				                backend: sql_default

				              prompts:

				                namespace: prompts

				                backend: kv_default

				          server:

				            port: 8321

				          EOF

				          yq eval '.server.auth = {"provider_type": "${{ matrix.auth-provider }}"}' -i $run_dir/run.yaml

				          yq eval '.server.auth.config = {"tls_cafile": "${{ env.KUBERNETES_CA_CERT_PATH }}", "issuer": "${{ env.KUBERNETES_ISSUER }}", "audience": "${{ env.KUBERNETES_AUDIENCE }}"}' -i $run_dir/run.yaml

				          yq eval '.server.auth.config.jwks = {"uri": "${{ env.KUBERNETES_API_SERVER_URL }}", "token": "${{ env.TOKEN }}"}' -i $run_dir/run.yaml

				          cat $run_dir/run.yaml

				          yq eval '.server.auth.provider_config.type = "${{ matrix.auth-provider }}"' -i $run_dir/config.yaml

				          yq eval '.server.auth.provider_config.tls_cafile = "${{ env.KUBERNETES_CA_CERT_PATH }}"' -i $run_dir/config.yaml

				          yq eval '.server.auth.provider_config.issuer = "${{ env.KUBERNETES_ISSUER }}"' -i $run_dir/config.yaml

				          yq eval '.server.auth.provider_config.audience = "${{ env.KUBERNETES_AUDIENCE }}"' -i $run_dir/config.yaml

				          yq eval '.server.auth.provider_config.jwks.uri = "${{ env.KUBERNETES_API_SERVER_URL }}"' -i $run_dir/config.yaml

				          yq eval '.server.auth.provider_config.jwks.token = "${{ env.TOKEN }}"' -i $run_dir/config.yaml

				          cat $run_dir/config.yaml

				          nohup uv run llama stack run $run_dir/run.yaml --image-type venv > server.log 2>&1 &

				          # avoid line breaks in the server log, especially because we grep it below.

				          export LLAMA_STACK_LOG_WIDTH=200

				          nohup uv run llama stack run $run_dir/config.yaml > server.log 2>&1 &

				      - name: Wait for Llama Stack server to be ready

				        run: |

				          echo "Waiting for Llama Stack server..."

				          for i in {1..30}; do

				            if curl -s -L -H "Authorization: Bearer $(cat llama-stack-auth-token)" http://localhost:8321/v1/health | grep -q "OK"; then

				            # Note: /v1/health does not require authentication

				            if curl -s -L http://localhost:8321/v1/health | grep -q "OK"; then

				              echo "Llama Stack server is up!"

				              if grep -q "Enabling authentication with provider: ${{ matrix.auth-provider }}" server.log; then

				                echo "Llama Stack server is configured to use ${{ matrix.auth-provider }} auth"

				@ -107,4 +137,40 @@ jobs:

				      - name: Test auth

				        run: |

				          curl -s -L -H "Authorization: Bearer $(cat llama-stack-auth-token)" http://127.0.0.1:8321/v1/providers|jq

				          # Function to test API endpoint with authentication

				          # Usage: test_endpoint <curl_args> <user_token_file> <expected_status> [output_file]

				          test_endpoint() {

				              local curl_args="$1"

				              local user_token_file=$2

				              local expected_status=$3

				              local output_file=${4:-/dev/null}

				              local status

				              local extra_curl_args=(-s -L -o "$output_file" -w "%{http_code}")

				              if [ "$user_token_file" != "none" ]; then

				                  extra_curl_args+=(-H "Authorization: Bearer $(cat $user_token_file)")

				              fi

				              set -x

				              status=$(curl $curl_args "${extra_curl_args[@]}")

				              set +x

				              if [ "$status" = "$expected_status" ]; then

				                  echo "  ✓ Status: $status (expected $expected_status)"

				                  return 0

				              else

				                  echo "  ✗ Status: $status (expected $expected_status)"

				                  exit 1

				              fi

				          }

				          echo "Testing /v1/version without token (should succeed)..."

				          test_endpoint "http://127.0.0.1:8321/v1/version" "none" "200" || exit 1

				          echo "Testing /v1/providers without token (should fail with 401)..."

				          test_endpoint "http://127.0.0.1:8321/v1/providers" "none" "401" || exit 1

				          echo "Testing /v1/providers with valid token (should succeed)..."

				          test_endpoint "http://127.0.0.1:8321/v1/providers" "llama-stack-auth-token" "200" "providers.json" || exit 1

				          cat providers.json | jq . > /dev/null && echo "  ✓ Valid JSON response"

									
										76

.github/workflows/integration-sql-store-tests.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,76 @@

				name: SqlStore Integration Tests

				run-name: Run the integration test suite with SqlStore

				on:

				  push:

				    branches:

				      - main

				      - 'release-[0-9]+.[0-9]+.x'

				  pull_request:

				    branches:

				      - main

				      - 'release-[0-9]+.[0-9]+.x'

				    paths:

				      - 'src/llama_stack/providers/utils/sqlstore/**'

				      - 'tests/integration/sqlstore/**'

				      - 'uv.lock'

				      - 'pyproject.toml'

				      - 'requirements.txt'

				      - '.github/workflows/integration-sql-store-tests.yml' # This workflow

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && github.run_id || github.ref }}

				  cancel-in-progress: true

				jobs:

				  test-postgres:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        python-version: ["3.12", "3.13"]

				      fail-fast: false

				    services:

				      postgres:

				        image: postgres:15

				        env:

				          POSTGRES_USER: llamastack

				          POSTGRES_PASSWORD: llamastack

				          POSTGRES_DB: llamastack

				        ports:

				          - 5432:5432

				        options: >-

				          --health-cmd pg_isready

				          --health-interval 10s

				          --health-timeout 5s

				          --health-retries 5

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				        with:

				          python-version: ${{ matrix.python-version }}

				      - name: Run SqlStore Integration Tests

				        env:

				          ENABLE_POSTGRES_TESTS: "true"

				          POSTGRES_HOST: localhost

				          POSTGRES_PORT: 5432

				          POSTGRES_DB: llamastack

				          POSTGRES_USER: llamastack

				          POSTGRES_PASSWORD: llamastack

				        run: |

				          uv run pytest -sv tests/integration/providers/utils/sqlstore/

				      - name: Upload test logs

				        if: ${{ always() }}

				        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0

				        with:

				          name: postgres-test-logs-${{ github.run_id }}-${{ github.run_attempt }}-${{ matrix.python-version }}

				          path: |

				            *.log

				          retention-days: 1

									
										235

.github/workflows/integration-tests.yml
									
										vendored
									
										View file
										
				@ -1,120 +1,163 @@

				name: Integration Tests

				name: Integration Tests (Replay)

				run-name: Run the integration test suites from tests/integration in replay mode

				on:

				  push:

				    branches: [ main ]

				    branches:

				      - main

				      - 'release-[0-9]+.[0-9]+.x'

				  pull_request:

				    branches: [ main ]

				    branches:

				      - main

				      - 'release-[0-9]+.[0-9]+.x'

				    types: [opened, synchronize, reopened]

				    paths:

				      - 'llama_stack/**'

				      - 'tests/integration/**'

				      - 'src/llama_stack/**'

				      - '!src/llama_stack_ui/**'

				      - 'tests/**'

				      - 'uv.lock'

				      - 'pyproject.toml'

				      - 'requirements.txt'

				      - '.github/workflows/integration-tests.yml' # This workflow

				      - '.github/actions/setup-ollama/action.yml'

				      - '.github/actions/setup-test-environment/action.yml'

				      - '.github/actions/run-and-record-tests/action.yml'

				      - 'scripts/integration-tests.sh'

				      - 'scripts/generate_ci_matrix.py'

				  schedule:

				    # If changing the cron schedule, update the provider in the test-matrix job

				    - cron: '0 0 * * *'  # (test latest client) Daily at 12 AM UTC

				  workflow_dispatch:

				    inputs:

				      test-all-client-versions:

				        description: 'Test against both the latest and published versions'

				        type: boolean

				        default: false

				      test-setup:

				        description: 'Test against a specific setup'

				        type: string

				        default: 'ollama'

				  workflow_call:

				    inputs:

				      sdk_install_url:

				        required: false

				        type: string

				        description: 'URL to install Python SDK from (for testing preview builds)'

				      matrix_key:

				        required: false

				        type: string

				        default: 'default'

				        description: 'Matrix configuration key from ci_matrix.json (e.g., "default", "stainless")'

				      pr_head_sha:

				        required: false

				        type: string

				        description: 'The SHA of the pull request head to checkout'

				      pr_head_ref:

				        required: false

				        type: string

				        description: 'The branch name of the pull request head (for recording commits)'

				      is_fork_pr:

				        required: false

				        type: boolean

				        default: false

				        description: 'Whether this is a fork PR (cannot push recordings to forks)'

				      test-all-client-versions:

				        required: false

				        type: boolean

				        default: false

				        description: 'Test against both the latest and published versions'

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref }}

				  # Skip concurrency for pushes to main - each commit should be tested independently

				  group: ${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && github.run_id || github.ref }}

				  cancel-in-progress: true

				jobs:

				  test-matrix:

				  generate-matrix:

				    runs-on: ubuntu-latest

				    outputs:

				      matrix: ${{ steps.set-matrix.outputs.matrix }}

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				        with:

				          ref: ${{ inputs.pr_head_sha || github.event.pull_request.head.sha || github.sha }}

				      - name: Generate test matrix

				        id: set-matrix

				        run: |

				          # Generate matrix from CI_MATRIX in tests/integration/ci_matrix.json

				          # Supports schedule-based, manual input, and workflow_call overrides

				          MATRIX=$(PYTHONPATH=. python3 scripts/generate_ci_matrix.py \

				            --schedule "${{ github.event.schedule }}" \

				            --test-setup "${{ github.event.inputs.test-setup || '' }}" \

				            --matrix-key "${{ inputs.matrix_key || 'default' }}")

				          echo "matrix=$MATRIX" >> $GITHUB_OUTPUT

				          echo "Generated matrix: $MATRIX"

				  run-replay-mode-tests:

				    needs: generate-matrix

				    runs-on: ubuntu-latest

				    name: ${{ format('Integration Tests ({0}, {1}, {2}, client={3}, {4})', matrix.client, matrix.config.setup, matrix.python-version, matrix.client-version, matrix.config.suite) }}

				    strategy:

				      fail-fast: false

				      matrix:

				        # Listing tests manually since some of them currently fail

				        # TODO: generate matrix list from tests/integration when fixed

				        test-type: [agents, inference, datasets, inspect, scoring, post_training, providers, tool_runtime, vector_io]

				        client-type: [library, http]

				        python-version: ["3.10", "3.11", "3.12"]

				      fail-fast: false # we want to run all tests regardless of failure

				        client: [library, docker, server]

				        # Use Python 3.13 only on nightly schedule (daily latest client test), otherwise use 3.12

				        python-version: ${{ github.event.schedule == '0 0 * * *' && fromJSON('["3.12", "3.13"]') || fromJSON('["3.12"]') }}

				        node-version: [22]

				        client-version: ${{ (github.event.schedule == '0 0 * * *' || github.event.inputs.test-all-client-versions == 'true' || inputs.test-all-client-versions == true) && fromJSON('["published", "latest"]') || fromJSON('["latest"]') }}

				        # Test configurations: Generated from CI_MATRIX in tests/integration/ci_matrix.json

				        # See scripts/generate_ci_matrix.py for generation logic

				        config: ${{ fromJSON(needs.generate-matrix.outputs.matrix).include }}

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				        with:

				          ref: ${{ inputs.pr_head_sha || github.event.pull_request.head.sha || github.sha }}

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: Setup test environment

				        if: ${{ matrix.config.allowed_clients == null || contains(matrix.config.allowed_clients, matrix.client) }}

				        uses: ./.github/actions/setup-test-environment

				        with:

				          python-version: ${{ matrix.python-version }}

				          client-version: ${{ matrix.client-version }}

				          sdk_install_url: ${{ inputs.sdk_install_url || '' }}

				          setup: ${{ matrix.config.setup }}

				          suite: ${{ matrix.config.suite }}

				          inference-mode: ${{ matrix.config.inference_mode || 'replay' }}

				      - name: Setup ollama

				        uses: ./.github/actions/setup-ollama

				      - name: Build Llama Stack

				        run: |

				          uv run llama stack build --template ollama --image-type venv

				      - name: Start Llama Stack server in background

				        if: matrix.client-type == 'http'

				        env:

				          INFERENCE_MODEL: "meta-llama/Llama-3.2-3B-Instruct"

				        run: |

				          LLAMA_STACK_LOG_FILE=server.log nohup uv run llama stack run ./llama_stack/templates/ollama/run.yaml --image-type venv --env OLLAMA_URL="http://0.0.0.0:11434" &

				      - name: Wait for Llama Stack server to be ready

				        if: matrix.client-type == 'http'

				        run: |

				          echo "Waiting for Llama Stack server..."

				          for i in {1..30}; do

				            if curl -s http://localhost:8321/v1/health | grep -q "OK"; then

				              echo "Llama Stack server is up!"

				              exit 0

				            fi

				            sleep 1

				          done

				          echo "Llama Stack server failed to start"

				          cat server.log

				          exit 1

				      - name: Verify Ollama status is OK

				        if: matrix.client-type == 'http'

				        run: |

				          echo "Verifying Ollama status..."

				          ollama_status=$(curl -s -L http://127.0.0.1:8321/v1/providers/ollama|jq --raw-output .health.status)

				          echo "Ollama status: $ollama_status"

				          if [ "$ollama_status" != "OK" ]; then

				            echo "Ollama health check failed"

				            exit 1

				          fi

				      - name: Check Storage and Memory Available Before Tests

				        if: ${{ always() }}

				        run: |

				          free -h

				          df -h

				      - name: Run Integration Tests

				        env:

				          INFERENCE_MODEL: "meta-llama/Llama-3.2-3B-Instruct"

				          OLLAMA_URL: "http://0.0.0.0:11434"

				        run: |

				          if [ "${{ matrix.client-type }}" == "library" ]; then

				            stack_config="ollama"

				          else

				            stack_config="http://localhost:8321"

				          fi

				          uv run pytest -s -v tests/integration/${{ matrix.test-type }} --stack-config=${stack_config} \

				            -k "not(builtin_tool or safety_with_image or code_interpreter or test_rag)" \

				            --text-model="meta-llama/Llama-3.2-3B-Instruct" \

				            --embedding-model=all-MiniLM-L6-v2

				      - name: Check Storage and Memory Available After Tests

				        if: ${{ always() }}

				        run: |

				          free -h

				          df -h

				      - name: Write ollama logs to file

				        if: ${{ always() }}

				        run: |

				          sudo docker logs ollama > ollama.log

				      - name: Upload all logs to artifacts

				        if: ${{ always() }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				      - name: Setup Node.js for TypeScript client tests

				        if: ${{ matrix.client == 'server' }}

				        uses: actions/setup-node@395ad3262231945c25e8478fd5baf05154b1d79f # v6.1.0

				        with:

				          name: logs-${{ github.run_id }}-${{ github.run_attempt }}-${{ matrix.client-type }}-${{ matrix.test-type }}-${{ matrix.python-version }}

				          path: |

				            *.log

				          retention-days: 1

				          node-version: ${{matrix.node-version}}

				          cache: 'npm'

				          cache-dependency-path: tests/integration/client-typescript/package-lock.json

				      - name: Setup TypeScript client

				        if: ${{ matrix.client == 'server' }}

				        id: setup-ts-client

				        uses: ./.github/actions/setup-typescript-client

				        with:

				          client-version: ${{ matrix.client-version }}

				      - name: Run tests

				        if: ${{ matrix.config.allowed_clients == null || contains(matrix.config.allowed_clients, matrix.client) }}

				        uses: ./.github/actions/run-and-record-tests

				        env:

				          OPENAI_API_KEY: dummy

				          TS_CLIENT_PATH: ${{ steps.setup-ts-client.outputs.ts-client-path || '' }}

				        with:

				          stack-config: >-

				            ${{ matrix.config.stack_config

				                || (matrix.client == 'library' && 'ci-tests')

				                || (matrix.client == 'server' && 'server:ci-tests')

				                || 'docker:ci-tests' }}

				          setup: ${{ matrix.config.setup }}

				          inference-mode: ${{ matrix.config.inference_mode || 'replay' }}

				          suite: ${{ matrix.config.suite }}

				          target-branch: ${{ inputs.pr_head_ref || '' }}

				          is-fork-pr: ${{ inputs.is_fork_pr && 'true' || (github.event.pull_request.head.repo.full_name != github.repository && 'true' || 'false') }}

									
										206

.github/workflows/integration-vector-io-tests.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,206 @@

				name: Vector IO Integration Tests

				run-name: Run the integration test suite with various VectorIO providers

				on:

				  push:

				    branches:

				      - main

				      - 'release-[0-9]+.[0-9]+.x'

				  pull_request:

				    branches:

				      - main

				      - 'release-[0-9]+.[0-9]+.x'

				    paths:

				      - 'src/llama_stack/**'

				      - '!src/llama_stack_ui/**'

				      - 'tests/integration/vector_io/**'

				      - 'uv.lock'

				      - 'pyproject.toml'

				      - 'requirements.txt'

				      - '.github/workflows/integration-vector-io-tests.yml' # This workflow

				  schedule:

				    - cron: '0 0 * * *'  # (test on python 3.13) Daily at 12 AM UTC

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && github.run_id || github.ref }}

				  cancel-in-progress: true

				jobs:

				  test-matrix:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        vector-io-provider: ["inline::faiss", "inline::sqlite-vec", "inline::milvus", "remote::chromadb", "remote::pgvector", "remote::weaviate", "remote::qdrant"]

				        python-version: ${{ github.event.schedule == '0 0 * * *' && fromJSON('["3.12", "3.13"]') || fromJSON('["3.12"]') }}

				      fail-fast: false # we want to run all tests regardless of failure

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				        with:

				          python-version: ${{ matrix.python-version }}

				      - name: Setup Chroma

				        if: matrix.vector-io-provider == 'remote::chromadb'

				        run: |

				          docker run --rm -d --pull always \

				            --name chromadb \

				            -p 8000:8000 \

				            -v ~/chroma:/chroma/chroma \

				            -e IS_PERSISTENT=TRUE \

				            -e ANONYMIZED_TELEMETRY=FALSE \

				            chromadb/chroma:latest

				      - name: Setup Weaviate

				        if: matrix.vector-io-provider == 'remote::weaviate'

				        run: |

				          docker run --rm -d --pull always \

				          --name weaviate \

				          -p 8080:8080 -p 50051:50051 \

				          cr.weaviate.io/semitechnologies/weaviate:1.32.0

				      - name: Start PGVector DB

				        if: matrix.vector-io-provider == 'remote::pgvector'

				        run: |

				          docker run -d \

				            --name pgvector \

				            -e POSTGRES_USER=llamastack \

				            -e POSTGRES_PASSWORD=llamastack \

				            -e POSTGRES_DB=llamastack \

				            -p 5432:5432 \

				            pgvector/pgvector:pg17

				      - name: Wait for PGVector to be ready

				        if: matrix.vector-io-provider == 'remote::pgvector'

				        run: |

				          echo "Waiting for Postgres to be ready..."

				          for i in {1..30}; do

				            if docker exec pgvector pg_isready -U llamastack > /dev/null 2>&1; then

				              echo "Postgres is ready!"

				              break

				            fi

				            echo "Not ready yet... ($i)"

				            sleep 1

				          done

				      - name: Enable pgvector extension

				        if: matrix.vector-io-provider == 'remote::pgvector'

				        run: |

				          PGPASSWORD=llamastack psql -h localhost -U llamastack -d llamastack \

				            -c "CREATE EXTENSION IF NOT EXISTS vector;"

				      - name: Setup Qdrant

				        if: matrix.vector-io-provider == 'remote::qdrant'

				        run: |

				          docker run --rm -d --pull always \

				            --name qdrant \

				            -p 6333:6333 \

				            qdrant/qdrant

				      - name: Wait for Qdrant to be ready

				        if: matrix.vector-io-provider == 'remote::qdrant'

				        run: |

				          echo "Waiting for Qdrant to be ready..."

				          for i in {1..30}; do

				            if curl -s http://localhost:6333/collections | grep -q '"status":"ok"'; then

				              echo "Qdrant is ready!"

				              exit 0

				            fi

				            sleep 2

				          done

				          echo "Qdrant failed to start"

				          docker logs qdrant

				          exit 1

				      - name: Wait for ChromaDB to be ready

				        if: matrix.vector-io-provider == 'remote::chromadb'

				        run: |

				          echo "Waiting for ChromaDB to be ready..."

				          for i in {1..30}; do

				            if curl -s http://localhost:8000/api/v2/heartbeat | grep -q "nanosecond heartbeat"; then

				              echo "ChromaDB is ready!"

				              exit 0

				            fi

				            sleep 2

				          done

				          echo "ChromaDB failed to start"

				          docker logs chromadb

				          exit 1

				      - name: Wait for Weaviate to be ready

				        if: matrix.vector-io-provider == 'remote::weaviate'

				        run: |

				          echo "Waiting for Weaviate to be ready..."

				          for i in {1..30}; do

				            if curl -s http://localhost:8080 | grep -q "https://weaviate.io/developers/weaviate/current/"; then

				              echo "Weaviate is ready!"

				              exit 0

				            fi

				            sleep 2

				          done

				          echo "Weaviate failed to start"

				          docker logs weaviate

				          exit 1

				      - name: Build Llama Stack

				        run: |

				          uv run --no-sync llama stack list-deps ci-tests | xargs -L1 uv pip install

				      - name: Check Storage and Memory Available Before Tests

				        if: ${{ always() }}

				        run: |

				          free -h

				          df -h

				      - name: Run Vector IO Integration Tests

				        env:

				          ENABLE_CHROMADB: ${{ matrix.vector-io-provider == 'remote::chromadb' && 'true' || '' }}

				          CHROMADB_URL: ${{ matrix.vector-io-provider == 'remote::chromadb' && 'http://localhost:8000' || '' }}

				          ENABLE_PGVECTOR: ${{ matrix.vector-io-provider == 'remote::pgvector' && 'true' || '' }}

				          PGVECTOR_HOST: ${{ matrix.vector-io-provider == 'remote::pgvector' && 'localhost' || '' }}

				          PGVECTOR_PORT: ${{ matrix.vector-io-provider == 'remote::pgvector' && '5432' || '' }}

				          PGVECTOR_DB: ${{ matrix.vector-io-provider == 'remote::pgvector' && 'llamastack' || '' }}

				          PGVECTOR_USER: ${{ matrix.vector-io-provider == 'remote::pgvector' && 'llamastack' || '' }}

				          PGVECTOR_PASSWORD: ${{ matrix.vector-io-provider == 'remote::pgvector' && 'llamastack' || '' }}

				          ENABLE_QDRANT: ${{ matrix.vector-io-provider == 'remote::qdrant' && 'true' || '' }}

				          QDRANT_URL: ${{ matrix.vector-io-provider == 'remote::qdrant' && 'http://localhost:6333' || '' }}

				          ENABLE_WEAVIATE: ${{ matrix.vector-io-provider == 'remote::weaviate' && 'true' || '' }}

				          WEAVIATE_CLUSTER_URL: ${{ matrix.vector-io-provider == 'remote::weaviate' && 'localhost:8080' || '' }}

				        run: |

				          uv run --no-sync \

				            pytest -sv --stack-config="files=inline::localfs,inference=inline::sentence-transformers,vector_io=${{ matrix.vector-io-provider }}" \

				            tests/integration/vector_io

				      - name: Check Storage and Memory Available After Tests

				        if: ${{ always() }}

				        run: |

				          free -h

				          df -h

				      - name: Create sanitized provider name

				        if: ${{ always() }}

				        run: |

				          echo "SANITIZED_PROVIDER=$(echo "${{ matrix.vector-io-provider }}" | tr ':' '_')" >> $GITHUB_ENV

				      - name: Write ChromaDB logs to file

				        if: ${{ always() && matrix.vector-io-provider == 'remote::chromadb' }}

				        run: |

				          docker logs chromadb > chromadb.log

				      - name: Write Qdrant logs to file

				        if: ${{ always() && matrix.vector-io-provider == 'remote::qdrant' }}

				        run: |

				          docker logs qdrant > qdrant.log

				      - name: Upload all logs to artifacts

				        if: ${{ always() }}

				        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0

				        with:

				          name: vector-io-logs-${{ github.run_id }}-${{ github.run_attempt }}-${{ env.SANITIZED_PROVIDER }}-${{ matrix.python-version }}

				          path: |

				            *.log

				          retention-days: 1

									
										156

.github/workflows/pre-commit.yml
									
										vendored
									
										View file
										
				@ -1,45 +1,181 @@

				name: Pre-commit

				run-name: Run pre-commit checks

				on:

				  pull_request:

				  push:

				    branches: [main]

				    branches:

				      - main

				      - 'release-[0-9]+.[0-9]+.x'

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref }}

				  group: ${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && github.run_id || github.ref }}

				  cancel-in-progress: true

				jobs:

				  pre-commit:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        node-version: [22]

				    permissions:

				      contents: write

				      pull-requests: write

				    steps:

				      - name: Checkout code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				        with:

				          # For dependabot PRs, we need to checkout with a token that can push changes

				          token: ${{ github.actor == 'dependabot[bot]' && secrets.GITHUB_TOKEN || github.token }}

				          # Fetch full history for dependabot PRs to allow commits

				          fetch-depth: ${{ github.actor == 'dependabot[bot]' && 0 || 1 }}

				      - name: Set up Python

				        uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0

				        uses: actions/setup-python@83679a892e2d95755f2dac6acb0bfd1e9ac5d548 # v6.1.0

				        with:

				          python-version: '3.11'

				          python-version: '3.12'

				          cache: pip

				          cache-dependency-path: |

				            **/requirements*.txt

				            .pre-commit-config.yaml

				      - uses: pre-commit/action@2c7b3805fd2a0fd8c1884dcaebf91fc102a13ecd # v3.0.1

				      - name: Set up Node.js

				        uses: actions/setup-node@395ad3262231945c25e8478fd5baf05154b1d79f # v6.1.0

				        with:

				          node-version: ${{matrix.node-version}}

				          cache: 'npm'

				          cache-dependency-path: 'src/llama_stack_ui/'

				      - name: Set up uv

				        uses: astral-sh/setup-uv@681c641aba71e4a1c380be3ab5e12ad51f415867 # v7.1.6

				      - name: Install npm dependencies

				        run: npm ci

				        working-directory: src/llama_stack_ui

				      - name: Install pre-commit

				        run: python -m pip install 'pre-commit>=4.4.0'

				      - name: Cache pre-commit

				        uses: actions/cache@9255dc7a253b0ccc959486e2bca901246202afeb # v4

				        with:

				          path: ~/.cache/pre-commit

				          key: pre-commit-3|${{ env.pythonLocation }}|${{ hashFiles('.pre-commit-config.yaml') }}

				      - name: Run pre-commit

				        id: precommit

				        run: |

				          set +e

				          pre-commit run --show-diff-on-failure --color=always --all-files 2>&1 | tee /tmp/precommit.log

				          status=${PIPESTATUS[0]}

				          echo "status=$status" >> $GITHUB_OUTPUT

				          exit 0

				        env:

				          SKIP: no-commit-to-branch

				          SKIP: no-commit-to-branch,mypy

				          RUFF_OUTPUT_FORMAT: github

				      - name: Verify if there are any diff files after pre-commit

				      - name: Check pre-commit results

				        if: steps.precommit.outputs.status != '0'

				        run: |

				          git diff --exit-code || (echo "There are uncommitted changes, run pre-commit locally and commit again" && exit 1)

				          echo "::error::Pre-commit hooks failed. Please run 'pre-commit run --all-files' locally and commit the fixes."

				          echo ""

				          echo "Failed hooks output:"

				          cat /tmp/precommit.log

				          exit 1

				      - name: Debug

				        run: |

				          echo "github.ref: ${{ github.ref }}"

				          echo "github.actor: ${{ github.actor }}"

				      - name: Commit changes for dependabot PRs

				        if: github.actor == 'dependabot[bot]'

				        run: |

				          if ! git diff --exit-code || [ -n "$(git ls-files --others --exclude-standard)" ]; then

				            git config --local user.email "github-actions[bot]@users.noreply.github.com"

				            git config --local user.name "github-actions[bot]"

				            # Ensure we're on the correct branch

				            git checkout -B ${{ github.head_ref }}

				            git add -A

				            git commit -m "Apply pre-commit fixes"

				            # Pull latest changes from the PR branch and rebase our commit on top

				            git pull --rebase origin ${{ github.head_ref }}

				            # Push to the PR branch

				            git push origin ${{ github.head_ref }}

				            echo "Pre-commit fixes committed and pushed"

				          else

				            echo "No changes to commit"

				          fi

				      - name: Verify no uncommitted changes

				        if: github.actor != 'dependabot[bot]'

				        run: |

				          if ! git diff --exit-code; then

				            echo "::error::There are uncommitted changes after pre-commit. Please run 'pre-commit run --all-files' locally and commit the fixes."

				            echo "::warning::Files with changes:"

				            git diff --name-status

				            exit 1

				          fi

				      - name: Verify if there are any new files after pre-commit

				        if: github.actor != 'dependabot[bot]'

				        run: |

				          unstaged_files=$(git ls-files --others --exclude-standard)

				          if [ -n "$unstaged_files" ]; then

				            echo "There are uncommitted new files, run pre-commit locally and commit again"

				            echo "::error::There are new untracked files after pre-commit. Please run 'pre-commit run --all-files' locally and commit the fixes."

				            echo "::warning::New files:"

				            echo "$unstaged_files"

				            exit 1

				          fi

				      - name: Configure client installation

				        id: client-config

				        uses: ./.github/actions/install-llama-stack-client

				      - name: Sync dev + type_checking dependencies

				        env:

				          UV_EXTRA_INDEX_URL: ${{ steps.client-config.outputs.uv-extra-index-url }}

				        run: |

				          if [ -n "$UV_EXTRA_INDEX_URL" ]; then

				            export UV_INDEX_STRATEGY="unsafe-best-match"

				          fi

				          uv sync --group dev --group type_checking

				          # Install specific client version after sync if needed

				          if [ "${{ steps.client-config.outputs.install-after-sync }}" = "true" ]; then

				            echo "Installing llama-stack-client from: ${{ steps.client-config.outputs.install-source }}"

				            uv pip install ${{ steps.client-config.outputs.install-source }}

				          fi

				      - name: Run mypy (full type_checking)

				        env:

				          UV_EXTRA_INDEX_URL: ${{ steps.client-config.outputs.uv-extra-index-url }}

				        run: |

				          if [ -n "$UV_EXTRA_INDEX_URL" ]; then

				            export UV_INDEX_STRATEGY="unsafe-best-match"

				          fi

				          set +e

				          uv run --group dev --group type_checking mypy

				          status=$?

				          if [ $status -ne 0 ]; then

				            echo "::error::Full mypy failed. Reproduce locally with 'uv run pre-commit run mypy-full --hook-stage manual --all-files'."

				          fi

				          exit $status

				      - name: Check if any unused recordings

				        run: |

				          set -e

				          PYTHONPATH=$PWD uv run ./scripts/cleanup_recordings.py --delete

				          changes=$(git status --short tests/integration | grep 'recordings' || true)

				          if [ -n "$changes" ]; then

				            echo "::error::Unused integration recordings detected. Run 'PYTHONPATH=$(pwd) uv run ./scripts/cleanup_recordings.py --delete' locally and commit the deletions."

				            echo "$changes"

				            exit 1

				          fi

									
										134

.github/workflows/providers-build.yml
									
										vendored
									
										View file
										
				@ -1,69 +1,88 @@

				name: Test Llama Stack Build

				run-name: Test llama stack build

				on:

				  push:

				    branches:

				      - main

				    paths:

				      - 'llama_stack/cli/stack/build.py'

				      - 'llama_stack/cli/stack/_build.py'

				      - 'llama_stack/distribution/build.*'

				      - 'llama_stack/distribution/*.sh'

				      - 'src/llama_stack/cli/stack/build.py'

				      - 'src/llama_stack/cli/stack/_build.py'

				      - 'src/llama_stack/core/build.*'

				      - 'src/llama_stack/core/*.sh'

				      - '.github/workflows/providers-build.yml'

				      - 'llama_stack/templates/**'

				      - 'src/llama_stack/distributions/**'

				      - 'pyproject.toml'

				      - 'containers/Containerfile'

				      - '.dockerignore'

				  pull_request:

				    paths:

				      - 'llama_stack/cli/stack/build.py'

				      - 'llama_stack/cli/stack/_build.py'

				      - 'llama_stack/distribution/build.*'

				      - 'llama_stack/distribution/*.sh'

				      - 'src/llama_stack/cli/stack/build.py'

				      - 'src/llama_stack/cli/stack/_build.py'

				      - 'src/llama_stack/core/build.*'

				      - 'src/llama_stack/core/*.sh'

				      - '.github/workflows/providers-build.yml'

				      - 'llama_stack/templates/**'

				      - 'src/llama_stack/distributions/**'

				      - 'pyproject.toml'

				      - 'containers/Containerfile'

				      - '.dockerignore'

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref }}

				  group: ${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && github.run_id || github.ref }}

				  cancel-in-progress: true

				jobs:

				  generate-matrix:

				    runs-on: ubuntu-latest

				    outputs:

				      templates: ${{ steps.set-matrix.outputs.templates }}

				      distros: ${{ steps.set-matrix.outputs.distros }}

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Generate Template List

				      - name: Generate Distribution List

				        id: set-matrix

				        run: |

				          templates=$(ls llama_stack/templates/*/*build.yaml | awk -F'/' '{print $(NF-1)}' | jq -R -s -c 'split("\n")[:-1]')

				          echo "templates=$templates" >> "$GITHUB_OUTPUT"

				          distros=$(ls src/llama_stack/distributions/*/*build.yaml | awk -F'/' '{print $(NF-1)}' | jq -R -s -c 'split("\n")[:-1]')

				          echo "distros=$distros" >> "$GITHUB_OUTPUT"

				  build:

				    needs: generate-matrix

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        template: ${{ fromJson(needs.generate-matrix.outputs.templates) }}

				        distro: ${{ fromJson(needs.generate-matrix.outputs.distros) }}

				        image-type: [venv, container]

				      fail-fast: false # We want to run all jobs even if some fail

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: Print build dependencies

				      - name: Install distribution into venv

				        if: matrix.image-type == 'venv'

				        run: |

				          uv run llama stack build --template ${{ matrix.template }} --image-type ${{ matrix.image-type }} --image-name test --print-deps-only

				          uv run llama stack list-deps ${{ matrix.distro }} | xargs -L1 uv pip install

				      - name: Run Llama Stack Build

				      - name: Build container image

				        if: matrix.image-type == 'container'

				        run: |

				          # USE_COPY_NOT_MOUNT is set to true since mounting is not supported by docker buildx, we use COPY instead

				          # LLAMA_STACK_DIR is set to the current directory so we are building from the source

				          USE_COPY_NOT_MOUNT=true LLAMA_STACK_DIR=. uv run llama stack build --template ${{ matrix.template }} --image-type ${{ matrix.image-type }} --image-name test

				          BUILD_ARGS="--build-arg INSTALL_MODE=editable --build-arg DISTRO_NAME=${{ matrix.distro }}"

				          if [ -n "${UV_EXTRA_INDEX_URL:-}" ]; then

				            BUILD_ARGS="$BUILD_ARGS --build-arg UV_EXTRA_INDEX_URL=$UV_EXTRA_INDEX_URL"

				          fi

				          if [ -n "${UV_INDEX_STRATEGY:-}" ]; then

				            BUILD_ARGS="$BUILD_ARGS --build-arg UV_INDEX_STRATEGY=$UV_INDEX_STRATEGY"

				          fi

				          docker build . \

				            -f containers/Containerfile \

				            $BUILD_ARGS \

				            --tag llama-stack:${{ matrix.distro }}-ci

				      - name: Print dependencies in the image

				        if: matrix.image-type == 'venv'

				@ -74,36 +93,51 @@ jobs:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: Build a single provider

				        run: |

				          USE_COPY_NOT_MOUNT=true LLAMA_STACK_DIR=. uv run llama stack build --image-type venv --image-name test --providers inference=remote::ollama

				          uv pip install -e .

				          uv run --no-sync llama stack list-deps --providers inference=remote::ollama | xargs -L1 uv pip install

				  build-custom-container-distribution:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: Build a single provider

				      - name: Build container image

				        run: |

				          yq -i '.image_type = "container"' llama_stack/templates/starter/build.yaml

				          yq -i '.image_name = "test"' llama_stack/templates/starter/build.yaml

				          USE_COPY_NOT_MOUNT=true LLAMA_STACK_DIR=. uv run llama stack build --config llama_stack/templates/starter/build.yaml

				          BASE_IMAGE=$(yq -r '.distribution_spec.container_image // "python:3.12-slim"' src/llama_stack/distributions/ci-tests/config.yaml)

				          BUILD_ARGS="--build-arg INSTALL_MODE=editable --build-arg DISTRO_NAME=ci-tests"

				          BUILD_ARGS="$BUILD_ARGS --build-arg BASE_IMAGE=$BASE_IMAGE"

				          BUILD_ARGS="$BUILD_ARGS --build-arg RUN_CONFIG_PATH=/workspace/src/llama_stack/distributions/ci-tests/config.yaml"

				          if [ -n "${UV_EXTRA_INDEX_URL:-}" ]; then

				            BUILD_ARGS="$BUILD_ARGS --build-arg UV_EXTRA_INDEX_URL=$UV_EXTRA_INDEX_URL"

				          fi

				          if [ -n "${UV_INDEX_STRATEGY:-}" ]; then

				            BUILD_ARGS="$BUILD_ARGS --build-arg UV_INDEX_STRATEGY=$UV_INDEX_STRATEGY"

				          fi

				          docker build . \

				            -f containers/Containerfile \

				            $BUILD_ARGS \

				            -t llama-stack:ci-tests

				      - name: Inspect the container image entrypoint

				        run: |

				          IMAGE_ID=$(docker images --format "{{.Repository}}:{{.Tag}}" | head -n 1)

				          if [ -z "$IMAGE_ID" ]; then

				            echo "No image found"

				            exit 1

				          fi

				          entrypoint=$(docker inspect --format '{{ .Config.Entrypoint }}' $IMAGE_ID)

				          echo "Entrypoint: $entrypoint"

				          if [ "$entrypoint" != "[python -m llama_stack.distribution.server.server --config /app/run.yaml]" ]; then

				          if [ "$entrypoint" != "[/usr/local/bin/llama-stack-entrypoint.sh]" ]; then

				            echo "Entrypoint is not correct"

				            exit 1

				          fi

				@ -112,32 +146,44 @@ jobs:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: Pin template to UBI9 base

				      - name: Pin distribution to UBI9 base

				        run: |

				          yq -i '

				            .image_type    = "container" |

				            .image_name    = "ubi9-test" |

				            .distribution_spec.container_image = "registry.access.redhat.com/ubi9:latest"

				          ' llama_stack/templates/starter/build.yaml

				          ' src/llama_stack/distributions/ci-tests/config.yaml

				      - name: Build dev container (UBI9)

				        env:

				          USE_COPY_NOT_MOUNT: "true"

				          LLAMA_STACK_DIR: "."

				      - name: Build UBI9 container image

				        run: |

				          uv run llama stack build --config llama_stack/templates/starter/build.yaml

				          BASE_IMAGE=$(yq -r '.distribution_spec.container_image // "registry.access.redhat.com/ubi9:latest"' src/llama_stack/distributions/ci-tests/config.yaml)

				          BUILD_ARGS="--build-arg INSTALL_MODE=editable --build-arg DISTRO_NAME=ci-tests"

				          BUILD_ARGS="$BUILD_ARGS --build-arg BASE_IMAGE=$BASE_IMAGE"

				          BUILD_ARGS="$BUILD_ARGS --build-arg RUN_CONFIG_PATH=/workspace/src/llama_stack/distributions/ci-tests/config.yaml"

				          if [ -n "${UV_EXTRA_INDEX_URL:-}" ]; then

				            BUILD_ARGS="$BUILD_ARGS --build-arg UV_EXTRA_INDEX_URL=$UV_EXTRA_INDEX_URL"

				          fi

				          if [ -n "${UV_INDEX_STRATEGY:-}" ]; then

				            BUILD_ARGS="$BUILD_ARGS --build-arg UV_INDEX_STRATEGY=$UV_INDEX_STRATEGY"

				          fi

				          docker build . \

				            -f containers/Containerfile \

				            $BUILD_ARGS \

				            -t llama-stack:ci-tests-ubi9

				      - name: Inspect UBI9 image

				        run: |

				          IMAGE_ID=$(docker images --format "{{.Repository}}:{{.Tag}}" | head -n 1)

				          if [ -z "$IMAGE_ID" ]; then

				            echo "No image found"

				            exit 1

				          fi

				          entrypoint=$(docker inspect --format '{{ .Config.Entrypoint }}' $IMAGE_ID)

				          echo "Entrypoint: $entrypoint"

				          if [ "$entrypoint" != "[python -m llama_stack.distribution.server.server --config /app/run.yaml]" ]; then

				          if [ "$entrypoint" != "[/usr/local/bin/llama-stack-entrypoint.sh]" ]; then

				            echo "Entrypoint is not correct"

				            exit 1

				          fi

									
										105

.github/workflows/providers-list-deps.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,105 @@

				name: Test llama stack list-deps

				run-name: Test llama stack list-deps

				on:

				  push:

				    branches:

				      - main

				    paths:

				      - 'src/llama_stack/cli/stack/list_deps.py'

				      - 'src/llama_stack/cli/stack/_list_deps.py'

				      - 'src/llama_stack/core/build.*'

				      - 'src/llama_stack/core/*.sh'

				      - '.github/workflows/providers-list-deps.yml'

				      - 'src/llama_stack/templates/**'

				      - 'pyproject.toml'

				  pull_request:

				    paths:

				      - 'src/llama_stack/cli/stack/list_deps.py'

				      - 'src/llama_stack/cli/stack/_list_deps.py'

				      - 'src/llama_stack/core/build.*'

				      - 'src/llama_stack/core/*.sh'

				      - '.github/workflows/providers-list-deps.yml'

				      - 'src/llama_stack/templates/**'

				      - 'pyproject.toml'

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref }}

				  cancel-in-progress: true

				jobs:

				  generate-matrix:

				    runs-on: ubuntu-latest

				    outputs:

				      distros: ${{ steps.set-matrix.outputs.distros }}

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Generate Distribution List

				        id: set-matrix

				        run: |

				          distros=$(ls src/llama_stack/distributions/*/*build.yaml | awk -F'/' '{print $(NF-1)}' | jq -R -s -c 'split("\n")[:-1]')

				          echo "distros=$distros" >> "$GITHUB_OUTPUT"

				  list-deps:

				    needs: generate-matrix

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        distro: ${{ fromJson(needs.generate-matrix.outputs.distros) }}

				        image-type: [venv, container]

				      fail-fast: false # We want to run all jobs even if some fail

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: Print dependencies

				        run: |

				          uv run llama stack list-deps ${{ matrix.distro }}

				      - name: Install Distro using llama stack list-deps

				        run: |

				          # USE_COPY_NOT_MOUNT is set to true since mounting is not supported by docker buildx, we use COPY instead

				          # LLAMA_STACK_DIR is set to the current directory so we are building from the source

				          USE_COPY_NOT_MOUNT=true LLAMA_STACK_DIR=. uv run llama stack list-deps ${{ matrix.distro }} | xargs -L1 uv pip install

				      - name: Print dependencies in the image

				        if: matrix.image-type == 'venv'

				        run: |

				          uv pip list

				  show-single-provider:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: Show a single provider

				        run: |

				          USE_COPY_NOT_MOUNT=true LLAMA_STACK_DIR=. uv run llama stack list-deps --providers inference=remote::ollama

				  list-deps-from-config:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: list-des from Config

				        env:

				          USE_COPY_NOT_MOUNT: "true"

				          LLAMA_STACK_DIR: "."

				        run: |

				          uv run llama stack list-deps src/llama_stack/distributions/ci-tests/config.yaml

									
										50

.github/workflows/python-build-test.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,50 @@

				name: Python Package Build Test

				run-name: Test building the llama-stack PyPI project

				on:

				  push:

				    branches:

				      - main

				  pull_request:

				    branches:

				      - main

				    paths-ignore:

				        - 'src/llama_stack_ui/**'

				jobs:

				  build:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        python-version: ['3.12', '3.13']

				    steps:

				    - name: Checkout repository

				      uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				    - name: Install uv

				      uses: astral-sh/setup-uv@681c641aba71e4a1c380be3ab5e12ad51f415867 # v7.1.6

				      with:

				        python-version: ${{ matrix.python-version }}

				        activate-environment: true

				    - name: Build Llama Stack API package

				      working-directory: src/llama_stack_api

				      run: uv build

				    - name: Build Llama Stack package

				      run: uv build

				    - name: Install Llama Stack package (with api stubs from local build)

				      run: |

				        uv pip install --find-links src/llama_stack_api/dist dist/*.whl

				    - name: Verify Llama Stack package

				      run: |

				        uv pip list

				        uv pip show llama-stack

				        command -v llama

				        llama stack list-apis

				        llama stack list-providers inference

				        llama stack list-deps starter

									
										73

.github/workflows/record-integration-tests.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,73 @@

				# This workflow should be run manually when needing to re-record tests. This happens when you have

				#  - added a new test

				#  - or changed an existing test such that a new inference call is made

				# You should make a PR and then run this workflow on that PR branch. The workflow will re-record the

				# tests and commit the recordings to the PR branch.

				name: Integration Tests (Record)

				run-name: Run the integration test suite from tests/integration

				on:

				  workflow_dispatch:

				    inputs:

				      test-setup:

				        description: 'Test against a specific setup'

				        type: string

				        default: 'ollama'

				      suite:

				        description: 'Test suite to use: base, responses, vision, etc.'

				        type: string

				        default: ''

				      subdirs:

				        description: 'Comma-separated list of test subdirectories to run; overrides suite'

				        type: string

				        default: ''

				      pattern:

				        description: 'Regex pattern to pass to pytest -k'

				        type: string

				        default: ''

				jobs:

				  record-tests:

				    runs-on: ubuntu-latest

				    permissions:

				      contents: write

				    steps:

				      - name: Echo workflow inputs

				        run: |

				          echo "::group::Workflow Inputs"

				          echo "branch: ${{ github.ref_name }}"

				          echo "test-setup: ${{ inputs.test-setup }}"

				          echo "suite: ${{ inputs.suite }}"

				          echo "subdirs: ${{ inputs.subdirs }}"

				          echo "pattern: ${{ inputs.pattern }}"

				          echo "::endgroup::"

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				        with:

				          fetch-depth: 0

				      - name: Setup test environment

				        uses: ./.github/actions/setup-test-environment

				        with:

				          python-version: "3.12"  # Use single Python version for recording

				          client-version: "latest"

				          setup: ${{ inputs.test-setup || 'ollama' }}

				          suite: ${{ inputs.suite }}

				          inference-mode: 'record'

				      - name: Run and record tests

				        uses: ./.github/actions/run-and-record-tests

				        env:

				          # Set OPENAI_API_KEY if using gpt setup

				          OPENAI_API_KEY: ${{ inputs.test-setup == 'gpt' && secrets.OPENAI_API_KEY || '' }}

				        with:

				          stack-config: 'server:ci-tests'  # recording must be done with server since more tests are run

				          setup: ${{ inputs.test-setup || 'ollama' }}

				          inference-mode: 'record'

				          suite: ${{ inputs.suite }}

				          subdirs: ${{ inputs.subdirs }}

				          pattern: ${{ inputs.pattern }}

									
										6

.github/workflows/semantic-pr.yml
									
										vendored
									
										View file
										
				@ -1,5 +1,7 @@

				name: Check semantic PR titles

				run-name: Ensure that PR titles follow the conventional commit spec

				on:

				  pull_request_target:

				    types:

				@ -9,7 +11,7 @@ on:

				      - synchronize

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref }}

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number }}

				  cancel-in-progress: true

				permissions:

				@ -20,6 +22,6 @@ jobs:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Check PR Title's semantic conformance

				        uses: amannn/action-semantic-pull-request@0723387faaf9b38adef4775cd42cfd5155ed6017 # v5.5.3

				        uses: amannn/action-semantic-pull-request@48f256284bd46cdaab1048c3721360e808335d50 # v6.1.1

				        env:

				          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

									
										227

.github/workflows/stainless-builds.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,227 @@

				name: Stainless SDK Builds

				run-name: Build Stainless SDK from OpenAPI spec changes

				# This workflow uses pull_request_target, which allows it to run on pull requests

				# from forks with access to secrets. This is safe because the workflow definition

				# comes from the base branch (trusted), and the action only reads OpenAPI spec

				# files without executing any code from the PR.

				on:

				  pull_request_target:

				    types:

				      - opened

				      - synchronize

				      - reopened

				      - closed

				    paths:

				      - "client-sdks/stainless/**"

				      - ".github/workflows/stainless-builds.yml" # this workflow

				  workflow_dispatch:

				    inputs:

				      pr_number:

				        description: 'PR number to run Stainless build for'

				        required: true

				        type: number

				      sdk_install_url:

				        description: 'Python SDK install URL (optional, for testing specific builds)'

				        required: false

				        type: string

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || inputs.pr_number || github.run_id }}

				  cancel-in-progress: true

				env:

				  # Stainless organization name.

				  STAINLESS_ORG: llamastack

				  # Stainless project name.

				  STAINLESS_PROJECT: llama-stack-client

				  # Path to your OpenAPI spec.

				  OAS_PATH: ./client-sdks/stainless/openapi.yml

				  # Path to your Stainless config. Optional; only provide this if you prefer

				  # to maintain the ground truth Stainless config in your own repo.

				  CONFIG_PATH: ./client-sdks/stainless/config.yml

				  # When to fail the job based on build conclusion.

				  # Options: "never" | "note" | "warning" | "error" | "fatal".

				  FAIL_ON: error

				  # In your repo secrets, configure:

				  # - STAINLESS_API_KEY: a Stainless API key, which you can generate on the

				  #   Stainless organization dashboard

				jobs:

				  compute-branch:

				    runs-on: ubuntu-latest

				    outputs:

				      preview_branch: ${{ steps.compute.outputs.preview_branch }}

				      base_branch: ${{ steps.compute.outputs.base_branch }}

				      merge_branch: ${{ steps.compute.outputs.merge_branch }}

				      pr_head_repo: ${{ steps.compute.outputs.pr_head_repo }}

				      pr_head_ref: ${{ steps.compute.outputs.pr_head_ref }}

				      pr_head_sha: ${{ steps.compute.outputs.pr_head_sha }}

				      pr_base_sha: ${{ steps.compute.outputs.pr_base_sha }}

				      pr_base_ref: ${{ steps.compute.outputs.pr_base_ref }}

				      pr_title: ${{ steps.compute.outputs.pr_title }}

				      is_fork_pr: ${{ steps.compute.outputs.is_fork_pr }}

				    steps:

				      - name: Fetch PR details for workflow_dispatch

				        if: github.event_name == 'workflow_dispatch'

				        id: fetch-pr

				        env:

				          GH_TOKEN: ${{ github.token }}

				        run: |

				          PR_DATA=$(gh pr view ${{ inputs.pr_number }} --repo ${{ github.repository }} --json headRefName,headRepository,headRefOid,baseRefName,baseRefOid,headRepositoryOwner,title)

				          echo "pr_data=$PR_DATA" >> $GITHUB_OUTPUT

				      - name: Compute branch names

				        id: compute

				        run: |

				          if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then

				            # Extract from fetched PR data

				            PR_DATA='${{ steps.fetch-pr.outputs.pr_data }}'

				            FORK_OWNER=$(echo "$PR_DATA" | jq -r '.headRepositoryOwner.login')

				            REPO_NAME=$(echo "$PR_DATA" | jq -r '.headRepository.name')

				            HEAD_REPO="${FORK_OWNER}/${REPO_NAME}"

				            BRANCH_NAME=$(echo "$PR_DATA" | jq -r '.headRefName')

				            HEAD_SHA=$(echo "$PR_DATA" | jq -r '.headRefOid')

				            BASE_SHA=$(echo "$PR_DATA" | jq -r '.baseRefOid')

				            BASE_REF=$(echo "$PR_DATA" | jq -r '.baseRefName')

				            PR_TITLE=$(echo "$PR_DATA" | jq -r '.title')

				          else

				            # Use pull_request_target event data

				            HEAD_REPO="${{ github.event.pull_request.head.repo.full_name }}"

				            BRANCH_NAME="${{ github.event.pull_request.head.ref }}"

				            FORK_OWNER="${{ github.event.pull_request.head.repo.owner.login }}"

				            HEAD_SHA="${{ github.event.pull_request.head.sha }}"

				            BASE_SHA="${{ github.event.pull_request.base.sha }}"

				            BASE_REF="${{ github.event.pull_request.base.ref }}"

				            PR_TITLE="${{ github.event.pull_request.title }}"

				          fi

				          BASE_REPO="${{ github.repository }}"

				          if [ "$HEAD_REPO" != "$BASE_REPO" ]; then

				            # Fork PR: prefix with fork owner for isolation

				            if [ -z "$FORK_OWNER" ]; then

				              echo "Error: Fork PR detected but fork owner is empty" >&2

				              exit 1

				            fi

				            PREVIEW_BRANCH="preview/${FORK_OWNER}/${BRANCH_NAME}"

				            BASE_BRANCH="preview/base/${FORK_OWNER}/${BRANCH_NAME}"

				            IS_FORK_PR="true"

				          else

				            # Same-repo PR

				            PREVIEW_BRANCH="preview/${BRANCH_NAME}"

				            BASE_BRANCH="preview/base/${BRANCH_NAME}"

				            IS_FORK_PR="false"

				          fi

				          echo "preview_branch=${PREVIEW_BRANCH}" >> $GITHUB_OUTPUT

				          echo "base_branch=${BASE_BRANCH}" >> $GITHUB_OUTPUT

				          echo "merge_branch=${PREVIEW_BRANCH}" >> $GITHUB_OUTPUT

				          echo "pr_head_repo=${HEAD_REPO}" >> $GITHUB_OUTPUT

				          echo "pr_head_ref=${BRANCH_NAME}" >> $GITHUB_OUTPUT

				          echo "pr_head_sha=${HEAD_SHA}" >> $GITHUB_OUTPUT

				          echo "pr_base_sha=${BASE_SHA}" >> $GITHUB_OUTPUT

				          echo "pr_base_ref=${BASE_REF}" >> $GITHUB_OUTPUT

				          echo "pr_title=${PR_TITLE}" >> $GITHUB_OUTPUT

				          echo "is_fork_pr=${IS_FORK_PR}" >> $GITHUB_OUTPUT

				  preview:

				    needs: compute-branch

				    # Skip preview if workflow_dispatch provides sdk_install_url, or if PR is being closed

				    if: |

				      (github.event_name == 'workflow_dispatch' && inputs.sdk_install_url == '') ||

				      (github.event_name == 'pull_request_target' && github.event.action != 'closed')

				    runs-on: ubuntu-latest

				    permissions:

				      contents: read

				      pull-requests: write

				    outputs:

				      sdk_install_url: ${{ fromJSON(steps.run-preview.outputs.outcomes || '{}').python.install_url || '' }}

				    steps:

				      # Checkout the PR's code to access the OpenAPI spec and config files.

				      # This is necessary to read the spec/config from the PR (including from forks).

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				        with:

				          repository: ${{ needs.compute-branch.outputs.pr_head_repo }}

				          ref: ${{ needs.compute-branch.outputs.pr_head_sha }}

				          fetch-depth: 2

				      - name: Run preview builds

				        id: run-preview

				        uses: stainless-api/upload-openapi-spec-action/preview@11792f827da87f9411ca0b491d7514b94dcb815f # 1.9.0

				        env:

				          PR_NUMBER: ${{ inputs.pr_number || github.event.pull_request.number }}

				        with:

				          stainless_api_key: ${{ secrets.STAINLESS_API_KEY }}

				          org: ${{ env.STAINLESS_ORG }}

				          project: ${{ env.STAINLESS_PROJECT }}

				          oas_path: ${{ env.OAS_PATH }}

				          config_path: ${{ env.CONFIG_PATH }}

				          fail_on: ${{ env.FAIL_ON }}

				          base_sha: ${{ needs.compute-branch.outputs.pr_base_sha }}

				          base_ref: ${{ needs.compute-branch.outputs.pr_base_ref }}

				          head_sha: ${{ needs.compute-branch.outputs.pr_head_sha }}

				          branch: ${{ needs.compute-branch.outputs.preview_branch }}

				          base_branch: ${{ needs.compute-branch.outputs.base_branch }}

				          commit_message: ${{ needs.compute-branch.outputs.pr_title }}

				          make_comment: true

				  run-integration-tests:

				    needs: [compute-branch, preview]

				    if: |

				      always() &&

				      (needs.preview.result == 'success' || needs.preview.result == 'skipped') &&

				      (github.event_name == 'workflow_dispatch' || github.event.action != 'closed')

				    uses: ./.github/workflows/integration-tests.yml

				    with:

				      # Use provided sdk_install_url from workflow_dispatch, or from preview build

				      sdk_install_url: ${{ inputs.sdk_install_url || needs.preview.outputs.sdk_install_url }}

				      matrix_key: 'stainless'

				      test-all-client-versions: false

				      pr_head_sha: ${{ needs.compute-branch.outputs.pr_head_sha }}

				      pr_head_ref: ${{ needs.compute-branch.outputs.pr_head_ref }}

				      is_fork_pr: ${{ needs.compute-branch.outputs.is_fork_pr == 'true' }}

				  merge:

				    needs: compute-branch

				    if: github.event_name == 'pull_request_target' && github.event.action == 'closed' && github.event.pull_request.merged == true

				    runs-on: ubuntu-latest

				    permissions:

				      contents: read

				      pull-requests: write

				    steps:

				      # Checkout the PR's code to access the OpenAPI spec and config files.

				      # This is necessary to read the spec/config from the PR (including from forks).

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				        with:

				          repository: ${{ needs.compute-branch.outputs.pr_head_repo }}

				          ref: ${{ needs.compute-branch.outputs.pr_head_sha }}

				          fetch-depth: 2

				      # Note that this only merges in changes that happened on the last build on

				      # the computed preview branch. It's possible that there are OAS/config

				      # changes that haven't been built, if the preview job didn't finish

				      # before this step starts. In theory we want to wait for all builds

				      # against the preview branch to complete, but assuming that

				      # the preview job happens before the PR merge, it should be fine.

				      - name: Run merge build

				        uses: stainless-api/upload-openapi-spec-action/merge@11792f827da87f9411ca0b491d7514b94dcb815f # 1.9.0

				        with:

				          stainless_api_key: ${{ secrets.STAINLESS_API_KEY }}

				          org: ${{ env.STAINLESS_ORG }}

				          project: ${{ env.STAINLESS_PROJECT }}

				          oas_path: ${{ env.OAS_PATH }}

				          config_path: ${{ env.CONFIG_PATH }}

				          fail_on: ${{ env.FAIL_ON }}

				          base_sha: ${{ needs.compute-branch.outputs.pr_base_sha }}

				          base_ref: ${{ needs.compute-branch.outputs.pr_base_ref }}

				          head_sha: ${{ needs.compute-branch.outputs.pr_head_sha }}

				          merge_branch: ${{ needs.compute-branch.outputs.merge_branch }}

									
										4

.github/workflows/stale_bot.yml
									
										vendored
									
										View file
										
				@ -1,5 +1,7 @@

				name: Close stale issues and PRs

				run-name: Run the Stale Bot action

				on:

				  schedule:

				    - cron: '0 0 * * *' # every day at midnight

				@ -22,7 +24,7 @@ jobs:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Stale Action

				        uses: actions/stale@5bef64f19d7facfb25b37b414482c7164d639639 # v9.1.0

				        uses: actions/stale@997185467fa4f803885201cee163a9f38240193d # v10.1.1

				        with:

				          stale-issue-label: 'stale'

				          stale-issue-message: >

									
										86

.github/workflows/test-external-provider-module.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,86 @@

				name: Test External Providers Installed via Module

				run-name: Test External Provider installation via Python module

				on:

				  push:

				    branches: [ main ]

				  pull_request:

				    branches: [ main ]

				    paths:

				      - 'src/llama_stack/**'

				      - 'tests/integration/**'

				      - 'uv.lock'

				      - 'pyproject.toml'

				      - 'tests/external/*'

				      - '.github/workflows/test-external-provider-module.yml' # This workflow

				jobs:

				  test-external-providers-from-module:

				    # This workflow is disabled. See https://github.com/meta-llama/llama-stack/pull/2975#issuecomment-3138702984 for details

				    if: false

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        image-type: [venv]

				        # We don't do container yet, it's tricky to install a package from the host into the

				        # container and point 'uv pip install' to the correct path...

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: Install Ramalama

				        shell: bash

				        run: |

				          uv pip install ramalama

				      - name: Run Ramalama

				        shell: bash

				        run: |

				          nohup ramalama serve llama3.2:3b-instruct-fp16  > ramalama_server.log 2>&1 &

				      - name: Apply image type to config file

				        run: |

				          yq -i '.image_type = "${{ matrix.image-type }}"' tests/external/ramalama-stack/config.yaml

				          cat tests/external/ramalama-stack/config.yaml

				      - name: Install distribution dependencies

				        run: |

				          uv run llama stack list-deps tests/external/ramalama-stack/build.yaml | xargs -L1 uv pip install

				      - name: Start Llama Stack server in background

				        if: ${{ matrix.image-type }} == 'venv'

				        env:

				          INFERENCE_MODEL: "llama3.2:3b-instruct-fp16"

				          LLAMA_STACK_LOG_FILE: "server.log"

				        run: |

				          # Use the virtual environment created by the build step (name comes from build config)

				          source ramalama-stack-test/bin/activate

				          uv pip list

				          nohup llama stack run tests/external/ramalama-stack/config.yaml > server.log 2>&1 &

				      - name: Wait for Llama Stack server to be ready

				        run: |

				          for i in {1..30}; do

				            if ! grep -q "successfully connected to Ramalama" server.log; then

				              echo "Waiting for Llama Stack server to load the provider..."

				              sleep 1

				            else

				              echo "Provider loaded"

				              exit 0

				            fi

				          done

				          echo "Provider failed to load"

				          cat server.log

				          exit 1

				      - name: Upload all logs to artifacts

				        if: ${{ always() }}

				        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0

				        with:

				          name: logs-${{ github.run_id }}-${{ github.run_attempt }}-external-provider-module-test

				          path: |

				            *.log

				          retention-days: 1

									
										73

.github/workflows/test-external-providers.yml
									
										vendored
									
										View file
									
				@ -1,73 +0,0 @@

				name: Test External Providers

				on:

				  push:

				    branches: [ main ]

				  pull_request:

				    branches: [ main ]

				    paths:

				      - 'llama_stack/**'

				      - 'tests/integration/**'

				      - 'uv.lock'

				      - 'pyproject.toml'

				      - 'requirements.txt'

				      - '.github/workflows/test-external-providers.yml' # This workflow

				jobs:

				  test-external-providers:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        image-type: [venv]

				        # We don't do container yet, it's tricky to install a package from the host into the

				        # container and point 'uv pip install' to the correct path...

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: Apply image type to config file

				        run: |

				          yq -i '.image_type = "${{ matrix.image-type }}"' tests/external-provider/llama-stack-provider-ollama/custom-distro.yaml

				          cat tests/external-provider/llama-stack-provider-ollama/custom-distro.yaml

				      - name: Setup directory for Ollama custom provider

				        run: |

				          mkdir -p tests/external-provider/llama-stack-provider-ollama/src/

				          cp -a llama_stack/providers/remote/inference/ollama/ tests/external-provider/llama-stack-provider-ollama/src/llama_stack_provider_ollama

				      - name: Create provider configuration

				        run: |

				          mkdir -p /home/runner/.llama/providers.d/remote/inference

				          cp tests/external-provider/llama-stack-provider-ollama/custom_ollama.yaml /home/runner/.llama/providers.d/remote/inference/custom_ollama.yaml

				      - name: Build distro from config file

				        run: |

				          USE_COPY_NOT_MOUNT=true LLAMA_STACK_DIR=. llama stack build --config tests/external-provider/llama-stack-provider-ollama/custom-distro.yaml

				      - name: Start Llama Stack server in background

				        if: ${{ matrix.image-type }} == 'venv'

				        env:

				          INFERENCE_MODEL: "meta-llama/Llama-3.2-3B-Instruct"

				        run: |

				          # Use the virtual environment created by the build step (name comes from build config)

				          source ci-test/bin/activate

				          uv pip list

				          nohup llama stack run tests/external-provider/llama-stack-provider-ollama/run.yaml --image-type ${{ matrix.image-type }} > server.log 2>&1 &

				      - name: Wait for Llama Stack server to be ready

				        run: |

				          for i in {1..30}; do

				            if ! grep -q "Successfully loaded external provider remote::custom_ollama" server.log; then

				              echo "Waiting for Llama Stack server to load the provider..."

				              sleep 1

				            else

				              echo "Provider loaded"

				              exit 0

				            fi

				          done

				          echo "Provider failed to load"

				          cat server.log

				          exit 1

									
										92

.github/workflows/test-external.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,92 @@

				name: Test External API and Providers

				run-name: Test the External API and Provider mechanisms

				on:

				  push:

				    branches: [ main ]

				  pull_request:

				    branches: [ main ]

				    paths:

				      - 'src/llama_stack/**'

				      - '!src/llama_stack_ui/**'

				      - 'tests/integration/**'

				      - 'uv.lock'

				      - 'pyproject.toml'

				      - 'requirements.txt'

				      - 'tests/external/*'

				      - '.github/workflows/test-external.yml' # This workflow

				jobs:

				  test-external:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        image-type: [venv]

				        # We don't do container yet, it's tricky to install a package from the host into the

				        # container and point 'uv pip install' to the correct path...

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: Create API configuration

				        run: |

				          mkdir -p /home/runner/.llama/apis.d

				          cp tests/external/weather.yaml /home/runner/.llama/apis.d/weather.yaml

				      - name: Create provider configuration

				        run: |

				          mkdir -p /home/runner/.llama/providers.d/remote/weather

				          cp tests/external/kaze.yaml /home/runner/.llama/providers.d/remote/weather/kaze.yaml

				      - name: Print distro dependencies

				        run: |

				          uv run --no-sync llama stack list-deps tests/external/config.yaml

				      - name: Build distro from config file

				        run: |

				          uv venv ci-test

				          source ci-test/bin/activate

				          uv pip install -e .

				          LLAMA_STACK_LOGGING=all=CRITICAL llama stack list-deps tests/external/config.yaml | xargs -L1 uv pip install

				      - name: Start Llama Stack server in background

				        if: ${{ matrix.image-type }} == 'venv'

				        env:

				          INFERENCE_MODEL: "meta-llama/Llama-3.2-3B-Instruct"

				          LLAMA_STACK_LOG_FILE: "server.log"

				        run: |

				          # Use the virtual environment created by the build step (name comes from build config)

				          source ci-test/bin/activate

				          uv pip list

				          nohup llama stack run tests/external/config.yaml > server.log 2>&1 &

				      - name: Wait for Llama Stack server to be ready

				        run: |

				          echo "Waiting for Llama Stack server..."

				          for i in {1..30}; do

				            if curl -sSf http://localhost:8321/v1/health | grep -q "OK"; then

				              echo "Llama Stack server is up!"

				              exit 0

				            fi

				            sleep 1

				          done

				          echo "Llama Stack server failed to start"

				          cat server.log

				          exit 1

				      - name: Test external API

				        run: |

				          curl -sSf http://localhost:8321/v1/weather/locations

				      - name: Upload all logs to artifacts

				        if: ${{ always() }}

				        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0

				        with:

				          name: logs-${{ github.run_id }}-${{ github.run_attempt }}-external-test

				          path: |

				            *.log

				          retention-days: 1

									
										69

.github/workflows/tests.yml
									
										vendored
									
										View file
									
				@ -1,69 +0,0 @@

				name: auto-tests

				on:

				  # pull_request:

				  workflow_dispatch:

				    inputs:

				      commit_sha:

				        description: 'Specific Commit SHA to trigger on'

				        required: false

				        default: $GITHUB_SHA # default to the last commit of $GITHUB_REF branch

				jobs:

				  test-llama-stack-as-library:

				    runs-on: ubuntu-latest

				    env:

				      TOGETHER_API_KEY: ${{ secrets.TOGETHER_API_KEY }}

				      FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}

				      TAVILY_SEARCH_API_KEY: ${{ secrets.TAVILY_SEARCH_API_KEY }}

				    strategy:

				      matrix:

				        provider: [fireworks, together]

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ github.event.inputs.commit_sha }}

				      - name: Echo commit SHA

				        run: |

				          echo "Triggered on commit SHA: ${{ github.event.inputs.commit_sha }}"

				          git rev-parse HEAD

				      - name: Install dependencies

				        run: |

				          python -m pip install --upgrade pip

				          pip install -r requirements.txt pytest

				          pip install -e .

				      - name: Build providers

				        run: |

				          llama stack build --template ${{ matrix.provider }} --image-type venv

				      - name: Install the latest llama-stack-client & llama-models packages

				        run: |

				          pip install -e git+https://github.com/meta-llama/llama-stack-client-python.git#egg=llama-stack-client

				          pip install -e git+https://github.com/meta-llama/llama-models.git#egg=llama-models

				      - name: Run client-sdk test

				        working-directory: "${{ github.workspace }}"

				        env:

				          REPORT_OUTPUT: md_report.md

				        shell: bash

				        run: |

				          pip install --upgrade pytest-md-report

				          echo "REPORT_FILE=${REPORT_OUTPUT}" >> "$GITHUB_ENV"

				          export INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct

				          LLAMA_STACK_CONFIG=./llama_stack/templates/${{ matrix.provider }}/run.yaml pytest --md-report --md-report-verbose=1 ./tests/client-sdk/inference/ --md-report-output "$REPORT_OUTPUT"

				      - name: Output reports to the job summary

				        if: always()

				        shell: bash

				        run: |

				          if [ -f "$REPORT_FILE" ]; then

				            echo "<details><summary> Test Report for ${{ matrix.provider }} </summary>" >> $GITHUB_STEP_SUMMARY

				            echo "" >> $GITHUB_STEP_SUMMARY

				            cat "$REPORT_FILE" >> $GITHUB_STEP_SUMMARY

				            echo "" >> $GITHUB_STEP_SUMMARY

				            echo "</details>" >> $GITHUB_STEP_SUMMARY

				          fi

									
										55

.github/workflows/ui-unit-tests.yml
									
										vendored
									
										Normal file
									
										View file
										
				@ -0,0 +1,55 @@

				name: UI Tests

				run-name: Run the UI test suite

				on:

				  push:

				    branches: [ main ]

				  pull_request:

				    branches: [ main ]

				    paths:

				      - 'src/llama_stack_ui/**'

				      - '.github/workflows/ui-unit-tests.yml' # This workflow

				  workflow_dispatch:

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && github.run_id || github.ref }}

				  cancel-in-progress: true

				jobs:

				  ui-tests:

				    runs-on: ubuntu-latest

				    strategy:

				      fail-fast: false

				      matrix:

				        node-version: [22]

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Setup Node.js

				        uses: actions/setup-node@395ad3262231945c25e8478fd5baf05154b1d79f # v6.1.0

				        with:

				          node-version: ${{ matrix.node-version }}

				          cache: 'npm'

				          cache-dependency-path: 'src/llama_stack_ui/package-lock.json'

				      - name: Install dependencies

				        working-directory: src/llama_stack_ui

				        run: npm ci

				      - name: Run linting

				        working-directory: src/llama_stack_ui

				        run: npm run lint

				      - name: Run format check

				        working-directory: src/llama_stack_ui

				        run: npm run format:check

				      - name: Run unit tests

				        working-directory: src/llama_stack_ui

				        env:

				          CI: true

				        run: npm test -- --coverage --watchAll=false --passWithNoTests

									
										25

.github/workflows/unit-tests.yml
									
										vendored
									
										View file
										
				@ -1,12 +1,19 @@

				name: Unit Tests

				run-name: Run the unit test suite

				on:

				  push:

				    branches: [ main ]

				    branches:

				      - main

				      - 'release-[0-9]+.[0-9]+.x'

				  pull_request:

				    branches: [ main ]

				    branches:

				      - main

				      - 'release-[0-9]+.[0-9]+.x'

				    paths:

				      - 'llama_stack/**'

				      - 'src/llama_stack/**'

				      - '!src/llama_stack_ui/**'

				      - 'tests/unit/**'

				      - 'uv.lock'

				      - 'pyproject.toml'

				@ -15,7 +22,7 @@ on:

				  workflow_dispatch:

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref }}

				  group: ${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && github.run_id || github.ref }}

				  cancel-in-progress: true

				jobs:

				@ -25,24 +32,24 @@ jobs:

				      fail-fast: false

				      matrix:

				        python:

				          - "3.10"

				          - "3.11"

				          - "3.12"

				          - "3.13"

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				        with:

				          python-version: ${{ matrix.python }}

				      - name: Run unit tests

				        run: |

				          PYTHON_VERSION=${{ matrix.python }} ./scripts/unit-tests.sh --cov=llama_stack --junitxml=pytest-report-${{ matrix.python }}.xml --cov-report=html:htmlcov-${{ matrix.python }}

				          PYTHON_VERSION=${{ matrix.python }} ./scripts/unit-tests.sh --junitxml=pytest-report-${{ matrix.python }}.xml

				      - name: Upload test results

				        if: always()

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0

				        with:

				          name: test-results-${{ matrix.python }}

				          path: |

									
										68

.github/workflows/update-readthedocs.yml
									
										vendored
									
										View file
									
				@ -1,68 +0,0 @@

				name: Update ReadTheDocs

				on:

				  workflow_dispatch:

				    inputs:

				      branch:

				        description: 'RTD version to update'

				        required: false

				        default: 'latest'

				  push:

				    branches:

				      - main

				    paths:

				      - 'docs/**'

				      - 'pyproject.toml'

				      - '.github/workflows/update-readthedocs.yml'

				    tags:

				      - '*'

				  pull_request:

				    branches:

				      - main

				    paths:

				      - 'docs/**'

				      - 'pyproject.toml'

				      - '.github/workflows/update-readthedocs.yml'

				concurrency:

				  group: ${{ github.workflow }}-${{ github.ref }}

				  cancel-in-progress: true

				jobs:

				  update-readthedocs:

				    runs-on: ubuntu-latest

				    env:

				      TOKEN: ${{ secrets.READTHEDOCS_TOKEN }}

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      - name: Install dependencies

				        uses: ./.github/actions/setup-runner

				      - name: Build HTML

				        run: |

				          cd docs

				          uv run make html

				      - name: Trigger ReadTheDocs build

				        if: github.event_name != 'pull_request'

				        run: |

				          if [ -z "$TOKEN" ]; then

				            echo "READTHEDOCS_TOKEN is not set"

				            exit 1

				          fi

				          response=$(curl -X POST \

				            -H "Content-Type: application/json" \

				            -d "{

				              \"token\": \"$TOKEN\",

				              \"version\": \"$GITHUB_REF_NAME\"

				            }" \

				            https://readthedocs.org/api/v2/webhook/llama-stack/289768/)

				          echo "Response: $response"

				          if [ $(echo $response | jq -r '.build_triggered') != 'true' ]; then

				            echo "Failed to trigger ReadTheDocs build"

				            exit 1

				          fi

13

.gitignore vendored

View file

 @ -18,7 +18,6 @@ Package.resolved
 .venv/
 .vscode
 _build
 docs/src
 # Sample tool-calling datasets generated by NVIDIA notebooks
 docs/notebooks/nvidia/tool_calling/sample_data/
 pyrightconfig.json
 @ -26,3 +25,15 @@ venv/
 pytest-report.xml
 .coverage
 .python-version
 AGENTS.md
 server.log
 CLAUDE.md
 .claude/
 docs/.docusaurus/
 docs/node_modules/
 docs/static/imported-files/
 docs/docs/api-deprecated/
 docs/docs/api-experimental/
 docs/docs/api/
 tests/integration/client-typescript/node_modules/
 .ts-client-checkout/

									
										171

.pre-commit-config.yaml
									
										View file
										
				@ -1,7 +1,9 @@

				exclude: 'build/'

				minimum_pre_commit_version: 4.4.0

				x-uv-dependency: &uv-dependency "uv==0.9.15"

				default_language_version:

				    python: python3

				    python: python3.12

				    node: "22"

				repos:

				-   repo: https://github.com/pre-commit/pre-commit-hooks

				@ -14,12 +16,12 @@ repos:

				    -   id: check-added-large-files

				        args: ['--maxkb=1000']

				    -   id: end-of-file-fixer

				        exclude: '^(.*\.svg)$'

				        exclude: '^(.*\.svg|.*\.md)$'

				    -   id: no-commit-to-branch

				    -   id: check-yaml

				        args: ["--unsafe"]

				        exclude: 'docs/static/openai-spec-2.3.0.yml'

				    -   id: detect-private-key

				    -   id: requirements-txt-fixer

				    -   id: mixed-line-ending

				        args: [--fix=lf] # Forces to replace line ending by LF (line feed)

				    -   id: check-executables-have-shebangs

				@ -29,7 +31,7 @@ repos:

				    -   id: check-toml

				-   repo: https://github.com/Lucas-C/pre-commit-hooks

				    rev: v1.5.4

				    rev: v1.5.5

				    hooks:

				    -   id: insert-license

				        files: \.py$|\.sh$

				@ -38,39 +40,26 @@ repos:

				          - docs/license_header.txt

				-   repo: https://github.com/astral-sh/ruff-pre-commit

				    rev: v0.9.4

				    rev: v0.12.2

				    hooks:

				    -   id: ruff

				        args: [ --fix ]

				        exclude: ^llama_stack/strong_typing/.*$

				    -   id: ruff-format

				-   repo: https://github.com/adamchainz/blacken-docs

				    rev: 1.19.0

				    rev: 1.19.1

				    hooks:

				    -   id: blacken-docs

				        additional_dependencies:

				        - black==24.3.0

				-   repo: https://github.com/astral-sh/uv-pre-commit

				    rev: 0.7.8

				    hooks:

				    -   id: uv-lock

				    -   id: uv-export

				        args: [

				            "--frozen",

				            "--no-hashes",

				            "--no-emit-project",

				            "--no-default-groups",

				            "--output-file=requirements.txt"

				        ]

				-   repo: https://github.com/pre-commit/mirrors-mypy

				    rev: v1.15.0

				    rev: v1.18.2

				    hooks:

				    -   id: mypy

				        additional_dependencies:

				          - uv==0.6.2

				          - *uv-dependency

				          - mypy

				          - pytest

				          - rich

				@ -86,24 +75,48 @@ repos:

				-   repo: local

				    hooks:

				      - id: uv-lock

				        name: uv-lock

				        additional_dependencies:

				          - *uv-dependency

				        entry: ./scripts/uv-run-with-index.sh lock

				        language: python

				        pass_filenames: false

				        require_serial: true

				        files: ^(pyproject\.toml|uv\.lock)$

				      - id: mypy-full

				        name: mypy (full type_checking)

				        entry: ./scripts/uv-run-with-index.sh run --group dev --group type_checking mypy

				        language: system

				        pass_filenames: false

				        stages: [manual]

				      - id: distro-codegen

				        name: Distribution Template Codegen

				        additional_dependencies:

				          - uv==0.7.8

				        entry: uv run --group codegen ./scripts/distro_codegen.py

				          - *uv-dependency

				        entry: ./scripts/uv-run-with-index.sh run --group codegen ./scripts/distro_codegen.py

				        language: python

				        pass_filenames: false

				        require_serial: true

				        files: ^llama_stack/templates/.*$|^llama_stack/providers/.*/inference/.*/models\.py$

				        files: ^src/llama_stack/distributions/.*$|^src/llama_stack/providers/.*/inference/.*/models\.py$

				      - id: provider-codegen

				        name: Provider Codegen

				        additional_dependencies:

				          - *uv-dependency

				        entry: ./scripts/uv-run-with-index.sh run --group codegen ./scripts/provider_codegen.py

				        language: python

				        pass_filenames: false

				        require_serial: true

				        files: ^src/llama_stack/providers/.*$|^scripts/run_openapi_generator.sh$

				      - id: openapi-codegen

				        name: API Spec Codegen

				        additional_dependencies:

				          - uv==0.7.8

				        entry: sh -c 'uv run ./docs/openapi_generator/run_openapi_generator.sh > /dev/null'

				          - *uv-dependency

				        entry: sh -c './scripts/uv-run-with-index.sh run scripts/run_openapi_generator.sh'

				        language: python

				        pass_filenames: false

				        require_serial: true

				        files: ^llama_stack/apis/|^docs/openapi_generator/

				        files: ^src/llama_stack_api/.*$

				      - id: check-workflows-use-hashes

				        name: Check GitHub Actions use SHA-pinned actions

				        entry: ./scripts/check-workflows-use-hashes.sh

				@ -112,7 +125,109 @@ repos:

				        require_serial: true

				        always_run: true

				        files: ^\.github/workflows/.*\.ya?ml$

				      - id: check-init-py

				        name: Check for missing __init__.py files

				        entry: ./scripts/check-init-py.sh

				        language: system

				        pass_filenames: false

				        require_serial: true

				        always_run: true

				        files: ^src/llama_stack/.*$

				      - id: forbid-pytest-asyncio

				        name: Block @pytest.mark.asyncio and @pytest_asyncio.fixture

				        entry: bash

				        language: system

				        types: [python]

				        pass_filenames: true

				        args:

				          - -c

				          - |

				            grep -EnH '^[^#]*@pytest\.mark\.asyncio|@pytest_asyncio\.fixture' "$@" && {

				              echo;

				              echo "❌ Do not use @pytest.mark.asyncio or @pytest_asyncio.fixture."

				              echo "   pytest is already configured with async-mode=auto."

				              echo;

				              exit 1;

				            } || true

				      - id: generate-ci-docs

				        name: Generate CI documentation

				        additional_dependencies:

				          - *uv-dependency

				        entry: ./scripts/uv-run-with-index.sh run ./scripts/gen-ci-docs.py

				        language: python

				        pass_filenames: false

				        require_serial: true

				        files: ^.github/workflows/.*$

				      - id: ui-linter

				        name: Format & Lint UI

				        entry: bash ./scripts/run-ui-linter.sh

				        language: system

				        files: ^src/llama_stack_ui/.*\.(ts|tsx)$

				        pass_filenames: false

				        require_serial: true

				      - id: check-log-usage

				        name: Ensure 'llama_stack.log' usage for logging

				        entry: bash

				        language: system

				        types: [python]

				        pass_filenames: true

				        args:

				          - -c

				          - |

				            matches=$(grep -EnH '^[^#]*\b(import\s+logging|from\s+logging\b)' "$@" | grep -v -e '#\s*allow-direct-logging' || true)

				            if [ -n "$matches" ]; then

				              # GitHub Actions annotation format

				              while IFS=: read -r file line_num rest; do

				                echo "::error file=$file,line=$line_num::Do not use 'import logging' or 'from logging import' in $file. Use the custom log instead: from llama_stack.log import get_logger; logger = get_logger(). If direct logging is truly needed, add: # allow-direct-logging"

				              done <<< "$matches"

				              exit 1

				            fi

				            exit 0

				      - id: fips-compliance

				        name: Ensure llama-stack remains FIPS compliant

				        entry: bash

				        language: system

				        types: [python]

				        pass_filenames: true

				        exclude: '^tests/.*$'  # Exclude test dir as some safety tests used MD5

				        args:

				          - -c

				          - |

				            grep -EnH '^[^#]*\b(md5|sha1|uuid3|uuid5)\b' "$@" && {

				              echo;

				              echo "❌ Do not use any of the following functions: hashlib.md5, hashlib.sha1, uuid.uuid3, uuid.uuid5"

				              echo "   These functions are not FIPS-compliant"

				              echo;

				              exit 1;

				            } || true

				      - id: check-api-independence

				        name: Ensure llama_stack_api does not import llama_stack

				        entry: bash

				        language: system

				        pass_filenames: false

				        require_serial: true

				        always_run: true

				        files: ^src/llama_stack_api/.*$

				        args:

				          - -c

				          - |

				            API_DIR="src/llama_stack_api"

				            grep -rn --include="*.py" -E '^[^#]*(import llama_stack\b|from llama_stack\b)' "$API_DIR" 2>/dev/null && {

				              echo "llama_stack_api must not import llama_stack";

				              exit 1;

				            }

				            [ -f "$API_DIR/pyproject.toml" ] && grep -n 'llama_stack[^_]' "$API_DIR/pyproject.toml" && {

				              echo "llama_stack_api must not depend on llama_stack in pyproject.toml";

				              exit 1;

				            }

				            exit 0

				ci:

				    autofix_commit_msg: 🎨 [pre-commit.ci] Auto format from pre-commit.com hooks

				    autoupdate_commit_msg: ⬆ [pre-commit.ci] pre-commit autoupdate

				    autofix_prs: true

				    autoupdate_branch: ''

				    autoupdate_schedule: weekly

				    skip: []

				    submodules: false

									
										25

.readthedocs.yaml
									
										View file
									
				@ -1,25 +0,0 @@

				# .readthedocs.yaml

				# Read the Docs configuration file

				# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

				# Required

				version: 2

				# Build documentation in the "docs/" directory with Sphinx

				sphinx:

				  configuration: docs/source/conf.py

				# Set the OS, Python version and other tools you might need

				build:

				  os: ubuntu-22.04

				  tools:

				    python: "3.12"

				  jobs:

				    pre_create_environment:

				      - asdf plugin add uv

				      - asdf install uv latest

				      - asdf global uv latest

				    create_environment:

				      - uv venv "${READTHEDOCS_VIRTUALENV_PATH}"

				    install:

				      - UV_PROJECT_ENVIRONMENT="${READTHEDOCS_VIRTUALENV_PATH}" uv sync --frozen --group docs

									
										219

CHANGELOG.md
									
										View file
										
				@ -1,5 +1,155 @@

				# Changelog

				# v0.2.20

				Published on: 2025-08-29T22:25:32Z

				Here are some key changes that are coming as part of this release.

				### Build and Environment

				- Environment improvements: fixed env var replacement to preserve types.

				- Docker stability: fixed container startup failures for Fireworks AI provider.

				- Removed absolute paths in build for better portability.

				### Features

				- UI Enhancements: Implemented file upload and VectorDB creation/configuration directly in UI.

				- Vector Store Improvements: Added keyword, vector, and hybrid search inside vector store.

				- Added S3 authorization support for file providers.

				- SQL Store: Added inequality support to where clause.

				### Documentation

				- Fixed post-training docs.

				- Added Contributor Guidelines for creating Internal vs. External providers.

				### Fixes

				- Removed unsupported bfcl scoring function.

				- Multiple reliability and configuration fixes for providers and environment handling.

				### Engineering / Chores

				- Cleaner internal development setup with consistent paths.

				- Incremental improvements to provider integration and vector store behavior.

				### New Contributors

				- @omertuc made their first contribution in #3270

				- @r3v5 made their first contribution in vector store hybrid search

				---

				# v0.2.19

				Published on: 2025-08-26T22:06:55Z

				## Highlights

				* feat: Add CORS configuration support for server by @skamenan7 in https://github.com/llamastack/llama-stack/pull/3201

				* feat(api): introduce /rerank by @ehhuang in https://github.com/llamastack/llama-stack/pull/2940

				* feat: Add S3 Files Provider by @mattf in https://github.com/llamastack/llama-stack/pull/3202

				---

				# v0.2.18

				Published on: 2025-08-20T01:09:27Z

				## Highlights

				* Add moderations create API

				* Hybrid search in Milvus

				* Numerous Responses API improvements

				* Documentation updates

				---

				# v0.2.17

				Published on: 2025-08-05T01:51:14Z

				## Highlights

				* feat(tests): introduce inference record/replay to increase test reliability by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2941

				* fix(library_client): improve initialization error handling and prevent AttributeError by @mattf in https://github.com/meta-llama/llama-stack/pull/2944

				* fix: use OLLAMA_URL to activate Ollama provider in starter by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2963

				* feat(UI): adding MVP playground UI by @franciscojavierarceo in https://github.com/meta-llama/llama-stack/pull/2828

				* Standardization of errors (@nathan-weinberg)

				* feat: Enable DPO training with HuggingFace inline provider by @Nehanth in https://github.com/meta-llama/llama-stack/pull/2825

				* chore: rename templates to distributions by @ashwinb in https://github.com/meta-llama/llama-stack/pull/3035

				---

				# v0.2.16

				Published on: 2025-07-28T23:35:23Z

				## Highlights

				* Automatic model registration for self-hosted providers (ollama and vllm currently). No need for `INFERENCE_MODEL` environment variables which need to be updated, etc.

				* Much simplified starter distribution. Most `ENABLE_` env variables are now gone. When you set `VLLM_URL`, the `vllm` provider is auto-enabled. Similar for `MILVUS_URL`, `PGVECTOR_DB`, etc. Check the [config.yaml](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/templates/starter/config.yaml) for more details.

				* All tests migrated to pytest now (thanks @Elbehery)

				* DPO implementation in the post-training provider (thanks @Nehanth)

				* (Huge!) Support for external APIs and providers thereof (thanks @leseb, @cdoern and others). This is a really big deal -- you can now add more APIs completely out of tree and experiment with them before (optionally) wanting to contribute back.

				* `inline::vllm` provider is gone thank you very much

				* several improvements to OpenAI inference implementations and LiteLLM backend (thanks @mattf)

				* Chroma now supports Vector Store API (thanks @franciscojavierarceo).

				* Authorization improvements: Vector Store/File APIs now supports access control (thanks @franciscojavierarceo); Telemetry read APIs are gated according to logged-in user's roles.

				---

				# v0.2.15

				Published on: 2025-07-16T03:30:01Z

				---

				# v0.2.14

				Published on: 2025-07-04T16:06:48Z

				## Highlights

				* Support for Llama Guard 4

				* Added Milvus  support to vector-stores API

				* Documentation and zero-to-hero updates for latest APIs

				---

				# v0.2.13

				Published on: 2025-06-28T04:28:11Z

				## Highlights

				* search_mode support in OpenAI vector store API

				* Security fixes

				---

				# v0.2.12

				Published on: 2025-06-20T22:52:12Z

				## Highlights

				* Filter support in file search

				* Support auth attributes in inference and response stores

				---

				# v0.2.11

				Published on: 2025-06-17T20:26:26Z

				## Highlights

				* OpenAI-compatible vector store APIs

				* Hybrid Search in Sqlite-vec

				* File search tool in Responses API

				* Pagination in inference and response stores

				* Added `suffix` to completions API for fill-in-the-middle tasks

				---

				# v0.2.10.1

				Published on: 2025-06-06T20:11:02Z

				@ -399,7 +549,7 @@ GenAI application developers need more than just an LLM - they need to integrate

				Llama Stack was created to provide developers with a comprehensive and coherent interface that simplifies AI application development and codifies best practices across the Llama ecosystem. Since our launch in September 2024, we have seen a huge uptick in interest in Llama Stack APIs by both AI developers and from partners building AI services with Llama models. Partners like Nvidia, Fireworks, and Ollama have collaborated with us to develop implementations across various APIs, including inference, memory, and safety.

				With Llama Stack, you can easily build a RAG agent which can also search the web, do complex math, and custom tool calling. You can use telemetry to inspect those traces, and convert telemetry into evals datasets. And with Llama Stack’s plugin architecture and prepackage distributions, you choose to run your agent anywhere - in the cloud with our partners, deploy your own environment using virtualenv, conda, or Docker, operate locally with Ollama, or even run on mobile devices with our SDKs. Llama Stack offers unprecedented flexibility while also simplifying the developer experience.

				With Llama Stack, you can easily build a RAG agent which can also search the web, do complex math, and custom tool calling. You can use telemetry to inspect those traces, and convert telemetry into evals datasets. And with Llama Stack’s plugin architecture and prepackage distributions, you choose to run your agent anywhere - in the cloud with our partners, deploy your own environment using virtualenv or Docker, operate locally with Ollama, or even run on mobile devices with our SDKs. Llama Stack offers unprecedented flexibility while also simplifying the developer experience.

				## Release

				After iterating on the APIs for the last 3 months, today we’re launching a stable release (V1) of the Llama Stack APIs and the corresponding llama-stack server and client packages(v0.1.0). We now have automated tests for providers. These tests make sure that all provider implementations are verified. Developers can now easily and reliably select distributions or providers based on their specific requirements.

				@ -462,70 +612,3 @@ A small but important bug-fix release to update the URL datatype for the client-

				---

				# v0.0.62

				Published on: 2024-12-18T02:39:43Z

				---

				# v0.0.61

				Published on: 2024-12-10T20:50:33Z

				---

				# v0.0.55

				Published on: 2024-11-23T17:14:07Z

				---

				# v0.0.54

				Published on: 2024-11-22T00:36:09Z

				---

				# v0.0.53

				Published on: 2024-11-20T22:18:00Z

				🚀  Initial Release Notes for Llama Stack!

				### Added

				- Resource-oriented design for models, shields, memory banks, datasets and eval tasks

				- Persistence for registered objects with distribution

				- Ability to persist memory banks created for FAISS

				- PostgreSQL KVStore implementation

				- Environment variable placeholder support in run.yaml files

				- Comprehensive Zero-to-Hero notebooks and quickstart guides

				- Support for quantized models in Ollama

				- Vision models support for Together, Fireworks, Meta-Reference, and Ollama, and vLLM

				- Bedrock distribution with safety shields support

				- Evals API with task registration and scoring functions

				- MMLU and SimpleQA benchmark scoring functions

				- Huggingface dataset provider integration for benchmarks

				- Support for custom dataset registration from local paths

				- Benchmark evaluation CLI tools with visualization tables

				- RAG evaluation scoring functions and metrics

				- Local persistence for datasets and eval tasks

				### Changed

				- Split safety into distinct providers (llama-guard, prompt-guard, code-scanner)

				- Changed provider naming convention (`impls` → `inline`, `adapters` → `remote`)

				- Updated API signatures for dataset and eval task registration

				- Restructured folder organization for providers

				- Enhanced Docker build configuration

				- Added version prefixing for REST API routes

				- Enhanced evaluation task registration workflow

				- Improved benchmark evaluation output formatting

				- Restructured evals folder organization for better modularity

				### Removed

				- `llama stack configure` command

				---

									
										246

CONTRIBUTING.md
									
										View file
										
				@ -1,17 +1,112 @@

				# Contributing to Llama-Stack

				# Contributing to Llama Stack

				We want to make contributing to this project as easy and transparent as

				possible.

				## Set up your development environment

				We use [uv](https://github.com/astral-sh/uv) to manage python dependencies and virtual environments.

				You can install `uv` by following this [guide](https://docs.astral.sh/uv/getting-started/installation/).

				You can install the dependencies by running:

				```bash

				cd llama-stack

				uv venv --python 3.12

				uv sync --group dev

				uv pip install -e .

				source .venv/bin/activate

				```

				```{note}

				If you are making changes to Llama Stack, it is essential that you use Python 3.12 as shown above.

				Llama Stack can work with Python 3.13 but the pre-commit hooks used to validate code changes only work with Python 3.12.

				If you don't specify a Python version, `uv` will automatically select a Python version according to the `requires-python`

				section of the `pyproject.toml`, which is fine for running Llama Stack but not for committing changes.

				For more info, see the [uv docs around Python versions](https://docs.astral.sh/uv/concepts/python-versions/).

				```

				Note that you can create a dotenv file `.env` that includes necessary environment variables:

				```

				LLAMA_STACK_BASE_URL=http://localhost:8321

				LLAMA_STACK_CLIENT_LOG=debug

				LLAMA_STACK_PORT=8321

				LLAMA_STACK_CONFIG=<provider-name>

				TAVILY_SEARCH_API_KEY=

				BRAVE_SEARCH_API_KEY=

				```

				And then use this dotenv file when running client SDK tests via the following:

				```bash

				uv run --env-file .env -- pytest -v tests/integration/inference/test_text_inference.py --text-model=meta-llama/Llama-3.1-8B-Instruct

				```

				### Pre-commit Hooks

				We use [pre-commit](https://pre-commit.com/) to run linting and formatting checks on your code. You can install the pre-commit hooks by running:

				```bash

				uv pip install 'pre-commit>=4.4.0'

				uv run pre-commit install

				```

				Note that the only version of pre-commit that works with the Llama Stack continuous integration is `4.3.0` so it is essential that you pull

				that specific version as shown above.  Once you have run these commands, pre-commit hooks will run automatically before each commit.

				Alternatively, if you don't want to install the pre-commit hooks (or if you want to check if your changes are ready before committing),

				you can run the checks manually by running:

				```bash

				uv run pre-commit run --all-files -v

				```

				The `-v` (verbose) parameter is optional but often helpful for getting more information about any issues with that the pre-commit checks identify.

				To run the expanded mypy configuration that CI enforces, use:

				```bash

				uv run pre-commit run mypy-full --hook-stage manual --all-files

				```

				or invoke mypy directly with all optional dependencies:

				```bash

				uv run --group dev --group type_checking mypy

				```

				```{caution}

				Before pushing your changes, make sure that the pre-commit hooks have passed successfully.

				```

				## Discussions -> Issues -> Pull Requests

				We actively welcome your pull requests. However, please read the following. This is heavily inspired by [Ghostty](https://github.com/ghostty-org/ghostty/blob/main/CONTRIBUTING.md).

				If in doubt, please open a [discussion](https://github.com/meta-llama/llama-stack/discussions); we can always convert that to an issue later.

				If in doubt, please open a [discussion](https://github.com/llamastack/llama-stack/discussions); we can always convert that to an issue later.

				### Issues

				We use GitHub issues to track public bugs. Please ensure your description is

				clear and has sufficient instructions to be able to reproduce the issue.

				Meta has a [bounty program](http://facebook.com/whitehat/info) for the safe

				disclosure of security bugs. In those cases, please go through the process

				outlined on that page and do not file a public issue.

				### Contributor License Agreement ("CLA")

				In order to accept your pull request, we need you to submit a CLA. You only need

				to do this once to work on any of Meta's open source projects.

				Complete your CLA here: <https://code.facebook.com/cla>

				**I'd like to contribute!**

				All issues are actionable (please report if they are not.) Pick one and start working on it. Thank you.

				If you need help or guidance, comment on the issue. Issues that are extra friendly to new contributors are tagged with "contributor friendly".

				If you are new to the project, start by looking at the issues tagged with "good first issue". If you're interested

				leave a comment on the issue and a triager will assign it to you.

				Please avoid picking up too many issues at once. This helps you stay focused and ensures that others in the community also have opportunities to contribute.

				- Try to work on only 1–2 issues at a time, especially if you’re still getting familiar with the codebase.

				- Before taking an issue, check if it’s already assigned or being actively discussed.

				- If you’re blocked or can’t continue with an issue, feel free to unassign yourself or leave a comment so others can step in.

				**I have a bug!**

				@ -41,89 +136,20 @@ If you need help or guidance, comment on the issue. Issues that are extra friend

				4. Make sure your code lints using `pre-commit`.

				5. If you haven't already, complete the Contributor License Agreement ("CLA").

				6. Ensure your pull request follows the [conventional commits format](https://www.conventionalcommits.org/en/v1.0.0/).

				## Contributor License Agreement ("CLA")

				In order to accept your pull request, we need you to submit a CLA. You only need

				to do this once to work on any of Meta's open source projects.

				Complete your CLA here: <https://code.facebook.com/cla>

				## Issues

				We use GitHub issues to track public bugs. Please ensure your description is

				clear and has sufficient instructions to be able to reproduce the issue.

				Meta has a [bounty program](http://facebook.com/whitehat/info) for the safe

				disclosure of security bugs. In those cases, please go through the process

				outlined on that page and do not file a public issue.

				7. Ensure your pull request follows the [coding style](#coding-style).

				## Set up your development environment

				Please keep pull requests (PRs) small and focused. If you have a large set of changes, consider splitting them into logically grouped, smaller PRs to facilitate review and testing.

				We use [uv](https://github.com/astral-sh/uv) to manage python dependencies and virtual environments.

				You can install `uv` by following this [guide](https://docs.astral.sh/uv/getting-started/installation/).

				You can install the dependencies by running:

				```bash

				cd llama-stack

				uv sync --extra dev

				uv pip install -e .

				source .venv/bin/activate

				```{tip}

				As a general guideline:

				- Experienced contributors should try to keep no more than 5 open PRs at a time.

				- New contributors are encouraged to have only one open PR at a time until they’re familiar with the codebase and process.

				```

				> [!NOTE]

				> You can use a specific version of Python with `uv` by adding the `--python <version>` flag (e.g. `--python 3.11`)

				> Otherwise, `uv` will automatically select a Python version according to the `requires-python` section of the `pyproject.toml`.

				> For more info, see the [uv docs around Python versions](https://docs.astral.sh/uv/concepts/python-versions/).

				## Repository guidelines

				Note that you can create a dotenv file `.env` that includes necessary environment variables:

				```

				LLAMA_STACK_BASE_URL=http://localhost:8321

				LLAMA_STACK_CLIENT_LOG=debug

				LLAMA_STACK_PORT=8321

				LLAMA_STACK_CONFIG=<provider-name>

				TAVILY_SEARCH_API_KEY=

				BRAVE_SEARCH_API_KEY=

				```

				And then use this dotenv file when running client SDK tests via the following:

				```bash

				uv run --env-file .env -- pytest -v tests/integration/inference/test_text_inference.py --text-model=meta-llama/Llama-3.1-8B-Instruct

				```

				## Pre-commit Hooks

				We use [pre-commit](https://pre-commit.com/) to run linting and formatting checks on your code. You can install the pre-commit hooks by running:

				```bash

				uv run pre-commit install

				```

				After that, pre-commit hooks will run automatically before each commit.

				Alternatively, if you don't want to install the pre-commit hooks, you can run the checks manually by running:

				```bash

				uv run pre-commit run --all-files

				```

				> [!CAUTION]

				> Before pushing your changes, make sure that the pre-commit hooks have passed successfully.

				## Running tests

				You can find the Llama Stack testing documentation here [here](tests/README.md).

				## Adding a new dependency to the project

				To add a new dependency to the project, you can use the `uv` command. For example, to add `foo` to the project, you can run:

				```bash

				uv add foo

				uv sync

				```

				## Coding Style

				### Coding Style

				* Comments should provide meaningful insights into the code. Avoid filler comments that simply

				  describe the next step, as they create unnecessary clutter, same goes for docstrings.

				@ -139,39 +165,65 @@ uv sync

				  justification for bypassing the check.

				* Don't use unicode characters in the codebase. ASCII-only is preferred for compatibility or

				  readability reasons.

				* Providers configuration class should be Pydantic Field class. It should have a `description` field

				  that describes the configuration. These descriptions will be used to generate the provider

				  documentation.

				* When possible, use keyword arguments only when calling functions.

				* Llama Stack utilizes [custom Exception classes](llama_stack/apis/common/errors.py) for certain Resources that should be used where applicable.

				### License

				By contributing to Llama, you agree that your contributions will be licensed

				under the LICENSE file in the root directory of this source tree.

				## Common Tasks

				Some tips about common tasks you work on while contributing to Llama Stack:

				### Using `llama stack build`

				### Installing dependencies of distributions

				Building a stack image (conda / docker) will use the production version of the `llama-stack` and `llama-stack-client` packages. If you are developing with a llama-stack repository checked out and need your code to be reflected in the stack image, set `LLAMA_STACK_DIR` and `LLAMA_STACK_CLIENT_DIR` to the appropriate checked out directories when running any of the `llama` CLI commands.

				When installing dependencies for a distribution, you can use `llama stack list-deps` to view and install the required packages.

				Example:

				```bash

				cd work/

				git clone https://github.com/meta-llama/llama-stack.git

				git clone https://github.com/meta-llama/llama-stack-client-python.git

				git clone https://github.com/llamastack/llama-stack.git

				git clone https://github.com/llamastack/llama-stack-client-python.git

				cd llama-stack

				LLAMA_STACK_DIR=$(pwd) LLAMA_STACK_CLIENT_DIR=../llama-stack-client-python llama stack build --template <...>

				# Show dependencies for a distribution

				llama stack list-deps <distro-name>

				# Install dependencies

				llama stack list-deps <distro-name> | xargs -L1 uv pip install

				```

				### Updating distribution configurations

				### Updating Provider Configurations

				If you have made changes to a provider's configuration in any form (introducing a new config key, or

				changing models, etc.), you should run `./scripts/distro_codegen.py` to re-generate various YAML

				files as well as the documentation. You should not change `docs/source/.../distributions/` files

				manually as they are auto-generated.

				If you have made changes to a provider's configuration in any form (introducing a new config key, or changing models, etc.), you should run `./scripts/distro_codegen.py` to re-generate various YAML files as well as the documentation. You should not change `docs/source/.../distributions/` files manually as they are auto-generated.

				### Updating the provider documentation

				If you have made changes to a provider's configuration, you should run `./scripts/provider_codegen.py`

				to re-generate the documentation. You should not change `docs/source/.../providers/` files manually

				as they are auto-generated.

				Note that the provider "description" field will be used to generate the provider documentation.

				### Building the Documentation

				If you are making changes to the documentation at [https://llama-stack.readthedocs.io/en/latest/](https://llama-stack.readthedocs.io/en/latest/), you can use the following command to build the documentation and preview your changes. You will need [Sphinx](https://www.sphinx-doc.org/en/master/) and the readthedocs theme.

				If you are making changes to the documentation at [https://llamastack.github.io/](https://llamastack.github.io/), you can use the following command to build the documentation and preview your changes.

				```bash

				# This rebuilds the documentation pages.

				uv run --group docs make -C docs/ html

				# This rebuilds the documentation pages and the OpenAPI spec.

				cd docs/

				npm install

				npm run gen-api-docs all

				npm run build

				# This will start a local server (usually at http://127.0.0.1:8000) that automatically rebuilds and refreshes when you make changes to the documentation.

				uv run --group docs sphinx-autobuild docs/source docs/build/html --write-all

				# This will start a local server (usually at http://127.0.0.1:3000).

				npm run serve

				```

				### Update API Documentation

				@ -179,11 +231,7 @@ uv run --group docs sphinx-autobuild docs/source docs/build/html --write-all

				If you modify or add new API endpoints, update the API documentation accordingly. You can do this by running the following command:

				```bash

				uv run ./docs/openapi_generator/run_openapi_generator.sh

				uv run ./scripts/run_openapi_generator.sh

				```

				The generated API documentation will be available in `docs/_static/`. Make sure to review the changes before committing.

				## License

				By contributing to Llama, you agree that your contributions will be licensed

				under the LICENSE file in the root directory of this source tree.

				The generated API schema will be available in `docs/static/`. Make sure to review the changes before committing.

18

MANIFEST.in

View file

 @ -1,9 +1,11 @@
 include pyproject.toml
 include llama_stack/models/llama/llama3/tokenizer.model
 include llama_stack/models/llama/llama4/tokenizer.model
 include llama_stack/distribution/*.sh
 include llama_stack/cli/scripts/*.sh
 include llama_stack/templates/*/*.yaml
 include llama_stack/providers/tests/test_cases/inference/*.json
 include llama_stack/models/llama/*/*.md
 include llama_stack/tests/integration/*.jpg
 include src/llama_stack/models/llama/llama3/tokenizer.model
 include src/llama_stack/models/llama/llama4/tokenizer.model
 include src/llama_stack/core/*.sh
 include src/llama_stack/cli/scripts/*.sh
 include src/llama_stack/distributions/*/*.yaml
 exclude src/llama_stack/distributions/ci-tests
 include tests/integration/test_cases/inference/*.json
 include src/llama_stack/models/llama/*/*.md
 include src/llama_stack/tests/integration/*.jpg
 prune src/llama_stack/distributions/ci-tests

									
										174

README.md
									
										View file
										
				@ -7,82 +7,22 @@

				[![Unit Tests](https://github.com/meta-llama/llama-stack/actions/workflows/unit-tests.yml/badge.svg?branch=main)](https://github.com/meta-llama/llama-stack/actions/workflows/unit-tests.yml?query=branch%3Amain)

				[![Integration Tests](https://github.com/meta-llama/llama-stack/actions/workflows/integration-tests.yml/badge.svg?branch=main)](https://github.com/meta-llama/llama-stack/actions/workflows/integration-tests.yml?query=branch%3Amain)

				[**Quick Start**](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html) | [**Documentation**](https://llama-stack.readthedocs.io/en/latest/index.html) | [**Colab Notebook**](./docs/getting_started.ipynb) | [**Discord**](https://discord.gg/llama-stack)

				[**Quick Start**](https://llamastack.github.io/docs/getting_started/quickstart) | [**Documentation**](https://llamastack.github.io/docs) | [**Colab Notebook**](./docs/getting_started.ipynb) | [**Discord**](https://discord.gg/llama-stack)

				### ✨🎉 Llama 4 Support  🎉✨

				We released [Version 0.2.0](https://github.com/meta-llama/llama-stack/releases/tag/v0.2.0) with support for the Llama 4 herd of models released by Meta.

				<details>

				<summary>👋 Click here to see how to run Llama 4 models on Llama Stack </summary>

				\

				*Note you need 8xH100 GPU-host to run these models*

				```bash

				pip install -U llama_stack

				MODEL="Llama-4-Scout-17B-16E-Instruct"

				# get meta url from llama.com

				llama model download --source meta --model-id $MODEL --meta-url <META_URL>

				# start a llama stack server

				INFERENCE_MODEL=meta-llama/$MODEL llama stack build --run --template meta-reference-gpu

				# install client to interact with the server

				pip install llama-stack-client

				```

				### CLI

				```bash

				# Run a chat completion

				llama-stack-client --endpoint http://localhost:8321 \

				inference chat-completion \

				--model-id meta-llama/$MODEL \

				--message "write a haiku for meta's llama 4 models"

				ChatCompletionResponse(

				    completion_message=CompletionMessage(content="Whispers in code born\nLlama's gentle, wise heartbeat\nFuture's soft unfold", role='assistant', stop_reason='end_of_turn', tool_calls=[]),

				    logprobs=None,

				    metrics=[Metric(metric='prompt_tokens', value=21.0, unit=None), Metric(metric='completion_tokens', value=28.0, unit=None), Metric(metric='total_tokens', value=49.0, unit=None)]

				)

				```

				### Python SDK

				```python

				from llama_stack_client import LlamaStackClient

				client = LlamaStackClient(base_url=f"http://localhost:8321")

				model_id = "meta-llama/Llama-4-Scout-17B-16E-Instruct"

				prompt = "Write a haiku about coding"

				print(f"User> {prompt}")

				response = client.inference.chat_completion(

				    model_id=model_id,

				    messages=[

				        {"role": "system", "content": "You are a helpful assistant."},

				        {"role": "user", "content": prompt},

				    ],

				)

				print(f"Assistant> {response.completion_message.content}")

				```

				As more providers start supporting Llama 4, you can use them in Llama Stack as well. We are adding to the list. Stay tuned!

				</details>

				### 🚀 One-Line Installer 🚀

				To try Llama Stack locally, run:

				```bash

				curl -LsSf https://github.com/meta-llama/llama-stack/raw/main/install.sh | bash

				curl -LsSf https://github.com/llamastack/llama-stack/raw/main/scripts/install.sh | bash

				```

				### Overview

				Llama Stack standardizes the core building blocks that simplify AI application development. It codifies best practices across the Llama ecosystem. More specifically, it provides

				Llama Stack defines and standardizes the core building blocks that simplify AI application development. It provides a unified set of APIs with implementations from leading service providers. More specifically, it provides:

				- **Unified API layer** for Inference, RAG, Agents, Tools, Safety, Evals, and Telemetry.

				- **Unified API layer** for Inference, RAG, Agents, Tools, Safety, Evals.

				- **Plugin architecture** to support the rich ecosystem of different API implementations in various environments, including local development, on-premises, cloud, and mobile.

				- **Prepackaged verified distributions** which offer a one-stop solution for developers to get started quickly and reliably in any environment.

				- **Multiple developer interfaces** like CLI and SDKs for Python, Typescript, iOS, and Android.

				@ -97,74 +37,81 @@ Llama Stack standardizes the core building blocks that simplify AI application d

				  />

				</div>

				### Llama Stack Benefits

				- **Flexible Options**: Developers can choose their preferred infrastructure without changing APIs and enjoy flexible deployment choices.

				- **Consistent Experience**: With its unified APIs, Llama Stack makes it easier to build, test, and deploy AI applications with consistent application behavior.

				- **Robust Ecosystem**: Llama Stack is already integrated with distribution partners (cloud providers, hardware vendors, and AI-focused companies) that offer tailored infrastructure, software, and services for deploying Llama models.

				#### Llama Stack Benefits

				By reducing friction and complexity, Llama Stack empowers developers to focus on what they do best: building transformative generative AI applications.

				- **Flexibility**: Developers can choose their preferred infrastructure without changing APIs and enjoy flexible deployment choices.

				- **Consistent Experience**: With its unified APIs, Llama Stack makes it easier to build, test, and deploy AI applications with consistent application behavior.

				- **Robust Ecosystem**: Llama Stack is integrated with distribution partners (cloud providers, hardware vendors, and AI-focused companies) that offer tailored infrastructure, software, and services for deploying Llama models.

				For more information, see the [Benefits of Llama Stack](https://llamastack.github.io/docs/latest/concepts/architecture#benefits-of-llama-stack) documentation.

				### API Providers

				Here is a list of the various API providers and available distributions that can help developers get started easily with Llama Stack.

				Please checkout for [full list](https://llamastack.github.io/docs/providers)

				| **API Provider Builder** |    **Environments**    | **Agents** | **Inference** | **Memory** | **Safety** | **Telemetry** | **Post Training** |

				|:------------------------:|:----------------------:|:----------:|:-------------:|:----------:|:----------:|:-------------:|:-----------------:|

				|      Meta Reference      |      Single Node       |     ✅      |       ✅       |     ✅      |     ✅      |       ✅       |               |

				|        SambaNova         |         Hosted         |            |       ✅       |            |     ✅      |               |                  |

				|         Cerebras         |         Hosted         |            |       ✅       |            |            |               |                  |

				|        Fireworks         |         Hosted         |     ✅      |       ✅       |     ✅      |            |               |                |

				|       AWS Bedrock        |         Hosted         |            |       ✅       |            |     ✅      |               |                |

				|         Together         |         Hosted         |     ✅      |       ✅       |            |     ✅      |               |                |

				|           Groq           |         Hosted         |            |       ✅       |            |            |               |                 |

				|          Ollama          |      Single Node       |            |       ✅       |            |            |               |                 |

				|           TGI            | Hosted and Single Node |            |       ✅       |            |            |               |                 |

				|        NVIDIA NIM        | Hosted and Single Node |            |       ✅       |            |            |               |                 |

				|          Chroma          |      Single Node       |            |               |     ✅      |            |               |                 |

				|        PG Vector         |      Single Node       |            |               |     ✅      |            |               |                 |

				|    PyTorch ExecuTorch    |     On-device iOS      |     ✅      |       ✅       |            |            |               |                |

				|           vLLM           | Hosted and Single Node |            |       ✅       |            |            |               |                 |

				|          OpenAI          |         Hosted         |            |       ✅       |            |            |               |                 |

				|        Anthropic         |         Hosted         |            |       ✅       |            |            |               |                 |

				|          Gemini          |         Hosted         |            |       ✅       |            |            |               |                 |

				|          watsonx         |         Hosted         |            |       ✅       |            |            |               |                 |

				|        HuggingFace       |       Single Node      |            |                |            |            |               |       ✅        |

				|         TorchTune        |       Single Node      |            |                |            |            |               |       ✅        |

				|       NVIDIA NEMO        |         Hosted         |            |                |            |            |               |       ✅        |

				|    API Provider      | Environments | Agents | Inference | VectorIO | Safety | Post Training | Eval | DatasetIO |

				|:--------------------:|:------------:|:------:|:---------:|:--------:|:------:|:-------------:|:----:|:--------:|

				|    Meta Reference    | Single Node | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |

				|      SambaNova       | Hosted | | ✅ | | ✅ | | | |

				|       Cerebras       | Hosted | | ✅ | | | | | |

				|      Fireworks       | Hosted | ✅ | ✅ | ✅ | | | | |

				|     AWS Bedrock      | Hosted | | ✅ | | ✅ | | | |

				|       Together       | Hosted | ✅ | ✅ | | ✅ | | | |

				|         Groq         | Hosted | | ✅ | | | | | |

				|        Ollama        | Single Node | | ✅ | | | | | |

				|         TGI          | Hosted/Single Node | | ✅ | | | | | |

				|      NVIDIA NIM      | Hosted/Single Node | | ✅ | | ✅ | | | |

				|       ChromaDB       | Hosted/Single Node | | | ✅ | | | | |

				|        Milvus        | Hosted/Single Node | | | ✅ | | | | |

				|        Qdrant        | Hosted/Single Node | | | ✅ | | | | |

				|       Weaviate       | Hosted/Single Node | | | ✅ | | | | |

				|      SQLite-vec      | Single Node | | | ✅ | | | | |

				|      PG Vector       | Single Node | | | ✅ | | | | |

				|  PyTorch ExecuTorch  | On-device iOS | ✅ | ✅ | | | | | |

				|         vLLM         | Single Node | | ✅ | | | | | |

				|        OpenAI        | Hosted | | ✅ | | | | | |

				|      Anthropic       | Hosted | | ✅ | | | | | |

				|        Gemini        | Hosted | | ✅ | | | | | |

				|       WatsonX        | Hosted | | ✅ | | | | | |

				|     HuggingFace      | Single Node | | | | | ✅ | | ✅ |

				|      TorchTune       | Single Node | | | | | ✅ | | |

				|     NVIDIA NEMO      | Hosted | | ✅ | ✅ | | ✅ | ✅ | ✅ |

				|        NVIDIA        | Hosted | | | | | ✅ | ✅ | ✅ |

				> **Note**: Additional providers are available through external packages. See [External Providers](https://llamastack.github.io/docs/providers/external) documentation.

				### Distributions

				A Llama Stack Distribution (or "distro") is a pre-configured bundle of provider implementations for each API component. Distributions make it easy to get started with a specific deployment scenario - you can begin with a local development setup (eg. ollama) and seamlessly transition to production (eg. Fireworks) without changing your application code. Here are some of the distributions we support:

				A Llama Stack Distribution (or "distro") is a pre-configured bundle of provider implementations for each API component. Distributions make it easy to get started with a specific deployment scenario. For example, you can begin with a local setup of Ollama and seamlessly transition to production, with fireworks, without changing your application code.

				Here are some of the distributions we support:

				|               **Distribution**                |                                                                    **Llama Stack Docker**                                                                     |                                                 Start This Distribution                                                  |

				|:---------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------:|

				|                Meta Reference                 |           [llamastack/distribution-meta-reference-gpu](https://hub.docker.com/repository/docker/llamastack/distribution-meta-reference-gpu/general)           |      [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/meta-reference-gpu.html)      |

				|                   SambaNova                   |                     [llamastack/distribution-sambanova](https://hub.docker.com/repository/docker/llamastack/distribution-sambanova/general)                     |   [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/sambanova.html)   |

				|                   Cerebras                    |                     [llamastack/distribution-cerebras](https://hub.docker.com/repository/docker/llamastack/distribution-cerebras/general)                     |   [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/cerebras.html)   |

				|                    Ollama                     |                       [llamastack/distribution-ollama](https://hub.docker.com/repository/docker/llamastack/distribution-ollama/general)                       |            [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/ollama.html)            |

				|                      TGI                      |                          [llamastack/distribution-tgi](https://hub.docker.com/repository/docker/llamastack/distribution-tgi/general)                          |             [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/tgi.html)              |

				|                   Together                    |                     [llamastack/distribution-together](https://hub.docker.com/repository/docker/llamastack/distribution-together/general)                     |           [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/together.html)           |

				|                   Fireworks                   |                    [llamastack/distribution-fireworks](https://hub.docker.com/repository/docker/llamastack/distribution-fireworks/general)                    |          [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/fireworks.html)           |

				| vLLM |                  [llamastack/distribution-remote-vllm](https://hub.docker.com/repository/docker/llamastack/distribution-remote-vllm/general)                  |         [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/remote-vllm.html)          |

				|                Starter Distribution                 |           [llamastack/distribution-starter](https://hub.docker.com/repository/docker/llamastack/distribution-starter/general)           |      [Guide](https://llamastack.github.io/docs/distributions/self_hosted_distro/starter)      |

				|                Meta Reference                 |           [llamastack/distribution-meta-reference-gpu](https://hub.docker.com/repository/docker/llamastack/distribution-meta-reference-gpu/general)           |      [Guide](https://llamastack.github.io/docs/distributions/self_hosted_distro/meta-reference-gpu)      |

				|                   PostgreSQL                  |                [llamastack/distribution-postgres-demo](https://hub.docker.com/repository/docker/llamastack/distribution-postgres-demo/general)                |                  |

				For full documentation on the Llama Stack distributions see the [Distributions Overview](https://llamastack.github.io/docs/distributions) page.

				### Documentation

				Please checkout our [Documentation](https://llama-stack.readthedocs.io/en/latest/index.html) page for more details.

				Please checkout our [Documentation](https://llamastack.github.io/docs) page for more details.

				* CLI references

				    * [llama (server-side) CLI Reference](https://llama-stack.readthedocs.io/en/latest/references/llama_cli_reference/index.html): Guide for using the `llama` CLI to work with Llama models (download, study prompts), and building/starting a Llama Stack distribution.

				    * [llama (client-side) CLI Reference](https://llama-stack.readthedocs.io/en/latest/references/llama_stack_client_cli_reference.html): Guide for using the `llama-stack-client` CLI, which allows you to query information about the distribution.

				    * [llama (server-side) CLI Reference](https://llamastack.github.io/docs/references/llama_cli_reference): Guide for using the `llama` CLI to work with Llama models (download, study prompts), and building/starting a Llama Stack distribution.

				    * [llama (client-side) CLI Reference](https://llamastack.github.io/docs/references/llama_stack_client_cli_reference): Guide for using the `llama-stack-client` CLI, which allows you to query information about the distribution.

				* Getting Started

				    * [Quick guide to start a Llama Stack server](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).

				    * [Quick guide to start a Llama Stack server](https://llamastack.github.io/docs/getting_started/quickstart).

				    * [Jupyter notebook](./docs/getting_started.ipynb) to walk-through how to use simple text and vision inference llama_stack_client APIs

				    * The complete Llama Stack lesson [Colab notebook](https://colab.research.google.com/drive/1dtVmxotBsI4cGZQNsJRYPrLiDeT0Wnwt) of the new [Llama 3.2 course on Deeplearning.ai](https://learn.deeplearning.ai/courses/introducing-multimodal-llama-3-2/lesson/8/llama-stack).

				    * A [Zero-to-Hero Guide](https://github.com/meta-llama/llama-stack/tree/main/docs/zero_to_hero_guide) that guide you through all the key components of llama stack with code samples.

				* [Contributing](CONTRIBUTING.md)

				    * [Adding a new API Provider](https://llama-stack.readthedocs.io/en/latest/contributing/new_api_provider.html) to walk-through how to add a new API provider.

				    * [Adding a new API Provider](https://llamastack.github.io/docs/contributing/new_api_provider) to walk-through how to add a new API provider.

				### Llama Stack Client SDKs

				Check out our client SDKs for connecting to a Llama Stack server in your preferred language.

				|  **Language** |  **Client SDK** | **Package** |

				| :----: | :----: | :----: |

				| Python |  [llama-stack-client-python](https://github.com/meta-llama/llama-stack-client-python) | [![PyPI version](https://img.shields.io/pypi/v/llama_stack_client.svg)](https://pypi.org/project/llama_stack_client/)

				@ -172,6 +119,17 @@ Please checkout our [Documentation](https://llama-stack.readthedocs.io/en/latest

				| Typescript   | [llama-stack-client-typescript](https://github.com/meta-llama/llama-stack-client-typescript) | [![NPM version](https://img.shields.io/npm/v/llama-stack-client.svg)](https://npmjs.org/package/llama-stack-client)

				| Kotlin | [llama-stack-client-kotlin](https://github.com/meta-llama/llama-stack-client-kotlin) | [![Maven version](https://img.shields.io/maven-central/v/com.llama.llamastack/llama-stack-client-kotlin)](https://central.sonatype.com/artifact/com.llama.llamastack/llama-stack-client-kotlin)

				Check out our client SDKs for connecting to a Llama Stack server in your preferred language, you can choose from [python](https://github.com/meta-llama/llama-stack-client-python), [typescript](https://github.com/meta-llama/llama-stack-client-typescript), [swift](https://github.com/meta-llama/llama-stack-client-swift), and [kotlin](https://github.com/meta-llama/llama-stack-client-kotlin) programming languages to quickly build your applications.

				You can find more example scripts with client SDKs to talk with the Llama Stack server in our [llama-stack-apps](https://github.com/meta-llama/llama-stack-apps/tree/main/examples) repo.

				## 🌟 GitHub Star History

				## Star History

				[![Star History Chart](https://api.star-history.com/svg?repos=meta-llama/llama-stack&type=Date)](https://www.star-history.com/#meta-llama/llama-stack&Date)

				## ✨ Contributors

				Thanks to all of our amazing contributors!

				<a href="https://github.com/meta-llama/llama-stack/graphs/contributors">

				  <img src="https://contrib.rocks/image?repo=meta-llama/llama-stack" />

				</a>

									
										229

benchmarking/k8s-benchmark/README.md
									
										Normal file
									
										View file
										
				@ -0,0 +1,229 @@

				# Llama Stack Benchmark Suite on Kubernetes

				## Motivation

				Performance benchmarking is critical for understanding the overhead and characteristics of the Llama Stack abstraction layer compared to direct inference engines like vLLM.

				### Why This Benchmark Suite Exists

				**Performance Validation**: The Llama Stack provides a unified API layer across multiple inference providers, but this abstraction introduces potential overhead. This benchmark suite quantifies the performance impact by comparing:

				- Llama Stack inference (with vLLM backend)

				- Direct vLLM inference calls

				- Both under identical Kubernetes deployment conditions

				**Production Readiness Assessment**: Real-world deployments require understanding performance characteristics under load. This suite simulates concurrent user scenarios with configurable parameters (duration, concurrency, request patterns) to validate production readiness.

				**Regression Detection (TODO)**: As the Llama Stack evolves, this benchmark provides automated regression detection for performance changes. CI/CD pipelines can leverage these benchmarks to catch performance degradations before production deployments.

				**Resource Planning**: By measuring throughput, latency percentiles, and resource utilization patterns, teams can make informed decisions about:

				- Kubernetes resource allocation (CPU, memory, GPU)

				- Auto-scaling configurations

				- Cost optimization strategies

				### Key Metrics Captured

				The benchmark suite measures critical performance indicators:

				- **Throughput**: Requests per second under sustained load

				- **Latency Distribution**: P50, P95, P99 response times

				- **Time to First Token (TTFT)**: Critical for streaming applications

				- **Inter-Token Latency (ITL)**: Token generation speed for streaming

				- **Error Rates**: Request failures and timeout analysis

				This data enables data-driven architectural decisions and performance optimization efforts.

				## Setup

				**1. Deploy base k8s infrastructure:**

				```bash

				cd ../../docs/source/distributions/k8s

				./apply.sh

				```

				**2. Deploy benchmark components:**

				```bash

				./apply.sh

				```

				**3. Verify deployment:**

				```bash

				kubectl get pods

				# Should see: llama-stack-benchmark-server, vllm-server, etc.

				```

				## Benchmark Results

				We use [GuideLLM](https://github.com/neuralmagic/guidellm) against our k8s deployment for comprehensive performance testing.

				### Performance - 1 vLLM Replica

				We vary the number of Llama Stack replicas with 1 vLLM replica and compare performance below.

				![Performance - 1 vLLM Replica](results/vllm_replica1_benchmark_results.png)

				For full results see the `benchmarking/k8s-benchmark/results/` directory.

				## Quick Start

				Follow the instructions below to run benchmarks similar to the ones above.

				### Comprehensive Benchmark Suite

				**Run all benchmarks with different cluster configurations:**

				```bash

				./scripts/run-all-benchmarks.sh

				```

				This script will automatically:

				- Scale deployments to different configurations

				- Run benchmarks for each setup

				- Generate output files with meaningful names that include setup information

				### Individual Benchmarks

				**Benchmark Llama Stack (runs against current cluster setup):**

				```bash

				./scripts/run-guidellm-benchmark.sh --target stack

				```

				**Benchmark vLLM direct (runs against current cluster setup):**

				```bash

				./scripts/run-guidellm-benchmark.sh --target vllm

				```

				**Benchmark with custom parameters:**

				```bash

				./scripts/run-guidellm-benchmark.sh --target stack --max-seconds 120 --prompt-tokens 1024 --output-tokens 512

				```

				**Benchmark with custom output file:**

				```bash

				./scripts/run-guidellm-benchmark.sh --target stack --output-file results/my-custom-benchmark.txt

				```

				### Generating Charts

				Once the benchmarks are run, you can generate performance charts from benchmark results:

				```bash

				uv run ./scripts/generate_charts.py

				```

				This loads runs in the `results/` directory and creates visualizations comparing different configurations and replica counts.

				## Benchmark Workflow

				The benchmark suite is organized into two main scripts with distinct responsibilities:

				### 1. `run-all-benchmarks.sh` - Orchestration & Scaling

				- **Purpose**: Manages different cluster configurations and orchestrates benchmark runs

				- **Responsibilities**:

				  - Scales Kubernetes deployments (vLLM replicas, Stack replicas, worker counts)

				  - Runs benchmarks for each configuration

				  - Generates meaningful output filenames with setup information

				- **Use case**: Running comprehensive performance testing across multiple configurations

				### 2. `run-guidellm-benchmark.sh` - Single Benchmark Execution

				- **Purpose**: Executes a single benchmark against the current cluster state

				- **Responsibilities**:

				  - Runs GuideLLM benchmark with configurable parameters

				  - Accepts custom output file paths

				  - No cluster scaling - benchmarks current deployment state

				- **Use case**: Testing specific configurations or custom scenarios

				### Typical Workflow

				1. **Comprehensive Testing**: Use `run-all-benchmarks.sh` to automatically test multiple configurations

				2. **Custom Testing**: Use `run-guidellm-benchmark.sh` for specific parameter testing or manual cluster configurations

				3. **Analysis**: Use `generate_charts.py` to visualize results from either approach

				## Command Reference

				### run-all-benchmarks.sh

				Orchestrates multiple benchmark runs with different cluster configurations. This script:

				- Automatically scales deployments before each benchmark

				- Runs benchmarks against the configured cluster setup

				- Generates meaningfully named output files

				```bash

				./scripts/run-all-benchmarks.sh

				```

				**Configuration**: Edit the `configs` array in the script to customize benchmark configurations:

				```bash

				# Each line: (target, stack_replicas, vllm_replicas, stack_workers)

				configs=(

				    "stack 1 1 1"

				    "stack 1 1 2"

				    "stack 1 1 4"

				    "vllm 1 1 -"

				)

				```

				**Output files**: Generated with setup information in filename:

				- Stack: `guidellm-benchmark-stack-s{replicas}-sw{workers}-v{vllm_replicas}-{timestamp}.txt`

				- vLLM: `guidellm-benchmark-vllm-v{vllm_replicas}-{timestamp}.txt`

				### run-guidellm-benchmark.sh Options

				Runs a single benchmark against the current cluster setup (no scaling).

				```bash

				./scripts/run-guidellm-benchmark.sh [options]

				Options:

				  -t, --target <stack|vllm>     Target to benchmark (default: stack)

				  -s, --max-seconds <seconds>   Maximum duration in seconds (default: 60)

				  -p, --prompt-tokens <tokens>  Number of prompt tokens (default: 512)

				  -o, --output-tokens <tokens>  Number of output tokens (default: 256)

				  -r, --rate-type <type>        Rate type (default: concurrent)

				  -c, --rate                    Rate (default: 1,2,4,8,16,32,64,128)

				  --output-file <path>          Output file path (default: auto-generated)

				  --stack-deployment <name>     Name of the stack deployment (default: llama-stack-benchmark-server)

				  --vllm-deployment <name>      Name of the vllm deployment (default: vllm-server)

				  --stack-url <url>             URL of the stack service (default: http://llama-stack-benchmark-service:8323/v1/openai)

				  -h, --help                    Show help message

				Examples:

				  ./scripts/run-guidellm-benchmark.sh --target vllm                              # Benchmark vLLM direct

				  ./scripts/run-guidellm-benchmark.sh --target stack                             # Benchmark Llama Stack (default)

				  ./scripts/run-guidellm-benchmark.sh -t vllm -s 60 -p 512 -o 256               # vLLM with custom parameters

				  ./scripts/run-guidellm-benchmark.sh --output-file results/my-benchmark.txt     # Specify custom output file

				  ./scripts/run-guidellm-benchmark.sh --stack-deployment my-stack-server         # Use custom stack deployment name

				```

				## Local Testing

				### Running Benchmark Locally

				For local development without Kubernetes:

				**1. (Optional) Start Mock OpenAI server:**

				There is a simple mock OpenAI server if you don't have an inference provider available.

				The `openai-mock-server.py` provides:

				- **OpenAI-compatible API** for testing without real models

				- **Configurable streaming delay** via `STREAM_DELAY_SECONDS` env var

				- **Consistent responses** for reproducible benchmarks

				- **Lightweight testing** without GPU requirements

				```bash

				uv run python openai-mock-server.py --port 8080

				```

				**2. Start Stack server:**

				```bash

				LLAMA_STACK_CONFIG=benchmarking/k8s-benchmark/stack_run_config.yaml uv run uvicorn llama_stack.core.server.server:create_app --port 8321 --workers 4 --factory

				```

				**3. Run GuideLLM benchmark:**

				```bash

				GUIDELLM__PREFERRED_ROUTE="chat_completions" uv run guidellm benchmark run \

				  --target "http://localhost:8321/v1/openai/v1" \

				  --model "meta-llama/Llama-3.2-3B-Instruct" \

				  --rate-type sweep \

				  --max-seconds 60 \

				  --data "prompt_tokens=256,output_tokens=128" --output-path='output.html'

				```

									
										33

benchmarking/k8s-benchmark/apply.sh
									
										Executable file
									
										View file
										
				@ -0,0 +1,33 @@

				#!/usr/bin/env bash

				# Copyright (c) Meta Platforms, Inc. and affiliates.

				# All rights reserved.

				#

				# This source code is licensed under the terms described in the LICENSE file in

				# the root directory of this source tree.

				# Deploys the benchmark-specific components on top of the base k8s deployment (../k8s/apply.sh).

				export STREAM_DELAY_SECONDS=0.005

				export POSTGRES_USER=llamastack

				export POSTGRES_DB=llamastack

				export POSTGRES_PASSWORD=llamastack

				export INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct

				export SAFETY_MODEL=meta-llama/Llama-Guard-3-1B

				export BENCHMARK_INFERENCE_MODEL=$INFERENCE_MODEL

				export LLAMA_STACK_WORKERS=4

				set -euo pipefail

				set -x

				# Deploy benchmark-specific components

				kubectl create configmap llama-stack-config --from-file=stack_run_config.yaml \

				  --dry-run=client -o yaml > stack-configmap.yaml

				kubectl apply --validate=false -f stack-configmap.yaml

				# Deploy our custom llama stack server (overriding the base one)

				envsubst < stack-k8s.yaml.template | kubectl apply --validate=false -f -

									
										202

benchmarking/k8s-benchmark/openai-mock-server.py
									
										Executable file
									
										View file
										
				@ -0,0 +1,202 @@

				#!/usr/bin/env python3

				# Copyright (c) Meta Platforms, Inc. and affiliates.

				# All rights reserved.

				#

				# This source code is licensed under the terms described in the LICENSE file in

				# the root directory of this source tree.

				"""

				OpenAI-compatible mock server that returns:

				- Hardcoded /models response for consistent validation

				- Valid OpenAI-formatted chat completion responses with dynamic content

				"""

				import argparse

				import json

				import os

				import random

				import time

				import uuid

				from flask import Flask, Response, jsonify, request

				app = Flask(__name__)

				# Models from environment variables

				def get_models():

				    models_str = os.getenv("MOCK_MODELS", "meta-llama/Llama-3.2-3B-Instruct")

				    model_ids = [m.strip() for m in models_str.split(",") if m.strip()]

				    return {

				        "object": "list",

				        "data": [

				            {"id": model_id, "object": "model", "created": 1234567890, "owned_by": "vllm"} for model_id in model_ids

				        ],

				    }

				def generate_random_text(length=50):

				    """Generate random but coherent text for responses."""

				    words = [

				        "Hello",

				        "there",

				        "I'm",

				        "an",

				        "AI",

				        "assistant",

				        "ready",

				        "to",

				        "help",

				        "you",

				        "with",

				        "your",

				        "questions",

				        "and",

				        "tasks",

				        "today",

				        "Let",

				        "me",

				        "know",

				        "what",

				        "you'd",

				        "like",

				        "to",

				        "discuss",

				        "or",

				        "explore",

				        "together",

				        "I",

				        "can",

				        "assist",

				        "with",

				        "various",

				        "topics",

				        "including",

				        "coding",

				        "writing",

				        "analysis",

				        "and",

				        "more",

				    ]

				    return " ".join(random.choices(words, k=length))

				@app.route("/v1/models", methods=["GET"])

				def list_models():

				    models = get_models()

				    print(f"[MOCK] Returning models: {[m['id'] for m in models['data']]}")

				    return jsonify(models)

				@app.route("/v1/chat/completions", methods=["POST"])

				def chat_completions():

				    """Return OpenAI-formatted chat completion responses."""

				    data = request.get_json()

				    default_model = get_models()["data"][0]["id"]

				    model = data.get("model", default_model)

				    messages = data.get("messages", [])

				    stream = data.get("stream", False)

				    print(f"[MOCK] Chat completion request - model: {model}, stream: {stream}")

				    if stream:

				        return handle_streaming_completion(model, messages)

				    else:

				        return handle_non_streaming_completion(model, messages)

				def handle_non_streaming_completion(model, messages):

				    response_text = generate_random_text(random.randint(20, 80))

				    # Calculate realistic token counts

				    prompt_tokens = sum(len(str(msg.get("content", "")).split()) for msg in messages)

				    completion_tokens = len(response_text.split())

				    response = {

				        "id": f"chatcmpl-{uuid.uuid4().hex[:8]}",

				        "object": "chat.completion",

				        "created": int(time.time()),

				        "model": model,

				        "choices": [{"index": 0, "message": {"role": "assistant", "content": response_text}, "finish_reason": "stop"}],

				        "usage": {

				            "prompt_tokens": prompt_tokens,

				            "completion_tokens": completion_tokens,

				            "total_tokens": prompt_tokens + completion_tokens,

				        },

				    }

				    return jsonify(response)

				def handle_streaming_completion(model, messages):

				    def generate_stream():

				        # Generate response text

				        full_response = generate_random_text(random.randint(30, 100))

				        words = full_response.split()

				        # Send initial chunk

				        initial_chunk = {

				            "id": f"chatcmpl-{uuid.uuid4().hex[:8]}",

				            "object": "chat.completion.chunk",

				            "created": int(time.time()),

				            "model": model,

				            "choices": [{"index": 0, "delta": {"role": "assistant", "content": ""}}],

				        }

				        yield f"data: {json.dumps(initial_chunk)}\n\n"

				        # Send word by word

				        for i, word in enumerate(words):

				            chunk = {

				                "id": f"chatcmpl-{uuid.uuid4().hex[:8]}",

				                "object": "chat.completion.chunk",

				                "created": int(time.time()),

				                "model": model,

				                "choices": [{"index": 0, "delta": {"content": f"{word} " if i < len(words) - 1 else word}}],

				            }

				            yield f"data: {json.dumps(chunk)}\n\n"

				            # Configurable delay to simulate realistic streaming

				            stream_delay = float(os.getenv("STREAM_DELAY_SECONDS", "0.005"))

				            time.sleep(stream_delay)

				        # Send final chunk

				        final_chunk = {

				            "id": f"chatcmpl-{uuid.uuid4().hex[:8]}",

				            "object": "chat.completion.chunk",

				            "created": int(time.time()),

				            "model": model,

				            "choices": [{"index": 0, "delta": {"content": ""}, "finish_reason": "stop"}],

				        }

				        yield f"data: {json.dumps(final_chunk)}\n\n"

				        yield "data: [DONE]\n\n"

				    return Response(

				        generate_stream(),

				        mimetype="text/event-stream",

				        headers={

				            "Cache-Control": "no-cache",

				            "Connection": "keep-alive",

				            "Access-Control-Allow-Origin": "*",

				        },

				    )

				@app.route("/health", methods=["GET"])

				def health():

				    return jsonify({"status": "healthy", "type": "openai-mock"})

				if __name__ == "__main__":

				    parser = argparse.ArgumentParser(description="OpenAI-compatible mock server")

				    parser.add_argument("--port", type=int, default=8081, help="Port to run the server on (default: 8081)")

				    args = parser.parse_args()

				    port = args.port

				    models = get_models()

				    print("Starting OpenAI-compatible mock server...")

				    print(f"- /models endpoint with: {[m['id'] for m in models['data']]}")

				    print("- OpenAI-formatted chat/completion responses with dynamic content")

				    print("- Streaming support with valid SSE format")

				    print(f"- Listening on: http://0.0.0.0:{port}")

				    app.run(host="0.0.0.0", port=port, debug=False)

171

benchmarking/k8s-benchmark/results/guidellm-benchmark-stack-s1-sw1-v1-20250922-103408.txt Normal file

View file

 @ -0,0 +1,171 @@
 Collecting uv
   Downloading uv-0.8.19-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
 Downloading uv-0.8.19-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20.9 MB)
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 20.9/20.9 MB 144.3 MB/s eta 0:00:00
 Installing collected packages: uv
 Successfully installed uv-0.8.19
 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
 [notice] A new release of pip is available: 24.0 -> 25.2
 [notice] To update, run: pip install --upgrade pip
 Using Python 3.11.13 environment at: /usr/local
 Resolved 61 packages in 551ms
 Downloading pillow (6.3MiB)
 Downloading hf-xet (3.0MiB)
 Downloading tokenizers (3.1MiB)
 Downloading pygments (1.2MiB)
 Downloading pandas (11.8MiB)
 Downloading aiohttp (1.7MiB)
 Downloading pydantic-core (1.9MiB)
 Downloading numpy (16.2MiB)
 Downloading transformers (11.1MiB)
 Downloading pyarrow (40.8MiB)
  Downloading pydantic-core
  Downloading aiohttp
  Downloading tokenizers
  Downloading hf-xet
  Downloading pygments
  Downloading pillow
  Downloading numpy
  Downloading pandas
  Downloading transformers
  Downloading pyarrow
 Prepared 61 packages in 1.23s
 Installed 61 packages in 114ms
  + aiohappyeyeballs==2.6.1
  + aiohttp==3.12.15
  + aiosignal==1.4.0
  + annotated-types==0.7.0
  + anyio==4.10.0
  + attrs==25.3.0
  + certifi==2025.8.3
  + charset-normalizer==3.4.3
  + click==8.1.8
  + datasets==4.1.1
  + dill==0.4.0
  + filelock==3.19.1
  + frozenlist==1.7.0
  + fsspec==2025.9.0
  + ftfy==6.3.1
  + guidellm==0.3.0
  + h11==0.16.0
  + h2==4.3.0
  + hf-xet==1.1.10
  + hpack==4.1.0
  + httpcore==1.0.9
  + httpx==0.28.1
  + huggingface-hub==0.35.0
  + hyperframe==6.1.0
  + idna==3.10
  + loguru==0.7.3
  + markdown-it-py==4.0.0
  + mdurl==0.1.2
  + multidict==6.6.4
  + multiprocess==0.70.16
  + numpy==2.3.3
  + packaging==25.0
  + pandas==2.3.2
  + pillow==11.3.0
  + propcache==0.3.2
  + protobuf==6.32.1
  + pyarrow==21.0.0
  + pydantic==2.11.9
  + pydantic-core==2.33.2
  + pydantic-settings==2.10.1
  + pygments==2.19.2
  + python-dateutil==2.9.0.post0
  + python-dotenv==1.1.1
  + pytz==2025.2
  + pyyaml==6.0.2
  + regex==2025.9.18
  + requests==2.32.5
  + rich==14.1.0
  + safetensors==0.6.2
  + six==1.17.0
  + sniffio==1.3.1
  + tokenizers==0.22.1
  + tqdm==4.67.1
  + transformers==4.56.2
  + typing-extensions==4.15.0
  + typing-inspection==0.4.1
  + tzdata==2025.2
  + urllib3==2.5.0
  + wcwidth==0.2.14
  + xxhash==3.5.0
  + yarl==1.20.1
 Using Python 3.11.13 environment at: /usr/local
 Audited 1 package in 3ms
 Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.
 Creating backend...
 Backend openai_http connected to http://llama-stack-benchmark-service:8323/v1/openai for model meta-llama/Llama-3.2-3B-Instruct.
 Creating request loader...
 Created loader with 1000 unique requests from prompt_tokens=512,output_tokens=256.
 ╭─ Benchmarks ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
 │ [17:34:30] ⠋ 100% concurrent@1   (complete)   Req:    0.3 req/s,    3.32s Lat,     1.0 Conc,      18 Comp,        1 Inc,        0 Err                                                                │
 │                                               Tok:   74.0 gen/s,  238.6 tot/s,  40.2ms TTFT,   13.4ms ITL,   546 Prompt,      246 Gen                                                                │
 │ [17:35:35] ⠋ 100% concurrent@2   (complete)   Req:    0.6 req/s,    3.46s Lat,     2.0 Conc,      34 Comp,        2 Inc,        0 Err                                                                │
 │                                               Tok:  139.6 gen/s,  454.0 tot/s,  48.0ms TTFT,   14.1ms ITL,   546 Prompt,      243 Gen                                                                │
 │ [17:36:40] ⠋ 100% concurrent@4   (complete)   Req:    1.1 req/s,    3.44s Lat,     3.9 Conc,      68 Comp,        4 Inc,        0 Err                                                                │
 │                                               Tok:  273.2 gen/s,  900.4 tot/s,  50.7ms TTFT,   14.3ms ITL,   546 Prompt,      238 Gen                                                                │
 │ [17:37:45] ⠋ 100% concurrent@8   (complete)   Req:    2.2 req/s,    3.55s Lat,     7.7 Conc,     129 Comp,        8 Inc,        0 Err                                                                │
 │                                               Tok:  519.1 gen/s, 1699.8 tot/s,  66.0ms TTFT,   14.6ms ITL,   547 Prompt,      240 Gen                                                                │
 │ [17:38:50] ⠋ 100% concurrent@16  (complete)   Req:    4.1 req/s,    3.76s Lat,    15.5 Conc,     247 Comp,       16 Inc,        0 Err                                                                │
 │                                               Tok: 1005.5 gen/s, 3256.7 tot/s, 101.0ms TTFT,   15.0ms ITL,   547 Prompt,      244 Gen                                                                │
 │ [17:39:56] ⠋ 100% concurrent@32  (complete)   Req:    8.1 req/s,    3.84s Lat,    30.9 Conc,     483 Comp,       32 Inc,        0 Err                                                                │
 │                                               Tok: 1926.3 gen/s, 6327.2 tot/s, 295.7ms TTFT,   14.8ms ITL,   547 Prompt,      239 Gen                                                                │
 │ [17:41:03] ⠋ 100% concurrent@64  (complete)   Req:    9.9 req/s,    6.05s Lat,    59.7 Conc,     576 Comp,       58 Inc,        0 Err                                                                │
 │                                               Tok: 2381.0 gen/s, 7774.5 tot/s, 1196.2ms TTFT,   20.2ms ITL,   547 Prompt,      241 Gen                                                               │
 │ [17:42:10] ⠋ 100% concurrent@128 (complete)   Req:    9.2 req/s,   11.59s Lat,   107.2 Conc,     514 Comp,      117 Inc,        0 Err                                                                │
 │                                               Tok: 2233.4 gen/s, 7286.3 tot/s, 2403.9ms TTFT,   38.2ms ITL,   547 Prompt,      242 Gen                                                               │
 ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
 Generating... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ (8/8) [ 0:08:41 < 0:00:00 ]
 Benchmarks Metadata:
     Run id:511a14fd-ba11-4ffa-92ef-7cc23db4dd38
     Duration:528.5 seconds
     Profile:type=concurrent, strategies=['concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent'], streams=[1, 2, 4, 8, 16, 32, 64, 128]
     Args:max_number=None, max_duration=60.0, warmup_number=None, warmup_duration=3.0, cooldown_number=None, cooldown_duration=None
     Worker:type_='generative_requests_worker' backend_type='openai_http' backend_target='http://llama-stack-benchmark-service:8323/v1/openai' backend_model='meta-llama/Llama-3.2-3B-Instruct'
     backend_info={'max_output_tokens': 16384, 'timeout': 300, 'http2': True, 'follow_redirects': True, 'headers': {}, 'text_completions_path': '/v1/completions', 'chat_completions_path':
     '/v1/chat/completions'}
     Request Loader:type_='generative_request_loader' data='prompt_tokens=512,output_tokens=256' data_args=None processor='meta-llama/Llama-3.2-3B-Instruct' processor_args=None
     Extras:None
 Benchmarks Info:
 ===================================================================================================================================================
 Metadata                                       |||| Requests Made  ||| Prompt Tok/Req ||| Output Tok/Req ||| Prompt Tok Total||| Output Tok Total||
      Benchmark| Start Time| End Time| Duration (s)|  Comp|  Inc|  Err|  Comp|   Inc| Err|  Comp|   Inc| Err|   Comp|   Inc| Err|   Comp|   Inc| Err
 --------------|-----------|---------|-------------|------|-----|-----|------|------|----|------|------|----|-------|------|----|-------|------|----
   concurrent@1|   17:34:35| 17:35:35|         60.0|    18|    1|    0| 546.4| 512.0| 0.0| 246.0|  14.0| 0.0|   9835|   512|   0|   4428|    14|   0
   concurrent@2|   17:35:40| 17:36:40|         60.0|    34|    2|    0| 546.4| 512.0| 0.0| 242.7|  80.0| 0.0|  18577|  1024|   0|   8253|   160|   0
   concurrent@4|   17:36:45| 17:37:45|         60.0|    68|    4|    0| 546.4| 512.0| 0.0| 238.1| 103.2| 0.0|  37156|  2048|   0|  16188|   413|   0
   concurrent@8|   17:37:50| 17:38:50|         60.0|   129|    8|    0| 546.7| 512.0| 0.0| 240.3| 180.0| 0.0|  70518|  4096|   0|  31001|  1440|   0
  concurrent@16|   17:38:55| 17:39:55|         60.0|   247|   16|    0| 546.6| 512.0| 0.0| 244.1| 142.6| 0.0| 135002|  8192|   0|  60300|  2281|   0
  concurrent@32|   17:40:01| 17:41:01|         60.0|   483|   32|    0| 546.5| 512.0| 0.0| 239.2| 123.2| 0.0| 263972| 16384|   0| 115540|  3944|   0
  concurrent@64|   17:41:08| 17:42:08|         60.0|   576|   58|    0| 546.6| 512.0| 0.0| 241.3|  13.9| 0.0| 314817| 29696|   0| 138976|   807|   0
 concurrent@128|   17:42:15| 17:43:15|         60.0|   514|  117|    0| 546.5| 512.0| 0.0| 241.6| 143.9| 0.0| 280911| 59904|   0| 124160| 16832|   0
 ===================================================================================================================================================
 Benchmarks Stats:
 =======================================================================================================================================================
 Metadata      | Request Stats         || Out Tok/sec| Tot Tok/sec| Req Latency (sec) ||| TTFT (ms)           ||| ITL (ms)        ||| TPOT (ms)       ||
      Benchmark| Per Second| Concurrency|        mean|        mean|  mean| median|   p99|   mean| median|    p99| mean| median|  p99| mean| median|  p99
 --------------|-----------|------------|------------|------------|------|-------|------|-------|-------|-------|-----|-------|-----|-----|-------|-----
   concurrent@1|       0.30|        1.00|        74.0|       238.6|  3.32|   3.43|  3.61|   40.2|   39.3|   51.2| 13.4|   13.3| 14.0| 13.3|   13.2| 13.9
   concurrent@2|       0.58|        1.99|       139.6|       454.0|  3.46|   3.64|  3.74|   48.0|   45.8|   72.0| 14.1|   14.1| 14.5| 14.0|   14.0| 14.4
   concurrent@4|       1.15|        3.95|       273.2|       900.4|  3.44|   3.69|  3.74|   50.7|   47.2|  118.6| 14.3|   14.3| 14.4| 14.2|   14.2| 14.4
   concurrent@8|       2.16|        7.67|       519.1|      1699.8|  3.55|   3.76|  3.87|   66.0|   48.8|  208.2| 14.6|   14.5| 14.8| 14.5|   14.5| 14.8
  concurrent@16|       4.12|       15.48|      1005.5|      3256.7|  3.76|   3.90|  4.18|  101.0|   65.6|  396.7| 15.0|   15.0| 15.9| 15.0|   15.0| 15.9
  concurrent@32|       8.05|       30.89|      1926.3|      6327.2|  3.84|   4.04|  4.39|  295.7|  265.6|  720.4| 14.8|   14.9| 15.5| 14.8|   14.8| 15.3
  concurrent@64|       9.87|       59.74|      2381.0|      7774.5|  6.05|   6.18|  9.94| 1196.2| 1122.5| 4295.3| 20.2|   20.0| 25.8| 20.1|   19.9| 25.8
 concurrent@128|       9.25|      107.16|      2233.4|      7286.3| 11.59|  12.04| 14.46| 2403.9| 2322.3| 4001.5| 38.2|   38.5| 53.0| 38.0|   38.3| 52.7
 =======================================================================================================================================================
 Saving benchmarks report...
 Benchmarks report saved to /benchmarks.json
 Benchmarking complete.

171

benchmarking/k8s-benchmark/results/guidellm-benchmark-stack-s1-sw2-v1-20250922-104457.txt Normal file

View file

 @ -0,0 +1,171 @@
 Collecting uv
   Downloading uv-0.8.19-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
 Downloading uv-0.8.19-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20.9 MB)
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 20.9/20.9 MB 149.3 MB/s eta 0:00:00
 Installing collected packages: uv
 Successfully installed uv-0.8.19
 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
 [notice] A new release of pip is available: 24.0 -> 25.2
 [notice] To update, run: pip install --upgrade pip
 Using Python 3.11.13 environment at: /usr/local
 Resolved 61 packages in 494ms
 Downloading pandas (11.8MiB)
 Downloading tokenizers (3.1MiB)
 Downloading pygments (1.2MiB)
 Downloading aiohttp (1.7MiB)
 Downloading transformers (11.1MiB)
 Downloading numpy (16.2MiB)
 Downloading pillow (6.3MiB)
 Downloading pydantic-core (1.9MiB)
 Downloading hf-xet (3.0MiB)
 Downloading pyarrow (40.8MiB)
  Downloading pydantic-core
  Downloading aiohttp
  Downloading tokenizers
  Downloading hf-xet
  Downloading pillow
  Downloading pygments
  Downloading numpy
  Downloading pandas
  Downloading pyarrow
  Downloading transformers
 Prepared 61 packages in 1.24s
 Installed 61 packages in 126ms
  + aiohappyeyeballs==2.6.1
  + aiohttp==3.12.15
  + aiosignal==1.4.0
  + annotated-types==0.7.0
  + anyio==4.10.0
  + attrs==25.3.0
  + certifi==2025.8.3
  + charset-normalizer==3.4.3
  + click==8.1.8
  + datasets==4.1.1
  + dill==0.4.0
  + filelock==3.19.1
  + frozenlist==1.7.0
  + fsspec==2025.9.0
  + ftfy==6.3.1
  + guidellm==0.3.0
  + h11==0.16.0
  + h2==4.3.0
  + hf-xet==1.1.10
  + hpack==4.1.0
  + httpcore==1.0.9
  + httpx==0.28.1
  + huggingface-hub==0.35.0
  + hyperframe==6.1.0
  + idna==3.10
  + loguru==0.7.3
  + markdown-it-py==4.0.0
  + mdurl==0.1.2
  + multidict==6.6.4
  + multiprocess==0.70.16
  + numpy==2.3.3
  + packaging==25.0
  + pandas==2.3.2
  + pillow==11.3.0
  + propcache==0.3.2
  + protobuf==6.32.1
  + pyarrow==21.0.0
  + pydantic==2.11.9
  + pydantic-core==2.33.2
  + pydantic-settings==2.10.1
  + pygments==2.19.2
  + python-dateutil==2.9.0.post0
  + python-dotenv==1.1.1
  + pytz==2025.2
  + pyyaml==6.0.2
  + regex==2025.9.18
  + requests==2.32.5
  + rich==14.1.0
  + safetensors==0.6.2
  + six==1.17.0
  + sniffio==1.3.1
  + tokenizers==0.22.1
  + tqdm==4.67.1
  + transformers==4.56.2
  + typing-extensions==4.15.0
  + typing-inspection==0.4.1
  + tzdata==2025.2
  + urllib3==2.5.0
  + wcwidth==0.2.14
  + xxhash==3.5.0
  + yarl==1.20.1
 Using Python 3.11.13 environment at: /usr/local
 Audited 1 package in 3ms
 Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.
 Creating backend...
 Backend openai_http connected to http://llama-stack-benchmark-service:8323/v1/openai for model meta-llama/Llama-3.2-3B-Instruct.
 Creating request loader...
 Created loader with 1000 unique requests from prompt_tokens=512,output_tokens=256.
 ╭─ Benchmarks ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
 │ [17:45:18] ⠋ 100% concurrent@1   (complete)   Req:    0.3 req/s,    3.42s Lat,     1.0 Conc,      17 Comp,        1 Inc,        0 Err                                                                │
 │                                               Tok:   73.9 gen/s,  233.7 tot/s,  50.2ms TTFT,   13.4ms ITL,   547 Prompt,      253 Gen                                                                │
 │ [17:46:23] ⠋ 100% concurrent@2   (complete)   Req:    0.6 req/s,    3.42s Lat,     2.0 Conc,      34 Comp,        2 Inc,        0 Err                                                                │
 │                                               Tok:  134.7 gen/s,  447.4 tot/s,  50.8ms TTFT,   14.3ms ITL,   546 Prompt,      235 Gen                                                                │
 │ [17:47:28] ⠋ 100% concurrent@4   (complete)   Req:    1.1 req/s,    3.55s Lat,     3.9 Conc,      66 Comp,        4 Inc,        0 Err                                                                │
 │                                               Tok:  268.7 gen/s,  873.1 tot/s,  54.9ms TTFT,   14.4ms ITL,   547 Prompt,      243 Gen                                                                │
 │ [17:48:33] ⠋ 100% concurrent@8   (complete)   Req:    2.2 req/s,    3.56s Lat,     7.8 Conc,     130 Comp,        8 Inc,        0 Err                                                                │
 │                                               Tok:  526.1 gen/s, 1728.4 tot/s,  60.6ms TTFT,   14.7ms ITL,   547 Prompt,      239 Gen                                                                │
 │ [17:49:38] ⠋ 100% concurrent@16  (complete)   Req:    4.1 req/s,    3.79s Lat,    15.7 Conc,     246 Comp,       16 Inc,        0 Err                                                                │
 │                                               Tok: 1006.9 gen/s, 3268.6 tot/s,  74.8ms TTFT,   15.3ms ITL,   547 Prompt,      243 Gen                                                                │
 │ [17:50:44] ⠋ 100% concurrent@32  (complete)   Req:    7.8 req/s,    3.95s Lat,    30.9 Conc,     467 Comp,       32 Inc,        0 Err                                                                │
 │                                               Tok: 1912.0 gen/s, 6191.6 tot/s, 119.1ms TTFT,   15.7ms ITL,   547 Prompt,      244 Gen                                                                │
 │ [17:51:50] ⠋ 100% concurrent@64  (complete)   Req:   13.0 req/s,    4.75s Lat,    61.8 Conc,     776 Comp,       64 Inc,        0 Err                                                                │
 │                                               Tok: 3154.3 gen/s, 10273.3 tot/s, 339.1ms TTFT,   18.3ms ITL,   547 Prompt,      242 Gen                                                               │
 │ [17:52:58] ⠋ 100% concurrent@128 (complete)   Req:   15.1 req/s,    7.82s Lat,   117.7 Conc,     898 Comp,      127 Inc,        0 Err                                                                │
 │                                               Tok: 3617.4 gen/s, 11843.9 tot/s, 1393.8ms TTFT,   26.8ms ITL,   547 Prompt,      240 Gen                                                              │
 ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
 Generating... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ (8/8) [ 0:08:41 < 0:00:00 ]
 Benchmarks Metadata:
     Run id:f73d408e-256a-4c32-aa40-05e8d7098b66
     Duration:529.2 seconds
     Profile:type=concurrent, strategies=['concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent'], streams=[1, 2, 4, 8, 16, 32, 64, 128]
     Args:max_number=None, max_duration=60.0, warmup_number=None, warmup_duration=3.0, cooldown_number=None, cooldown_duration=None
     Worker:type_='generative_requests_worker' backend_type='openai_http' backend_target='http://llama-stack-benchmark-service:8323/v1/openai' backend_model='meta-llama/Llama-3.2-3B-Instruct'
     backend_info={'max_output_tokens': 16384, 'timeout': 300, 'http2': True, 'follow_redirects': True, 'headers': {}, 'text_completions_path': '/v1/completions', 'chat_completions_path':
     '/v1/chat/completions'}
     Request Loader:type_='generative_request_loader' data='prompt_tokens=512,output_tokens=256' data_args=None processor='meta-llama/Llama-3.2-3B-Instruct' processor_args=None
     Extras:None
 Benchmarks Info:
 =====================================================================================================================================================
 Metadata                                       |||| Requests Made  ||| Prompt Tok/Req ||| Output Tok/Req ||| Prompt Tok Total||| Output Tok Total  ||
      Benchmark| Start Time| End Time| Duration (s)|  Comp|  Inc|  Err|  Comp|   Inc| Err|  Comp|   Inc| Err|   Comp|   Inc| Err|    Comp|   Inc|  Err
 --------------|-----------|---------|-------------|------|-----|-----|------|------|----|------|------|----|-------|------|----|--------|------|-----
   concurrent@1|   17:45:23| 17:46:23|         60.0|    17|    1|    0| 546.6| 512.0| 0.0| 252.8| 136.0| 0.0|   9292|   512|   0|    4298|   136|    0
   concurrent@2|   17:46:28| 17:47:28|         60.0|    34|    2|    0| 546.4| 512.0| 0.0| 235.4| 130.0| 0.0|  18577|  1024|   0|    8003|   260|    0
   concurrent@4|   17:47:33| 17:48:33|         60.0|    66|    4|    0| 546.5| 512.0| 0.0| 243.0|  97.5| 0.0|  36072|  2048|   0|   16035|   390|    0
   concurrent@8|   17:48:38| 17:49:38|         60.0|   130|    8|    0| 546.6| 512.0| 0.0| 239.2| 146.0| 0.0|  71052|  4096|   0|   31090|  1168|    0
  concurrent@16|   17:49:43| 17:50:43|         60.0|   246|   16|    0| 546.6| 512.0| 0.0| 243.3| 112.3| 0.0| 134456|  8192|   0|   59862|  1797|    0
  concurrent@32|   17:50:49| 17:51:49|         60.0|   467|   32|    0| 546.6| 512.0| 0.0| 244.2| 147.3| 0.0| 255242| 16384|   0|  114038|  4714|    0
  concurrent@64|   17:51:55| 17:52:55|         60.0|   776|   64|    0| 546.5| 512.0| 0.0| 242.2| 106.1| 0.0| 424115| 32768|   0|  187916|  6788|    0
 concurrent@128|   17:53:03| 17:54:03|         60.0|   898|  127|    0| 546.5| 512.0| 0.0| 240.3|  69.8| 0.0| 490789| 65024|   0|  215810|  8864|    0
 =====================================================================================================================================================
 Benchmarks Stats:
 ======================================================================================================================================================
 Metadata      | Request Stats         || Out Tok/sec| Tot Tok/sec| Req Latency (sec)||| TTFT (ms)           ||| ITL (ms)        ||| TPOT (ms)       ||
      Benchmark| Per Second| Concurrency|        mean|        mean| mean| median|   p99|   mean| median|    p99| mean| median|  p99| mean| median|  p99
 --------------|-----------|------------|------------|------------|-----|-------|------|-------|-------|-------|-----|-------|-----|-----|-------|-----
   concurrent@1|       0.29|        1.00|        73.9|       233.7| 3.42|   3.45|  3.50|   50.2|   50.9|   62.5| 13.4|   13.4| 13.5| 13.3|   13.3| 13.5
   concurrent@2|       0.57|        1.96|       134.7|       447.4| 3.42|   3.67|  4.12|   50.8|   49.2|   79.8| 14.3|   14.2| 15.9| 14.3|   14.2| 15.9
   concurrent@4|       1.11|        3.92|       268.7|       873.1| 3.55|   3.72|  3.80|   54.9|   51.7|  101.3| 14.4|   14.4| 14.5| 14.4|   14.4| 14.5
   concurrent@8|       2.20|        7.82|       526.1|      1728.4| 3.56|   3.78|  3.93|   60.6|   49.8|  189.5| 14.7|   14.7| 14.8| 14.6|   14.6| 14.8
  concurrent@16|       4.14|       15.66|      1006.9|      3268.6| 3.79|   3.94|  4.25|   74.8|   54.3|  328.4| 15.3|   15.3| 16.1| 15.2|   15.2| 16.0
  concurrent@32|       7.83|       30.91|      1912.0|      6191.6| 3.95|   4.07|  4.53|  119.1|   80.5|  674.0| 15.7|   15.6| 17.4| 15.7|   15.6| 17.3
  concurrent@64|      13.03|       61.85|      3154.3|     10273.3| 4.75|   4.93|  5.43|  339.1|  321.1| 1146.6| 18.3|   18.4| 19.3| 18.2|   18.3| 19.2
 concurrent@128|      15.05|      117.71|      3617.4|     11843.9| 7.82|   8.58| 13.35| 1393.8| 1453.0| 5232.2| 26.8|   26.7| 36.0| 26.7|   26.6| 35.9
 ======================================================================================================================================================
 Saving benchmarks report...
 Benchmarks report saved to /benchmarks.json
 Benchmarking complete.

171

benchmarking/k8s-benchmark/results/guidellm-benchmark-stack-s1-sw4-v1-20250922-105539.txt Normal file

View file

 @ -0,0 +1,171 @@
 Collecting uv
   Downloading uv-0.8.19-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
 Downloading uv-0.8.19-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20.9 MB)
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 20.9/20.9 MB 156.8 MB/s eta 0:00:00
 Installing collected packages: uv
 Successfully installed uv-0.8.19
 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
 [notice] A new release of pip is available: 24.0 -> 25.2
 [notice] To update, run: pip install --upgrade pip
 Using Python 3.11.13 environment at: /usr/local
 Resolved 61 packages in 480ms
 Downloading pillow (6.3MiB)
 Downloading pydantic-core (1.9MiB)
 Downloading pyarrow (40.8MiB)
 Downloading aiohttp (1.7MiB)
 Downloading numpy (16.2MiB)
 Downloading pygments (1.2MiB)
 Downloading transformers (11.1MiB)
 Downloading pandas (11.8MiB)
 Downloading tokenizers (3.1MiB)
 Downloading hf-xet (3.0MiB)
  Downloading pydantic-core
  Downloading aiohttp
  Downloading tokenizers
  Downloading hf-xet
  Downloading pygments
  Downloading pillow
  Downloading numpy
  Downloading pandas
  Downloading pyarrow
  Downloading transformers
 Prepared 61 packages in 1.25s
 Installed 61 packages in 126ms
  + aiohappyeyeballs==2.6.1
  + aiohttp==3.12.15
  + aiosignal==1.4.0
  + annotated-types==0.7.0
  + anyio==4.10.0
  + attrs==25.3.0
  + certifi==2025.8.3
  + charset-normalizer==3.4.3
  + click==8.1.8
  + datasets==4.1.1
  + dill==0.4.0
  + filelock==3.19.1
  + frozenlist==1.7.0
  + fsspec==2025.9.0
  + ftfy==6.3.1
  + guidellm==0.3.0
  + h11==0.16.0
  + h2==4.3.0
  + hf-xet==1.1.10
  + hpack==4.1.0
  + httpcore==1.0.9
  + httpx==0.28.1
  + huggingface-hub==0.35.0
  + hyperframe==6.1.0
  + idna==3.10
  + loguru==0.7.3
  + markdown-it-py==4.0.0
  + mdurl==0.1.2
  + multidict==6.6.4
  + multiprocess==0.70.16
  + numpy==2.3.3
  + packaging==25.0
  + pandas==2.3.2
  + pillow==11.3.0
  + propcache==0.3.2
  + protobuf==6.32.1
  + pyarrow==21.0.0
  + pydantic==2.11.9
  + pydantic-core==2.33.2
  + pydantic-settings==2.10.1
  + pygments==2.19.2
  + python-dateutil==2.9.0.post0
  + python-dotenv==1.1.1
  + pytz==2025.2
  + pyyaml==6.0.2
  + regex==2025.9.18
  + requests==2.32.5
  + rich==14.1.0
  + safetensors==0.6.2
  + six==1.17.0
  + sniffio==1.3.1
  + tokenizers==0.22.1
  + tqdm==4.67.1
  + transformers==4.56.2
  + typing-extensions==4.15.0
  + typing-inspection==0.4.1
  + tzdata==2025.2
  + urllib3==2.5.0
  + wcwidth==0.2.14
  + xxhash==3.5.0
  + yarl==1.20.1
 Using Python 3.11.13 environment at: /usr/local
 Audited 1 package in 4ms
 Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.
 Creating backend...
 Backend openai_http connected to http://llama-stack-benchmark-service:8323/v1/openai for model meta-llama/Llama-3.2-3B-Instruct.
 Creating request loader...
 Created loader with 1000 unique requests from prompt_tokens=512,output_tokens=256.
 ╭─ Benchmarks ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
 │ [17:55:59] ⠋ 100% concurrent@1   (complete)   Req:    0.3 req/s,    3.33s Lat,     1.0 Conc,      18 Comp,        1 Inc,        0 Err                                                                │
 │                                               Tok:   74.0 gen/s,  238.0 tot/s,  49.6ms TTFT,   13.4ms ITL,   546 Prompt,      246 Gen                                                                │
 │ [17:57:04] ⠋ 100% concurrent@2   (complete)   Req:    0.6 req/s,    3.32s Lat,     1.9 Conc,      35 Comp,        2 Inc,        0 Err                                                                │
 │                                               Tok:  137.1 gen/s,  457.5 tot/s,  50.6ms TTFT,   14.0ms ITL,   546 Prompt,      234 Gen                                                                │
 │ [17:58:09] ⠋ 100% concurrent@4   (complete)   Req:    1.2 req/s,    3.42s Lat,     4.0 Conc,      69 Comp,        4 Inc,        0 Err                                                                │
 │                                               Tok:  276.7 gen/s,  907.2 tot/s,  52.7ms TTFT,   14.1ms ITL,   547 Prompt,      240 Gen                                                                │
 │ [17:59:14] ⠋ 100% concurrent@8   (complete)   Req:    2.3 req/s,    3.47s Lat,     7.8 Conc,     134 Comp,        8 Inc,        0 Err                                                                │
 │                                               Tok:  541.4 gen/s, 1775.4 tot/s,  57.3ms TTFT,   14.3ms ITL,   547 Prompt,      240 Gen                                                                │
 │ [18:00:19] ⠋ 100% concurrent@16  (complete)   Req:    4.3 req/s,    3.60s Lat,    15.6 Conc,     259 Comp,       16 Inc,        0 Err                                                                │
 │                                               Tok: 1034.8 gen/s, 3401.7 tot/s,  72.3ms TTFT,   14.8ms ITL,   547 Prompt,      239 Gen                                                                │
 │ [18:01:25] ⠋ 100% concurrent@32  (complete)   Req:    8.4 req/s,    3.69s Lat,    31.1 Conc,     505 Comp,       32 Inc,        0 Err                                                                │
 │                                               Tok: 2029.7 gen/s, 6641.5 tot/s,  91.6ms TTFT,   15.0ms ITL,   547 Prompt,      241 Gen                                                                │
 │ [18:02:31] ⠋ 100% concurrent@64  (complete)   Req:   13.6 req/s,    4.50s Lat,    61.4 Conc,     818 Comp,       64 Inc,        0 Err                                                                │
 │                                               Tok: 3333.9 gen/s, 10787.0 tot/s, 171.3ms TTFT,   17.8ms ITL,   547 Prompt,      244 Gen                                                               │
 │ [18:03:40] ⠋ 100% concurrent@128 (complete)   Req:   16.1 req/s,    7.43s Lat,   119.5 Conc,     964 Comp,      122 Inc,        0 Err                                                                │
 │                                               Tok: 3897.0 gen/s, 12679.4 tot/s, 446.4ms TTFT,   28.9ms ITL,   547 Prompt,      243 Gen                                                               │
 ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
 Generating... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ (8/8) [ 0:08:41 < 0:00:00 ]
 Benchmarks Metadata:
     Run id:5393e64f-d9f8-4548-95d8-da320bba1c24
     Duration:530.1 seconds
     Profile:type=concurrent, strategies=['concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent'], streams=[1, 2, 4, 8, 16, 32, 64, 128]
     Args:max_number=None, max_duration=60.0, warmup_number=None, warmup_duration=3.0, cooldown_number=None, cooldown_duration=None
     Worker:type_='generative_requests_worker' backend_type='openai_http' backend_target='http://llama-stack-benchmark-service:8323/v1/openai' backend_model='meta-llama/Llama-3.2-3B-Instruct'
     backend_info={'max_output_tokens': 16384, 'timeout': 300, 'http2': True, 'follow_redirects': True, 'headers': {}, 'text_completions_path': '/v1/completions', 'chat_completions_path':
     '/v1/chat/completions'}
     Request Loader:type_='generative_request_loader' data='prompt_tokens=512,output_tokens=256' data_args=None processor='meta-llama/Llama-3.2-3B-Instruct' processor_args=None
     Extras:None
 Benchmarks Info:
 ===================================================================================================================================================
 Metadata                                       |||| Requests Made  ||| Prompt Tok/Req ||| Output Tok/Req ||| Prompt Tok Total||| Output Tok Total||
      Benchmark| Start Time| End Time| Duration (s)|  Comp|  Inc|  Err|  Comp|   Inc| Err|  Comp|   Inc| Err|   Comp|   Inc| Err|   Comp|   Inc| Err
 --------------|-----------|---------|-------------|------|-----|-----|------|------|----|------|------|----|-------|------|----|-------|------|----
   concurrent@1|   17:56:04| 17:57:04|         60.0|    18|    1|    0| 546.4| 512.0| 0.0| 246.4| 256.0| 0.0|   9836|   512|   0|   4436|   256|   0
   concurrent@2|   17:57:09| 17:58:09|         60.0|    35|    2|    0| 546.4| 512.0| 0.0| 233.9| 132.0| 0.0|  19124|  1024|   0|   8188|   264|   0
   concurrent@4|   17:58:14| 17:59:14|         60.0|    69|    4|    0| 546.6| 512.0| 0.0| 239.9|  60.5| 0.0|  37715|  2048|   0|  16553|   242|   0
   concurrent@8|   17:59:19| 18:00:19|         60.0|   134|    8|    0| 546.6| 512.0| 0.0| 239.8| 126.6| 0.0|  73243|  4096|   0|  32135|  1013|   0
  concurrent@16|   18:00:24| 18:01:24|         60.0|   259|   16|    0| 546.6| 512.0| 0.0| 239.0| 115.7| 0.0| 141561|  8192|   0|  61889|  1851|   0
  concurrent@32|   18:01:30| 18:02:30|         60.0|   505|   32|    0| 546.5| 512.0| 0.0| 240.5| 113.2| 0.0| 275988| 16384|   0| 121466|  3623|   0
  concurrent@64|   18:02:37| 18:03:37|         60.0|   818|   64|    0| 546.6| 512.0| 0.0| 244.5| 132.4| 0.0| 447087| 32768|   0| 199988|  8475|   0
 concurrent@128|   18:03:45| 18:04:45|         60.0|   964|  122|    0| 546.5| 512.0| 0.0| 242.5| 133.1| 0.0| 526866| 62464|   0| 233789| 16241|   0
 ===================================================================================================================================================
 Benchmarks Stats:
 =======================================================================================================================================================
 Metadata      | Request Stats         || Out Tok/sec| Tot Tok/sec| Req Latency (sec)  ||| TTFT (ms)          ||| ITL (ms)        ||| TPOT (ms)       ||
      Benchmark| Per Second| Concurrency|        mean|        mean|  mean|  median|   p99|  mean| median|    p99| mean| median|  p99| mean| median|  p99
 --------------|-----------|------------|------------|------------|------|--------|------|------|-------|-------|-----|-------|-----|-----|-------|-----
   concurrent@1|       0.30|        1.00|        74.0|       238.0|  3.33|    3.44|  3.63|  49.6|   47.2|   66.1| 13.4|   13.3| 14.0| 13.3|   13.3| 14.0
   concurrent@2|       0.59|        1.95|       137.1|       457.5|  3.32|    3.61|  3.67|  50.6|   48.6|   80.4| 14.0|   14.0| 14.2| 13.9|   13.9| 14.1
   concurrent@4|       1.15|        3.95|       276.7|       907.2|  3.42|    3.61|  3.77|  52.7|   49.7|  106.9| 14.1|   14.0| 14.6| 14.0|   13.9| 14.5
   concurrent@8|       2.26|        7.83|       541.4|      1775.4|  3.47|    3.70|  3.79|  57.3|   50.9|  171.3| 14.3|   14.3| 14.4| 14.2|   14.2| 14.4
  concurrent@16|       4.33|       15.57|      1034.8|      3401.7|  3.60|    3.81|  4.22|  72.3|   52.0|  292.9| 14.8|   14.7| 16.3| 14.7|   14.7| 16.3
  concurrent@32|       8.44|       31.12|      2029.7|      6641.5|  3.69|    3.89|  4.24|  91.6|   62.6|  504.6| 15.0|   15.0| 15.4| 14.9|   14.9| 15.4
  concurrent@64|      13.64|       61.40|      3333.9|     10787.0|  4.50|    4.61|  5.67| 171.3|  101.2| 1165.6| 17.8|   17.7| 19.2| 17.7|   17.6| 19.1
 concurrent@128|      16.07|      119.45|      3897.0|     12679.4|  7.43|    7.63|  9.74| 446.4|  195.8| 2533.1| 28.9|   28.9| 31.0| 28.8|   28.8| 30.9
 =======================================================================================================================================================
 Saving benchmarks report...
 Benchmarks report saved to /benchmarks.json
 Benchmarking complete.

170

benchmarking/k8s-benchmark/results/guidellm-benchmark-vllm-v1-20250922-111127.txt Normal file

View file

 @ -0,0 +1,170 @@
 Collecting uv
   Downloading uv-0.8.19-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
 Downloading uv-0.8.19-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20.9 MB)
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 20.9/20.9 MB 126.9 MB/s eta 0:00:00
 Installing collected packages: uv
 Successfully installed uv-0.8.19
 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
 [notice] A new release of pip is available: 24.0 -> 25.2
 [notice] To update, run: pip install --upgrade pip
 Using Python 3.11.13 environment at: /usr/local
 Resolved 61 packages in 561ms
 Downloading hf-xet (3.0MiB)
 Downloading pillow (6.3MiB)
 Downloading transformers (11.1MiB)
 Downloading pyarrow (40.8MiB)
 Downloading numpy (16.2MiB)
 Downloading pandas (11.8MiB)
 Downloading tokenizers (3.1MiB)
 Downloading pydantic-core (1.9MiB)
 Downloading pygments (1.2MiB)
 Downloading aiohttp (1.7MiB)
  Downloading pydantic-core
  Downloading aiohttp
  Downloading tokenizers
  Downloading hf-xet
  Downloading pygments
  Downloading pillow
  Downloading numpy
  Downloading pandas
  Downloading transformers
  Downloading pyarrow
 Prepared 61 packages in 1.25s
 Installed 61 packages in 114ms
  + aiohappyeyeballs==2.6.1
  + aiohttp==3.12.15
  + aiosignal==1.4.0
  + annotated-types==0.7.0
  + anyio==4.10.0
  + attrs==25.3.0
  + certifi==2025.8.3
  + charset-normalizer==3.4.3
  + click==8.1.8
  + datasets==4.1.1
  + dill==0.4.0
  + filelock==3.19.1
  + frozenlist==1.7.0
  + fsspec==2025.9.0
  + ftfy==6.3.1
  + guidellm==0.3.0
  + h11==0.16.0
  + h2==4.3.0
  + hf-xet==1.1.10
  + hpack==4.1.0
  + httpcore==1.0.9
  + httpx==0.28.1
  + huggingface-hub==0.35.0
  + hyperframe==6.1.0
  + idna==3.10
  + loguru==0.7.3
  + markdown-it-py==4.0.0
  + mdurl==0.1.2
  + multidict==6.6.4
  + multiprocess==0.70.16
  + numpy==2.3.3
  + packaging==25.0
  + pandas==2.3.2
  + pillow==11.3.0
  + propcache==0.3.2
  + protobuf==6.32.1
  + pyarrow==21.0.0
  + pydantic==2.11.9
  + pydantic-core==2.33.2
  + pydantic-settings==2.10.1
  + pygments==2.19.2
  + python-dateutil==2.9.0.post0
  + python-dotenv==1.1.1
  + pytz==2025.2
  + pyyaml==6.0.2
  + regex==2025.9.18
  + requests==2.32.5
  + rich==14.1.0
  + safetensors==0.6.2
  + six==1.17.0
  + sniffio==1.3.1
  + tokenizers==0.22.1
  + tqdm==4.67.1
  + transformers==4.56.2
  + typing-extensions==4.15.0
  + typing-inspection==0.4.1
  + tzdata==2025.2
  + urllib3==2.5.0
  + wcwidth==0.2.14
  + xxhash==3.5.0
  + yarl==1.20.1
 Using Python 3.11.13 environment at: /usr/local
 Audited 1 package in 3ms
 Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.
 Creating backend...
 Backend openai_http connected to http://vllm-server:8000 for model meta-llama/Llama-3.2-3B-Instruct.
 Creating request loader...
 Created loader with 1000 unique requests from prompt_tokens=512,output_tokens=256.
 ╭─ Benchmarks ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
 │ [18:11:47] ⠋ 100% concurrent@1   (complete)   Req:    0.3 req/s,    3.35s Lat,     1.0 Conc,      17 Comp,        1 Inc,        0 Err                                                                │
 │                                               Tok:   76.4 gen/s,  239.4 tot/s,  29.6ms TTFT,   13.0ms ITL,   547 Prompt,      256 Gen                                                                │
 │ [18:12:52] ⠋ 100% concurrent@2   (complete)   Req:    0.6 req/s,    3.53s Lat,     2.0 Conc,      32 Comp,        2 Inc,        0 Err                                                                │
 │                                               Tok:  145.0 gen/s,  454.5 tot/s,  36.9ms TTFT,   13.7ms ITL,   546 Prompt,      256 Gen                                                                │
 │ [18:13:57] ⠋ 100% concurrent@4   (complete)   Req:    1.1 req/s,    3.59s Lat,     4.0 Conc,      64 Comp,        4 Inc,        0 Err                                                                │
 │                                               Tok:  284.8 gen/s,  892.7 tot/s,  59.0ms TTFT,   13.9ms ITL,   546 Prompt,      256 Gen                                                                │
 │ [18:15:02] ⠋ 100% concurrent@8   (complete)   Req:    2.2 req/s,    3.70s Lat,     8.0 Conc,     128 Comp,        7 Inc,        0 Err                                                                │
 │                                               Tok:  553.5 gen/s, 1735.2 tot/s,  79.8ms TTFT,   14.2ms ITL,   547 Prompt,      256 Gen                                                                │
 │ [18:16:08] ⠋ 100% concurrent@16  (complete)   Req:    4.2 req/s,    3.83s Lat,    16.0 Conc,     240 Comp,       16 Inc,        0 Err                                                                │
 │                                               Tok: 1066.9 gen/s, 3344.6 tot/s,  97.5ms TTFT,   14.6ms ITL,   547 Prompt,      256 Gen                                                                │
 │ [18:17:13] ⠋ 100% concurrent@32  (complete)   Req:    8.1 req/s,    3.94s Lat,    31.8 Conc,     480 Comp,       31 Inc,        0 Err                                                                │
 │                                               Tok: 2069.7 gen/s, 6488.4 tot/s, 120.8ms TTFT,   15.0ms ITL,   547 Prompt,      256 Gen                                                                │
 │ [18:18:20] ⠋ 100% concurrent@64  (complete)   Req:   13.6 req/s,    4.60s Lat,    62.3 Conc,     813 Comp,       57 Inc,        0 Err                                                                │
 │                                               Tok: 3472.1 gen/s, 10884.9 tot/s, 190.9ms TTFT,   17.3ms ITL,   547 Prompt,      256 Gen                                                               │
 │ [18:19:28] ⠋ 100% concurrent@128 (complete)   Req:   16.8 req/s,    7.37s Lat,   123.5 Conc,    1005 Comp,      126 Inc,        0 Err                                                                │
 │                                               Tok: 4289.1 gen/s, 13445.8 tot/s, 356.4ms TTFT,   27.5ms ITL,   547 Prompt,      256 Gen                                                               │
 ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
 Generating... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ (8/8) [ 0:08:43 < 0:00:00 ]
 Benchmarks Metadata:
     Run id:8ccb6da1-83f4-4624-8d84-07c723b0b2a5
     Duration:530.4 seconds
     Profile:type=concurrent, strategies=['concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent', 'concurrent'], streams=[1, 2, 4, 8, 16, 32, 64, 128]
     Args:max_number=None, max_duration=60.0, warmup_number=None, warmup_duration=3.0, cooldown_number=None, cooldown_duration=None
     Worker:type_='generative_requests_worker' backend_type='openai_http' backend_target='http://vllm-server:8000' backend_model='meta-llama/Llama-3.2-3B-Instruct' backend_info={'max_output_tokens':
 , 'timeout': 300, 'http2': True, 'follow_redirects': True, 'headers': {}, 'text_completions_path': '/v1/completions', 'chat_completions_path': '/v1/chat/completions'}
     Request Loader:type_='generative_request_loader' data='prompt_tokens=512,output_tokens=256' data_args=None processor='meta-llama/Llama-3.2-3B-Instruct' processor_args=None
     Extras:None
 Benchmarks Info:
 =====================================================================================================================================================
 Metadata                                       |||| Requests Made  ||| Prompt Tok/Req ||| Output Tok/Req ||| Prompt Tok Total||| Output Tok Total  ||
      Benchmark| Start Time| End Time| Duration (s)|  Comp|  Inc|  Err|  Comp|   Inc| Err|  Comp|   Inc| Err|   Comp|   Inc| Err|    Comp|   Inc|  Err
 --------------|-----------|---------|-------------|------|-----|-----|------|------|----|------|------|----|-------|------|----|--------|------|-----
   concurrent@1|   18:11:52| 18:12:52|         60.0|    17|    1|    0| 546.5| 512.0| 0.0| 256.0| 231.0| 0.0|   9291|   512|   0|    4352|   231|    0
   concurrent@2|   18:12:57| 18:13:57|         60.0|    32|    2|    0| 546.5| 512.0| 0.0| 256.0| 251.0| 0.0|  17488|  1024|   0|    8192|   502|    0
   concurrent@4|   18:14:02| 18:15:02|         60.0|    64|    4|    0| 546.4| 512.0| 0.0| 256.0| 175.2| 0.0|  34972|  2048|   0|   16384|   701|    0
   concurrent@8|   18:15:07| 18:16:07|         60.0|   128|    7|    0| 546.6| 512.0| 0.0| 256.0|  50.7| 0.0|  69966|  3584|   0|   32768|   355|    0
  concurrent@16|   18:16:13| 18:17:13|         60.0|   240|   16|    0| 546.5| 512.0| 0.0| 256.0| 166.0| 0.0| 131170|  8192|   0|   61440|  2656|    0
  concurrent@32|   18:17:18| 18:18:18|         60.0|   480|   31|    0| 546.5| 512.0| 0.0| 256.0|  47.4| 0.0| 262339| 15872|   0|  122880|  1468|    0
  concurrent@64|   18:18:25| 18:19:25|         60.0|   813|   57|    0| 546.5| 512.0| 0.0| 256.0| 110.7| 0.0| 444341| 29184|   0|  208128|  6311|    0
 concurrent@128|   18:19:33| 18:20:33|         60.0|  1005|  126|    0| 546.5| 512.0| 0.0| 256.0|  65.8| 0.0| 549264| 64512|   0|  257280|  8296|    0
 =====================================================================================================================================================
 Benchmarks Stats:
 =======================================================================================================================================================
 Metadata      | Request Stats         || Out Tok/sec| Tot Tok/sec| Req Latency (sec)  ||| TTFT (ms)          ||| ITL (ms)        ||| TPOT (ms)       ||
      Benchmark| Per Second| Concurrency|        mean|        mean|  mean|  median|   p99|  mean| median|    p99| mean| median|  p99| mean| median|  p99
 --------------|-----------|------------|------------|------------|------|--------|------|------|-------|-------|-----|-------|-----|-----|-------|-----
   concurrent@1|       0.30|        1.00|        76.4|       239.4|  3.35|    3.35|  3.38|  29.6|   29.0|   38.9| 13.0|   13.0| 13.1| 13.0|   13.0| 13.0
   concurrent@2|       0.57|        2.00|       145.0|       454.5|  3.53|    3.53|  3.55|  36.9|   39.0|   59.6| 13.7|   13.7| 13.8| 13.6|   13.7| 13.7
   concurrent@4|       1.11|        4.00|       284.8|       892.7|  3.59|    3.59|  3.65|  59.0|   65.7|   88.2| 13.9|   13.8| 14.1| 13.8|   13.8| 14.0
   concurrent@8|       2.16|        7.99|       553.5|      1735.2|  3.70|    3.69|  3.76|  79.8|   80.7|  152.6| 14.2|   14.2| 14.5| 14.1|   14.1| 14.4
  concurrent@16|       4.17|       15.97|      1066.9|      3344.6|  3.83|    3.82|  3.99|  97.5|   96.3|  283.9| 14.6|   14.6| 14.9| 14.6|   14.6| 14.8
  concurrent@32|       8.08|       31.84|      2069.7|      6488.4|  3.94|    3.90|  4.31| 120.8|  101.7|  564.3| 15.0|   14.9| 15.9| 14.9|   14.8| 15.9
  concurrent@64|      13.56|       62.34|      3472.1|     10884.9|  4.60|    4.54|  5.43| 190.9|  133.9| 1113.2| 17.3|   17.2| 18.2| 17.2|   17.2| 18.2
 concurrent@128|      16.75|      123.45|      4289.1|     13445.8|  7.37|    7.21|  9.21| 356.4|  161.9| 2319.9| 27.5|   27.5| 28.8| 27.4|   27.4| 28.7
 =======================================================================================================================================================
 Saving benchmarks report...
 Benchmarks report saved to /benchmarks.json
 Benchmarking complete.

BIN
benchmarking/k8s-benchmark/results/vllm_replica1_benchmark_results.png Normal file

View file

Binary file not shown.

After

Width: | Height: | Size: 562 KiB

									
										294

benchmarking/k8s-benchmark/scripts/generate_charts.py
									
										Executable file
									
										View file
										
				@ -0,0 +1,294 @@

				#!/usr/bin/env python3

				# Copyright (c) Meta Platforms, Inc. and affiliates.

				# All rights reserved.

				#

				# This source code is licensed under the terms described in the LICENSE file in

				# the root directory of this source tree.

				# /// script

				# dependencies = [

				#   "matplotlib",

				# ]

				# ///

				"""

				Script to generate benchmark charts from guidellm text results.

				Creates 2x2 grid charts with RPS, Request Latency, TTFT, and ITL metrics against concurrent@x values.

				Outputs one chart file per vLLM replica group, with each line representing one benchmark run.

				"""

				import glob

				import os

				import re

				import matplotlib.pyplot as plt

				def extract_setup_name(filename: str) -> str:

				    """Extract setup name from filename and format legend appropriately."""

				    basename = os.path.basename(filename)

				    # Try new pattern: guidellm-benchmark-stack-s{stack_replicas}-sw{workers}-v{vllm_replicas}-{timestamp}.txt

				    match = re.search(r"guidellm-benchmark-stack-s(\d+)-sw(\d+)-v(\d+)-(\d{8})-(\d{6})\.txt", basename)

				    if match:

				        stack_replicas = match.group(1)

				        workers = match.group(2)

				        vllm_replicas = match.group(3)

				        date = match.group(4)

				        time = match.group(5)

				        return f"stack-s{stack_replicas}-sw{workers}-v{vllm_replicas}"

				    # Try new vLLM pattern: guidellm-benchmark-vllm-v{vllm_replicas}-{timestamp}.txt

				    match = re.search(r"guidellm-benchmark-vllm-v(\d+)-(\d{8})-(\d{6})\.txt", basename)

				    if match:

				        vllm_replicas = match.group(1)

				        date = match.group(2)

				        time = match.group(3)

				        return f"vllm-v{vllm_replicas}"

				    # Fall back to old pattern: guidellm-benchmark-{target}-{stack_replicas}-w{workers}-{vllm_replicas}-{timestamp}.txt

				    match = re.search(r"guidellm-benchmark-([^-]+)-(\d+)-w(\d+)-(\d+)-(\d+)-(\d+)\.txt", basename)

				    if match:

				        target = match.group(1)

				        stack_replicas = match.group(2)

				        workers = match.group(3)

				        vllm_replicas = match.group(4)

				        date = match.group(5)

				        time = match.group(6)

				        if target == "vllm":

				            return f"vllm-{vllm_replicas}-w{workers}-{vllm_replicas}"

				        else:

				            return f"stack-replicas{stack_replicas}-w{workers}-vllm-replicas{vllm_replicas}-{date}-{time}"

				    # Fall back to older pattern: guidellm-benchmark-{target}-{stack_replicas}-{vllm_replicas}-{timestamp}.txt

				    match = re.search(r"guidellm-benchmark-([^-]+)-(\d+)-(\d+)-(\d+)-(\d+)\.txt", basename)

				    if match:

				        target = match.group(1)

				        stack_replicas = match.group(2)

				        vllm_replicas = match.group(3)

				        date = match.group(4)

				        time = match.group(5)

				        if target == "vllm":

				            return f"vllm-{vllm_replicas}-w1-{vllm_replicas}"

				        else:

				            return f"stack-replicas{stack_replicas}-vllm-replicas{vllm_replicas}-{date}-{time}"

				    return basename.replace("guidellm-benchmark-", "").replace(".txt", "")

				def parse_txt_file(filepath: str) -> list[tuple[float, float, float, float, float, str]]:

				    """

				    Parse a text benchmark file and extract concurrent@x, RPS, TTFT, ITL, and request latency data.

				    Returns list of (concurrency, rps_mean, ttft_mean, itl_mean, req_latency_mean, setup_name) tuples.

				    """

				    setup_name = extract_setup_name(filepath)

				    data_points = []

				    try:

				        with open(filepath) as f:

				            content = f.read()

				        # Find the benchmark stats table

				        lines = content.split("\n")

				        in_stats_table = False

				        header_lines_seen = 0

				        for line in lines:

				            line_stripped = line.strip()

				            # Look for the start of the stats table

				            if "Benchmarks Stats:" in line:

				                in_stats_table = True

				                continue

				            if in_stats_table:

				                # Skip the first few separator/header lines

				                if line_stripped.startswith("=") or line_stripped.startswith("-"):

				                    header_lines_seen += 1

				                    if header_lines_seen >= 3:  # After seeing multiple header lines, look for concurrent@ data

				                        if line_stripped.startswith("=") and "concurrent@" not in line_stripped:

				                            break

				                    continue

				            # Parse concurrent@ lines in the stats table (may have leading spaces)

				            if in_stats_table and "concurrent@" in line:

				                parts = [part.strip() for part in line.split("|")]

				                if len(parts) >= 12:  # Make sure we have enough columns for new format

				                    try:

				                        # Extract concurrency from benchmark name (e.g., concurrent@1 -> 1)

				                        concurrent_match = re.search(r"concurrent@(\d+)", parts[0])

				                        if not concurrent_match:

				                            continue

				                        concurrency = float(concurrent_match.group(1))

				                        # Extract metrics from the new table format

				                        # From your image, the table has these columns with | separators:

				                        # Benchmark | Per Second | Concurrency | Out Tok/sec | Tot Tok/sec | Req Latency (sec) | TTFT (ms) | ITL (ms) | TPOT (ms)

				                        # Looking at the mean/median/p99 structure, need to find the mean columns

				                        # The structure shows: mean | median | p99 for each metric

				                        rps_mean = float(parts[1])  # Per Second (RPS)

				                        req_latency_mean = float(parts[6]) * 1000  # Request latency mean (convert from sec to ms)

				                        ttft_mean = float(parts[9])  # TTFT mean column

				                        itl_mean = float(parts[12])  # ITL mean column

				                        data_points.append((concurrency, rps_mean, ttft_mean, itl_mean, req_latency_mean, setup_name))

				                    except (ValueError, IndexError) as e:

				                        print(f"Warning: Could not parse line '{line}' in {filepath}: {e}")

				                        continue

				    except (OSError, FileNotFoundError) as e:

				        print(f"Error reading {filepath}: {e}")

				    return data_points

				def generate_charts(benchmark_dir: str = "results"):

				    """Generate 2x2 grid charts (RPS, Request Latency, TTFT, ITL) from benchmark text files."""

				    # Find all text result files instead of JSON

				    txt_pattern = os.path.join(benchmark_dir, "guidellm-benchmark-*.txt")

				    txt_files = glob.glob(txt_pattern)

				    if not txt_files:

				        print(f"No text files found matching pattern: {txt_pattern}")

				        return

				    print(f"Found {len(txt_files)} text files")

				    # Parse all files and collect data

				    all_data = {}  # setup_name -> [(concurrency, rps, ttft, itl, req_latency), ...]

				    for txt_file in txt_files:

				        print(f"Processing {txt_file}")

				        data_points = parse_txt_file(txt_file)

				        for concurrency, rps, ttft, itl, req_latency, setup_name in data_points:

				            if setup_name not in all_data:

				                all_data[setup_name] = []

				            all_data[setup_name].append((concurrency, rps, ttft, itl, req_latency))

				    if not all_data:

				        print("No data found to plot")

				        return

				    # Sort data points by concurrency for each setup

				    for setup_name in all_data:

				        all_data[setup_name].sort(key=lambda x: x[0])  # Sort by concurrency

				    # Group setups by vLLM replica number (original approach)

				    replica_groups = {}  # vllm_replica_count -> {setup_name: points}

				    for setup_name, points in all_data.items():

				        # Extract vLLM replica number from setup name

				        # Expected formats:

				        # - New stack format: "stack-s{X}-sw{W}-v{Y}"

				        # - New vLLM format: "vllm-v{Y}"

				        # - Old formats: "stack-replicas{X}-w{W}-vllm-replicas{Y}" or "vllm-{Y}-w{W}-{Y}"

				        # Try new formats first

				        vllm_match = re.search(r"-v(\d+)$", setup_name)  # Matches both "stack-s1-sw2-v3" and "vllm-v1"

				        if not vllm_match:

				            # Try old stack format

				            vllm_match = re.search(r"vllm-replicas(\d+)", setup_name)

				        if not vllm_match:

				            # Try old vLLM format: "vllm-{Y}-w{W}-{Y}"

				            vllm_match = re.search(r"vllm-(\d+)-w\d+-\d+", setup_name)

				        if vllm_match:

				            vllm_replica_num = int(vllm_match.group(1))

				            if vllm_replica_num not in replica_groups:

				                replica_groups[vllm_replica_num] = {}

				            replica_groups[vllm_replica_num][setup_name] = points

				        else:

				            print(f"Warning: Could not extract vLLM replica count from setup name: {setup_name}")

				    def create_charts(data_dict, prefix, title_prefix):

				        """Create a 2x2 grid with RPS, Request Latency, TTFT, and ITL charts."""

				        if not data_dict:

				            print(f"No data found for {prefix}")

				            return

				        # Create 2x2 subplot grid

				        fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(16, 12))

				        fig.suptitle(f"{title_prefix} Benchmark Results", fontsize=16, fontweight="bold")

				        # Collect all unique concurrency values for tick setting

				        all_concurrency_values = set()

				        for points in data_dict.values():

				            all_concurrency_values.update([p[0] for p in points])

				        all_concurrency_values = sorted(all_concurrency_values)

				        # Plot data for each setup in alphabetical order

				        for setup_name in sorted(data_dict.keys()):

				            points = data_dict[setup_name]

				            if not points:

				                continue

				            concurrency_values = [p[0] for p in points]

				            rps_values = [p[1] for p in points]

				            ttft_values = [p[2] for p in points]

				            itl_values = [p[3] for p in points]

				            req_latency_values = [p[4] for p in points]

				            # RPS chart (top-left)

				            ax1.plot(concurrency_values, rps_values, marker="o", label=setup_name, linewidth=2, markersize=6)

				            # Request Latency chart (top-right)

				            ax2.plot(concurrency_values, req_latency_values, marker="o", label=setup_name, linewidth=2, markersize=6)

				            # TTFT chart (bottom-left)

				            ax3.plot(concurrency_values, ttft_values, marker="o", label=setup_name, linewidth=2, markersize=6)

				            # ITL chart (bottom-right)

				            ax4.plot(concurrency_values, itl_values, marker="o", label=setup_name, linewidth=2, markersize=6)

				        # Configure all charts after plotting data

				        axes = [ax1, ax2, ax3, ax4]

				        titles = ["RPS", "Request Latency", "TTFT", "ITL"]

				        ylabels = [

				            "Requests Per Second (RPS)",

				            "Request Latency (ms)",

				            "Time to First Token (ms)",

				            "Inter Token Latency (ms)",

				        ]

				        for ax, title, ylabel in zip(axes, titles, ylabels, strict=False):

				            ax.set_xlabel("Concurrency", fontsize=12)

				            ax.set_ylabel(ylabel, fontsize=12)

				            ax.set_title(title, fontsize=14, fontweight="bold")

				            ax.set_xscale("log", base=2)

				            ax.set_xticks(all_concurrency_values)

				            ax.set_xticklabels([str(int(x)) for x in all_concurrency_values])

				            ax.grid(True, alpha=0.3)

				        # Add legend to the right-most subplot (top-right)

				        ax2.legend(bbox_to_anchor=(1.05, 1), loc="upper left")

				        plt.tight_layout()

				        # Save the combined chart

				        combined_filename = os.path.join(benchmark_dir, f"{prefix}_benchmark_results.png")

				        plt.savefig(combined_filename, dpi=300, bbox_inches="tight")

				        plt.close()

				        print(f"Combined benchmark chart saved to {combined_filename}")

				    # Print grouping information

				    for replica_count, data_dict in replica_groups.items():

				        print(f"vLLM Replica {replica_count} setups: {list(data_dict.keys())}")

				    # Create separate charts for each replica group

				    for replica_count, data_dict in replica_groups.items():

				        prefix = f"vllm_replica{replica_count}"

				        title = f"vLLM Replicas={replica_count}"

				        create_charts(data_dict, prefix, title)

				    # Print summary

				    print("\nSummary:")

				    for setup_name, points in all_data.items():

				        print(f"{setup_name}: {len(points)} data points")

				if __name__ == "__main__":

				    generate_charts()

									
										103

benchmarking/k8s-benchmark/scripts/run-all-benchmarks.sh
									
										Executable file
									
										View file
										
				@ -0,0 +1,103 @@

				#!/usr/bin/env bash

				# Copyright (c) Meta Platforms, Inc. and affiliates.

				# All rights reserved.

				#

				# This source code is licensed under the terms described in the LICENSE file in

				# the root directory of this source tree.

				# Define benchmark configurations: (target, stack_replicas, vllm_replicas, stack_workers)

				configs=(

				    "stack 1 1 1"

				    "stack 1 1 2"

				    "stack 1 1 4"

				    "vllm 1 1 -"

				)

				set -euo pipefail

				# Get the directory where this script is located

				SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

				echo "Running comprehensive GuideLL benchmark suite..."

				echo "Start time: $(date)"

				# Default deployment names

				STACK_DEPLOYMENT="llama-stack-benchmark-server"

				VLLM_DEPLOYMENT="vllm-server"

				# Scaling function

				scale_deployments() {

				    local stack_replicas=$1

				    local vllm_replicas=$2

				    local workers=$3

				    echo "Scaling deployments..."

				    if [[ "$vllm_replicas" != "-" ]]; then

				        echo "Scaling $VLLM_DEPLOYMENT to $vllm_replicas replicas..."

				        kubectl scale deployment $VLLM_DEPLOYMENT --replicas=$vllm_replicas

				        kubectl rollout status deployment $VLLM_DEPLOYMENT --timeout=600s

				    fi

				    if [[ "$target" == "stack" ]]; then

				        if [[ "$stack_replicas" != "-" ]]; then

				            echo "Scaling $STACK_DEPLOYMENT to $stack_replicas replicas..."

				            kubectl scale deployment $STACK_DEPLOYMENT --replicas=$stack_replicas

				            kubectl rollout status deployment $STACK_DEPLOYMENT --timeout=600s

				        fi

				        if [[ "$workers" != "-" ]]; then

				            echo "Updating $STACK_DEPLOYMENT to use $workers workers..."

				            kubectl set env deployment/$STACK_DEPLOYMENT LLAMA_STACK_WORKERS=$workers

				            kubectl rollout status deployment $STACK_DEPLOYMENT --timeout=600s

				        fi

				    fi

				    echo "All scaling operations completed. Waiting additional 30s for services to stabilize..."

				    sleep 30

				}

				for config in "${configs[@]}"; do

				    read -r target stack_replicas vllm_replicas workers <<< "$config"

				    echo ""

				    echo "=========================================="

				    if [[ "$workers" != "-" ]]; then

				        echo "Running benchmark: $target (stack=$stack_replicas, vllm=$vllm_replicas, workers=$workers)"

				    else

				        echo "Running benchmark: $target (stack=$stack_replicas, vllm=$vllm_replicas)"

				    fi

				    echo "Start: $(date)"

				    echo "=========================================="

				    # Scale deployments before running benchmark

				    scale_deployments "$stack_replicas" "$vllm_replicas" "$workers"

				    # Generate output filename with setup info

				    TIMESTAMP=$(date +%Y%m%d-%H%M%S)

				    if [[ "$target" == "stack" ]]; then

				        OUTPUT_FILE="results/guidellm-benchmark-${target}-s${stack_replicas}-sw${workers}-v${vllm_replicas}-${TIMESTAMP}.txt"

				    else

				        OUTPUT_FILE="results/guidellm-benchmark-${target}-v${vllm_replicas}-${TIMESTAMP}.txt"

				    fi

				    # Run the benchmark with the cluster as configured

				    "$SCRIPT_DIR/run-guidellm-benchmark.sh" \

				        --target "$target" \

				        --output-file "$OUTPUT_FILE"

				    echo "Completed: $(date)"

				    echo "Waiting 30 seconds before next benchmark..."

				    sleep 30

				done

				echo ""

				echo "=========================================="

				echo "All benchmarks completed!"

				echo "End time: $(date)"

				echo "=========================================="

				echo ""

				echo "Results files generated:"

				ls -la results/guidellm-*.txt results/guidellm-*.json 2>/dev/null || echo "No result files found"

									
										219

benchmarking/k8s-benchmark/scripts/run-guidellm-benchmark.sh
									
										Executable file
									
										View file
										
				@ -0,0 +1,219 @@

				#!/usr/bin/env bash

				# Copyright (c) Meta Platforms, Inc. and affiliates.

				# All rights reserved.

				#

				# This source code is licensed under the terms described in the LICENSE file in

				# the root directory of this source tree.

				set -euo pipefail

				# Default values

				TARGET="stack"

				MAX_SECONDS=60

				PROMPT_TOKENS=512

				OUTPUT_TOKENS=256

				RATE_TYPE="concurrent"

				RATE="1,2,4,8,16,32,64,128"

				STACK_DEPLOYMENT="llama-stack-benchmark-server"

				STACK_URL="http://llama-stack-benchmark-service:8323/v1/openai"

				VLLM_DEPLOYMENT="vllm-server"

				OUTPUT_FILE=""

				# Parse command line arguments

				usage() {

				    echo "Usage: $0 [options]"

				    echo "Options:"

				    echo "  -t, --target <stack|vllm>     Target to benchmark (default: stack)"

				    echo "  -s, --max-seconds <seconds>   Maximum duration in seconds (default: 60)"

				    echo "  -p, --prompt-tokens <tokens>  Number of prompt tokens (default: 512)"

				    echo "  -o, --output-tokens <tokens>  Number of output tokens (default: 256)"

				    echo "  -r, --rate-type <type>        Rate type (default: concurrent)"

				    echo "  -c, --rate                    Rate (default: 1,2,4,8,16,32,64,128)"

				    echo "  --output-file <path>          Output file path (default: auto-generated)"

				    echo "  --stack-deployment <name>     Name of the stack deployment (default: llama-stack-benchmark-server)"

				    echo "  --vllm-deployment <name>      Name of the vllm deployment (default: vllm-server)"

				    echo "  --stack-url <url>             URL of the stack service (default: http://llama-stack-benchmark-service:8323/v1/openai)"

				    echo "  -h, --help                    Show this help message"

				    echo ""

				    echo "Examples:"

				    echo "  $0 --target vllm                              # Benchmark vLLM direct"

				    echo "  $0 --target stack                             # Benchmark Llama Stack (default)"

				    echo "  $0 -t vllm -s 60 -p 512 -o 256               # vLLM with custom parameters"

				    echo "  $0 --output-file results/my-benchmark.txt     # Specify custom output file"

				    echo "  $0 --stack-deployment my-stack-server         # Use custom stack deployment name"

				}

				while [[ $# -gt 0 ]]; do

				    case $1 in

				        -t|--target)

				            TARGET="$2"

				            shift 2

				            ;;

				        -s|--max-seconds)

				            MAX_SECONDS="$2"

				            shift 2

				            ;;

				        -p|--prompt-tokens)

				            PROMPT_TOKENS="$2"

				            shift 2

				            ;;

				        -o|--output-tokens)

				            OUTPUT_TOKENS="$2"

				            shift 2

				            ;;

				        -r|--rate-type)

				            RATE_TYPE="$2"

				            shift 2

				            ;;

				        -c|--rate)

				            RATE="$2"

				            shift 2

				            ;;

				        --output-file)

				            OUTPUT_FILE="$2"

				            shift 2

				            ;;

				        --stack-deployment)

				            STACK_DEPLOYMENT="$2"

				            shift 2

				            ;;

				        --vllm-deployment)

				            VLLM_DEPLOYMENT="$2"

				            shift 2

				            ;;

				        --stack-url)

				            STACK_URL="$2"

				            shift 2

				            ;;

				        -h|--help)

				            usage

				            exit 0

				            ;;

				        *)

				            echo "Unknown option: $1"

				            usage

				            exit 1

				            ;;

				    esac

				done

				# Validate target

				if [[ "$TARGET" != "stack" && "$TARGET" != "vllm" ]]; then

				    echo "Error: Target must be 'stack' or 'vllm'"

				    usage

				    exit 1

				fi

				# Set configuration based on target

				if [[ "$TARGET" == "vllm" ]]; then

				    BASE_URL="http://${VLLM_DEPLOYMENT}:8000"

				    JOB_NAME="guidellm-vllm-benchmark-job"

				    echo "Benchmarking vLLM direct with GuideLLM..."

				else

				    BASE_URL="$STACK_URL"

				    JOB_NAME="guidellm-stack-benchmark-job"

				    echo "Benchmarking Llama Stack with GuideLLM..."

				fi

				echo "Configuration:"

				echo "  Target: $TARGET"

				echo "  Base URL: $BASE_URL"

				echo "  Max seconds: ${MAX_SECONDS}s"

				echo "  Prompt tokens: $PROMPT_TOKENS"

				echo "  Output tokens: $OUTPUT_TOKENS"

				echo "  Rate type: $RATE_TYPE"

				if [[ "$TARGET" == "vllm" ]]; then

				    echo "  vLLM deployment: $VLLM_DEPLOYMENT"

				else

				    echo "  Stack deployment: $STACK_DEPLOYMENT"

				fi

				echo ""

				# Create temporary job yaml

				TEMP_YAML="/tmp/guidellm-benchmark-job-temp-$(date +%s).yaml"

				cat > "$TEMP_YAML" << EOF

				apiVersion: batch/v1

				kind: Job

				metadata:

				  name: $JOB_NAME

				  namespace: default

				spec:

				  template:

				    spec:

				      containers:

				      - name: guidellm-benchmark

				        image: python:3.11-slim

				        command: ["/bin/bash"]

				        args:

				        - "-c"

				        - |

				          # Install uv and guidellm

				          pip install uv &&

				          uv pip install --system guidellm &&

				          # Login to HuggingFace

				          uv pip install --system huggingface_hub &&

				          python -c "from huggingface_hub import login; login(token='\$HF_TOKEN')" &&

				          # Run GuideLLM benchmark and save output

				          export COLUMNS=200

				          GUIDELLM__PREFERRED_ROUTE="chat_completions" uv run guidellm benchmark run \\

				            --target "$BASE_URL" \\

				            --rate-type "$RATE_TYPE" \\

				            --max-seconds $MAX_SECONDS \\

				            --data "prompt_tokens=$PROMPT_TOKENS,output_tokens=$OUTPUT_TOKENS" \\

				            --model "$INFERENCE_MODEL" \\

				            --rate "$RATE" \\

				            --warmup-percent 0.05 \\

				            2>&1

				        env:

				        - name: INFERENCE_MODEL

				          value: "meta-llama/Llama-3.2-3B-Instruct"

				        - name: HF_TOKEN

				          valueFrom:

				            secretKeyRef:

				              name: hf-token-secret

				              key: token

				        resources:

				          requests:

				            memory: "4Gi"

				            cpu: "500m"

				          limits:

				            memory: "8Gi"

				            cpu: "2000m"

				      restartPolicy: Never

				  backoffLimit: 3

				EOF

				echo "Cleaning up any existing GuideLLM benchmark job..."

				kubectl delete job $JOB_NAME 2>/dev/null || true

				echo "Deploying GuideLLM benchmark Job..."

				kubectl apply -f "$TEMP_YAML"

				echo "Waiting for job to start..."

				kubectl wait --for=condition=Ready pod -l job-name=$JOB_NAME --timeout=120s

				# Prepare file names and create results directory

				mkdir -p results

				if [[ -z "$OUTPUT_FILE" ]]; then

				    TIMESTAMP=$(date +%Y%m%d-%H%M%S)

				    OUTPUT_FILE="results/guidellm-benchmark-${TARGET}-${TIMESTAMP}.txt"

				fi

				echo "Following GuideLLM benchmark logs..."

				kubectl logs -f job/$JOB_NAME

				echo "Job completed. Checking final status..."

				kubectl get job $JOB_NAME

				# Save benchmark results using kubectl logs

				echo "Saving benchmark results..."

				kubectl logs job/$JOB_NAME > "$OUTPUT_FILE"

				echo "Benchmark output saved to: $OUTPUT_FILE"

				# Clean up temporary file

				rm -f "$TEMP_YAML"

									
										142

benchmarking/k8s-benchmark/stack-configmap.yaml
									
										Normal file
									
										View file
										
				@ -0,0 +1,142 @@

				apiVersion: v1

				data:

				  stack_run_config.yaml: |

				    version: '2'

				    image_name: kubernetes-benchmark-demo

				    apis:

				    - agents

				    - files

				    - inference

				    - files

				    - safety

				    - tool_runtime

				    - vector_io

				    providers:

				      inference:

				      - provider_id: vllm-inference

				        provider_type: remote::vllm

				        config:

				          url: ${env.VLLM_URL:=http://localhost:8000/v1}

				          max_tokens: ${env.VLLM_MAX_TOKENS:=4096}

				          api_token: ${env.VLLM_API_TOKEN:=fake}

				          tls_verify: ${env.VLLM_TLS_VERIFY:=true}

				      - provider_id: sentence-transformers

				        provider_type: inline::sentence-transformers

				        config: {}

				      files:

				      - provider_id: meta-reference-files

				        provider_type: inline::localfs

				        config:

				          storage_dir: ${env.FILES_STORAGE_DIR:=~/.llama/distributions/starter/files}

				          metadata_store:

				            type: sqlite

				            db_path: ${env.SQLITE_STORE_DIR:=~/.llama/distributions/starter}/files_metadata.db

				      vector_io:

				      - provider_id: ${env.ENABLE_CHROMADB:+chromadb}

				        provider_type: remote::chromadb

				        config:

				          url: ${env.CHROMADB_URL:=}

				          kvstore:

				            type: postgres

				            host: ${env.POSTGRES_HOST:=localhost}

				            port: ${env.POSTGRES_PORT:=5432}

				            db: ${env.POSTGRES_DB:=llamastack}

				            user: ${env.POSTGRES_USER:=llamastack}

				            password: ${env.POSTGRES_PASSWORD:=llamastack}

				      safety:

				      - provider_id: llama-guard

				        provider_type: inline::llama-guard

				        config:

				          excluded_categories: []

				      agents:

				      - provider_id: meta-reference

				        provider_type: inline::meta-reference

				        config:

				          persistence_store:

				            type: postgres

				            host: ${env.POSTGRES_HOST:=localhost}

				            port: ${env.POSTGRES_PORT:=5432}

				            db: ${env.POSTGRES_DB:=llamastack}

				            user: ${env.POSTGRES_USER:=llamastack}

				            password: ${env.POSTGRES_PASSWORD:=llamastack}

				          responses_store:

				            type: postgres

				            host: ${env.POSTGRES_HOST:=localhost}

				            port: ${env.POSTGRES_PORT:=5432}

				            db: ${env.POSTGRES_DB:=llamastack}

				            user: ${env.POSTGRES_USER:=llamastack}

				            password: ${env.POSTGRES_PASSWORD:=llamastack}

				      tool_runtime:

				      - provider_id: brave-search

				        provider_type: remote::brave-search

				        config:

				          api_key: ${env.BRAVE_SEARCH_API_KEY:+}

				          max_results: 3

				      - provider_id: tavily-search

				        provider_type: remote::tavily-search

				        config:

				          api_key: ${env.TAVILY_SEARCH_API_KEY:+}

				          max_results: 3

				      - provider_id: rag-runtime

				        provider_type: inline::rag-runtime

				        config: {}

				      - provider_id: model-context-protocol

				        provider_type: remote::model-context-protocol

				        config: {}

				    storage:

				      backends:

				        kv_default:

				          type: kv_postgres

				          host: ${env.POSTGRES_HOST:=localhost}

				          port: ${env.POSTGRES_PORT:=5432}

				          db: ${env.POSTGRES_DB:=llamastack}

				          user: ${env.POSTGRES_USER:=llamastack}

				          password: ${env.POSTGRES_PASSWORD:=llamastack}

				          table_name: ${env.POSTGRES_TABLE_NAME:=llamastack_kvstore}

				        sql_default:

				          type: sql_postgres

				          host: ${env.POSTGRES_HOST:=localhost}

				          port: ${env.POSTGRES_PORT:=5432}

				          db: ${env.POSTGRES_DB:=llamastack}

				          user: ${env.POSTGRES_USER:=llamastack}

				          password: ${env.POSTGRES_PASSWORD:=llamastack}

				      stores:

				        metadata:

				          backend: kv_default

				          namespace: registry

				        inference:

				          backend: sql_default

				          table_name: inference_store

				          max_write_queue_size: 10000

				          num_writers: 4

				        conversations:

				          backend: sql_default

				          table_name: openai_conversations

				        prompts:

				          backend: kv_default

				          namespace: prompts

				    models:

				    - metadata:

				        embedding_dimension: 768

				      model_id: nomic-embed-text-v1.5

				      provider_id: sentence-transformers

				      model_type: embedding

				    - model_id: ${env.INFERENCE_MODEL}

				      provider_id: vllm-inference

				      model_type: llm

				    shields:

				    - shield_id: ${env.SAFETY_MODEL:=meta-llama/Llama-Guard-3-1B}

				    vector_dbs: []

				    datasets: []

				    scoring_fns: []

				    benchmarks: []

				    tool_groups:

				    - toolgroup_id: builtin::websearch

				      provider_id: tavily-search

				    - toolgroup_id: builtin::rag

				      provider_id: rag-runtime

				    server:

				      port: 8323

				kind: ConfigMap

				metadata:

				  name: llama-stack-config

94

benchmarking/k8s-benchmark/stack-k8s.yaml.template Normal file

View file

 @ -0,0 +1,94 @@
 apiVersion: v1
 kind: PersistentVolumeClaim
 metadata:
   name: llama-benchmark-pvc
 spec:
   accessModes:
     - ReadWriteOnce
   resources:
     requests:
       storage: 1Gi
 ---
 apiVersion: apps/v1
 kind: Deployment
 metadata:
   name: llama-stack-benchmark-server
 spec:
   replicas: 1
   selector:
     matchLabels:
       app.kubernetes.io/name: llama-stack-benchmark
       app.kubernetes.io/component: server
   template:
     metadata:
       labels:
         app.kubernetes.io/name: llama-stack-benchmark
         app.kubernetes.io/component: server
     spec:
       containers:
       - name: llama-stack-benchmark
         image: llamastack/distribution-starter:latest
         imagePullPolicy: Always # since we have specified latest instead of a version
         env:
         - name: ENABLE_CHROMADB
           value: "true"
         - name: CHROMADB_URL
           value: http://chromadb.default.svc.cluster.local:6000
         - name: POSTGRES_HOST
           value: postgres-server.default.svc.cluster.local
         - name: POSTGRES_PORT
           value: "5432"
         - name: INFERENCE_MODEL
           value: "${INFERENCE_MODEL}"
         - name: SAFETY_MODEL
           value: "${SAFETY_MODEL}"
         - name: TAVILY_SEARCH_API_KEY
           value: "${TAVILY_SEARCH_API_KEY}"
         - name: VLLM_URL
           value: http://vllm-server.default.svc.cluster.local:8000/v1
         - name: VLLM_MAX_TOKENS
           value: "3072"
         - name: VLLM_SAFETY_URL
           value: http://vllm-server-safety.default.svc.cluster.local:8001/v1
         - name: VLLM_TLS_VERIFY
           value: "false"
         - name: LLAMA_STACK_LOGGING
           value: "all=WARNING"
         - name: LLAMA_STACK_CONFIG
           value: "/etc/config/stack_run_config.yaml"
         - name: LLAMA_STACK_WORKERS
           value: "${LLAMA_STACK_WORKERS}"
         command: ["uvicorn", "llama_stack.core.server.server:create_app", "--host", "0.0.0.0", "--port", "8323", "--workers", "$(LLAMA_STACK_WORKERS)", "--factory"]
         ports:
           - containerPort: 8323
         resources:
           requests:
             cpu: "4"
           limits:
             cpu: "4"
         volumeMounts:
           - name: llama-storage
             mountPath: /root/.llama
           - name: llama-config
             mountPath: /etc/config
       volumes:
       - name: llama-storage
         persistentVolumeClaim:
           claimName: llama-benchmark-pvc
       - name: llama-config
         configMap:
           name: llama-stack-config
 ---
 apiVersion: v1
 kind: Service
 metadata:
   name: llama-stack-benchmark-service
 spec:
   selector:
     app.kubernetes.io/name: llama-stack-benchmark
     app.kubernetes.io/component: server
   ports:
   - name: http
     port: 8323
     targetPort: 8323
   type: ClusterIP

									
										133

benchmarking/k8s-benchmark/stack_run_config.yaml
									
										Normal file
									
										View file
										
				@ -0,0 +1,133 @@

				version: '2'

				image_name: kubernetes-benchmark-demo

				apis:

				- agents

				- files

				- inference

				- files

				- safety

				- tool_runtime

				- vector_io

				providers:

				  inference:

				  - provider_id: vllm-inference

				    provider_type: remote::vllm

				    config:

				      url: ${env.VLLM_URL:=http://localhost:8000/v1}

				      max_tokens: ${env.VLLM_MAX_TOKENS:=4096}

				      api_token: ${env.VLLM_API_TOKEN:=fake}

				      tls_verify: ${env.VLLM_TLS_VERIFY:=true}

				  - provider_id: sentence-transformers

				    provider_type: inline::sentence-transformers

				    config: {}

				  files:

				  - provider_id: meta-reference-files

				    provider_type: inline::localfs

				    config:

				      storage_dir: ${env.FILES_STORAGE_DIR:=~/.llama/distributions/starter/files}

				      metadata_store:

				        table_name: files_metadata

				        backend: sql_default

				  vector_io:

				  - provider_id: ${env.ENABLE_CHROMADB:+chromadb}

				    provider_type: remote::chromadb

				    config:

				      url: ${env.CHROMADB_URL:=}

				      persistence:

				        namespace: vector_io::chroma_remote

				        backend: kv_default

				  safety:

				  - provider_id: llama-guard

				    provider_type: inline::llama-guard

				    config:

				      excluded_categories: []

				  agents:

				  - provider_id: meta-reference

				    provider_type: inline::meta-reference

				    config:

				      persistence:

				        agent_state:

				          namespace: agents

				          backend: kv_default

				        responses:

				          table_name: responses

				          backend: sql_default

				          max_write_queue_size: 10000

				          num_writers: 4

				  tool_runtime:

				  - provider_id: brave-search

				    provider_type: remote::brave-search

				    config:

				      api_key: ${env.BRAVE_SEARCH_API_KEY:+}

				      max_results: 3

				  - provider_id: tavily-search

				    provider_type: remote::tavily-search

				    config:

				      api_key: ${env.TAVILY_SEARCH_API_KEY:+}

				      max_results: 3

				  - provider_id: rag-runtime

				    provider_type: inline::rag-runtime

				    config: {}

				  - provider_id: model-context-protocol

				    provider_type: remote::model-context-protocol

				    config: {}

				storage:

				  backends:

				    kv_default:

				      type: kv_postgres

				      host: ${env.POSTGRES_HOST:=localhost}

				      port: ${env.POSTGRES_PORT:=5432}

				      db: ${env.POSTGRES_DB:=llamastack}

				      user: ${env.POSTGRES_USER:=llamastack}

				      password: ${env.POSTGRES_PASSWORD:=llamastack}

				      table_name: ${env.POSTGRES_TABLE_NAME:=llamastack_kvstore}

				    sql_default:

				      type: sql_postgres

				      host: ${env.POSTGRES_HOST:=localhost}

				      port: ${env.POSTGRES_PORT:=5432}

				      db: ${env.POSTGRES_DB:=llamastack}

				      user: ${env.POSTGRES_USER:=llamastack}

				      password: ${env.POSTGRES_PASSWORD:=llamastack}

				  stores:

				    metadata:

				      namespace: registry

				      backend: kv_default

				    inference:

				      table_name: inference_store

				      backend: sql_default

				      max_write_queue_size: 10000

				      num_writers: 4

				    conversations:

				      table_name: openai_conversations

				      backend: sql_default

				    prompts:

				      namespace: prompts

				      backend: kv_default

				registered_resources:

				  models:

				  - metadata:

				      embedding_dimension: 768

				    model_id: nomic-embed-text-v1.5

				    provider_id: sentence-transformers

				    model_type: embedding

				  - model_id: ${env.INFERENCE_MODEL}

				    provider_id: vllm-inference

				    model_type: llm

				  shields:

				  - shield_id: ${env.SAFETY_MODEL:=meta-llama/Llama-Guard-3-1B}

				  vector_dbs: []

				  datasets: []

				  scoring_fns: []

				  benchmarks: []

				  tool_groups:

				  - toolgroup_id: builtin::websearch

				    provider_id: tavily-search

				  - toolgroup_id: builtin::rag

				    provider_id: rag-runtime

				server:

				  port: 8323

				vector_stores:

				  default_provider_id: chromadb

				  default_embedding_model:

				    provider_id: sentence-transformers

				    model_id: nomic-ai/nomic-embed-text-v1.5

									
										11

client-sdks/stainless/README.md
									
										Normal file
									
										View file
										
				@ -0,0 +1,11 @@

				These are the source-of-truth configuration files used to generate the Stainless client SDKs via Stainless.

				- `openapi.yml`: this is the OpenAPI specification for the Llama Stack API.

				- `config.yml`: this is the Stainless _configuration_ which instructs Stainless how to generate the client SDKs.

				A small side note: notice the `.yml` suffixes since Stainless uses that suffix typically for its configuration files.

				These files go hand-in-hand. Both `openapi.yml` and `config.yml` are generated by `scripts/run_openapi_generator.sh`:

				- `openapi.yml` comes from the FastAPI-based generator.

				- `config.yml` is rendered from `scripts/openapi_generator/stainless_config/config_data.py` so the Stainless config stays in lock-step with the spec.

									
										494

client-sdks/stainless/config.yml
									
										Normal file
									
										View file
										
				@ -0,0 +1,494 @@

				# yaml-language-server: $schema=https://app.stainlessapi.com/config-internal.schema.json

				organization:

				  name: llama-stack-client

				  docs: https://llama-stack.readthedocs.io/en/latest/

				  contact: llamastack@meta.com

				security:

				- {}

				- BearerAuth: []

				security_schemes:

				  BearerAuth:

				    type: http

				    scheme: bearer

				targets:

				  node:

				    package_name: llama-stack-client

				    production_repo: llamastack/llama-stack-client-typescript

				    publish:

				      npm: false

				  python:

				    package_name: llama_stack_client

				    production_repo: llamastack/llama-stack-client-python

				    options:

				      use_uv: true

				    publish:

				      pypi: true

				    project_name: llama_stack_client

				  kotlin:

				    reverse_domain: com.llama_stack_client.api

				    production_repo: null

				    publish:

				      maven: false

				  go:

				    package_name: llama-stack-client

				    production_repo: llamastack/llama-stack-client-go

				    options:

				      enable_v2: true

				      back_compat_use_shared_package: false

				client_settings:

				  default_env_prefix: LLAMA_STACK_CLIENT

				  opts:

				    api_key:

				      type: string

				      read_env: LLAMA_STACK_CLIENT_API_KEY

				      auth:

				        security_scheme: BearerAuth

				      nullable: true

				environments:

				  production: http://any-hosted-llama-stack.com

				pagination:

				- name: datasets_iterrows

				  type: offset

				  request:

				    dataset_id:

				      type: string

				    start_index:

				      type: integer

				      x-stainless-pagination-property:

				        purpose: offset_count_param

				    limit:

				      type: integer

				  response:

				    data:

				      type: array

				      items:

				        type: object

				    next_index:

				      type: integer

				      x-stainless-pagination-property:

				        purpose: offset_count_start_field

				- name: openai_cursor_page

				  type: cursor

				  request:

				    limit:

				      type: integer

				    after:

				      type: string

				      x-stainless-pagination-property:

				        purpose: next_cursor_param

				  response:

				    data:

				      type: array

				      items: {}

				    has_more:

				      type: boolean

				    last_id:

				      type: string

				      x-stainless-pagination-property:

				        purpose: next_cursor_field

				settings:

				  license: MIT

				  unwrap_response_fields:

				  - data

				  file_header: 'Copyright (c) Meta Platforms, Inc. and affiliates.

				    All rights reserved.

				    This source code is licensed under the terms described in the LICENSE file in

				    the root directory of this source tree.

				    '

				openapi:

				  transformations:

				  - command: mergeObject

				    reason: Better return_type using enum

				    args:

				      target:

				      - $.components.schemas

				      object:

				        ReturnType:

				          additionalProperties: false

				          properties:

				            type:

				              enum:

				              - string

				              - number

				              - boolean

				              - array

				              - object

				              - json

				              - union

				              - chat_completion_input

				              - completion_input

				              - agent_turn_input

				          required:

				          - type

				          type: object

				  - command: replaceProperties

				    reason: Replace return type properties with better model (see above)

				    args:

				      filter:

				        only:

				        - $.components.schemas.ScoringFn.properties.return_type

				        - $.components.schemas.RegisterScoringFunctionRequest.properties.return_type

				      value:

				        $ref: '#/components/schemas/ReturnType'

				  - command: oneOfToAnyOf

				    reason: Prism (mock server) doesn't like one of our requests as it technically

				      matches multiple variants

				readme:

				  example_requests:

				    default:

				      type: request

				      endpoint: post /v1/chat/completions

				      params: {}

				    headline:

				      type: request

				      endpoint: get /v1/models

				      params: {}

				    pagination:

				      type: request

				      endpoint: post /v1/chat/completions

				      params: {}

				resources:

				  $shared:

				    models:

				      interleaved_content_item: InterleavedContentItem

				      interleaved_content: InterleavedContent

				      param_type: ParamType

				      safety_violation: SafetyViolation

				      sampling_params: SamplingParams

				      scoring_result: ScoringResult

				      system_message: SystemMessage

				      health_info: HealthInfo

				      provider_info: ProviderInfo

				      list_providers_response: ListProvidersResponse

				      route_info: RouteInfo

				      list_routes_response: ListRoutesResponse

				      version_info: VersionInfo

				  toolgroups:

				    models:

				      tool_group: ToolGroup

				      list_tool_groups_response: ListToolGroupsResponse

				    methods:

				      register: post /v1/toolgroups

				      get: get /v1/toolgroups/{toolgroup_id}

				      list: get /v1/toolgroups

				      unregister: delete /v1/toolgroups/{toolgroup_id}

				  tools:

				    methods:

				      get: get /v1/tools/{tool_name}

				      list:

				        paginated: false

				        endpoint: get /v1/tools

				  tool_runtime:

				    models:

				      tool_def: ToolDef

				      tool_invocation_result: ToolInvocationResult

				    methods:

				      list_tools:

				        paginated: false

				        endpoint: get /v1/tool-runtime/list-tools

				      invoke_tool: post /v1/tool-runtime/invoke

				  responses:

				    models:

				      response_object_stream: OpenAIResponseObjectStream

				      response_object: OpenAIResponseObject

				    methods:

				      create:

				        type: http

				        streaming:

				          stream_event_model: responses.response_object_stream

				          param_discriminator: stream

				        endpoint: post /v1/responses

				      retrieve: get /v1/responses/{response_id}

				      list:

				        type: http

				        endpoint: get /v1/responses

				      delete:

				        type: http

				        endpoint: delete /v1/responses/{response_id}

				    subresources:

				      input_items:

				        methods:

				          list:

				            type: http

				            paginated: false

				            endpoint: get /v1/responses/{response_id}/input_items

				  prompts:

				    models:

				      prompt: Prompt

				      list_prompts_response: ListPromptsResponse

				    methods:

				      create: post /v1/prompts

				      list:

				        paginated: false

				        endpoint: get /v1/prompts

				      retrieve: get /v1/prompts/{prompt_id}

				      update: post /v1/prompts/{prompt_id}

				      delete: delete /v1/prompts/{prompt_id}

				      set_default_version: post /v1/prompts/{prompt_id}/set-default-version

				    subresources:

				      versions:

				        methods:

				          list:

				            paginated: false

				            endpoint: get /v1/prompts/{prompt_id}/versions

				  conversations:

				    models:

				      conversation_object: Conversation

				    methods:

				      create:

				        type: http

				        endpoint: post /v1/conversations

				      retrieve: get /v1/conversations/{conversation_id}

				      update:

				        type: http

				        endpoint: post /v1/conversations/{conversation_id}

				      delete:

				        type: http

				        endpoint: delete /v1/conversations/{conversation_id}

				    subresources:

				      items:

				        methods:

				          get:

				            type: http

				            endpoint: get /v1/conversations/{conversation_id}/items/{item_id}

				          list:

				            type: http

				            endpoint: get /v1/conversations/{conversation_id}/items

				          create:

				            type: http

				            endpoint: post /v1/conversations/{conversation_id}/items

				          delete:

				            type: http

				            endpoint: delete /v1/conversations/{conversation_id}/items/{item_id}

				  inspect:

				    methods:

				      health: get /v1/health

				      version: get /v1/version

				  embeddings:

				    models:

				      create_embeddings_response: OpenAIEmbeddingsResponse

				    methods:

				      create: post /v1/embeddings

				  chat:

				    models:

				      chat_completion_chunk: OpenAIChatCompletionChunk

				    subresources:

				      completions:

				        methods:

				          create:

				            type: http

				            streaming:

				              stream_event_model: chat.chat_completion_chunk

				              param_discriminator: stream

				            endpoint: post /v1/chat/completions

				          list:

				            type: http

				            paginated: false

				            endpoint: get /v1/chat/completions

				          retrieve:

				            type: http

				            endpoint: get /v1/chat/completions/{completion_id}

				  completions:

				    methods:

				      create:

				        type: http

				        streaming:

				          param_discriminator: stream

				        endpoint: post /v1/completions

				  vector_io:

				    models:

				      queryChunksResponse: QueryChunksResponse

				    methods:

				      insert: post /v1/vector-io/insert

				      query: post /v1/vector-io/query

				  vector_stores:

				    models:

				      vector_store: VectorStoreObject

				      list_vector_stores_response: VectorStoreListResponse

				      vector_store_delete_response: VectorStoreDeleteResponse

				      vector_store_search_response: VectorStoreSearchResponsePage

				    methods:

				      create: post /v1/vector_stores

				      list: get /v1/vector_stores

				      retrieve: get /v1/vector_stores/{vector_store_id}

				      update: post /v1/vector_stores/{vector_store_id}

				      delete: delete /v1/vector_stores/{vector_store_id}

				      search: post /v1/vector_stores/{vector_store_id}/search

				    subresources:

				      files:

				        models:

				          vector_store_file: VectorStoreFileObject

				        methods:

				          list: get /v1/vector_stores/{vector_store_id}/files

				          retrieve: get /v1/vector_stores/{vector_store_id}/files/{file_id}

				          update: post /v1/vector_stores/{vector_store_id}/files/{file_id}

				          delete: delete /v1/vector_stores/{vector_store_id}/files/{file_id}

				          create: post /v1/vector_stores/{vector_store_id}/files

				          content: get /v1/vector_stores/{vector_store_id}/files/{file_id}/content

				      file_batches:

				        models:

				          vector_store_file_batches: VectorStoreFileBatchObject

				          list_vector_store_files_in_batch_response: VectorStoreFilesListInBatchResponse

				        methods:

				          create: post /v1/vector_stores/{vector_store_id}/file_batches

				          retrieve: get /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}

				          list_files: get /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/files

				          cancel: post /v1/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel

				  models:

				    models:

				      model: OpenAIModel

				      list_models_response: OpenAIListModelsResponse

				    methods:

				      list:

				        paginated: false

				        endpoint: get /v1/models

				      retrieve: get /v1/models/{model_id}

				      register: post /v1/models

				      unregister: delete /v1/models/{model_id}

				    subresources:

				      openai:

				        methods:

				          list:

				            paginated: false

				            endpoint: get /v1/models

				  providers:

				    methods:

				      list:

				        paginated: false

				        endpoint: get /v1/providers

				      retrieve: get /v1/providers/{provider_id}

				  routes:

				    methods:

				      list:

				        paginated: false

				        endpoint: get /v1/inspect/routes

				  moderations:

				    models:

				      create_response: ModerationObject

				    methods:

				      create: post /v1/moderations

				  safety:

				    models:

				      run_shield_response: RunShieldResponse

				    methods:

				      run_shield: post /v1/safety/run-shield

				  shields:

				    models:

				      shield: Shield

				      list_shields_response: ListShieldsResponse

				    methods:

				      retrieve: get /v1/shields/{identifier}

				      list:

				        paginated: false

				        endpoint: get /v1/shields

				      register: post /v1/shields

				      delete: delete /v1/shields/{identifier}

				  scoring:

				    methods:

				      score: post /v1/scoring/score

				      score_batch: post /v1/scoring/score-batch

				  scoring_functions:

				    models:

				      scoring_fn: ScoringFn

				      scoring_fn_params: ScoringFnParams

				      list_scoring_functions_response: ListScoringFunctionsResponse

				    methods:

				      retrieve: get /v1/scoring-functions/{scoring_fn_id}

				      list:

				        paginated: false

				        endpoint: get /v1/scoring-functions

				      register: post /v1/scoring-functions

				      unregister: delete /v1/scoring-functions/{scoring_fn_id}

				  files:

				    models:

				      file: OpenAIFileObject

				      list_files_response: ListOpenAIFileResponse

				      delete_file_response: OpenAIFileDeleteResponse

				    methods:

				      create: post /v1/files

				      list: get /v1/files

				      retrieve: get /v1/files/{file_id}

				      delete: delete /v1/files/{file_id}

				      content: get /v1/files/{file_id}/content

				  batches:

				    methods:

				      create: post /v1/batches

				      list: get /v1/batches

				      retrieve: get /v1/batches/{batch_id}

				      cancel: post /v1/batches/{batch_id}/cancel

				  alpha:

				    subresources:

				      inference:

				        methods:

				          rerank: post /v1alpha/inference/rerank

				      post_training:

				        models:

				          algorithm_config: AlgorithmConfig

				          post_training_job: PostTrainingJob

				          list_post_training_jobs_response: ListPostTrainingJobsResponse

				        methods:

				          preference_optimize: post /v1alpha/post-training/preference-optimize

				          supervised_fine_tune: post /v1alpha/post-training/supervised-fine-tune

				        subresources:

				          job:

				            methods:

				              artifacts: get /v1alpha/post-training/job/artifacts

				              cancel: post /v1alpha/post-training/job/cancel

				              status: get /v1alpha/post-training/job/status

				              list:

				                paginated: false

				                endpoint: get /v1alpha/post-training/jobs

				      benchmarks:

				        models:

				          benchmark: Benchmark

				          list_benchmarks_response: ListBenchmarksResponse

				        methods:

				          retrieve: get /v1alpha/eval/benchmarks/{benchmark_id}

				          list:

				            paginated: false

				            endpoint: get /v1alpha/eval/benchmarks

				          register: post /v1alpha/eval/benchmarks

				          unregister: delete /v1alpha/eval/benchmarks/{benchmark_id}

				      eval:

				        models:

				          evaluate_response: EvaluateResponse

				          benchmark_config: BenchmarkConfig

				          job: Job

				        methods:

				          evaluate_rows: post /v1alpha/eval/benchmarks/{benchmark_id}/evaluations

				          run_eval: post /v1alpha/eval/benchmarks/{benchmark_id}/jobs

				          evaluate_rows_alpha: post /v1alpha/eval/benchmarks/{benchmark_id}/evaluations

				          run_eval_alpha: post /v1alpha/eval/benchmarks/{benchmark_id}/jobs

				        subresources:

				          jobs:

				            methods:

				              cancel: delete /v1alpha/eval/benchmarks/{benchmark_id}/jobs/{job_id}

				              status: get /v1alpha/eval/benchmarks/{benchmark_id}/jobs/{job_id}

				              retrieve: get /v1alpha/eval/benchmarks/{benchmark_id}/jobs/{job_id}/result

				      admin:

				        methods:

				          list_providers: get /v1alpha/admin/providers

				          inspect_provider: get /v1alpha/admin/providers/{provider_id}

				          list_routes: get /v1alpha/admin/inspect/routes

				          health: get /v1alpha/admin/health

				          version: get /v1alpha/admin/version

				  beta:

				    subresources:

				      datasets:

				        models:

				          list_datasets_response: ListDatasetsResponse

				        methods:

				          register: post /v1beta/datasets

				          retrieve: get /v1beta/datasets/{dataset_id}

				          list:

				            paginated: false

				            endpoint: get /v1beta/datasets

				          unregister: delete /v1beta/datasets/{dataset_id}

				          iterrows: get /v1beta/datasetio/iterrows/{dataset_id}

				          appendrows: post /v1beta/datasetio/append-rows/{dataset_id}

14219

client-sdks/stainless/openapi.yml Normal file

View file

File diff suppressed because it is too large Load diff

163

containers/Containerfile Normal file

View file

 @ -0,0 +1,163 @@
 # syntax=docker/dockerfile:1.6
 #
 # This Dockerfile is used to build the Llama Stack container image.
 # Example:
 # docker build \
 #   -f containers/Containerfile \
 #   --build-arg DISTRO_NAME=starter \
 #   --tag llama-stack:starter .
 ARG BASE_IMAGE=python:3.12-slim
 FROM ${BASE_IMAGE}
 ARG INSTALL_MODE="pypi"
 ARG LLAMA_STACK_DIR="/workspace"
 ARG LLAMA_STACK_CLIENT_DIR=""
 ARG PYPI_VERSION=""
 ARG TEST_PYPI_VERSION=""
 ARG KEEP_WORKSPACE=""
 ARG DISTRO_NAME="starter"
 ARG RUN_CONFIG_PATH=""
 ARG UV_HTTP_TIMEOUT=500
 ARG UV_EXTRA_INDEX_URL=""
 ARG UV_INDEX_STRATEGY=""
 ENV UV_HTTP_TIMEOUT=${UV_HTTP_TIMEOUT}
 ENV PYTHONDONTWRITEBYTECODE=1
 ENV PIP_DISABLE_PIP_VERSION_CHECK=1
 WORKDIR /app
 RUN set -eux; \
     if command -v dnf >/dev/null 2>&1; then \
         dnf -y update && \
         dnf install -y iputils git net-tools wget \
             vim-minimal python3.12 python3.12-pip python3.12-wheel \
             python3.12-setuptools python3.12-devel gcc gcc-c++ make && \
         ln -sf /usr/bin/pip3.12 /usr/local/bin/pip && \
         ln -sf /usr/bin/python3.12 /usr/local/bin/python && \
         dnf clean all; \
     elif command -v apt-get >/dev/null 2>&1; then \
         apt-get update && \
         apt-get install -y --no-install-recommends \
             iputils-ping net-tools iproute2 dnsutils telnet \
             curl wget git procps psmisc lsof traceroute bubblewrap \
             gcc g++ && \
         rm -rf /var/lib/apt/lists/*; \
     else \
         echo "Unsupported base image: expected dnf or apt-get" >&2; \
         exit 1; \
     fi
 RUN pip install --no-cache uv
 ENV UV_SYSTEM_PYTHON=1
 ENV INSTALL_MODE=${INSTALL_MODE}
 ENV LLAMA_STACK_DIR=${LLAMA_STACK_DIR}
 ENV LLAMA_STACK_CLIENT_DIR=${LLAMA_STACK_CLIENT_DIR}
 ENV PYPI_VERSION=${PYPI_VERSION}
 ENV TEST_PYPI_VERSION=${TEST_PYPI_VERSION}
 ENV KEEP_WORKSPACE=${KEEP_WORKSPACE}
 ENV DISTRO_NAME=${DISTRO_NAME}
 ENV RUN_CONFIG_PATH=${RUN_CONFIG_PATH}
 # Copy the repository so editable installs and run configurations are available.
 COPY . /workspace
 # Install the client package if it is provided
 # NOTE: this is installed before llama-stack since llama-stack depends on llama-stack-client-python
 # Unset UV index env vars to ensure we only use PyPI for the client
 RUN set -eux; \
     unset UV_EXTRA_INDEX_URL UV_INDEX_STRATEGY; \
     if [ -n "$LLAMA_STACK_CLIENT_DIR" ]; then \
         if [ ! -d "$LLAMA_STACK_CLIENT_DIR" ]; then \
             echo "LLAMA_STACK_CLIENT_DIR is set but $LLAMA_STACK_CLIENT_DIR does not exist" >&2; \
             exit 1; \
         fi; \
         uv pip install --no-cache -e "$LLAMA_STACK_CLIENT_DIR"; \
     fi;
 # Install llama-stack
 # Use UV_EXTRA_INDEX_URL inline only for editable install with RC dependencies
 RUN set -eux; \
     SAVED_UV_EXTRA_INDEX_URL="${UV_EXTRA_INDEX_URL:-}"; \
     SAVED_UV_INDEX_STRATEGY="${UV_INDEX_STRATEGY:-}"; \
     unset UV_EXTRA_INDEX_URL UV_INDEX_STRATEGY; \
     if [ "$INSTALL_MODE" = "editable" ]; then \
         if [ ! -d "$LLAMA_STACK_DIR" ]; then \
             echo "INSTALL_MODE=editable requires LLAMA_STACK_DIR to point to a directory inside the build context" >&2; \
             exit 1; \
         fi; \
         if [ -n "$SAVED_UV_EXTRA_INDEX_URL" ] && [ -n "$SAVED_UV_INDEX_STRATEGY" ]; then \
             UV_EXTRA_INDEX_URL="$SAVED_UV_EXTRA_INDEX_URL" UV_INDEX_STRATEGY="$SAVED_UV_INDEX_STRATEGY" \
                 uv pip install --no-cache -e "$LLAMA_STACK_DIR"; \
         else \
             uv pip install --no-cache -e "$LLAMA_STACK_DIR"; \
         fi; \
     elif [ "$INSTALL_MODE" = "test-pypi" ]; then \
         uv pip install --no-cache fastapi libcst; \
         if [ -n "$TEST_PYPI_VERSION" ]; then \
             uv pip install --no-cache --extra-index-url https://test.pypi.org/simple/ --index-strategy unsafe-best-match "llama-stack==$TEST_PYPI_VERSION"; \
         else \
             uv pip install --no-cache --extra-index-url https://test.pypi.org/simple/ --index-strategy unsafe-best-match llama-stack; \
         fi; \
     else \
         if [ -n "$PYPI_VERSION" ]; then \
             uv pip install --no-cache "llama-stack==$PYPI_VERSION"; \
         else \
             uv pip install --no-cache llama-stack; \
         fi; \
     fi;
 # Install the dependencies for the distribution
 # Explicitly unset UV index env vars to ensure we only use PyPI for distribution deps
 RUN set -eux; \
     unset UV_EXTRA_INDEX_URL UV_INDEX_STRATEGY; \
     if [ -z "$DISTRO_NAME" ]; then \
         echo "DISTRO_NAME must be provided" >&2; \
         exit 1; \
     fi; \
     deps="$(llama stack list-deps "$DISTRO_NAME")"; \
     if [ -n "$deps" ]; then \
         printf '%s\n' "$deps" | xargs -L1 uv pip install --no-cache; \
     fi
 # Install OpenTelemetry auto-instrumentation support
 RUN set -eux; \
     pip install --no-cache opentelemetry-distro opentelemetry-exporter-otlp; \
     opentelemetry-bootstrap -a install
 # Cleanup
 RUN set -eux; \
     pip uninstall -y uv; \
     should_remove=1; \
     if [ -n "$KEEP_WORKSPACE" ]; then should_remove=0; fi; \
     if [ "$INSTALL_MODE" = "editable" ]; then should_remove=0; fi; \
     case "$RUN_CONFIG_PATH" in \
         /workspace*) should_remove=0 ;; \
     esac; \
     if [ "$should_remove" -eq 1 ] && [ -d /workspace ]; then rm -rf /workspace; fi
 RUN cat <<'EOF' >/usr/local/bin/llama-stack-entrypoint.sh
 #!/bin/sh
 set -e
 # Enable OpenTelemetry auto-instrumentation if any OTEL_* variable is set
 CMD_PREFIX=""
 if env | grep -q '^OTEL_'; then
   CMD_PREFIX="opentelemetry-instrument"
 fi
 if [ -n "$RUN_CONFIG_PATH" ] && [ -f "$RUN_CONFIG_PATH" ]; then
   exec $CMD_PREFIX llama stack run "$RUN_CONFIG_PATH" "$@"
 fi
 if [ -n "$DISTRO_NAME" ]; then
   exec $CMD_PREFIX llama stack run "$DISTRO_NAME" "$@"
 fi
 exec $CMD_PREFIX llama stack run "$@"
 EOF
 RUN chmod +x /usr/local/bin/llama-stack-entrypoint.sh
 RUN mkdir -p /.llama /.cache && chmod -R g+rw /app /.llama /.cache
 ENTRYPOINT ["/usr/local/bin/llama-stack-entrypoint.sh"]

21

coverage.svg Normal file

View file

 @ -0,0 +1,21 @@
 <?xml version="1.0" encoding="UTF-8"?>
 <svg xmlns="http://www.w3.org/2000/svg" width="99" height="20">
     <linearGradient id="b" x2="0" y2="100%">
         <stop offset="0" stop-color="#bbb" stop-opacity=".1"/>
         <stop offset="1" stop-opacity=".1"/>
     </linearGradient>
     <mask id="a">
         <rect width="99" height="20" rx="3" fill="#fff"/>
     </mask>
     <g mask="url(#a)">
         <path fill="#555" d="M0 0h63v20H0z"/>
         <path fill="#fe7d37" d="M63 0h36v20H63z"/>
         <path fill="url(#b)" d="M0 0h99v20H0z"/>
     </g>
     <g fill="#fff" text-anchor="middle" font-family="DejaVu Sans,Verdana,Geneva,sans-serif" font-size="11">
         <text x="31.5" y="15" fill="#010101" fill-opacity=".3">coverage</text>
         <text x="31.5" y="14">coverage</text>
         <text x="80" y="15" fill="#010101" fill-opacity=".3">44%</text>
         <text x="80" y="14">44%</text>
     </g>
 </svg>

After

Width: | Height: | Size: 904 B

									
										20

docs/Makefile
									
										View file
									
				@ -1,20 +0,0 @@

				# Minimal makefile for Sphinx documentation

				#

				# You can set these variables from the command line, and also

				# from the environment for the first two.

				SPHINXOPTS    ?=

				SPHINXBUILD   ?= sphinx-build

				SOURCEDIR     = source

				BUILDDIR      = _build

				# Put it first so that "make" without argument is like "make help".

				help:

					@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

				.PHONY: help Makefile

				# Catch-all target: route all unknown targets to Sphinx using the new

				# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).

				%: Makefile

					@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

									
										58

docs/README.md
									
										Normal file
									
										View file
										
				@ -0,0 +1,58 @@

				# Llama Stack Documentation

				Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our [Github page](https://llamastack.github.io/getting_started/quickstart).

				## Render locally

				From the llama-stack `docs/` directory, run the following commands to render the docs locally:

				```bash

				npm install

				npm run gen-api-docs all

				npm run build

				npm run serve

				```

				You can open up the docs in your browser at http://localhost:3000

				## File Import System

				This documentation uses `remark-code-import` to import files directly from the repository, eliminating copy-paste maintenance. Files are automatically embedded during build time.

				### Importing Code Files

				To import Python code (or any code files) with syntax highlighting, use this syntax in `.mdx` files:

				```markdown

				```python file=./demo_script.py title="demo_script.py"

				```

				```

				This automatically imports the file content and displays it as a formatted code block with Python syntax highlighting.

				**Note:** Paths are relative to the current `.mdx` file location, not the repository root.

				### Importing Markdown Files as Content

				For importing and rendering markdown files (like CONTRIBUTING.md), use the raw-loader approach:

				```jsx

				import Contributing from '!!raw-loader!../../../CONTRIBUTING.md';

				import ReactMarkdown from 'react-markdown';

				<ReactMarkdown>{Contributing}</ReactMarkdown>

				```

				**Requirements:**

				- Install dependencies: `npm install --save-dev raw-loader react-markdown`

				**Path Resolution:**

				- For `remark-code-import`: Paths are relative to the current `.mdx` file location

				- For `raw-loader`: Paths are relative to the current `.mdx` file location

				- Use `../` to navigate up directories as needed

				## Content

				Try out Llama Stack's capabilities through our detailed Jupyter notebooks:

				* [Building AI Applications Notebook](./getting_started.ipynb) - A comprehensive guide to building production-ready AI applications using Llama Stack

				* [Benchmark Evaluations Notebook](./notebooks/Llama_Stack_Benchmark_Evals.ipynb) - Detailed performance evaluations and benchmarking results

				* [Zero-to-Hero Guide](./zero_to_hero_guide) - Step-by-step guide for getting started with Llama Stack

									
										35

docs/_static/css/my_theme.css
									
										vendored
									
										View file
									
				@ -1,35 +0,0 @@

				@import url("theme.css");

				.wy-nav-content {

				    max-width: 90%;

				}

				.wy-nav-side {

				    /* background: linear-gradient(45deg, #2980B9, #16A085); */

				    background: linear-gradient(90deg, #332735, #1b263c);

				}

				.wy-side-nav-search {

				    background-color: transparent !important;

				}

				.hide-title h1 {

				    display: none;

				}

				h2, h3, h4 {

				    font-weight: normal;

				}

				html[data-theme="dark"] .rst-content div[class^="highlight"] {

				  background-color: #0b0b0b;

				}

				pre {

				    white-space: pre-wrap !important;

				    word-break: break-all;

				}

				[data-theme="dark"] .mermaid {

				    background-color: #f4f4f6 !important;

				    border-radius: 6px;

				    padding: 0.5em;

				  }

									
										32

docs/_static/js/detect_theme.js
									
										vendored
									
										View file
									
				@ -1,32 +0,0 @@

				document.addEventListener("DOMContentLoaded", function () {

				  const prefersDark = window.matchMedia("(prefers-color-scheme: dark)").matches;

				  const htmlElement = document.documentElement;

				  // Check if theme is saved in localStorage

				  const savedTheme = localStorage.getItem("sphinx-rtd-theme");

				  if (savedTheme) {

				    // Use the saved theme preference

				    htmlElement.setAttribute("data-theme", savedTheme);

				    document.body.classList.toggle("dark", savedTheme === "dark");

				  } else {

				    // Fall back to system preference

				    const theme = prefersDark ? "dark" : "light";

				    htmlElement.setAttribute("data-theme", theme);

				    document.body.classList.toggle("dark", theme === "dark");

				    // Save initial preference

				    localStorage.setItem("sphinx-rtd-theme", theme);

				  }

				  // Listen for theme changes from the existing toggle

				  const observer = new MutationObserver(function(mutations) {

				    mutations.forEach(function(mutation) {

				      if (mutation.attributeName === "data-theme") {

				        const currentTheme = htmlElement.getAttribute("data-theme");

				        localStorage.setItem("sphinx-rtd-theme", currentTheme);

				      }

				    });

				  });

				  observer.observe(htmlElement, { attributes: true });

				});

BIN
docs/_static/llama-stack-logo.png vendored

View file

Binary file not shown.

Before

Width: | Height: | Size: 70 KiB

15608

docs/_static/llama-stack-spec.html vendored

View file

File diff suppressed because it is too large Load diff

10871

docs/_static/llama-stack-spec.yaml vendored

View file

File diff suppressed because it is too large Load diff

BIN
docs/_static/llama-stack.png vendored

View file

Binary file not shown.

Before

Width: | Height: | Size: 196 KiB

									
										24

docs/conftest.py
									
										View file
									
				@ -1,24 +0,0 @@

				# Copyright (c) Meta Platforms, Inc. and affiliates.

				# All rights reserved.

				#

				# This source code is licensed under the terms described in the LICENSE file in

				# the root directory of this source tree.

				import os

				import time

				def pytest_collection_modifyitems(items):

				    for item in items:

				        item.name = item.name.replace(' ', '_') 

				def pytest_runtest_teardown(item):

				    interval_seconds = os.getenv("LLAMA_STACK_TEST_INTERVAL_SECONDS")

				    if interval_seconds:

				        time.sleep(float(interval_seconds))

				def pytest_configure(config):

				    config.option.tbstyle = "short"

				    config.option.disable_warnings = True

									
										7

docs/contbuild.sh
									
										View file
									
				@ -1,7 +0,0 @@

				# Copyright (c) Meta Platforms, Inc. and affiliates.

				# All rights reserved.

				#

				# This source code is licensed under the terms described in the LICENSE file in

				# the root directory of this source tree.

				sphinx-autobuild --write-all source build/html --watch source/

163

docs/docs/advanced_apis/evaluation.mdx Normal file

View file

 @ -0,0 +1,163 @@
 # Evaluation
 ## Evaluation Concepts
 The Llama Stack Evaluation flow allows you to run evaluations on your GenAI application datasets or pre-registered benchmarks.
 We introduce a set of APIs in Llama Stack for supporting running evaluations of LLM applications:
 - `/datasetio` + `/datasets` API
 - `/scoring` + `/scoring_functions` API
 - `/eval` + `/benchmarks` API
 This guide goes over the sets of APIs and developer experience flow of using Llama Stack to run evaluations for different use cases. Checkout our Colab notebook on working examples with evaluations [here](https://colab.research.google.com/drive/10CHyykee9j2OigaIcRv47BKG9mrNm0tJ?usp=sharing).
 The Evaluation APIs are associated with a set of Resources. Please visit the Resources section in our [Core Concepts](../concepts/index.mdx) guide for better high-level understanding.
 - **DatasetIO**: defines interface with datasets and data loaders.
   - Associated with `Dataset` resource.
 - **Scoring**: evaluate outputs of the system.
   - Associated with `ScoringFunction` resource. We provide a suite of out-of-the box scoring functions and also the ability for you to add custom evaluators. These scoring functions are the core part of defining an evaluation task to output evaluation metrics.
 - **Eval**: generate outputs (via Inference or Agents) and perform scoring.
   - Associated with `Benchmark` resource.
 ## Evaluation Providers
 Llama Stack provides multiple evaluation providers:
 - **Meta Reference** (`inline::meta-reference`) - Meta's reference implementation with multi-language support
 - **NVIDIA** (`remote::nvidia`) - NVIDIA's evaluation platform integration
 ### Meta Reference
 Meta's reference implementation of evaluation tasks with support for multiple languages and evaluation metrics.
 #### Configuration
 | Field | Type | Required | Default | Description |
 |-------|------|----------|---------|-------------|
 | `kvstore` | `RedisKVStoreConfig \| SqliteKVStoreConfig \| PostgresKVStoreConfig \| MongoDBKVStoreConfig` | No | sqlite | Key-value store configuration |
 #### Sample Configuration
 ```yaml
 kvstore:
   type: sqlite
   db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/meta_reference_eval.db
 ```
 #### Features
 - Multi-language evaluation support
 - Comprehensive evaluation metrics
 - Integration with various key-value stores (SQLite, Redis, PostgreSQL, MongoDB)
 - Built-in support for popular benchmarks
 ### NVIDIA
 NVIDIA's evaluation provider for running evaluation tasks on NVIDIA's platform.
 #### Configuration
 | Field | Type | Required | Default | Description |
 |-------|------|----------|---------|-------------|
 | `evaluator_url` | `str` | No | http://0.0.0.0:7331 | The url for accessing the evaluator service |
 #### Sample Configuration
 ```yaml
 evaluator_url: ${env.NVIDIA_EVALUATOR_URL:=http://localhost:7331}
 ```
 #### Features
 - Integration with NVIDIA's evaluation platform
 - Remote evaluation capabilities
 - Scalable evaluation processing
 ## Open-benchmark Eval
 ### List of open-benchmarks Llama Stack support
 Llama stack pre-registers several popular open-benchmarks to easily evaluate model performance via CLI.
 The list of open-benchmarks we currently support:
 - [MMLU-COT](https://arxiv.org/abs/2009.03300) (Measuring Massive Multitask Language Understanding): Benchmark designed to comprehensively evaluate the breadth and depth of a model's academic and professional understanding
 - [GPQA-COT](https://arxiv.org/abs/2311.12022) (A Graduate-Level Google-Proof Q&A Benchmark): A challenging benchmark of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry.
 - [SimpleQA](https://openai.com/index/introducing-simpleqa/): Benchmark designed to access models to answer short, fact-seeking questions.
 - [MMMU](https://arxiv.org/abs/2311.16502) (A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI): Benchmark designed to evaluate multimodal models.
 You can follow this [contributing guide](../references/evals_reference/index.mdx#open-benchmark-contributing-guide) to add more open-benchmarks to Llama Stack
 ### Run evaluation on open-benchmarks via CLI
 We have built-in functionality to run the supported open-benchmarks using llama-stack-client CLI
 #### Spin up Llama Stack server
 Spin up llama stack server with 'open-benchmark' template
 ```
 llama stack run llama_stack/distributions/open-benchmark/config.yaml
 ```
 #### Run eval CLI
 There are 3 necessary inputs to run a benchmark eval
 - `list of benchmark_ids`: The list of benchmark ids to run evaluation on
 - `model-id`: The model id to evaluate on
 - `output_dir`: Path to store the evaluate results
 ```
 llama-stack-client eval run-benchmark <benchmark_id_1> <benchmark_id_2> ... \
 --model_id <model id to evaluate on> \
 --output_dir <directory to store the evaluate results>
 ```
 You can run
 ```
 llama-stack-client eval run-benchmark help
 ```
 to see the description of all the flags that eval run-benchmark has
 In the output log, you can find the file path that has your evaluation results. Open that file and you can see you aggregate evaluation results over there.
 ## Usage Example
 Here's a basic example of using the evaluation API:
 ```python
 from llama_stack_client import LlamaStackClient
 client = LlamaStackClient(base_url="http://localhost:8321")
 # Register a dataset for evaluation
 client.datasets.register(
     purpose="evaluation",
     source={
         "type": "uri",
         "uri": "huggingface://datasets/llamastack/evaluation_dataset"
     },
     dataset_id="my_eval_dataset"
 )
 # Run evaluation
 eval_result = client.eval.run_evaluation(
     dataset_id="my_eval_dataset",
     scoring_functions=["accuracy", "bleu"],
     model_id="my_model"
 )
 print(f"Evaluation completed: {eval_result}")
 ```
 ## Best Practices
 - **Choose appropriate providers**: Use Meta Reference for comprehensive evaluation, NVIDIA for platform-specific needs
 - **Configure storage properly**: Ensure your key-value store configuration matches your performance requirements
 - **Monitor evaluation progress**: Large evaluations can take time - implement proper monitoring
 - **Use appropriate scoring functions**: Select scoring metrics that align with your evaluation goals
 ## What's Next?
 - Check out our Colab notebook on working examples with running benchmark evaluations [here](https://colab.research.google.com/github/meta-llama/llama-stack/blob/main/docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb#scrollTo=mxLCsP4MvFqP).
 - Check out our [Building Applications - Evaluation](../building_applications/evals.mdx) guide for more details on how to use the Evaluation APIs to evaluate your applications.
 - Check out our [Evaluation Reference](../references/evals_reference/index.mdx) for more details on the APIs.
 - Explore the [Scoring](./scoring.mdx) documentation for available scoring functions.

305

docs/docs/advanced_apis/post_training.mdx Normal file

View file

 @ -0,0 +1,305 @@
 # Post-Training
 Post-training in Llama Stack allows you to fine-tune models using various providers and frameworks. This section covers all available post-training providers and how to use them effectively.
 ## Overview
 Llama Stack provides multiple post-training providers:
 - **HuggingFace SFTTrainer** (`inline::huggingface`) - Fine-tuning using HuggingFace ecosystem
 - **TorchTune** (`inline::torchtune`) - Fine-tuning using Meta's TorchTune framework
 - **NVIDIA** (`remote::nvidia`) - Fine-tuning using NVIDIA's platform
 ## HuggingFace SFTTrainer
 [HuggingFace SFTTrainer](https://huggingface.co/docs/trl/en/sft_trainer) is an inline post training provider for Llama Stack. It allows you to run supervised fine tuning on a variety of models using many datasets.
 ### Features
 - Simple access through the post_training API
 - Fully integrated with Llama Stack
 - GPU support, CPU support, and MPS support (MacOS Metal Performance Shaders)
 ### Configuration
 | Field | Type | Required | Default | Description |
 |-------|------|----------|---------|-------------|
 | `device` | `str` | No | cuda |  |
 | `distributed_backend` | `Literal['fsdp', 'deepspeed']` | No |  |  |
 | `checkpoint_format` | `Literal['full_state', 'huggingface']` | No | huggingface |  |
 | `chat_template` | `str` | No | |
 | `model_specific_config` | `dict` | No | `{'trust_remote_code': True, 'attn_implementation': 'sdpa'}` |  |
 | `max_seq_length` | `int` | No | 2048 |  |
 | `gradient_checkpointing` | `bool` | No | False |  |
 | `save_total_limit` | `int` | No | 3 |  |
 | `logging_steps` | `int` | No | 10 |  |
 | `warmup_ratio` | `float` | No | 0.1 |  |
 | `weight_decay` | `float` | No | 0.01 |  |
 | `dataloader_num_workers` | `int` | No | 4 |  |
 | `dataloader_pin_memory` | `bool` | No | True |  |
 ### Sample Configuration
 ```yaml
 checkpoint_format: huggingface
 distributed_backend: null
 device: cpu
 ```
 ### Setup
 You can access the HuggingFace trainer via the `starter` distribution:
 ```bash
 llama stack list-deps starter | xargs -L1 uv pip install
 llama stack run starter
 ```
 ### Usage Example
 ```python
 import time
 import uuid
 from llama_stack_client.types import (
     post_training_supervised_fine_tune_params,
     algorithm_config_param,
 )
 def create_http_client():
     from llama_stack_client import LlamaStackClient
     return LlamaStackClient(base_url="http://localhost:8321")
 client = create_http_client()
 # Example Dataset
 client.datasets.register(
     purpose="post-training/messages",
     source={
         "type": "uri",
         "uri": "huggingface://datasets/llamastack/simpleqa?split=train",
     },
     dataset_id="simpleqa",
 )
 training_config = post_training_supervised_fine_tune_params.TrainingConfig(
     data_config=post_training_supervised_fine_tune_params.TrainingConfigDataConfig(
         batch_size=32,
         data_format="instruct",
         dataset_id="simpleqa",
         shuffle=True,
     ),
     gradient_accumulation_steps=1,
     max_steps_per_epoch=0,
     max_validation_steps=1,
     n_epochs=4,
 )
 algorithm_config = algorithm_config_param.LoraFinetuningConfig(
     alpha=1,
     apply_lora_to_mlp=True,
     apply_lora_to_output=False,
     lora_attn_modules=["q_proj"],
     rank=1,
     type="LoRA",
 )
 job_uuid = f"test-job{uuid.uuid4()}"
 # Example Model
 training_model = "ibm-granite/granite-3.3-8b-instruct"
 start_time = time.time()
 response = client.post_training.supervised_fine_tune(
     job_uuid=job_uuid,
     logger_config={},
     model=training_model,
     hyperparam_search_config={},
     training_config=training_config,
     algorithm_config=algorithm_config,
     checkpoint_dir="output",
 )
 print("Job: ", job_uuid)
 # Wait for the job to complete!
 while True:
     status = client.post_training.job.status(job_uuid=job_uuid)
     if not status:
         print("Job not found")
         break
     print(status)
     if status.status == "completed":
         break
     print("Waiting for job to complete...")
     time.sleep(5)
 end_time = time.time()
 print("Job completed in", end_time - start_time, "seconds!")
 print("Artifacts:")
 print(client.post_training.job.artifacts(job_uuid=job_uuid))
 ```
 ## TorchTune
 [TorchTune](https://github.com/pytorch/torchtune) is an inline post training provider for Llama Stack. It provides a simple and efficient way to fine-tune language models using PyTorch.
 ### Features
 - Simple access through the post_training API
 - Fully integrated with Llama Stack
 - GPU support and single device capabilities
 - Support for LoRA
 ### Configuration
 | Field | Type | Required | Default | Description |
 |-------|------|----------|---------|-------------|
 | `torch_seed` | `int \| None` | No |  |  |
 | `checkpoint_format` | `Literal['meta', 'huggingface']` | No | meta |  |
 ### Sample Configuration
 ```yaml
 checkpoint_format: meta
 ```
 ### Setup
 You can access the TorchTune trainer by writing your own yaml pointing to the provider:
 ```yaml
 post_training:
   - provider_id: torchtune
     provider_type: inline::torchtune
     config: {}
 ```
 You can then build and run your own stack with this provider.
 ### Usage Example
 ```python
 import time
 import uuid
 from llama_stack_client.types import (
     post_training_supervised_fine_tune_params,
     algorithm_config_param,
 )
 def create_http_client():
     from llama_stack_client import LlamaStackClient
     return LlamaStackClient(base_url="http://localhost:8321")
 client = create_http_client()
 # Example Dataset
 client.datasets.register(
     purpose="post-training/messages",
     source={
         "type": "uri",
         "uri": "huggingface://datasets/llamastack/simpleqa?split=train",
     },
     dataset_id="simpleqa",
 )
 training_config = post_training_supervised_fine_tune_params.TrainingConfig(
     data_config=post_training_supervised_fine_tune_params.TrainingConfigDataConfig(
         batch_size=32,
         data_format="instruct",
         dataset_id="simpleqa",
         shuffle=True,
     ),
     gradient_accumulation_steps=1,
     max_steps_per_epoch=0,
     max_validation_steps=1,
     n_epochs=4,
 )
 algorithm_config = algorithm_config_param.LoraFinetuningConfig(
     alpha=1,
     apply_lora_to_mlp=True,
     apply_lora_to_output=False,
     lora_attn_modules=["q_proj"],
     rank=1,
     type="LoRA",
 )
 job_uuid = f"test-job{uuid.uuid4()}"
 # Example Model
 training_model = "meta-llama/Llama-2-7b-hf"
 start_time = time.time()
 response = client.post_training.supervised_fine_tune(
     job_uuid=job_uuid,
     logger_config={},
     model=training_model,
     hyperparam_search_config={},
     training_config=training_config,
     algorithm_config=algorithm_config,
     checkpoint_dir="output",
 )
 print("Job: ", job_uuid)
 # Wait for the job to complete!
 while True:
     status = client.post_training.job.status(job_uuid=job_uuid)
     if not status:
         print("Job not found")
         break
     print(status)
     if status.status == "completed":
         break
     print("Waiting for job to complete...")
     time.sleep(5)
 end_time = time.time()
 print("Job completed in", end_time - start_time, "seconds!")
 print("Artifacts:")
 print(client.post_training.job.artifacts(job_uuid=job_uuid))
 ```
 ## NVIDIA
 NVIDIA's post-training provider for fine-tuning models on NVIDIA's platform.
 ### Configuration
 | Field | Type | Required | Default | Description |
 |-------|------|----------|---------|-------------|
 | `api_key` | `str \| None` | No |  | The NVIDIA API key. |
 | `dataset_namespace` | `str \| None` | No | default | The NVIDIA dataset namespace. |
 | `project_id` | `str \| None` | No | test-example-model@v1 | The NVIDIA project ID. |
 | `customizer_url` | `str \| None` | No |  | Base URL for the NeMo Customizer API |
 | `timeout` | `int` | No | 300 | Timeout for the NVIDIA Post Training API |
 | `max_retries` | `int` | No | 3 | Maximum number of retries for the NVIDIA Post Training API |
 | `output_model_dir` | `str` | No | test-example-model@v1 | Directory to save the output model |
 ### Sample Configuration
 ```yaml
 api_key: ${env.NVIDIA_API_KEY:=}
 dataset_namespace: ${env.NVIDIA_DATASET_NAMESPACE:=default}
 project_id: ${env.NVIDIA_PROJECT_ID:=test-project}
 customizer_url: ${env.NVIDIA_CUSTOMIZER_URL:=http://nemo.test}
 ```
 ## Best Practices
 - **Choose the right provider**: Use HuggingFace for broader compatibility, TorchTune for Meta models, or NVIDIA for their ecosystem
 - **Configure hardware appropriately**: Ensure your configuration matches your available hardware (CPU, GPU, MPS)
 - **Monitor jobs**: Always monitor job status and handle completion appropriately
 - **Use appropriate datasets**: Ensure your dataset format matches the expected input format for your chosen provider
 ## Next Steps
 - Check out the [Building Applications - Fine-tuning](../building_applications/index.mdx) guide for application-level examples
 - See the [Providers](../providers/post_training/index.mdx) section for detailed provider documentation
 - Review the [API Reference](../advanced_apis/post_training.mdx) for complete API documentation

193

docs/docs/advanced_apis/scoring.mdx Normal file

View file

 @ -0,0 +1,193 @@
 # Scoring
 The Scoring API in Llama Stack allows you to evaluate outputs of your GenAI system using various scoring functions and metrics. This section covers all available scoring providers and their configuration.
 ## Overview
 Llama Stack provides multiple scoring providers:
 - **Basic** (`inline::basic`) - Simple evaluation metrics and scoring functions
 - **Braintrust** (`inline::braintrust`) - Advanced evaluation using the Braintrust platform
 - **LLM-as-Judge** (`inline::llm-as-judge`) - Uses language models to evaluate responses
 The Scoring API is associated with `ScoringFunction` resources and provides a suite of out-of-the-box scoring functions. You can also add custom evaluators to meet specific evaluation needs.
 ## Basic Scoring
 Basic scoring provider for simple evaluation metrics and scoring functions. This provider offers fundamental scoring capabilities without external dependencies.
 ### Configuration
 No configuration required - this provider works out of the box.
 ```yaml
 {}
 ```
 ### Features
 - Simple evaluation metrics (accuracy, precision, recall, F1-score)
 - String matching and similarity metrics
 - Basic statistical scoring functions
 - No external dependencies required
 - Fast execution for standard metrics
 ### Use Cases
 - Quick evaluation of basic accuracy metrics
 - String similarity comparisons
 - Statistical analysis of model outputs
 - Development and testing scenarios
 ## Braintrust
 Braintrust scoring provider for evaluation and scoring using the [Braintrust platform](https://braintrustdata.com/). Braintrust provides advanced evaluation capabilities and experiment tracking.
 ### Configuration
 | Field | Type | Required | Default | Description |
 |-------|------|----------|---------|-------------|
 | `openai_api_key` | `str \| None` | No |  | The OpenAI API Key for LLM-powered evaluations |
 ### Sample Configuration
 ```yaml
 openai_api_key: ${env.OPENAI_API_KEY:=}
 ```
 ### Features
 - Advanced evaluation metrics
 - Experiment tracking and comparison
 - LLM-powered evaluation functions
 - Integration with Braintrust's evaluation suite
 - Detailed scoring analytics and insights
 ### Use Cases
 - Production evaluation pipelines
 - A/B testing of model versions
 - Advanced scoring with custom metrics
 - Detailed evaluation reporting and analysis
 ## LLM-as-Judge
 LLM-as-judge scoring provider that uses language models to evaluate and score responses. This approach leverages the reasoning capabilities of large language models to assess quality, relevance, and other subjective metrics.
 ### Configuration
 No configuration required - this provider works out of the box.
 ```yaml
 {}
 ```
 ### Features
 - Subjective quality evaluation using LLMs
 - Flexible evaluation criteria definition
 - Natural language evaluation explanations
 - Support for complex evaluation scenarios
 - Contextual understanding of responses
 ### Use Cases
 - Evaluating response quality and relevance
 - Assessing creativity and coherence
 - Subjective metric evaluation
 - Human-like judgment for complex tasks
 ## Usage Examples
 ### Basic Scoring Example
 ```python
 from llama_stack_client import LlamaStackClient
 client = LlamaStackClient(base_url="http://localhost:8321")
 # Register a basic accuracy scoring function
 client.scoring_functions.register(
     scoring_function_id="basic_accuracy",
     provider_id="basic",
     provider_scoring_function_id="accuracy"
 )
 # Use the scoring function
 result = client.scoring.score(
     input_rows=[
         {"expected": "Paris", "actual": "Paris"},
         {"expected": "London", "actual": "Paris"}
     ],
     scoring_function_id="basic_accuracy"
 )
 print(f"Accuracy: {result.results[0].score}")
 ```
 ### LLM-as-Judge Example
 ```python
 # Register an LLM-as-judge scoring function
 client.scoring_functions.register(
     scoring_function_id="quality_judge",
     provider_id="llm_judge",
     provider_scoring_function_id="response_quality",
     params={
         "criteria": "Evaluate response quality, relevance, and helpfulness",
         "scale": "1-10"
     }
 )
 # Score responses using LLM judgment
 result = client.scoring.score(
     input_rows=[{
         "query": "What is machine learning?",
         "response": "Machine learning is a subset of AI that enables computers to learn patterns from data..."
     }],
     scoring_function_id="quality_judge"
 )
 ```
 ### Braintrust Integration Example
 ```python
 # Register a Braintrust scoring function
 client.scoring_functions.register(
     scoring_function_id="braintrust_eval",
     provider_id="braintrust",
     provider_scoring_function_id="semantic_similarity"
 )
 # Run evaluation with Braintrust
 result = client.scoring.score(
     input_rows=[{
         "reference": "The capital of France is Paris",
         "candidate": "Paris is the capital city of France"
     }],
     scoring_function_id="braintrust_eval"
 )
 ```
 ## Best Practices
 - **Choose appropriate providers**: Use Basic for simple metrics, Braintrust for advanced analytics, LLM-as-Judge for subjective evaluation
 - **Define clear criteria**: When using LLM-as-Judge, provide specific evaluation criteria and scales
 - **Validate scoring functions**: Test your scoring functions with known examples before production use
 - **Monitor performance**: Track scoring performance and adjust thresholds based on results
 - **Combine multiple metrics**: Use different scoring providers together for comprehensive evaluation
 ## Integration with Evaluation
 The Scoring API works closely with the [Evaluation](./evaluation.mdx) API to provide comprehensive evaluation workflows:
 . **Datasets** are loaded via the DatasetIO API
 . **Evaluation** generates model outputs using the Eval API
 . **Scoring** evaluates the quality of outputs using various scoring functions
 . **Results** are aggregated and reported for analysis
 ## Next Steps
 - Check out the [Evaluation](./evaluation.mdx) guide for running complete evaluations
 - See the [Building Applications - Evaluation](../building_applications/evals.mdx) guide for application examples
 - Review the [Evaluation Reference](../references/evals_reference/) for comprehensive scoring function usage
 - Explore the [Evaluation Concepts](../concepts/evaluation_concepts) for detailed conceptual information

62

docs/docs/api-deprecated/index.mdx Normal file

View file

 @ -0,0 +1,62 @@
 ---
 title: Deprecated APIs
 description: Legacy APIs that are being phased out
 sidebar_label: Deprecated
 sidebar_position: 1
 ---
 # Deprecated APIs
 This section contains APIs that are being phased out in favor of newer, more standardized implementations. These APIs are maintained for backward compatibility but are not recommended for new projects.
 :::warning Deprecation Notice
 These APIs are deprecated and will be removed in future versions. Please migrate to the recommended alternatives listed below.
 :::
 ## Migration Guide
 When using deprecated APIs, please refer to the migration guides provided for each API to understand how to transition to the supported alternatives.
 ## Deprecated API List
 ### Legacy Inference APIs
 Some older inference endpoints that have been superseded by the standardized Inference API.
 **Migration Path:** Use the [Inference API](../api/) instead.
 ### Legacy Vector Operations
 Older vector database operations that have been replaced by the Vector IO API.
 **Migration Path:** Use the [Vector IO API](../api/) instead.
 ### Legacy File Operations
 Older file management endpoints that have been replaced by the Files API.
 **Migration Path:** Use the [Files API](../api/) instead.
 ## Support Timeline
 Deprecated APIs will be supported according to the following timeline:
 - **Current Version**: Full support with deprecation warnings
 - **Next Major Version**: Limited support with migration notices
 - **Following Major Version**: Removal of deprecated APIs
 ## Getting Help
 If you need assistance migrating from deprecated APIs:
 . Check the specific migration guides for each API
 . Review the [API Reference](../api/) for current alternatives
 . Consult the [Community Forums](https://github.com/llamastack/llama-stack/discussions) for migration support
 . Open an issue on GitHub for specific migration questions
 ## Contributing
 If you find issues with deprecated APIs or have suggestions for improving the migration process, please contribute by:
 . Opening an issue describing the problem
 . Submitting a pull request with improvements
 . Updating migration documentation
 For more information on contributing, see our [Contributing Guide](../contributing/).

128

docs/docs/api-experimental/index.mdx Normal file

View file

 @ -0,0 +1,128 @@
 ---
 title: Experimental APIs
 description: APIs in development with limited support
 sidebar_label: Experimental
 sidebar_position: 1
 ---
 # Experimental APIs
 This section contains APIs that are currently in development and may have limited support or stability. These APIs are available for testing and feedback but should not be used in production environments.
 :::warning Experimental Notice
 These APIs are experimental and may change without notice. Use with caution and provide feedback to help improve them.
 :::
 ## Current Experimental APIs
 ### Batch Inference API
 Run inference on a dataset of inputs in batch mode for improved efficiency.
 **Status:** In Development
 **Provider Support:** Limited
 **Use Case:** Large-scale inference operations
 **Features:**
 - Batch processing of multiple inputs
 - Optimized resource utilization
 - Progress tracking and monitoring
 ### Batch Agents API
 Run agentic workflows on a dataset of inputs in batch mode.
 **Status:** In Development
 **Provider Support:** Limited
 **Use Case:** Large-scale agent operations
 **Features:**
 - Batch agent execution
 - Parallel processing capabilities
 - Result aggregation and analysis
 ### Synthetic Data Generation API
 Generate synthetic data for model development and testing.
 **Status:** Early Development
 **Provider Support:** Very Limited
 **Use Case:** Training data augmentation
 **Features:**
 - Automated data generation
 - Quality control mechanisms
 - Customizable generation parameters
 ### Batches API (OpenAI-compatible)
 OpenAI-compatible batch management for inference operations.
 **Status:** In Development
 **Provider Support:** Limited
 **Use Case:** OpenAI batch processing compatibility
 **Features:**
 - OpenAI batch API compatibility
 - Job scheduling and management
 - Status tracking and monitoring
 ## Getting Started with Experimental APIs
 ### Prerequisites
 - Llama Stack server running with experimental features enabled
 - Appropriate provider configurations
 - Understanding of API limitations
 ### Configuration
 Experimental APIs may require special configuration flags or provider settings. Check the specific API documentation for setup requirements.
 ### Usage Guidelines
 . **Testing Only**: Use experimental APIs for testing and development only
 . **Monitor Changes**: Watch for updates and breaking changes
 . **Provide Feedback**: Report issues and suggest improvements
 . **Backup Data**: Always backup important data when using experimental features
 ## Feedback and Contribution
 We encourage feedback on experimental APIs to help improve them:
 ### Reporting Issues
 - Use GitHub issues with the "experimental" label
 - Include detailed error messages and reproduction steps
 - Specify the API version and provider being used
 ### Feature Requests
 - Submit feature requests through GitHub discussions
 - Provide use cases and expected behavior
 - Consider contributing implementations
 ### Testing
 - Test experimental APIs in your environment
 - Report performance issues and optimization opportunities
 - Share success stories and use cases
 ## Migration to Stable APIs
 As experimental APIs mature, they will be moved to the stable API section. When this happens:
 . **Announcement**: We'll announce the promotion in release notes
 . **Migration Guide**: Detailed migration instructions will be provided
 . **Deprecation Timeline**: Experimental versions will be deprecated with notice
 . **Support**: Full support will be available for stable versions
 ## Provider Support
 Experimental APIs may have limited provider support. Check the specific API documentation for:
 - Supported providers
 - Configuration requirements
 - Known limitations
 - Performance characteristics
 ## Roadmap
 Experimental APIs are part of our ongoing development roadmap:
 - **Q1 2024**: Batch Inference API stabilization
 - **Q2 2024**: Batch Agents API improvements
 - **Q3 2024**: Synthetic Data Generation API expansion
 - **Q4 2024**: Batches API full OpenAI compatibility
 For the latest updates, follow our [GitHub releases](https://github.com/llamastack/llama-stack/releases) and [roadmap discussions](https://github.com/llamastack/llama-stack/discussions).

287

docs/docs/api-openai/index.mdx Normal file

View file

 @ -0,0 +1,287 @@
 ---
 title: OpenAI API Compatibility
 description: OpenAI-compatible APIs and features in Llama Stack
 sidebar_label: OpenAI Compatibility
 sidebar_position: 1
 ---
 # OpenAI API Compatibility
 Llama Stack provides comprehensive OpenAI API compatibility, allowing you to use existing OpenAI API clients and tools with Llama Stack providers. This compatibility layer ensures seamless migration and interoperability.
 ## Overview
 OpenAI API compatibility in Llama Stack includes:
 - **OpenAI-compatible endpoints** for all major APIs
 - **Request/response format compatibility** with OpenAI standards
 - **Authentication and authorization** using OpenAI-style API keys
 - **Error handling** with OpenAI-compatible error codes and messages
 - **Rate limiting** and usage tracking compatible with OpenAI patterns
 ## Supported OpenAI APIs
 ### Chat Completions API
 OpenAI-compatible chat completions for conversational AI applications.
 **Endpoint:** `/v1/chat/completions`
 **Compatibility:** Full OpenAI API compatibility
 **Providers:** All inference providers
 **Features:**
 - Message-based conversations
 - System prompts and user messages
 - Function calling support
 - Streaming responses
 - Temperature and other parameter controls
 ### Completions API
 OpenAI-compatible text completions for general text generation.
 **Endpoint:** `/v1/completions`
 **Compatibility:** Full OpenAI API compatibility
 **Providers:** All inference providers
 **Features:**
 - Text completion generation
 - Prompt engineering support
 - Customizable parameters
 - Batch processing capabilities
 ### Embeddings API
 OpenAI-compatible embeddings for vector operations.
 **Endpoint:** `/v1/embeddings`
 **Compatibility:** Full OpenAI API compatibility
 **Providers:** All embedding providers
 **Features:**
 - Text embedding generation
 - Multiple embedding models
 - Batch embedding processing
 - Vector similarity operations
 ### Files API
 OpenAI-compatible file management for document processing.
 **Endpoint:** `/v1/files`
 **Compatibility:** Full OpenAI API compatibility
 **Providers:** Local Filesystem, S3
 **Features:**
 - File upload and management
 - Document processing
 - File metadata tracking
 - Secure file access
 ### Vector Store Files API
 OpenAI-compatible vector store file operations for RAG applications.
 **Endpoint:** `/v1/vector_stores/{vector_store_id}/files`
 **Compatibility:** Full OpenAI API compatibility
 **Providers:** FAISS, SQLite-vec, Milvus, ChromaDB, Qdrant, Weaviate, Postgres (PGVector)
 **Features:**
 - Automatic document processing
 - Vector store integration
 - File chunking and indexing
 - Search and retrieval operations
 ### Batches API
 OpenAI-compatible batch processing for large-scale operations.
 **Endpoint:** `/v1/batches`
 **Compatibility:** OpenAI API compatibility (experimental)
 **Providers:** Limited support
 **Features:**
 - Batch job creation and management
 - Progress tracking
 - Result retrieval
 - Error handling
 ## Migration from OpenAI
 ### Step 1: Update API Endpoint
 Change your API endpoint from OpenAI to your Llama Stack server:
 ```python
 # Before (OpenAI)
 import openai
 client = openai.OpenAI(api_key="your-openai-key")
 # After (Llama Stack)
 import openai
 client = openai.OpenAI(
     api_key="your-llama-stack-key",
     base_url="http://localhost:8000/v1"  # Your Llama Stack server
 )
 ```
 ### Step 2: Configure Providers
 Set up your preferred providers in the Llama Stack configuration:
 ```yaml
 # stack-config.yaml
 inference:
   providers:
     - name: "meta-reference"
       type: "inline"
       model: "llama-3.1-8b"
 ```
 ### Step 3: Test Compatibility
 Verify that your existing code works with Llama Stack:
 ```python
 # Test chat completions
 response = client.chat.completions.create(
     model="llama-3.1-8b",
     messages=[
         {"role": "user", "content": "Hello, world!"}
     ]
 )
 print(response.choices[0].message.content)
 ```
 ## Provider-Specific Features
 ### Meta Reference Provider
 - Full OpenAI API compatibility
 - Local model execution
 - Custom model support
 ### Remote Providers
 - OpenAI API compatibility
 - Cloud-based execution
 - Scalable infrastructure
 ### Vector Store Providers
 - OpenAI vector store API compatibility
 - Automatic document processing
 - Advanced search capabilities
 ## Authentication
 Llama Stack supports OpenAI-style authentication:
 ### API Key Authentication
 ```python
 client = openai.OpenAI(
     api_key="your-api-key",
     base_url="http://localhost:8000/v1"
 )
 ```
 ### Environment Variables
 ```bash
 export OPENAI_API_KEY="your-api-key"
 export OPENAI_BASE_URL="http://localhost:8000/v1"
 ```
 ## Error Handling
 Llama Stack provides OpenAI-compatible error responses:
 ```python
 try:
     response = client.chat.completions.create(...)
 except openai.APIError as e:
     print(f"API Error: {e}")
 except openai.RateLimitError as e:
     print(f"Rate Limit Error: {e}")
 except openai.APIConnectionError as e:
     print(f"Connection Error: {e}")
 ```
 ## Rate Limiting
 OpenAI-compatible rate limiting is supported:
 - **Requests per minute** limits
 - **Tokens per minute** limits
 - **Concurrent request** limits
 - **Usage tracking** and monitoring
 ## Monitoring and Observability
 Track your API usage with OpenAI-compatible monitoring:
 - **Request/response logging**
 - **Usage metrics** and analytics
 - **Performance monitoring**
 - **Error tracking** and alerting
 ## Best Practices
 ### 1. Provider Selection
 Choose providers based on your requirements:
 - **Local development**: Meta Reference, Ollama
 - **Production**: Cloud providers (Fireworks, Together, NVIDIA)
 - **Specialized use cases**: Custom providers
 ### 2. Model Configuration
 Configure models for optimal performance:
 - **Model selection** based on task requirements
 - **Parameter tuning** for specific use cases
 - **Resource allocation** for performance
 ### 3. Error Handling
 Implement robust error handling:
 - **Retry logic** for transient failures
 - **Fallback providers** for high availability
 - **Monitoring** and alerting for issues
 ### 4. Security
 Follow security best practices:
 - **API key management** and rotation
 - **Access control** and authorization
 - **Data privacy** and compliance
 ## Implementation Examples
 For detailed code examples and implementation guides, see our [OpenAI Implementation Guide](../providers/openai.mdx).
 ## Known Limitations
 ### Responses API Limitations
 The Responses API is still in active development. For detailed information about current limitations and implementation status, see our [OpenAI Responses API Limitations](../providers/openai_responses_limitations.mdx).
 ## Troubleshooting
 ### Common Issues
 **Connection Errors**
 - Verify server is running
 - Check network connectivity
 - Validate API endpoint URL
 **Authentication Errors**
 - Verify API key is correct
 - Check key permissions
 - Ensure proper authentication headers
 **Model Errors**
 - Verify model is available
 - Check provider configuration
 - Validate model parameters
 ### Getting Help
 For OpenAI compatibility issues:
 . **Check Documentation**: Review provider-specific documentation
 . **Community Support**: Ask questions in GitHub discussions
 . **Issue Reporting**: Open GitHub issues for bugs
 . **Professional Support**: Contact support for enterprise issues
 ## Roadmap
 Upcoming OpenAI compatibility features:
 - **Enhanced batch processing** support
 - **Advanced function calling** capabilities
 - **Improved error handling** and diagnostics
 - **Performance optimizations** for large-scale deployments
 For the latest updates, follow our [GitHub releases](https://github.com/llamastack/llama-stack/releases) and [roadmap discussions](https://github.com/llamastack/llama-stack/discussions).

									
										49

docs/docs/api-overview.md
									
										Normal file
									
										View file
										
				@ -0,0 +1,49 @@

				# API Reference Overview

				The Llama Stack provides a comprehensive set of APIs organized by stability level to help you choose the right endpoints for your use case.

				## 🟢 Stable APIs

				**Production-ready APIs with backward compatibility guarantees.**

				These APIs are fully tested, documented, and stable. They follow semantic versioning principles and maintain backward compatibility within major versions. Recommended for production applications.

				[**Browse Stable APIs →**](./api/llama-stack-specification)

				**Key Features:**

				- ✅ Backward compatibility guaranteed

				- ✅ Comprehensive testing and validation

				- ✅ Production-ready reliability

				- ✅ Long-term support

				---

				## 🟡 Experimental APIs

				**Preview APIs that may change before becoming stable.**

				These APIs include v1alpha and v1beta endpoints that are feature-complete but may undergo changes based on feedback. Great for exploring new capabilities and providing feedback.

				[**Browse Experimental APIs →**](./api-experimental/llama-stack-specification-experimental-apis)

				**Key Features:**

				- 🧪 Latest features and capabilities

				- 🧪 May change based on user feedback

				- 🧪 Active development and iteration

				- 🧪 Opportunity to influence final design

				---

				## 🔴 Deprecated APIs

				**Legacy APIs for migration reference.**

				These APIs are deprecated and will be removed in future versions. They are provided for migration purposes and to help transition to newer, stable alternatives.

				[**Browse Deprecated APIs →**](./api-deprecated/llama-stack-specification-deprecated-apis)

				**Key Features:**

				- ⚠️ Will be removed in future versions

				- ⚠️ Migration guidance provided

				- ⚠️ Use for compatibility during transition

				- ⚠️ Not recommended for new projects

144

docs/docs/api/index.mdx Normal file

View file

 @ -0,0 +1,144 @@
 ---
 title: API Reference
 description: Complete reference for Llama Stack APIs
 sidebar_label: Overview
 sidebar_position: 1
 ---
 # API Reference
 Llama Stack provides a comprehensive set of APIs for building generative AI applications. All APIs follow OpenAI-compatible standards and can be used interchangeably across different providers.
 ## Core APIs
 ### Inference API
 Run inference with Large Language Models (LLMs) and embedding models.
 **Supported Providers:**
 - Meta Reference (Single Node)
 - Ollama (Single Node)
 - Fireworks (Hosted)
 - Together (Hosted)
 - NVIDIA NIM (Hosted and Single Node)
 - vLLM (Hosted and Single Node)
 - TGI (Hosted and Single Node)
 - AWS Bedrock (Hosted)
 - Cerebras (Hosted)
 - Groq (Hosted)
 - SambaNova (Hosted)
 - PyTorch ExecuTorch (On-device iOS, Android)
 - OpenAI (Hosted)
 - Anthropic (Hosted)
 - Gemini (Hosted)
 - WatsonX (Hosted)
 ### Agents API
 Run multi-step agentic workflows with LLMs, including tool usage, memory (RAG), and complex reasoning.
 **Supported Providers:**
 - Meta Reference (Single Node)
 - Fireworks (Hosted)
 - Together (Hosted)
 - PyTorch ExecuTorch (On-device iOS)
 ### Vector IO API
 Perform operations on vector stores, including adding documents, searching, and deleting documents.
 **Supported Providers:**
 - FAISS (Single Node)
 - SQLite-Vec (Single Node)
 - Chroma (Hosted and Single Node)
 - Milvus (Hosted and Single Node)
 - Postgres (PGVector) (Hosted and Single Node)
 - Weaviate (Hosted)
 - Qdrant (Hosted and Single Node)
 ### Files API (OpenAI-compatible)
 Manage file uploads, storage, and retrieval with OpenAI-compatible endpoints.
 **Supported Providers:**
 - Local Filesystem (Single Node)
 - S3 (Hosted)
 ### Vector Store Files API (OpenAI-compatible)
 Integrate file operations with vector stores for automatic document processing and search.
 **Supported Providers:**
 - FAISS (Single Node)
 - SQLite-vec (Single Node)
 - Milvus (Single Node)
 - ChromaDB (Hosted and Single Node)
 - Qdrant (Hosted and Single Node)
 - Weaviate (Hosted)
 - Postgres (PGVector) (Hosted and Single Node)
 ### Safety API
 Apply safety policies to outputs at a systems level, not just model level.
 **Supported Providers:**
 - Llama Guard (Depends on Inference Provider)
 - Prompt Guard (Single Node)
 - Code Scanner (Single Node)
 - AWS Bedrock (Hosted)
 ### Post Training API
 Fine-tune models for specific use cases and domains.
 **Supported Providers:**
 - Meta Reference (Single Node)
 - HuggingFace (Single Node)
 - TorchTune (Single Node)
 - NVIDIA NEMO (Hosted)
 ### Eval API
 Generate outputs and perform scoring to evaluate system performance.
 **Supported Providers:**
 - Meta Reference (Single Node)
 - NVIDIA NEMO (Hosted)
 ### Telemetry API
 Collect telemetry data from the system for monitoring and observability.
 **Supported Providers:**
 - Meta Reference (Single Node)
 ### Tool Runtime API
 Interact with various tools and protocols to extend LLM capabilities.
 **Supported Providers:**
 - Brave Search (Hosted)
 - RAG Runtime (Single Node)
 ## API Compatibility
 All Llama Stack APIs are designed to be OpenAI-compatible, allowing you to:
 - Use existing OpenAI API clients and tools
 - Migrate from OpenAI to other providers seamlessly
 - Maintain consistent API contracts across different environments
 ## Getting Started
 To get started with Llama Stack APIs:
 . **Choose a Distribution**: Select a pre-configured distribution that matches your environment
 . **Configure Providers**: Set up the providers you want to use for each API
 . **Start the Server**: Launch the Llama Stack server with your configuration
 . **Use the APIs**: Make requests to the API endpoints using your preferred client
 For detailed setup instructions, see our [Getting Started Guide](../getting_started/quickstart).
 ## Provider Details
 For complete provider compatibility and setup instructions, see our [Providers Documentation](../providers/).
 ## API Stability
 Llama Stack APIs are organized by stability level:
 - **[Stable APIs](./index.mdx)** - Production-ready APIs with full support
 - **[Experimental APIs](../api-experimental/)** - APIs in development with limited support
 - **[Deprecated APIs](../api-deprecated/)** - Legacy APIs being phased out
 ## OpenAI Integration
 For specific OpenAI API compatibility features, see our [OpenAI Compatibility Guide](../api-openai/).

112

docs/docs/building_applications/agent.mdx Normal file

View file

 @ -0,0 +1,112 @@
 ---
 title: Agents
 description: Build powerful AI applications with the Llama Stack agent framework
 sidebar_label: Agents
 sidebar_position: 3
 ---
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 # Agents
 An Agent in Llama Stack is a powerful abstraction that allows you to build complex AI applications.
 The Llama Stack agent framework is built on a modular architecture that allows for flexible and powerful AI applications. This document explains the key components and how they work together.
 ## Core Concepts
 ### 1. Agent Configuration
 Agents are configured using the `AgentConfig` class, which includes:
 - **Model**: The underlying LLM to power the agent
 - **Instructions**: System prompt that defines the agent's behavior
 - **Tools**: Capabilities the agent can use to interact with external systems
 - **Safety Shields**: Guardrails to ensure responsible AI behavior
 ```python
 from llama_stack_client import Agent
 # Create the agent
 agent = Agent(
     llama_stack_client,
     model="meta-llama/Llama-3-70b-chat",
     instructions="You are a helpful assistant that can use tools to answer questions.",
     tools=["builtin::code_interpreter", "builtin::rag/knowledge_search"],
 )
 ```
 ### 2. Sessions
 Agents maintain state through sessions, which represent a conversation thread:
 ```python
 # Create a session
 session_id = agent.create_session(session_name="My conversation")
 ```
 ### 3. Turns
 Each interaction with an agent is called a "turn" and consists of:
 - **Input Messages**: What the user sends to the agent
 - **Steps**: The agent's internal processing (inference, tool execution, etc.)
 - **Output Message**: The agent's response
 <Tabs>
 <TabItem value="streaming" label="Streaming Response">
 ```python
 from llama_stack_client import AgentEventLogger
 # Create a turn with streaming response
 turn_response = agent.create_turn(
     session_id=session_id,
     messages=[{"role": "user", "content": "Tell me about Llama models"}],
 )
 for log in AgentEventLogger().log(turn_response):
     log.print()
 ```
 </TabItem>
 <TabItem value="non-streaming" label="Non-Streaming Response">
 ```python
 from rich.pretty import pprint
 # Non-streaming API
 response = agent.create_turn(
     session_id=session_id,
     messages=[{"role": "user", "content": "Tell me about Llama models"}],
     stream=False,
 )
 print("Inputs:")
 pprint(response.input_messages)
 print("Output:")
 pprint(response.output_message.content)
 print("Steps:")
 pprint(response.steps)
 ```
 </TabItem>
 </Tabs>
 ### 4. Steps
 Each turn consists of multiple steps that represent the agent's thought process:
 - **Inference Steps**: The agent generating text responses
 - **Tool Execution Steps**: The agent using tools to gather information
 - **Shield Call Steps**: Safety checks being performed
 ## Agent Execution Loop
 Refer to the [Agent Execution Loop](./agent_execution_loop) for more details on what happens within an agent turn.
 ## Related Resources
 - **[Agent Execution Loop](./agent_execution_loop)** - Understanding the internal processing flow
 - **[RAG (Retrieval Augmented Generation)](./rag)** - Building knowledge-enhanced agents
 - **[Tools Integration](./tools)** - Extending agent capabilities with external tools
 - **[Safety Guardrails](./safety)** - Implementing responsible AI practices

185

docs/docs/building_applications/agent_execution_loop.mdx Normal file

View file

 @ -0,0 +1,185 @@
 ---
 title: Agent Execution Loop
 description: Understanding the internal processing flow of Llama Stack agents
 sidebar_label: Agent Execution Loop
 sidebar_position: 4
 ---
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 # Agent Execution Loop
 Agents are the heart of Llama Stack applications. They combine inference, memory, safety, and tool usage into coherent workflows. At its core, an agent follows a sophisticated execution loop that enables multi-step reasoning, tool usage, and safety checks.
 ## Steps in the Agent Workflow
 Each agent turn follows these key steps:
 . **Initial Safety Check**: The user's input is first screened through configured safety shields
 . **Context Retrieval**:
    - If RAG is enabled, the agent can choose to query relevant documents from memory banks. You can use the `instructions` field to steer the agent.
    - For new documents, they are first inserted into the memory bank.
    - Retrieved context is provided to the LLM as a tool response in the message history.
 . **Inference Loop**: The agent enters its main execution loop:
    - The LLM receives a user prompt (with previous tool outputs)
    - The LLM generates a response, potentially with [tool calls](./tools)
    - If tool calls are present:
      - Tool inputs are safety-checked
      - Tools are executed (e.g., web search, code execution)
      - Tool responses are fed back to the LLM for synthesis
    - The loop continues until:
      - The LLM provides a final response without tool calls
      - Maximum iterations are reached
      - Token limit is exceeded
 . **Final Safety Check**: The agent's final response is screened through safety shields
 ## Execution Flow Diagram
 ```mermaid
 sequenceDiagram
     participant U as User
     participant E as Executor
     participant M as Memory Bank
     participant L as LLM
     participant T as Tools
     participant S as Safety Shield
     Note over U,S: Agent Turn Start
     U->>S: 1. Submit Prompt
     activate S
     S->>E: Input Safety Check
     deactivate S
     loop Inference Loop
         E->>L: 2.1 Augment with Context
         L-->>E: 2.2 Response (with/without tool calls)
         alt Has Tool Calls
             E->>S: Check Tool Input
             S->>T: 3.1 Execute Tool
             T-->>E: 3.2 Tool Response
             E->>L: 4.1 Tool Response
             L-->>E: 4.2 Synthesized Response
         end
         opt Stop Conditions
             Note over E: Break if:
             Note over E: - No tool calls
             Note over E: - Max iterations reached
             Note over E: - Token limit exceeded
         end
     end
     E->>S: Output Safety Check
     S->>U: 5. Final Response
 ```
 Each step in this process can be monitored and controlled through configurations.
 ## Agent Execution Example
 Here's an example that demonstrates monitoring the agent's execution:
 <Tabs>
 <TabItem value="streaming" label="Streaming Execution">
 ```python
 from llama_stack_client import LlamaStackClient, Agent, AgentEventLogger
 # Replace host and port
 client = LlamaStackClient(base_url=f"http://{HOST}:{PORT}")
 agent = Agent(
     client,
     # Check with `llama-stack-client models list`
     model="Llama3.2-3B-Instruct",
     instructions="You are a helpful assistant",
     # Enable both RAG and tool usage
     tools=[
         {
             "name": "builtin::rag/knowledge_search",
             "args": {"vector_db_ids": ["my_docs"]},
         },
         "builtin::code_interpreter",
     ],
     # Configure safety (optional)
     input_shields=["llama_guard"],
     output_shields=["llama_guard"],
     # Control the inference loop
     max_infer_iters=5,
     sampling_params={
         "strategy": {"type": "top_p", "temperature": 0.7, "top_p": 0.95},
         "max_tokens": 2048,
     },
 )
 session_id = agent.create_session("monitored_session")
 # Stream the agent's execution steps
 response = agent.create_turn(
     messages=[{"role": "user", "content": "Analyze this code and run it"}],
     documents=[
         {
             "content": "https://raw.githubusercontent.com/example/code.py",
             "mime_type": "text/plain",
         }
     ],
     session_id=session_id,
 )
 # Monitor each step of execution
 for log in AgentEventLogger().log(response):
     log.print()
 ```
 </TabItem>
 <TabItem value="non-streaming" label="Non-Streaming Execution">
 ```python
 from rich.pretty import pprint
 # Using non-streaming API, the response contains input, steps, and output.
 response = agent.create_turn(
     messages=[{"role": "user", "content": "Analyze this code and run it"}],
     documents=[
         {
             "content": "https://raw.githubusercontent.com/example/code.py",
             "mime_type": "text/plain",
         }
     ],
     session_id=session_id,
     stream=False,
 )
 pprint(f"Input: {response.input_messages}")
 pprint(f"Output: {response.output_message.content}")
 pprint(f"Steps: {response.steps}")
 ```
 </TabItem>
 </Tabs>
 ## Key Configuration Options
 ### Loop Control
 - **max_infer_iters**: Maximum number of inference iterations (default: 5)
 - **max_tokens**: Token limit for responses
 - **temperature**: Controls response randomness
 ### Safety Configuration
 - **input_shields**: Safety checks for user input
 - **output_shields**: Safety checks for agent responses
 ### Tool Integration
 - **tools**: List of available tools for the agent
 - **tool_choice**: Control over when tools are used
 ## Related Resources
 - **[Agents](./agent)** - Understanding agent fundamentals
 - **[Tools Integration](./tools)** - Adding capabilities to agents
 - **[Safety Guardrails](./safety)** - Implementing safety measures
 - **[RAG (Retrieval Augmented Generation)](./rag)** - Building knowledge-enhanced workflows

256

docs/docs/building_applications/evals.mdx Normal file

View file

 @ -0,0 +1,256 @@
 ---
 title: Evaluations
 description: Evaluate LLM applications with Llama Stack's comprehensive evaluation framework
 sidebar_label: Evaluations
 sidebar_position: 7
 ---
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 This guide walks you through the process of evaluating an LLM application built using Llama Stack. For detailed API reference, check out the [Evaluation Reference](../references/evals_reference/) guide that covers the complete set of APIs and developer experience flow.
 :::tip[Interactive Examples]
 Check out our [Colab notebook](https://colab.research.google.com/drive/10CHyykee9j2OigaIcRv47BKG9mrNm0tJ?usp=sharing) for working examples with evaluations, or try the [Getting Started notebook](https://colab.research.google.com/github/meta-llama/llama-stack/blob/main/docs/getting_started.ipynb).
 :::
 ## Application Evaluation Example
 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/meta-llama/llama-stack/blob/main/docs/getting_started.ipynb)
 Llama Stack offers a library of scoring functions and the `/scoring` API, allowing you to run evaluations on your pre-annotated AI application datasets.
 In this example, we will show you how to:
 . **Build an Agent** with Llama Stack
 . **Query the agent's sessions, turns, and steps** to analyze execution
 . **Evaluate the results** using scoring functions
 ## Step-by-Step Evaluation Process
 ### 1. Building a Search Agent
 First, let's create an agent that can search the web to answer questions:
 ```python
 from llama_stack_client import LlamaStackClient, Agent, AgentEventLogger
 client = LlamaStackClient(base_url=f"http://{HOST}:{PORT}")
 agent = Agent(
     client,
     model="meta-llama/Llama-3.3-70B-Instruct",
     instructions="You are a helpful assistant. Use search tool to answer the questions.",
     tools=["builtin::websearch"],
 )
 # Test prompts for evaluation
 user_prompts = [
     "Which teams played in the NBA Western Conference Finals of 2024. Search the web for the answer.",
     "In which episode and season of South Park does Bill Cosby (BSM-471) first appear? Give me the number and title. Search the web for the answer.",
     "What is the British-American kickboxer Andrew Tate's kickboxing name? Search the web for the answer.",
 ]
 session_id = agent.create_session("test-session")
 # Execute all prompts in the session
 for prompt in user_prompts:
     response = agent.create_turn(
         messages=[
             {
                 "role": "user",
                 "content": prompt,
             }
         ],
         session_id=session_id,
     )
     for log in AgentEventLogger().log(response):
         log.print()
 ```
 ### 2. Query Agent Execution Steps
 Now, let's analyze the agent's execution steps to understand its performance:
 <Tabs>
 <TabItem value="session-analysis" label="Session Analysis">
 ```python
 from rich.pretty import pprint
 # Query the agent's session to get detailed execution data
 session_response = client.agents.session.retrieve(
     session_id=session_id,
     agent_id=agent.agent_id,
 )
 pprint(session_response)
 ```
 </TabItem>
 <TabItem value="tool-validation" label="Tool Usage Validation">
 ```python
 # Sanity check: Verify that all user prompts are followed by tool calls
 num_tool_call = 0
 for turn in session_response.turns:
     for step in turn.steps:
         if (
             step.step_type == "tool_execution"
             and step.tool_calls[0].tool_name == "brave_search"
         ):
             num_tool_call += 1
 print(
     f"{num_tool_call}/{len(session_response.turns)} user prompts are followed by a tool call to `brave_search`"
 )
 ```
 </TabItem>
 </Tabs>
 ### 3. Evaluate Agent Responses
 Now we'll evaluate the agent's responses using Llama Stack's scoring API:
 <Tabs>
 <TabItem value="data-preparation" label="Data Preparation">
 ```python
 # Process agent execution history into evaluation rows
 eval_rows = []
 # Define expected answers for our test prompts
 expected_answers = [
     "Dallas Mavericks and the Minnesota Timberwolves",
     "Season 4, Episode 12",
     "King Cobra",
 ]
 # Create evaluation dataset from agent responses
 for i, turn in enumerate(session_response.turns):
     eval_rows.append(
         {
             "input_query": turn.input_messages[0].content,
             "generated_answer": turn.output_message.content,
             "expected_answer": expected_answers[i],
         }
     )
 pprint(eval_rows)
 ```
 </TabItem>
 <TabItem value="scoring" label="Scoring & Evaluation">
 ```python
 # Configure scoring parameters
 scoring_params = {
     "basic::subset_of": None,  # Check if generated answer contains expected answer
 }
 # Run evaluation using Llama Stack's scoring API
 scoring_response = client.scoring.score(
     input_rows=eval_rows,
     scoring_functions=scoring_params
 )
 pprint(scoring_response)
 # Analyze results
 for i, result in enumerate(scoring_response.results):
     print(f"Query {i+1}: {result.score}")
     print(f"  Generated: {eval_rows[i]['generated_answer'][:100]}...")
     print(f"  Expected: {expected_answers[i]}")
     print(f"  Score: {result.score}")
     print()
 ```
 </TabItem>
 </Tabs>
 ## Available Scoring Functions
 Llama Stack provides several built-in scoring functions:
 ### Basic Scoring Functions
 - **`basic::subset_of`**: Checks if the expected answer is contained in the generated response
 - **`basic::exact_match`**: Performs exact string matching between expected and generated answers
 - **`basic::regex_match`**: Uses regular expressions to match patterns in responses
 ### Advanced Scoring Functions
 - **`llm_as_judge::accuracy`**: Uses an LLM to judge response accuracy
 - **`llm_as_judge::helpfulness`**: Evaluates how helpful the response is
 - **`llm_as_judge::safety`**: Assesses response safety and appropriateness
 ### Custom Scoring Functions
 You can also create custom scoring functions for domain-specific evaluation needs.
 ## Evaluation Workflow Best Practices
 ### 🎯 **Dataset Preparation**
 - Use diverse test cases that cover edge cases and common scenarios
 - Include clear expected answers or success criteria
 - Balance your dataset across different difficulty levels
 ### 📊 **Metrics Selection**
 - Choose appropriate scoring functions for your use case
 - Combine multiple metrics for comprehensive evaluation
 - Consider both automated and human evaluation metrics
 ### 🔄 **Iterative Improvement**
 - Run evaluations regularly during development
 - Use evaluation results to identify areas for improvement
 - Track performance changes over time
 ### 📈 **Analysis & Reporting**
 - Analyze failures to understand model limitations
 - Generate comprehensive evaluation reports
 - Share results with stakeholders for informed decision-making
 ## Advanced Evaluation Scenarios
 ### Batch Evaluation
 For evaluating large datasets efficiently:
 ```python
 # Prepare large evaluation dataset
 large_eval_dataset = [
     {"input_query": query, "expected_answer": answer}
     for query, answer in zip(queries, expected_answers)
 ]
 # Run batch evaluation
 batch_results = client.scoring.score(
     input_rows=large_eval_dataset,
     scoring_functions={
         "basic::subset_of": None,
         "llm_as_judge::accuracy": {"judge_model": "meta-llama/Llama-3.3-70B-Instruct"},
     }
 )
 ```
 ### Multi-Metric Evaluation
 Combining different scoring approaches:
 ```python
 comprehensive_scoring = {
     "exact_match": "basic::exact_match",
     "subset_match": "basic::subset_of",
     "llm_judge": "llm_as_judge::accuracy",
     "safety_check": "llm_as_judge::safety",
 }
 results = client.scoring.score(
     input_rows=eval_rows,
     scoring_functions=comprehensive_scoring
 )
 ```
 ## Related Resources
 - **[Agents](./agent)** - Building agents for evaluation
 - **[Tools Integration](./tools)** - Using tools in evaluated agents
 - **[Evaluation Reference](../references/evals_reference/)** - Complete API reference for evaluations
 - **[Getting Started Notebook](https://colab.research.google.com/github/meta-llama/llama-stack/blob/main/docs/getting_started.ipynb)** - Interactive examples
 - **[Evaluation Examples](https://colab.research.google.com/drive/10CHyykee9j2OigaIcRv47BKG9mrNm0tJ?usp=sharing)** - Additional evaluation scenarios

80

docs/docs/building_applications/index.mdx Normal file

View file

 @ -0,0 +1,80 @@
 ---
 title: Building Applications
 description: Comprehensive guides for building AI applications with Llama Stack
 sidebar_label: Overview
 sidebar_position: 5
 ---
 # AI Application Examples
 Llama Stack provides all the building blocks needed to create sophisticated AI applications.
 ## Getting Started
 The best way to get started is to look at this comprehensive notebook which walks through the various APIs (from basic inference, to RAG agents) and how to use them.
 **📓 [Building AI Applications Notebook](https://github.com/meta-llama/llama-stack/blob/main/docs/getting_started.ipynb)**
 ## Core Topics
 Here are the key topics that will help you build effective AI applications:
 ### 🤖 **Agent Development**
 - **[Agent Framework](./agent.mdx)** - Understand the components and design patterns of the Llama Stack agent framework
 - **[Agent Execution Loop](./agent_execution_loop.mdx)** - How agents process information, make decisions, and execute actions
 - **[Agents vs Responses API](./responses_vs_agents.mdx)** - Learn when to use each API for different use cases
 ### 📚 **Knowledge Integration**
 - **[RAG (Retrieval-Augmented Generation)](./rag.mdx)** - Enhance your agents with external knowledge through retrieval mechanisms
 ### 🛠️ **Capabilities & Extensions**
 - **[Tools](./tools.mdx)** - Extend your agents' capabilities by integrating with external tools and APIs
 ### 📊 **Quality & Monitoring**
 - **[Evaluations](./evals.mdx)** - Evaluate your agents' effectiveness and identify areas for improvement
 - **[Telemetry](./telemetry.mdx)** - Monitor and analyze your agents' performance and behavior
 - **[Safety](./safety.mdx)** - Implement guardrails and safety measures to ensure responsible AI behavior
 ## Application Patterns
 ### 🤖 **Conversational Agents**
 Build intelligent chatbots and assistants that can:
 - Maintain context across conversations
 - Access external knowledge bases
 - Execute actions through tool integrations
 - Apply safety filters and guardrails
 ### 📖 **RAG Applications**
 Create knowledge-augmented applications that:
 - Retrieve relevant information from documents
 - Generate contextually accurate responses
 - Handle large knowledge bases efficiently
 - Provide source attribution
 ### 🔧 **Tool-Enhanced Systems**
 Develop applications that can:
 - Search the web for real-time information
 - Interact with databases and APIs
 - Perform calculations and analysis
 - Execute complex multi-step workflows
 ### 🛡️ **Enterprise Applications**
 Build production-ready systems with:
 - Comprehensive safety measures
 - Performance monitoring and analytics
 - Scalable deployment configurations
 - Evaluation and quality assurance
 ## Next Steps
 . **📖 Start with the Notebook** - Work through the complete tutorial
 . **🎯 Choose Your Pattern** - Pick the application type that matches your needs
 . **🏗️ Build Your Foundation** - Set up your [providers](/docs/providers/) and [distributions](/docs/distributions/)
 . **🚀 Deploy & Monitor** - Use our [deployment guides](/docs/deploying/) for production
 ## Related Resources
 - **[Getting Started](/docs/getting_started/quickstart)** - Basic setup and concepts
 - **[Providers](/docs/providers/)** - Available AI service providers
 - **[Distributions](/docs/distributions/)** - Pre-configured deployment packages
 - **[API Reference](/docs/api/llama-stack-specification)** - Complete API documentation

87

docs/docs/building_applications/playground.mdx Normal file

View file

 @ -0,0 +1,87 @@
 ---
 title: Admin UI & Chat Playground
 description: Web-based admin interface and chat playground for Llama Stack
 sidebar_label: Playground
 sidebar_position: 10
 ---
 # Admin UI & Chat Playground
 The Llama Stack UI provides a comprehensive web-based admin interface for managing your Llama Stack server, with an integrated chat playground for interactive testing. This admin interface is the primary way to monitor, manage, and debug your Llama Stack applications.
 ## Quick Start
 Launch the admin UI with:
 ```bash
 npx llama-stack-ui
 ```
 Then visit `http://localhost:8322` to access the interface.
 ## Admin Interface Features
 The Llama Stack UI is organized into three main sections:
 ### 🎯 Create
 **Chat Playground** - Interactive testing environment
 - Real-time chat interface for testing agents and models
 - Multi-turn conversations with tool calling support
 - Agent SDK integration (will be migrated to Responses API)
 - Custom system prompts and model parameter adjustment
 ### 📊 Manage
 **Logs & Resource Management** - Monitor and manage your stack
 - **Responses Logs**: View and analyze agent responses and interactions
 - **Chat Completions Logs**: Monitor chat completion requests and responses
 - **Vector Stores**: Create, manage, and monitor vector databases for RAG workflows
 - **Prompts**: Full CRUD operations for prompt templates and management
 - **Files**: Forthcoming file management capabilities
 ## Key Capabilities for Application Development
 ### Real-time Monitoring
 - **Response Tracking**: Monitor all agent responses and tool calls
 - **Completion Analysis**: View chat completion performance and patterns
 - **Vector Store Activity**: Track RAG operations and document processing
 - **Prompt Usage**: Analyze prompt template performance
 ### Resource Management
 - **Vector Store CRUD**: Create, update, and delete vector databases
 - **Prompt Library**: Organize and version control your prompts
 - **File Operations**: Manage documents and assets (forthcoming)
 ### Interactive Testing
 - **Chat Playground**: Test conversational flows before production deployment
 - **Agent Prototyping**: Validate agent behaviors and tool integrations
 ## Development Workflow Integration
 The admin UI supports your development lifecycle:
 . **Development**: Use chat playground to prototype and test features
 . **Monitoring**: Track system performance through logs and metrics
 . **Management**: Organize prompts, vector stores, and other resources
 . **Debugging**: Analyze logs to identify and resolve issues
 ## Architecture Notes
 - **Current**: Chat playground uses Agents SDK
 - **Future**: Migration to Responses API for improved performance and consistency
 - **Admin Focus**: Primary emphasis on monitoring, logging, and resource management
 ## Getting Started
 . **Launch the UI**: Run `npx llama-stack-ui`
 . **Explore Logs**: Start with Responses and Chat Completions logs to understand your system activity
 . **Test in Playground**: Use the chat interface to validate your agent configurations
 . **Manage Resources**: Create vector stores and organize prompts through the UI
 For detailed setup and configuration, see the [Llama Stack UI documentation](/docs/distributions/llama_stack_ui).
 ## Next Steps
 - Set up your [first agent](/docs/building_applications/agent)
 - Implement [RAG functionality](/docs/building_applications/rag)
 - Add [evaluation metrics](/docs/building_applications/evals)
 - Configure [safety measures](/docs/building_applications/safety)

222

docs/docs/building_applications/rag.mdx Normal file

View file

 @ -0,0 +1,222 @@
 ---
 title: Retrieval Augmented Generation (RAG)
 description: Build knowledge-enhanced AI applications with external document retrieval
 sidebar_label: RAG (Retrieval Augmented Generation)
 sidebar_position: 2
 ---
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 # Retrieval Augmented Generation (RAG)
 RAG enables your applications to reference and recall information from external documents. Llama Stack makes Agentic RAG available through OpenAI's Responses API.
 ## Quick Start
 ### 1. Start the Server
 In one terminal, start the Llama Stack server:
 ```bash
 llama stack list-deps starter | xargs -L1 uv pip install
 llama stack run starter
 ```
 ### 2. Choose Your Approach
 Llama Stack supports various approaches for building RAG applications. The server provides two APIs (Responses and Chat Completions), plus a high-level client wrapper (Agent class):
 #### Approach 1: Agent Class (Client-Side)
 The **Agent class** is a high-level client wrapper around the Responses API with automatic tool execution and session management. Best for conversational agents and multi-turn RAG.
 ```python
 from llama_stack_client import Agent, AgentEventLogger, LlamaStackClient
 import requests
 from io import BytesIO
 client = LlamaStackClient(base_url="http://localhost:8321")
 # Create vector store
 vs = client.vector_stores.create(name="my_vector_db")
 # Upload document
 url = "https://www.paulgraham.com/greatwork.html"
 response = requests.get(url)
 file_buffer = BytesIO(response.content)
 file_buffer.name = "greatwork.html"
 file = client.files.create(file=file_buffer, purpose="assistants")
 client.vector_stores.files.create(vector_store_id=vs.id, file_id=file.id)
 # Create agent with file_search tool (client-side wrapper)
 agent = Agent(
     client,
     model="ollama/llama3.2:3b",
     instructions="You are a helpful assistant",
     tools=[
         {
             "type": "file_search",
             "vector_store_ids": [vs.id],  # Agent searches this automatically
         }
     ],
 )
 # Just ask - agent handles retrieval automatically
 response = agent.create_turn(
     messages=[{"role": "user", "content": "How do you do great work?"}],
     session_id=agent.create_session("my_session"),
     stream=True,
 )
 for log in AgentEventLogger().log(response):
     print(log, end="")
 ```
 **How it works:**
 - Client-side `Agent` class wraps the Responses API
 - Agent automatically decides when to search the vector store
 - Uses internal Python API for vector search (no HTTP overhead)
 - Maintains conversation context across turns
 - Best for: Interactive applications, chatbots, multi-turn conversations
 #### Approach 2: Responses API
 ```python
 import io, requests
 from openai import OpenAI
 url = "https://www.paulgraham.com/greatwork.html"
 client = OpenAI(base_url="http://localhost:8321/v1/", api_key="none")
 # Create vector store
 vs = client.vector_stores.create()
 response = requests.get(url)
 pseudo_file = io.BytesIO(str(response.content).encode('utf-8'))
 file_id = client.files.create(file=(url, pseudo_file, "text/html"), purpose="assistants").id
 client.vector_stores.files.create(vector_store_id=vs.id, file_id=file_id)
 # Automatic tool calling (calls Responses API directly)
 resp = client.responses.create(
     model="gpt-4o",
     input="How do you do great work?",
     tools=[{"type": "file_search", "vector_store_ids": [vs.id]}],
     include=["file_search_call.results"],
 )
 print(resp.output[-1].content[-1].text)
 ```
 **How it works:**
 - Server-side API with automatic tool calling
 - Uses internal Python API for vector search
 - No built-in session management (stateless by default)
 - Best for: Single-turn queries, OpenAI-compatible applications
 #### Approach 3: Chat Completions API
 The **Chat Completions API** is a server-side API that gives you explicit control over retrieval and generation. Best for custom RAG pipelines and batch processing.
 ```python
 import io, requests
 from openai import OpenAI
 client = OpenAI(base_url="http://localhost:8321/v1/", api_key="none")
 # Create vector store and add documents
 vs = client.vector_stores.create()
 # ... upload and add files ...
 # Explicitly search vector store via REST API
 query = "How do you do great work?"
 search_results = client.vector_stores.search(
     vector_store_id=vs.id,
     query=query,
     limit=3
 )
 # Manually extract context
 context = "\n\n".join([r.content for r in search_results.data if r.content])
 # Manually construct prompt with context
 completion = client.chat.completions.create(
     model="gpt-4o",
     messages=[
         {"role": "system", "content": "Use the provided context to answer questions."},
         {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {query}"}
     ]
 )
 print(completion.choices[0].message.content)
 Doing great work is about more than just hard work and ambition; it involves combining several elements:
 . **Pursue What Excites You**: Engage in projects that are both ambitious and exciting to you. It's important to work on something you have a natural aptitude for and a deep interest in.
 . **Explore and Discover**: Great work often feels like a blend of discovery and creation. Focus on seeing possibilities and let ideas take their natural shape, rather than just executing a plan.
 . **Be Bold Yet Flexible**: Take bold steps in your work without over-planning. An adaptable approach that evolves with new ideas can often lead to breakthroughs.
 . **Work on Your Own Projects**: Develop a habit of working on projects of your own choosing, as these often lead to great achievements. These should be projects you find exciting and that challenge you intellectually.
 . **Be Earnest and Authentic**: Approach your work with earnestness and authenticity. Trying to impress others with affectation can be counterproductive, as genuine effort and intellectual honesty lead to better work outcomes.
 . **Build a Supportive Environment**: Work alongside great colleagues who inspire you and enhance your work. Surrounding yourself with motivating individuals creates a fertile environment for great work.
 . **Maintain High Morale**: High morale significantly impacts your ability to do great work. Stay optimistic and protect your mental well-being to maintain progress and momentum.
 . **Balance**: While hard work is essential, overworking can lead to diminishing returns. Balance periods of intensive work with rest to sustain productivity over time.
 This approach shows that great work is less about following a strict formula and more about aligning your interests, ambition, and environment to foster creativity and innovation.
 ```
 ## Architecture Overview
 Llama Stack provides OpenAI-compatible RAG capabilities through:
 - **Vector Stores API**: OpenAI-compatible vector storage with automatic embedding model detection
 - **Files API**: Document upload and processing using OpenAI's file format
 - **Responses API**: Enhanced chat completions with agentic tool calling via file search
 ## Configuring Default Embedding Models
 To enable automatic vector store creation without specifying embedding models, configure a default embedding model in your config.yaml like so:
 ```yaml
 vector_stores:
   default_provider_id: faiss
   default_embedding_model:
     provider_id: sentence-transformers
     model_id: nomic-ai/nomic-embed-text-v1.5
 ```
 With this configuration:
 - `client.vector_stores.create()` works without requiring embedding model or provider parameters
 - The system automatically uses the default vector store provider (`faiss`) when multiple providers are available
 - The system automatically uses the default embedding model (`sentence-transformers/nomic-ai/nomic-embed-text-v1.5`) for any newly created vector store
 - The `default_provider_id` specifies which vector storage backend to use
 - The `default_embedding_model` specifies both the inference provider and model for embeddings
 ## Vector Store Operations
 ### Creating Vector Stores
 You can create vector stores with automatic or explicit embedding model selection:
 ```python
 # Automatic - uses default configured embedding model and vector store provider
 vs = client.vector_stores.create()
 # Explicit - specify embedding model and/or provider when you need specific ones
 vs = client.vector_stores.create(
     extra_body={
         "provider_id": "faiss",  # Optional: specify vector store provider
         "embedding_model": "sentence-transformers/nomic-ai/nomic-embed-text-v1.5",
         "embedding_dimension": 768  # Optional: will be auto-detected if not provided
     }
 )
 ```

221

docs/docs/building_applications/responses_vs_agents.mdx Normal file

View file

 @ -0,0 +1,221 @@
 ---
 title: Agents vs OpenAI Responses API
 description: Compare the Agents API and OpenAI Responses API for building AI applications with tool calling capabilities
 sidebar_label: Agents vs Responses API
 sidebar_position: 5
 ---
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 # Agents vs OpenAI Responses API
 Llama Stack (LLS) provides two different APIs for building AI applications with tool calling capabilities: the **Agents API** and the **OpenAI Responses API**. While both enable AI systems to use tools, and maintain full conversation history, they serve different use cases and have distinct characteristics.
 :::note
 **Note:** For simple and basic inferencing, you may want to use the [Chat Completions API](../providers/openai#chat-completions) directly, before progressing to Agents or Responses API.
 :::
 ## Overview
 ### LLS Agents API
 The Agents API is a full-featured, stateful system designed for complex, multi-turn conversations. It maintains conversation state through persistent sessions identified by a unique session ID. The API supports comprehensive agent lifecycle management, detailed execution tracking, and rich metadata about each interaction through a structured session/turn/step hierarchy. The API can orchestrate multiple tool calls within a single turn.
 ### OpenAI Responses API
 The OpenAI Responses API is a full-featured, stateful system designed for complex, multi-turn conversations, with direct compatibility with OpenAI's conversational patterns enhanced by LLama Stack's tool calling capabilities. It maintains conversation state by chaining responses through a `previous_response_id`, allowing interactions to branch or continue from any prior point. Each response can perform multiple tool calls within a single turn.
 ### Key Differences
 The LLS Agents API uses the Chat Completions API on the backend for inference as it's the industry standard for building AI applications and most LLM providers are compatible with this API. For a detailed comparison between Responses and Chat Completions, see [OpenAI's documentation](https://platform.openai.com/docs/guides/responses-vs-chat-completions).
 Additionally, Agents let you specify input/output shields whereas Responses do not (though support is planned). Agents use a linear conversation model referenced by a single session ID. Responses, on the other hand, support branching, where each response can serve as a fork point, and conversations are tracked by the latest response ID. Responses also lets you dynamically choose the model, vector store, files, MCP servers, and more on each inference call, enabling more complex workflows. Agents require a static configuration for these components at the start of the session.
 Today the Agents and Responses APIs can be used independently depending on the use case. But, it is also productive to treat the APIs as complementary. It is not currently supported, but it is planned for the LLS Agents API to alternatively use the Responses API as its backend instead of the default Chat Completions API, i.e., enabling a combination of the safety features of Agents with the dynamic configuration and branching capabilities of Responses.
 ## Feature Comparison
 | Feature | LLS Agents API | OpenAI Responses API |
 |---------|------------|---------------------|
 | **Conversation Management** | Linear persistent sessions | Can branch from any previous response ID |
 | **Input/Output Safety Shields** | Supported | Not yet supported |
 | **Per-call Flexibility** | Static per-session configuration | Dynamic per-call configuration |
 ## Use Case Example: Research with Multiple Search Methods
 Let's compare how both APIs handle a research task where we need to:
 . Search for current information and examples
 . Access different information sources dynamically
 . Continue the conversation based on search results
 <Tabs>
 <TabItem value="agents" label="Agents API">
 ### Session-based Configuration with Safety Shields
 ```python
 # Create agent with static session configuration
 agent = Agent(
     client,
     model="Llama3.2-3B-Instruct",
     instructions="You are a helpful coding assistant",
     tools=[
         {
             "name": "builtin::rag/knowledge_search",
             "args": {"vector_db_ids": ["code_docs"]},
         },
         "builtin::code_interpreter",
     ],
     input_shields=["llama_guard"],
     output_shields=["llama_guard"],
 )
 session_id = agent.create_session("code_session")
 # First turn: Search and execute
 response1 = agent.create_turn(
     messages=[
         {
             "role": "user",
             "content": "Find examples of sorting algorithms and run a bubble sort on [3,1,4,1,5]",
         },
     ],
     session_id=session_id,
 )
 # Continue conversation in same session
 response2 = agent.create_turn(
     messages=[
         {
             "role": "user",
             "content": "Now optimize that code and test it with a larger dataset",
         },
     ],
     session_id=session_id,  # Same session, maintains full context
 )
 # Agents API benefits:
 # ✅ Safety shields protect against malicious code execution
 # ✅ Session maintains context between code executions
 # ✅ Consistent tool configuration throughout conversation
 print(f"First result: {response1.output_message.content}")
 print(f"Optimization: {response2.output_message.content}")
 ```
 </TabItem>
 <TabItem value="responses" label="Responses API">
 ### Dynamic Per-call Configuration with Branching
 ```python
 # First response: Use web search for latest algorithms
 response1 = client.responses.create(
     model="Llama3.2-3B-Instruct",
     input="Search for the latest efficient sorting algorithms and their performance comparisons",
     tools=[
         {
             "type": "web_search",
         },
     ],  # Web search for current information
 )
 # Continue conversation: Switch to file search for local docs
 response2 = client.responses.create(
     model="Llama3.2-1B-Instruct",  # Switch to faster model
     input="Now search my uploaded files for existing sorting implementations",
     tools=[
         {  # Using Responses API built-in tools
             "type": "file_search",
             "vector_store_ids": ["vs_abc123"],  # Vector store containing uploaded files
         },
     ],
     previous_response_id=response1.id,
 )
 # Branch from first response: Try different search approach
 response3 = client.responses.create(
     model="Llama3.2-3B-Instruct",
     input="Instead, search the web for Python-specific sorting best practices",
     tools=[{"type": "web_search"}],  # Different web search query
     previous_response_id=response1.id,  # Branch from response1
 )
 # Responses API benefits:
 # ✅ Dynamic tool switching (web search ↔ file search per call)
 # ✅ OpenAI-compatible tool patterns (web_search, file_search)
 # ✅ Branch conversations to explore different information sources
 # ✅ Model flexibility per search type
 print(f"Web search results: {response1.output_message.content}")
 print(f"File search results: {response2.output_message.content}")
 print(f"Alternative web search: {response3.output_message.content}")
 ```
 </TabItem>
 </Tabs>
 Both APIs demonstrate distinct strengths that make them valuable on their own for different scenarios. The Agents API excels in providing structured, safety-conscious workflows with persistent session management, while the Responses API offers flexibility through dynamic configuration and OpenAI compatible tool patterns.
 ## Use Case Examples
 ### 1. Research and Analysis with Safety Controls
 **Best Choice: Agents API**
 **Scenario:** You're building a research assistant for a financial institution that needs to analyze market data, execute code to process financial models, and search through internal compliance documents. The system must ensure all interactions are logged for regulatory compliance and protected by safety shields to prevent malicious code execution or data leaks.
 **Why Agents API?** The Agents API provides persistent session management for iterative research workflows, built-in safety shields to protect against malicious code in financial models, and structured execution logs (session/turn/step) required for regulatory compliance. The static tool configuration ensures consistent access to your knowledge base and code interpreter throughout the entire research session.
 ### 2. Dynamic Information Gathering with Branching Exploration
 **Best Choice: Responses API**
 **Scenario:** You're building a competitive intelligence tool that helps businesses research market trends. Users need to dynamically switch between web search for current market data and file search through uploaded industry reports. They also want to branch conversations to explore different market segments simultaneously and experiment with different models for various analysis types.
 **Why Responses API?** The Responses API's branching capability lets users explore multiple market segments from any research point. Dynamic per-call configuration allows switching between web search and file search as needed, while experimenting with different models (faster models for quick searches, more powerful models for deep analysis). The OpenAI-compatible tool patterns make integration straightforward.
 ### 3. OpenAI Migration with Advanced Tool Capabilities
 **Best Choice: Responses API**
 **Scenario:** You have an existing application built with OpenAI's Assistants API that uses file search and web search capabilities. You want to migrate to Llama Stack for better performance and cost control while maintaining the same tool calling patterns and adding new capabilities like dynamic vector store selection.
 **Why Responses API?** The Responses API provides full OpenAI tool compatibility (`web_search`, `file_search`) with identical syntax, making migration seamless. The dynamic per-call configuration enables advanced features like switching vector stores per query or changing models based on query complexity - capabilities that extend beyond basic OpenAI functionality while maintaining compatibility.
 ### 4. Educational Programming Tutor
 **Best Choice: Agents API**
 **Scenario:** You're building a programming tutor that maintains student context across multiple sessions, safely executes code exercises, and tracks learning progress with audit trails for educators.
 **Why Agents API?** Persistent sessions remember student progress across multiple interactions, safety shields prevent malicious code execution while allowing legitimate programming exercises, and structured execution logs help educators track learning patterns.
 ### 5. Advanced Software Debugging Assistant
 **Best Choice: Agents API with Responses Backend**
 **Scenario:** You're building a debugging assistant that helps developers troubleshoot complex issues. It needs to maintain context throughout a debugging session, safely execute diagnostic code, switch between different analysis tools dynamically, and branch conversations to explore multiple potential causes simultaneously.
 **Why Agents + Responses?** The Agent provides safety shields for code execution and session management for the overall debugging workflow. The underlying Responses API enables dynamic model selection and flexible tool configuration per query, while branching lets you explore different theories (memory leak vs. concurrency issue) from the same debugging point and compare results.
 :::info[Future Enhancement]
 The ability to use Responses API as the backend for Agents is not yet implemented but is planned for a future release. Currently, Agents use Chat Completions API as their backend by default.
 :::
 ## Decision Framework
 Use this framework to choose the right API for your use case:
 ### Choose Agents API when:
 - ✅ You need **safety shields** for input/output validation
 - ✅ Your application requires **linear conversation flow** with persistent context
 - ✅ You need **audit trails** and structured execution logs
 - ✅ Your tool configuration is **static** throughout the session
 - ✅ You're building **educational, financial, or enterprise** applications with compliance requirements
 ### Choose Responses API when:
 - ✅ You need **conversation branching** to explore multiple paths
 - ✅ You want **dynamic per-call configuration** (models, tools, vector stores)
 - ✅ You're **migrating from OpenAI** and want familiar tool patterns
 - ✅ You need **OpenAI compatibility** for existing workflows
 - ✅ Your application benefits from **flexible, experimental** interactions
 ## Related Resources
 - **[Agents](./agent)** - Understanding the Agents API fundamentals
 - **[Agent Execution Loop](./agent_execution_loop)** - How agents process turns and steps
 - **[Tools Integration](./tools)** - Adding capabilities to both APIs
 - **[OpenAI Compatibility](../providers/openai)** - Using OpenAI-compatible endpoints
 - **[Safety Guardrails](./safety)** - Implementing safety measures in agents

394

docs/docs/building_applications/safety.mdx Normal file

View file

 @ -0,0 +1,394 @@
 ---
 title: Safety Guardrails
 description: Implement safety measures and content moderation in Llama Stack applications
 sidebar_label: Safety
 sidebar_position: 9
 ---
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 # Safety Guardrails
 Safety is a critical component of any AI application. Llama Stack provides a comprehensive Shield system that can be applied at multiple touchpoints to ensure responsible AI behavior and content moderation.
 ## Shield System Overview
 The Shield system in Llama Stack provides:
 - **Content filtering** for both input and output messages
 - **Multi-touchpoint protection** across your application flow
 - **Configurable safety policies** tailored to your use case
 - **Integration with agents** for automated safety enforcement
 ## Basic Shield Usage
 ### Registering a Safety Shield
 <Tabs>
 <TabItem value="registration" label="Shield Registration">
 ```python
 # Register a safety shield
 shield_id = "content_safety"
 client.shields.register(
     shield_id=shield_id,
     provider_shield_id="llama-guard-basic"
 )
 ```
 </TabItem>
 <TabItem value="manual-check" label="Manual Safety Check">
 ```python
 # Run content through shield manually
 response = client.safety.run_shield(
     shield_id=shield_id,
     messages=[{"role": "user", "content": "User message here"}]
 )
 if response.violation:
     print(f"Safety violation detected: {response.violation.user_message}")
     # Handle violation appropriately
 else:
     print("Content passed safety checks")
 ```
 </TabItem>
 </Tabs>
 ## Agent Integration
 Shields can be automatically applied to agent interactions for seamless safety enforcement:
 <Tabs>
 <TabItem value="input-shields" label="Input Shields">
 ```python
 from llama_stack_client import Agent
 # Create agent with input safety shields
 agent = Agent(
     client,
     model="meta-llama/Llama-3.2-3B-Instruct",
     instructions="You are a helpful assistant",
     input_shields=["content_safety"],  # Shield user inputs
     tools=["builtin::websearch"],
 )
 session_id = agent.create_session("safe_session")
 # All user inputs will be automatically screened
 response = agent.create_turn(
     messages=[{"role": "user", "content": "Tell me about AI safety"}],
     session_id=session_id,
 )
 ```
 </TabItem>
 <TabItem value="output-shields" label="Output Shields">
 ```python
 # Create agent with output safety shields
 agent = Agent(
     client,
     model="meta-llama/Llama-3.2-3B-Instruct",
     instructions="You are a helpful assistant",
     output_shields=["content_safety"],  # Shield agent outputs
     tools=["builtin::websearch"],
 )
 session_id = agent.create_session("safe_session")
 # All agent responses will be automatically screened
 response = agent.create_turn(
     messages=[{"role": "user", "content": "Help me with my research"}],
     session_id=session_id,
 )
 ```
 </TabItem>
 <TabItem value="both-shields" label="Input & Output Shields">
 ```python
 # Create agent with comprehensive safety coverage
 agent = Agent(
     client,
     model="meta-llama/Llama-3.2-3B-Instruct",
     instructions="You are a helpful assistant",
     input_shields=["content_safety"],   # Screen user inputs
     output_shields=["content_safety"],  # Screen agent outputs
     tools=["builtin::websearch"],
 )
 session_id = agent.create_session("fully_protected_session")
 # Both input and output are automatically protected
 response = agent.create_turn(
     messages=[{"role": "user", "content": "Research question here"}],
     session_id=session_id,
 )
 ```
 </TabItem>
 </Tabs>
 ## Available Shield Types
 ### Llama Guard Shields
 Llama Guard provides state-of-the-art content safety classification:
 <Tabs>
 <TabItem value="basic" label="Basic Llama Guard">
 ```python
 # Basic Llama Guard for general content safety
 client.shields.register(
     shield_id="llama_guard_basic",
     provider_shield_id="llama-guard-basic"
 )
 ```
 **Use Cases:**
 - General content moderation
 - Harmful content detection
 - Basic safety compliance
 </TabItem>
 <TabItem value="advanced" label="Advanced Llama Guard">
 ```python
 # Advanced Llama Guard with custom categories
 client.shields.register(
     shield_id="llama_guard_advanced",
     provider_shield_id="llama-guard-advanced",
     config={
         "categories": [
             "violence", "hate_speech", "sexual_content",
             "self_harm", "illegal_activity"
         ],
         "threshold": 0.8
     }
 )
 ```
 **Use Cases:**
 - Fine-tuned safety policies
 - Domain-specific content filtering
 - Enterprise compliance requirements
 </TabItem>
 </Tabs>
 ### Custom Safety Shields
 Create domain-specific safety shields for specialized use cases:
 ```python
 # Register custom safety shield
 client.shields.register(
     shield_id="financial_compliance",
     provider_shield_id="custom-financial-shield",
     config={
         "detect_pii": True,
         "financial_advice_warning": True,
         "regulatory_compliance": "FINRA"
     }
 )
 ```
 ## Safety Response Handling
 When safety violations are detected, handle them appropriately:
 <Tabs>
 <TabItem value="basic-handling" label="Basic Handling">
 ```python
 response = client.safety.run_shield(
     shield_id="content_safety",
     messages=[{"role": "user", "content": "Potentially harmful content"}]
 )
 if response.violation:
     violation = response.violation
     print(f"Violation Type: {violation.violation_type}")
     print(f"User Message: {violation.user_message}")
     print(f"Metadata: {violation.metadata}")
     # Log the violation for audit purposes
     logger.warning(f"Safety violation detected: {violation.violation_type}")
     # Provide appropriate user feedback
     return "I can't help with that request. Please try asking something else."
 ```
 </TabItem>
 <TabItem value="advanced-handling" label="Advanced Handling">
 ```python
 def handle_safety_response(safety_response, user_message):
     """Advanced safety response handling with logging and user feedback"""
     if not safety_response.violation:
         return {"safe": True, "message": "Content passed safety checks"}
     violation = safety_response.violation
     # Log violation details
     audit_log = {
         "timestamp": datetime.now().isoformat(),
         "violation_type": violation.violation_type,
         "original_message": user_message,
         "shield_response": violation.user_message,
         "metadata": violation.metadata
     }
     logger.warning(f"Safety violation: {audit_log}")
     # Determine appropriate response based on violation type
     if violation.violation_type == "hate_speech":
         user_feedback = "I can't engage with content that contains hate speech. Let's keep our conversation respectful."
     elif violation.violation_type == "violence":
         user_feedback = "I can't provide information that could promote violence. How else can I help you today?"
     else:
         user_feedback = "I can't help with that request. Please try asking something else."
     return {
         "safe": False,
         "user_feedback": user_feedback,
         "violation_details": audit_log
     }
 # Usage
 safety_result = handle_safety_response(response, user_input)
 if not safety_result["safe"]:
     return safety_result["user_feedback"]
 ```
 </TabItem>
 </Tabs>
 ## Safety Configuration Best Practices
 ### 🛡️ **Multi-Layer Protection**
 - Use both input and output shields for comprehensive coverage
 - Combine multiple shield types for different threat categories
 - Implement fallback mechanisms when shields fail
 ### 📊 **Monitoring & Auditing**
 - Log all safety violations for compliance and analysis
 - Monitor false positive rates to tune shield sensitivity
 - Track safety metrics across different use cases
 ### ⚙️ **Configuration Management**
 - Use environment-specific safety configurations
 - Implement A/B testing for shield effectiveness
 - Regularly update shield models and policies
 ### 🔧 **Integration Patterns**
 - Integrate shields early in the development process
 - Test safety measures with adversarial inputs
 - Provide clear user feedback for violations
 ## Advanced Safety Scenarios
 ### Context-Aware Safety
 ```python
 # Safety shields that consider conversation context
 agent = Agent(
     client,
     model="meta-llama/Llama-3.2-3B-Instruct",
     instructions="You are a healthcare assistant",
     input_shields=["medical_safety"],
     output_shields=["medical_safety"],
     # Context helps shields make better decisions
     safety_context={
         "domain": "healthcare",
         "user_type": "patient",
         "compliance_level": "HIPAA"
     }
 )
 ```
 ### Dynamic Shield Selection
 ```python
 def select_shield_for_user(user_profile):
     """Select appropriate safety shield based on user context"""
     if user_profile.age < 18:
         return "child_safety_shield"
     elif user_profile.context == "enterprise":
         return "enterprise_compliance_shield"
     else:
         return "general_safety_shield"
 # Use dynamic shield selection
 shield_id = select_shield_for_user(current_user)
 response = client.safety.run_shield(
     shield_id=shield_id,
     messages=messages
 )
 ```
 ## Compliance and Regulations
 ### Industry-Specific Safety
 <Tabs>
 <TabItem value="healthcare" label="Healthcare (HIPAA)">
 ```python
 # Healthcare-specific safety configuration
 client.shields.register(
     shield_id="hipaa_compliance",
     provider_shield_id="healthcare-safety-shield",
     config={
         "detect_phi": True,  # Protected Health Information
         "medical_advice_warning": True,
         "regulatory_framework": "HIPAA"
     }
 )
 ```
 </TabItem>
 <TabItem value="financial" label="Financial (FINRA)">
 ```python
 # Financial services safety configuration
 client.shields.register(
     shield_id="finra_compliance",
     provider_shield_id="financial-safety-shield",
     config={
         "detect_financial_advice": True,
         "investment_disclaimers": True,
         "regulatory_framework": "FINRA"
     }
 )
 ```
 </TabItem>
 <TabItem value="education" label="Education (COPPA)">
 ```python
 # Educational platform safety for minors
 client.shields.register(
     shield_id="coppa_compliance",
     provider_shield_id="educational-safety-shield",
     config={
         "child_protection": True,
         "educational_content_only": True,
         "regulatory_framework": "COPPA"
     }
 )
 ```
 </TabItem>
 </Tabs>
 ## Related Resources
 - **[Agents](./agent)** - Integrating safety shields with intelligent agents
 - **[Agent Execution Loop](./agent_execution_loop)** - Understanding safety in the execution flow
 - **[Evaluations](./evals)** - Evaluating safety shield effectiveness
 - **[Llama Guard Documentation](https://github.com/meta-llama/PurpleLlama/tree/main/Llama-Guard3)** - Advanced safety model details

43

docs/docs/building_applications/telemetry.mdx Normal file

View file

 @ -0,0 +1,43 @@
 ---
 title: Telemetry
 description: Monitor and observe Llama Stack applications with comprehensive telemetry capabilities
 sidebar_label: Telemetry
 sidebar_position: 8
 ---
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 # Telemetry
 The preferred way to instrument Llama Stack is with OpenTelemetry. Llama Stack enriches the data
 collected by OpenTelemetry to capture helpful information about the performance and behavior of your
 application. Here is an example of how to forward your telemetry to an OTLP collector from Llama Stack:
 ```sh
 export OTEL_EXPORTER_OTLP_ENDPOINT="http://127.0.0.1:4318"
 export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
 export OTEL_SERVICE_NAME="llama-stack-server"
 uv pip install opentelemetry-distro opentelemetry-exporter-otlp
 uv run opentelemetry-bootstrap -a requirements | uv pip install --requirement -
 uv run opentelemetry-instrument llama stack run config.yaml
 ```
 ### Known issues
 Some database instrumentation libraries have a known bug where spans get wrapped twice, or do not get connected to a trace.
 To prevent this, you can disable database specific tracing, and rely just on the SQLAlchemy tracing. If you are using
 `sqlite3` as your database, for example, you can disable the additional tracing like this:
 ```sh
 export OTEL_PYTHON_DISABLED_INSTRUMENTATIONS="sqlite3"
 ```
 ## Related Resources
 - **[OpenTelemetry Documentation](https://opentelemetry.io/)** - Comprehensive observability framework
 - **[Jaeger Documentation](https://www.jaegertracing.io/)** - Distributed tracing visualization

333

docs/docs/building_applications/tools.mdx Normal file

View file

 @ -0,0 +1,333 @@
 ---
 title: Tools
 description: Extend agent capabilities with external tools and function calling
 sidebar_label: Tools
 sidebar_position: 6
 ---
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 # Tools
 Tools are functions that can be invoked by an agent to perform tasks. They are organized into tool groups and registered with specific providers. Each tool group represents a collection of related tools from a single provider. They are organized into groups so that state can be externalized: the collection operates on the same state typically.
 An example of this would be a "db_access" tool group that contains tools for interacting with a database. "list_tables", "query_table", "insert_row" could be examples of tools in this group.
 Tools are treated as any other resource in llama stack like models. You can register them, have providers for them etc.
 When instantiating an agent, you can provide it a list of tool groups that it has access to. Agent gets the corresponding tool definitions for the specified tool groups and passes them along to the model.
 Refer to the [Building AI Applications](https://github.com/meta-llama/llama-stack/blob/main/docs/getting_started.ipynb) notebook for more examples on how to use tools.
 ## Server-side vs. Client-side Tool Execution
 Llama Stack allows you to use both server-side and client-side tools. With server-side tools, `agent.create_turn` can perform execution of the tool calls emitted by the model transparently giving the user the final answer desired. If client-side tools are provided, the tool call is sent back to the user for execution and optional continuation using the `agent.resume_turn` method.
 ## Server-side Tools
 Llama Stack provides built-in providers for some common tools. These include web search, math, and RAG capabilities.
 ### Web Search
 You have three providers to execute the web search tool calls generated by a model: Brave Search, Bing Search, and Tavily Search.
 To indicate that the web search tool calls should be executed by brave-search, you can point the "builtin::websearch" toolgroup to the "brave-search" provider.
 ```python
 client.toolgroups.register(
     toolgroup_id="builtin::websearch",
     provider_id="brave-search",
     args={"max_results": 5},
 )
 ```
 The tool requires an API key which can be provided either in the configuration or through the request header `X-LlamaStack-Provider-Data`. The format of the header is:
 ```
 {"<provider_name>_api_key": <your api key>}
 ```
 ### Math
 The WolframAlpha tool provides access to computational knowledge through the WolframAlpha API.
 ```python
 client.toolgroups.register(
     toolgroup_id="builtin::wolfram_alpha",
     provider_id="wolfram-alpha"
 )
 ```
 Example usage:
 ```python
 result = client.tool_runtime.invoke_tool(
     tool_name="wolfram_alpha",
     args={"query": "solve x^2 + 2x + 1 = 0"}
 )
 ```
 ### RAG
 The RAG tool enables retrieval of context from various types of memory banks (vector, key-value, keyword, and graph).
 ```python
 # Register Memory tool group
 client.toolgroups.register(
     toolgroup_id="builtin::rag",
     provider_id="faiss",
     args={"max_chunks": 5, "max_tokens_in_context": 4096},
 )
 ```
 Features:
 - Support for multiple memory bank types
 - Configurable query generation
 - Context retrieval with token limits
 :::note[Default Configuration]
 By default, llama stack config.yaml defines toolgroups for web search, wolfram alpha and rag, that are provided by tavily-search, wolfram-alpha and rag providers.
 :::
 ## Model Context Protocol (MCP)
 [MCP](https://github.com/modelcontextprotocol) is an upcoming, popular standard for tool discovery and execution. It is a protocol that allows tools to be dynamically discovered from an MCP endpoint and can be used to extend the agent's capabilities.
 ### Using Remote MCP Servers
 You can find some popular remote MCP servers [here](https://github.com/jaw9c/awesome-remote-mcp-servers). You can register them as toolgroups in the same way as local providers.
 ```python
 client.toolgroups.register(
     toolgroup_id="mcp::deepwiki",
     provider_id="model-context-protocol",
     mcp_endpoint=URL(uri="https://mcp.deepwiki.com/sse"),
 )
 ```
 Note that most of the more useful MCP servers need you to authenticate with them. Many of them use OAuth2.0 for authentication. You can provide the authorization token when creating the Agent:
 ```python
 agent = Agent(
     ...,
     tools=[
         {
             "type": "mcp",
             "server_url": "https://mcp.deepwiki.com/sse",
             "server_label": "mcp::deepwiki",
             "authorization": "<your_access_token>",  # OAuth token (without "Bearer " prefix)
         }
     ],
 )
 agent.create_turn(...)
 ```
 ### Running Your Own MCP Server
 Here's an example of how to run a simple MCP server that exposes a File System as a set of tools to the Llama Stack agent.
 <Tabs>
 <TabItem value="setup" label="Server Setup">
 ```shell
 # Start your MCP server
 mkdir /tmp/content
 touch /tmp/content/foo
 touch /tmp/content/bar
 npx -y supergateway --port 8000 --stdio 'npx -y @modelcontextprotocol/server-filesystem /tmp/content'
 ```
 </TabItem>
 <TabItem value="register" label="Registration">
 ```python
 # Register the MCP server as a tool group
 client.toolgroups.register(
     toolgroup_id="mcp::filesystem",
     provider_id="model-context-protocol",
     mcp_endpoint=URL(uri="http://localhost:8000/sse"),
 )
 ```
 </TabItem>
 </Tabs>
 ## Adding Custom (Client-side) Tools
 When you want to use tools other than the built-in tools, you just need to implement a python function with a docstring. The content of the docstring will be used to describe the tool and the parameters and passed along to the generative model.
 ```python
 # Example tool definition
 def my_tool(input: int) -> int:
     """
     Runs my awesome tool.
     :param input: some int parameter
     """
     return input * 2
 ```
 :::tip[Documentation Best Practices]
 We employ python docstrings to describe the tool and the parameters. It is important to document the tool and the parameters so that the model can use the tool correctly. It is recommended to experiment with different docstrings to see how they affect the model's behavior.
 :::
 Once defined, simply pass the tool to the agent config. `Agent` will take care of the rest (calling the model with the tool definition, executing the tool, and returning the result to the model for the next iteration).
 ```python
 # Example agent config with client provided tools
 agent = Agent(client, ..., tools=[my_tool])
 ```
 Refer to [llama-stack-apps](https://github.com/meta-llama/llama-stack-apps/) for an example of how to use client provided tools.
 ## Tool Invocation
 Tools can be invoked using the `invoke_tool` method:
 ```python
 result = client.tool_runtime.invoke_tool(
     tool_name="web_search",
     kwargs={"query": "What is the capital of France?"}
 )
 ```
 The result contains:
 - `content`: The tool's output
 - `error_message`: Optional error message if the tool failed
 - `error_code`: Optional error code if the tool failed
 ## Listing Available Tools
 You can list all available tools or filter by tool group:
 ```python
 # List all tools
 all_tools = client.tools.list_tools()
 # List tools in a specific group
 group_tools = client.tools.list_tools(toolgroup_id="search_tools")
 ```
 ## Complete Examples
 ### Web Search Agent
 <Tabs>
 <TabItem value="setup" label="Setup & Configuration">
 . Start by registering a Tavily API key at [Tavily](https://tavily.com/).
 . [Optional] Set the API key in your environment before starting the Llama Stack server
 ```bash
 export TAVILY_SEARCH_API_KEY="your key"
 ```
 </TabItem>
 <TabItem value="implementation" label="Implementation">
 ```python
 from llama_stack_client.lib.agents.agent import Agent
 from llama_stack_client.types.agent_create_params import AgentConfig
 from llama_stack_client.lib.agents.event_logger import EventLogger
 from llama_stack_client import LlamaStackClient
 client = LlamaStackClient(
     base_url=f"http://localhost:8321",
     provider_data={
         "tavily_search_api_key": "your_TAVILY_SEARCH_API_KEY"
     },  # Set this from the client side. No need to provide it if it has already been configured on the Llama Stack server.
 )
 agent = Agent(
     client,
     model="meta-llama/Llama-3.2-3B-Instruct",
     instructions=(
         "You are a web search assistant, must use websearch tool to look up the most current and precise information available. "
     ),
     tools=["builtin::websearch"],
 )
 session_id = agent.create_session("websearch-session")
 response = agent.create_turn(
     messages=[
         {"role": "user", "content": "How did the USA perform in the last Olympics?"}
     ],
     session_id=session_id,
 )
 for log in EventLogger().log(response):
     log.print()
 ```
 </TabItem>
 </Tabs>
 ### WolframAlpha Math Agent
 <Tabs>
 <TabItem value="setup" label="Setup & Configuration">
 . Start by registering for a WolframAlpha API key at [WolframAlpha Developer Portal](https://developer.wolframalpha.com/access).
 . Provide the API key either by setting it in your environment before starting the Llama Stack server:
     ```bash
     export WOLFRAM_ALPHA_API_KEY="your key"
     ```
     or from the client side:
     ```python
     client = LlamaStackClient(
         base_url="http://localhost:8321",
         provider_data={"wolfram_alpha_api_key": wolfram_api_key},
     )
     ```
 </TabItem>
 <TabItem value="implementation" label="Implementation">
 ```python
 # Configure the tools in the Agent by setting tools=["builtin::wolfram_alpha"]
 agent = Agent(
     client,
     model="meta-llama/Llama-3.2-3B-Instruct",
     instructions="You are a mathematical assistant that can solve complex equations.",
     tools=["builtin::wolfram_alpha"],
 )
 session_id = agent.create_session("math-session")
 # Example user query
 response = agent.create_turn(
     messages=[{"role": "user", "content": "Solve x^2 + 2x + 1 = 0 using WolframAlpha"}],
     session_id=session_id,
 )
 ```
 </TabItem>
 </Tabs>
 ## Best Practices
 ### 🛠️ **Tool Selection**
 - Use **server-side tools** for production applications requiring reliability and security
 - Use **client-side tools** for development, prototyping, or specialized integrations
 - Combine multiple tool types for comprehensive functionality
 ### 📝 **Documentation**
 - Write clear, detailed docstrings for custom tools
 - Include parameter descriptions and expected return types
 - Test tool descriptions with the model to ensure proper usage
 ### 🔐 **Security**
 - Store API keys securely using environment variables or secure configuration
 - Use the `X-LlamaStack-Provider-Data` header for dynamic authentication
 - Validate tool inputs and outputs for security
 ### 🔄 **Error Handling**
 - Implement proper error handling in custom tools
 - Use structured error responses with meaningful messages
 - Monitor tool performance and reliability
 ## Related Resources
 - **[Agents](./agent)** - Building intelligent agents with tools
 - **[RAG (Retrieval Augmented Generation)](./rag)** - Using knowledge retrieval tools
 - **[Agent Execution Loop](./agent_execution_loop)** - Understanding tool execution flow
 - **[Building AI Applications Notebook](https://github.com/meta-llama/llama-stack/blob/main/docs/getting_started.ipynb)** - Comprehensive examples
 - **[Llama Stack Apps Examples](https://github.com/meta-llama/llama-stack-apps)** - Real-world tool implementations

105

docs/docs/concepts/apis/api_leveling.mdx Normal file

View file

 @ -0,0 +1,105 @@
 ---
 title: API Stability Leveling
 description: Understanding API stability levels and versioning in Llama Stack
 sidebar_label: API Stability
 sidebar_position: 4
 ---
 # Llama Stack API Stability Leveling
 In order to provide a stable experience in Llama Stack, the various APIs need different stability levels indicating the level of support, backwards compatability, and overall production readiness.
 ## Different Levels
 ### v1alpha
 - Little to no expectation of support between versions
 - Breaking changes are permitted
 - Datatypes and parameters can break
 - Routes can be added and removed
 #### Graduation Criteria
 - an API can graduate from `v1alpha` to `v1beta` if the team has identified the extent of the non-optional routes and the shape of their parameters/return types for the API eg. `/v1/openai/chat/completions`. Optional types can change.
 - CRUD must stay stable once in `v1beta`. This is a commitment to backward compatibility, guaranteeing that most code you write against the v1beta version will not break during future updates. We may make additive changes (like adding a new, optional field to a response), but we will not make breaking changes (like renaming an existing "modelName" field to "name", changing an ID's data type from an integer to a string, or altering an endpoint URL).
 - for OpenAI APIs, a comparison to the OpenAI spec for the specific API can be done to ensure completeness.
 ### v1beta
 - API routes remain consistent between versions
 - Parameters and return types are not ensured between versions
 - API, besides minor fixes and adjustments, should be _almost_ v1. Changes should not be drastic.
 #### Graduation Criteria
 - an API can graduate from `v1beta` to `v1` if the API surface and datatypes are complete as identified by the team. The parameters and return types that are mandatory for each route are stable. All aspects of graduating from `v1alpha1` to `v1beta` apply as well.
 - Optional parameters, routes, or parts of the return type can be added after graduating to `v1`
 ### v1 (stable)
 - Considered stable
 - Backwards compatible between Z-streams
   - Y-stream breaking changes must go through the proper approval and announcement process.
 - Datatypes for a route and its return types cannot change between Z-streams
   - Y-stream datatype changes should be sparing, unless the changes are additional net-new parameters
 - Must have proper conformance testing as outlined in https://github.com/llamastack/llama-stack/issues/3237
 ### v2+ (Major Versions)
 Introducing a new major version like `/v2` is a significant and disruptive event that should be treated as a last resort. It is reserved for essential changes to a stable `/v1` API that are fundamentally backward-incompatible and cannot be implemented through additive, non-breaking changes or breaking changes across X/Y-Stream releases (x.y.z).
 If a `/v2` version is deemed absolutely necessary, it must adhere to the following protocol to ensure a sane and predictable transition for users:
 #### Lifecycle Progression
  A new major version must follow the same stability lifecycle as `/v1`. It will be introduced as `/v2alpha`, mature to `/v2beta`, and finally become stable as `/v2`.
 #### Coexistence:
 The new `/v2` API must be introduced alongside the existing `/v1` API and run in parallel. It must not replace the `/v1` API immediately.
 #### Deprecation Policy:
 When a `/v2` API is introduced, a clear and generous deprecation policy for the `/v1` API must be published simultaneously. This policy must outline the timeline for the eventual removal of the `/v1` API, giving users ample time to migrate.
 ### Deprecated APIs
 Deprecated APIs are those that are no longer actively maintained or supported. Depreated APIs are marked with the flag `deprecated = True` in the OpenAPI spec. These APIs will be removed in a future release.
 ### API Stability vs. Provider Stability
 The leveling introduced in this document relates to the stability of the API and not specifically the providers within the API.
 Providers can iterate as much as they want on functionality as long as they work within the bounds of an API. If they need to change the API, then the API should not be `/v1`, or those breaking changes can only happen on a y-stream release basis.
 ### Approval and Announcement Process for Breaking Changes
 - **PR Labeling**: Any pull request that introduces a breaking API change must be clearly labeled with `breaking-change`.
 - **PR Title/Commit**: Any pull request that introduces a breaking API change must contain `BREAKING CHANGE` in the title and commit footer. Alternatively, the commit can include `!`, eg. `feat(api)!: title goes here` This is outlined in the [conventional commits documentation](https://www.conventionalcommits.org/en/v1.0.0/#specification)
 - **Maintainer Review**: At least one maintainer must explicitly acknowledge the breaking change during review by applying the `breaking-change` label. An approval must come with this label or the acknowledgement this label has already been applied.
 - **Announcement**: Breaking changes require inclusion in release notes and, if applicable, a separate communication (e.g., Discord, Github Issues, or GitHub Discussions) prior to release.
 If a PR has proper approvals, labels, and commit/title hygiene, the failing API conformance tests will be bypassed.
 ## Enforcement
 ### Migration of API routes under `/v1alpha`, `/v1beta`, and `/v1`
 Instead of placing every API under `/v1`, any API that is not fully stable or complete should go under `/v1alpha` or `/v1beta`. For example, at the time of this writing,  `post_training` belongs here, as well as any OpenAI-compatible API whose surface does not exactly match the upstream OpenAI API it mimics.
 This migration is crucial as we get Llama Stack in the hands of users who intend to productize various APIs. A clear view of what is stable and what is actively being developed will enable users to pick and choose various APIs to build their products on.
 This migration will be a breaking change for any API moving out of `/v1`. Ideally, this should happen before 0.3.0 and especially 1.0.0.
 ### `x-stability` tags in the OpenAPI spec for oasdiff
 `x-stability` tags allow tools like oasdiff to enforce different rules for different stability levels; these tags should match the routes: [oasdiff stability](https://github.com/oasdiff/oasdiff/blob/main/docs/STABILITY.md)
 ### Testing
 The testing of each stable API is already outlined in [issue #3237](https://github.com/llamastack/llama-stack/issues/3237) and is being worked on. These sorts of conformance tests should apply primarily to `/v1` APIs only, with `/v1alpha` and `/v1beta` having any tests the maintainers see fit as well as basic testing to ensure the routing works properly.
 ### New APIs going forward
 Any subsequently introduced APIs should be introduced as `/v1alpha`

19

docs/docs/concepts/apis/api_providers.mdx Normal file

View file

 @ -0,0 +1,19 @@
 ---
 title: API Providers
 description: Understanding remote vs inline provider implementations
 sidebar_label: API Providers
 sidebar_position: 2
 ---
 # API Providers
 The goal of Llama Stack is to build an ecosystem where users can easily swap out different implementations for the same API. Examples for these include:
 - LLM inference providers (e.g., Fireworks, Together, AWS Bedrock, Groq, Cerebras, SambaNova, vLLM, etc.),
 - Vector databases (e.g., ChromaDB, Weaviate, Qdrant, Milvus, FAISS, PGVector, etc.),
 - Safety providers (e.g., Meta's Llama Guard, AWS Bedrock Guardrails, etc.)
 Providers come in two flavors:
 - **Remote**: the provider runs as a separate service external to the Llama Stack codebase. Llama Stack contains a small amount of adapter code.
 - **Inline**: the provider is fully specified and implemented within the Llama Stack codebase. It may be a simple wrapper around an existing library, or a full fledged implementation within Llama Stack.
 Most importantly, Llama Stack always strives to provide at least one fully inline provider for each API so you can iterate on a fully featured environment locally.

393

docs/docs/concepts/apis/external.mdx Normal file

View file

 @ -0,0 +1,393 @@
 ---
 title: External APIs
 description: Understanding external APIs in Llama Stack
 sidebar_label: External APIs
 sidebar_position: 3
 ---
 # External APIs
 Llama Stack supports external APIs that live outside of the main codebase. This allows you to:
 - Create and maintain your own APIs independently
 - Share APIs with others without contributing to the main codebase
 - Keep API-specific code separate from the core Llama Stack code
 ## Configuration
 To enable external APIs, you need to configure the `external_apis_dir` in your Llama Stack configuration. This directory should contain your external API specifications:
 ```yaml
 external_apis_dir: ~/.llama/apis.d/
 ```
 ## Directory Structure
 The external APIs directory should follow this structure:
 ```
 apis.d/
   custom_api1.yaml
   custom_api2.yaml
 ```
 Each YAML file in these directories defines an API specification.
 ## API Specification
 Here's an example of an external API specification for a weather API:
 ```yaml
 module: weather
 api_dependencies:
   - inference
 protocol: WeatherAPI
 name: weather
 pip_packages:
   - llama-stack-api-weather
 ```
 ### API Specification Fields
 - `module`: Python module containing the API implementation
 - `protocol`: Name of the protocol class for the API
 - `name`: Name of the API
 - `pip_packages`: List of pip packages to install the API, typically a single package
 ## Required Implementation
 External APIs must expose a `available_providers()` function in their module that returns a list of provider names:
 ```python
 # llama_stack_api_weather/api.py
 from llama_stack_api import Api, InlineProviderSpec, ProviderSpec
 def available_providers() -> list[ProviderSpec]:
     return [
         InlineProviderSpec(
             api=Api.weather,
             provider_type="inline::darksky",
             pip_packages=[],
             module="llama_stack_provider_darksky",
             config_class="llama_stack_provider_darksky.DarkSkyWeatherImplConfig",
         ),
     ]
 ```
 A Protocol class like so:
 ```python
 # llama_stack_api_weather/api.py
 from typing import Protocol
 from llama_stack_api import webmethod
 class WeatherAPI(Protocol):
     """
     A protocol for the Weather API.
     """
     @webmethod(route="/locations", method="GET")
     async def get_available_locations() -> dict[str, list[str]]:
         """
         Get the available locations.
         """
         ...
 ```
 ## Example: Custom API
 Here's a complete example of creating and using a custom API:
 . First, create the API package:
 ```bash
 mkdir -p llama-stack-api-weather
 cd llama-stack-api-weather
 mkdir src/llama_stack_api_weather
 git init
 uv init
 ```
 . Edit `pyproject.toml`:
 ```toml
 [project]
 name = "llama-stack-api-weather"
 version = "0.1.0"
 description = "Weather API for Llama Stack"
 readme = "README.md"
 requires-python = ">=3.12"
 dependencies = ["llama-stack", "pydantic"]
 [build-system]
 requires = ["setuptools"]
 build-backend = "setuptools.build_meta"
 [tool.setuptools.packages.find]
 where = ["src"]
 include = ["llama_stack_api_weather", "llama_stack_api_weather.*"]
 ```
 . Create the initial files:
 ```bash
 touch src/llama_stack_api_weather/__init__.py
 touch src/llama_stack_api_weather/api.py
 ```
 ```python
 # llama-stack-api-weather/src/llama_stack_api_weather/__init__.py
 """Weather API for Llama Stack."""
 from .api import WeatherAPI, available_providers
 __all__ = ["WeatherAPI", "available_providers"]
 ```
 . Create the API implementation:
 ```python
 # llama-stack-api-weather/src/llama_stack_api_weather/weather.py
 from typing import Protocol
 from llama_stack_api import (
     Api,
     ProviderSpec,
     RemoteProviderSpec,
     webmethod,
 )
 def available_providers() -> list[ProviderSpec]:
     return [
         RemoteProviderSpec(
             api=Api.weather,
             provider_type="remote::kaze",
             config_class="llama_stack_provider_kaze.KazeProviderConfig",
             adapter_type="kaze",
             module="llama_stack_provider_kaze",
             pip_packages=["llama_stack_provider_kaze"],
             config_class="llama_stack_provider_kaze.KazeProviderConfig",
         ),
     ]
 class WeatherProvider(Protocol):
     """
     A protocol for the Weather API.
     """
     @webmethod(route="/weather/locations", method="GET")
     async def get_available_locations() -> dict[str, list[str]]:
         """
         Get the available locations.
         """
         ...
 ```
 . Create the API specification:
 ```yaml
 # ~/.llama/apis.d/weather.yaml
 module: llama_stack_api_weather
 name: weather
 pip_packages: ["llama-stack-api-weather"]
 protocol: WeatherProvider
 ```
 . Install the API package:
 ```bash
 uv pip install -e .
 ```
 . Configure Llama Stack to use external APIs:
 ```yaml
 version: "2"
 image_name: "llama-stack-api-weather"
 apis:
   - weather
 providers: {}
 external_apis_dir: ~/.llama/apis.d
 ```
 The API will now be available at `/v1/weather/locations`.
 ## Example: custom provider for the weather API
 . Create the provider package:
 ```bash
 mkdir -p llama-stack-provider-kaze
 cd llama-stack-provider-kaze
 uv init
 ```
 . Edit `pyproject.toml`:
 ```toml
 [project]
 name = "llama-stack-provider-kaze"
 version = "0.1.0"
 description = "Kaze weather provider for Llama Stack"
 readme = "README.md"
 requires-python = ">=3.12"
 dependencies = ["llama-stack", "pydantic", "aiohttp"]
 [build-system]
 requires = ["setuptools"]
 build-backend = "setuptools.build_meta"
 [tool.setuptools.packages.find]
 where = ["src"]
 include = ["llama_stack_provider_kaze", "llama_stack_provider_kaze.*"]
 ```
 . Create the initial files:
 ```bash
 touch src/llama_stack_provider_kaze/__init__.py
 touch src/llama_stack_provider_kaze/kaze.py
 ```
 . Create the provider implementation:
 Initialization function:
 ```python
 # llama-stack-provider-kaze/src/llama_stack_provider_kaze/__init__.py
 """Kaze weather provider for Llama Stack."""
 from .config import KazeProviderConfig
 from .kaze import WeatherKazeAdapter
 __all__ = ["KazeProviderConfig", "WeatherKazeAdapter"]
 async def get_adapter_impl(config: KazeProviderConfig, _deps):
     from .kaze import WeatherKazeAdapter
     impl = WeatherKazeAdapter(config)
     await impl.initialize()
     return impl
 ```
 Configuration:
 ```python
 # llama-stack-provider-kaze/src/llama_stack_provider_kaze/config.py
 from pydantic import BaseModel, Field
 class KazeProviderConfig(BaseModel):
     """Configuration for the Kaze weather provider."""
     base_url: str = Field(
         "https://api.kaze.io/v1",
         description="Base URL for the Kaze weather API",
     )
 ```
 Main implementation:
 ```python
 # llama-stack-provider-kaze/src/llama_stack_provider_kaze/kaze.py
 from llama_stack_api_weather.api import WeatherProvider
 from .config import KazeProviderConfig
 class WeatherKazeAdapter(WeatherProvider):
     """Kaze weather provider implementation."""
     def __init__(
         self,
         config: KazeProviderConfig,
     ) -> None:
         self.config = config
     async def initialize(self) -> None:
         pass
     async def get_available_locations(self) -> dict[str, list[str]]:
         """Get available weather locations."""
         return {"locations": ["Paris", "Tokyo"]}
 ```
 . Create the provider specification:
 ```yaml
 # ~/.llama/providers.d/remote/weather/kaze.yaml
 adapter_type: kaze
 pip_packages: ["llama_stack_provider_kaze"]
 config_class: llama_stack_provider_kaze.config.KazeProviderConfig
 module: llama_stack_provider_kaze
 optional_api_dependencies: []
 ```
 . Install the provider package:
 ```bash
 uv pip install -e .
 ```
 . Configure Llama Stack to use the provider:
 ```yaml
 # ~/.llama/config.yaml
 version: "2"
 image_name: "llama-stack-api-weather"
 apis:
   - weather
 providers:
   weather:
   - provider_id: kaze
     provider_type: remote::kaze
     config: {}
 external_apis_dir: ~/.llama/apis.d
 external_providers_dir: ~/.llama/providers.d
 server:
   port: 8321
 ```
 . Run the server:
 ```bash
 llama stack run ~/.llama/config.yaml
 ```
 . Test the API:
 ```bash
 curl -sSf http://127.0.0.1:8321/v1/weather/locations
 {"locations":["Paris","Tokyo"]}%
 ```
 ## Best Practices
 . **Package Naming**: Use a clear and descriptive name for your API package.
 . **Version Management**: Keep your API package versioned and compatible with the Llama Stack version you're using.
 . **Dependencies**: Only include the minimum required dependencies in your API package.
 . **Documentation**: Include clear documentation in your API package about:
    - Installation requirements
    - Configuration options
    - API endpoints and usage
    - Any limitations or known issues
 . **Testing**: Include tests in your API package to ensure it works correctly with Llama Stack.
 ## Troubleshooting
 If your external API isn't being loaded:
 . Check that the `external_apis_dir` path is correct and accessible.
 . Verify that the YAML files are properly formatted.
 . Ensure all required Python packages are installed.
 . Check the Llama Stack server logs for any error messages - turn on debug logging to get more information using `LLAMA_STACK_LOGGING=all=debug`.
 . Verify that the API package is installed in your Python environment.

40

docs/docs/concepts/apis/index.mdx Normal file

View file

 @ -0,0 +1,40 @@
 ---
 title: APIs
 description: Available REST APIs and planned capabilities in Llama Stack
 sidebar_label: APIs
 sidebar_position: 1
 ---
 # APIs
 A Llama Stack API is described as a collection of REST endpoints following OpenAI API standards. We currently support the following APIs:
 - **Inference**: run inference with a LLM
 - **Safety**: apply safety policies to the output at a Systems (not only model) level
 - **Agents**: run multi-step agentic workflows with LLMs with tool usage, memory (RAG), etc.
 - **DatasetIO**: interface with datasets and data loaders
 - **Scoring**: evaluate outputs of the system
 - **Eval**: generate outputs (via Inference or Agents) and perform scoring
 - **VectorIO**: perform operations on vector stores, such as adding documents, searching, and deleting documents
 - **Files**: manage file uploads, storage, and retrieval
 - **Post Training**: fine-tune a model
 - **Tool Runtime**: interact with various tools and protocols
 - **Responses**: generate responses from an LLM
 We are working on adding a few more APIs to complete the application lifecycle. These will include:
 - **Batch Inference**: run inference on a dataset of inputs
 - **Batch Agents**: run agents on a dataset of inputs
 - **Batches**: OpenAI-compatible batch management for inference
 ## OpenAI API Compatibility
 We are working on adding OpenAI API compatibility to Llama Stack. This will allow you to use Llama Stack with OpenAI API clients and tools.
 ### File Operations and Vector Store Integration
 The Files API and Vector Store APIs work together through file operations, enabling automatic document processing and search. This integration implements the [OpenAI Vector Store Files API specification](https://platform.openai.com/docs/api-reference/vector-stores-files) and allows you to:
 - Upload documents through the Files API
 - Automatically process and chunk documents into searchable vectors
 - Store processed content in vector databases based on the availability of [our providers](../../providers/index.mdx)
 - Search through documents using natural language queries
 For detailed information about this integration, see [File Operations and Vector Store Integration](../file_operations_vector_stores.md).

74

docs/docs/concepts/architecture.mdx Normal file

View file

 @ -0,0 +1,74 @@
 ---
 title: Llama Stack Architecture
 description: Understanding Llama Stack's service-oriented design and benefits
 sidebar_label: Architecture
 sidebar_position: 2
 ---
 # Llama Stack architecture
 Llama Stack allows you to build different layers of distributions for your AI workloads using various SDKs and API providers.
 <img src="/img/llama-stack.png" alt="Llama Stack" width="400" />
 ## Benefits of Llama stack
 ### Current challenges in custom AI applications
 Building production AI applications today requires solving multiple challenges:
 **Infrastructure Complexity**
 - Running large language models efficiently requires specialized infrastructure.
 - Different deployment scenarios (local development, cloud, edge) need different solutions.
 - Moving from development to production often requires significant rework.
 **Essential Capabilities**
 - Safety guardrails and content filtering are necessary in an enterprise setting.
 - Just model inference is not enough - Knowledge retrieval and RAG capabilities are required.
 - Nearly any application needs composable multi-step workflows.
 - Without monitoring, observability and evaluation, you end up operating in the dark.
 **Lack of Flexibility and Choice**
 - Directly integrating with multiple providers creates tight coupling.
 - Different providers have different APIs and abstractions.
 - Changing providers requires significant code changes.
 ### Our Solution: A Universal Stack
 Llama Stack addresses these challenges through a service-oriented, API-first approach:
 **Develop Anywhere, Deploy Everywhere**
 - Start locally with CPU-only setups
 - Move to GPU acceleration when needed
 - Deploy to cloud or edge without code changes
 - Same APIs and developer experience everywhere
 **Production-Ready Building Blocks**
 - Pre-built safety guardrails and content filtering
 - Built-in RAG and agent capabilities
 - Comprehensive evaluation toolkit
 - Full observability and monitoring
 **True Provider Independence**
 - Swap providers without application changes
 - Mix and match best-in-class implementations
 - Federation and fallback support
 - No vendor lock-in
 **Robust Ecosystem**
 - Llama Stack is already integrated with distribution partners (cloud providers, hardware vendors, and AI-focused companies).
 - Ecosystem offers tailored infrastructure, software, and services for deploying a variety of models.
 ## Our Philosophy
 - **Service-Oriented**: REST APIs enforce clean interfaces and enable seamless transitions across different environments.
 - **Composability**: Every component is independent but works together seamlessly
 - **Production Ready**: Built for real-world applications, not just demos
 - **Turnkey Solutions**: Easy to deploy built in solutions for popular deployment scenarios
 With Llama Stack, you can focus on building your application while we handle the infrastructure complexity, essential capabilities, and provider integrations.

Compare commits

1181 commits v0.2.11 ... main

6 .coveragerc Unescape Escape View file

19 .dockerignore Normal file Unescape Escape View file

1 .gitattributes vendored Normal file Unescape Escape View file

2 .github/CODEOWNERS vendored Unescape Escape View file

4 .github/ISSUE_TEMPLATE/config.yml vendored Unescape Escape View file

30 .github/ISSUE_TEMPLATE/tech-debt.yml vendored Normal file Unescape Escape View file

1 .github/TRIAGERS.md vendored Unescape Escape View file

72 .github/actions/install-llama-stack-client/action.yml vendored Normal file Unescape Escape View file

137 .github/actions/run-and-record-tests/action.yml vendored Normal file Unescape Escape View file

16 .github/actions/setup-ollama/action.yml vendored Unescape Escape View file

50 .github/actions/setup-runner/action.yml vendored Unescape Escape View file

95 .github/actions/setup-test-environment/action.yml vendored Normal file Unescape Escape View file

35 .github/actions/setup-typescript-client/action.yml vendored Normal file Unescape Escape View file

28 .github/actions/setup-vllm/action.yml vendored Normal file Unescape Escape View file

14 .github/dependabot.yml vendored Unescape Escape View file

23 .github/mergify.yml vendored Normal file Unescape Escape View file

25 .github/workflows/README.md vendored Normal file Unescape Escape View file

578 .github/workflows/backward-compat.yml vendored Normal file Unescape Escape View file

29 .github/workflows/changelog.yml vendored Unescape Escape View file

161 .github/workflows/conformance.yml vendored Normal file Unescape Escape View file

355 .github/workflows/gha_workflow_llama_stack_tests.yml vendored Unescape Escape View file

38 .github/workflows/install-script-ci.yml vendored Unescape Escape View file

102 .github/workflows/integration-auth-tests.yml vendored Unescape Escape View file

76 .github/workflows/integration-sql-store-tests.yml vendored Normal file Unescape Escape View file

235 .github/workflows/integration-tests.yml vendored Unescape Escape View file

206 .github/workflows/integration-vector-io-tests.yml vendored Normal file Unescape Escape View file

156 .github/workflows/pre-commit.yml vendored Unescape Escape View file

134 .github/workflows/providers-build.yml vendored Unescape Escape View file

105 .github/workflows/providers-list-deps.yml vendored Normal file Unescape Escape View file

50 .github/workflows/python-build-test.yml vendored Normal file Unescape Escape View file

73 .github/workflows/record-integration-tests.yml vendored Normal file Unescape Escape View file

6 .github/workflows/semantic-pr.yml vendored Unescape Escape View file

227 .github/workflows/stainless-builds.yml vendored Normal file Unescape Escape View file

4 .github/workflows/stale_bot.yml vendored Unescape Escape View file

86 .github/workflows/test-external-provider-module.yml vendored Normal file Unescape Escape View file

73 .github/workflows/test-external-providers.yml vendored Unescape Escape View file

92 .github/workflows/test-external.yml vendored Normal file Unescape Escape View file

69 .github/workflows/tests.yml vendored Unescape Escape View file

55 .github/workflows/ui-unit-tests.yml vendored Normal file Unescape Escape View file

25 .github/workflows/unit-tests.yml vendored Unescape Escape View file

68 .github/workflows/update-readthedocs.yml vendored Unescape Escape View file

13 .gitignore vendored Unescape Escape View file

171 .pre-commit-config.yaml Unescape Escape View file

25 .readthedocs.yaml Unescape Escape View file

219 CHANGELOG.md Unescape Escape View file

246 CONTRIBUTING.md Unescape Escape View file

18 MANIFEST.in Unescape Escape View file

174 README.md Unescape Escape View file

229 benchmarking/k8s-benchmark/README.md Normal file Unescape Escape View file

33 benchmarking/k8s-benchmark/apply.sh Executable file Unescape Escape View file

202 benchmarking/k8s-benchmark/openai-mock-server.py Executable file Unescape Escape View file

171 benchmarking/k8s-benchmark/results/guidellm-benchmark-stack-s1-sw1-v1-20250922-103408.txt Normal file Unescape Escape View file

171 benchmarking/k8s-benchmark/results/guidellm-benchmark-stack-s1-sw2-v1-20250922-104457.txt Normal file Unescape Escape View file

171 benchmarking/k8s-benchmark/results/guidellm-benchmark-stack-s1-sw4-v1-20250922-105539.txt Normal file Unescape Escape View file

170 benchmarking/k8s-benchmark/results/guidellm-benchmark-vllm-v1-20250922-111127.txt Normal file Unescape Escape View file

BIN benchmarking/k8s-benchmark/results/vllm_replica1_benchmark_results.png Normal file View file

294 benchmarking/k8s-benchmark/scripts/generate_charts.py Executable file Unescape Escape View file

103 benchmarking/k8s-benchmark/scripts/run-all-benchmarks.sh Executable file Unescape Escape View file

219 benchmarking/k8s-benchmark/scripts/run-guidellm-benchmark.sh Executable file Unescape Escape View file

142 benchmarking/k8s-benchmark/stack-configmap.yaml Normal file Unescape Escape View file

94 benchmarking/k8s-benchmark/stack-k8s.yaml.template Normal file Unescape Escape View file

133 benchmarking/k8s-benchmark/stack_run_config.yaml Normal file Unescape Escape View file

11 client-sdks/stainless/README.md Normal file Unescape Escape View file

494 client-sdks/stainless/config.yml Normal file Unescape Escape View file

14219 client-sdks/stainless/openapi.yml Normal file View file

163 containers/Containerfile Normal file Unescape Escape View file

21 coverage.svg Normal file Unescape Escape View file

20 docs/Makefile Unescape Escape View file

58 docs/README.md Normal file Unescape Escape View file

35 docs/_static/css/my_theme.css vendored Unescape Escape View file

32 docs/_static/js/detect_theme.js vendored Unescape Escape View file

BIN docs/_static/llama-stack-logo.png vendored View file

15608 docs/_static/llama-stack-spec.html vendored View file

10871 docs/_static/llama-stack-spec.yaml vendored View file

BIN docs/_static/llama-stack.png vendored View file

24 docs/conftest.py Unescape Escape View file

7 docs/contbuild.sh Unescape Escape View file

163 docs/docs/advanced_apis/evaluation.mdx Normal file Unescape Escape View file

1181 commits

v0.2.11 ... main

6

.coveragerc

View file

19

.dockerignore Normal file

View file

1

.gitattributes vendored Normal file

View file

2

.github/CODEOWNERS vendored

View file

4

.github/ISSUE_TEMPLATE/config.yml vendored

View file

30

.github/ISSUE_TEMPLATE/tech-debt.yml vendored Normal file

View file

1

.github/TRIAGERS.md vendored

View file

72

.github/actions/install-llama-stack-client/action.yml vendored Normal file

View file

137

.github/actions/run-and-record-tests/action.yml vendored Normal file

View file

16

.github/actions/setup-ollama/action.yml vendored

View file

50

.github/actions/setup-runner/action.yml vendored

View file

95

.github/actions/setup-test-environment/action.yml vendored Normal file

View file

35

.github/actions/setup-typescript-client/action.yml vendored Normal file

View file

28

.github/actions/setup-vllm/action.yml vendored Normal file

View file

14

.github/dependabot.yml vendored

View file

23

.github/mergify.yml vendored Normal file

View file

25

.github/workflows/README.md vendored Normal file

View file

578

.github/workflows/backward-compat.yml vendored Normal file

View file

29

.github/workflows/changelog.yml vendored

View file

161

.github/workflows/conformance.yml vendored Normal file

View file

355

.github/workflows/gha_workflow_llama_stack_tests.yml vendored

View file

38

.github/workflows/install-script-ci.yml vendored

View file

102

.github/workflows/integration-auth-tests.yml vendored

View file

76

.github/workflows/integration-sql-store-tests.yml vendored Normal file

View file

235

.github/workflows/integration-tests.yml vendored

View file

206

.github/workflows/integration-vector-io-tests.yml vendored Normal file

View file

156

.github/workflows/pre-commit.yml vendored

View file

134

.github/workflows/providers-build.yml vendored

View file

105

.github/workflows/providers-list-deps.yml vendored Normal file

View file

50

.github/workflows/python-build-test.yml vendored Normal file

View file

73

.github/workflows/record-integration-tests.yml vendored Normal file

View file

6

.github/workflows/semantic-pr.yml vendored

View file

227

.github/workflows/stainless-builds.yml vendored Normal file

View file

4

.github/workflows/stale_bot.yml vendored

View file

86

.github/workflows/test-external-provider-module.yml vendored Normal file

View file

73

.github/workflows/test-external-providers.yml vendored

View file

92

.github/workflows/test-external.yml vendored Normal file

View file

69

.github/workflows/tests.yml vendored

View file

55

.github/workflows/ui-unit-tests.yml vendored Normal file

View file

25

.github/workflows/unit-tests.yml vendored

View file

68

.github/workflows/update-readthedocs.yml vendored

View file

13

.gitignore vendored

View file

171

.pre-commit-config.yaml

View file

25

.readthedocs.yaml

View file

219

CHANGELOG.md

View file

246

CONTRIBUTING.md

View file

18

MANIFEST.in

View file

174

README.md

View file

229

benchmarking/k8s-benchmark/README.md Normal file

View file

33

benchmarking/k8s-benchmark/apply.sh Executable file

View file

202

benchmarking/k8s-benchmark/openai-mock-server.py Executable file

View file

171

benchmarking/k8s-benchmark/results/guidellm-benchmark-stack-s1-sw1-v1-20250922-103408.txt Normal file

View file

171

benchmarking/k8s-benchmark/results/guidellm-benchmark-stack-s1-sw2-v1-20250922-104457.txt Normal file

View file

171

benchmarking/k8s-benchmark/results/guidellm-benchmark-stack-s1-sw4-v1-20250922-105539.txt Normal file

View file

170

benchmarking/k8s-benchmark/results/guidellm-benchmark-vllm-v1-20250922-111127.txt Normal file

View file

BIN
benchmarking/k8s-benchmark/results/vllm_replica1_benchmark_results.png Normal file

View file

294

benchmarking/k8s-benchmark/scripts/generate_charts.py Executable file

View file

103

benchmarking/k8s-benchmark/scripts/run-all-benchmarks.sh Executable file

View file

219

benchmarking/k8s-benchmark/scripts/run-guidellm-benchmark.sh Executable file

View file

142

benchmarking/k8s-benchmark/stack-configmap.yaml Normal file

View file

94

benchmarking/k8s-benchmark/stack-k8s.yaml.template Normal file

View file

133

benchmarking/k8s-benchmark/stack_run_config.yaml Normal file

View file

11

client-sdks/stainless/README.md Normal file

View file

494

client-sdks/stainless/config.yml Normal file

View file

14219

client-sdks/stainless/openapi.yml Normal file

View file

163

containers/Containerfile Normal file

View file

21

coverage.svg Normal file

View file

20

docs/Makefile

View file

58

docs/README.md Normal file

View file

35

docs/_static/css/my_theme.css vendored

View file

32

docs/_static/js/detect_theme.js vendored

View file

BIN
docs/_static/llama-stack-logo.png vendored

View file

15608

docs/_static/llama-stack-spec.html vendored

View file

10871

docs/_static/llama-stack-spec.yaml vendored

View file

BIN
docs/_static/llama-stack.png vendored

View file

24

docs/conftest.py

View file

7

docs/contbuild.sh

View file

163

docs/docs/advanced_apis/evaluation.mdx Normal file

View file

305

docs/docs/advanced_apis/post_training.mdx Normal file

View file