mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-12 13:57:57 +00:00
2886 commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
82cbcada39
|
chore(ui-deps): bump lucide-react from 0.542.0 to 0.545.0 in /llama_stack/ui (#3788)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.12) (push) Failing after 1s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s
Python Package Build Test / build (3.13) (push) Failing after 2s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (push) Failing after 5s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 3s
Unit Tests / unit-tests (3.13) (push) Failing after 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 12s
UI Tests / ui-tests (22) (push) Successful in 41s
Pre-commit / pre-commit (push) Successful in 1m26s
Bumps
[lucide-react](https://github.com/lucide-icons/lucide/tree/HEAD/packages/lucide-react)
from 0.542.0 to 0.545.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/lucide-icons/lucide/releases">lucide-react's
releases</a>.</em></p>
<blockquote>
<h2>Version 0.545.0</h2>
<h2>What's Changed</h2>
<ul>
<li>fix(icons): changed <code>flame</code> icon by <a
href="https://github.com/jamiemlaw"><code>@jamiemlaw</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3600">lucide-icons/lucide#3600</a></li>
<li>fix(icons): arcified <code>square-m</code> icon by <a
href="https://github.com/jguddas"><code>@jguddas</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3549">lucide-icons/lucide#3549</a></li>
<li>chore(deps-dev): bump vite from 6.3.5 to 6.3.6 by <a
href="https://github.com/dependabot"><code>@dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3611">lucide-icons/lucide#3611</a></li>
<li>fix(icons): changed <code>combine</code> icon by <a
href="https://github.com/jguddas"><code>@jguddas</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3200">lucide-icons/lucide#3200</a></li>
<li>fix(icons): changed <code>building-2</code> icon by <a
href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a>
in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3509">lucide-icons/lucide#3509</a></li>
<li>chore(deps): bump devalue from 5.1.1 to 5.3.2 by <a
href="https://github.com/dependabot"><code>@dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3638">lucide-icons/lucide#3638</a></li>
<li>feat(icons): Add <code>motorbike</code> icon by <a
href="https://github.com/jamiemlaw"><code>@jamiemlaw</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3371">lucide-icons/lucide#3371</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/lucide-icons/lucide/compare/0.544.0...0.545.0">https://github.com/lucide-icons/lucide/compare/0.544.0...0.545.0</a></p>
<h2>Version 0.544.0</h2>
<h2>What's Changed</h2>
<ul>
<li>docs: update lucide-static documentation about raw string imports by
<a href="https://github.com/pascalduez"><code>@pascalduez</code></a> in
<a
href="https://redirect.github.com/lucide-icons/lucide/pull/3524">lucide-icons/lucide#3524</a></li>
<li>feat(icons): added <code>ev-charger</code> icon by <a
href="https://github.com/UsamaKhan"><code>@UsamaKhan</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/2781">lucide-icons/lucide#2781</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/pascalduez"><code>@pascalduez</code></a> made
their first contribution in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3524">lucide-icons/lucide#3524</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/lucide-icons/lucide/compare/0.543.0...0.544.0">https://github.com/lucide-icons/lucide/compare/0.543.0...0.544.0</a></p>
<h2>Version 0.543.0</h2>
<h2>What's Changed</h2>
<ul>
<li>feat(preview-comment): put x-ray at top if there are more than 7
changed icons to prevent them from being cut of by <a
href="https://github.com/jguddas"><code>@jguddas</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3589">lucide-icons/lucide#3589</a></li>
<li>fix(icons): changed <code>church</code> icon by <a
href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a>
in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/2971">lucide-icons/lucide#2971</a></li>
<li>chore(metadata): Added tags to <code>messages-square</code> by <a
href="https://github.com/jamiemlaw"><code>@jamiemlaw</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3529">lucide-icons/lucide#3529</a></li>
<li>fix(icons): Optimise <code>bug</code> icons by <a
href="https://github.com/jamiemlaw"><code>@jamiemlaw</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3574">lucide-icons/lucide#3574</a></li>
<li>fix(icons): changed list/text & derived icons by <a
href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a>
in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3568">lucide-icons/lucide#3568</a></li>
<li>fix(icons): changed <code>panel-top-bottom-dashed</code> icon by <a
href="https://github.com/jguddas"><code>@jguddas</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3584">lucide-icons/lucide#3584</a></li>
<li>fix(icons): changed <code>message-square-quote</code> icon by <a
href="https://github.com/jguddas"><code>@jguddas</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3550">lucide-icons/lucide#3550</a></li>
<li>fix(meta): added tag to <code>ship</code> metadata by <a
href="https://github.com/jguddas"><code>@jguddas</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3559">lucide-icons/lucide#3559</a></li>
<li>fix(meta): add tags to <code>id-card-lanyard</code> metadata by <a
href="https://github.com/jguddas"><code>@jguddas</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3534">lucide-icons/lucide#3534</a></li>
<li>fix(icons): changed <code>calendar-cog</code> icon by <a
href="https://github.com/jguddas"><code>@jguddas</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3583">lucide-icons/lucide#3583</a></li>
<li>chore(deps): bump astro from 5.5.2 to 5.13.2 by <a
href="https://github.com/dependabot"><code>@dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3564">lucide-icons/lucide#3564</a></li>
<li>feat(packages): add new package for flutter by <a
href="https://github.com/vqh2602"><code>@vqh2602</code></a> in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3536">lucide-icons/lucide#3536</a></li>
<li>feat(icons): added <code>house-heart</code> icon by <a
href="https://github.com/danielbayley"><code>@danielbayley</code></a>
in <a
href="https://redirect.github.com/lucide-icons/lucide/pull/3239">lucide-icons/lucide#3239</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/lucide-icons/lucide/compare/0.542.0...0.543.0">https://github.com/lucide-icons/lucide/compare/0.542.0...0.543.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="
|
||
|
e94840d298
|
chore(ui-deps): bump framer-motion from 12.23.12 to 12.23.24 in /llama_stack/ui (#3792)
Bumps [framer-motion](https://github.com/motiondivision/motion) from 12.23.12 to 12.23.24. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/motiondivision/motion/blob/main/CHANGELOG.md">framer-motion's changelog</a>.</em></p> <blockquote> <h2>[12.23.24] 2025-10-10</h2> <h3>Fixed</h3> <ul> <li>Ensure that when a component remounts, it continues to fire animations even when <code>initial={false}</code>.</li> </ul> <h2>[12.23.23] 2025-10-10</h2> <h3>Added</h3> <ul> <li>Exporting <code>PresenceChild</code> and <code>PopChild</code> type for internal use.</li> </ul> <h2>[12.23.22] 2025-09-25</h2> <h3>Added</h3> <ul> <li>Exporting <code>HTMLElements</code> and <code>useComposedRefs</code> type for internal use.</li> </ul> <h2>[12.23.21] 2025-09-24</h2> <h3>Fixed</h3> <ul> <li>Fixing main-thread <code>scroll</code> with animations that contain <code>delay</code>.</li> </ul> <h2>[12.23.20] 2025-09-24</h2> <h3>Fixed</h3> <ul> <li>Suppress non-animatable value warning for instant animations.</li> </ul> <h2>[12.23.19] 2025-09-23</h2> <h3>Fixed</h3> <ul> <li>Remove support for changing <code>ref</code> prop.</li> </ul> <h2>[12.23.18] 2025-09-19</h2> <h3>Fixed</h3> <ul> <li><code><motion /></code> components now support changing <code>ref</code> prop.</li> </ul> <h2>[12.23.17] 2025-09-19</h2> <h3>Fixed</h3> <ul> <li>Ensure <code>animate()</code> <code>onComplete</code> only fires once, when all values are complete.</li> </ul> <h2>[12.23.16] 2025-09-19</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
25ea94fcf7
|
chore(ui-deps): bump eslint from 9.26.0 to 9.37.0 in /llama_stack/ui (#3791)
Bumps [eslint](https://github.com/eslint/eslint) from 9.26.0 to 9.37.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/eslint/eslint/releases">eslint's releases</a>.</em></p> <blockquote> <h2>v9.37.0</h2> <h2>Features</h2> <ul> <li><a href=" |
||
|
190b96ea62
|
chore(ui-deps): bump @types/react-dom from 19.2.0 to 19.2.1 in /llama_stack/ui (#3789)
Bumps [@types/react-dom](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-dom) from 19.2.0 to 19.2.1. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-dom">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> |
||
|
4fb39f0a6a
|
chore(ui-deps): bump @types/react from 19.2.0 to 19.2.2 in /llama_stack/ui (#3790)
Bumps [@types/react](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react) from 19.2.0 to 19.2.2. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> |
||
|
cfd2e303db
|
chore(python-deps): bump black from 25.1.0 to 25.9.0 (#3783)
Bumps [black](https://github.com/psf/black) from 25.1.0 to 25.9.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/psf/black/releases">black's releases</a>.</em></p> <blockquote> <h2>25.9.0</h2> <h3>Highlights</h3> <ul> <li>Remove support for pre-python 3.7 <code>await/async</code> as soft keywords/variable names (<a href="https://redirect.github.com/psf/black/issues/4676">#4676</a>)</li> </ul> <h3>Stable style</h3> <ul> <li>Fix crash while formatting a long <code>del</code> statement containing tuples (<a href="https://redirect.github.com/psf/black/issues/4628">#4628</a>)</li> <li>Fix crash while formatting expressions using the walrus operator in complex <code>with</code> statements (<a href="https://redirect.github.com/psf/black/issues/4630">#4630</a>)</li> <li>Handle <code># fmt: skip</code> followed by a comment at the end of file (<a href="https://redirect.github.com/psf/black/issues/4635">#4635</a>)</li> <li>Fix crash when a tuple appears in the <code>as</code> clause of a <code>with</code> statement (<a href="https://redirect.github.com/psf/black/issues/4634">#4634</a>)</li> <li>Fix crash when tuple is used as a context manager inside a <code>with</code> statement (<a href="https://redirect.github.com/psf/black/issues/4646">#4646</a>)</li> <li>Fix crash when formatting a <code>\</code> followed by a <code>\r</code> followed by a comment (<a href="https://redirect.github.com/psf/black/issues/4663">#4663</a>)</li> <li>Fix crash on a <code>\\r\n</code> (<a href="https://redirect.github.com/psf/black/issues/4673">#4673</a>)</li> <li>Fix crash on <code>await ...</code> (where <code>...</code> is a literal <code>Ellipsis</code>) (<a href="https://redirect.github.com/psf/black/issues/4676">#4676</a>)</li> <li>Fix crash on parenthesized expression inside a type parameter bound (<a href="https://redirect.github.com/psf/black/issues/4684">#4684</a>)</li> <li>Fix crash when using line ranges excluding indented single line decorated items (<a href="https://redirect.github.com/psf/black/issues/4670">#4670</a>)</li> </ul> <h3>Preview style</h3> <ul> <li>Fix a bug where one-liner functions/conditionals marked with <code># fmt: skip</code> would still be formatted (<a href="https://redirect.github.com/psf/black/issues/4552">#4552</a>)</li> <li>Improve <code>multiline_string_handling</code> with ternaries and dictionaries (<a href="https://redirect.github.com/psf/black/issues/4657">#4657</a>)</li> <li>Fix a bug where <code>string_processing</code> would not split f-strings directly after expressions (<a href="https://redirect.github.com/psf/black/issues/4680">#4680</a>)</li> <li>Wrap the <code>in</code> clause of comprehensions across lines if necessary (<a href="https://redirect.github.com/psf/black/issues/4699">#4699</a>)</li> <li>Remove parentheses around multiple exception types in <code>except</code> and <code>except*</code> without <code>as</code>. (<a href="https://redirect.github.com/psf/black/issues/4720">#4720</a>)</li> <li>Add <code>\r</code> style newlines to the potential newlines to normalize file newlines both from and to (<a href="https://redirect.github.com/psf/black/issues/4710">#4710</a>)</li> </ul> <h3>Parser</h3> <ul> <li>Rewrite tokenizer to improve performance and compliance (<a href="https://redirect.github.com/psf/black/issues/4536">#4536</a>)</li> <li>Fix bug where certain unusual expressions (e.g., lambdas) were not accepted in type parameter bounds and defaults. (<a href="https://redirect.github.com/psf/black/issues/4602">#4602</a>)</li> </ul> <h3>Performance</h3> <ul> <li>Avoid using an extra process when running with only one worker (<a href="https://redirect.github.com/psf/black/issues/4734">#4734</a>)</li> </ul> <h3>Integrations</h3> <ul> <li>Fix the version check in the vim file to reject Python 3.8 (<a href="https://redirect.github.com/psf/black/issues/4567">#4567</a>)</li> <li>Enhance GitHub Action <code>psf/black</code> to read Black version from an additional section in pyproject.toml: <code>[project.dependency-groups]</code> (<a href="https://redirect.github.com/psf/black/issues/4606">#4606</a>)</li> <li>Build gallery docker image with python3-slim and reduce image size (<a href="https://redirect.github.com/psf/black/issues/4686">#4686</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/psf/black/blob/main/CHANGES.md">black's changelog</a>.</em></p> <blockquote> <h2>25.9.0</h2> <h3>Highlights</h3> <ul> <li>Remove support for pre-python 3.7 <code>await/async</code> as soft keywords/variable names (<a href="https://redirect.github.com/psf/black/issues/4676">#4676</a>)</li> </ul> <h3>Stable style</h3> <ul> <li>Fix crash while formatting a long <code>del</code> statement containing tuples (<a href="https://redirect.github.com/psf/black/issues/4628">#4628</a>)</li> <li>Fix crash while formatting expressions using the walrus operator in complex <code>with</code> statements (<a href="https://redirect.github.com/psf/black/issues/4630">#4630</a>)</li> <li>Handle <code># fmt: skip</code> followed by a comment at the end of file (<a href="https://redirect.github.com/psf/black/issues/4635">#4635</a>)</li> <li>Fix crash when a tuple appears in the <code>as</code> clause of a <code>with</code> statement (<a href="https://redirect.github.com/psf/black/issues/4634">#4634</a>)</li> <li>Fix crash when tuple is used as a context manager inside a <code>with</code> statement (<a href="https://redirect.github.com/psf/black/issues/4646">#4646</a>)</li> <li>Fix crash when formatting a <code>\</code> followed by a <code>\r</code> followed by a comment (<a href="https://redirect.github.com/psf/black/issues/4663">#4663</a>)</li> <li>Fix crash on a <code>\\r\n</code> (<a href="https://redirect.github.com/psf/black/issues/4673">#4673</a>)</li> <li>Fix crash on <code>await ...</code> (where <code>...</code> is a literal <code>Ellipsis</code>) (<a href="https://redirect.github.com/psf/black/issues/4676">#4676</a>)</li> <li>Fix crash on parenthesized expression inside a type parameter bound (<a href="https://redirect.github.com/psf/black/issues/4684">#4684</a>)</li> <li>Fix crash when using line ranges excluding indented single line decorated items (<a href="https://redirect.github.com/psf/black/issues/4670">#4670</a>)</li> </ul> <h3>Preview style</h3> <ul> <li>Fix a bug where one-liner functions/conditionals marked with <code># fmt: skip</code> would still be formatted (<a href="https://redirect.github.com/psf/black/issues/4552">#4552</a>)</li> <li>Improve <code>multiline_string_handling</code> with ternaries and dictionaries (<a href="https://redirect.github.com/psf/black/issues/4657">#4657</a>)</li> <li>Fix a bug where <code>string_processing</code> would not split f-strings directly after expressions (<a href="https://redirect.github.com/psf/black/issues/4680">#4680</a>)</li> <li>Wrap the <code>in</code> clause of comprehensions across lines if necessary (<a href="https://redirect.github.com/psf/black/issues/4699">#4699</a>)</li> <li>Remove parentheses around multiple exception types in <code>except</code> and <code>except*</code> without <code>as</code>. (<a href="https://redirect.github.com/psf/black/issues/4720">#4720</a>)</li> <li>Add <code>\r</code> style newlines to the potential newlines to normalize file newlines both from and to (<a href="https://redirect.github.com/psf/black/issues/4710">#4710</a>)</li> </ul> <h3>Parser</h3> <ul> <li>Rewrite tokenizer to improve performance and compliance (<a href="https://redirect.github.com/psf/black/issues/4536">#4536</a>)</li> <li>Fix bug where certain unusual expressions (e.g., lambdas) were not accepted in type parameter bounds and defaults. (<a href="https://redirect.github.com/psf/black/issues/4602">#4602</a>)</li> </ul> <h3>Performance</h3> <ul> <li>Avoid using an extra process when running with only one worker (<a href="https://redirect.github.com/psf/black/issues/4734">#4734</a>)</li> </ul> <h3>Integrations</h3> <ul> <li>Fix the version check in the vim file to reject Python 3.8 (<a href="https://redirect.github.com/psf/black/issues/4567">#4567</a>)</li> <li>Enhance GitHub Action <code>psf/black</code> to read Black version from an additional section in pyproject.toml: <code>[project.dependency-groups]</code> (<a href="https://redirect.github.com/psf/black/issues/4606">#4606</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
055a7664f0
|
chore(python-deps): bump blobfile from 3.0.0 to 3.1.0 (#3784)
Bumps [blobfile](https://github.com/christopher-hesse/blobfile) from 3.0.0 to 3.1.0. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/blobfile/blobfile/blob/master/CHANGES.md">blobfile's changelog</a>.</em></p> <blockquote> <h2>3.1.0</h2> <ul> <li>Improve <code>bf.join</code></li> <li>Add option to support blind writes</li> <li>Treat <code>EAI_NODATA</code> similarly to <code>EAI_NONAME</code> in DNS retry logic</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
13518e7562
|
chore(python-deps): bump ollama from 0.5.1 to 0.6.0 (#3786)
Bumps [ollama](https://github.com/ollama/ollama-python) from 0.5.1 to 0.6.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/ollama/ollama-python/releases">ollama's releases</a>.</em></p> <blockquote> <h2>v0.6.0</h2> <h2>What's Changed</h2> <ul> <li> <p>client: add web search and web crawl capabilities by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/578">ollama/ollama-python#578</a></p> </li> <li> <p>client: load OLLAMA_API_KEY on init by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/583">ollama/ollama-python#583</a></p> </li> <li> <p>client/types: update web search and fetch API by <a href="https://github.com/npardal"><code>@npardal</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/584">ollama/ollama-python#584</a></p> </li> <li> <p>examples: add mcp server for web_search web_crawl by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/585">ollama/ollama-python#585</a></p> </li> <li> <p>examples: gpt oss browser tool by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/588">ollama/ollama-python#588</a></p> </li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/npardal"><code>@npardal</code></a> made their first contribution in <a href="https://redirect.github.com/ollama/ollama-python/pull/584">ollama/ollama-python#584</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/ollama/ollama-python/compare/v0.5.4...v0.6.0">https://github.com/ollama/ollama-python/compare/v0.5.4...v0.6.0</a></p> <h2>v0.5.4</h2> <h2>What's Changed</h2> <ul> <li>examples: add gpt-oss browser example by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/558">ollama/ollama-python#558</a></li> <li>build(deps): bump actions/checkout from 4 to 5 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/ollama/ollama-python/pull/559">ollama/ollama-python#559</a></li> <li>examples/gpt-oss: fix examples by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/566">ollama/ollama-python#566</a></li> <li>Fix link for thinking-levels.py in documentation by <a href="https://github.com/btjanaka"><code>@btjanaka</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/567">ollama/ollama-python#567</a></li> <li>examples: fix gpt-oss-tools-stream for adding tool calls by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/568">ollama/ollama-python#568</a></li> <li>examples: resolve invalid tool usage status code 400 if llm makes a mistake gpt-oss by <a href="https://github.com/MarkWard0110"><code>@MarkWard0110</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/569">ollama/ollama-python#569</a></li> <li>build(deps): bump actions/setup-python from 5 to 6 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/ollama/ollama-python/pull/571">ollama/ollama-python#571</a></li> <li>feat: add dimensions to embed request by <a href="https://github.com/mxyng"><code>@mxyng</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/574">ollama/ollama-python#574</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/btjanaka"><code>@btjanaka</code></a> made their first contribution in <a href="https://redirect.github.com/ollama/ollama-python/pull/567">ollama/ollama-python#567</a></li> <li><a href="https://github.com/MarkWard0110"><code>@MarkWard0110</code></a> made their first contribution in <a href="https://redirect.github.com/ollama/ollama-python/pull/569">ollama/ollama-python#569</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/ollama/ollama-python/compare/v0.5.3...v0.5.4">https://github.com/ollama/ollama-python/compare/v0.5.3...v0.5.4</a></p> <h2>v0.5.3</h2> <h2>What's Changed</h2> <ul> <li>add support for 'high'/'medium'/'low' think values by <a href="https://github.com/drifkin"><code>@drifkin</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/553">ollama/ollama-python#553</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/ollama/ollama-python/compare/v0.5.2...v0.5.3">https://github.com/ollama/ollama-python/compare/v0.5.2...v0.5.3</a></p> <h2>v0.5.2</h2> <h2>What's Changed</h2> <ul> <li> <p>types/examples: add tool_name to message and examples by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/537">ollama/ollama-python#537</a></p> </li> <li> <p>types: add <code>context_length</code> to ProcessResponse by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/538">ollama/ollama-python#538</a></p> </li> <li> <p>types: relax type for tools by <a href="https://github.com/ParthSareen"><code>@ParthSareen</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/550">ollama/ollama-python#550</a></p> </li> <li> <p>add license metadata to package by <a href="https://github.com/ViViDboarder"><code>@ViViDboarder</code></a> in <a href="https://redirect.github.com/ollama/ollama-python/pull/526">ollama/ollama-python#526</a></p> </li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/hwittenborn"><code>@hwittenborn</code></a> made their first contribution in <a href="https://redirect.github.com/ollama/ollama-python/pull/525">ollama/ollama-python#525</a></li> <li><a href="https://github.com/ViViDboarder"><code>@ViViDboarder</code></a> made their first contribution in <a href="https://redirect.github.com/ollama/ollama-python/pull/526">ollama/ollama-python#526</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
e6378872c7 | fix(misc): pre-commit fix for server.py | ||
|
7c63aebd64
|
feat(responses)!: add reasoning and annotation added events (#3793)
Implements missing streaming events from OpenAI Responses API spec: - reasoning text/summary events for o1/o3 models, - refusal events for safety moderation - annotation events for citations, - and file search streaming events. Added optional reasoning_content field to chat completion chunks to support non-standard provider extensions. **NOTE:** OpenAI does _not_ fill reasoning_content when users use the chat_completion APIs. This means there is no way for us to implement Responses (with reasoning) by using OpenAI chat completions! We'd need to transparently punt to OpenAI's responses endpoints if we wish to do that. For others though (vLLM, etc.) we can use it. ## Test Plan File search streaming test passes: ``` ./scripts/integration-tests.sh --stack-config server:ci-tests \ --suite responses --setup gpt --inference-mode replay --pattern test_response_file_search_streaming_events ``` Need more complex setup and validation for reasoning tests (need a vLLM powered OSS model maybe gpt-oss which can return reasoning_content). I will do that in a followup PR. |
||
|
f365961731 | fix(tests): handle TEST_CONTEXT not being set | ||
|
dac1d7be1c
|
chore(python-deps): bump fire from 0.7.0 to 0.7.1 (#3787)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 19s
Python Package Build Test / build (3.12) (push) Failing after 19s
Python Package Build Test / build (3.13) (push) Failing after 38s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 42s
Unit Tests / unit-tests (3.12) (push) Failing after 39s
API Conformance Tests / check-schema-compatibility (push) Successful in 51s
UI Tests / ui-tests (22) (push) Successful in 54s
Pre-commit / pre-commit (push) Successful in 1m24s
Bumps [fire](https://github.com/google/python-fire) from 0.7.0 to 0.7.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/google/python-fire/releases">fire's releases</a>.</em></p> <blockquote> <h2>Python Fire v0.7.1</h2> <h2>What's Changed</h2> <ul> <li>Use Neutral theme for IPython Inspector, supporting newer IPython versions in <a href="https://redirect.github.com/google/python-fire/pull/588">google/python-fire#588</a></li> <li>Call inspectutils.GetClassAttrsDict on component, not None in <a href="https://redirect.github.com/google/python-fire/pull/606">google/python-fire#606</a></li> <li>Move to pyproject.toml, adding wheel support in pypi</li> <li>Use ty in place of pytype</li> <li>Update requirements <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot]</li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/google/python-fire/compare/v0.7.0...v0.7.1">https://github.com/google/python-fire/compare/v0.7.0...v0.7.1</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
2cb1b19efe
|
chore(python-deps): bump psycopg2-binary from 2.9.10 to 2.9.11 (#3785)
Bumps [psycopg2-binary](https://github.com/psycopg/psycopg2) from 2.9.10 to 2.9.11. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/psycopg/psycopg2/blob/master/NEWS">psycopg2-binary's changelog</a>.</em></p> <blockquote> <h2>Current release</h2> <p>What's new in psycopg 2.9.11 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^</p> <ul> <li>Add support for Python 3.14.</li> <li>Avoid a segfault passing more arguments than placeholders if Python is built with assertions enabled (🎫<code>[#1791](https://github.com/psycopg/psycopg2/issues/1791)</code>).</li> <li><code>~psycopg2.errorcodes</code> map and <code>~psycopg2.errors</code> classes updated to PostgreSQL 18.</li> <li>Drop support for Python 3.8.</li> </ul> <p>What's new in psycopg 2.9.10 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^</p> <ul> <li>Add support for Python 3.13.</li> <li>Receive notifications on commit (🎫<code>[#1728](https://github.com/psycopg/psycopg2/issues/1728)</code>).</li> <li><code>~psycopg2.errorcodes</code> map and <code>~psycopg2.errors</code> classes updated to PostgreSQL 17.</li> <li>Drop support for Python 3.7.</li> </ul> <p>What's new in psycopg 2.9.9 ^^^^^^^^^^^^^^^^^^^^^^^^^^^</p> <ul> <li>Add support for Python 3.12.</li> <li>Drop support for Python 3.6.</li> </ul> <p>What's new in psycopg 2.9.8 ^^^^^^^^^^^^^^^^^^^^^^^^^^^</p> <ul> <li>Wheel package bundled with PostgreSQL 16 libpq in order to add support for recent features, such as <code>sslcertmode</code>.</li> </ul> <p>What's new in psycopg 2.9.7 ^^^^^^^^^^^^^^^^^^^^^^^^^^^</p> <ul> <li>Fix propagation of exceptions raised during module initialization (🎫<code>[#1598](https://github.com/psycopg/psycopg2/issues/1598)</code>).</li> <li>Fix building when pg_config returns an empty string (🎫<code>[#1599](https://github.com/psycopg/psycopg2/issues/1599)</code>).</li> <li>Wheel package bundled with OpenSSL 1.1.1v.</li> </ul> <p>What's new in psycopg 2.9.6 ^^^^^^^^^^^^^^^^^^^^^^^^^^^</p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
f15d865a3e
|
chore(github-deps): bump astral-sh/setup-uv from 6.8.0 to 7.0.0 (#3782)
Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.8.0 to 7.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v7.0.0 🌈 node24 and a lot of bugfixes</h2> <h2>Changes</h2> <p>This release comes with a load of bug fixes and a speed up. Because of switching from node20 to node24 it is also a breaking change. If you are running on GitHub hosted runners this will just work, if you are using self-hosted runners make sure, that your runners are up to date. If you followed the normal installation instructions your self-hosted runner will keep itself updated.</p> <p>This release also removes the deprecated input <code>server-url</code> which was used to download uv releases from a different server. The <a href="https://github.com/astral-sh/setup-uv?tab=readme-ov-file#manifest-file">manifest-file</a> input supersedes that functionality by adding a flexible way to define available versions and where they should be downloaded from.</p> <h3>Fixes</h3> <ul> <li>The action now respects when the environment variable <code>UV_CACHE_DIR</code> is already set and does not overwrite it. It now also finds <a href="https://docs.astral.sh/uv/reference/settings/#cache-dir">cache-dir</a> settings in config files if you set them.</li> <li>Some users encountered problems that <a href="https://github.com/astral-sh/setup-uv?tab=readme-ov-file#disable-cache-pruning">cache pruning</a> took forever because they had some <code>uv</code> processes running in the background. Starting with uv version <code>0.8.24</code> this action uses <code>uv cache prune --ci --force</code> to ignore the running processes</li> <li>If you just want to install uv but not have it available in path, this action now respects <code>UV_NO_MODIFY_PATH</code></li> <li>Some other actions also set the env var <code>UV_CACHE_DIR</code>. This action can now deal with that but as this could lead to unwanted behavior in some edgecases a warning is now displayed.</li> </ul> <h3>Improvements</h3> <p>If you are using minimum version specifiers for the version of uv to install for example</p> <pre lang="toml"><code>[tool.uv] required-version = ">=0.8.17" </code></pre> <p>This action now detects that and directly uses the latest version. Previously it would download all available releases from the uv repo to determine the highest matching candidate for the version specifier, which took much more time.</p> <p>If you are using other specifiers like <code>0.8.x</code> this action still needs to download all available releases because the specifier defines an upper bound (not 0.9.0 or later) and "latest" would possibly not satisfy that.</p> <h2>🚨 Breaking changes</h2> <ul> <li>Use node24 instead of node20 <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/608">#608</a>)</li> <li>Remove deprecated input server-url <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/607">#607</a>)</li> </ul> <h2>🐛 Bug fixes</h2> <ul> <li>Respect UV_CACHE_DIR and cache-dir <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/612">#612</a>)</li> <li>Use --force when pruning cache <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/611">#611</a>)</li> <li>Respect UV_NO_MODIFY_PATH <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/603">#603</a>)</li> <li>Warn when <code>UV_CACHE_DIR</code> has changed <a href="https://github.com/jamesbraza"><code>@jamesbraza</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/601">#601</a>)</li> </ul> <h2>🚀 Enhancements</h2> <ul> <li>Shortcut to latest version for minimum version specifier <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/598">#598</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>Bump dependencies <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/613">#613</a>)</li> <li>Fix test-uv-no-modify-path <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/604">#604</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
a165b8b5bb
|
chore!: BREAKING CHANGE removing VectorDB APIs (#3774)
# What does this PR do? Removes VectorDBs from API surface and our tests. Moves tests to Vector Stores. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> |
||
|
06e4cd8e02
|
feat(api)!: BREAKING CHANGE: support passing extra_body through to providers (#3777)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Python Package Build Test / build (3.12) (push) Failing after 1s
Python Package Build Test / build (3.13) (push) Failing after 1s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (push) Failing after 5s
API Conformance Tests / check-schema-compatibility (push) Successful in 9s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
UI Tests / ui-tests (22) (push) Successful in 38s
Pre-commit / pre-commit (push) Successful in 1m27s
# What does this PR do? Allows passing through extra_body parameters to inference providers. With this, we removed the 2 vllm-specific parameters from completions API into `extra_body`. Before/After <img width="1883" height="324" alt="image" src="https://github.com/user-attachments/assets/acb27c08-c748-46c9-b1da-0de64e9908a1" /> closes #2720 ## Test Plan CI and added new test ``` ❯ uv run pytest -s -v tests/integration/ --stack-config=server:starter --inference-mode=record -k 'not( builtin_tool or safety_with_image or code_interpreter or test_rag ) and test_openai_completion_guided_choice' --setup=vllm --suite=base --color=yes Uninstalled 3 packages in 125ms Installed 3 packages in 19ms INFO 2025-10-10 14:29:54,317 tests.integration.conftest:118 tests: Applying setup 'vllm' for suite base INFO 2025-10-10 14:29:54,331 tests.integration.conftest:47 tests: Test stack config type: server (stack_config=server:starter) ============================================================================================================== test session starts ============================================================================================================== platform darwin -- Python 3.12.11, pytest-8.4.2, pluggy-1.6.0 -- /Users/erichuang/projects/llama-stack-1/.venv/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.11', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0'}} rootdir: /Users/erichuang/projects/llama-stack-1 configfile: pyproject.toml plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0 asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 285 items / 284 deselected / 1 selected tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=vllm/Qwen/Qwen3-0.6B] instantiating llama_stack_client Starting llama stack server with config 'starter' on port 8321... Waiting for server at http://localhost:8321... (0.0s elapsed) Waiting for server at http://localhost:8321... (0.5s elapsed) Waiting for server at http://localhost:8321... (5.1s elapsed) Waiting for server at http://localhost:8321... (5.6s elapsed) Waiting for server at http://localhost:8321... (10.1s elapsed) Waiting for server at http://localhost:8321... (10.6s elapsed) Server is ready at http://localhost:8321 llama_stack_client instantiated in 11.773s PASSEDTerminating llama stack server process... Terminating process 98444 and its group... Server process and children terminated gracefully ============================================================================================================= slowest 10 durations ============================================================================================================== 11.88s setup tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=vllm/Qwen/Qwen3-0.6B] 3.02s call tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=vllm/Qwen/Qwen3-0.6B] 0.01s teardown tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=vllm/Qwen/Qwen3-0.6B] ================================================================================================ 1 passed, 284 deselected, 3 warnings in 16.21s ================================================================================================= ``` |
||
|
80d58ab519
|
chore: refactor (chat)completions endpoints to use shared params struct (#3761)
# What does this PR do? Converts openai(_chat)_completions params to pydantic BaseModel to reduce code duplication across all providers. ## Test Plan CI --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/llamastack/llama-stack/pull/3761). * #3777 * __->__ #3761 |
||
|
6954fe2274
|
fix(auth): allow unauthenticated access to health and version endpoints (#3736)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 1s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.13) (push) Failing after 1s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Test Llama Stack Build / build-single-provider (push) Failing after 4s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s
API Conformance Tests / check-schema-compatibility (push) Successful in 11s
Test Llama Stack Build / build (push) Failing after 3s
Test External API and Providers / test-external (venv) (push) Failing after 5s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Unit Tests / unit-tests (3.13) (push) Failing after 3s
UI Tests / ui-tests (22) (push) Successful in 37s
Pre-commit / pre-commit (push) Successful in 2m1s
The AuthenticationMiddleware was blocking all requests without an Authorization header, including health and version endpoints that are needed by monitoring tools, load balancers, and Kubernetes probes. This commit allows endpoints ending in /health or /version to bypass authentication, enabling operational tooling to function properly without requiring credentials. Closes: #3735 Signed-off-by: Derek Higgins <derekh@redhat.com> |
||
|
32fde8d9a8
|
feat: Add /v1/embeddings endpoint to batches API (#3384)
# What does this PR do? This PR extends the Llama Stack Batches API to support the /v1/embeddings endpoint, enabling efficient batch processing of embedding requests alongside the existing /v1/chat/completions and /v1/completions support. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes: https://github.com/llamastack/llama-stack/issues/3145 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> ``` (stack-client) ➜ llama-stack git:(support/embeddings-api) conda activate stack-client && python -m pytest tests/unit/providers/batches/test_reference.py -v ============================================================================================================================================ test session starts ============================================================================================================================================= platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.11', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'xdist': '3.8.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}} rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack configfile: pyproject.toml plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, xdist-3.8.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0 asyncio: mode=Mode.AUTO collected 46 items tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_and_retrieve_batch_success PASSED [ 2%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_batch_without_metadata PASSED [ 4%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_batch_completion_window PASSED [ 6%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_batch_invalid_endpoints[/v1/invalid/endpoint] PASSED [ 8%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_batch_invalid_endpoints[] PASSED [ 10%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_batch_invalid_metadata PASSED [ 13%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_retrieve_batch_not_found PASSED [ 15%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_cancel_batch_success PASSED [ 17%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_cancel_batch_invalid_statuses[failed] PASSED [ 19%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_cancel_batch_invalid_statuses[expired] PASSED [ 21%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_cancel_batch_invalid_statuses[completed] PASSED [ 23%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_cancel_batch_not_found PASSED [ 26%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_list_batches_empty PASSED [ 28%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_list_batches_single_batch PASSED [ 30%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_list_batches_multiple_batches PASSED [ 32%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_list_batches_with_limit PASSED [ 34%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_list_batches_with_pagination PASSED [ 36%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_list_batches_invalid_after PASSED [ 39%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_kvstore_persistence PASSED [ 41%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_file_not_found PASSED [ 43%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_file_exists_empty_content PASSED [ 45%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_file_mixed_valid_invalid_json PASSED [ 47%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_model PASSED [ 50%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_chat_completions[custom_id-custom_id-missing_required_parameter-Missing required parameter: custom_id] PASSED [ 52%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_chat_completions[method-method-missing_required_parameter-Missing required parameter: method] PASSED [ 54%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_chat_completions[url-url-missing_required_parameter-Missing required parameter: url] PASSED [ 56%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_chat_completions[body-body-missing_required_parameter-Missing required parameter: body] PASSED [ 58%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_chat_completions[model-body.model-invalid_request-Model parameter is required] PASSED [ 60%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_chat_completions[messages-body.messages-invalid_request-Messages parameter is required] PASSED [ 63%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_completions[custom_id-custom_id-missing_required_parameter-Missing required parameter: custom_id] PASSED [ 65%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_completions[method-method-missing_required_parameter-Missing required parameter: method] PASSED [ 67%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_completions[url-url-missing_required_parameter-Missing required parameter: url] PASSED [ 69%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_completions[body-body-missing_required_parameter-Missing required parameter: body] PASSED [ 71%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_completions[model-body.model-invalid_request-Model parameter is required] PASSED [ 73%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_missing_parameters_completions[prompt-body.prompt-invalid_request-Prompt parameter is required] PASSED [ 76%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_url_mismatch PASSED [ 78%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_multiple_errors_per_request PASSED [ 80%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_request_format PASSED [ 82%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_parameter_types[custom_id-custom_id-12345-Custom_id must be a string] PASSED [ 84%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_parameter_types[url-url-123-URL must be a string] PASSED [ 86%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_parameter_types[method-method-invalid_value2-Method must be a string] PASSED [ 89%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_parameter_types[body-body-invalid_value3-Body must be a JSON dictionary object] PASSED [ 91%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_parameter_types[model-body.model-123-Model must be a string] PASSED [ 93%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_validate_input_invalid_parameter_types[messages-body.messages-invalid messages format-Messages must be an array] PASSED [ 95%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_max_concurrent_batches PASSED [ 97%] tests/unit/providers/batches/test_reference.py::TestReferenceBatchesImpl::test_create_batch_embeddings_endpoint PASSED [100%] ``` --------- Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> |
||
|
1394403360
|
feat(responses): implement usage tracking in streaming responses (#3771)
Implementats usage accumulation to StreamingResponseOrchestrator. The most important part was to pass `stream_options = { "include_usage": true }` to the chat_completion call. This means I will have to record all responses tests again because request hash will change :) Test changes: - Add usage assertions to streaming and non-streaming tests - Update test recordings with actual usage data from OpenAI |
||
|
e7d21e1ee3
|
feat: Add support for Conversations in Responses API (#3743)
# What does this PR do? This PR adds support for Conversations in Responses. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Unit tests Integration tests <Details> <Summary>Manual testing with this script: (click to expand)</Summary> ```python from openai import OpenAI client = OpenAI() client = OpenAI(base_url="http://localhost:8321/v1/", api_key="none") def test_conversation_create(): print("Testing conversation create...") conversation = client.conversations.create( metadata={"topic": "demo"}, items=[ {"type": "message", "role": "user", "content": "Hello!"} ] ) print(f"Created: {conversation}") return conversation def test_conversation_retrieve(conv_id): print(f"Testing conversation retrieve for {conv_id}...") retrieved = client.conversations.retrieve(conv_id) print(f"Retrieved: {retrieved}") return retrieved def test_conversation_update(conv_id): print(f"Testing conversation update for {conv_id}...") updated = client.conversations.update( conv_id, metadata={"topic": "project-x"} ) print(f"Updated: {updated}") return updated def test_conversation_delete(conv_id): print(f"Testing conversation delete for {conv_id}...") deleted = client.conversations.delete(conv_id) print(f"Deleted: {deleted}") return deleted def test_conversation_items_create(conv_id): print(f"Testing conversation items create for {conv_id}...") items = client.conversations.items.create( conv_id, items=[ { "type": "message", "role": "user", "content": [{"type": "input_text", "text": "Hello!"}] }, { "type": "message", "role": "user", "content": [{"type": "input_text", "text": "How are you?"}] } ] ) print(f"Items created: {items}") return items def test_conversation_items_list(conv_id): print(f"Testing conversation items list for {conv_id}...") items = client.conversations.items.list(conv_id, limit=10) print(f"Items list: {items}") return items def test_conversation_item_retrieve(conv_id, item_id): print(f"Testing conversation item retrieve for {conv_id}/{item_id}...") item = client.conversations.items.retrieve(conversation_id=conv_id, item_id=item_id) print(f"Item retrieved: {item}") return item def test_conversation_item_delete(conv_id, item_id): print(f"Testing conversation item delete for {conv_id}/{item_id}...") deleted = client.conversations.items.delete(conversation_id=conv_id, item_id=item_id) print(f"Item deleted: {deleted}") return deleted def test_conversation_responses_create(): print("\nTesting conversation create for a responses example...") conversation = client.conversations.create() print(f"Created: {conversation}") response = client.responses.create( model="gpt-4.1", input=[{"role": "user", "content": "What are the 5 Ds of dodgeball?"}], conversation=conversation.id, ) print(f"Created response: {response} for conversation {conversation.id}") return response, conversation def test_conversations_responses_create_followup( conversation, content="Repeat what you just said but add 'this is my second time saying this'", ): print(f"Using: {conversation.id}") response = client.responses.create( model="gpt-4.1", input=[{"role": "user", "content": content}], conversation=conversation.id, ) print(f"Created response: {response} for conversation {conversation.id}") conv_items = client.conversations.items.list(conversation.id) print(f"\nRetrieving list of items for conversation {conversation.id}:") print(conv_items.model_dump_json(indent=2)) def test_response_with_fake_conv_id(): fake_conv_id = "conv_zzzzzzzzz5dc81908289d62779d2ac510a2b0b602ef00a44" print(f"Using {fake_conv_id}") try: response = client.responses.create( model="gpt-4.1", input=[{"role": "user", "content": "say hello"}], conversation=fake_conv_id, ) print(f"Created response: {response} for conversation {fake_conv_id}") except Exception as e: print(f"failed to create response for conversation {fake_conv_id} with error {e}") def main(): print("Testing OpenAI Conversations API...") # Create conversation conversation = test_conversation_create() conv_id = conversation.id # Retrieve conversation test_conversation_retrieve(conv_id) # Update conversation test_conversation_update(conv_id) # Create items items = test_conversation_items_create(conv_id) # List items items_list = test_conversation_items_list(conv_id) # Retrieve specific item if items_list.data: item_id = items_list.data[0].id test_conversation_item_retrieve(conv_id, item_id) # Delete item test_conversation_item_delete(conv_id, item_id) # Delete conversation test_conversation_delete(conv_id) response, conversation2 = test_conversation_responses_create() print('\ntesting reseponse retrieval') test_conversation_retrieve(conversation2.id) print('\ntesting responses follow up') test_conversations_responses_create_followup(conversation2) print('\ntesting responses follow up x2!') test_conversations_responses_create_followup( conversation2, content="Repeat what you just said but add 'this is my third time saying this'", ) test_response_with_fake_conv_id() print("All tests completed!") if __name__ == "__main__": main() ``` </Details> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> |
||
|
932fea813a
|
fix(ci): remove responses from CI for now (#3773)
There are many changes to responses which are landing. They are introducing fundamental new types. This means re-recordings even from the inference calls. Let's avoid that for now. Once everything lands I will re-record everything, make things pass and re-enable. |
||
|
548ccff368
|
fix(mypy): fix wrong attribute access (#3770) | ||
|
8bf07f91cb
|
feat: reuse previous mcp tool listings where possible (#3710)
# What does this PR do? This PR checks whether, if a previous response is linked, there are mcp_list_tools objects that can be reused instead of listing the tools explicitly every time. Closes #3106 ## Test Plan Tested manually. Added unit tests to cover new behaviour. --------- Signed-off-by: Gordon Sim <gsim@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com> |
||
|
0066d986c5
|
feat: use SecretStr for inference provider auth credentials (#3724)
# What does this PR do? use SecretStr for OpenAIMixin providers - RemoteInferenceProviderConfig now has auth_credential: SecretStr - the default alias is api_key (most common name) - some providers override to use api_token (RunPod, vLLM, Databricks) - some providers exclude it (Ollama, TGI, Vertex AI) addresses #3517 ## Test Plan ci w/ new tests |
||
|
6d8f61206e
|
fix: update normalize to search all recordings dirs (#3767)
Updated scripts/normalize_recordings.py to dynamically find and process all 'recordings' directories under tests/ using pathlib.rglob() instead of hardcoding a single path. Signed-off-by: Derek Higgins <derekh@redhat.com> |
||
|
e039b61d26
|
feat(responses)!: add in_progress, failed, content part events (#3765)
## Summary - add schema + runtime support for response.in_progress / response.failed / response.incomplete - stream content parts with proper indexes and reasoning slots - align tests + docs with the richer event payloads ## Testing - uv run pytest tests/unit/providers/agents/meta_reference/test_openai_responses.py::test_create_openai_response_with_string_input - uv run pytest tests/unit/providers/agents/meta_reference/test_response_conversion_utils.py |
||
|
a548169b99
|
fix: allow skipping model availability check for vLLM (#3739)
# What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Allows model check to fail gracefully instead of crashing on startup. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> set VLLM_URL to your VLLM server ``` (base) akram@Mac llama-stack % LAMA_STACK_LOGGING="all=debug" VLLM_ENABLE_MODEL_DISCOVERY=false MILVUS_DB_PATH=./milvus.db INFERENCE_MODEL=vllm uv run --with llama-stack llama stack build --distro starter --image-type venv --run ``` ``` INFO 2025-10-08 20:11:24,637 llama_stack.providers.utils.inference.inference_store:74 inference: Write queue disabled for SQLite to avoid concurrency issues INFO 2025-10-08 20:11:24,866 llama_stack.providers.utils.responses.responses_store:96 openai_responses: Write queue disabled for SQLite to avoid concurrency issues ERROR 2025-10-08 20:11:26,160 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: VLLMInferenceAdapter.list_provider_model_ids() failed with: <a href="https://oauth.akram.a1ey.p3.openshiftapps.com:443/oauth/authorize?approval_prompt=force&client_id=system%3Aserviceaccount%3Arhoai-30-genai%3Adefault&redirect_uri=ht tps%3A%2F%2Fvllm-rhoai-30-genai.apps.rosa.akram.a1ey.p3.openshiftapps.com%2Foauth%2Fcallback&response_type=code&scope=user%3Ainfo+user%3Acheck-access&state=9fba207425 5851c718aca717a5887d76%3A%2Fmodels">Found</a>. [...] INFO 2025-10-08 20:11:26,295 uvicorn.error:84 uncategorized: Started server process [83144] INFO 2025-10-08 20:11:26,296 uvicorn.error:48 uncategorized: Waiting for application startup. INFO 2025-10-08 20:11:26,297 llama_stack.core.server.server:170 core::server: Starting up INFO 2025-10-08 20:11:26,297 llama_stack.core.stack:399 core: starting registry refresh task INFO 2025-10-08 20:11:26,311 uvicorn.error:62 uncategorized: Application startup complete. INFO 2025-10-08 20:11:26,312 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) ERROR 2025-10-08 20:11:26,791 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: VLLMInferenceAdapter.list_provider_model_ids() failed with: <a href="https://oauth.akram.a1ey.p3.openshiftapps.com:443/oauth/authorize?approval_prompt=force&client_id=system%3Aserviceaccount%3Arhoai-30-genai%3Adefault&redirect_uri=ht tps%3A%2F%2Fvllm-rhoai-30-genai.apps.rosa.akram.a1ey.p3.openshiftapps.com%2Foauth%2Fcallback&response_type=code&scope=user%3Ainfo+user%3Acheck-access&state=8ef0cba3e1 71a4f8b04cb445cfb91a4c%3A%2Fmodels">Found</a>. ``` |
||
|
aaf5036235
|
feat(responses): add usage types to inference and responses APIs (#3764)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 2s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (push) Failing after 6s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Test External API and Providers / test-external (venv) (push) Failing after 6s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
Python Package Build Test / build (3.13) (push) Failing after 23s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 27s
API Conformance Tests / check-schema-compatibility (push) Successful in 36s
UI Tests / ui-tests (22) (push) Successful in 55s
Pre-commit / pre-commit (push) Successful in 2m7s
## Summary Adds OpenAI-compatible usage tracking types to enable reporting token consumption for both streaming and non-streaming responses. ## Type Definitions **Chat Completion Usage** (inference API): ```python class OpenAIChatCompletionUsage(BaseModel): prompt_tokens: int completion_tokens: int total_tokens: int prompt_tokens_details: OpenAIChatCompletionUsagePromptTokensDetails | None completion_tokens_details: OpenAIChatCompletionUsageCompletionTokensDetails | None ``` **Response Usage** (responses API): ```python class OpenAIResponseUsage(BaseModel): input_tokens: int output_tokens: int total_tokens: int input_tokens_details: OpenAIResponseUsageInputTokensDetails | None output_tokens_details: OpenAIResponseUsageOutputTokensDetails | None ``` This matches OpenAI's usage reporting format and enables PR #3766 to implement usage tracking in streaming responses. Co-authored-by: Claude <noreply@anthropic.com> |
||
|
ebae0385bb
|
fix: update dangling references to llama download command (#3763)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Test Llama Stack Build / build-single-provider (push) Failing after 3s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s
Python Package Build Test / build (3.13) (push) Failing after 1s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Vector IO Integration Tests / test-matrix (push) Failing after 5s
Python Package Build Test / build (3.12) (push) Failing after 3s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Test Llama Stack Build / build (push) Failing after 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 10s
Unit Tests / unit-tests (3.13) (push) Failing after 3s
Unit Tests / unit-tests (3.12) (push) Failing after 5s
UI Tests / ui-tests (22) (push) Successful in 40s
Pre-commit / pre-commit (push) Successful in 2m14s
## Summary After removing model management CLI in #3700, this PR updates remaining references to the old `llama download` command to use `huggingface-cli download` instead. ## Changes - Updated error messages in `meta_reference/common.py` to recommend `huggingface-cli download` - Updated error messages in `torchtune/recipes/lora_finetuning_single_device.py` to use `huggingface-cli download` - Updated post-training notebook to use `huggingface-cli download` instead of `llama download` - Fixed typo: "you model" -> "your model" ## Test Plan - Verified error messages provide correct guidance for users - Checked that notebook instructions are up-to-date with current tooling |
||
|
8fe4a216b5
|
fix(inference): propagate 401/403 errors from remote providers (#3762)
## Summary Fixes #2990 Remote provider authentication errors (401/403) were being converted to 500 Internal Server Error, preventing users from understanding why their requests failed. ## The Problem When a request with an invalid API key was sent to a remote provider: - Provider correctly returns 401 with error details - Llama Stack's `translate_exception()` didn't recognize provider SDK exceptions - Fell through to generic 500 error handler - User received: "Internal server error: An unexpected error occurred." ## The Fix Added handler in `translate_exception()` that checks for exceptions with a `status_code` attribute and preserves the original HTTP status code and error message. **Before:** ```json HTTP 500 {"detail": "Internal server error: An unexpected error occurred."} ``` **After:** ```json HTTP 401 {"detail": "Error code: 401 - {'error': {'message': 'Invalid API Key', 'type': 'invalid_request_error', 'code': 'invalid_api_key'}}"} ``` ## Tested With - ✅ groq: 401 "Invalid API Key" - ✅ openai: 401 "Incorrect API key provided" - ✅ together: 401 "Invalid API key provided" - ✅ fireworks: 403 "unauthorized" ## Test Plan **Automated test script:** https://gist.github.com/ashwinb/1199dd7585ffa3f4be67b111cc65f2f3 The test script: 1. Builds separate stacks for each provider 2. Registers models (with validation temporarily disabled for testing) 3. Sends requests with invalid API keys via `x-llamastack-provider-data` header 4. Verifies HTTP status codes are 401/403 (not 500) **Results before fix:** All providers returned 500 **Results after fix:** All providers correctly return 401/403 **Manual verification:** ```bash # 1. Build stack llama stack build --image-type venv --providers inference=remote::groq # 2. Start stack llama stack run # 3. Send request with invalid API key curl http://localhost:8321/v1/chat/completions \ -H "Content-Type: application/json" \ -H 'x-llamastack-provider-data: {"groq_api_key": "invalid-key"}' \ -d '{"model": "groq/llama3-70b-8192", "messages": [{"role": "user", "content": "test"}]}' # Expected: HTTP 401 with provider error message (not 500) ``` ## Impact - Works with all remote providers using OpenAI SDK (groq, openai, together, fireworks, etc.) - Works with any provider SDK that follows the pattern of exceptions with `status_code` attribute - No breaking changes - only affects error responses |
||
|
145b2bcf25
|
feat: make object registration idempotent (#3752)
# What does this PR do? objects (vector dbs, models, scoring functions, etc) have an identifier and associated object values. we allow exact duplicate registrations. we reject registrations when the identifier exists and the associated object values differ. note: model are namespaced, i.e. {provider_id}/{identifier}, while other object types are not ## Test Plan ci w/ new tests |
||
|
7ee0ee7843
|
chore!: remove model mgmt from CLI for Hugging Face CLI (#3700)
This change removes the `llama model` and `llama download` subcommands from the CLI, replacing them with recommendations to use the Hugging Face CLI instead. Rationale for this change: - The model management functionality was largely duplicating what Hugging Face CLI already provides, leading to unnecessary maintenance overhead (except the download source from Meta?) - Maintaining our own implementation required fixing bugs and keeping up with changes in model repositories and download mechanisms - The Hugging Face CLI is more mature, widely adopted, and better maintained - This allows us to focus on the core Llama Stack functionality rather than reimplementing model management tools Changes made: - Removed all model-related CLI commands and their implementations - Updated documentation to recommend using `huggingface-cli` for model downloads - Removed Meta-specific download logic and statements - Simplified the CLI to focus solely on stack management operations Users should now use: - `huggingface-cli download` for downloading models - `huggingface-cli scan-cache` for listing downloaded models This is a breaking change as it removes previously available CLI commands. Signed-off-by: Sébastien Han <seb@redhat.com> |
||
|
841d0c3583
|
fix(testing): improve api_recorder error messages for missing recordings (#3760)
Replaces opaque error messages when recordings are not found with somewhat better guidance Before: ``` No recorded response found for request hash: abc123... To record this response, run with LLAMA_STACK_TEST_INFERENCE_MODE=record ``` After: ``` Recording not found for request hash: abc123 Model: gpt-4 | Request: POST https://api.openai.com/v1/chat/completions Run './scripts/integration-tests.sh --inference-mode record-if-missing' with required API keys to generate. ``` |
||
|
a055a32ee4
|
fix(tests): remove chroma and qdrant from vector io unit tests (#3759)
These vector databases are already thoroughly tested in integration tests. Unit tests now focus on sqlite_vec, faiss, and pgvector with mocked dependencies, removing the need for external service dependencies. ## Changes: - Deleted test_qdrant.py unit test file - Removed chroma/qdrant fixtures and parametrization from conftest.py - Fixed SqliteKVStoreConfig import to use correct location - Removed chromadb, qdrant-client, pymilvus, milvus-lite, and weaviate-client from unit test dependencies in pyproject.toml |
||
|
f50ce11a3b
|
feat(tests): make inference_recorder into api_recorder (include tool_invoke) (#3403)
Renames `inference_recorder.py` to `api_recorder.py` and extends it to support recording/replaying tool invocations in addition to inference calls. This allows us to record web-search, etc. tool calls and thereafter apply recordings for `tests/integration/responses` ## Test Plan ``` export OPENAI_API_KEY=... export TAVILY_SEARCH_API_KEY=... ./scripts/integration-tests.sh --stack-config ci-tests \ --suite responses --inference-mode record-if-missing ``` |
||
|
26fd5dbd34
|
fix: add traces for tool calls and mcp tool listing (#3722)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s
Python Package Build Test / build (3.13) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (push) Failing after 5s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.13) (push) Failing after 5s
API Conformance Tests / check-schema-compatibility (push) Successful in 15s
UI Tests / ui-tests (22) (push) Successful in 42s
Pre-commit / pre-commit (push) Successful in 1m24s
# What does this PR do? Adds traces around tool execution and mcp tool listing for better observability. Closes #3108 ## Test Plan Manually examined traces in jaeger to verify the added information was available. Signed-off-by: Gordon Sim <gsim@redhat.com> |
||
|
4b9ebbf6a2
|
chore: revert "fix: Raising an error message to the user when registering an existing provider." (#3750)
Reverts llamastack/llama-stack#3624 Causing https://github.com/llamastack/llama-stack/issues/3749 |
||
|
05a62a6ffb
|
chore: print integration tests command (#3747)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.13) (push) Failing after 1s
Python Package Build Test / build (3.12) (push) Failing after 1s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 3s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
API Conformance Tests / check-schema-compatibility (push) Successful in 9s
UI Tests / ui-tests (22) (push) Successful in 41s
Pre-commit / pre-commit (push) Successful in 1m23s
# What does this PR do? ## Test Plan <img width="1104" height="60" alt="image" src="https://github.com/user-attachments/assets/d4691a2e-c5ec-4df5-a15a-f86e667fdf8c" /> |
||
|
16db42e7e5
|
feat(tests): add --collect-only option to integration test script (#3745)
Adds --collect-only flag to scripts/integration-tests.sh that skips server startup and passes the flag to pytest for test collection only. When specified, minimal flags are required (no --stack-config or --setup needed). ## Changes - Added `--collect-only` flag that skips server startup - Made `--stack-config` and `--setup` optional when using `--collect-only` - Skip `llama` command check when collecting tests only ## Usage ```bash # Collect tests without starting server ./scripts/integration-tests.sh --subdirs inference --collect-only ``` |
||
|
b96640eca3
|
chore: Removing Weaviate, PGVector, and Milvus from unit tests (#3742)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.12) (push) Failing after 1s
Unit Tests / unit-tests (3.13) (push) Failing after 3s
Python Package Build Test / build (3.13) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (push) Failing after 5s
Unit Tests / unit-tests (3.12) (push) Failing after 3s
Test External API and Providers / test-external (venv) (push) Failing after 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 11s
UI Tests / ui-tests (22) (push) Successful in 48s
Pre-commit / pre-commit (push) Successful in 1m27s
# What does this PR do? Removing Weaviate, PostGres, and Milvus unit tests <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> |
||
|
79bed44b04
|
fix(tests): ensure test isolation in server mode (#3737)
Propagate test IDs from client to server via HTTP headers to maintain proper test isolation when running with server-based stack configs. Without this, recorded/replayed inference requests in server mode would leak across tests. Changes: - Patch client _prepare_request to inject test ID into provider data header - Sync test context from provider data on server side before storage operations - Set LLAMA_STACK_TEST_STACK_CONFIG_TYPE env var based on stack config - Configure console width for cleaner log output in CI - Add SQLITE_STORE_DIR temp directory for test data isolation |
||
|
96886afaca
|
fix(responses): fix regression in support for mcp tool require_approval argument (#3731)
# What does this PR do? It prevents a tool call message being added to the chat completions message without a corresponding tool call result, which is needed in the case that an approval is required first or if the approval request is denied. In both these cases the tool call messages is popped of the next turn messages. Closes #3728 ## Test Plan Ran the integration tests Manual check of both approval and denial against gpt-4o Signed-off-by: Gordon Sim <gsim@redhat.com> |
||
|
5d711d4bcb
|
fix: Update watsonx.ai provider to use LiteLLM mixin and list all models (#3674)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s
Python Package Build Test / build (3.13) (push) Failing after 2s
Python Package Build Test / build (3.12) (push) Failing after 3s
Vector IO Integration Tests / test-matrix (push) Failing after 7s
Test Llama Stack Build / generate-matrix (push) Successful in 6s
Test Llama Stack Build / build-single-provider (push) Failing after 4s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
API Conformance Tests / check-schema-compatibility (push) Successful in 12s
Test Llama Stack Build / build (push) Failing after 3s
Unit Tests / unit-tests (3.12) (push) Failing after 5s
UI Tests / ui-tests (22) (push) Successful in 32s
Pre-commit / pre-commit (push) Successful in 1m29s
# What does this PR do? - The watsonx.ai provider now uses the LiteLLM mixin instead of using IBM's library, which does not seem to be working (see #3165 for context). - The watsonx.ai provider now lists all the models available by calling the watsonx.ai server instead of having a hard coded list of known models. (That list gets out of date quickly) - An edge case in [llama_stack/core/routers/inference.py](https://github.com/llamastack/llama-stack/pull/3674/files#diff-a34bc966ed9befd9f13d4883c23705dff49be0ad6211c850438cdda6113f3455) is addressed that was causing my manual tests to fail. - Fixes `b64_encode_openai_embeddings_response` which was trying to enumerate over a dictionary and then reference elements of the dictionary using .field instead of ["field"]. That method is called by the LiteLLM mixin for embedding models, so it is needed to get the watsonx.ai embedding models to work. - A unit test along the lines of the one in #3348 is added. A more comprehensive plan for automatically testing the end-to-end functionality for inference providers would be a good idea, but is out of scope for this PR. - Updates to the watsonx distribution. Some were in response to the switch to LiteLLM (e.g., updating the Python packages needed). Others seem to be things that were already broken that I found along the way (e.g., a reference to a watsonx specific doc template that doesn't seem to exist). Closes #3165 Also it is related to a line-item in #3387 but doesn't really address that goal (because it uses the LiteLLM mixin, not the OpenAI one). I tried the OpenAI one and it doesn't work with watsonx.ai, presumably because the watsonx.ai service is not OpenAI compatible. It works with LiteLLM because LiteLLM has a provider implementation for watsonx.ai. ## Test Plan The test script below goes back and forth between the OpenAI and watsonx providers. The idea is that the OpenAI provider shows how it should work and then the watsonx provider output shows that it is also working with watsonx. Note that the result from the MCP test is not as good (the Llama 3.3 70b model does not choose tools as wisely as gpt-4o), but it is still working and providing a valid response. For more details on setup and the MCP server being used for testing, see [the AI Alliance sample notebook](https://github.com/The-AI-Alliance/llama-stack-examples/blob/main/notebooks/01-responses/) that these examples are drawn from. ```python #!/usr/bin/env python3 import json from llama_stack_client import LlamaStackClient from litellm import completion import http.client def print_response(response): """Print response in a nicely formatted way""" print(f"ID: {response.id}") print(f"Status: {response.status}") print(f"Model: {response.model}") print(f"Created at: {response.created_at}") print(f"Output items: {len(response.output)}") for i, output_item in enumerate(response.output): if len(response.output) > 1: print(f"\n--- Output Item {i+1} ---") print(f"Output type: {output_item.type}") if output_item.type in ("text", "message"): print(f"Response content: {output_item.content[0].text}") elif output_item.type == "file_search_call": print(f" Tool Call ID: {output_item.id}") print(f" Tool Status: {output_item.status}") # 'queries' is a list, so we join it for clean printing print(f" Queries: {', '.join(output_item.queries)}") # Display results if they exist, otherwise note they are empty print(f" Results: {output_item.results if output_item.results else 'None'}") elif output_item.type == "mcp_list_tools": print_mcp_list_tools(output_item) elif output_item.type == "mcp_call": print_mcp_call(output_item) else: print(f"Response content: {output_item.content}") def print_mcp_call(mcp_call): """Print MCP call in a nicely formatted way""" print(f"\n🛠️ MCP Tool Call: {mcp_call.name}") print(f" Server: {mcp_call.server_label}") print(f" ID: {mcp_call.id}") print(f" Arguments: {mcp_call.arguments}") if mcp_call.error: print("Error: {mcp_call.error}") elif mcp_call.output: print("Output:") # Try to format JSON output nicely try: parsed_output = json.loads(mcp_call.output) print(json.dumps(parsed_output, indent=4)) except: # If not valid JSON, print as-is print(f" {mcp_call.output}") else: print(" ⏳ No output yet") def print_mcp_list_tools(mcp_list_tools): """Print MCP list tools in a nicely formatted way""" print(f"\n🔧 MCP Server: {mcp_list_tools.server_label}") print(f" ID: {mcp_list_tools.id}") print(f" Available Tools: {len(mcp_list_tools.tools)}") print("=" * 80) for i, tool in enumerate(mcp_list_tools.tools, 1): print(f"\n{i}. {tool.name}") print(f" Description: {tool.description}") # Parse and display input schema schema = tool.input_schema if schema and 'properties' in schema: properties = schema['properties'] required = schema.get('required', []) print(" Parameters:") for param_name, param_info in properties.items(): param_type = param_info.get('type', 'unknown') param_desc = param_info.get('description', 'No description') required_marker = " (required)" if param_name in required else " (optional)" print(f" • {param_name} ({param_type}){required_marker}") if param_desc: print(f" {param_desc}") if i < len(mcp_list_tools.tools): print("-" * 40) def main(): """Main function to run all the tests""" # Configuration LLAMA_STACK_URL = "http://localhost:8321/" LLAMA_STACK_MODEL_IDS = [ "openai/gpt-3.5-turbo", "openai/gpt-4o", "llama-openai-compat/Llama-3.3-70B-Instruct", "watsonx/meta-llama/llama-3-3-70b-instruct" ] # Using gpt-4o for this demo, but feel free to try one of the others or add more to run.yaml. OPENAI_MODEL_ID = LLAMA_STACK_MODEL_IDS[1] WATSONX_MODEL_ID = LLAMA_STACK_MODEL_IDS[-1] NPS_MCP_URL = "http://localhost:3005/sse/" print("=== Llama Stack Testing Script ===") print(f"Using OpenAI model: {OPENAI_MODEL_ID}") print(f"Using WatsonX model: {WATSONX_MODEL_ID}") print(f"MCP URL: {NPS_MCP_URL}") print() # Initialize client print("Initializing LlamaStackClient...") client = LlamaStackClient(base_url="http://localhost:8321") # Test 1: List models print("\n=== Test 1: List Models ===") try: models = client.models.list() print(f"Found {len(models)} models") except Exception as e: print(f"Error listing models: {e}") raise e # Test 2: Basic chat completion with OpenAI print("\n=== Test 2: Basic Chat Completion (OpenAI) ===") try: chat_completion_response = client.chat.completions.create( model=OPENAI_MODEL_ID, messages=[{"role": "user", "content": "What is the capital of France?"}] ) print("OpenAI Response:") for chunk in chat_completion_response.choices[0].message.content: print(chunk, end="", flush=True) print() except Exception as e: print(f"Error with OpenAI chat completion: {e}") raise e # Test 3: Basic chat completion with WatsonX print("\n=== Test 3: Basic Chat Completion (WatsonX) ===") try: chat_completion_response_wxai = client.chat.completions.create( model=WATSONX_MODEL_ID, messages=[{"role": "user", "content": "What is the capital of France?"}], ) print("WatsonX Response:") for chunk in chat_completion_response_wxai.choices[0].message.content: print(chunk, end="", flush=True) print() except Exception as e: print(f"Error with WatsonX chat completion: {e}") raise e # Test 4: Tool calling with OpenAI print("\n=== Test 4: Tool Calling (OpenAI) ===") tools = [ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather for a specific location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g., San Francisco, CA", }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] }, }, "required": ["location"], }, }, } ] messages = [ {"role": "user", "content": "What's the weather like in Boston, MA?"} ] try: print("--- Initial API Call ---") response = client.chat.completions.create( model=OPENAI_MODEL_ID, messages=messages, tools=tools, tool_choice="auto", # "auto" is the default ) print("OpenAI tool calling response received") except Exception as e: print(f"Error with OpenAI tool calling: {e}") raise e # Test 5: Tool calling with WatsonX print("\n=== Test 5: Tool Calling (WatsonX) ===") try: wxai_response = client.chat.completions.create( model=WATSONX_MODEL_ID, messages=messages, tools=tools, tool_choice="auto", # "auto" is the default ) print("WatsonX tool calling response received") except Exception as e: print(f"Error with WatsonX tool calling: {e}") raise e # Test 6: Streaming with WatsonX print("\n=== Test 6: Streaming Response (WatsonX) ===") try: chat_completion_response_wxai_stream = client.chat.completions.create( model=WATSONX_MODEL_ID, messages=[{"role": "user", "content": "What is the capital of France?"}], stream=True ) print("Model response: ", end="") for chunk in chat_completion_response_wxai_stream: # Each 'chunk' is a ChatCompletionChunk object. # We want the content from the 'delta' attribute. if hasattr(chunk, 'choices') and chunk.choices is not None: content = chunk.choices[0].delta.content # The first few chunks might have None content, so we check for it. if content is not None: print(content, end="", flush=True) print() except Exception as e: print(f"Error with streaming: {e}") raise e # Test 7: MCP with OpenAI print("\n=== Test 7: MCP Integration (OpenAI) ===") try: mcp_llama_stack_client_response = client.responses.create( model=OPENAI_MODEL_ID, input="Tell me about some parks in Rhode Island, and let me know if there are any upcoming events at them.", tools=[ { "type": "mcp", "server_url": NPS_MCP_URL, "server_label": "National Parks Service tools", "allowed_tools": ["search_parks", "get_park_events"], } ] ) print_response(mcp_llama_stack_client_response) except Exception as e: print(f"Error with MCP (OpenAI): {e}") raise e # Test 8: MCP with WatsonX print("\n=== Test 8: MCP Integration (WatsonX) ===") try: mcp_llama_stack_client_response = client.responses.create( model=WATSONX_MODEL_ID, input="What is the capital of France?" ) print_response(mcp_llama_stack_client_response) except Exception as e: print(f"Error with MCP (WatsonX): {e}") raise e # Test 9: MCP with Llama 3.3 print("\n=== Test 9: MCP Integration (Llama 3.3) ===") try: mcp_llama_stack_client_response = client.responses.create( model=WATSONX_MODEL_ID, input="Tell me about some parks in Rhode Island, and let me know if there are any upcoming events at them.", tools=[ { "type": "mcp", "server_url": NPS_MCP_URL, "server_label": "National Parks Service tools", "allowed_tools": ["search_parks", "get_park_events"], } ] ) print_response(mcp_llama_stack_client_response) except Exception as e: print(f"Error with MCP (Llama 3.3): {e}") raise e # Test 10: Embeddings print("\n=== Test 10: Embeddings ===") try: conn = http.client.HTTPConnection("localhost:8321") payload = json.dumps({ "model": "watsonx/ibm/granite-embedding-278m-multilingual", "input": "Hello, world!", }) headers = { 'Content-Type': 'application/json', 'Accept': 'application/json' } conn.request("POST", "/v1/openai/v1/embeddings", payload, headers) res = conn.getresponse() data = res.read() print(data.decode("utf-8")) except Exception as e: print(f"Error with Embeddings: {e}") raise e print("\n=== Testing Complete ===") if __name__ == "__main__": main() ``` --------- Signed-off-by: Bill Murdock <bmurdock@redhat.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> |
||
|
62bac0aad4
|
chore(github-deps): bump actions/stale from 10.0.0 to 10.1.0 (#3684)
Bumps [actions/stale](https://github.com/actions/stale) from 10.0.0 to 10.1.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/stale/releases">actions/stale's releases</a>.</em></p> <blockquote> <h2>v10.1.0</h2> <h2>What's Changed</h2> <ul> <li>Add <code>only-issue-types</code> option to filter issues by type by <a href="https://github.com/Bibo-Joshi"><code>@Bibo-Joshi</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1255">actions/stale#1255</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/Bibo-Joshi"><code>@Bibo-Joshi</code></a> made their first contribution in <a href="https://redirect.github.com/actions/stale/pull/1255">actions/stale#1255</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/stale/compare/v10...v10.1.0">https://github.com/actions/stale/compare/v10...v10.1.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
702fcd1abf
|
fix: Raising an error message to the user when registering an existing provider. (#3624)
When the user wants to change the attributes (which could include model name, dimensions,...etc) of an already registered provider, they will get an error message asking that they first unregister the provider before registering a new one. # What does this PR do? This PR updated the register function to raise an error to the user when they attempt to register a provider that was already registered asking them to un-register the existing provider first. <!-- If resolving an issue, uncomment and update the line below --> #2313 ## Test Plan Tested the change with /tests/unit/registry/test_registry.py --------- Co-authored-by: Omar Abdelwahab <omara@fb.com> |
||
|
0cde3d956d
|
chore: require valid logging category (#3712)
# What does this PR do? grep'd and audited all usage of 'get_logger' with help of Claude. ## Test Plan CI |
||
|
a3f5072776
|
chore!: remove --env from llama stack run (#3711)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Installer CI / lint (push) Failing after 2s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Installer CI / smoke-test-on-dev (push) Failing after 2s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 2s
Test Llama Stack Build / build-single-provider (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 2s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Python Package Build Test / build (3.13) (push) Failing after 1s
API Conformance Tests / check-schema-compatibility (push) Successful in 10s
Unit Tests / unit-tests (3.12) (push) Failing after 3s
Test Llama Stack Build / build (push) Failing after 3s
Test External API and Providers / test-external (venv) (push) Failing after 3s
Unit Tests / unit-tests (3.13) (push) Failing after 3s
UI Tests / ui-tests (22) (push) Successful in 40s
Pre-commit / pre-commit (push) Successful in 1m18s
# What does this PR do? user can simply set env vars in the beginning of the command.`FOO=BAR llama stack run ...` ## Test Plan Run TELEMETRY_SINKS=coneol uv run --with llama-stack llama stack build --distro=starter --image-type=venv --run --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/llamastack/llama-stack/pull/3711). * #3714 * __->__ #3711 |
||
|
1ac320b7e6
|
chore: remove dead code (#3729)
# What does this PR do? Removing some dead code, found by vulture and checked by claude that there are no references or imports for these ## Test Plan CI |
||
|
b6e9f41041
|
chore: Revert "fix: fix nvidia provider (#3716)" (#3730)
This reverts commit
|