Commit graph

2342 commits

Author SHA1 Message Date
github-actions[bot]
62ed7b3531 build: Bump version to 0.2.16 2025-07-28 23:13:06 +00:00
github-actions[bot]
7d4e082bdb Release candidate 0.2.16rc1 2025-07-28 22:35:14 +00:00
ehhuang
4019027070
chore: revert #2855 (#2939)
# What does this PR do?
revert https://github.com/meta-llama/llama-stack/pull/2855 to unblock
release (running out of disk space)

Error here:
4689354931

## Test Plan
2025-07-28 15:30:25 -07:00
dependabot[bot]
e189f65548
chore(python-deps): bump pydantic from 2.10.6 to 2.11.7 (#2925)
Bumps [pydantic](https://github.com/pydantic/pydantic) from 2.10.6 to
2.11.7.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pydantic/pydantic/releases">pydantic's
releases</a>.</em></p>
<blockquote>
<h2>v2.11.7 2025-06-14</h2>
<!-- raw HTML omitted -->
<h2>What's Changed</h2>
<h3>Fixes</h3>
<ul>
<li>Copy <code>FieldInfo</code> instance if necessary during
<code>FieldInfo</code> build by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11980">pydantic/pydantic#11980</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/pydantic/pydantic/compare/v2.11.6...v2.11.7">https://github.com/pydantic/pydantic/compare/v2.11.6...v2.11.7</a></p>
<h2>v2.11.6 2025-06-13</h2>
<h2>v2.11.6 (2025-06-13)</h2>
<h3>What's Changed</h3>
<h4>Fixes</h4>
<ul>
<li>Rebuild dataclass fields before schema generation by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11949">#11949</a></li>
<li>Always store the original field assignment on <code>FieldInfo</code>
by <a href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11946">#11946</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/pydantic/pydantic/compare/v2.11.5...v2.11.6">https://github.com/pydantic/pydantic/compare/v2.11.5...v2.11.6</a></p>
<h2>v2.11.5 2025-05-22</h2>
<!-- raw HTML omitted -->
<h2>What's Changed</h2>
<h3>Fixes</h3>
<ul>
<li>Check if <code>FieldInfo</code> is complete after applying type
variable map by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11855">#11855</a></li>
<li>Do not delete mock validator/serializer in
<code>model_rebuild()</code> by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11890">#11890</a></li>
<li>Do not duplicate metadata on model rebuild by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11902">#11902</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/pydantic/pydantic/compare/v2.11.4...v2.11.5">https://github.com/pydantic/pydantic/compare/v2.11.4...v2.11.5</a></p>
<h2>v2.11.4 2025-04-29</h2>
<h3>What's Changed</h3>
<h4>Packaging</h4>
<ul>
<li>Bump <code>mkdocs-llmstxt</code> to v0.2.0 by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11725">#11725</a></li>
</ul>
<h4>Changes</h4>
<ul>
<li>Allow config and bases to be specified together in
<code>create_model()</code> by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11714">#11714</a>.
This change was backported as it was previously possible (although not
meant to be supported)
to provide <code>model_config</code> as a field, which would make it
possible to provide both configuration
and bases.</li>
</ul>
<h4>Fixes</h4>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pydantic/pydantic/blob/main/HISTORY.md">pydantic's
changelog</a>.</em></p>
<blockquote>
<h2>v2.11.7 (2025-06-14)</h2>
<p><a
href="https://github.com/pydantic/pydantic/releases/tag/v2.11.7">GitHub
release</a></p>
<h3>What's Changed</h3>
<h4>Fixes</h4>
<ul>
<li>Copy <code>FieldInfo</code> instance if necessary during
<code>FieldInfo</code> build by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11898">#11898</a></li>
</ul>
<h2>v2.11.6 (2025-06-13)</h2>
<p><a
href="https://github.com/pydantic/pydantic/releases/tag/v2.11.6">GitHub
release</a></p>
<h3>What's Changed</h3>
<h4>Fixes</h4>
<ul>
<li>Rebuild dataclass fields before schema generation by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11949">#11949</a></li>
<li>Always store the original field assignment on <code>FieldInfo</code>
by <a href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11946">#11946</a></li>
</ul>
<h2>v2.11.5 (2025-05-22)</h2>
<p><a
href="https://github.com/pydantic/pydantic/releases/tag/v2.11.5">GitHub
release</a></p>
<h3>What's Changed</h3>
<h4>Fixes</h4>
<ul>
<li>Check if <code>FieldInfo</code> is complete after applying type
variable map by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11855">#11855</a></li>
<li>Do not delete mock validator/serializer in
<code>model_rebuild()</code> by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11890">#11890</a></li>
<li>Do not duplicate metadata on model rebuild by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11902">#11902</a></li>
</ul>
<h2>v2.11.4 (2025-04-29)</h2>
<p><a
href="https://github.com/pydantic/pydantic/releases/tag/v2.11.4">GitHub
release</a></p>
<h3>What's Changed</h3>
<h4>Packaging</h4>
<ul>
<li>Bump <code>mkdocs-llmstxt</code> to v0.2.0 by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11725">#11725</a></li>
</ul>
<h4>Changes</h4>
<ul>
<li>Allow config and bases to be specified together in
<code>create_model()</code> by <a
href="https://github.com/Viicos"><code>@​Viicos</code></a> in <a
href="https://redirect.github.com/pydantic/pydantic/pull/11714">#11714</a>.
This change was backported as it was previously possible (although not
meant to be supported)
to provide <code>model_config</code> as a field, which would make it
possible to provide both configuration
and bases.</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="5f033e46c5"><code>5f033e4</code></a>
Prepare release v2.11.7</li>
<li><a
href="c3368b83c4"><code>c3368b8</code></a>
Copy <code>FieldInfo</code> instance if necessary during
<code>FieldInfo</code> build (<a
href="https://redirect.github.com/pydantic/pydantic/issues/11980">#11980</a>)</li>
<li><a
href="3987b23db4"><code>3987b23</code></a>
Prepare release v2.11.6</li>
<li><a
href="dc7a9d20be"><code>dc7a9d2</code></a>
Always store the original field assignment on
<code>FieldInfo</code></li>
<li><a
href="c284c279a5"><code>c284c27</code></a>
Rebuild dataclass fields before schema generation</li>
<li><a
href="5e6d1dc71f"><code>5e6d1dc</code></a>
Prepare release v2.11.5</li>
<li><a
href="1b63218c42"><code>1b63218</code></a>
Do not duplicate metadata on model rebuild (<a
href="https://redirect.github.com/pydantic/pydantic/issues/11902">#11902</a>)</li>
<li><a
href="5aefad873b"><code>5aefad8</code></a>
Do not delete mock validator/serializer in
<code>model_rebuild()</code></li>
<li><a
href="8fbe6585f4"><code>8fbe658</code></a>
Check if <code>FieldInfo</code> is complete after applying type variable
map</li>
<li><a
href="12b371a0f7"><code>12b371a</code></a>
Update documentation about <code>@dataclass_transform</code>
support</li>
<li>Additional commits viewable in <a
href="https://github.com/pydantic/pydantic/compare/v2.10.6...v2.11.7">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pydantic&package-manager=uv&previous-version=2.10.6&new-version=2.11.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-28 15:11:54 -07:00
Ashwin Bharambe
70469c84e9
chore(packaging): remove requirements.txt (#2938)
We don't need this. We have kept it since existing wisdom is that "it
helps with back-compat". Well, the entire ecosystem is moving to `uv` at
an unprecedented rate and keeping this creates unnecessary work and
confusion. The specific reason I am killing this is that it confuses
`dependabot` which ends up not bumping `uv.lock` which is the more
important file to change.
2025-07-28 14:52:24 -07:00
Ashwin Bharambe
cd24aaf3aa fix(pre-commit): push properly version 4 2025-07-28 13:11:56 -07:00
Ashwin Bharambe
8fa77bc93e fix(pre-commit): push properly version 3 2025-07-28 13:02:04 -07:00
Ashwin Bharambe
3058060e2b fix(pre-commit): push properly version 2 2025-07-28 12:50:50 -07:00
Ashwin Bharambe
607574c26a fix(pre-commit): push properly 2025-07-28 12:43:49 -07:00
Ashwin Bharambe
8961706dea fix(pre-commit): dont error if pre-commit itself errors 2025-07-28 12:35:34 -07:00
Ashwin Bharambe
dd4ea28b49
fix(dependabot): run pre-commit on dependabot PRs (#2935)
See PR screenshot below -- we need to run pre-commit on the dependabot
PRs obviously

<img width="837" height="277" alt="image"
src="https://github.com/user-attachments/assets/c17802d7-e252-4719-acc7-e335b24120f8"
/>
2025-07-28 15:25:06 -04:00
Matthew Farrellee
968fc132d3
fix(openai-compat): restrict developer/assistant/system/tool messages to text-only content (#2932)
**What:**
- Added OpenAIChatCompletionTextOnlyMessageContent type for text-only
content validation
- Modified OpenAISystemMessageParam, OpenAIAssistantMessageParam,
OpenAIDeveloperMessageParam, and OpenAIToolMessageParam to use text-only
content type instead of mixed content
- OpenAIUserMessageParam unchanged - still accepts both text and images
- Updated OpenAPI spec files to reflect text-only content restrictions
in schemas

closes #2894 

**Why:**
- Enforces OpenAI API compatibility by restricting image content to user
messages only
- Prevents API misuse where images might be sent in message types that
don't support them
- Aligns with OpenAI's actual API behavior where only user messages can
contain multimodal content
- Improves type safety and validation at the API boundary

**Test plan:**
- Added comprehensive parametrized tests covering all 5 OpenAI message
types
- Tests verify text string acceptance for all message types
- Tests verify text list acceptance for all message types
- Tests verify image rejection for system/assistant/developer/tool
messages (ValidationError expected)
- Tests verify user messages still accept images (backward compatibility
maintained)
2025-07-28 10:36:34 -07:00
Matthew Farrellee
60bb5e307e
feat(openai): add configurable base_url support with OPENAI_BASE_URL env var (#2919)
# What does this PR do?

- Add base_url field to OpenAIConfig with default
"https://api.openai.com/v1"
- Update sample_run_config to support OPENAI_BASE_URL environment
variable
- Modify get_base_url() to return configured base_url instead of
hardcoded value
- Add comprehensive test suite covering:
  - Default base URL behavior
  - Custom base URL from config
  - Environment variable override
  - Config precedence over environment variables
  - Client initialization with configured URL
  - Model availability checks using configured URL

This enables users to configure custom OpenAI-compatible API endpoints
via environment variables or configuration files.

Closes #2910 

## Test Plan

run unit tests
2025-07-28 10:16:02 -07:00
Charlie Doern
b1c21a25ec
docs: remove provider_id from external docs (#2922)
# What does this PR do?

external provider docs mention setting provider_id in the build yaml.
Since we changed that to just be provider_type and module, remove
instances of provider_id

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-07-28 10:14:39 -07:00
Charlie Doern
86fe2b8475
fix: adjust provider type used in external provider test (#2921)
# What does this PR do?

provider_id is no longer valid in a build.yaml, remove it in the
external provider test

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-07-28 10:14:16 -07:00
Matthew Farrellee
47c078fcef
feat: implement dynamic model detection support for inference providers using litellm (#2886)
# What does this PR do?

This enhancement allows inference providers using LiteLLMOpenAIMixin to
validate model availability against LiteLLM's official provider model
listings, improving reliability and user experience when working with
different AI service providers.

- Add litellm_provider_name parameter to LiteLLMOpenAIMixin constructor
- Add check_model_availability method to LiteLLMOpenAIMixin using
litellm.models_by_provider
- Update Gemini, Groq, and SambaNova inference adapters to pass
litellm_provider_name

## Test Plan

standard CI.
2025-07-28 10:13:54 -07:00
Christian Zaccaria
c48dcafc77
fix: Fix unit tests CI and failing tests (#2928)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
- Added `set -e` to the beginning of the unit test script to ensure the
script exits on failure and correctly fails the CI when tests do not
pass.
- Fixed all unit tests that were silently failing in the CI.
- Fixed Python 3.13 unit test CI failing silently.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes #2877 

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
- **Previously:** Unit tests passing in CI eventhough it failed 11 tests
->
[CI-run](4683681501 (step):4:2097)
- **Made the fix. Now, ensuring CI fails as expected on test failures:**
Unit tests failing in CI with 1 failed test ->
[CI-run](4684234247 (step):4:1506)
- This PR shows the CI passing and all unit tests passing.
2025-07-28 10:07:26 -07:00
Charlie Doern
46e2989312
fix: switch refresh to debug log (#2933)
# What does this PR do?
the server logs have a persistent `core: refreshing registry` log that
clogs up the output. Switch it to debug

this is what it looked like:

<img width="1126" height="1028" alt="Screenshot 2025-07-28 at 9 56
44 AM"
src="https://github.com/user-attachments/assets/a1880fd3-7fc7-4a97-bfb8-89a62e4c5c19"
/>

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-07-28 10:02:54 -07:00
Matthew Farrellee
3c40c8e583
fix: litellm_provider_name for llama-api (#2934)
litellm uses "meta_llama" for the provider name, see
https://docs.litellm.ai/docs/providers/meta_llama ad
https://github.com/BerriAI/litellm/blob/main/litellm/__init__.py#L833
2025-07-28 10:02:16 -07:00
Charlie Doern
09abdb0a37
test: upload logs for external provider tests (#2914)
Some checks failed
Integration Tests / discover-tests (push) Successful in 2s
Installer CI / lint (push) Failing after 5s
Installer CI / smoke-test-on-dev (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 7s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 6s
Test Llama Stack Build / generate-matrix (push) Successful in 4s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s
Test Llama Stack Build / build-single-provider (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 8s
Python Package Build Test / build (3.13) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 12s
Test External API and Providers / test-external (venv) (push) Failing after 6s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 9s
Test Llama Stack Build / build (push) Failing after 6s
Update ReadTheDocs / update-readthedocs (push) Failing after 7s
Unit Tests / unit-tests (3.13) (push) Failing after 9s
Integration Tests / test-matrix (push) Failing after 7s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 13s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 17s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 22s
Unit Tests / unit-tests (3.12) (push) Failing after 19s
Pre-commit / pre-commit (push) Successful in 1m5s
# What does this PR do?

currently the external provider tests don't upload log files as
artifacts nor do they use LLAMA_STACK_LOG_FILE. align with the other
integration tests

## Test Plan

logs should be present in the two tests on this PR

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-07-25 15:03:15 -07:00
Ashwin Bharambe
9583f468f8
feat(starter)!: simplify starter distro; litellm model registry changes (#2916) 2025-07-25 15:02:04 -07:00
Charlie Doern
3344d8a9e5
fix: separate build and run provider types (#2917)
Some checks failed
Coverage Badge / unit-tests (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Integration Tests / discover-tests (push) Successful in 3s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s
Test Llama Stack Build / generate-matrix (push) Successful in 4s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s
Python Package Build Test / build (3.13) (push) Failing after 2s
Test Llama Stack Build / build-single-provider (push) Failing after 3s
Python Package Build Test / build (3.12) (push) Failing after 2s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 5s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 5s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 9s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 6s
Test External API and Providers / test-external (venv) (push) Failing after 5s
Update ReadTheDocs / update-readthedocs (push) Failing after 4s
Unit Tests / unit-tests (3.13) (push) Failing after 5s
Test Llama Stack Build / build (push) Failing after 3s
Unit Tests / unit-tests (3.12) (push) Failing after 5s
Integration Tests / test-matrix (push) Failing after 7s
Pre-commit / pre-commit (push) Successful in 1m13s
# What does this PR do?

in #2637, I combined the run and build config provider types to both use
`Provider`

since this includes a provider_id, a user must now specify this when
writing a build yaml. This is not very clear because all a user should
care about upon build is the code to be installed (the module and the
provider_type)

introduce `BuildProvider` and fixup the parts of the code impacted by
this

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-07-25 12:39:26 -07:00
Nathan Weinberg
025163d8e6
feat: add auto-generated CI documentation pre-commit hook (#2890)
# What does this PR do?
Our CI is entirely undocumented, this commit adds a README.md file with
a table of the current CI and what is does

---------

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-07-25 17:57:01 +02:00
Derek Higgins
52201612de
feat: implement chunk deletion for vector stores (#2701)
Add support for deleting individual chunks from vector stores

- Add abstract remove_chunk() method to EmbeddingIndex base class
- Implement chunk deletion for Faiss provider, SQLite Vec, Milvus,
PGVector
- Placeholder implementations with NotImplementedError for
Chroma/Qdrant/Weaviate
- Integrate chunk deletion into OpenAI vector store file deletion flow
- removed xfail from
test_openai_vector_store_delete_file_removes_from_vector_store

Closes: #2477

---------

Signed-off-by: Derek Higgins <derekh@redhat.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
2025-07-25 10:30:30 -04:00
Francisco Arceo
9e77be1f72
chore: Fix chroma unit tests (#2896)
# What does this PR do?
Enable Chroma inline unit tests and fix integration tests.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-07-25 10:12:14 -04:00
Ashwin Bharambe
ed07a58b50
fix(registry): ensure clean shutdown (#2901)
Some checks failed
Coverage Badge / unit-tests (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Integration Tests / discover-tests (push) Successful in 4s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 5s
Python Package Build Test / build (3.12) (push) Failing after 1s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Test Llama Stack Build / build-single-provider (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 6s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 5s
Python Package Build Test / build (3.13) (push) Failing after 3s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 9s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 5s
Test External API and Providers / test-external (venv) (push) Failing after 5s
Update ReadTheDocs / update-readthedocs (push) Failing after 5s
Unit Tests / unit-tests (3.12) (push) Failing after 6s
Unit Tests / unit-tests (3.13) (push) Failing after 6s
Integration Tests / test-matrix (push) Failing after 5s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s
Test Llama Stack Build / build (push) Failing after 4s
Pre-commit / pre-commit (push) Successful in 57s
Avoid the error message:

```
INFO     2025-07-24 21:51:54,530 __main__:598 server: Received interrupt signal, shutting down gracefully...                                          
ERROR    2025-07-24 21:51:54,692 asyncio:1826 uncategorized: Task was destroyed but it is pending!                                                    
         task: <Task pending name='Task-15' coro=<refresh_registry() running at                                                                       
         /Users/leseb/Documents/AI/llama-stack/llama_stack/distribution/stack.py:356> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=>  
```
2025-07-25 09:44:31 -04:00
Charlie Doern
de6919ecdd
refactor: install external providers from module (#2637)
# What does this PR do?

Today, external providers are installed via the `external_providers_dir`
in the config. This necessitates users to understand the `ProviderSpec`
and set up their directories accordingly. This process splits up the
config for the stack across multiple files, directories, and formats.

Most (if not all) external providers today have a
[get_provider_spec](559cb18fbb/src/ramalama_stack/provider.py (L9))
method that sits unused. Utilizing this method rather than the
providers.d route allows for a much easier installation process for
external providers and limits the amount of extra configuration a
regular user has to do to get their stack off the ground.

To accomplish this and wire it throughout the build process, Introduce
the concept of a `module` for users to specify for an external provider
upon build time. In order to facilitate this, align the build and run
spec to use `Provider` class rather than the stringified provider_type
that build currently uses.

For example, say this is in your build config:

```
- provider_id: ramalama
  provider_type: remote::ramalama
  module: ramalama_stack
```

during build (in the various `build_...` scripts), additionally to
installing any pip dependencies we will also install this module and use
the `get_provider_spec` method to retrieve the ProviderSpec that is
currently specified using `providers.d`.

In production so far, providing instructions for installing external
providers for users has been difficult: they need to install the module
as a pre-req, create the providers.d directory, copy in the provider
spec, and also copy in the necessary build/run yaml files. Accessing an
external provider should be as easy as possible, and pointing to its
installable module aligns more with the rest of our build and dependency
management process.

For now, `external_providers_dir` still exists as an alternate more
declarative method of using external providers.

## Test Plan

added an integration test installing an external provider from module
and more unit test coverage for `get_provider_registry`


( the warning in yellow is expected, the module is installed inside of
the build env, not where we are running the command)
<img width="1119" height="400" alt="Screenshot 2025-07-24 at 11 30
48 AM"
src="https://github.com/user-attachments/assets/1efbaf45-b9e8-451a-bd63-264ed664706d"
/>

<img width="1154" height="618" alt="Screenshot 2025-07-24 at 11 31
14 AM"
src="https://github.com/user-attachments/assets/feb2b3ea-c5dd-418e-9662-9a3bd5dd6bdc"
/>

---------

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-07-25 15:41:26 +02:00
dependabot[bot]
85223ccc4d
chore(github-deps): bump astral-sh/setup-uv from 6.4.1 to 6.4.3 (#2902)
Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from
6.4.1 to 6.4.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's
releases</a>.</em></p>
<blockquote>
<h2>v6.4.3 🌈 fix relative paths starting with dots</h2>
<h2>🐛 Bug fixes</h2>
<ul>
<li>fix relative paths starting with dots <a
href="https://github.com/eifinger"><code>@​eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/500">#500</a>)</li>
</ul>
<h2>v6.4.2 🌈 Interpret relative inputs as under working-directory</h2>
<h2>Changes</h2>
<p>This release will interpret relative paths in inputs as relative
to the value of <code>working-directory</code> (default is <code>${{
github.workspace }}</code>) .
This means the following configuration</p>
<pre lang="yaml"><code>- uses: astral-sh/setup-uv@v6
   with:
     working-directory: /my/path
     cache-dependency-glob: uv.lock
</code></pre>
<p>will look for the <code>cache-dependency-glob</code> under
<code>/my/path/uv.lock</code></p>
<h2>🐛 Bug fixes</h2>
<ul>
<li>interpret relative inputs as under working-directory <a
href="https://github.com/eifinger"><code>@​eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/498">#498</a>)</li>
</ul>
<h2>🧰 Maintenance</h2>
<ul>
<li>chore: update known versions for 0.8.1/0.8.2 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/497">#497</a>)</li>
<li>chore: update known versions for 0.8.0 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/491">#491</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="e92bafb625"><code>e92bafb</code></a>
fix relative paths starting with dots (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/500">#500</a>)</li>
<li><a
href="2c7142f755"><code>2c7142f</code></a>
interpret relative inputs as under working-directory (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/498">#498</a>)</li>
<li><a
href="23482a31a8"><code>23482a3</code></a>
chore: update known versions for 0.8.1/0.8.2 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/497">#497</a>)</li>
<li><a
href="4ac06a054e"><code>4ac06a0</code></a>
chore: update known versions for 0.8.0 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/491">#491</a>)</li>
<li>See full diff in <a
href="7edac99f96...e92bafb625">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.4.1&new-version=6.4.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-25 10:08:24 +02:00
Yuan Tang
34093fecd1
ci: Remove open-pull-requests-limit: 0 from dependabot.yml (#2900)
This might fix issues in
https://github.com/meta-llama/llama-stack/pull/2899 and
https://github.com/meta-llama/llama-stack/pull/2897 where uv
dependencies are not being upgraded correctly (`uv.lock` is not being
updated).
2025-07-25 09:49:18 +02:00
dependabot[bot]
3216765c26
chore(deps): bump form-data from 4.0.2 to 4.0.4 in /llama_stack/ui (#2898)
Some checks failed
Coverage Badge / unit-tests (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s
Integration Tests / discover-tests (push) Successful in 3s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 4s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s
Python Package Build Test / build (3.12) (push) Failing after 2s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s
Python Package Build Test / build (3.13) (push) Failing after 3s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 4s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 6s
Update ReadTheDocs / update-readthedocs (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 4s
Unit Tests / unit-tests (3.13) (push) Failing after 7s
Integration Tests / test-matrix (push) Failing after 6s
Pre-commit / pre-commit (push) Successful in 47s
Bumps [form-data](https://github.com/form-data/form-data) from 4.0.2 to
4.0.4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/form-data/form-data/releases">form-data's
releases</a>.</em></p>
<blockquote>
<h2>v4.0.4</h2>
<h2><a
href="https://github.com/form-data/form-data/compare/v4.0.3...v4.0.4">v4.0.4</a>
- 2025-07-16</h2>
<h3>Commits</h3>
<ul>
<li>[meta] add <code>auto-changelog</code> <a
href="811f68282f"><code>811f682</code></a></li>
<li>[Tests] handle predict-v8-randomness failures in node &lt; 17 and
node &gt; 23 <a
href="1d11a76434"><code>1d11a76</code></a></li>
<li>[Fix] Switch to using <code>crypto</code> random for boundary values
<a
href="3d1723080e"><code>3d17230</code></a></li>
<li>[Tests] fix linting errors <a
href="5e340800b5"><code>5e34080</code></a></li>
<li>[meta] actually ensure the readme backup isn’t published <a
href="316c82ba93"><code>316c82b</code></a></li>
<li>[Dev Deps] update <code>@ljharb/eslint-config</code> <a
href="58c25d7640"><code>58c25d7</code></a></li>
<li>[meta] fix readme capitalization <a
href="2300ca1959"><code>2300ca1</code></a></li>
</ul>
<h2>v4.0.3</h2>
<h2><a
href="https://github.com/form-data/form-data/compare/v4.0.2...v4.0.3">v4.0.3</a>
- 2025-06-05</h2>
<h3>Fixed</h3>
<ul>
<li>[Fix] <code>append</code>: avoid a crash on nullish values <a
href="https://redirect.github.com/form-data/form-data/issues/577"><code>[#577](https://github.com/form-data/form-data/issues/577)</code></a></li>
</ul>
<h3>Commits</h3>
<ul>
<li>[eslint] use a shared config <a
href="426ba9ac44"><code>426ba9a</code></a></li>
<li>[eslint] fix some spacing issues <a
href="20941917f0"><code>2094191</code></a></li>
<li>[Refactor] use <code>hasown</code> <a
href="81ab41b46f"><code>81ab41b</code></a></li>
<li>[Fix] validate boundary type in <code>setBoundary()</code> method <a
href="8d8e469309"><code>8d8e469</code></a></li>
<li>[Tests] add tests to check the behavior of <code>getBoundary</code>
with non-strings <a
href="837b8a1f75"><code>837b8a1</code></a></li>
<li>[Dev Deps] remove unused deps <a
href="870e4e6659"><code>870e4e6</code></a></li>
<li>[meta] remove local commit hooks <a
href="e6e83ccb54"><code>e6e83cc</code></a></li>
<li>[Dev Deps] update <code>eslint</code> <a
href="4066fd6f65"><code>4066fd6</code></a></li>
<li>[meta] fix scripts to use prepublishOnly <a
href="c4bbb13c0e"><code>c4bbb13</code></a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/form-data/form-data/blob/master/CHANGELOG.md">form-data's
changelog</a>.</em></p>
<blockquote>
<h2><a
href="https://github.com/form-data/form-data/compare/v4.0.3...v4.0.4">v4.0.4</a>
- 2025-07-16</h2>
<h3>Commits</h3>
<ul>
<li>[meta] add <code>auto-changelog</code> <a
href="811f68282f"><code>811f682</code></a></li>
<li>[Tests] handle predict-v8-randomness failures in node &lt; 17 and
node &gt; 23 <a
href="1d11a76434"><code>1d11a76</code></a></li>
<li>[Fix] Switch to using <code>crypto</code> random for boundary values
<a
href="3d1723080e"><code>3d17230</code></a></li>
<li>[Tests] fix linting errors <a
href="5e340800b5"><code>5e34080</code></a></li>
<li>[meta] actually ensure the readme backup isn’t published <a
href="316c82ba93"><code>316c82b</code></a></li>
<li>[Dev Deps] update <code>@ljharb/eslint-config</code> <a
href="58c25d7640"><code>58c25d7</code></a></li>
<li>[meta] fix readme capitalization <a
href="2300ca1959"><code>2300ca1</code></a></li>
</ul>
<h2><a
href="https://github.com/form-data/form-data/compare/v4.0.2...v4.0.3">v4.0.3</a>
- 2025-06-05</h2>
<h3>Fixed</h3>
<ul>
<li>[Fix] <code>append</code>: avoid a crash on nullish values <a
href="https://redirect.github.com/form-data/form-data/issues/577"><code>[#577](https://github.com/form-data/form-data/issues/577)</code></a></li>
</ul>
<h3>Commits</h3>
<ul>
<li>[eslint] use a shared config <a
href="426ba9ac44"><code>426ba9a</code></a></li>
<li>[eslint] fix some spacing issues <a
href="20941917f0"><code>2094191</code></a></li>
<li>[Refactor] use <code>hasown</code> <a
href="81ab41b46f"><code>81ab41b</code></a></li>
<li>[Fix] validate boundary type in <code>setBoundary()</code> method <a
href="8d8e469309"><code>8d8e469</code></a></li>
<li>[Tests] add tests to check the behavior of <code>getBoundary</code>
with non-strings <a
href="837b8a1f75"><code>837b8a1</code></a></li>
<li>[Dev Deps] remove unused deps <a
href="870e4e6659"><code>870e4e6</code></a></li>
<li>[meta] remove local commit hooks <a
href="e6e83ccb54"><code>e6e83cc</code></a></li>
<li>[Dev Deps] update <code>eslint</code> <a
href="4066fd6f65"><code>4066fd6</code></a></li>
<li>[meta] fix scripts to use prepublishOnly <a
href="c4bbb13c0e"><code>c4bbb13</code></a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="41996f5ac7"><code>41996f5</code></a>
v4.0.4</li>
<li><a
href="316c82ba93"><code>316c82b</code></a>
[meta] actually ensure the readme backup isn’t published</li>
<li><a
href="2300ca1959"><code>2300ca1</code></a>
[meta] fix readme capitalization</li>
<li><a
href="811f68282f"><code>811f682</code></a>
[meta] add <code>auto-changelog</code></li>
<li><a
href="5e340800b5"><code>5e34080</code></a>
[Tests] fix linting errors</li>
<li><a
href="1d11a76434"><code>1d11a76</code></a>
[Tests] handle predict-v8-randomness failures in node &lt; 17 and node
&gt; 23</li>
<li><a
href="58c25d7640"><code>58c25d7</code></a>
[Dev Deps] update <code>@ljharb/eslint-config</code></li>
<li><a
href="3d1723080e"><code>3d17230</code></a>
[Fix] Switch to using <code>crypto</code> random for boundary
values</li>
<li><a
href="d8d67dc8ac"><code>d8d67dc</code></a>
v4.0.3</li>
<li><a
href="e6e83ccb54"><code>e6e83cc</code></a>
[meta] remove local commit hooks</li>
<li>Additional commits viewable in <a
href="https://github.com/form-data/form-data/compare/v4.0.2...v4.0.4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=form-data&package-manager=npm_and_yarn&previous-version=4.0.2&new-version=4.0.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/meta-llama/llama-stack/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-24 21:24:56 -04:00
ehhuang
21bae296f2
feat(auth): API access control (#2822)
# What does this PR do?
- Added ability to specify `required_scope` when declaring an API. This
is part of the `@webmethod` decorator.
- If auth is enabled, a user can access an API only if
`user.attributes['scope']` includes the `required_scope`
- We add `required_scope='telemetry.read'` to the telemetry read APIs.

## Test Plan
CI with added tests

1. Enable server.auth with github token
2. Observe `client.telemetry.query_traces()` returns 403
2025-07-24 15:30:48 -07:00
Calum Murray
7cc4819e90
feat: add MCP Streamable HTTP support (#2554)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR adds support for the new Streamable HTTP transport for MCP, as
well as falling back to the SSE protocol if the Streamable HTTP
connection fails.

<!-- If resolving an issue, uncomment and update the line below -->
Closes #2542 

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

---------

Signed-off-by: Calum Murray <cmurray@redhat.com>
2025-07-24 15:04:27 -07:00
Sébastien Han
632cf9eb72
feat: Bring Your Own API (BYOA) (#2228)
Some checks failed
Coverage Badge / unit-tests (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Installer CI / lint (push) Failing after 3s
Integration Tests / discover-tests (push) Successful in 3s
Installer CI / smoke-test-on-dev (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s
Python Package Build Test / build (3.12) (push) Failing after 3s
Python Package Build Test / build (3.13) (push) Failing after 2s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 10s
Test Llama Stack Build / build-single-provider (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s
Unit Tests / unit-tests (3.13) (push) Failing after 6s
Test External API and Providers / test-external (venv) (push) Failing after 5s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s
Unit Tests / unit-tests (3.12) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 6s
Update ReadTheDocs / update-readthedocs (push) Failing after 8s
Integration Tests / test-matrix (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s
Test Llama Stack Build / build (push) Failing after 6s
Pre-commit / pre-commit (push) Successful in 57s
# What does this PR do?

Prototype on a new feature to allow new APIs to be plugged in Llama
Stack. Opened for early feedback on the approach and test appetite on
the functionality.

@ashwinb @raghotham open for early feedback, thanks!

---------

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-07-24 13:41:14 -07:00
Charlie Doern
341504869e
fix: use logger for console telemetry (#2844)
# What does this PR do?

currently `print` is being used with custom formatting to achieve
telemetry output in the console_span_processor

This causes telemetry not to show up in log files when using
`LLAMA_STACK_LOG_FILE`. During testing it looks like telemetry is not
being captured when it is

switch to using Rich formatting with the logger and then strip the
formatting off when a log file is being used so the formatting looks
normal

## Test Plan

before:

console:

<img width="967" height="127" alt="Screenshot 2025-07-21 at 4 02 15 PM"
src="https://github.com/user-attachments/assets/b09518cc-9d38-4970-9877-70e2c41fcbb5"
/>


log file (no telemetry):

```
2025-07-21 16:01:32,481 llama_stack.providers.remote.inference.ollama.ollama:117 inference: checking connectivity to Ollama at `http://localhost:11434`...
2025-07-21 16:01:34,779 opentelemetry.trace:537 uncategorized: Overriding of current TracerProvider is not allowed
2025-07-21 16:01:35,083 __main__:587 server: Listening on ['::', '0.0.0.0']:8321
2025-07-21 16:01:35,091 uvicorn.error:84 uncategorized: Started server process [68679]
2025-07-21 16:01:35,091 uvicorn.error:48 uncategorized: Waiting for application startup.
2025-07-21 16:01:35,092 __main__:163 server: Starting up
2025-07-21 16:01:35,092 uvicorn.error:62 uncategorized: Application startup complete.
2025-07-21 16:01:35,092 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit)
2025-07-21 16:01:37,167 uvicorn.access:473 uncategorized: 127.0.0.1:53145 - "POST /v1/openai/v1/chat/completions HTTP/1.1" 200
```

after:

console:

<img width="797" height="165" alt="Screenshot 2025-07-22 at 3 28 44 PM"
src="https://github.com/user-attachments/assets/44d40e3b-6502-439d-9ea5-38058b289962"
/>


log file:

```
2025-07-21 15:59:51,481 llama_stack.providers.remote.inference.ollama.ollama:117 inference: checking connectivity to Ollama at `http://localhost:11434`...
2025-07-21 15:59:53,801 opentelemetry.trace:537 uncategorized: Overriding of current TracerProvider is not allowed
2025-07-21 15:59:54,059 __main__:587 server: Listening on ['::', '0.0.0.0']:8321
2025-07-21 15:59:54,066 uvicorn.error:84 uncategorized: Started server process [68578]
2025-07-21 15:59:54,067 uvicorn.error:48 uncategorized: Waiting for application startup.
2025-07-21 15:59:54,067 __main__:163 server: Starting up
2025-07-21 15:59:54,067 uvicorn.error:62 uncategorized: Application startup complete.
2025-07-21 15:59:54,068 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit)
2025-07-21 15:59:55,381 [TELEMETRY] 19:59:55.381  /v1/openai/v1/chat/completions
2025-07-21 15:59:55,619 uvicorn.access:473 uncategorized: 127.0.0.1:53102 - "POST /v1/openai/v1/chat/completions HTTP/1.1" 200
2025-07-21 15:59:55,621 [TELEMETRY] 19:59:55.621  /v1/openai/v1/chat/completions [StatusCode.OK] (240.07ms)
2025-07-21 15:59:55,622 [TELEMETRY]  19:59:55.620  127.0.0.1:53102 - "POST /v1/openai/v1/chat/completions HTTP/1.1" 200
```

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-07-24 16:26:59 -04:00
Kelly Brown
abade761e0
docs: Update nvidia docs template (#2893)
**Description**

Fixes generation issue in nvidia code gen file.

Closes #2873
2025-07-24 22:11:34 +02:00
Sébastien Han
226b877ca6
chore: install script should use starter (#2891)
Our demo installation script should pull the starter image. Ollama is
not being updated anymore as a distribution.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-07-24 12:18:02 -07:00
ehhuang
cbe89d2bdd
chore: return webmethod from find_matching_route (#2883)
This will be used to support API access control, i.e. Webmethod would
have a `required_scope` attribute, and we need access to that in the
middleware.
2025-07-24 11:37:21 -07:00
Ashwin Bharambe
1463b79218
feat(registry): make the Stack query providers for model listing (#2862)
This flips #2823 and #2805 by making the Stack periodically query the
providers for models rather than the providers going behind the back and
calling "register" on to the registry themselves. This also adds support
for model listing for all other providers via `ModelRegistryHelper`.
Once this is done, we do not need to manually list or register models
via `run.yaml` and it will remove both noise and annoyance (setting
`INFERENCE_MODEL` environment variables, for example) from the new user
experience.

In addition, it adds a configuration variable `allowed_models` which can
be used to optionally restrict the set of models exposed from a
provider.
2025-07-24 10:39:53 -07:00
Stefan Thaler
537dc693ee
chore: add mypy coverage to inspect.py and library_client.py in /distribution (#2707)
# What does this PR do?
Adds type guards in /distribution/inspect.py and ignores a valid-type
mypy error in library_client.py. This PR is part of issue #2647 . I'm
rather unsure whether ignoring the valid-type error is correct in this
case. It appears that args[0] is interpreted as [any] but I didn't find
any way to specify the type.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->


<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
2025-07-24 09:51:46 -07:00
Charlie Doern
d4f0b430e2
docs: update list of apis (#2697)
# What does this PR do?

apis.md had a few APIs missing and incorrectly described APIs

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-07-24 09:50:14 -07:00
Sébastien Han
af9c707eaf
fix: various improvements on install.sh (#2724)
# What does this PR do?

Bulk improvements:

* The script has a better error reporting, when a command fails it will
print the logs of the failed command
* Better error handling using a trap to catch signal and perform proper
cleanup
* Cosmetic changes
* Added CI to test the image code against main
* Use the starter image and its latest tag

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-07-24 09:43:51 -07:00
Derek Higgins
4ea1f2aa9f
test: Add VLLM provider support to integration tests (#2757)
- Add setup-vllm GitHub action to start VLLM container
- Extend integration test matrix to support both ollama and vllm
providers
- Make test setup conditional based on provider type
- Add provider-specific environment variables and configurations
- vllm tests setup to run weekly or can be triggered manually (only
ollama on PR)

TODO:
investigate failing tests for vllm provider (safety and post_training)
Also need a proper fix for #2713 (tmp fix for this in the first commit
in this PR)
Closes: #1648

---------

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-07-24 09:42:26 -07:00
Mustafa Elbehery
6ab5760a1b
chore(test): migrate unit tests from unittest to pytest nvidia test safety (#2793)
This PR replaces unittest with pytest.

Part of https://github.com/meta-llama/llama-stack/issues/2680

cc @leseb

Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>
2025-07-24 09:41:07 -07:00
Yuan Tang
9069d878ef
docs: Update CHANGELOG.md (#2874)
This updates the changelog to include recent releases.

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-07-24 09:36:28 -07:00
Christian Zaccaria
7f7b990b80
docs: Document use cases for Responses and Agents APIs (#2756)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This pull request adds documentation to clarify the differences between
the Agents API and the OpenAI Responses API, including use cases for
each. It also updates the index page to reference the new documentation.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes #2368
2025-07-24 12:20:04 -04:00
Mohit Gaur
5ef2baacdc
fix: update check-workflows-use-hashes to use github error format (#2875)
# What does this PR do?
Updates the script `scripts/check-workflows-use-hashes.sh` to improve
error reporting by adopting GitHub Actions error annotation format.

* Updated the script to use GitHub Actions error annotation format
(`::error file={name},line={line},col={col}::{message}`) making error
messages more actionable and easier to locate in workflows.
* Modified the script to include line numbers for `uses:` references by
using `grep -n` and extracting line numbers, improving the precision of
error reporting.

Closes #2778

## Test Plan

- Violation check - Created test file with mixed SHA/non-SHA actions

```
echo 'uses: actions/checkout@v4' > test-workflow.yml
echo 'uses: actions/upload-artifact@main' >> test-workflow.yml
```
Result: Correctly detected violations with precise line numbers
```
./scripts/check-workflows-use-hashes.sh
Output:
::error file=test-workflow.yml,line=14::uses non-SHA action ref: uses: actions/checkout@v4
::error file=test-workflow.yml,line=20::uses non-SHA action ref: uses: actions/upload-artifact@main
```

- Verified existing project workflows pass
```
./scripts/check-workflows-use-hashes.sh
# Result: Exit code 0 (all workflows properly SHA-pinned)
```
2025-07-24 17:41:17 +02:00
Matthew Farrellee
e33a50480d
fix: starter template and litellm backward compat conflict for openai (#2885)
# What does this PR do?

openai/models.py has backward compat entries for litellm model names.
the starter template includes these in the list of registered models.
the inclusion results in duplicate model registrations.

the backward compat is no longer necessary.

## Test Plan

ci
2025-07-24 17:28:37 +02:00
Sarthak Deshpande
cd8715d327
chore: Added openai compatible vector io endpoints for chromadb (#2489)
Some checks failed
Integration Tests / discover-tests (push) Successful in 3s
Coverage Badge / unit-tests (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Python Package Build Test / build (3.13) (push) Failing after 2s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 12s
Test External Providers / test-external-providers (venv) (push) Failing after 12s
Update ReadTheDocs / update-readthedocs (push) Failing after 10s
Test Llama Stack Build / build-single-provider (push) Failing after 15s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s
Unit Tests / unit-tests (3.13) (push) Failing after 14s
Test Llama Stack Build / build (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 18s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 51s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 49s
Integration Tests / test-matrix (push) Failing after 53s
Pre-commit / pre-commit (push) Successful in 1m42s
# What does this PR do?
This PR implements the openai compatible endpoints for chromadb

Closes #2462 

## Test Plan
Ran ollama llama stack server and ran the command
`pytest -sv --stack-config=http://localhost:8321
tests/integration/vector_io/test_openai_vector_stores.py
--embedding-model all-MiniLM-L6-v2`
8 failed, 27 passed, 8 skipped, 1 xfailed
The failed ones are regarding files api

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Co-authored-by: sarthakdeshpande <sarthak.deshpande@engati.com>
Co-authored-by: Francisco Javier Arceo <farceo@redhat.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
2025-07-23 13:51:58 -07:00
Derek Higgins
fd2aab8582
fix: prevent shell redirection issues with pip dependencies (#2867)
- Use printf to to escape special characters (e.g. < > )
- Apply escaping to pip_dependencies and special_pip_deps

Resolves shell interpretation of >= operators as redirections that were
causing build failing to respect versions and unexpected file creation
in /app directory.

Closes: #2866

## Test Plan
Manually tested, will also be tested by existing CI

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-07-23 21:43:33 +02:00
Derek Higgins
427136bb63
fix: cleanup after build_container.sh (#2869)
- rm TEMP_DIR when build_container.sh succeeds
- prevents multiple temp directories with Containerfile being left in
/tmp

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-07-23 11:54:54 -07:00