Commit graph

115 commits

Author SHA1 Message Date
Sébastien Han
4f3f28f718
chore: use dependency-groups for dev (#2287)
# What does this PR do?

The previous `[project.optional-dependencies]` was misrepresenting what
the packages were. They were NOT optional dependencies to the project
but development dependencies. Unlike optional dependencies, development
dependencies are local-only and will not be included in the project
requirements when published to PyPI or other indexes. As such,
development dependencies are not included in the [project] table.
Additionally, the dev group is synced by default.

Source:

https://docs.astral.sh/uv/concepts/projects/dependencies/#development-dependencies

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-05-27 23:00:17 +02:00
Mark Campbell
e7e9ec0379
chore: fix visible comments in pr template (#2279)
# What does this PR do?
This PR adds updated comments for the PR template as comments were
showing up in PRs when they were not meant to
2025-05-27 15:42:33 +02:00
Ashwin Bharambe
3faf1e4a79
feat: enable MCP execution in Responses impl (#2240)
## Test Plan

```
pytest -s -v 'tests/verifications/openai_api/test_responses.py' \
  --provider=stack:together --model meta-llama/Llama-4-Scout-17B-16E-Instruct
```
2025-05-24 14:20:42 -07:00
Sébastien Han
37f1e8a7f7
fix: use proper service account for kube auth (#2227)
# What does this PR do?

Not sure why it passed CI earlier...

Strange only 24 workflows run here
https://github.com/meta-llama/llama-stack/pull/2216 so the test never
ran...

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-05-21 15:28:21 -07:00
Sébastien Han
6a62e783b9
chore: refactor workflow writting (#2225)
# What does this PR do?

Use a composite action to avoid similar steps repetitions and
centralization of the defaults.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-05-21 17:31:14 +02:00
Sébastien Han
c25acedbcd
chore: remove k8s auth in favor of k8s jwks endpoint (#2216)
# What does this PR do?

Kubernetes since 1.20 exposes a JWKS endpoint that we can use with our
recent oauth2 recent implementation.
The CI test has been kept intact for validation.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-05-21 16:23:54 +02:00
Sébastien Han
3f6368d56c
ci: enable ruff output format for github (#2214)
# What does this PR do?

Update output format to enable automatic inline annotations.

![Screenshot 2025-05-20 at 10 55
38](https://github.com/user-attachments/assets/f943aa00-9b60-4cdb-b434-67b2de8b79f2)

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-05-20 09:04:03 -07:00
dependabot[bot]
1341916caf
chore(github-deps): bump astral-sh/setup-uv from 5.4.1 to 6.0.1 (#2197) 2025-05-18 02:09:56 -04:00
Charlie Doern
f02f7b28c1
feat: add huggingface post_training impl (#2132)
# What does this PR do?


adds an inline HF SFTTrainer provider. Alongside touchtune -- this is a
super popular option for running training jobs. The config allows a user
to specify some key fields such as a model, chat_template, device, etc

the provider comes with one recipe `finetune_single_device` which works
both with and without LoRA.

any model that is a valid HF identifier can be given and the model will
be pulled.

this has been tested so far with CPU and MPS device types, but should be
compatible with CUDA out of the box

The provider processes the given dataset into the proper format,
establishes the various steps per epoch, steps per save, steps per eval,
sets a sane SFTConfig, and runs n_epochs of training

if checkpoint_dir is none, no model is saved. If there is a checkpoint
dir, a model is saved every `save_steps` and at the end of training.


## Test Plan

re-enabled post_training integration test suite with a singular test
that loads the simpleqa dataset:
https://huggingface.co/datasets/llamastack/simpleqa and a tiny granite
model: https://huggingface.co/ibm-granite/granite-3.3-2b-instruct. The
test now uses the llama stack client and the proper post_training API

runs one step with a batch_size of 1. This test runs on CPU on the
Ubuntu runner so it needs to be a small batch and a single step.

[//]: # (## Documentation)

---------

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-05-16 14:41:28 -07:00
Matthew Farrellee
7aae8fadbf
fix: dev -> starter rename in ci (#2183)
continuation of https://github.com/meta-llama/llama-stack/pull/2181
2025-05-16 09:41:53 +02:00
Ashwin Bharambe
87e284f1a0 chore: update CODEOWNERS 2025-05-15 12:31:12 -07:00
Charlie Doern
e46de23be6
feat: refactor external providers dir (#2049)
# What does this PR do?

currently the "default" dir for external providers is
`/etc/llama-stack/providers.d`

This dir is not used anywhere nor created.

Switch to a more friendly `~/.llama/providers.d/`

This allows external providers to actually create this dir and/or
populate it upon installation, `pip` cannot create directories in `etc`.

If a user does not specify a dir, default to this one

see https://github.com/containers/ramalama-stack/issues/36

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-05-15 20:17:03 +02:00
Yuan Tang
7e25c8df28
fix: ReadTheDocs should display all versions (#2172)
# What does this PR do?

Currently the website only displays the "latest" version. This is
because our config and workflow do not include version information. This
PR adds missing version info.

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-05-15 11:41:15 -04:00
Ihar Hrachyshka
c3f27de3ea
chore: Update triagers list with new additions (#2180)
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
2025-05-15 11:39:25 -04:00
Ihar Hrachyshka
268725868e
chore: enforce no git tags or branches in external github actions (#2159)
# What does this PR do?

Don't allow git tags and branches for external actions.

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
2025-05-14 20:40:06 +02:00
Nathan Weinberg
a1fbfb51e2
ci(chore): use hashes for all version pinning (#2157)
# What does this PR do?
most third-party actions use hashes for pinning but not all

do proper hash pinning on all remaining actions using tags

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-14 14:59:58 +02:00
Sébastien Han
6371bb1b33
chore(refact)!: simplify config management (#1105)
# What does this PR do?

We are dropping configuration via CLI flag almost entirely. If any
server configuration has to be tweak it must be done through the server
section in the run.yaml.

This is unfortunately a breaking change for whover was using:

* `--tls-*`
* `--disable_ipv6`

`--port` stays around and get a special treatment since we believe, it's
common for user dev to change port for quick experimentations.

Closes: https://github.com/meta-llama/llama-stack/issues/1076

## Test Plan

Simply do `llama stack run <config>` nothing should break :)

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-05-07 09:18:12 -07:00
Sébastien Han
b9b13a3670
chore: factor kube auth test distro (#2105)
# What does this PR do?

We just need to validate the auth so we don't need any API / Providers.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-05-06 09:49:49 -07:00
Ignas Baranauskas
2413447467
ci: add new action to install ollama, cache the model (#2054)
# What does this PR do?
This PR introduces a reusable GitHub Actions workflow for pulling and
running an Ollama model, with caching to avoid repeated downloads.

[//]: # (If resolving an issue, uncomment and update the line below)
Closes: #1949 

## Test Plan

1. Trigger a workflow that uses the Ollama setup. Confirm that:
- The model is pulled successfully.
- It is placed in the correct directory, official at the moment (not
~ollama/.ollama/models as per comment so need to confirm this).
2. Re-run the same workflow to validate that:
- The model is restored from the cache.
- Execution succeeds with the cached model.

[//]: # (## Documentation)
2025-05-06 14:56:20 +02:00
dependabot[bot]
1fbda6bfaa
chore(github-deps): bump actions/setup-python from 5.5.0 to 5.6.0 (#2099)
Bumps [actions/setup-python](https://github.com/actions/setup-python)
from 5.5.0 to 5.6.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-python/releases">actions/setup-python's
releases</a>.</em></p>
<blockquote>
<h2>v5.6.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Workflow updates related to Ubuntu 20.04 by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1065">actions/setup-python#1065</a></li>
<li>Fix for Candidate Not Iterable Error by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1082">actions/setup-python#1082</a></li>
<li>Upgrade semver and <code>@​types/semver</code> by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1091">actions/setup-python#1091</a></li>
<li>Upgrade prettier from 2.8.8 to 3.5.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1046">actions/setup-python#1046</a></li>
<li>Upgrade ts-jest from 29.1.2 to 29.3.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1081">actions/setup-python#1081</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-python/compare/v5...v5.6.0">https://github.com/actions/setup-python/compare/v5...v5.6.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="a26af69be9"><code>a26af69</code></a>
Bump ts-jest from 29.1.2 to 29.3.2 (<a
href="https://redirect.github.com/actions/setup-python/issues/1081">#1081</a>)</li>
<li><a
href="30eafe9548"><code>30eafe9</code></a>
Bump prettier from 2.8.8 to 3.5.3 (<a
href="https://redirect.github.com/actions/setup-python/issues/1046">#1046</a>)</li>
<li><a
href="5d95bc16d4"><code>5d95bc1</code></a>
Bump semver and <code>@​types/semver</code> (<a
href="https://redirect.github.com/actions/setup-python/issues/1091">#1091</a>)</li>
<li><a
href="6ed2c67c8a"><code>6ed2c67</code></a>
Fix for Candidate Not Iterable Error (<a
href="https://redirect.github.com/actions/setup-python/issues/1082">#1082</a>)</li>
<li><a
href="e348410e00"><code>e348410</code></a>
Remove Ubuntu 20.04 from workflows due to deprecation from 2025-04-15
(<a
href="https://redirect.github.com/actions/setup-python/issues/1065">#1065</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/setup-python/compare/v5.5.0...a26af69be951a213d495a4c3e4e4022e16d87065">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-python&package-manager=github_actions&previous-version=5.5.0&new-version=5.6.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-05 10:25:45 +02:00
Ihar Hrachyshka
f36f68c590
ci: Disable no-commit-to-branch (#2084)
All merges produced by github are pushes to main, which makes the check
fail. The check is local by design, not meant for CI.

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
2025-05-01 11:43:43 -07:00
Sébastien Han
653e8526ec
chore(ci): misc Ollama improvements (#2052)
# What does this PR do?

* pull the embedding model so that it's not pulled during the distro
server startup sequence
* cache the models
* collect logs at the end of the workflow

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-04-30 07:05:28 -07:00
Alexey Rybak
afd7e750d9
ci: add UBI 9 container-build gate (#2039)
# What does this PR do?
* new workflow job **build-ubi9-container-distribution**
  * runs on the default `ubuntu-latest` runner
  * uses the existing `dev` template
* invokes `uv run llama stack build` with `.container_base =
"registry.access.redhat.com/ubi9/ubi-minimal:latest"`
  * inspects the resulting image to verify its entrypoint

# (Closes #1994)

## Test Plan
- CI now includes the `build-ubi9-container-distribution` job and will
turn green when that job passes on changes to build files
2025-04-30 09:52:57 +02:00
Yuan Tang
7532f4cdb2
chore(github-deps): bump astral-sh/setup-uv from 5 to 6 (#2051)
# What does this PR do?

This builds on top of
https://github.com/meta-llama/llama-stack/pull/2037 to include some
additional changes to fix integration tests builds.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-29 20:41:41 +02:00
Sébastien Han
7807a86358
ci: simplify external provider integration test (#2050)
Do not run Ollama, but only validate that the provider was loaded by the
server.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-04-28 23:10:27 +02:00
Sébastien Han
79851d93aa
feat: Add Kubernetes authentication (#1778)
# What does this PR do?

This commit adds a new authentication system to the Llama Stack server
with support for Kubernetes and custom authentication providers. Key
changes include:

- Implemented KubernetesAuthProvider for validating Kubernetes service
account tokens
- Implemented CustomAuthProvider for validating tokens against external
endpoints - this is the same code that was already present.
- Added test for Kubernetes
- Updated server configuration to support authentication settings
- Added documentation for authentication configuration and usage

The authentication system supports:
- Bearer token validation
- Kubernetes service account token validation
- Custom authentication endpoints

## Test Plan

Setup a Kube cluster using Kind or Minikube.

Run a server with:

```
server:
  port: 8321
  auth:
    provider_type: kubernetes
    config:
      api_server_url: http://url
      ca_cert_path: path/to/cert (optional)
```

Run:

```
curl -s -L -H "Authorization: Bearer $(kubectl create token my-user)" http://127.0.0.1:8321/v1/providers
```

Or replace "my-user" with your service account.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-04-28 22:24:58 +02:00
dependabot[bot]
c149cf2e0f
chore(github-deps): bump actions/setup-python from 5.5.0 to 5.6.0 (#2038)
[//]: # (dependabot-start)
⚠️  **Dependabot is rebasing this PR** ⚠️ 

Rebasing might not happen immediately, so don't worry if this takes some
time.

Note: if you make any changes to this PR yourself, they will take
precedence over the rebase.

---

[//]: # (dependabot-end)

Bumps [actions/setup-python](https://github.com/actions/setup-python)
from 5.5.0 to 5.6.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-python/releases">actions/setup-python's
releases</a>.</em></p>
<blockquote>
<h2>v5.6.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Workflow updates related to Ubuntu 20.04 by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1065">actions/setup-python#1065</a></li>
<li>Fix for Candidate Not Iterable Error by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1082">actions/setup-python#1082</a></li>
<li>Upgrade semver and <code>@​types/semver</code> by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1091">actions/setup-python#1091</a></li>
<li>Upgrade prettier from 2.8.8 to 3.5.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1046">actions/setup-python#1046</a></li>
<li>Upgrade ts-jest from 29.1.2 to 29.3.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1081">actions/setup-python#1081</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-python/compare/v5...v5.6.0">https://github.com/actions/setup-python/compare/v5...v5.6.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="a26af69be9"><code>a26af69</code></a>
Bump ts-jest from 29.1.2 to 29.3.2 (<a
href="https://redirect.github.com/actions/setup-python/issues/1081">#1081</a>)</li>
<li><a
href="30eafe9548"><code>30eafe9</code></a>
Bump prettier from 2.8.8 to 3.5.3 (<a
href="https://redirect.github.com/actions/setup-python/issues/1046">#1046</a>)</li>
<li><a
href="5d95bc16d4"><code>5d95bc1</code></a>
Bump semver and <code>@​types/semver</code> (<a
href="https://redirect.github.com/actions/setup-python/issues/1091">#1091</a>)</li>
<li><a
href="6ed2c67c8a"><code>6ed2c67</code></a>
Fix for Candidate Not Iterable Error (<a
href="https://redirect.github.com/actions/setup-python/issues/1082">#1082</a>)</li>
<li><a
href="e348410e00"><code>e348410</code></a>
Remove Ubuntu 20.04 from workflows due to deprecation from 2025-04-15
(<a
href="https://redirect.github.com/actions/setup-python/issues/1065">#1065</a>)</li>
<li>See full diff in <a
href="8d9ed9ac5c...a26af69be9">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-python&package-manager=github_actions&previous-version=5.5.0&new-version=5.6.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-28 11:46:29 +02:00
Alexey Rybak
1050837622
feat: Llama Stack Meta Reference installation script (#1383)
# What does this PR do?
Add installation script for Llama Stack Meta Reference distro (Docker
only).

# Closes #1374 

## Test Plan
./instal.sh

---------

Co-authored-by: Sébastien Han <seb@redhat.com>
2025-04-28 11:25:59 +02:00
Francisco Arceo
70488abe9c
chore: Remove distributions/** from integration, external provider, and unit tests (#2018)
# What does this PR do?
Remove `distributions/**` from integration, external provider, and unit
tests

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan
N/A

[//]: # (## Documentation)

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-04-24 11:39:31 -04:00
Francisco Arceo
dc0d4763a0
chore: Update External Providers CI to not run on changes to docs, rfcs, and scripts (#2009)
# What does this PR do?
Update External Providers CI to not run on changes to docs, rfcs, and
scripts

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]

[//]: # (## Documentation)

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-04-24 11:24:07 -04:00
Sébastien Han
14e60e3c02
feat: include run.yaml in the container image (#2005)
As part of the build process, we now include the generated run.yaml
(based of the provided build configuration file) into the container. We
updated the entrypoint to use this run configuration as well.

Given this simple distribution configuration:

```
# build.yaml
version: '2'
distribution_spec:
  description: Use (an external) Ollama server for running LLM inference
  providers:
    inference:
    - remote::ollama
    vector_io:
    - inline::faiss
    safety:
    - inline::llama-guard
    agents:
    - inline::meta-reference
    telemetry:
    - inline::meta-reference
    eval:
    - inline::meta-reference
    datasetio:
    - remote::huggingface
    - inline::localfs
    scoring:
    - inline::basic
    - inline::llm-as-judge
    - inline::braintrust
    tool_runtime:
    - remote::brave-search
    - remote::tavily-search
    - inline::code-interpreter
    - inline::rag-runtime
    - remote::model-context-protocol
    - remote::wolfram-alpha
  container_image: "registry.access.redhat.com/ubi9"
image_type: container
image_name: test
```

Build it:
```
llama stack build --config build.yaml
```

Run it:

```
podman run --rm \
         -p 8321:8321 \
         -e OLLAMA_URL=http://host.containers.internal:11434 \
         --name llama-stack-server \
         localhost/leseb-test:0.2.2
```

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-04-24 11:29:53 +02:00
Sébastien Han
94f83382eb
feat: allow building distro with external providers (#1967)
# What does this PR do?

We can now build a distribution that includes external providers.
Closes: https://github.com/meta-llama/llama-stack/issues/1948

## Test Plan

Build a distro with an external provider following the doc instructions.

[//]: # (## Documentation)

Added.

Rendered:


![Screenshot 2025-04-18 at 11 26
39](https://github.com/user-attachments/assets/afcf3d50-8d30-48c3-8d24-06a4b3662881)

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-04-18 17:18:28 +02:00
Ihar Hrachyshka
6f97f9a593
chore: Use hashes to pull actions for build-single-provider job (#1977)
Other jobs already use hashes.

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
2025-04-17 10:26:08 +02:00
Charlie Doern
83b5523e2d
feat: add --providers to llama stack build (#1718)
# What does this PR do?

allow users to specify only the providers they want in the llama stack
build command. If a user wants a non-interactive build, but doesn't want
to use a template, `--providers` allows someone to specify something
like `--providers inference=remote::ollama` for a distro with JUST
ollama

## Test Plan

`llama stack build --providers inference=remote::ollama --image-type
venv`
<img width="1084" alt="Screenshot 2025-03-20 at 9 34 14 AM"
src="https://github.com/user-attachments/assets/502b5fa2-edab-4267-a595-4f987204a6a9"
/>

`llama stack run --image-type venv
/Users/charliedoern/projects/Documents/llama-stack/venv-run.yaml`
<img width="1149" alt="Screenshot 2025-03-20 at 9 35 19 AM"
src="https://github.com/user-attachments/assets/433765f3-6b7f-4383-9241-dad085b69228"
/>

---------

Signed-off-by: Charlie Doern <cdoern@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
2025-04-15 14:17:03 +02:00
Sébastien Han
68eeacec0e
docs: resync missing nvidia doc (#1947)
# What does this PR do?

Resync doc.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-04-14 15:09:16 +02:00
dependabot[bot]
2ec5879f14
chore(github-deps): bump astral-sh/setup-uv from 5.4.0 to 5.4.1 (#1881)
Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from
5.4.0 to 5.4.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's
releases</a>.</em></p>
<blockquote>
<h2>v5.4.1 🌈 Add support for pep440 version specifiers</h2>
<h2>Changes</h2>
<p>With this release you can also use <a
href="https://peps.python.org/pep-0440/#version-specifiers">pep440
version specifiers</a> as <code>required-version</code> in
files<code>uv.toml</code>, <code>pyroject.toml</code> and in the
<code>version</code> input:</p>
<pre lang="yaml"><code>- name: Install a pep440-specifier-satisfying
version of uv
  uses: astral-sh/setup-uv@v5
  with:
    version: &quot;&gt;=0.4.25,&lt;0.5&quot;
</code></pre>
<h2>🐛 Bug fixes</h2>
<ul>
<li>Add support for pep440 version identifiers <a
href="https://github.com/eifinger"><code>@​eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/353">#353</a>)</li>
</ul>
<h2>🧰 Maintenance</h2>
<ul>
<li>chore: update known checksums for 0.6.10 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/345">#345</a>)</li>
</ul>
<h2>📚 Documentation</h2>
<ul>
<li>Add pep440 to docs header <a
href="https://github.com/eifinger"><code>@​eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/355">#355</a>)</li>
<li>Fix glob syntax link <a
href="https://github.com/flying-sheep"><code>@​flying-sheep</code></a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/349">#349</a>)</li>
<li>Add link to supported glob patterns <a
href="https://github.com/eifinger"><code>@​eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/348">#348</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="0c5e2b8115"><code>0c5e2b8</code></a>
Add pep440 to docs header (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/355">#355</a>)</li>
<li><a
href="794ea9455c"><code>794ea94</code></a>
Add support for pep440 version identifiers (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/353">#353</a>)</li>
<li><a
href="2d49baf2b6"><code>2d49baf</code></a>
chore: update known checksums for 0.6.10 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/345">#345</a>)</li>
<li><a
href="4fa25599ce"><code>4fa2559</code></a>
Fix glob syntax link (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/349">#349</a>)</li>
<li><a
href="224dce1d79"><code>224dce1</code></a>
Add link to supported glob patterns (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/348">#348</a>)</li>
<li>See full diff in <a
href="22695119d7...0c5e2b8115">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=5.4.0&new-version=5.4.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-14 14:33:43 +02:00
Matthew Farrellee
6d6b40983e
refactor: update integration test workflow (#1856)
workflow -
 0. Checkout
 1. Install uv
 2. Install Ollama
 3. Pull Ollama image
 4. Start Ollama in background
 5. Set Up Environment and Install Dependencies
 6. Wait for Ollama to start
 7. Start Llama Stack server in background
 8. Wait for Llama Stack server to be ready
 9. Run Integration Tests

changes -
(4) starts the loading of the ollama model, it does not start ollama.
the model will be loaded when used. this step is removed.
 (6) is handled in (2). this step is removed.
 (2) is renamed to reflect it's dual purpose.
2025-04-14 12:17:51 +02:00
Sébastien Han
69554158fa
feat: add health to all providers through providers endpoint (#1418)
The `/v1/providers` now reports the health status of each
provider when implemented.

```
curl -L http://127.0.0.1:8321/v1/providers|jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4072  100  4072    0     0   246k      0 --:--:-- --:--:-- --:--:--  248k
{
  "data": [
    {
      "api": "inference",
      "provider_id": "ollama",
      "provider_type": "remote::ollama",
      "config": {
        "url": "http://localhost:11434"
      },
      "health": {
        "status": "OK"
      }
    },
    {
      "api": "vector_io",
      "provider_id": "faiss",
      "provider_type": "inline::faiss",
      "config": {
        "kvstore": {
          "type": "sqlite",
          "namespace": null,
          "db_path": "/Users/leseb/.llama/distributions/ollama/faiss_store.db"
        }
      },
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "safety",
      "provider_id": "llama-guard",
      "provider_type": "inline::llama-guard",
      "config": {
        "excluded_categories": []
      },
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "agents",
      "provider_id": "meta-reference",
      "provider_type": "inline::meta-reference",
      "config": {
        "persistence_store": {
          "type": "sqlite",
          "namespace": null,
          "db_path": "/Users/leseb/.llama/distributions/ollama/agents_store.db"
        }
      },
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "telemetry",
      "provider_id": "meta-reference",
      "provider_type": "inline::meta-reference",
      "config": {
        "service_name": "llama-stack",
        "sinks": "console,sqlite",
        "sqlite_db_path": "/Users/leseb/.llama/distributions/ollama/trace_store.db"
      },
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "eval",
      "provider_id": "meta-reference",
      "provider_type": "inline::meta-reference",
      "config": {
        "kvstore": {
          "type": "sqlite",
          "namespace": null,
          "db_path": "/Users/leseb/.llama/distributions/ollama/meta_reference_eval.db"
        }
      },
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "datasetio",
      "provider_id": "huggingface",
      "provider_type": "remote::huggingface",
      "config": {
        "kvstore": {
          "type": "sqlite",
          "namespace": null,
          "db_path": "/Users/leseb/.llama/distributions/ollama/huggingface_datasetio.db"
        }
      },
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "datasetio",
      "provider_id": "localfs",
      "provider_type": "inline::localfs",
      "config": {
        "kvstore": {
          "type": "sqlite",
          "namespace": null,
          "db_path": "/Users/leseb/.llama/distributions/ollama/localfs_datasetio.db"
        }
      },
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "scoring",
      "provider_id": "basic",
      "provider_type": "inline::basic",
      "config": {},
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "scoring",
      "provider_id": "llm-as-judge",
      "provider_type": "inline::llm-as-judge",
      "config": {},
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "scoring",
      "provider_id": "braintrust",
      "provider_type": "inline::braintrust",
      "config": {
        "openai_api_key": "********"
      },
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "tool_runtime",
      "provider_id": "brave-search",
      "provider_type": "remote::brave-search",
      "config": {
        "api_key": "********",
        "max_results": 3
      },
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "tool_runtime",
      "provider_id": "tavily-search",
      "provider_type": "remote::tavily-search",
      "config": {
        "api_key": "********",
        "max_results": 3
      },
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "tool_runtime",
      "provider_id": "code-interpreter",
      "provider_type": "inline::code-interpreter",
      "config": {},
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "tool_runtime",
      "provider_id": "rag-runtime",
      "provider_type": "inline::rag-runtime",
      "config": {},
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "tool_runtime",
      "provider_id": "model-context-protocol",
      "provider_type": "remote::model-context-protocol",
      "config": {},
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    },
    {
      "api": "tool_runtime",
      "provider_id": "wolfram-alpha",
      "provider_type": "remote::wolfram-alpha",
      "config": {
        "api_key": "********"
      },
      "health": {
        "status": "Not Implemented",
        "message": "Provider does not implement health check"
      }
    }
  ]
}
```

Per providers too:

```
curl -L http://127.0.0.1:8321/v1/providers/ollama
{"api":"inference","provider_id":"ollama","provider_type":"remote::ollama","config":{"url":"http://localhost:11434"},"health":{"status":"OK"}}
```

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-04-14 11:59:36 +02:00
Sébastien Han
389767010b
feat: ability to execute external providers (#1672)
# What does this PR do?

Providers that live outside of the llama-stack codebase are now
supported.
A new property `external_providers_dir` has been added to the main
config and can be configured as follow:

```
external_providers_dir: /etc/llama-stack/providers.d/
```

Where the expected structure is:

```
providers.d/
  inference/
    custom_ollama.yaml
    vllm.yaml
  vector_io/
    qdrant.yaml
```

Where `custom_ollama.yaml` is:

```
adapter:
  adapter_type: custom_ollama
  pip_packages: ["ollama", "aiohttp"]
  config_class: llama_stack_ollama_provider.config.OllamaImplConfig
  module: llama_stack_ollama_provider
api_dependencies: []
optional_api_dependencies: []
```

Obviously the package must be installed on the system, here is the
`llama_stack_ollama_provider` example:

```
$ uv pip show llama-stack-ollama-provider
Using Python 3.10.16 environment at: /Users/leseb/Documents/AI/llama-stack/.venv
Name: llama-stack-ollama-provider
Version: 0.1.0
Location: /Users/leseb/Documents/AI/llama-stack/.venv/lib/python3.10/site-packages
Editable project location: /private/var/folders/mq/rnm5w_7s2d3fxmtkx02knvhm0000gn/T/tmp.ZBHU5Ezxg4/ollama/llama-stack-ollama-provider
Requires:
Required-by:
```

Closes: https://github.com/meta-llama/llama-stack/issues/658

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-04-09 10:30:41 +02:00
Ihar Hrachyshka
e3d22d8de7
chore: fix hash for thollander/actions-comment-pull-request (#1900)
# What does this PR do?

Fix hash for v3.0.1 tag for a github action.

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
2025-04-09 10:10:07 +02:00
Ihar Hrachyshka
d5e0f32485
ci: pin github actions to hashes (#1776)
# What does this PR do?

Let dependabot move them with PRs (and human oversight).

Fixes #1775

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
2025-04-01 17:09:39 +02:00
Ashwin Bharambe
b440a1dc42
test: make sure integration tests runs against the server (#1743)
Previously, the integration tests started the server, but never really
used it because `--stack-config=ollama` uses the ollama template and the
inline "llama stack as library" client, not the HTTP client.

This PR makes sure we test it both ways.

We also add agents tests to the mix.

## Test Plan 

GitHub

---------

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
2025-03-31 22:38:47 +02:00
Sébastien Han
a4f458e1c1
ci: add myself to CODEOWNERS (#1823)
Signed-off-by: Sébastien Han <seb@redhat.com>
2025-03-28 09:37:42 -07:00
Yuan Tang
934de0a281
ci: Enforce concurrency to reduce CI loads (#1738)
# What does this PR do?

When multiple commits are pushed to a PR, multiple CI builds will be
triggered. This PR ensures that we only run one concurrent build for
each PR to reduce CI loads.

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-03-20 22:28:47 -04:00
Botao Chen
f369871083
feat: [New Eval Benchamark] IfEval (#1708)
# What does this PR do?
In this PR, we added a new eval open benchmark IfEval based on paper
https://arxiv.org/abs/2311.07911 to measure the model capability of
instruction following.


## Test Plan
spin up a llama stack server with open-benchmark template

run `llama-stack-client --endpoint xxx eval run-benchmark
"meta-reference-ifeval" --model-id "meta-llama/Llama-3.3-70B-Instruct"
--output-dir "/home/markchen1015/" --num-examples 20` on client side and
get the eval aggregate results
2025-03-19 16:39:59 -07:00
Francisco Arceo
5418e63919
chore: Add triagers list #1561 (#1701)
# What does this PR do?
Adds triagers list

## Closes #1561

## Documentation
Was provided here: https://github.com/meta-llama/llama-stack/pull/1621

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-03-19 09:59:17 -07:00
Yuan Tang
22e560351e
ci: Add scheduled workflow to update changelog (#1503)
# What does this PR do?

This is a follow up from
https://github.com/meta-llama/llama-stack/pull/1463. cc @yanxi0830

---------

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
2025-03-18 14:39:22 -07:00
Yuan Tang
d609ffce2a
chore: Add links and badges to both unit and integration tests (#1632)
# What does this PR do?

This makes it easier to know the statuses of both and identifying failed
builds.

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-03-18 14:12:17 -07:00
Sébastien Han
ffe9b3b278
ci(ollama): run more integration tests (#1636)
# What does this PR do?
Run additional tests in a matrix to accelerate the process and clearly
identify failing providers.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-03-18 08:54:42 -07:00
Yuan Tang
2d2bb701fa
ci: Add dependabot scans for Python deps (#1618)
# What does this PR do?

This PR adds dependabot updates for Python dependencies. In addition:
* Consistent weekly schedule on a specific day
* Specific commit messages
* `open-pull-requests-limit` is intentional to avoid upgrading
dependencies that will likely cause regressions. We want to keep the
focus here on security updates only

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-03-17 20:20:31 -07:00