Commit graph

2312 commits

Author SHA1 Message Date
Matthew Farrellee
0ff0658484
Merge bf63470c22 into 632cf9eb72 2025-07-24 16:41:17 -04:00
Sébastien Han
632cf9eb72
feat: Bring Your Own API (BYOA) (#2228)
Some checks failed
Coverage Badge / unit-tests (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Installer CI / lint (push) Failing after 3s
Integration Tests / discover-tests (push) Successful in 3s
Installer CI / smoke-test-on-dev (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s
Python Package Build Test / build (3.12) (push) Failing after 3s
Python Package Build Test / build (3.13) (push) Failing after 2s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 10s
Test Llama Stack Build / build-single-provider (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 5s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s
Unit Tests / unit-tests (3.13) (push) Failing after 6s
Test External API and Providers / test-external (venv) (push) Failing after 5s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 5s
Unit Tests / unit-tests (3.12) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 6s
Update ReadTheDocs / update-readthedocs (push) Failing after 8s
Integration Tests / test-matrix (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 5s
Test Llama Stack Build / build (push) Failing after 6s
Pre-commit / pre-commit (push) Successful in 57s
# What does this PR do?

Prototype on a new feature to allow new APIs to be plugged in Llama
Stack. Opened for early feedback on the approach and test appetite on
the functionality.

@ashwinb @raghotham open for early feedback, thanks!

---------

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-07-24 13:41:14 -07:00
Charlie Doern
341504869e
fix: use logger for console telemetry (#2844)
# What does this PR do?

currently `print` is being used with custom formatting to achieve
telemetry output in the console_span_processor

This causes telemetry not to show up in log files when using
`LLAMA_STACK_LOG_FILE`. During testing it looks like telemetry is not
being captured when it is

switch to using Rich formatting with the logger and then strip the
formatting off when a log file is being used so the formatting looks
normal

## Test Plan

before:

console:

<img width="967" height="127" alt="Screenshot 2025-07-21 at 4 02 15 PM"
src="https://github.com/user-attachments/assets/b09518cc-9d38-4970-9877-70e2c41fcbb5"
/>


log file (no telemetry):

```
2025-07-21 16:01:32,481 llama_stack.providers.remote.inference.ollama.ollama:117 inference: checking connectivity to Ollama at `http://localhost:11434`...
2025-07-21 16:01:34,779 opentelemetry.trace:537 uncategorized: Overriding of current TracerProvider is not allowed
2025-07-21 16:01:35,083 __main__:587 server: Listening on ['::', '0.0.0.0']:8321
2025-07-21 16:01:35,091 uvicorn.error:84 uncategorized: Started server process [68679]
2025-07-21 16:01:35,091 uvicorn.error:48 uncategorized: Waiting for application startup.
2025-07-21 16:01:35,092 __main__:163 server: Starting up
2025-07-21 16:01:35,092 uvicorn.error:62 uncategorized: Application startup complete.
2025-07-21 16:01:35,092 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit)
2025-07-21 16:01:37,167 uvicorn.access:473 uncategorized: 127.0.0.1:53145 - "POST /v1/openai/v1/chat/completions HTTP/1.1" 200
```

after:

console:

<img width="797" height="165" alt="Screenshot 2025-07-22 at 3 28 44 PM"
src="https://github.com/user-attachments/assets/44d40e3b-6502-439d-9ea5-38058b289962"
/>


log file:

```
2025-07-21 15:59:51,481 llama_stack.providers.remote.inference.ollama.ollama:117 inference: checking connectivity to Ollama at `http://localhost:11434`...
2025-07-21 15:59:53,801 opentelemetry.trace:537 uncategorized: Overriding of current TracerProvider is not allowed
2025-07-21 15:59:54,059 __main__:587 server: Listening on ['::', '0.0.0.0']:8321
2025-07-21 15:59:54,066 uvicorn.error:84 uncategorized: Started server process [68578]
2025-07-21 15:59:54,067 uvicorn.error:48 uncategorized: Waiting for application startup.
2025-07-21 15:59:54,067 __main__:163 server: Starting up
2025-07-21 15:59:54,067 uvicorn.error:62 uncategorized: Application startup complete.
2025-07-21 15:59:54,068 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit)
2025-07-21 15:59:55,381 [TELEMETRY] 19:59:55.381  /v1/openai/v1/chat/completions
2025-07-21 15:59:55,619 uvicorn.access:473 uncategorized: 127.0.0.1:53102 - "POST /v1/openai/v1/chat/completions HTTP/1.1" 200
2025-07-21 15:59:55,621 [TELEMETRY] 19:59:55.621  /v1/openai/v1/chat/completions [StatusCode.OK] (240.07ms)
2025-07-21 15:59:55,622 [TELEMETRY]  19:59:55.620  127.0.0.1:53102 - "POST /v1/openai/v1/chat/completions HTTP/1.1" 200
```

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-07-24 16:26:59 -04:00
Kelly Brown
abade761e0
docs: Update nvidia docs template (#2893)
**Description**

Fixes generation issue in nvidia code gen file.

Closes #2873
2025-07-24 22:11:34 +02:00
Sébastien Han
226b877ca6
chore: install script should use starter (#2891)
Our demo installation script should pull the starter image. Ollama is
not being updated anymore as a distribution.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-07-24 12:18:02 -07:00
ehhuang
cbe89d2bdd
chore: return webmethod from find_matching_route (#2883)
This will be used to support API access control, i.e. Webmethod would
have a `required_scope` attribute, and we need access to that in the
middleware.
2025-07-24 11:37:21 -07:00
Ashwin Bharambe
1463b79218
feat(registry): make the Stack query providers for model listing (#2862)
This flips #2823 and #2805 by making the Stack periodically query the
providers for models rather than the providers going behind the back and
calling "register" on to the registry themselves. This also adds support
for model listing for all other providers via `ModelRegistryHelper`.
Once this is done, we do not need to manually list or register models
via `run.yaml` and it will remove both noise and annoyance (setting
`INFERENCE_MODEL` environment variables, for example) from the new user
experience.

In addition, it adds a configuration variable `allowed_models` which can
be used to optionally restrict the set of models exposed from a
provider.
2025-07-24 10:39:53 -07:00
Stefan Thaler
537dc693ee
chore: add mypy coverage to inspect.py and library_client.py in /distribution (#2707)
# What does this PR do?
Adds type guards in /distribution/inspect.py and ignores a valid-type
mypy error in library_client.py. This PR is part of issue #2647 . I'm
rather unsure whether ignoring the valid-type error is correct in this
case. It appears that args[0] is interpreted as [any] but I didn't find
any way to specify the type.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->


<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
2025-07-24 09:51:46 -07:00
Charlie Doern
d4f0b430e2
docs: update list of apis (#2697)
# What does this PR do?

apis.md had a few APIs missing and incorrectly described APIs

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-07-24 09:50:14 -07:00
Sébastien Han
af9c707eaf
fix: various improvements on install.sh (#2724)
# What does this PR do?

Bulk improvements:

* The script has a better error reporting, when a command fails it will
print the logs of the failed command
* Better error handling using a trap to catch signal and perform proper
cleanup
* Cosmetic changes
* Added CI to test the image code against main
* Use the starter image and its latest tag

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-07-24 09:43:51 -07:00
Derek Higgins
4ea1f2aa9f
test: Add VLLM provider support to integration tests (#2757)
- Add setup-vllm GitHub action to start VLLM container
- Extend integration test matrix to support both ollama and vllm
providers
- Make test setup conditional based on provider type
- Add provider-specific environment variables and configurations
- vllm tests setup to run weekly or can be triggered manually (only
ollama on PR)

TODO:
investigate failing tests for vllm provider (safety and post_training)
Also need a proper fix for #2713 (tmp fix for this in the first commit
in this PR)
Closes: #1648

---------

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-07-24 09:42:26 -07:00
Mustafa Elbehery
6ab5760a1b
chore(test): migrate unit tests from unittest to pytest nvidia test safety (#2793)
This PR replaces unittest with pytest.

Part of https://github.com/meta-llama/llama-stack/issues/2680

cc @leseb

Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>
2025-07-24 09:41:07 -07:00
Yuan Tang
9069d878ef
docs: Update CHANGELOG.md (#2874)
This updates the changelog to include recent releases.

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-07-24 09:36:28 -07:00
Christian Zaccaria
7f7b990b80
docs: Document use cases for Responses and Agents APIs (#2756)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This pull request adds documentation to clarify the differences between
the Agents API and the OpenAI Responses API, including use cases for
each. It also updates the index page to reference the new documentation.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes #2368
2025-07-24 12:20:04 -04:00
Mohit Gaur
5ef2baacdc
fix: update check-workflows-use-hashes to use github error format (#2875)
# What does this PR do?
Updates the script `scripts/check-workflows-use-hashes.sh` to improve
error reporting by adopting GitHub Actions error annotation format.

* Updated the script to use GitHub Actions error annotation format
(`::error file={name},line={line},col={col}::{message}`) making error
messages more actionable and easier to locate in workflows.
* Modified the script to include line numbers for `uses:` references by
using `grep -n` and extracting line numbers, improving the precision of
error reporting.

Closes #2778

## Test Plan

- Violation check - Created test file with mixed SHA/non-SHA actions

```
echo 'uses: actions/checkout@v4' > test-workflow.yml
echo 'uses: actions/upload-artifact@main' >> test-workflow.yml
```
Result: Correctly detected violations with precise line numbers
```
./scripts/check-workflows-use-hashes.sh
Output:
::error file=test-workflow.yml,line=14::uses non-SHA action ref: uses: actions/checkout@v4
::error file=test-workflow.yml,line=20::uses non-SHA action ref: uses: actions/upload-artifact@main
```

- Verified existing project workflows pass
```
./scripts/check-workflows-use-hashes.sh
# Result: Exit code 0 (all workflows properly SHA-pinned)
```
2025-07-24 17:41:17 +02:00
Matthew Farrellee
e33a50480d
fix: starter template and litellm backward compat conflict for openai (#2885)
# What does this PR do?

openai/models.py has backward compat entries for litellm model names.
the starter template includes these in the list of registered models.
the inclusion results in duplicate model registrations.

the backward compat is no longer necessary.

## Test Plan

ci
2025-07-24 17:28:37 +02:00
Matthew Farrellee
bf63470c22 feat: implement dynamic model detection support for inference providers using litellm
This enhancement allows inference providers using LiteLLMOpenAIMixin to validate
model availability against LiteLLM's official provider model listings, improving
reliability and user experience when working with different AI service providers.

- Add litellm_provider_name parameter to LiteLLMOpenAIMixin constructor
- Add check_model_availability method to LiteLLMOpenAIMixin using litellm.models_by_provider
- Update Gemini, Groq, and SambaNova inference adapters to pass litellm_provider_name
2025-07-24 09:49:32 -04:00
Sarthak Deshpande
cd8715d327
chore: Added openai compatible vector io endpoints for chromadb (#2489)
Some checks failed
Integration Tests / discover-tests (push) Successful in 3s
Coverage Badge / unit-tests (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Python Package Build Test / build (3.13) (push) Failing after 2s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 12s
Test External Providers / test-external-providers (venv) (push) Failing after 12s
Update ReadTheDocs / update-readthedocs (push) Failing after 10s
Test Llama Stack Build / build-single-provider (push) Failing after 15s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s
Unit Tests / unit-tests (3.13) (push) Failing after 14s
Test Llama Stack Build / build (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 18s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 51s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 49s
Integration Tests / test-matrix (push) Failing after 53s
Pre-commit / pre-commit (push) Successful in 1m42s
# What does this PR do?
This PR implements the openai compatible endpoints for chromadb

Closes #2462 

## Test Plan
Ran ollama llama stack server and ran the command
`pytest -sv --stack-config=http://localhost:8321
tests/integration/vector_io/test_openai_vector_stores.py
--embedding-model all-MiniLM-L6-v2`
8 failed, 27 passed, 8 skipped, 1 xfailed
The failed ones are regarding files api

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Co-authored-by: sarthakdeshpande <sarthak.deshpande@engati.com>
Co-authored-by: Francisco Javier Arceo <farceo@redhat.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
2025-07-23 13:51:58 -07:00
Derek Higgins
fd2aab8582
fix: prevent shell redirection issues with pip dependencies (#2867)
- Use printf to to escape special characters (e.g. < > )
- Apply escaping to pip_dependencies and special_pip_deps

Resolves shell interpretation of >= operators as redirections that were
causing build failing to respect versions and unexpected file creation
in /app directory.

Closes: #2866

## Test Plan
Manually tested, will also be tested by existing CI

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-07-23 21:43:33 +02:00
Derek Higgins
427136bb63
fix: cleanup after build_container.sh (#2869)
- rm TEMP_DIR when build_container.sh succeeds
- prevents multiple temp directories with Containerfile being left in
/tmp

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-07-23 11:54:54 -07:00
IAN MILLER
51affe5783
fix: fixed test_access_control.py unit test (#2876)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
I fixed test_access_policy() function providing provider_model_id in
each register model endpoint to pass assertions.

Initially I faced this issue:
```
tests/unit/server/test_quota.py::test_authenticated_quota_allows_up_to_limit
tests/unit/server/test_quota.py::test_authenticated_quota_blocks_after_limit
tests/unit/server/test_quota.py::test_anonymous_quota_allows_up_to_limit
tests/unit/server/test_quota.py::test_anonymous_quota_blocks_after_limit
  /Users/iamiller/GitHub/llama-stack/.venv/lib/python3.12/site-packages/aiosqlite/core.py:105: DeprecationWarning: The default datetime adapter is deprecated as of Python 3.12; see the sqlite3 documentation for suggested replacement recipes
    result = function()

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============================================================================== short test summary info ===============================================================================
FAILED tests/unit/server/test_access_control.py::test_access_policy - AssertionError: assert 'test_provider/model-1' == 'model-1'
==================================================================== 1 failed, 436 passed, 194 warnings in 20.09s ====================================================================
```

After resolved, all works:
```
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
========================================================================= 437 passed, 194 warnings in 19.41s =========================================================================
```

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Run ` ./scripts/unit-tests.sh`
2025-07-23 11:50:20 -07:00
Ashwin Bharambe
2fcfb0f0b5
fix: bring back dell template (#2880)
This template is definitely needed since it (and related docker, which
will push soon) is used by folks at Dell.
2025-07-23 11:40:59 -07:00
Mark Campbell
8353ad4981
fix: search mode validation for rag query (#2857)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
I noticed a few issues with my implementation of the search mode
validation for RagQuery.
This PR replaces the check for search mode in RagQuery with a Literal. 
There were issues before with
```
TypeError: Object of type RAGSearchMode is not JSON serializable
```
When using 
```
query_config = RAGQueryConfig(max_chunks=6, mode="vector").model_dump()
```

It also fixes the fact that despite user input "vector" was always the
used search mode.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

Verify that a chosen search mode works when using Rag Query or use below
agent config:
```
agent = Agent(
    client,
    model=model_id,
    instructions="You are a helpful assistant",
    tools=[
        {
            "name": "builtin::rag/knowledge_search",
            "args": {
                "vector_db_ids": [vector_db_id],
                "query_config": {
                    "mode": "keyword",
                    "max_chunks": 6
                }
            },
        }
    ],
)
```

Running Unit Tests:
```
uv sync --extra dev
uv run pytest tests/unit/rag/test_rag_query.py -v
```
2025-07-23 11:25:12 -07:00
Francisco Arceo
2aba2c1236
chore: Moving vector store and vector store files helper methods to openai_vector_store_mixin (#2863)
# What does this PR do?
Moving vector store and vector store files helper methods to
`openai_vector_store_mixin.py`

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
The tests are already supported in the CI and tests the inline providers
and current integration tests.

Note that the `vector_index` fixture will be test `milvus_vec_adapter`,
`faiss_vec_adapter`, and `sqlite_vec_adapter` in
`tests/unit/providers/vector_io/test_vector_io_openai_vector_stores.py`.

Additionally, the integration tests in `integration-vector-io-tests.yml`
runs `tests/integration/vector_io` tests for the following providers:
```python
vector-io-provider: ["inline::faiss", "inline::sqlite-vec", "inline::milvus", "remote::chromadb", "remote::pgvector"]
```

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-07-23 13:35:48 -04:00
Matthew Farrellee
e1ed152779
chore: create OpenAIMixin for inference providers with an OpenAI-compat API that need to implement openai_* methods (#2835)
Some checks failed
Coverage Badge / unit-tests (push) Failing after 3s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s
Python Package Build Test / build (3.12) (push) Failing after 3s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 6s
Integration Tests / discover-tests (push) Successful in 7s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 6s
Python Package Build Test / build (3.13) (push) Failing after 2s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 9s
Unit Tests / unit-tests (3.12) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 11s
Test External Providers / test-external-providers (venv) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s
Unit Tests / unit-tests (3.13) (push) Failing after 12s
Update ReadTheDocs / update-readthedocs (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 16s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 18s
Integration Tests / test-matrix (push) Failing after 18s
Pre-commit / pre-commit (push) Successful in 1m14s
# What does this PR do?

add an `OpenAIMixin` for use by inference providers who remote endpoints
support an OpenAI compatible API.

use is demonstrated by refactoring
- OpenAIInferenceAdapter
- NVIDIAInferenceAdapter (adds embedding support)
- LlamaCompatInferenceAdapter

## Test Plan

existing unit and integration tests
2025-07-23 06:49:40 -04:00
grs
fc67ad408a
chore: add some documentation for access policy rules (#2785)
# What does this PR do?

Adds some documentation on setting explicit access_policy rules in
config.
2025-07-23 10:27:27 +02:00
Sébastien Han
c0563c0560
fix: honour deprecation of --config and --template (#2856)
Some checks failed
Coverage Badge / unit-tests (push) Failing after 1s
Integration Tests / discover-tests (push) Successful in 3s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 12s
Test Llama Stack Build / build-single-provider (push) Failing after 6s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 13s
Unit Tests / unit-tests (3.12) (push) Failing after 6s
Python Package Build Test / build (3.12) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 13s
Unit Tests / unit-tests (3.13) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 12s
Test Llama Stack Build / generate-matrix (push) Successful in 8s
Python Package Build Test / build (3.13) (push) Failing after 6s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s
Test External Providers / test-external-providers (venv) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s
Integration Tests / test-matrix (push) Failing after 12s
Test Llama Stack Build / build (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 25s
Pre-commit / pre-commit (push) Successful in 1m33s
# What does this PR do?

https://github.com/meta-llama/llama-stack/pull/2716/ broke commands
like:

```
 python -m llama_stack.distribution.server.server --config
 llama_stack/templates/starter/run.yaml
 ```

 And will fail with:

 ```
 Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/leseb/Documents/AI/llama-stack/llama_stack/distribution/server/server.py", line 626, in <module>
    main()
  File "/Users/leseb/Documents/AI/llama-stack/llama_stack/distribution/server/server.py", line 402, in main
    config_file = resolve_config_or_template(args.config, Mode.RUN)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/leseb/Documents/AI/llama-stack/llama_stack/distribution/utils/config_resolution.py", line 43, in resolve_config_or_template
    config_path = Path(config_or_template)
                  ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.12/3.12.8/Frameworks/Python.framework/Versions/3.12/lib/python3.12/pathlib.py", line 1162, in __init__
    super().__init__(*args)
  File "/opt/homebrew/Cellar/python@3.12/3.12.8/Frameworks/Python.framework/Versions/3.12/lib/python3.12/pathlib.py", line 373, in __init__
    raise TypeError(
TypeError: argument should be a str or an os.PathLike object where __fspath__ returns a str, not 'NoneType'
```

Complaining that no positional arguments are present. We now honour the
deprecation until --config and --template are removed completely.

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

Both ` python -m llama_stack.distribution.server.server --config
llama_stack/templates/starter/run.yaml` and ` python -m
llama_stack.distribution.server.server
llama_stack/templates/starter/run.yaml` should run the server. Same for
`--template starter`.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-07-22 20:48:23 -07:00
Derek Higgins
340448e0aa
fix: optimize container build by enabling uv cache (#2855)
- Remove --no-cache flags from uv pip install commands to enable caching
- Mount host uv cache directory to container for persistent caching
- Set UV_LINK_MODE=copy to prevent uv using hardlinks
- When building the starter image
o Build time reduced from ~4:45 to ~3:05 on subsequent builds
(environment specific)
  o Eliminates re-downloading of 3G+ of data on each build 
  o Cache size: ~6.2G (when building starter image)

Fixes excessive data downloads during distro container builds.

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-07-22 16:51:52 -07:00
Ashwin Bharambe
3b83032555
feat(registry): more flexible model lookup (#2859)
This PR updates model registration and lookup behavior to be slightly
more general / flexible. See
https://github.com/meta-llama/llama-stack/issues/2843 for more details.

Note that this change is backwards compatible given the design of the
`lookup_model()` method.

## Test Plan

Added unit tests
2025-07-22 15:22:48 -07:00
Mustafa Elbehery
9736f096f6
chore(test): fix flaky telemetry tests (#2815)
Some checks failed
Installer CI / lint (push) Failing after 2s
Installer CI / smoke-test (push) Has been skipped
Integration Tests / discover-tests (push) Successful in 3s
Coverage Badge / unit-tests (push) Failing after 6s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s
Python Package Build Test / build (3.12) (push) Failing after 3s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 11s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 9s
Unit Tests / unit-tests (3.12) (push) Failing after 6s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s
Test Llama Stack Build / generate-matrix (push) Successful in 11s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 16s
Test Llama Stack Build / build-single-provider (push) Failing after 12s
Update ReadTheDocs / update-readthedocs (push) Failing after 9s
Integration Tests / test-matrix (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s
Test External Providers / test-external-providers (venv) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 8s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s
Test Llama Stack Build / build (push) Failing after 3s
Python Package Build Test / build (3.13) (push) Failing after 48s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 55s
Unit Tests / unit-tests (3.13) (push) Failing after 52s
Pre-commit / pre-commit (push) Successful in 1m42s
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR fixes flaky telemetry tests

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
See https://github.com/meta-llama/llama-stack/pull/2814
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>
2025-07-22 12:30:14 -07:00
Omer Tuchfeld
c1a63fcd87
fix(install): explicit docker.io usage (#2850)
# What does this PR do?

When podman is used and the registry is omitted, podman will prompt the
user. However, we're piping the output of podman to /dev/null and the
user will not see the prompt, the script will end abruptly and this is
confusing.

This commit explicitly uses the docker.io registry for the ollama image
and the llama-stack image so that the prompt is avoided.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

I ran the script on a machine with podman and the issue was resolved

## Image

Before the fix, this is what would happen:

<img width="748" height="95" alt="image"
src="https://github.com/user-attachments/assets/9c609f88-c0a8-45e7-a789-834f64f601e5"
/>

Signed-off-by: Omer Tuchfeld <omer@tuchfeld.dev>
2025-07-22 20:36:48 +02:00
Francisco Arceo
20c3197952
chore: Making name optional in openai_create_vector_store (#2858)
# What does this PR do?
chore: Making name optional in openai_create_vector_store


# Closes https://github.com/meta-llama/llama-stack/issues/2706

## Test Plan
CI and unit tests

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-07-22 13:31:31 -04:00
ehhuang
8e1a2b4703
chore: remove *_openai_compat providers (#2849)
# What does this PR do?
These are no longer needed as llama-stack-evals can run against OAI
endpoints directly.

## Test Plan
2025-07-22 10:25:36 -07:00
Omer Tuchfeld
5e18d4d097
fix(agent): ensure turns are sorted (#2854)
# What does this PR do?

Ensures that session turns retrieved from the agent persistence layer
are sorted by their `started_at` timestamp, as the key-value store does
not guarantee order.

Closes #2852

## Test Plan

- [ ] Add unit tests
2025-07-22 10:24:51 -07:00
Jeremy Bonghwan Choi
b5a6ecc331
docs: minor fix of the pgvector provider spec description (#2847)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Integration Tests / discover-tests (push) Successful in 3s
Coverage Badge / unit-tests (push) Failing after 6s
Python Package Build Test / build (3.13) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 10s
Python Package Build Test / build (3.12) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s
Test External Providers / test-external-providers (venv) (push) Failing after 7s
Unit Tests / unit-tests (3.12) (push) Failing after 10s
Update ReadTheDocs / update-readthedocs (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 10s
Integration Tests / test-matrix (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 11s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 27s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 25s
Unit Tests / unit-tests (3.13) (push) Failing after 24s
Pre-commit / pre-commit (push) Successful in 1m17s
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->

minor update of the pgvector doc, changing 'faiss' to 'pgvector'

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
2025-07-21 22:10:35 -07:00
Francisco Arceo
2bc96613f9
chore: Adding demo script and importing it into the docs (#2848)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Coverage Badge / unit-tests (push) Failing after 6s
Integration Tests / discover-tests (push) Successful in 7s
Unit Tests / unit-tests (3.13) (push) Failing after 6s
Test Llama Stack Build / build-single-provider (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 11s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 14s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 14s
Test Llama Stack Build / generate-matrix (push) Successful in 10s
Test External Providers / test-external-providers (venv) (push) Failing after 9s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 11s
Unit Tests / unit-tests (3.12) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 19s
Python Package Build Test / build (3.13) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s
Integration Tests / test-matrix (push) Failing after 13s
Python Package Build Test / build (3.12) (push) Failing after 1m1s
Update ReadTheDocs / update-readthedocs (push) Failing after 1m0s
Test Llama Stack Build / build (push) Failing after 52s
Pre-commit / pre-commit (push) Successful in 2m39s
# What does this PR do?
This PR adds the quickstart as a file to the docs so that it can be more
easily maintained and run, as mentioned in
https://github.com/meta-llama/llama-stack/pull/2800.

## Test Plan
I could add this as a test in the CI but I wasn't sure if we wanted to
add additional jobs there. 😅

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-07-21 22:53:32 -04:00
Francisco Arceo
c8f274347d
chore: Adding Access Control for OpenAI Vector Stores methods (#2772)
# What does this PR do?

Refactors the vector store routing logic by moving OpenAI-compatible
vector store operations from the `VectorIORouter` to the
`VectorDBsRoutingTable`.

Closes https://github.com/meta-llama/llama-stack/issues/2761

## Test Plan

Added unit tests to cover new routing logic and ACL checks.

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-07-21 16:22:44 -04:00
ehhuang
0d7a90b8bc
chore: merge --config and --template in server.py (#2716)
# What does this PR do?
Part of #2696 

## Test Plan
Run `llama stack run starter`

Error:
```

myenv ❯ llama stack run starters
WARNING  2025-07-10 12:12:43,052 llama_stack.cli.stack.run:82 server: Conda detected. Using conda environment myenv for the run.
usage: llama stack run [-h] [--port PORT] [--image-name IMAGE_NAME] [--env KEY=VALUE]
                       [--image-type {conda,venv}] [--enable-ui]
                       [config | template]
llama stack run: error: Could not resolve config or template 'starters'.

Tried the following locations:
  1. As file path: /Users/erichuang/projects/llama-stack-git/starters
  2. As template: /Users/erichuang/projects/llama-stack-git/llama_stack/templates/starters/run.yaml
  3. As built distribution: (/Users/erichuang/.llama/distributions/llamastack-starters/starters-run.yaml, /Users/erichuang/.llama/distributions/starters/starters-run.yaml)

Available templates: dell, test-env, vllm-gpu, test-template, cerebras, openai-api-verification, sambanova, passthrough, direct-config, together, openai, fireworks, meta-reference-gpu, __pycache__, dev, ollama, watsonx, remote-vllm, llama_api, groq, dummy, oracle, nvidia, ci-tests, postgres-demo, test-stack, bedrock, starter, hf-serverless, hf-endpoint, tgi, open-benchmark, verification

Did you mean one of these templates?
  - starter
  - together
  - postgres-demo
```
2025-07-21 13:19:27 -07:00
Charlie Doern
9a03526672
fix: uvicorn respect log_config (#2842)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s
Integration Tests / discover-tests (push) Successful in 9s
Coverage Badge / unit-tests (push) Failing after 13s
Python Package Build Test / build (3.12) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 16s
Python Package Build Test / build (3.13) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 17s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s
Unit Tests / unit-tests (3.13) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 18s
Test External Providers / test-external-providers (venv) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s
Integration Tests / test-matrix (push) Failing after 12s
Pre-commit / pre-commit (push) Successful in 1m7s
2025-07-21 12:50:39 -07:00
Sébastien Han
019ddda138
fix: graceful SIGINT on server (#2831)
# What does this PR do?

After https://github.com/meta-llama/llama-stack/pull/2818, SIGINT will
print a stack trace. This is because uvicorn re-raises SIGINT and it
gets converted by Python internal signal handler (default handles
SIGINT) to KeyboardInterrupt exception. We know simply catch the
exception to get a clean exit, this is not changing the behavior on
SIGINT.

## Test Plan

Run the server, hit Ctrl+C or `kill -2 <server pid>` and expect a clean
exit with no stack trace.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-07-21 11:35:15 -07:00
ehhuang
d0208df286
test: skip flaky telemetry tests (#2814)
# What does this PR do?
example error:
4625086977

## Test Plan
2025-07-21 10:01:40 -07:00
IAN MILLER
9e6860b9cf
fix: remove @pytest.mark.asyncio from test_get_raw_document_text.py (#2840)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
The pre-commit workflow was failing in the main branch and removing
`@pytest.mark.asyncio `from `test_get_raw_document_text.py` fixed that.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
2025-07-21 09:14:34 -07:00
Ondrej Metelka
89c49eb003
feat: Allow application/yaml as mime_type (#2575)
# What does this PR do?
Allow application/yaml as mime_type for documents.

## Test Plan
Added unit tests.
2025-07-21 15:43:32 +02:00
Mustafa Elbehery
b2c7543af7
fix(vectordb): VectorDBInput has no provider_id (#2830)
Some checks failed
Coverage Badge / unit-tests (push) Failing after 3s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 13s
Test External Providers / test-external-providers (venv) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s
Python Package Build Test / build (3.13) (push) Failing after 11s
Python Package Build Test / build (3.12) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s
Unit Tests / unit-tests (3.12) (push) Failing after 13s
Integration Tests / discover-tests (push) Successful in 21s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s
Unit Tests / unit-tests (3.13) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 24s
Integration Tests / test-matrix (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 53s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 51s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 59s
Pre-commit / pre-commit (push) Successful in 1m35s
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR add `provider_id` field to `VectorDBInput` class.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

fixes https://github.com/meta-llama/llama-stack/issues/2819

Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>
2025-07-21 14:03:40 +02:00
Sébastien Han
ecd28f0085
chore: add contribution guideline around PRs (#2811)
More contributing guidelines.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-07-21 05:47:17 -04:00
Christian Zaccaria
56269245c2
fix: Add permissions for pull request creation in coverage-badge workflow (#2832)
# What does this PR do?
The workflow that automatically creates a PR to update the Coverage
Badge fails as the `GITHUB_TOKEN` doesn't have write permissions.

As opposed to providing write permissions to the token, we can provide
the permissions for just this workflow with this PR.
2025-07-21 11:40:00 +02:00
dependabot[bot]
28956f9447
chore(github-deps): bump astral-sh/setup-uv from 6.3.1 to 6.4.1 (#2827)
Some checks failed
Integration Tests / discover-tests (push) Successful in 2s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 8s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 23s
Test External Providers / test-external-providers (venv) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 22s
Python Package Build Test / build (3.12) (push) Failing after 19s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 24s
Python Package Build Test / build (3.13) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 24s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 27s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 24s
Integration Tests / test-matrix (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 3m13s
Unit Tests / unit-tests (3.13) (push) Failing after 3m15s
Pre-commit / pre-commit (push) Successful in 4m55s
Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from
6.3.1 to 6.4.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's
releases</a>.</em></p>
<blockquote>
<h2>v6.4.1 🌈 Hotfix: Ignore deps starting with uv when finding uv
version</h2>
<h2>Changes</h2>
<p>Thank you <a
href="https://github.com/phpmypython"><code>@​phpmypython</code></a> for
raising a PR to fix this issue!</p>
<h2>🐛 Bug fixes</h2>
<ul>
<li>Ignore deps starting with uv when finding uv version <a
href="https://github.com/eifinger"><code>@​eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/492">#492</a>)</li>
</ul>
<h2>v6.4.0 🌈 Add input <code>version-file</code></h2>
<h2>Changes</h2>
<p>You can now use the <code>version-file</code> input to specify a file
that contains the version of uv to install.
This can either be a <code>pyproject.toml</code> or <code>uv.toml</code>
file which defines a <code>required-version</code> or
uv defined as a dependency in <code>pyproject.toml</code> or
<code>requirements.txt</code>.</p>
<pre lang="yaml"><code>- name: Install uv based on the version defined
in requirements.txt
  uses: astral-sh/setup-uv@v6
  with:
    version-file: &quot;requirements.txt&quot;
</code></pre>
<h2>🚀 Enhancements</h2>
<ul>
<li>Add input version-file <a
href="https://github.com/eifinger"><code>@​eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/486">#486</a>)</li>
</ul>
<h2>🧰 Maintenance</h2>
<ul>
<li>chore: update known versions for 0.7.22 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/488">#488</a>)</li>
<li>Bump dependencies <a
href="https://github.com/eifinger"><code>@​eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/487">#487</a>)</li>
<li>chore: update known versions for 0.7.21 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/483">#483</a>)</li>
<li>chore: update known versions for 0.7.20 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/480">#480</a>)</li>
<li>chore: update known versions for 0.7.19 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/475">#475</a>)</li>
<li>chore: update known versions for 0.7.18 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/473">#473</a>)</li>
<li>chore: update known versions for 0.7.17 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/468">#468</a>)</li>
<li>chore: update known versions for 0.7.16 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/466">#466</a>)</li>
<li>chore: update known versions for 0.7.15 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/463">#463</a>)</li>
</ul>
<h2>📚 Documentation</h2>
<ul>
<li>Add FAQ on changed cache and cache upload behavior <a
href="https://github.com/eifinger"><code>@​eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/477">#477</a>)</li>
</ul>
<h2>⬆️ Dependency updates</h2>
<ul>
<li>Bump dependencies <a
href="https://github.com/eifinger"><code>@​eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/487">#487</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="7edac99f96"><code>7edac99</code></a>
Ignore deps starting with uv when finding uv version (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/492">#492</a>)</li>
<li><a
href="05273c154d"><code>05273c1</code></a>
chore: update known versions for 0.7.22 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/488">#488</a>)</li>
<li><a
href="de545d4421"><code>de545d4</code></a>
Bump dependencies (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/487">#487</a>)</li>
<li><a
href="b75ff7d7b8"><code>b75ff7d</code></a>
Add input version-file (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/486">#486</a>)</li>
<li><a
href="c893ac1cb2"><code>c893ac1</code></a>
chore: update known versions for 0.7.21 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/483">#483</a>)</li>
<li><a
href="a905f0040b"><code>a905f00</code></a>
chore: update known versions for 0.7.20 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/480">#480</a>)</li>
<li><a
href="d4219d1620"><code>d4219d1</code></a>
Add FAQ on changed cache and cache upload behavior (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/477">#477</a>)</li>
<li><a
href="aaefb91b77"><code>aaefb91</code></a>
chore: update known versions for 0.7.19 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/475">#475</a>)</li>
<li><a
href="c05b3e180b"><code>c05b3e1</code></a>
chore: update known versions for 0.7.18 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/473">#473</a>)</li>
<li><a
href="1bf1493664"><code>1bf1493</code></a>
chore: update known versions for 0.7.17 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/468">#468</a>)</li>
<li>Additional commits viewable in <a
href="bd01e18f51...7edac99f96">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.3.1&new-version=6.4.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-19 21:10:35 -05:00
ehhuang
0a6e588f68
feat: enable auth for LocalFS Files Provider (#2773)
Some checks failed
Integration Tests / discover-tests (push) Successful in 4s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 7s
Test Llama Stack Build / generate-matrix (push) Successful in 7s
Coverage Badge / unit-tests (push) Failing after 16s
Test Llama Stack Build / build-single-provider (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s
Unit Tests / unit-tests (3.12) (push) Failing after 13s
Test External Providers / test-external-providers (venv) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 13s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 17s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 17s
Update ReadTheDocs / update-readthedocs (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 23s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s
Unit Tests / unit-tests (3.13) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 23s
Test Llama Stack Build / build (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s
Python Package Build Test / build (3.13) (push) Failing after 2m19s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 2m25s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2m32s
Integration Tests / test-matrix (push) Failing after 2m24s
Pre-commit / pre-commit (push) Successful in 3m57s
# What does this PR do?
Supports authentication for LocalFS Files provider.

closes https://github.com/meta-llama/llama-stack/issues/2760

## Test Plan
CI. added tests.
2025-07-18 19:11:01 -07:00
Ashwin Bharambe
dd303327f3
feat(ci): add a ci-tests distro (#2826) 2025-07-18 17:11:06 -07:00
Ashwin Bharambe
199f859eec
feat(vllm): periodically refresh models (#2823)
Just like #2805 but for vLLM.

We also make VLLM_URL env variable optional (not required) -- if not
specified, the provider silently sits idle and yells eventually if
someone tries to call a completion on it. This is done so as to allow
this provider to be present in the `starter` distribution.

## Test Plan

Set up vLLM, copy the starter template and set `{ refresh_models: true,
refresh_models_interval: 10 }` for the vllm provider and then run:

```
ENABLE_VLLM=vllm VLLM_URL=http://localhost:8000/v1 \
  uv run llama stack run --image-type venv /tmp/starter.yaml
```

Verify that `llama-stack-client models list` brings up the model
correctly from vLLM.
2025-07-18 15:53:09 -07:00