Commit graph

1454 commits

Author SHA1 Message Date
Ashwin Bharambe
803114180b
chore(logging)!: use comma as a delimiter (#3095)
Some checks failed
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 14s
Test Llama Stack Build / generate-matrix (push) Successful in 11s
Test Llama Stack Build / build-single-provider (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 11s
Unit Tests / unit-tests (3.13) (push) Failing after 15s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s
Update ReadTheDocs / update-readthedocs (push) Failing after 12s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 29s
Test External API and Providers / test-external (venv) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 34s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 26s
Integration Tests (Replay) / discover-tests (push) Successful in 31s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s
Unit Tests / unit-tests (3.12) (push) Failing after 30s
Python Package Build Test / build (3.13) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 32s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 33s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 40s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 40s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 42s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 44s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 32s
Pre-commit / pre-commit (push) Successful in 1m24s
Test Llama Stack Build / build (push) Failing after 54s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 13s
Using commas is much more shell-friendly. A semi-colon is a statement
delimiter and must be escaped.

This change is backwards incompatible but I imagine not many people are
using this. I could be wrong. Looking for feedback.
2025-08-11 11:51:43 -07:00
Mustafa Elbehery
b5b5f5b9ae
chore: add mypy prompt guard (#2678)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR adds static type coverage to `llama-stack`

Part of https://github.com/meta-llama/llama-stack/issues/2647

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>
2025-08-11 08:40:40 -07:00
Francisco Arceo
7448a4a88c
chore: Updating UI Sidebar (#3081)
# What does this PR do?
This updates the sidebar to look a little more like other popular ones.

<img width="1913" height="1352" alt="Screenshot 2025-08-08 at 11 25
31 PM"
src="https://github.com/user-attachments/assets/00738412-1101-48ec-8864-cde4a8733ec1"
/>

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-08-11 07:39:52 -07:00
Matthew Farrellee
8faff92591
chore: remove redundant code in unregister_toolgroup (#3092)
# What does this PR do?

removes redundant code

## Test Plan

ci
2025-08-11 07:38:54 -07:00
Eran Cohen
a4bad6c0b4
feat: Add Google Vertex AI inference provider support (#2841)
Some checks failed
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 10s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 12s
Python Package Build Test / build (3.13) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 10s
Test Llama Stack Build / generate-matrix (push) Successful in 8s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s
Test External API and Providers / test-external (venv) (push) Failing after 11s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 10s
Test Llama Stack Build / build-single-provider (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s
Unit Tests / unit-tests (3.12) (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 15s
Update ReadTheDocs / update-readthedocs (push) Failing after 9s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s
Test Llama Stack Build / build (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 47s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 49s
Unit Tests / unit-tests (3.13) (push) Failing after 39s
Pre-commit / pre-commit (push) Successful in 1m37s
# What does this PR do?
- Add new Vertex AI remote inference provider with litellm integration
- Support for Gemini models through Google Cloud Vertex AI platform
- Uses Google Cloud Application Default Credentials (ADC) for
authentication
- Added VertexAI models: gemini-2.5-flash, gemini-2.5-pro,
gemini-2.0-flash.
- Updated provider registry to include vertexai provider
- Updated starter template to support Vertex AI configuration
- Added comprehensive documentation and sample configuration

<!-- If resolving an issue, uncomment and update the line below -->
relates to https://github.com/meta-llama/llama-stack/issues/2747

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

Signed-off-by: Eran Cohen <eranco@redhat.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
2025-08-11 08:22:04 -04:00
Varsha
69dc789e15
docs: Add unsupported search mode info about FAISS (#3089) 2025-08-10 17:34:34 -06:00
Varsha
ce72a28525
docs: Update doc on search modes for Milvus (#3078)
# What does this PR do?
Update Milvus doc on using search modes. 

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
2025-08-10 18:48:36 -04:00
Vlastimil Eliáš
1677d6bffd
feat: Flash-Lite 2.0 and 2.5 models added to Gemini inference provider (#3058)
Some checks failed
Integration Tests (Replay) / discover-tests (push) Successful in 4s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 11s
Python Package Build Test / build (3.12) (push) Failing after 8s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 15s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s
Python Package Build Test / build (3.13) (push) Failing after 10s
Unit Tests / unit-tests (3.12) (push) Failing after 9s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s
Test External API and Providers / test-external (venv) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 59s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 1m1s
Unit Tests / unit-tests (3.13) (push) Failing after 59s
Pre-commit / pre-commit (push) Successful in 1m41s
PR adds Flash-Lite 2.0 and 2.5 models to the Gemini inference provider

Closes #3046 

## Test Plan
I was not able to locate any existing test for this provider, so I
performed manual testing. But the change is really trivial and
straightforward.
2025-08-08 13:48:15 -07:00
ehhuang
0b5a794c27
fix: telemetry logger spams when queue is full (#3070)
# What does this PR do?


## Test Plan
Ran a stress test on chat completion endpoint locally:

For 10 concurrent users over 3 minutes:
Before:
<img width="1440" height="201" alt="image"
src="https://github.com/user-attachments/assets/24e0d580-186e-4e24-931e-2b936c5859b6"
/>

After:
<img width="1434" height="204" alt="image"
src="https://github.com/user-attachments/assets/4b806d88-f822-41e9-b25a-018cc4bec866"
/>

(Will send scripts in a future PR.)
2025-08-08 13:47:36 -07:00
Francisco Arceo
9b70bb9d4b
feat(ui): Adding Vector Store Files to Admin UI (#3041)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 4s
Integration Tests (Replay) / discover-tests (push) Successful in 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.12) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 16s
Unit Tests / unit-tests (3.13) (push) Failing after 12s
Test External API and Providers / test-external (venv) (push) Failing after 13s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 20s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 20s
Python Package Build Test / build (3.13) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 57s
Unit Tests / unit-tests (3.12) (push) Failing after 55s
Pre-commit / pre-commit (push) Successful in 2m10s
# What does this PR do?
This PR updates the UI to create new:
1. `/files/{file_id}` 
2. `files/{file_id}/contents`
3. `files/{file_id}/contents/{content_id}` 

The list of files are clickable which brings the user to the FIles
Detail page
The File Details page shows all of the content
The content details page shows the individual chunk/content parsed 

These only use our existing OpenAI compatible APIs. I have a separate
branch where I expose the embedding and the portal is correctly
populated. I included the FE rendering code for that in this PR.

1. `vector-stores/{vector_store_id}/files/{file_id}` 
<img width="1913" height="1351" alt="Screenshot 2025-08-06 at 10 20
12 PM"
src="https://github.com/user-attachments/assets/08010d5e-60c8-4bd9-9f3e-a2731ed1ad55"
/>

2. `vector-stores/{vector_store_id}/files/{file_id}/contents`
<img width="1920" height="1272" alt="Screenshot 2025-08-06 at 10 21
23 PM"
src="https://github.com/user-attachments/assets/3b91e67b-5d64-4fe6-91b6-18f14587e850"
/>

3.
`vector-stores/{vector_store_id}/files/{file_id}/contents/{content_id}`
<img width="1916" height="1273" alt="Screenshot 2025-08-06 at 10 21
45 PM"
src="https://github.com/user-attachments/assets/d38ca996-e8d9-460c-9e39-7ff0cb5ec0dd"
/>

## Test Plan
I tested this locally and reviewed the code. I generated a significant
share of the code with Claude and some manual intervention. After this,
I'll begin adding tests to the UI.

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-08-08 07:44:06 -07:00
Jiayi Ni
9e78f2da96
docs: fix the docs for NVIDIA Inference Provider (#3055)
Some checks failed
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 20s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s
Test Llama Stack Build / build-single-provider (push) Failing after 11s
Test Llama Stack Build / generate-matrix (push) Successful in 14s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 20s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 16s
Test External API and Providers / test-external (venv) (push) Failing after 11s
Unit Tests / unit-tests (3.12) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 20s
Python Package Build Test / build (3.12) (push) Failing after 23s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 18s
Unit Tests / unit-tests (3.13) (push) Failing after 9s
Update ReadTheDocs / update-readthedocs (push) Failing after 9s
Python Package Build Test / build (3.13) (push) Failing after 21s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 17s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 51s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 58s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 56s
Pre-commit / pre-commit (push) Successful in 1m40s
Test Llama Stack Build / build (push) Failing after 14s
# What does this PR do?
Fix the NVIDIA inference docs by updating API methods, model IDs, and
embedding example.

## Test Plan
N/A
2025-08-08 11:27:55 +02:00
Varsha
e3928e6a29
feat: Implement hybrid search in Milvus (#2644)
Some checks failed
Integration Tests (Replay) / discover-tests (push) Successful in 5s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 10s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.13) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s
Unit Tests / unit-tests (3.13) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 8s
Unit Tests / unit-tests (3.12) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 11s
Test External API and Providers / test-external (venv) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s
Pre-commit / pre-commit (push) Successful in 57s
# What does this PR do?
This PR implements hybrid search for Milvus DB based on the inbuilt
milvus support.
   
    To test:
    ```
pytest tests/unit/providers/vector_io/remote/test_milvus.py -v -s
--tb=long --disable-warnings --asyncio-mode=auto
    ```

Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
2025-08-07 09:42:03 +02:00
slekkala1
26d3d25c87
feat: Add moderations create api (#3020)
# What does this PR do?
This PR adds Open AI Compatible moderations api. Currently only
implementing for llama guard safety provider
Image support, expand to other safety providers and Deprecation of
run_shield will be next steps.


## Test Plan
Added 2 new tests for safe/ unsafe text prompt examples for the new open
ai compatible moderations api usage
`SAFETY_MODEL=llama-guard3:8b LLAMA_STACK_CONFIG=starter uv run pytest
-v tests/integration/safety/test_safety.py
--text-model=llama3.2:3b-instruct-fp16
--embedding-model=all-MiniLM-L6-v2 --safety-shield=ollama`
(Had some issue with previous PR
https://github.com/meta-llama/llama-stack/pull/2994 while updating and
accidentally close it , reopened new one )
2025-08-06 13:51:23 -07:00
Charlie Doern
0caef40e0d
fix: telemetry fixes (inference and core telemetry) (#2733)
# What does this PR do?

I found a few issues while adding new metrics for various APIs:

currently metrics are only propagated in `chat_completion` and
`completion`

since most providers use the `openai_..` routes as the default in
`llama-stack-client inference chat-completion`, metrics are currently
not working as expected.

in order to get them working the following had to be done:

1. get the completion as usual
2. use new `openai_` versions of the metric gathering functions which
use `.usage` from the `OpenAI..` response types to gather the metrics
which are already populated.
3. define a `stream_generator` which counts the tokens and computes the
metrics (only for stream=True)
5. add metrics to response


NOTE: I could not add metrics to `openai_completion` where stream=True
because that ONLY returns an `OpenAICompletion` not an AsyncGenerator
that we can manipulate.


acquire the lock, and add event to the span as the other `_log_...`
methods do

some new output:

`llama-stack-client inference chat-completion --message hi`

<img width="2416" height="425" alt="Screenshot 2025-07-16 at 8 28 20 AM"
src="https://github.com/user-attachments/assets/ccdf1643-a184-4ddd-9641-d426c4d51326"
/>


and in the client:

<img width="763" height="319" alt="Screenshot 2025-07-16 at 8 28 32 AM"
src="https://github.com/user-attachments/assets/6bceb811-5201-47e9-9e16-8130f0d60007"
/>

these were not previously being recorded nor were they being printed to
the server due to the improper console sink handling

---------

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-08-06 13:37:40 -07:00
Ashwin Bharambe
c252dfa3ef
fix(ci): allow tests to skip llama stack client instantiation (#3052)
Some checks failed
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 6s
Python Package Build Test / build (3.12) (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Successful in 9s
Unit Tests / unit-tests (3.12) (push) Failing after 6s
Test Llama Stack Build / build-single-provider (push) Failing after 11s
Python Package Build Test / build (3.13) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 20s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 18s
Update ReadTheDocs / update-readthedocs (push) Failing after 5s
Unit Tests / unit-tests (3.13) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 23s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 16s
Test External API and Providers / test-external (venv) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 20s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 15s
Pre-commit / pre-commit (push) Successful in 1m16s
Test Llama Stack Build / build (push) Failing after 8s
2025-08-06 11:15:41 -07:00
Nathan Weinberg
e9fced773a
refactor: introduce common 'ResourceNotFoundError' exception (#3032)
# What does this PR do?
1. Introduce new base custom exception class `ResourceNotFoundError`
2. All other "not found" exception classes now inherit from
`ResourceNotFoundError`

Closes #3030

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-08-06 10:22:55 -07:00
ehhuang
3e695cf320
chore: update postgres_demo with new config (#3045)
# What does this PR do?

closes https://github.com/meta-llama/llama-stack/issues/3044

## Test Plan
matches starter's template
2025-08-06 07:48:40 -07:00
Ashwin Bharambe
7f834339ba
chore(misc): make tests and starter faster (#3042)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 9s
Python Package Build Test / build (3.12) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s
Test Llama Stack Build / generate-matrix (push) Successful in 11s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 22s
Test External API and Providers / test-external (venv) (push) Failing after 14s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 22s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s
Unit Tests / unit-tests (3.13) (push) Failing after 14s
Test Llama Stack Build / build-single-provider (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s
Unit Tests / unit-tests (3.12) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 18s
Test Llama Stack Build / build (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 16s
Python Package Build Test / build (3.13) (push) Failing after 53s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 59s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 1m1s
Update ReadTheDocs / update-readthedocs (push) Failing after 1m6s
Pre-commit / pre-commit (push) Successful in 1m53s
A bunch of miscellaneous cleanup focusing on tests, but ended up
speeding up starter distro substantially.

- Pulled llama stack client init for tests into `pytest_sessionstart` so
it does not clobber output
- Profiling of that told me where we were doing lots of heavy imports
for starter, so lazied them
- starter now starts 20seconds+ faster on my Mac
- A few other smallish refactors for `compat_client`
2025-08-05 14:55:05 -07:00
IAN MILLER
e12524af85
feat: create unregister shield API endpoint in Llama Stack (#2853)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s
Integration Tests (Replay) / discover-tests (push) Successful in 13s
Python Package Build Test / build (3.12) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 24s
Test External API and Providers / test-external (venv) (push) Failing after 12s
Unit Tests / unit-tests (3.13) (push) Failing after 10s
Update ReadTheDocs / update-readthedocs (push) Failing after 9s
Python Package Build Test / build (3.13) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 27s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 29s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 27s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 21s
Unit Tests / unit-tests (3.12) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 35s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 39s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 35s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 35s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 1m2s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 1m4s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 1m2s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 7s
Pre-commit / pre-commit (push) Successful in 2m21s
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->

Extend the Shields Protocol and implement the capability to unregister
previously registered shields and CLI for shields management.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes #2581 

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

First of, test API for shields
1. Install and start Ollama:

`ollama serve`


2. Pull Llama Guard Model in Ollama:

`ollama pull llama-guard3:8b`

3. Configure env variables:

```
export ENABLE_OLLAMA=ollama
export OLLAMA_URL=http://localhost:11434
```

4. Build Llama Stack distro:

`llama stack build --template starter --image-type venv  `

5. Start Llama Stack server:

`llama stack run starter --port 8321`

6. Check if Ollama model is available:

`curl -X GET http://localhost:8321/v1/models | jq '.data[] |
select(.provider_id=="ollama")'`

7. Register a new Shield using Ollama provider:

```
curl -X POST http://localhost:8321/v1/shields \
 -H "Content-Type: application/json" \
 -d '{
   "shield_id": "test-shield",
   "provider_id": "llama-guard",
   "provider_shield_id": "ollama/llama-guard3:8b",
   "params": {}
 }'
```

`{"identifier":"test-shield","provider_resource_id":"ollama/llama-guard3:8b","provider_id":"llama-guard","type":"shield","owner":{"principal":"","attributes":{}},"params":{}}%
`

8. Check if shield was registered:

`curl -X GET http://localhost:8321/v1/shields/test-shield`


`{"identifier":"test-shield","provider_resource_id":"ollama/llama-guard3:8b","provider_id":"llama-guard","type":"shield","owner":{"principal":"","attributes":{}},"params":{}}%
`

9. Run shield:

```
curl -X POST http://localhost:8321/v1/safety/run-shield \
  -H "Content-Type: application/json" \
  -d '{
    "shield_id": "test-shield",
    "messages": [
      {
        "role": "user",
        "content": "How can I hack into someone computer?"
      }
    ],
    "params": {}
  }'
```

`{"violation":{"violation_level":"error","user_message":"I can't answer
that. Can I help with something
else?","metadata":{"violation_type":"S2"}}}% `

10. Unregister shield:

`curl -X DELETE http://localhost:8321/v1/shields/test-shield`

`null% `

11. Verify shield was deleted:

`curl -X GET http://localhost:8321/v1/shields/test-shield`

`{"detail":"Invalid value: Shield 'test-shield' not found"}%`

All tests passed 

```
========================================================================== 430 passed, 194 warnings in 19.54s ==========================================================================
/Users/iamiller/GitHub/llama-stack/.venv/lib/python3.12/site-packages/litellm/llms/custom_httpx/async_client_cleanup.py:78: RuntimeWarning: coroutine 'close_litellm_async_clients' was never awaited
  loop.close()
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Wrote HTML report to htmlcov-3.12/index.html

```
2025-08-05 07:33:46 -07:00
github-actions[bot]
e565b91182 build: Bump version to 0.2.17
Some checks failed
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 7s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 13s
Test Llama Stack Build / generate-matrix (push) Successful in 8s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 9s
Python Package Build Test / build (3.12) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s
Test Llama Stack Build / build-single-provider (push) Failing after 5s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s
Test External API and Providers / test-external (venv) (push) Failing after 7s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s
Unit Tests / unit-tests (3.12) (push) Failing after 7s
Python Package Build Test / build (3.13) (push) Failing after 9s
Update ReadTheDocs / update-readthedocs (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s
Unit Tests / unit-tests (3.13) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 14s
Test Llama Stack Build / build (push) Failing after 12s
Pre-commit / pre-commit (push) Successful in 1m38s
2025-08-05 01:43:30 +00:00
Francisco Arceo
eac1e0c7d4
chore: Fixing Markdown renderer (#3038) 2025-08-04 14:16:09 -07:00
Nathan Weinberg
68b0071861
chore: standardize session not found error (#3031)
# What does this PR do?
1. Creates a new `SessionNotFoundError` class
2. Implements the new class where appropriate 

Relates to #2379

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-08-04 13:12:02 -07:00
Nathan Weinberg
05cfa213b6
chore: standardize tool group not found error (#2986)
# What does this PR do?
1. Creates a new `ToolGroupNotFoundError` class
2. Implements the new class where appropriate 

Relates to #2379

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-08-04 11:41:33 -07:00
Ashwin Bharambe
cc87995e2b
chore: rename templates to distributions (#3035)
As the title says. Distributions is in, Templates is out.

`llama stack build --template` --> `llama stack build --distro`. For
backward compatibility, the previous option is kept but results in a
warning.

Updated `server.py` to remove the "config_or_template" backward
compatibility since it has been a couple releases since that change.
2025-08-04 11:34:17 -07:00
Eran Cohen
e5b542dd8e
feat: switch to async completion in LiteLLM OpenAI mixin (#3029)
Some checks failed
Integration Tests (Replay) / discover-tests (push) Successful in 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 13s
Unit Tests / unit-tests (3.12) (push) Failing after 11s
Python Package Build Test / build (3.13) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 17s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 17s
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 21s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 29s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 27s
Test External API and Providers / test-external (venv) (push) Failing after 23s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 25s
Unit Tests / unit-tests (3.13) (push) Failing after 25s
Pre-commit / pre-commit (push) Successful in 1m10s
2025-08-03 12:08:56 -07:00
Varsha
3c2aee610d
refactor: Remove double filtering based on score threshold (#3019)
# What does this PR do?
Remove score_threshold based check from `OpenAIVectorStoreMixin`

Closes: https://github.com/meta-llama/llama-stack/issues/3018

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
2025-08-02 15:57:03 -07:00
IAN MILLER
a749d5f4a4
refactor: remove Conda support from Llama Stack (#2969)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR is responsible for removal of Conda support in Llama Stack

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes #2539

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
2025-08-02 15:52:59 -07:00
ehhuang
6ac710f3b0
fix(recording): endpoint resolution (#3013)
Some checks failed
Integration Tests (Replay) / discover-tests (push) Successful in 5s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 15s
Integration Tests (Replay) / run-replay-mode-tests (push) Failing after 10s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s
Python Package Build Test / build (3.12) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 15s
Test External API and Providers / test-external (venv) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s
Python Package Build Test / build (3.13) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 18s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s
Unit Tests / unit-tests (3.12) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 56s
Unit Tests / unit-tests (3.13) (push) Failing after 52s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 55s
Pre-commit / pre-commit (push) Successful in 1m49s
# What does this PR do?


## Test Plan
2025-08-01 16:23:54 -07:00
Matthew Farrellee
140ee7d337
fix: sambanova inference provider (#2996)
Some checks failed
Integration Tests (Replay) / discover-tests (push) Successful in 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 10s
Integration Tests (Replay) / run-replay-mode-tests (push) Failing after 5s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 14s
Python Package Build Test / build (3.13) (push) Failing after 8s
Unit Tests / unit-tests (3.12) (push) Failing after 8s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 15s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 12s
Python Package Build Test / build (3.12) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 17s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 10s
Test External API and Providers / test-external (venv) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 10s
Unit Tests / unit-tests (3.13) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 46s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 49s
Pre-commit / pre-commit (push) Successful in 1m29s
# What does this PR do?

closes #2995 

update SambaNovaInferenceAdapter to efficiently use LiteLLMOpenAIMixin

## Test Plan

```
$ uv run pytest -s -v tests/integration/inference --stack-config inference=sambanova --text-model sambanova/Meta-Llama-3.1-8B-Instruct
...
======================== 10 passed, 84 skipped, 3 xfailed, 51 warnings in 8.14s ========================
```
2025-08-01 09:09:14 -07:00
Varsha
1f0766308d
feat: Add openAI compatible APIs to Qdrant (#2465)
Some checks failed
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 15s
Test Llama Stack Build / generate-matrix (push) Successful in 9s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 19s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s
Test Llama Stack Build / build-single-provider (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 15s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s
Integration Tests (Replay) / discover-tests (push) Successful in 24s
Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 18s
Update ReadTheDocs / update-readthedocs (push) Failing after 12s
Unit Tests / unit-tests (3.12) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 16s
Python Package Build Test / build (3.12) (push) Failing after 20s
Python Package Build Test / build (3.13) (push) Failing after 18s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 18s
Test External API and Providers / test-external (venv) (push) Failing after 18s
Unit Tests / unit-tests (3.13) (push) Failing after 19s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 42s
Integration Tests (Replay) / run-replay-mode-tests (push) Failing after 22s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 1m12s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 1m15s
Test Llama Stack Build / build (push) Failing after 32s
Pre-commit / pre-commit (push) Successful in 2m39s
# What does this PR do?
Adds support to Vector store Open AI APIs in Qdrant.

<!-- If resolving an issue, uncomment and update the line below -->
 Closes #2463 


## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
2025-08-01 00:41:34 -04:00
Francisco Arceo
33cca26154
chore: Enabling Integration tests for Weaviate (#2882)
# What does this PR do?

This PR (1) enables the files API for Weaviate and (2) enables
integration tests for Weaviate, which adds a docker container to the
github action.

This PR also handles a couple of edge cases for in creating the
collection and ensuring the tests all pass.

## Test Plan
CI enabled

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-07-31 20:29:50 -04:00
Matthew Farrellee
218c89fff1
feat: Add clear error message when API key is missing (#2992)
# What does this PR do?

Improve user experience by providing specific guidance when no API key
is available, showing both provider data header and config options with
the correct field name for each provider.

Also adds comprehensive test coverage for API key resolution scenarios.

addresses #2990 for providers using litellm openai mixin

## Test Plan

`./scripts/unit-tests.sh
tests/unit/providers/inference/test_litellm_openai_mixin.py`
2025-07-31 16:33:16 -04:00
Nathan Weinberg
ffb6306fbd
fix: remove redundant code from unregister_vector_db (#2983)
get_vector_db() will raise an exception if a vector store won't be
returned

client handling is redundant

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-07-31 09:22:04 -07:00
Nehanth Narendrula
3a574ef23c
fix: remove unused DPO parameters from schema and tests (#2988)
# What does this PR do?

I removed these DPO parameters from the schema in [this
PR](https://github.com/meta-llama/llama-stack/pull/2804), but I may not
have done it correctly, since they were reintroduced in [this
commit](cb7354a9ce (diff-4e9a8cb358213d6118c4b6ec2a76d0367af06441bf0717e13a775ade75e2061dR15081))—likely
due to a pre-commit hook.

I've made the changes again, and the pre-commit hook automatically
updated the spec sheet.
2025-07-31 09:11:08 -07:00
Nehanth Narendrula
cf73146132
feat: Enable DPO training with HuggingFace inline provider (#2825)
Some checks failed
Integration Tests / discover-tests (push) Has been skipped
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 7s
Integration Tests / record-tests (push) Has been skipped
Integration Tests / run-tests (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 22s
Python Package Build Test / build (3.13) (push) Failing after 16s
Test Llama Stack Build / generate-matrix (push) Successful in 19s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 31s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 32s
Test External API and Providers / test-external (venv) (push) Failing after 32s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 36s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 39s
Update ReadTheDocs / update-readthedocs (push) Failing after 31s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 42s
Test Llama Stack Build / build-single-provider (push) Failing after 37s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 35s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 37s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 40s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 42s
Unit Tests / unit-tests (3.12) (push) Failing after 36s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 40s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 45s
Test Llama Stack Build / build (push) Failing after 6s
Python Package Build Test / build (3.12) (push) Failing after 1m1s
Unit Tests / unit-tests (3.13) (push) Failing after 1m0s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 1m6s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 1m8s
Pre-commit / pre-commit (push) Successful in 1m50s
What does this PR do?

This PR adds support for Direct Preference Optimization (DPO) training
via the existing HuggingFace inline provider. It introduces a new DPO
training recipe, config schema updates, dataset integration, and
end-to-end testing to support preference-based fine-tuning with TRL.

Test Plan

Added integration test:

tests/integration/post_training/test_post_training.py::TestPostTraining::test_preference_optimize

Ran tests on both CPU and CUDA environments

---------

Co-authored-by: Ubuntu <ubuntu@ip-172-31-43-83.ec2.internal>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-07-30 23:33:36 -07:00
Ashwin Bharambe
2665f00102
chore(rename): move llama_stack.distribution to llama_stack.core (#2975)
We would like to rename the term `template` to `distribution`. To
prepare for that, this is a precursor.

cc @leseb
2025-07-30 23:30:53 -07:00
Francisco Arceo
f3d5459647
feat(UI): adding MVP playground UI (#2828)
# What does this PR do?
I've been tinkering a little with a simple chat playground in the UI, so
I'm opening the PR with what's kind of a WIP.

If you look at the first commit, that includes the big part of the
changes. The rest of the files changed come from adding installing the
`shadcn` components.

Note this is missing a lot; e.g.,
- sessions
- document upload
- audio (the shadcn components install these by default from
https://shadcn-chatbot-kit.vercel.app/docs/components/chat)

I still need to wire up a lot more to make it actually fully functional
but it does basic chat using the LS Typescript Client.

Basic demo: 

<img width="1329" height="1430" alt="Image"
src="https://github.com/user-attachments/assets/917a2096-36d4-4925-b83b-f1f2cda98698"
/>

<img width="1319" height="1424" alt="Image"
src="https://github.com/user-attachments/assets/fab1583b-1c72-4bf3-baf2-405aee13c6bb"
/>


<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-07-30 19:44:16 -07:00
Nathan Weinberg
406ca72957
fix: remove redundant code from unregister_dataset (#2971)
Some checks failed
Integration Tests / discover-tests (push) Has been skipped
Integration Tests / record-tests (push) Has been skipped
Integration Tests / run-tests (push) Has been skipped
Python Package Build Test / build (3.12) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 12s
Test Llama Stack Build / generate-matrix (push) Successful in 10s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 14s
Test Llama Stack Build / build-single-provider (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s
Unit Tests / unit-tests (3.13) (push) Failing after 9s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 10s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 13s
Test External API and Providers / test-external (venv) (push) Failing after 12s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 20s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 13s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s
Unit Tests / unit-tests (3.12) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s
Test Llama Stack Build / build (push) Failing after 7s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 19s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 24s
Python Package Build Test / build (3.13) (push) Failing after 53s
Update ReadTheDocs / update-readthedocs (push) Failing after 52s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1m0s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 58s
Pre-commit / pre-commit (push) Successful in 1m44s
get_dataset() will raise an exception if a dataset won't be returned

client handling is redundant

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-07-30 16:40:01 -07:00
Sai Prashanth S
cb7354a9ce
docs: Add detailed docstrings to API models and update OpenAPI spec (#2889)
This PR focuses on improving the developer experience by adding
comprehensive docstrings to the API data models across the Llama Stack.
These docstrings provide detailed explanations for each model and its
fields, making the API easier to understand and use.

**Key changes:**
- **Added Docstrings:** Added reST formatted docstrings to Pydantic
models in the `llama_stack/apis/` directory. This includes models for:
  - Agents (`agents.py`)
  - Benchmarks (`benchmarks.py`)
  - Datasets (`datasets.py`)
  - Inference (`inference.py`)
  - And many other API modules.
- **OpenAPI Spec Update:** Regenerated the OpenAPI specification
(`docs/_static/llama-stack-spec.yaml` and
`docs/_static/llama-stack-spec.html`) to include the new docstrings.
This will be reflected in the API documentation, providing richer
information to users.

**Impact:**
- Developers using the Llama Stack API will have a better understanding
of the data structures.
- The auto-generated API documentation is now more informative.

---------

Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-07-30 16:32:59 -07:00
Nathan Weinberg
cd5c6a2fcd
chore: standardize vector store not found error (#2968)
# What does this PR do?
1. Creates a new `VectorStoreNotFoundError` class
2. Implements the new class where appropriate 

Relates to #2379

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-07-30 15:19:16 -07:00
Nathan Weinberg
272a3e9937
chore: standardize dataset not found error (#2962)
# What does this PR do?
1. Adds a broad schema for custom exception classes in the Llama Stack
project
2. Creates a new `DatasetNotFoundError` class
3. Implements the new class where appropriate 

Relates to #2379

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-07-30 14:52:46 -07:00
Nathan Weinberg
c5622c79de
chore: standardize model not found error (#2964)
# What does this PR do?
1. Creates a new `ModelNotFoundError` class
2. Implements the new class where appropriate 

Relates to #2379

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-07-30 12:19:53 -07:00
Ashwin Bharambe
fd2aaf4978
fix: use OLLAMA_URL to activate Ollama provider in starter (#2963)
We tried to always keep Ollama enabled. However doing so makes the
provider implementation half-assed -- should it error when it cannot
connect to Ollama or not? What happens during periodic model refresh?
Etc. Instead do the same thing we do for vLLM -- use the `OLLAMA_URL` to
conditionally enable the provider.

## Test Plan

Run `uv run llama stack build --template starter --image-type venv
--run` with and without `OLLAMA_URL` set. Verify using
`llama-stack-client provider list` that ollama is correctly enabled.
2025-07-30 10:11:17 -07:00
Matthew Farrellee
b69bafba30
fix(library_client): improve initialization error handling and prevent AttributeError (#2944)
# What does this PR do?

- Initialize route_impls to None in constructor to prevent
AttributeError
- Consolidate initialization checks to single point in request() method
- Improve error message to be more helpful ("Please call initialize()
first")
- Add comprehensive test suite to prevent regressions

The library client now has better error handling when users forget to
call initialize(), showing a clear ValueError instead of confusing
AttributeError. All initialization validation is now centralized in the
request() method, with internal methods (_call_non_streaming,
_call_streaming, _convert_body) relying on this single check for
cleaner, more maintainable code.

closes #2943 

## Test Plan

`./scripts/unit-tests.sh`
2025-07-30 11:58:47 -04:00
Matthew Farrellee
2d1ab3ca55
fix: use same image_name logic for build & run config (#2949)
# What does this PR do?

when --image-name is not provided the build script default to the
image_name in the config, this makes sure the same is done for the run
script

## Test Plan

llama stack build w/o --image-name
2025-07-29 12:54:21 -07:00
Ashwin Bharambe
08b4a1deb3
feat(tests): introduce inference record/replay to increase test reliability (#2941)
Implements a comprehensive recording and replay system for inference API
calls that eliminates dependency on online inference providers during
testing. The system treats inference as deterministic by recording real
API responses and replaying them in subsequent test runs. Applies to
OpenAI clients (which should cover many inference requests) as well as
Ollama AsyncClient.

For storing, we use a hybrid system: Sqlite for fast lookups and JSON
files for easy greppability / debuggability.

As expected, tests become much much faster (more than 3x in just
inference testing.)

```bash
LLAMA_STACK_TEST_INFERENCE_MODE=record LLAMA_STACK_TEST_RECORDING_DIR=<...> \
  uv run pytest -s -v tests/integration/inference \
  --stack-config=starter \
  -k "not( builtin_tool or safety_with_image or code_interpreter or test_rag )" \
  --text-model="ollama/llama3.2:3b-instruct-fp16" \
  --embedding-model=sentence-transformers/all-MiniLM-L6-v2
```

```bash
LLAMA_STACK_TEST_INFERENCE_MODE=replay LLAMA_STACK_TEST_RECORDING_DIR=<...> \
  uv run pytest -s -v tests/integration/inference \
  --stack-config=starter \
  -k "not( builtin_tool or safety_with_image or code_interpreter or test_rag )" \
  --text-model="ollama/llama3.2:3b-instruct-fp16" \
  --embedding-model=sentence-transformers/all-MiniLM-L6-v2
```

- `LLAMA_STACK_TEST_INFERENCE_MODE`: `live` (default), `record`, or
`replay`
- `LLAMA_STACK_TEST_RECORDING_DIR`: Storage location (must be specified
for record or replay modes)
2025-07-29 12:41:31 -07:00
Ashwin Bharambe
abf1d6a703 fix: random breakage in llama_stack/ui/package.json 2025-07-29 12:31:29 -07:00
Nehanth Narendrula
58ffd82853
fix: Update SFTConfig parameter to fix CI and Post Training Workflow (#2948)
# What does this PR do?

- Change max_seq_length to max_length in SFTConfig constructor
- TRL deprecated max_seq_length in Feb 2024 and removed it in v0.20.0
- Reference: https://github.com/huggingface/trl/pull/2895

This resolves the SFT training failure in CI tests
2025-07-29 11:14:04 -07:00
Nathan Weinberg
870a37ff4b
feat: add base64 encoded PDF support for OpenAI Chat Completions (#2881)
Some checks failed
Coverage Badge / unit-tests (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Integration Tests / discover-tests (push) Successful in 3s
Test Llama Stack Build / generate-matrix (push) Successful in 6s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 12s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 13s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 9s
Unit Tests / unit-tests (3.12) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 14s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 10s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 13s
Unit Tests / unit-tests (3.13) (push) Failing after 10s
Test Llama Stack Build / build-single-provider (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 17s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 19s
Test External API and Providers / test-external (venv) (push) Failing after 16s
Test Llama Stack Build / build (push) Failing after 9s
Python Package Build Test / build (3.12) (push) Failing after 23s
Update ReadTheDocs / update-readthedocs (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 27s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 29s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 31s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 58s
Python Package Build Test / build (3.13) (push) Failing after 54s
Integration Tests / test-matrix (push) Failing after 56s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1m4s
Pre-commit / pre-commit (push) Successful in 2m15s
# What does this PR do?
OpenAI Chat Completions supports passing a base64 encoded PDF file to a
model, but Llama Stack currently does not allow for this behavior. This
PR extends our implementation of the OpenAI API spec to change that.

Closes #2129

## Test Plan
A new functional test has been added to test the validity of such a
request

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-07-29 06:23:41 -04:00
github-actions[bot]
cf8722079c build: Bump version to 0.2.16
Some checks failed
Coverage Badge / unit-tests (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s
Integration Tests / discover-tests (push) Successful in 8s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 8s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 12s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 10s
Python Package Build Test / build (3.12) (push) Failing after 1s
Test Llama Stack Build / generate-matrix (push) Successful in 6s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 15s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 16s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 20s
Python Package Build Test / build (3.13) (push) Failing after 8s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 13s
Test External API and Providers / test-external (venv) (push) Failing after 8s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 14s
Test Llama Stack Build / build (push) Failing after 7s
Update ReadTheDocs / update-readthedocs (push) Failing after 9s
Unit Tests / unit-tests (3.13) (push) Failing after 9s
Integration Tests / test-matrix (push) Failing after 8s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 12s
Test Llama Stack Build / build-single-provider (push) Failing after 35s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 42s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 44s
Pre-commit / pre-commit (push) Successful in 1m23s
2025-07-28 23:13:50 +00:00