Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
chore: Enable keyword search for Milvus inline (#3073)
With https://github.com/milvus-io/milvus-lite/pull/294 - Milvus Lite
supports keyword search using BM25. While introducing keyword search we
had explicitly disabled it for inline milvus. This PR removes the need
for the check, and enables `inline::milvus` for tests.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Run llama stack with `inline::milvus` enabled:
```
pytest tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes --stack-config=http://localhost:8321 --embedding-model=all-MiniLM-L6-v2 -v
```
```
INFO 2025-08-07 17:06:20,932 tests.integration.conftest:64 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS
=========================================================================================== test session starts ============================================================================================
platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.12.11', 'Platform': 'macOS-14.7.6-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}}
rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack
configfile: pyproject.toml
plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0
asyncio: mode=Mode.AUTO
collected 3 items
tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-vector] PASSED [ 33%]
tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-keyword] PASSED [ 66%]
tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-hybrid] PASSED [100%]
============================================================================================ 3 passed in 4.75s =============================================================================================
```
Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
chore: Fixup main pre commit (#3204)
build: Bump version to 0.2.18
chore: Faster npm pre-commit (#3206)
Adds npm to pre-commit.yml installation and caches ui
Removes node installation during pre-commit.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
chiecking in for tonight, wip moving to agents api
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
remove log
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
updated
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
fix: disable ui-prettier & ui-eslint (#3207)
chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (#3061)
This PR adds a step in pre-commit to enforce using `llama_stack` logger.
Currently, various parts of the code base uses different loggers. As a
custom `llama_stack` logger exist and used in the codebase, it is better
to standardize its utilization.
Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>
Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>
fix: fix ```openai_embeddings``` for asymmetric embedding NIMs (#3205)
NVIDIA asymmetric embedding models (e.g.,
`nvidia/llama-3.2-nv-embedqa-1b-v2`) require an `input_type` parameter
not present in the standard OpenAI embeddings API. This PR adds the
`input_type="query"` as default and updates the documentation to suggest
using the `embedding` API for passage embeddings.
<!-- If resolving an issue, uncomment and update the line below -->
Resolves#2892
```
pytest -s -v tests/integration/inference/test_openai_embeddings.py --stack-config="inference=nvidia" --embedding-model="nvidia/llama-3.2-nv-embedqa-1b-v2" --env NVIDIA_API_KEY={nvidia_api_key} --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com"
```
cleaning up
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
updating session manager to cache messages locally
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
fix linter
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
more cleanup
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
This updates the sidebar to look a little more like other popular ones.
<img width="1913" height="1352" alt="Screenshot 2025-08-08 at 11 25
31 PM"
src="https://github.com/user-attachments/assets/00738412-1101-48ec-8864-cde4a8733ec1"
/>
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
This PR updates the UI to create new:
1. `/files/{file_id}`
2. `files/{file_id}/contents`
3. `files/{file_id}/contents/{content_id}`
The list of files are clickable which brings the user to the FIles
Detail page
The File Details page shows all of the content
The content details page shows the individual chunk/content parsed
These only use our existing OpenAI compatible APIs. I have a separate
branch where I expose the embedding and the portal is correctly
populated. I included the FE rendering code for that in this PR.
1. `vector-stores/{vector_store_id}/files/{file_id}`
<img width="1913" height="1351" alt="Screenshot 2025-08-06 at 10 20
12 PM"
src="https://github.com/user-attachments/assets/08010d5e-60c8-4bd9-9f3e-a2731ed1ad55"
/>
2. `vector-stores/{vector_store_id}/files/{file_id}/contents`
<img width="1920" height="1272" alt="Screenshot 2025-08-06 at 10 21
23 PM"
src="https://github.com/user-attachments/assets/3b91e67b-5d64-4fe6-91b6-18f14587e850"
/>
3.
`vector-stores/{vector_store_id}/files/{file_id}/contents/{content_id}`
<img width="1916" height="1273" alt="Screenshot 2025-08-06 at 10 21
45 PM"
src="https://github.com/user-attachments/assets/d38ca996-e8d9-460c-9e39-7ff0cb5ec0dd"
/>
## Test Plan
I tested this locally and reviewed the code. I generated a significant
share of the code with Claude and some manual intervention. After this,
I'll begin adding tests to the UI.
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
I've been tinkering a little with a simple chat playground in the UI, so
I'm opening the PR with what's kind of a WIP.
If you look at the first commit, that includes the big part of the
changes. The rest of the files changed come from adding installing the
`shadcn` components.
Note this is missing a lot; e.g.,
- sessions
- document upload
- audio (the shadcn components install these by default from
https://shadcn-chatbot-kit.vercel.app/docs/components/chat)
I still need to wire up a lot more to make it actually fully functional
but it does basic chat using the LS Typescript Client.
Basic demo:
<img width="1329" height="1430" alt="Image"
src="https://github.com/user-attachments/assets/917a2096-36d4-4925-b83b-f1f2cda98698"
/>
<img width="1319" height="1424" alt="Image"
src="https://github.com/user-attachments/assets/fab1583b-1c72-4bf3-baf2-405aee13c6bb"
/>
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
- Adds two pages to UI
- Vector stores
- Vector store detail view
- Fixed darkmode navbar highlighting
- Updated darkmode font color
- Updated llama-stack-client package
<img width="1916" height="734" alt="Screenshot 2025-07-12 at 11 34
35 PM"
src="https://github.com/user-attachments/assets/3f9b6727-ee82-4e6b-9555-2e3ef36d24d2"
/>
<img width="1912" height="910" alt="Screenshot 2025-07-12 at 11 57
09 PM"
src="https://github.com/user-attachments/assets/0c9d3b5e-5592-4dfb-8e04-a57edc9fb406"
/>
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
## Summary:
This commit adds infinite scroll pagination to the chat completions and
responses tables.
## Test Plan:
1. Run unit tests: npm run test
2. Manual testing: Navigate to chat
completions/responses pages
3. Verify infinite scroll triggers when approaching
bottom
4. Added playwright tests: npm run test:e2e
# What does this PR do?
This PR adds a few enhancements:
- Dark mode
- A dark mode icon
- Adds a link to the API documentation
- Adds prettier and a linter to the code
- Aligning the default text
- Linted the code
## Before:

## After (dark mode):

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Related to https://github.com/meta-llama/llama-stack/issues/2085
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
Introduces scaffolding for Llama Stack's UI. Created with next.js and
https://ui.shadcn.com/.
1. Initialized directory with `npx shadcn@latest init`
2. Added sidebar component `npx shadcn@latest add sidebar` and added
menu items for chat completions and responses.
3. Placeholder pages for each.
## Test Plan
`npm run dev`
<img width="1058" alt="image"
src="https://github.com/user-attachments/assets/5695a53f-e22e-418e-80d1-5bf0ae9b6fe8"
/>