adding logo and favicon

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

chore: Enable keyword search for Milvus inline (#3073)

With https://github.com/milvus-io/milvus-lite/pull/294 - Milvus Lite
supports keyword search using BM25. While introducing keyword search we
had explicitly disabled it for inline milvus. This PR removes the need
for the check, and enables `inline::milvus` for tests.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

Run llama stack with `inline::milvus` enabled:

```
pytest tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes --stack-config=http://localhost:8321 --embedding-model=all-MiniLM-L6-v2 -v
```

```
INFO     2025-08-07 17:06:20,932 tests.integration.conftest:64 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS
=========================================================================================== test session starts ============================================================================================
platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.12.11', 'Platform': 'macOS-14.7.6-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}}
rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack
configfile: pyproject.toml
plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0
asyncio: mode=Mode.AUTO
collected 3 items

tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-vector] PASSED                                                   [ 33%]
tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-keyword] PASSED                                                  [ 66%]
tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-hybrid] PASSED                                                   [100%]

============================================================================================ 3 passed in 4.75s =============================================================================================
```

Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>

chore: Fixup main pre commit (#3204)

build: Bump version to 0.2.18

chore: Faster npm pre-commit (#3206)

Adds npm to pre-commit.yml installation and caches ui
Removes node installation during pre-commit.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

chiecking in for tonight, wip moving to agents api

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

remove log

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

updated

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

fix: disable ui-prettier & ui-eslint (#3207)

chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (#3061)

This PR adds a step in pre-commit to enforce using `llama_stack` logger.

Currently, various parts of the code base uses different loggers. As a
custom `llama_stack` logger exist and used in the codebase, it is better
to standardize its utilization.

Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>
Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>

fix: fix ```openai_embeddings``` for asymmetric embedding NIMs (#3205)

NVIDIA asymmetric embedding models (e.g.,
`nvidia/llama-3.2-nv-embedqa-1b-v2`) require an `input_type` parameter
not present in the standard OpenAI embeddings API. This PR adds the
`input_type="query"` as default and updates the documentation to suggest
using the `embedding` API for passage embeddings.

<!-- If resolving an issue, uncomment and update the line below -->
Resolves #2892

```
pytest -s -v tests/integration/inference/test_openai_embeddings.py   --stack-config="inference=nvidia"   --embedding-model="nvidia/llama-3.2-nv-embedqa-1b-v2"   --env NVIDIA_API_KEY={nvidia_api_key}   --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com"
```

cleaning up

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

updating session manager to cache messages locally

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

fix linter

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

more cleanup

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
This commit is contained in:
Francisco Javier Arceo 2025-08-19 16:44:20 -04:00
parent e7be568d7e
commit 6620b625f1
76 changed files with 2343 additions and 1187 deletions

16
uv.lock generated
View file

@ -1719,7 +1719,7 @@ wheels = [
[[package]]
name = "llama-stack"
version = "0.2.17"
version = "0.2.18"
source = { editable = "." }
dependencies = [
{ name = "aiohttp" },
@ -1809,6 +1809,7 @@ test = [
{ name = "chardet" },
{ name = "datasets" },
{ name = "mcp" },
{ name = "milvus-lite" },
{ name = "openai" },
{ name = "pymilvus" },
{ name = "pypdf" },
@ -1831,6 +1832,7 @@ unit = [
{ name = "faiss-cpu" },
{ name = "litellm" },
{ name = "mcp" },
{ name = "milvus-lite" },
{ name = "ollama" },
{ name = "openai" },
{ name = "pymilvus" },
@ -1854,8 +1856,8 @@ requires-dist = [
{ name = "jinja2", specifier = ">=3.1.6" },
{ name = "jsonschema" },
{ name = "llama-api-client", specifier = ">=0.1.2" },
{ name = "llama-stack-client", specifier = ">=0.2.17" },
{ name = "llama-stack-client", marker = "extra == 'ui'", specifier = ">=0.2.17" },
{ name = "llama-stack-client", specifier = ">=0.2.18" },
{ name = "llama-stack-client", marker = "extra == 'ui'", specifier = ">=0.2.18" },
{ name = "openai", specifier = ">=1.99.6,<1.100.0" },
{ name = "opentelemetry-exporter-otlp-proto-http", specifier = ">=1.30.0" },
{ name = "opentelemetry-sdk", specifier = ">=1.30.0" },
@ -1925,6 +1927,7 @@ test = [
{ name = "chardet" },
{ name = "datasets" },
{ name = "mcp" },
{ name = "milvus-lite", specifier = ">=2.5.0" },
{ name = "openai" },
{ name = "pymilvus", specifier = ">=2.5.12" },
{ name = "pypdf" },
@ -1946,6 +1949,7 @@ unit = [
{ name = "faiss-cpu" },
{ name = "litellm" },
{ name = "mcp" },
{ name = "milvus-lite", specifier = ">=2.5.0" },
{ name = "ollama" },
{ name = "openai" },
{ name = "pymilvus", specifier = ">=2.5.12" },
@ -1959,7 +1963,7 @@ unit = [
[[package]]
name = "llama-stack-client"
version = "0.2.17"
version = "0.2.18"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "anyio" },
@ -1978,9 +1982,9 @@ dependencies = [
{ name = "tqdm" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/c5/2a/bb2949d6a5c494d21da0c185d426e25eaa8016f8287b689249afc6c96fb5/llama_stack_client-0.2.17.tar.gz", hash = "sha256:1fe2070133c6356761e394fa346045e9b6b567d4c63157b9bc6be89b9a6e7a41", size = 257636, upload-time = "2025-08-05T01:42:55.911Z" }
sdist = { url = "https://files.pythonhosted.org/packages/69/da/5e5a745495f8a2b8ef24fc4d01fe9031aa2277c36447cb22192ec8c8cc1e/llama_stack_client-0.2.18.tar.gz", hash = "sha256:860c885c9e549445178ac55cc9422e6e2a91215ac7aff5aaccfb42f3ce07e79e", size = 277284, upload-time = "2025-08-19T22:12:09.106Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/81/fc/5eccc86b83c5ced3a3bca071d250a86ccafa4ff17546cf781deb7758ab74/llama_stack_client-0.2.17-py3-none-any.whl", hash = "sha256:336c32f8688700ff64717b8109f405dc87a990fbe310c2027ac9ed6d39d67d16", size = 350329, upload-time = "2025-08-05T01:42:54.381Z" },
{ url = "https://files.pythonhosted.org/packages/0a/e4/e97f8fdd8a07aa1efc7f7e37b5657d84357b664bf70dd1885a437edc0699/llama_stack_client-0.2.18-py3-none-any.whl", hash = "sha256:90f827d5476f7fc15fd993f1863af6a6e72bd064646bf6a99435eb43a1327f70", size = 367586, upload-time = "2025-08-19T22:12:07.899Z" },
]
[[package]]