mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-17 00:29:26 +00:00

History

Francisco Javier Arceo 95b2948d11 Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Python Package Build Test / build (3.12) (push) Successful in 15s Details Python Package Build Test / build (3.13) (push) Successful in 20s Details Test External API and Providers / test-external (venv) (push) Failing after 41s Details Vector IO Integration Tests / test-matrix (push) Failing after 49s Details UI Tests / ui-tests (22) (push) Successful in 51s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m27s Details Unit Tests / unit-tests (3.12) (push) Failing after 1m45s Details Pre-commit / pre-commit (22) (push) Failing after 2m30s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m22s Details feat: Add support for query rewrite in vector_store.search (#4171 ) # What does this PR do? Actualize query rewrite in search API, add `default_query_expansion_model` and `query_expansion_prompt` in `VectorStoresConfig`. Makes `rewrite_query` parameter functional in vector store search. - `rewrite_query=false` (default): Use original query - `rewrite_query=true`: Expand query via LLM, or fail gracefully if no LLM available Adds 4 parameters to`VectorStoresConfig`: - `default_query_expansion_model`: LLM model for query expansion (optional) - `query_expansion_prompt`: Custom prompt template (optional, uses built-in default) - `query_expansion_max_tokens`: Configurable token limit (default: 100) - `query_expansion_temperature`: Configurable temperature (default: 0.3) Enabled `run.yaml`: ```yaml vector_stores: rewrite_query_params: model: provider_id: "ollama" model_id: "llama3.2:3b-instruct-fp16" # prompt defaults to built-in # max_tokens defaults to 100 # temperature defaults to 0.3 ``` Fully customized `run.yaml`: ```yaml vector_stores: default_provider_id: faiss default_embedding_model: provider_id: sentence-transformers model_id: nomic-ai/nomic-embed-text-v1.5 rewrite_query_params: model: provider_id: ollama model_id: llama3.2:3b-instruct-fp16 prompt: "Rewrite this search query to improve retrieval results by expanding it with relevant synonyms and related terms: {query}" max_tokens: 100 temperature: 0.3 ``` ## Test Plan Added test and recording Example script as well: ```python import asyncio from llama_stack_client import LlamaStackClient from io import BytesIO def gen_file(client, text: str=""): file_buffer = BytesIO(text.encode('utf-8')) file_buffer.name = "my_file.txt" uploaded_file = client.files.create( file=file_buffer, purpose="assistants" ) return uploaded_file async def test_query_rewriting(): client = LlamaStackClient(base_url="http://0.0.0.0:8321/") uploaded_file = gen_file(client, "banana banana apple") uploaded_file2 = gen_file(client, "orange orange kiwi") vs = client.vector_stores.create() xf_vs = client.vector_stores.files.create(vector_store_id=vs.id, file_id=uploaded_file.id) xf_vs1 = client.vector_stores.files.create(vector_store_id=vs.id, file_id=uploaded_file2.id) response1 = client.vector_stores.search( vector_store_id=vs.id, query="apple", max_num_results=3, rewrite_query=False ) response2 = client.vector_stores.search( vector_store_id=vs.id, query="kiwi", max_num_results=3, rewrite_query=True, ) print(f"\n🔵 Response 1 (rewrite_query=False):\n\033[94m{response1}\033[0m") print(f"\n🟢 Response 2 (rewrite_query=True):\n\033[92m{response2}\033[0m") for f in [uploaded_file.id, uploaded_file2.id]: client.files.delete(file_id=f) client.vector_stores.delete(vector_store_id=vs.id) if __name__ == "__main__": asyncio.run(test_query_rewriting()) ``` And see the screen shot of the server logs showing it worked. <img width="1111" height="826" alt="Screenshot 2025-11-19 at 1 16 03 PM" src="https://github.com/user-attachments/assets/2d188b44-1fef-4df5-b465-2d6728ca49ce" /> Notice the log: ```bash Query rewritten: 'kiwi' → 'kiwi, a small brown or green fruit native to New Zealand, or a person having a fuzzy brown outer skin similar in appearance.' ``` So `kiwi` was expanded. --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>		2025-12-10 10:06:19 -05:00
..
cli	feat: remove usage of build yaml (#4192 )	2025-12-10 10:12:12 +01:00
conversations	feat: remove usage of build yaml (#4192 )	2025-12-10 10:12:12 +01:00
core	feat: Add support for query rewrite in vector_store.search (#4171 )	2025-12-10 10:06:19 -05:00
distribution	feat: convert Benchmarks API to use FastAPI router (#4309 )	2025-12-10 15:04:27 +01:00
files	refactor(storage): make { kvstore, sqlstore } as llama stack "internal" APIs (#4181 )	2025-11-18 13:15:16 -08:00
models	refactor: remove dead inference API code and clean up imports (#4093 )	2025-11-10 15:29:24 -08:00
prompts/prompts	feat: remove usage of build yaml (#4192 )	2025-12-10 10:12:12 +01:00
providers	feat: Add support for query rewrite in vector_store.search (#4171 )	2025-12-10 10:06:19 -05:00
rag	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
registry	refactor(storage): make { kvstore, sqlstore } as llama stack "internal" APIs (#4181 )	2025-11-18 13:15:16 -08:00
server	feat: remove usage of build yaml (#4192 )	2025-12-10 10:12:12 +01:00
tools	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
utils	fix: set SqlRecord owner to None when owner_principal is empty (#4284 )	2025-12-03 10:28:33 -08:00
__init__.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
conftest.py	test: suppress expected error logs in SSE test (#3886 )	2025-10-22 14:34:32 -07:00
fixtures.py	refactor(storage): make { kvstore, sqlstore } as llama stack "internal" APIs (#4181 )	2025-11-18 13:15:16 -08:00
README.md	test: Measure and track code coverage (#2636 )	2025-07-18 18:08:36 +02:00

README.md

Llama Stack Unit Tests

Unit Tests

Unit tests verify individual components and functions in isolation. They are fast, reliable, and don't require external services.

Prerequisites

Python Environment: Ensure you have Python 3.12+ installed
uv Package Manager: Install uv if not already installed

You can run the unit tests by running:

./scripts/unit-tests.sh [PYTEST_ARGS]

Any additional arguments are passed to pytest. For example, you can specify a test directory, a specific test file, or any pytest flags (e.g., -vvv for verbosity). If no test directory is specified, it defaults to "tests/unit", e.g:

./scripts/unit-tests.sh tests/unit/registry/test_registry.py -vvv

If you'd like to run for a non-default version of Python (currently 3.12), pass PYTHON_VERSION variable as follows:

source .venv/bin/activate
PYTHON_VERSION=3.13 ./scripts/unit-tests.sh

Test Configuration

Test Discovery: Tests are automatically discovered in the tests/unit/ directory
Async Support: Tests use --asyncio-mode=auto for automatic async test handling
Coverage: Tests generate coverage reports in htmlcov/ directory
Python Version: Defaults to Python 3.12, but can be overridden with PYTHON_VERSION environment variable

Coverage Reports

After running tests, you can view coverage reports:

# Open HTML coverage report in browser
open htmlcov/index.html  # macOS
xdg-open htmlcov/index.html  # Linux
start htmlcov/index.html  # Windows