mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-16 19:22:37 +00:00
|
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 11s
Python Package Build Test / build (3.12) (push) Successful in 15s
Python Package Build Test / build (3.13) (push) Successful in 20s
Test External API and Providers / test-external (venv) (push) Failing after 41s
Vector IO Integration Tests / test-matrix (push) Failing after 49s
UI Tests / ui-tests (22) (push) Successful in 51s
Unit Tests / unit-tests (3.13) (push) Failing after 1m27s
Unit Tests / unit-tests (3.12) (push) Failing after 1m45s
Pre-commit / pre-commit (22) (push) Failing after 2m30s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4m22s
# What does this PR do?
Actualize query rewrite in search API, add
`default_query_expansion_model` and `query_expansion_prompt` in
`VectorStoresConfig`.
Makes `rewrite_query` parameter functional in vector store search.
- `rewrite_query=false` (default): Use original query
- `rewrite_query=true`: Expand query via LLM, or fail gracefully if no
LLM available
Adds 4 parameters to`VectorStoresConfig`:
- `default_query_expansion_model`: LLM model for query expansion
(optional)
- `query_expansion_prompt`: Custom prompt template (optional, uses
built-in default)
- `query_expansion_max_tokens`: Configurable token limit (default: 100)
- `query_expansion_temperature`: Configurable temperature (default: 0.3)
Enabled `run.yaml`:
```yaml
vector_stores:
rewrite_query_params:
model:
provider_id: "ollama"
model_id: "llama3.2:3b-instruct-fp16"
# prompt defaults to built-in
# max_tokens defaults to 100
# temperature defaults to 0.3
```
Fully customized `run.yaml`:
```yaml
vector_stores:
default_provider_id: faiss
default_embedding_model:
provider_id: sentence-transformers
model_id: nomic-ai/nomic-embed-text-v1.5
rewrite_query_params:
model:
provider_id: ollama
model_id: llama3.2:3b-instruct-fp16
prompt: "Rewrite this search query to improve retrieval results by expanding it with relevant synonyms and related terms: {query}"
max_tokens: 100
temperature: 0.3
```
## Test Plan
Added test and recording
Example script as well:
```python
import asyncio
from llama_stack_client import LlamaStackClient
from io import BytesIO
def gen_file(client, text: str=""):
file_buffer = BytesIO(text.encode('utf-8'))
file_buffer.name = "my_file.txt"
uploaded_file = client.files.create(
file=file_buffer,
purpose="assistants"
)
return uploaded_file
async def test_query_rewriting():
client = LlamaStackClient(base_url="http://0.0.0.0:8321/")
uploaded_file = gen_file(client, "banana banana apple")
uploaded_file2 = gen_file(client, "orange orange kiwi")
vs = client.vector_stores.create()
xf_vs = client.vector_stores.files.create(vector_store_id=vs.id, file_id=uploaded_file.id)
xf_vs1 = client.vector_stores.files.create(vector_store_id=vs.id, file_id=uploaded_file2.id)
response1 = client.vector_stores.search(
vector_store_id=vs.id,
query="apple",
max_num_results=3,
rewrite_query=False
)
response2 = client.vector_stores.search(
vector_store_id=vs.id,
query="kiwi",
max_num_results=3,
rewrite_query=True,
)
print(f"\n🔵 Response 1 (rewrite_query=False):\n\033[94m{response1}\033[0m")
print(f"\n🟢 Response 2 (rewrite_query=True):\n\033[92m{response2}\033[0m")
for f in [uploaded_file.id, uploaded_file2.id]:
client.files.delete(file_id=f)
client.vector_stores.delete(vector_store_id=vs.id)
if __name__ == "__main__":
asyncio.run(test_query_rewriting())
```
And see the screen shot of the server logs showing it worked.
<img width="1111" height="826" alt="Screenshot 2025-11-19 at 1 16 03 PM"
src="https://github.com/user-attachments/assets/2d188b44-1fef-4df5-b465-2d6728ca49ce"
/>
Notice the log:
```bash
Query rewritten:
'kiwi' → 'kiwi, a small brown or green fruit native to New Zealand, or a person having a fuzzy brown outer skin similar in appearance.'
```
So `kiwi` was expanded.
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>
|
||
|---|---|---|
| .. | ||
| 02c93bb3c314427bae2b7a7a6f054792b9f22d2cb4522eab802810be8672d3dc.json | ||
| 4df315784095a200b2d275c6f8dda8be845e250000208127d20cf8c4f0bc666c.json | ||
| 5b03940f8f14616ba20bf3b695138b785ffc26aed814ef01db492f4a5674d6c5.json | ||
| 41c28019c2c89e5962ae3043dc7015ee45aa5ee235645768a230a5fa5cd45ad9.json | ||
| 224f7e7bd332f0ada326039866d13b7f167d5bfa91ce752022010f1e885d869f.json | ||
| 395c30078677826058a0cbe136dfd07c816854cfb7015ee4ece0e414d16e7e52.json | ||
| 1098240ef53bbd378adf8dafbd5838b16eef7d6a7d6e75d24e3c120e25e73750.json | ||
| a6ad8748dce1ebe53352c6ac4ccd9b209d614ce5c6ff86992b4aed3dc344eafc.json | ||
| a4416482053457914b5834398c2664ceb843d8c7deaec80a59d5e20dbb1ca090.json | ||
| ab1a32474062bbad640ce43d02d6b61ed9f174c225597f0241cf120c47c7d2fa.json | ||
| b2c646582d0a4d9d8986789261c0d630d5b604ee6291cf8aa3d44ab761f2c676.json | ||
| bfc8818f4ad237ba6c9649d47eaff8946e334ea6a2bcb564d74f4f14dbc3497b.json | ||
| c4f314b202711805808eb75f1947cb6cca0bf8dbffb0dfabb814f9da0083b3c3.json | ||
| c34cccb2af2fb9f02f7136b0dd350e75e7d2a77d222ef26a9bc419e10fa33c56.json | ||
| models-64a2277c90f0f42576f60c1030e3a020403d34a95f56931b792d5939f4cebc57-6a62c598.json | ||
| models-64a2277c90f0f42576f60c1030e3a020403d34a95f56931b792d5939f4cebc57-826d44c3.json | ||
| models-64a2277c90f0f42576f60c1030e3a020403d34a95f56931b792d5939f4cebc57-abd54ea0.json | ||
| models-64a2277c90f0f42576f60c1030e3a020403d34a95f56931b792d5939f4cebc57-e660ee4a.json | ||
| models-64a2277c90f0f42576f60c1030e3a020403d34a95f56931b792d5939f4cebc57-fb68f5a6.json | ||
| models-d98e7566147f9d534bc0461f2efe61e3f525c18360a07bb3dda397579e25c27b-a882f554.json | ||