feat: remove special handling of builtin::rag tool (#1015)

Summary:

Lets the model decide which tool it needs to call to respond to a query.

Test Plan:
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/ --safety-shield meta-llama/Llama-Guard-3-8B
```

Also evaluated on a small benchmark with 20 questions from HotpotQA.
With this PR and some prompting, the performance is 77% recall compared
to 50% currently.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with
[ReviewStack](https://reviewstack.dev/meta-llama/llama-stack/pull/1015).
* #1268
* #1239
* __->__ #1015
This commit is contained in:
ehhuang 2025-02-26 13:04:52 -08:00 committed by GitHub
parent c64f0d5888
commit bb2690f176
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 94 additions and 133 deletions

View file

@ -441,7 +441,7 @@ class ToolRuntimeRouter(ToolRuntime):
vector_db_ids: List[str],
query_config: Optional[RAGQueryConfig] = None,
) -> RAGQueryResult:
return await self.routing_table.get_provider_impl("query_from_memory").query(
return await self.routing_table.get_provider_impl("knowledge_search").query(
content, vector_db_ids, query_config
)