llama-stack/llama_stack/providers/inline/tool_runtime/rag
ehhuang bb2690f176
feat: remove special handling of builtin::rag tool (#1015)
Summary:

Lets the model decide which tool it needs to call to respond to a query.

Test Plan:
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/ --safety-shield meta-llama/Llama-Guard-3-8B
```

Also evaluated on a small benchmark with 20 questions from HotpotQA.
With this PR and some prompting, the performance is 77% recall compared
to 50% currently.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with
[ReviewStack](https://reviewstack.dev/meta-llama/llama-stack/pull/1015).
* #1268
* #1239
* __->__ #1015
2025-02-26 13:04:52 -08:00
..
__init__.py fix: Get distro_codegen.py working with default deps and enabled in pre-commit hooks (#1123) 2025-02-19 18:39:20 -08:00
config.py Move tool_runtime.memory -> tool_runtime.rag 2025-01-22 20:25:02 -08:00
context_retriever.py build: format codebase imports using ruff linter (#1028) 2025-02-13 10:06:21 -08:00
memory.py feat: remove special handling of builtin::rag tool (#1015) 2025-02-26 13:04:52 -08:00