llama-stack/llama_stack
Xi Yan 094eb6a5ae
feat(rag): entire document context with attachments (#1763)
# What does this PR do?
**What**
Instead of adhoc creating a vectordb and chunking when documents ae sent
as an attachment to agent turn, we directly pass raw text from document
into messages to model for user context, and let model perform
summarization directly.

This removes the magic behaviour, and yields better performance than
existing approach.

**Improved Performance**
- RAG lifecycle notebook
  - Model: 0.3 factuality score
  - (+ websearch) Agent: 0.44 factuality score
  - (+ vector db) Agent: 0.3 factuality score
  - (+ raw context) Agent: 0.6 factuality score

Closes https://github.com/meta-llama/llama-stack/issues/1478

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan
- [NEW] added section in RAG lifecycle notebook shows better performance

<img width="840" alt="image"
src="https://github.com/user-attachments/assets/a0c4e816-809a-41c0-9124-89825983e3f5"
/>


[//]: # (## Documentation)
2025-03-23 16:57:48 -07:00
..
apis feat(telemetry): clean up spans (#1760) 2025-03-21 20:05:11 -07:00
cli fix: compare timezones correctly in download script 2025-03-21 11:46:57 -07:00
distribution feat(telemetry): clean up spans (#1760) 2025-03-21 20:05:11 -07:00
models/llama fix: update default tool call system prompt (#1712) 2025-03-19 22:49:24 -07:00
providers feat(rag): entire document context with attachments (#1763) 2025-03-23 16:57:48 -07:00
strong_typing fix: Support types.UnionType in schemas (#1721) 2025-03-20 09:54:02 -07:00
templates chore(telemetry): remove service_name entirely (#1755) 2025-03-21 15:11:56 -07:00
__init__.py export LibraryClient 2024-12-13 12:08:00 -08:00
env.py refactor(test): move tools, evals, datasetio, scoring and post training tests (#1401) 2025-03-04 14:53:47 -08:00
log.py feat: add support for logging config in the run.yaml (#1408) 2025-03-14 12:36:25 -07:00
schema_utils.py chore: make mypy happy with webmethod (#1758) 2025-03-22 08:17:23 -07:00