mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-08-11 12:38:02 +00:00
docs: [RFC] uploaded rerank RFC document
This commit is contained in:
parent
f8872b3184
commit
99068dfde4
2 changed files with 1 additions and 2 deletions
BIN
docs/resources/rerank_api_flowchart.png
Normal file
BIN
docs/resources/rerank_api_flowchart.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 67 KiB |
|
@ -35,8 +35,7 @@ Current RAG implementations use embedding-based similarity search to retrieve do
|
|||
Existing RAG systems efficiently index and retrieve document chunks from vector stores, but they often lack a mechanism to refine initial results. This can lead to suboptimal context for LLMs and hinder overall performance. The case for re-ranking is especially strong for enterprise users relying on legacy keyword search systems, where significant investments have already been made in content synchronization and indexing. In these environments, re-ranking can substantially improve accuracy by refining outputs from established search infrastructure. While new vector stores using state-of-the-art dense models also benefit from re-ranking, the improvements tend to be less pronounced and may not justify the additional complexity and latency. Moreover, different operational needs mean that some users prefer a managed API solution, while others require inline control for low latency or data privacy.
|
||||
|
||||
## Proposed Reranking Solution
|
||||

|
||||
|
||||

|
||||
## 4.1. Extended API Endpoints
|
||||
|
||||
### 4.1.1. Query Endpoint
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue