mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-04 12:07:34 +00:00
chore: Updating documentation, adding exception handling for Vector Stores in RAG Tool, more tests on migration, and migrate off of inference_api for context_retriever for RAG (#3367)
# What does this PR do? - Updating documentation on migration from RAG Tool to Vector Stores and Files APIs - Adding exception handling for Vector Stores in RAG Tool - Add more tests on migration from RAG Tool to Vector Stores - Migrate off of inference_api for context_retriever for RAG <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Integration and unit tests added Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
This commit is contained in:
parent
f31bcc11bc
commit
d15368a302
5 changed files with 355 additions and 45 deletions
|
@ -93,10 +93,31 @@ chunks_response = client.vector_io.query(
|
|||
|
||||
### Using the RAG Tool
|
||||
|
||||
> **⚠️ DEPRECATION NOTICE**: The RAG Tool is being deprecated in favor of directly using the OpenAI-compatible Search
|
||||
> API. We recommend migrating to the OpenAI APIs for better compatibility and future support.
|
||||
|
||||
A better way to ingest documents is to use the RAG Tool. This tool allows you to ingest documents from URLs, files, etc.
|
||||
and automatically chunks them into smaller pieces. More examples for how to format a RAGDocument can be found in the
|
||||
[appendix](#more-ragdocument-examples).
|
||||
|
||||
#### OpenAI API Integration & Migration
|
||||
|
||||
The RAG tool has been updated to use OpenAI-compatible APIs. This provides several benefits:
|
||||
|
||||
- **Files API Integration**: Documents are now uploaded using OpenAI's file upload endpoints
|
||||
- **Vector Stores API**: Vector storage operations use OpenAI's vector store format with configurable chunking strategies
|
||||
- **Error Resilience:** When processing multiple documents, individual failures are logged but don't crash the operation. Failed documents are skipped while successful ones continue processing.
|
||||
|
||||
**Migration Path:**
|
||||
We recommend migrating to the OpenAI-compatible Search API for:
|
||||
1. **Better OpenAI Ecosystem Integration**: Direct compatibility with OpenAI tools and workflows including the Responses API
|
||||
2**Future-Proof**: Continued support and feature development
|
||||
3**Full OpenAI Compatibility**: Vector Stores, Files, and Search APIs are fully compatible with OpenAI's Responses API
|
||||
|
||||
The OpenAI APIs are used under the hood, so you can continue to use your existing RAG Tool code with minimal changes.
|
||||
However, we recommend updating your code to use the new OpenAI-compatible APIs for better long-term support. If any
|
||||
documents fail to process, they will be logged in the response but will not cause the entire operation to fail.
|
||||
|
||||
```python
|
||||
from llama_stack_client import RAGDocument
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue