diff --git a/docs/docs/building_applications/rag.mdx b/docs/docs/building_applications/rag.mdx index cb7f941e9..2ea459890 100644 --- a/docs/docs/building_applications/rag.mdx +++ b/docs/docs/building_applications/rag.mdx @@ -19,7 +19,7 @@ Llama Stack now uses a modern, OpenAI-compatible API pattern for RAG: This new approach provides better compatibility with OpenAI's ecosystem and is the recommended way to implement RAG in Llama Stack. -RAG System +RAG System ## Prerequisites @@ -32,7 +32,7 @@ Before you begin, make sure you have the following: 2. **Llama Stack**: Follow the [installation guide](/docs/installation) to set up Llama Stack on your machine. 3. **Documents**: Prepare a set of documents that you want to search. These can be plain text, PDFs, or other file types. -4. Set the `LLAMA_STACK_PORT` environment variable to the port where Llama Stack is running. For example, if you are using the default port of 8321, set `export LLAMA_STACK_PORT=8321`. Also set 'OLLAMA_URL' environment variable to be 'http://localhost:11434' +4. **environment variable**: Set the `LLAMA_STACK_PORT` environment variable to the port where Llama Stack is running. For example, if you are using the default port of 8321, set `export LLAMA_STACK_PORT=8321`. Also set 'OLLAMA_URL' environment variable to be 'http://localhost:11434' ## Step 0: Initialize Client @@ -287,59 +287,6 @@ for query in queries: print(f"Response: {response}") ``` -## Advanced Usage: Dynamic Document Management - -You can dynamically add or remove files from existing vector stores: - -```python -# Add new files to an existing vector store -new_file_ids = [] -new_docs = [ - "Deep learning requires large amounts of training data.", - "Transformers revolutionized natural language processing." -] - -for doc in new_docs: - with BytesIO(doc.encode()) as f: - f.name = f"doc_{len(new_file_ids)}.txt" - response = client.files.create(file=f, purpose="assistants") - new_file_ids.append(response.id) - -# Update vector store with new files -# Note: Implementation may vary depending on your Llama Stack version -# Check documentation for vector_stores.update() or recreate the store -``` - -## Best Practices - -### ๐ŸŽฏ **Descriptive Filenames** -Use meaningful filenames that describe the content when uploading documents. - -### ๐Ÿ“Š **Metadata Organization** -Structure metadata consistently across documents for better organization and retrieval. - -### ๐Ÿ” **Vector Store Naming** -Use clear, descriptive names for vector stores to make management easier. - -### ๐Ÿงน **Resource Cleanup** -Regularly delete unused vector stores to free up resources and maintain system performance. - -### โšก **Batch Processing** -Upload multiple files before creating the vector store for better efficiency. - -### ๐Ÿ›ก๏ธ **Error Handling** -Always wrap API calls in try-except blocks for production code: - -```python -# Example with error handling -try: - with BytesIO(content.encode()) as f: - f.name = "document.txt" - file_response = client.files.create(file=f, purpose="assistants") -except Exception as e: - print(f"Error uploading file: {e}") -``` - ## Migration from Legacy API :::danger[Deprecation Notice] diff --git a/docs/static/img/rag_llama_stack.png b/docs/static/img/rag_llama_stack.png new file mode 100644 index 000000000..bc0e499e9 Binary files /dev/null and b/docs/static/img/rag_llama_stack.png differ