mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-16 09:58:10 +00:00
More Updates to Read the Docs (#856)
This commit is contained in:
parent
8a686270e9
commit
74e933cbfd
8 changed files with 405 additions and 730 deletions
92
docs/source/building_applications/rag.md
Normal file
92
docs/source/building_applications/rag.md
Normal file
|
@ -0,0 +1,92 @@
|
|||
## Memory & RAG
|
||||
|
||||
Memory enables your applications to reference and recall information from previous interactions or external documents. Llama Stack's memory system is built around the concept of Memory Banks:
|
||||
|
||||
1. **Vector Memory Banks**: For semantic search and retrieval
|
||||
2. **Key-Value Memory Banks**: For structured data storage
|
||||
3. **Keyword Memory Banks**: For basic text search
|
||||
4. **Graph Memory Banks**: For relationship-based retrieval
|
||||
|
||||
Here's how to set up a vector memory bank for RAG:
|
||||
|
||||
```python
|
||||
# Register a memory bank
|
||||
bank_id = "my_documents"
|
||||
response = client.memory_banks.register(
|
||||
memory_bank_id=bank_id,
|
||||
params={
|
||||
"memory_bank_type": "vector",
|
||||
"embedding_model": "all-MiniLM-L6-v2",
|
||||
"chunk_size_in_tokens": 512
|
||||
}
|
||||
)
|
||||
|
||||
# Insert documents
|
||||
documents = [
|
||||
{
|
||||
"document_id": "doc1",
|
||||
"content": "Your document text here",
|
||||
"mime_type": "text/plain"
|
||||
}
|
||||
]
|
||||
client.memory.insert(bank_id, documents)
|
||||
|
||||
# Query documents
|
||||
results = client.memory.query(
|
||||
bank_id=bank_id,
|
||||
query="What do you know about...",
|
||||
)
|
||||
```
|
||||
|
||||
|
||||
### Building RAG-Enhanced Agents
|
||||
|
||||
One of the most powerful patterns is combining agents with RAG capabilities. Here's a complete example:
|
||||
|
||||
```python
|
||||
from llama_stack_client.types import Attachment
|
||||
|
||||
# Create attachments from documents
|
||||
attachments = [
|
||||
Attachment(
|
||||
content="https://raw.githubusercontent.com/example/doc.rst",
|
||||
mime_type="text/plain"
|
||||
)
|
||||
]
|
||||
|
||||
# Configure agent with memory
|
||||
agent_config = AgentConfig(
|
||||
model="Llama3.2-3B-Instruct",
|
||||
instructions="You are a helpful assistant",
|
||||
tools=[{
|
||||
"type": "memory",
|
||||
"memory_bank_configs": [],
|
||||
"query_generator_config": {"type": "default", "sep": " "},
|
||||
"max_tokens_in_context": 4096,
|
||||
"max_chunks": 10
|
||||
}],
|
||||
enable_session_persistence=True
|
||||
)
|
||||
|
||||
agent = Agent(client, agent_config)
|
||||
session_id = agent.create_session("rag_session")
|
||||
|
||||
# Initial document ingestion
|
||||
response = agent.create_turn(
|
||||
messages=[{
|
||||
"role": "user",
|
||||
"content": "I am providing some documents for reference."
|
||||
}],
|
||||
attachments=attachments,
|
||||
session_id=session_id
|
||||
)
|
||||
|
||||
# Query with RAG
|
||||
response = agent.create_turn(
|
||||
messages=[{
|
||||
"role": "user",
|
||||
"content": "What are the key topics in the documents?"
|
||||
}],
|
||||
session_id=session_id
|
||||
)
|
||||
```
|
Loading…
Add table
Add a link
Reference in a new issue