docs: Some aesthetic changes to the Building AI Applicaitons to make them read a little easier

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2026-01-02 22:02:16 +00:00 · 2025-04-03 22:33:21 -04:00 · 2025-04-03 22:33:21 -04:00 · db9eded18a
commit db9eded18a
parent 66d6c2580e
4 changed files with 41 additions and 14 deletions
--- a/docs/source/building_applications/agent.md
+++ b/docs/source/building_applications/agent.md
@ -1,6 +1,10 @@
-# Llama Stack Agent Framework
+# Agents
-The Llama Stack agent framework is built on a modular architecture that allows for flexible and powerful AI applications. This document explains the key components and how they work together.
+An Agent in Llama Stack is a powerful abstraction that allows you to build complex AI applications.
 :w
 The Llama Stack agent framework is built on a modular architecture that allows for flexible and powerful AI
 applications. This document explains the key components and how they work together.
 ## Core Concepts
--- a/docs/source/building_applications/agent_execution_loop.md
+++ b/docs/source/building_applications/agent_execution_loop.md
@ -1,6 +1,11 @@
 ## Agent Execution Loop
-Agents are the heart of complex AI applications. They combine inference, memory, safety, and tool usage into coherent workflows. At its core, an agent follows a sophisticated execution loop that enables multi-step reasoning, tool usage, and safety checks.
+
 Agents are the heart of Llama Stack applications. They combine inference, memory, safety, and tool usage into coherent
 workflows. At its core, an agent follows a sophisticated execution loop that enables multi-step reasoning, tool usage,
 and safety checks.
 ### Steps in the Agent Workflow
 Each agent turn follows these key steps:
@ -64,7 +69,10 @@ sequenceDiagram
    S->>U: 5. Final Response
 ```
-Each step in this process can be monitored and controlled through configurations. Here's an example that demonstrates monitoring the agent's execution:
+Each step in this process can be monitored and controlled through configurations.
 ### Agent Execution Loop Example
 Here's an example that demonstrates monitoring the agent's execution:
 ```python
 from llama_stack_client import LlamaStackClient, Agent, AgentEventLogger
--- a/docs/source/building_applications/index.md
+++ b/docs/source/building_applications/index.md
@ -8,9 +8,9 @@ The best way to get started is to look at this notebook which walks through the
 Here are some key topics that will help you build effective agents:
 - **[RAG (Retrieval-Augmented Generation)](rag)**: Learn how to enhance your agents with external knowledge through retrieval mechanisms.
 - **[Agent](agent)**: Understand the components and design patterns of the Llama Stack agent framework.
 - **[Agent Execution Loop](agent_execution_loop)**: Understand how agents process information, make decisions, and execute actions in a continuous loop.
 - **[RAG (Retrieval-Augmented Generation)](rag)**: Learn how to enhance your agents with external knowledge through retrieval mechanisms.
 - **[Tools](tools)**: Extend your agents' capabilities by integrating with external tools and APIs.
 - **[Evals](evals)**: Evaluate your agents' effectiveness and identify areas for improvement.
 - **[Telemetry](telemetry)**: Monitor and analyze your agents' performance and behavior.
@ -20,12 +20,11 @@ Here are some key topics that will help you build effective agents:
 :hidden:
 :maxdepth: 1
 rag
 agent
 agent_execution_loop
 rag
 tools
 telemetry
 evals
-advanced_agent_patterns
+telemetry
 safety
 ```
--- a/docs/source/building_applications/rag.md
+++ b/docs/source/building_applications/rag.md
@ -3,9 +3,9 @@
 RAG enables your applications to reference and recall information from previous interactions or external documents.
 Llama Stack organizes the APIs that enable RAG into three layers:
- the lowermost APIs deal with raw storage and retrieval. These include Vector IO, KeyValue IO (coming soon) and Relational IO (also coming soon.)
+1. The lowermost APIs deal with raw storage and retrieval. These include Vector IO, KeyValue IO (coming soon) and Relational IO (also coming soon.).
- next is the "Rag Tool", a first-class tool as part of the Tools API that allows you to ingest documents (from URLs, files, etc) with various chunking strategies and query them smartly.
+2. The next is the "Rag Tool", a first-class tool as part of the [Tools API](tools.md) that allows you to ingest documents (from URLs, files, etc) with various chunking strategies and query them smartly.
- finally, it all comes together with the top-level "Agents" API that allows you to create agents that can use the tools to answer questions, perform tasks, and more.
+3. Finally, it all comes together with the top-level ["Agents" API](agent.md) that allows you to create agents that can use the tools to answer questions, perform tasks, and more.
 <img src="rag.png" alt="RAG System" width="50%">
@ -17,14 +17,19 @@ We may add more storage types like Graph IO in the future.
 ### Setting up Vector DBs
 For this guide, we will use [Ollama](https://ollama.com/) as the inference provider.
 Ollama is an LLM runtime that allows you to run Llama models locally.
 Here's how to set up a vector database for RAG:
 ```python
 # Create http client
 import os
 from llama_stack_client import LlamaStackClient
 client = LlamaStackClient(base_url=f"http://localhost:{os.environ['LLAMA_STACK_PORT']}")
 # Register a vector db
 vector_db_id = "my_documents"
 response = client.vector_dbs.register(
@ -33,17 +38,27 @@ response = client.vector_dbs.register(
    embedding_dimension=384,
    provider_id="faiss",
 )
 ```
 ### Ingesting Documents
 You can ingest documents into the vector database using two methods: directly inserting pre-chunked
 documents or using the RAG Tool.
 ```python
 # You can insert a pre-chunked document directly into the vector db
 chunks = [
    {
        "document_id": "doc1",
        "content": "Your document text here",
        "mime_type": "text/plain",
        "metadata": {
            "document_id": "doc1",
        },
    },
 ]
 client.vector_io.insert(vector_db_id=vector_db_id, chunks=chunks)
-
+```
 ### Retrieval
 You can query the vector database to retrieve documents based on their embeddings.
 ```python
 # You can then query for these chunks
 chunks_response = client.vector_io.query(
    vector_db_id=vector_db_id, query="What do you know about..."
@ -52,7 +67,8 @@ chunks_response = client.vector_io.query(
 ### Using the RAG Tool
-A better way to ingest documents is to use the RAG Tool. This tool allows you to ingest documents from URLs, files, etc. and automatically chunks them into smaller pieces.
+A better way to ingest documents is to use the RAG Tool. This tool allows you to ingest documents from URLs, files, etc.
 and automatically chunks them into smaller pieces.
 ```python
 from llama_stack_client import RAGDocument