docs: update Agent documentation (#1333)

Summary: - [new] Agent concepts (session, turn) - [new] how to write custom tools - [new] non-streaming API and how to get outputs - [update] remaining `memory` -> `rag` rename - [new] note importance of `instructions` Test Plan: read
2025-06-28 02:53:30 +00:00 · 2025-03-01 22:34:52 -08:00 · 2025-03-01 22:34:52 -08:00 · 52977e56a8
commit 52977e56a8
parent 46b0a404e8
6 changed files with 170 additions and 64 deletions
--- a/docs/source/building_applications/agent_execution_loop.md
+++ b/docs/source/building_applications/agent_execution_loop.md
@ -13,7 +13,7 @@ Each agent turn follows these key steps:

 3. **Inference Loop**: The agent enters its main execution loop:
   - The LLM receives a user prompt (with previous tool outputs)
-   - The LLM generates a response, potentially with tool calls
+   - The LLM generates a response, potentially with [tool calls](tools)
   - If tool calls are present:
     - Tool inputs are safety-checked
     - Tools are executed (e.g., web search, code execution)
@ -68,6 +68,7 @@ Each step in this process can be monitored and controlled through configurations

 ```python
 from llama_stack_client.lib.agents.event_logger import EventLogger
+from rich.pretty import pprint

 agent_config = AgentConfig(
    model="Llama3.2-3B-Instruct",
@ -108,14 +109,21 @@ response = agent.create_turn(

 # Monitor each step of execution
 for log in EventLogger().log(response):
-    if log.event.step_type == "memory_retrieval":
-        print("Retrieved context:", log.event.retrieved_context)
-    elif log.event.step_type == "inference":
-        print("LLM output:", log.event.model_response)
-    elif log.event.step_type == "tool_execution":
-        print("Tool call:", log.event.tool_call)
-        print("Tool response:", log.event.tool_response)
-    elif log.event.step_type == "shield_call":
-        if log.event.violation:
-            print("Safety violation:", log.event.violation)
+    log.print()
+
+# Using non-streaming API, the response contains input, steps, and output.
+response = agent.create_turn(
+    messages=[{"role": "user", "content": "Analyze this code and run it"}],
+    attachments=[
+        {
+            "content": "https://raw.githubusercontent.com/example/code.py",
+            "mime_type": "text/plain",
+        }
+    ],
+    session_id=session_id,
+)
+
+pprint(f"Input: {response.input_messages}")
+pprint(f"Output: {response.output_message.content}")
+pprint(f"Steps: {response.steps}")
 ```