Merge 20fd5ff54c into 48a551ecbc

2025-10-04 04:04:14 +00:00 · 2025-09-24 13:18:36 -04:00 · 2025-09-24 13:18:36 -04:00 · 8c78fd097e
commit 8c78fd097e
parent 48a551ecbc 20fd5ff54c
7 changed files with 342 additions and 0 deletions
--- a/docs/source/building_applications/index.md
+++ b/docs/source/building_applications/index.md
@ -12,6 +12,7 @@ Here are some key topics that will help you build effective agents:
 - **[Agent](agent)**: Understand the components and design patterns of the Llama Stack agent framework.
 - **[Agent Execution Loop](agent_execution_loop)**: Understand how agents process information, make decisions, and execute actions in a continuous loop.
 - **[Agents vs Responses API](responses_vs_agents)**: Learn the differences between the Agents API and Responses API, and when to use each one.
+- **[OpenAI API](more_on_openai_compatibility)**: Learn how Llama Stack's OpenAI API Compatibility also allows for use of other AI frameworks on the platform.
 - **[Tools](tools)**: Extend your agents' capabilities by integrating with external tools and APIs.
 - **[Evals](evals)**: Evaluate your agents' effectiveness and identify areas for improvement.
 - **[Telemetry](telemetry)**: Monitor and analyze your agents' performance and behavior.
@ -25,6 +26,7 @@ rag
 agent
 agent_execution_loop
 responses_vs_agents
+more_on_openai_compatibility
 tools
 evals
 telemetry
--- a/docs/source/building_applications/langchain_langgraph/index.md
+++ b/docs/source/building_applications/langchain_langgraph/index.md
@ -0,0 +1,36 @@
+# OpenAI, LangChain, and LangGraph via Llama Stack
+
+One popular AI framework that exposes Open AI API compatibility is LangChain, with its [OpenAI Provider](https://python.langchain.com/docs/integrations/providers/openai/).
+
+With LangChain's OpenAI API compatibility, and using the Llama Stack OpenAI-compatible endpoint URL (`http://localhost:8321/v1/openapi/v1`, for example, if you are running Llama Stack
+locally) as the Open AI API provider, you can run your existing LangChain AI applications in your Llama Stack environment.
+
+There is also LangGraph, an associated by separate extension to the LangChain framework, to consider.  While LangChain is excellent for creating
+linear sequences of operations (chains), LangGraph allows for more dynamic workflows (graphs) with loops, branching, and persistent state.
+This makes LangGraph ideal for sophisticated agent-based systems where the flow of control is not predetermined.
+You can use your existing LangChain components in combination with LangGraph components to create more complex,
+multi-agent applications.
+
+As this LangChain/LangGraph section of the Llama Stack docs iterates and expands, a variety of samples that vary both in
+
+- How complex the application is
+- What aspects of Llama Stack are leveraged in conjunction with the application
+
+will be provided, as well as references to third party sites with samples.
+
+Local examples:
+
+- **[Starter](langchain_langgraph)**:  Explore a simple, graph-based agentic application that exposes a simple tool to add numbers together.
+
+External sites:
+
+- **[Responses](more_on_responses)**:  A deeper dive into the newer OpenAI Responses API (vs. the Chat Completion API).
+
+
+```{toctree}
+:hidden:
+:maxdepth: 1
+
+langchain_langgraph
+more_on_responses
+```
--- a/docs/source/building_applications/langchain_langgraph/langchain_langgraph.md
+++ b/docs/source/building_applications/langchain_langgraph/langchain_langgraph.md
@ -0,0 +1,120 @@
+# Example: A multi-node LangGraph Agent Application that registers a simple tool that adds two numbers together
+
+### Setup
+
+#### Activate model
+
+```bash
+ollama run llama3.2:3b-instruct-fp16 --keepalive 60m
+```
+
+Note: this blocks the terminal as `ollama run` allows you to chat with the model.  So use
+
+```bash
+/bye
+```
+
+to return to your command prompt.  To confirm the model is in fact running, you can run
+
+```bash
+ollama ps
+```
+
+#### Start up Llama Stack
+
+```bash
+OLLAMA_URL=http://localhost:11434 uv run --with llama-stack llama stack build --distro starter --image-type venv --run
+```
+
+#### Install dependencies
+
+In order to install LangChain, LangGraph, OpenAI, and their related dependencies, run
+
+```bash
+uv pip install langgraph langchain openai langchain_openai langchain_community
+```
+
+### Application details
+
+To run the application, from the root of your Llama Stack git repository clone, execute:
+
+```bash
+python docs/source/building_applications/langgraph-agent-add.py
+```
+
+and you should see this in the output:
+
+```bash
+HUMAN: What is 16 plus 9?
+AI:
+TOOL: 25
+```
+
+The sample also adds some debug that illustrates the use of the Open AI Chat Completion API, as the `response.response_metadata`
+field equates to the [Chat Completion Object](https://platform.openai.com/docs/api-reference/chat/object).
+
+```bash
+LLM returned Chat Completion object: {'token_usage': {'completion_tokens': 23, 'prompt_tokens': 169, 'total_tokens': 192, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'llama3.2:3b-instruct-fp16', 'system_fingerprint': 'fp_ollama', 'id': 'chatcmpl-51307b80-a1a1-4092-b005-21ea9cde29a0', 'service_tier': None, 'finish_reason': 'tool_calls', 'logprobs': None}
+```
+
+This is analogous to the object returned by the Llama Stack Client's `chat.completions.create` call.
+
+The example application leverages a series of LangGraph and LangChain API.  The two keys ones are:
+1. [ChatOpenAI](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html#chatopenai) is the primary LangChain Open AI compatible chat API.  The standard parameters fort his API supply the Llama Stack OpenAI provider endpoint, followed by a model registered with Llama Stack.
+2. [StateGraph](https://langchain-ai.github.io/langgraph/reference/graphs/#langgraph.graph.state.StateGraph) provides the LangGraph API for building the nodes and edges of the graph that define (potential) steps of the agentic workflow.
+
+Additional LangChain API are leveraged in order to:
+- register or [bind](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html#langchain_openai.chat_models.base.ChatOpenAI.bind_tools) the tool used by the LangGraph agentic workflow
+- [process](https://api.python.langchain.com/en/latest/agents/langchain.agents.format_scratchpad.openai_tools.format_to_openai_tool_messages.html) any messages generated by the workflow
+- supply [user](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.human.HumanMessage.html) and [tool](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolMessage.html) prompts
+
+Ultimately, this agentic workflow application performs the simple task of adding numbers together.
+
+```{literalinclude} ./langgraph-agent-add.py
+:language: python
+```
+
+#### Minor application tweak - the OpenAI Responses API
+
+It is very easy to switch from the default OpenAI Chat Completion API to the newer OpenAI Responses API.  Simply modify
+the `ChatOpenAI` instantiator with the additional `use_responses_api=True` flag:
+
+```python
+llm = ChatOpenAI(
+    model="ollama/llama3.2:3b-instruct-fp16",
+    openai_api_key="none",
+    openai_api_base="http://localhost:8321/v1/openai/v1",
+    use_responses_api=True).bind_tools(tools)
+```
+
+For convenience, here is the entire sample with that change to the constructor:
+
+```{literalinclude} ./langgraph-agent-add-via-responses.py
+:language: python
+```
+
+If you are examining the Llama Stack server logs while running the application, you'll see use of the `/v1/openai/v1/responses` REST APIs instead of `/v1/openai/v1/chat/completions` REST APIs.
+
+In the sample application's output, the debug statement displaying the response from the LLM will now illustrate that instead of the Chat Completion Object,
+the LLM returns a [Response Object from the Responses API](https://platform.openai.com/docs/api-reference/responses/object).
+
+```bash
+LLM returned Responses object: {'id': 'resp-9dbaa1e1-7ba4-45cd-978e-e84448aee278', 'created_at': 1756140326.0, 'model': 'ollama/llama3.2:3b-instruct-fp16', 'object': 'response', 'status': 'completed', 'model_name': 'ollama/llama3.2:3b-instruct-fp16'}
+```
+
+This is analogous to the object returned by the Llama Stack Client's `responses.create` call.
+
+The Responses API is considered the next generation of OpenAI's core agentic API primitive.  
+For a detailed comparison with and migration suggestions from the Chat API, visit the [Open AI documentation](https://platform.openai.com/docs/guides/migrate-to-responses).
+
+### Comparing Llama Stack Agents and LangGraph Agents
+
+Expressing the agent workflow as a LangGraph `StateGraph` is an alternative approach to the Llama Stack agent execution
+loop as discussed in [this prior section](../agent_execution_loop.md).
+
+To summarize some of the key takeaways detailed earlier:
+- LangGraph does not offer the easy integration with Llama Stack's API providers, like say the shields / safety mechanisms, that Llama Stack Agents benefit from
+- Llama stack agents provide a simpler, predefined, sequence of steps incorporated in a loop is the standard execution pattern (similar to LangChain), where multiple LLM calls
+and tool invocations are possible.
+- LangGraph execution order is more flexible, with edges allowing for conditional branching along with loops. Each node is a LLM call
+or tool call.  Also, a mutable state is passed between nodes, enabling complex, multi-turn interactions and adaptive behavior, i.e. workflow orchestration.
--- a/docs/source/building_applications/langchain_langgraph/langgraph-agent-add-via-responses.py
+++ b/docs/source/building_applications/langchain_langgraph/langgraph-agent-add-via-responses.py
@ -0,0 +1,72 @@
+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the terms described in the LICENSE file in
+# the root directory of this source tree.
+
+from langgraph.graph import StateGraph, END
+from langchain_core.messages import HumanMessage, ToolMessage
+from langchain.agents import tool
+from langchain_openai import ChatOpenAI
+from langchain.agents.format_scratchpad.openai_tools import format_to_openai_tool_messages
+
+# --- Tool ---
+@tool
+def add_numbers(x: int, y: int) -> int:
+    """Add two integers together."""
+    return x + y
+
+tools = [add_numbers]
+
+# --- LLM that supports function-calling ---
+llm = ChatOpenAI(
+    model="ollama/llama3.2:3b-instruct-fp16",
+    openai_api_key="none",
+    openai_api_base="http://localhost:8321/v1/openai/v1",
+    use_responses_api=True
+).bind_tools(tools)
+
+# --- Node that runs the agent ---
+def agent_node(state):
+    messages = state["messages"]
+    if "scratchpad" in state:
+        messages += format_to_openai_tool_messages(state["scratchpad"])
+    response = llm.invoke(messages)
+    print(f"LLM returned Responses object: {response.response_metadata}")
+    response.content
+    return {
+        "messages": messages + [response],
+        "intermediate_step": response,
+    }
+
+# --- Node that executes tool call ---
+def tool_node(state):
+    tool_call = state["intermediate_step"].tool_calls[0]
+    result = add_numbers.invoke(tool_call["args"])
+    return {
+        "messages": state["messages"] + [
+            ToolMessage(tool_call_id=tool_call["id"], content=str(result))
+        ]
+    }
+
+# --- Build LangGraph ---
+graph = StateGraph(dict)
+graph.add_node("agent", agent_node)
+graph.add_node("tool", tool_node)
+
+graph.set_entry_point("agent")
+graph.add_edge("agent", "tool")
+graph.add_edge("tool", END)
+
+compiled_graph = graph.compile()
+
+# --- Run it ---
+initial_state = {
+    "messages": [HumanMessage(content="What is 16 plus 9?")]
+}
+
+final_state = compiled_graph.invoke(initial_state)
+
+# --- Output ---
+for msg in final_state["messages"]:
+    print(f"{msg.type.upper()}: {msg.content}")
--- a/docs/source/building_applications/langchain_langgraph/langgraph-agent-add.py
+++ b/docs/source/building_applications/langchain_langgraph/langgraph-agent-add.py
@ -0,0 +1,70 @@
+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the terms described in the LICENSE file in
+# the root directory of this source tree.
+
+from langgraph.graph import StateGraph, END
+from langchain_core.messages import HumanMessage, ToolMessage
+from langchain.agents import tool
+from langchain_openai import ChatOpenAI
+from langchain.agents.format_scratchpad.openai_tools import format_to_openai_tool_messages
+
+# --- Tool ---
+@tool
+def add_numbers(x: int, y: int) -> int:
+    """Add two integers together."""
+    return x + y
+
+tools = [add_numbers]
+
+# --- LLM that supports function-calling ---
+llm = ChatOpenAI(
+    model="ollama/llama3.2:3b-instruct-fp16",
+    openai_api_key="none",
+    openai_api_base="http://localhost:8321/v1/openai/v1"
+).bind_tools(tools)
+
+# --- Node that runs the agent ---
+def agent_node(state):
+    messages = state["messages"]
+    if "scratchpad" in state:
+        messages += format_to_openai_tool_messages(state["scratchpad"])
+    response = llm.invoke(messages)
+    print(f"LLM returned Chat Completion object: {response.response_metadata}")
+    return {
+        "messages": messages + [response],
+        "intermediate_step": response,
+    }
+
+# --- Node that executes tool call ---
+def tool_node(state):
+    tool_call = state["intermediate_step"].tool_calls[0]
+    result = add_numbers.invoke(tool_call["args"])
+    return {
+        "messages": state["messages"] + [
+            ToolMessage(tool_call_id=tool_call["id"], content=str(result))
+        ]
+    }
+
+# --- Build LangGraph ---
+graph = StateGraph(dict)
+graph.add_node("agent", agent_node)
+graph.add_node("tool", tool_node)
+
+graph.set_entry_point("agent")
+graph.add_edge("agent", "tool")
+graph.add_edge("tool", END)
+
+compiled_graph = graph.compile()
+
+# --- Run it ---
+initial_state = {
+    "messages": [HumanMessage(content="What is 16 plus 9?")]
+}
+
+final_state = compiled_graph.invoke(initial_state)
+
+# --- Output ---
+for msg in final_state["messages"]:
+    print(f"{msg.type.upper()}: {msg.content}")
--- a/docs/source/building_applications/langchain_langgraph/more_on_responses.md
+++ b/docs/source/building_applications/langchain_langgraph/more_on_responses.md
@ -0,0 +1,21 @@
+# Deep dive references for Llama Stack, OpenAI Responses API, and LangChain/LangGraph
+
+Examples for dealing with combinations the LLama Stack Client API with say:
+- OpenAI Responses API
+- And a wide variety of frameworks, such as the LangChain API
+
+are rapidly evolving throughout various code repositories, blogs, and documentations sites.
+
+The list of scenarios included at such location are impossible to list and keep current, but for certain the
+minimally include such scenarios as:
+- Simple model inference
+- RAG with document search
+- Tool calling to MCP
+- Complex multi-step workflows.
+
+Rather then duplicate these Llama Stack Client related examples in this documentation site, this section will provide
+references to these external sites.
+
+## The AI Alliance
+
+Consider the Responses API Examples detailed [here](https://github.com/The-AI-Alliance/llama-stack-examples/blob/main/notebooks/01-responses/README.md).
--- a/docs/source/building_applications/more_on_openai_compatibility.md
+++ b/docs/source/building_applications/more_on_openai_compatibility.md
@ -0,0 +1,21 @@
+# More on Llama Stack's OpenAI API Compatibility and other AI Frameworks
+
+Many of the other Agentic frameworks also recognize the value of providing OpenAI API compatibility to allow for coupling
+with their framework specific APIs, similar to the use of the OpenAI Responses API from a Llama Stack Client
+instance as described in the previous [Agents vs Responses API](responses_vs_agents) section.
+
+This OpenAI API compatibility becomes the "least common denominator" of sorts, and allows for migrating these agentic applications written
+with these other frameworks onto AI infrastructure leveraging Llama Stack.  Once on Llama Stack, the application maintainer
+can then leverage all the advantages Llama Stack can provide as summarized in the [Core Concepts section](../concepts/index.md).
+
+As the Llama Stack community continues to dive into these different AI Frameworks with Open AI API compatibility, a
+variety of documentation sections, examples, and references will be provided.  Here is what is currently available:
+
+- **[LangChain/LangGraph](langchain_langgraph/index)**: the LangChain and associated LangGraph AI Frameworks.
+
+```{toctree}
+:hidden:
+:maxdepth: 1
+
+langchain_langgraph/index
+```