diff --git a/docs/source/building_applications/index.md b/docs/source/building_applications/index.md index fddd957ed..b2c8fc75d 100644 --- a/docs/source/building_applications/index.md +++ b/docs/source/building_applications/index.md @@ -12,6 +12,7 @@ Here are some key topics that will help you build effective agents: - **[Agent](agent)**: Understand the components and design patterns of the Llama Stack agent framework. - **[Agent Execution Loop](agent_execution_loop)**: Understand how agents process information, make decisions, and execute actions in a continuous loop. - **[Agents vs Responses API](responses_vs_agents)**: Learn the differences between the Agents API and Responses API, and when to use each one. +- **[OpenAI API](more_on_openai_compatibility)**: Learn how Llama Stack's OpenAI API Compatibility also allows for use of other AI frameworks on the platform. - **[Tools](tools)**: Extend your agents' capabilities by integrating with external tools and APIs. - **[Evals](evals)**: Evaluate your agents' effectiveness and identify areas for improvement. - **[Telemetry](telemetry)**: Monitor and analyze your agents' performance and behavior. @@ -25,6 +26,7 @@ rag agent agent_execution_loop responses_vs_agents +more_on_openai_compatibility tools evals telemetry diff --git a/docs/source/building_applications/langchain_langgraph/index.md b/docs/source/building_applications/langchain_langgraph/index.md new file mode 100644 index 000000000..08de2c09c --- /dev/null +++ b/docs/source/building_applications/langchain_langgraph/index.md @@ -0,0 +1,36 @@ +# OpenAI, LangChain, and LangGraph via Llama Stack + +One popular AI framework that exposes Open AI API compatibility is LangChain, with its [OpenAI Provider](https://python.langchain.com/docs/integrations/providers/openai/). + +With LangChain's OpenAI API compatibility, and using the Llama Stack OpenAI-compatible endpoint URL (`http://localhost:8321/v1/openapi/v1`, for example, if you are running Llama Stack +locally) as the Open AI API provider, you can run your existing LangChain AI applications in your Llama Stack environment. + +There is also LangGraph, an associated by separate extension to the LangChain framework, to consider. While LangChain is excellent for creating +linear sequences of operations (chains), LangGraph allows for more dynamic workflows (graphs) with loops, branching, and persistent state. +This makes LangGraph ideal for sophisticated agent-based systems where the flow of control is not predetermined. +You can use your existing LangChain components in combination with LangGraph components to create more complex, +multi-agent applications. + +As this LangChain/LangGraph section of the Llama Stack docs iterates and expands, a variety of samples that vary both in + +- How complex the application is +- What aspects of Llama Stack are leveraged in conjunction with the application + +will be provided, as well as references to third party sites with samples. + +Local examples: + +- **[Starter](langchain_langgraph)**: Explore a simple, graph-based agentic application that exposes a simple tool to add numbers together. + +External sites: + +- **[Responses](more_on_responses)**: A deeper dive into the newer OpenAI Responses API (vs. the Chat Completion API). + + +```{toctree} +:hidden: +:maxdepth: 1 + +langchain_langgraph +more_on_responses +``` \ No newline at end of file diff --git a/docs/source/building_applications/langchain_langgraph/langchain_langgraph.md b/docs/source/building_applications/langchain_langgraph/langchain_langgraph.md new file mode 100644 index 000000000..d30033b52 --- /dev/null +++ b/docs/source/building_applications/langchain_langgraph/langchain_langgraph.md @@ -0,0 +1,120 @@ +# Example: A multi-node LangGraph Agent Application that registers a simple tool that adds two numbers together + +### Setup + +#### Activate model + +```bash +ollama run llama3.2:3b-instruct-fp16 --keepalive 60m +``` + +Note: this blocks the terminal as `ollama run` allows you to chat with the model. So use + +```bash +/bye +``` + +to return to your command prompt. To confirm the model is in fact running, you can run + +```bash +ollama ps +``` + +#### Start up Llama Stack + +```bash +OLLAMA_URL=http://localhost:11434 uv run --with llama-stack llama stack build --distro starter --image-type venv --run +``` + +#### Install dependencies + +In order to install LangChain, LangGraph, OpenAI, and their related dependencies, run + +```bash +uv pip install langgraph langchain openai langchain_openai langchain_community +``` + +### Application details + +To run the application, from the root of your Llama Stack git repository clone, execute: + +```bash +python docs/source/building_applications/langgraph-agent-add.py +``` + +and you should see this in the output: + +```bash +HUMAN: What is 16 plus 9? +AI: +TOOL: 25 +``` + +The sample also adds some debug that illustrates the use of the Open AI Chat Completion API, as the `response.response_metadata` +field equates to the [Chat Completion Object](https://platform.openai.com/docs/api-reference/chat/object). + +```bash +LLM returned Chat Completion object: {'token_usage': {'completion_tokens': 23, 'prompt_tokens': 169, 'total_tokens': 192, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'llama3.2:3b-instruct-fp16', 'system_fingerprint': 'fp_ollama', 'id': 'chatcmpl-51307b80-a1a1-4092-b005-21ea9cde29a0', 'service_tier': None, 'finish_reason': 'tool_calls', 'logprobs': None} +``` + +This is analogous to the object returned by the Llama Stack Client's `chat.completions.create` call. + +The example application leverages a series of LangGraph and LangChain API. The two keys ones are: +1. [ChatOpenAI](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html#chatopenai) is the primary LangChain Open AI compatible chat API. The standard parameters fort his API supply the Llama Stack OpenAI provider endpoint, followed by a model registered with Llama Stack. +2. [StateGraph](https://langchain-ai.github.io/langgraph/reference/graphs/#langgraph.graph.state.StateGraph) provides the LangGraph API for building the nodes and edges of the graph that define (potential) steps of the agentic workflow. + +Additional LangChain API are leveraged in order to: +- register or [bind](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html#langchain_openai.chat_models.base.ChatOpenAI.bind_tools) the tool used by the LangGraph agentic workflow +- [process](https://api.python.langchain.com/en/latest/agents/langchain.agents.format_scratchpad.openai_tools.format_to_openai_tool_messages.html) any messages generated by the workflow +- supply [user](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.human.HumanMessage.html) and [tool](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolMessage.html) prompts + +Ultimately, this agentic workflow application performs the simple task of adding numbers together. + +```{literalinclude} ./langgraph-agent-add.py +:language: python +``` + +#### Minor application tweak - the OpenAI Responses API + +It is very easy to switch from the default OpenAI Chat Completion API to the newer OpenAI Responses API. Simply modify +the `ChatOpenAI` instantiator with the additional `use_responses_api=True` flag: + +```python +llm = ChatOpenAI( + model="ollama/llama3.2:3b-instruct-fp16", + openai_api_key="none", + openai_api_base="http://localhost:8321/v1/openai/v1", + use_responses_api=True).bind_tools(tools) +``` + +For convenience, here is the entire sample with that change to the constructor: + +```{literalinclude} ./langgraph-agent-add-via-responses.py +:language: python +``` + +If you are examining the Llama Stack server logs while running the application, you'll see use of the `/v1/openai/v1/responses` REST APIs instead of `/v1/openai/v1/chat/completions` REST APIs. + +In the sample application's output, the debug statement displaying the response from the LLM will now illustrate that instead of the Chat Completion Object, +the LLM returns a [Response Object from the Responses API](https://platform.openai.com/docs/api-reference/responses/object). + +```bash +LLM returned Responses object: {'id': 'resp-9dbaa1e1-7ba4-45cd-978e-e84448aee278', 'created_at': 1756140326.0, 'model': 'ollama/llama3.2:3b-instruct-fp16', 'object': 'response', 'status': 'completed', 'model_name': 'ollama/llama3.2:3b-instruct-fp16'} +``` + +This is analogous to the object returned by the Llama Stack Client's `responses.create` call. + +The Responses API is considered the next generation of OpenAI's core agentic API primitive. +For a detailed comparison with and migration suggestions from the Chat API, visit the [Open AI documentation](https://platform.openai.com/docs/guides/migrate-to-responses). + +### Comparing Llama Stack Agents and LangGraph Agents + +Expressing the agent workflow as a LangGraph `StateGraph` is an alternative approach to the Llama Stack agent execution +loop as discussed in [this prior section](../agent_execution_loop.md). + +To summarize some of the key takeaways detailed earlier: +- LangGraph does not offer the easy integration with Llama Stack's API providers, like say the shields / safety mechanisms, that Llama Stack Agents benefit from +- Llama stack agents provide a simpler, predefined, sequence of steps incorporated in a loop is the standard execution pattern (similar to LangChain), where multiple LLM calls +and tool invocations are possible. +- LangGraph execution order is more flexible, with edges allowing for conditional branching along with loops. Each node is a LLM call +or tool call. Also, a mutable state is passed between nodes, enabling complex, multi-turn interactions and adaptive behavior, i.e. workflow orchestration. \ No newline at end of file diff --git a/docs/source/building_applications/langchain_langgraph/langgraph-agent-add-via-responses.py b/docs/source/building_applications/langchain_langgraph/langgraph-agent-add-via-responses.py new file mode 100644 index 000000000..e7f83a53f --- /dev/null +++ b/docs/source/building_applications/langchain_langgraph/langgraph-agent-add-via-responses.py @@ -0,0 +1,72 @@ +# Copyright (c) Meta Platforms, Inc. and affiliates. +# All rights reserved. +# +# This source code is licensed under the terms described in the LICENSE file in +# the root directory of this source tree. + +from langgraph.graph import StateGraph, END +from langchain_core.messages import HumanMessage, ToolMessage +from langchain.agents import tool +from langchain_openai import ChatOpenAI +from langchain.agents.format_scratchpad.openai_tools import format_to_openai_tool_messages + +# --- Tool --- +@tool +def add_numbers(x: int, y: int) -> int: + """Add two integers together.""" + return x + y + +tools = [add_numbers] + +# --- LLM that supports function-calling --- +llm = ChatOpenAI( + model="ollama/llama3.2:3b-instruct-fp16", + openai_api_key="none", + openai_api_base="http://localhost:8321/v1/openai/v1", + use_responses_api=True +).bind_tools(tools) + +# --- Node that runs the agent --- +def agent_node(state): + messages = state["messages"] + if "scratchpad" in state: + messages += format_to_openai_tool_messages(state["scratchpad"]) + response = llm.invoke(messages) + print(f"LLM returned Responses object: {response.response_metadata}") + response.content + return { + "messages": messages + [response], + "intermediate_step": response, + } + +# --- Node that executes tool call --- +def tool_node(state): + tool_call = state["intermediate_step"].tool_calls[0] + result = add_numbers.invoke(tool_call["args"]) + return { + "messages": state["messages"] + [ + ToolMessage(tool_call_id=tool_call["id"], content=str(result)) + ] + } + +# --- Build LangGraph --- +graph = StateGraph(dict) +graph.add_node("agent", agent_node) +graph.add_node("tool", tool_node) + +graph.set_entry_point("agent") +graph.add_edge("agent", "tool") +graph.add_edge("tool", END) + +compiled_graph = graph.compile() + +# --- Run it --- +initial_state = { + "messages": [HumanMessage(content="What is 16 plus 9?")] +} + +final_state = compiled_graph.invoke(initial_state) + +# --- Output --- +for msg in final_state["messages"]: + print(f"{msg.type.upper()}: {msg.content}") diff --git a/docs/source/building_applications/langchain_langgraph/langgraph-agent-add.py b/docs/source/building_applications/langchain_langgraph/langgraph-agent-add.py new file mode 100644 index 000000000..61ab84724 --- /dev/null +++ b/docs/source/building_applications/langchain_langgraph/langgraph-agent-add.py @@ -0,0 +1,70 @@ +# Copyright (c) Meta Platforms, Inc. and affiliates. +# All rights reserved. +# +# This source code is licensed under the terms described in the LICENSE file in +# the root directory of this source tree. + +from langgraph.graph import StateGraph, END +from langchain_core.messages import HumanMessage, ToolMessage +from langchain.agents import tool +from langchain_openai import ChatOpenAI +from langchain.agents.format_scratchpad.openai_tools import format_to_openai_tool_messages + +# --- Tool --- +@tool +def add_numbers(x: int, y: int) -> int: + """Add two integers together.""" + return x + y + +tools = [add_numbers] + +# --- LLM that supports function-calling --- +llm = ChatOpenAI( + model="ollama/llama3.2:3b-instruct-fp16", + openai_api_key="none", + openai_api_base="http://localhost:8321/v1/openai/v1" +).bind_tools(tools) + +# --- Node that runs the agent --- +def agent_node(state): + messages = state["messages"] + if "scratchpad" in state: + messages += format_to_openai_tool_messages(state["scratchpad"]) + response = llm.invoke(messages) + print(f"LLM returned Chat Completion object: {response.response_metadata}") + return { + "messages": messages + [response], + "intermediate_step": response, + } + +# --- Node that executes tool call --- +def tool_node(state): + tool_call = state["intermediate_step"].tool_calls[0] + result = add_numbers.invoke(tool_call["args"]) + return { + "messages": state["messages"] + [ + ToolMessage(tool_call_id=tool_call["id"], content=str(result)) + ] + } + +# --- Build LangGraph --- +graph = StateGraph(dict) +graph.add_node("agent", agent_node) +graph.add_node("tool", tool_node) + +graph.set_entry_point("agent") +graph.add_edge("agent", "tool") +graph.add_edge("tool", END) + +compiled_graph = graph.compile() + +# --- Run it --- +initial_state = { + "messages": [HumanMessage(content="What is 16 plus 9?")] +} + +final_state = compiled_graph.invoke(initial_state) + +# --- Output --- +for msg in final_state["messages"]: + print(f"{msg.type.upper()}: {msg.content}") diff --git a/docs/source/building_applications/langchain_langgraph/more_on_responses.md b/docs/source/building_applications/langchain_langgraph/more_on_responses.md new file mode 100644 index 000000000..4c2c76baf --- /dev/null +++ b/docs/source/building_applications/langchain_langgraph/more_on_responses.md @@ -0,0 +1,21 @@ +# Deep dive references for Llama Stack, OpenAI Responses API, and LangChain/LangGraph + +Examples for dealing with combinations the LLama Stack Client API with say: +- OpenAI Responses API +- And a wide variety of frameworks, such as the LangChain API + +are rapidly evolving throughout various code repositories, blogs, and documentations sites. + +The list of scenarios included at such location are impossible to list and keep current, but for certain the +minimally include such scenarios as: +- Simple model inference +- RAG with document search +- Tool calling to MCP +- Complex multi-step workflows. + +Rather then duplicate these Llama Stack Client related examples in this documentation site, this section will provide +references to these external sites. + +## The AI Alliance + +Consider the Responses API Examples detailed [here](https://github.com/The-AI-Alliance/llama-stack-examples/blob/main/notebooks/01-responses/README.md). \ No newline at end of file diff --git a/docs/source/building_applications/more_on_openai_compatibility.md b/docs/source/building_applications/more_on_openai_compatibility.md new file mode 100644 index 000000000..623be5e3b --- /dev/null +++ b/docs/source/building_applications/more_on_openai_compatibility.md @@ -0,0 +1,21 @@ +# More on Llama Stack's OpenAI API Compatibility and other AI Frameworks + +Many of the other Agentic frameworks also recognize the value of providing OpenAI API compatibility to allow for coupling +with their framework specific APIs, similar to the use of the OpenAI Responses API from a Llama Stack Client +instance as described in the previous [Agents vs Responses API](responses_vs_agents) section. + +This OpenAI API compatibility becomes the "least common denominator" of sorts, and allows for migrating these agentic applications written +with these other frameworks onto AI infrastructure leveraging Llama Stack. Once on Llama Stack, the application maintainer +can then leverage all the advantages Llama Stack can provide as summarized in the [Core Concepts section](../concepts/index.md). + +As the Llama Stack community continues to dive into these different AI Frameworks with Open AI API compatibility, a +variety of documentation sections, examples, and references will be provided. Here is what is currently available: + +- **[LangChain/LangGraph](langchain_langgraph/index)**: the LangChain and associated LangGraph AI Frameworks. + +```{toctree} +:hidden: +:maxdepth: 1 + +langchain_langgraph/index +``` \ No newline at end of file