mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-04 04:04:14 +00:00
Merge 20fd5ff54c
into 48a551ecbc
This commit is contained in:
commit
8c78fd097e
7 changed files with 342 additions and 0 deletions
|
@ -12,6 +12,7 @@ Here are some key topics that will help you build effective agents:
|
|||
- **[Agent](agent)**: Understand the components and design patterns of the Llama Stack agent framework.
|
||||
- **[Agent Execution Loop](agent_execution_loop)**: Understand how agents process information, make decisions, and execute actions in a continuous loop.
|
||||
- **[Agents vs Responses API](responses_vs_agents)**: Learn the differences between the Agents API and Responses API, and when to use each one.
|
||||
- **[OpenAI API](more_on_openai_compatibility)**: Learn how Llama Stack's OpenAI API Compatibility also allows for use of other AI frameworks on the platform.
|
||||
- **[Tools](tools)**: Extend your agents' capabilities by integrating with external tools and APIs.
|
||||
- **[Evals](evals)**: Evaluate your agents' effectiveness and identify areas for improvement.
|
||||
- **[Telemetry](telemetry)**: Monitor and analyze your agents' performance and behavior.
|
||||
|
@ -25,6 +26,7 @@ rag
|
|||
agent
|
||||
agent_execution_loop
|
||||
responses_vs_agents
|
||||
more_on_openai_compatibility
|
||||
tools
|
||||
evals
|
||||
telemetry
|
||||
|
|
|
@ -0,0 +1,36 @@
|
|||
# OpenAI, LangChain, and LangGraph via Llama Stack
|
||||
|
||||
One popular AI framework that exposes Open AI API compatibility is LangChain, with its [OpenAI Provider](https://python.langchain.com/docs/integrations/providers/openai/).
|
||||
|
||||
With LangChain's OpenAI API compatibility, and using the Llama Stack OpenAI-compatible endpoint URL (`http://localhost:8321/v1/openapi/v1`, for example, if you are running Llama Stack
|
||||
locally) as the Open AI API provider, you can run your existing LangChain AI applications in your Llama Stack environment.
|
||||
|
||||
There is also LangGraph, an associated by separate extension to the LangChain framework, to consider. While LangChain is excellent for creating
|
||||
linear sequences of operations (chains), LangGraph allows for more dynamic workflows (graphs) with loops, branching, and persistent state.
|
||||
This makes LangGraph ideal for sophisticated agent-based systems where the flow of control is not predetermined.
|
||||
You can use your existing LangChain components in combination with LangGraph components to create more complex,
|
||||
multi-agent applications.
|
||||
|
||||
As this LangChain/LangGraph section of the Llama Stack docs iterates and expands, a variety of samples that vary both in
|
||||
|
||||
- How complex the application is
|
||||
- What aspects of Llama Stack are leveraged in conjunction with the application
|
||||
|
||||
will be provided, as well as references to third party sites with samples.
|
||||
|
||||
Local examples:
|
||||
|
||||
- **[Starter](langchain_langgraph)**: Explore a simple, graph-based agentic application that exposes a simple tool to add numbers together.
|
||||
|
||||
External sites:
|
||||
|
||||
- **[Responses](more_on_responses)**: A deeper dive into the newer OpenAI Responses API (vs. the Chat Completion API).
|
||||
|
||||
|
||||
```{toctree}
|
||||
:hidden:
|
||||
:maxdepth: 1
|
||||
|
||||
langchain_langgraph
|
||||
more_on_responses
|
||||
```
|
|
@ -0,0 +1,120 @@
|
|||
# Example: A multi-node LangGraph Agent Application that registers a simple tool that adds two numbers together
|
||||
|
||||
### Setup
|
||||
|
||||
#### Activate model
|
||||
|
||||
```bash
|
||||
ollama run llama3.2:3b-instruct-fp16 --keepalive 60m
|
||||
```
|
||||
|
||||
Note: this blocks the terminal as `ollama run` allows you to chat with the model. So use
|
||||
|
||||
```bash
|
||||
/bye
|
||||
```
|
||||
|
||||
to return to your command prompt. To confirm the model is in fact running, you can run
|
||||
|
||||
```bash
|
||||
ollama ps
|
||||
```
|
||||
|
||||
#### Start up Llama Stack
|
||||
|
||||
```bash
|
||||
OLLAMA_URL=http://localhost:11434 uv run --with llama-stack llama stack build --distro starter --image-type venv --run
|
||||
```
|
||||
|
||||
#### Install dependencies
|
||||
|
||||
In order to install LangChain, LangGraph, OpenAI, and their related dependencies, run
|
||||
|
||||
```bash
|
||||
uv pip install langgraph langchain openai langchain_openai langchain_community
|
||||
```
|
||||
|
||||
### Application details
|
||||
|
||||
To run the application, from the root of your Llama Stack git repository clone, execute:
|
||||
|
||||
```bash
|
||||
python docs/source/building_applications/langgraph-agent-add.py
|
||||
```
|
||||
|
||||
and you should see this in the output:
|
||||
|
||||
```bash
|
||||
HUMAN: What is 16 plus 9?
|
||||
AI:
|
||||
TOOL: 25
|
||||
```
|
||||
|
||||
The sample also adds some debug that illustrates the use of the Open AI Chat Completion API, as the `response.response_metadata`
|
||||
field equates to the [Chat Completion Object](https://platform.openai.com/docs/api-reference/chat/object).
|
||||
|
||||
```bash
|
||||
LLM returned Chat Completion object: {'token_usage': {'completion_tokens': 23, 'prompt_tokens': 169, 'total_tokens': 192, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'llama3.2:3b-instruct-fp16', 'system_fingerprint': 'fp_ollama', 'id': 'chatcmpl-51307b80-a1a1-4092-b005-21ea9cde29a0', 'service_tier': None, 'finish_reason': 'tool_calls', 'logprobs': None}
|
||||
```
|
||||
|
||||
This is analogous to the object returned by the Llama Stack Client's `chat.completions.create` call.
|
||||
|
||||
The example application leverages a series of LangGraph and LangChain API. The two keys ones are:
|
||||
1. [ChatOpenAI](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html#chatopenai) is the primary LangChain Open AI compatible chat API. The standard parameters fort his API supply the Llama Stack OpenAI provider endpoint, followed by a model registered with Llama Stack.
|
||||
2. [StateGraph](https://langchain-ai.github.io/langgraph/reference/graphs/#langgraph.graph.state.StateGraph) provides the LangGraph API for building the nodes and edges of the graph that define (potential) steps of the agentic workflow.
|
||||
|
||||
Additional LangChain API are leveraged in order to:
|
||||
- register or [bind](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html#langchain_openai.chat_models.base.ChatOpenAI.bind_tools) the tool used by the LangGraph agentic workflow
|
||||
- [process](https://api.python.langchain.com/en/latest/agents/langchain.agents.format_scratchpad.openai_tools.format_to_openai_tool_messages.html) any messages generated by the workflow
|
||||
- supply [user](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.human.HumanMessage.html) and [tool](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolMessage.html) prompts
|
||||
|
||||
Ultimately, this agentic workflow application performs the simple task of adding numbers together.
|
||||
|
||||
```{literalinclude} ./langgraph-agent-add.py
|
||||
:language: python
|
||||
```
|
||||
|
||||
#### Minor application tweak - the OpenAI Responses API
|
||||
|
||||
It is very easy to switch from the default OpenAI Chat Completion API to the newer OpenAI Responses API. Simply modify
|
||||
the `ChatOpenAI` instantiator with the additional `use_responses_api=True` flag:
|
||||
|
||||
```python
|
||||
llm = ChatOpenAI(
|
||||
model="ollama/llama3.2:3b-instruct-fp16",
|
||||
openai_api_key="none",
|
||||
openai_api_base="http://localhost:8321/v1/openai/v1",
|
||||
use_responses_api=True).bind_tools(tools)
|
||||
```
|
||||
|
||||
For convenience, here is the entire sample with that change to the constructor:
|
||||
|
||||
```{literalinclude} ./langgraph-agent-add-via-responses.py
|
||||
:language: python
|
||||
```
|
||||
|
||||
If you are examining the Llama Stack server logs while running the application, you'll see use of the `/v1/openai/v1/responses` REST APIs instead of `/v1/openai/v1/chat/completions` REST APIs.
|
||||
|
||||
In the sample application's output, the debug statement displaying the response from the LLM will now illustrate that instead of the Chat Completion Object,
|
||||
the LLM returns a [Response Object from the Responses API](https://platform.openai.com/docs/api-reference/responses/object).
|
||||
|
||||
```bash
|
||||
LLM returned Responses object: {'id': 'resp-9dbaa1e1-7ba4-45cd-978e-e84448aee278', 'created_at': 1756140326.0, 'model': 'ollama/llama3.2:3b-instruct-fp16', 'object': 'response', 'status': 'completed', 'model_name': 'ollama/llama3.2:3b-instruct-fp16'}
|
||||
```
|
||||
|
||||
This is analogous to the object returned by the Llama Stack Client's `responses.create` call.
|
||||
|
||||
The Responses API is considered the next generation of OpenAI's core agentic API primitive.
|
||||
For a detailed comparison with and migration suggestions from the Chat API, visit the [Open AI documentation](https://platform.openai.com/docs/guides/migrate-to-responses).
|
||||
|
||||
### Comparing Llama Stack Agents and LangGraph Agents
|
||||
|
||||
Expressing the agent workflow as a LangGraph `StateGraph` is an alternative approach to the Llama Stack agent execution
|
||||
loop as discussed in [this prior section](../agent_execution_loop.md).
|
||||
|
||||
To summarize some of the key takeaways detailed earlier:
|
||||
- LangGraph does not offer the easy integration with Llama Stack's API providers, like say the shields / safety mechanisms, that Llama Stack Agents benefit from
|
||||
- Llama stack agents provide a simpler, predefined, sequence of steps incorporated in a loop is the standard execution pattern (similar to LangChain), where multiple LLM calls
|
||||
and tool invocations are possible.
|
||||
- LangGraph execution order is more flexible, with edges allowing for conditional branching along with loops. Each node is a LLM call
|
||||
or tool call. Also, a mutable state is passed between nodes, enabling complex, multi-turn interactions and adaptive behavior, i.e. workflow orchestration.
|
|
@ -0,0 +1,72 @@
|
|||
# Copyright (c) Meta Platforms, Inc. and affiliates.
|
||||
# All rights reserved.
|
||||
#
|
||||
# This source code is licensed under the terms described in the LICENSE file in
|
||||
# the root directory of this source tree.
|
||||
|
||||
from langgraph.graph import StateGraph, END
|
||||
from langchain_core.messages import HumanMessage, ToolMessage
|
||||
from langchain.agents import tool
|
||||
from langchain_openai import ChatOpenAI
|
||||
from langchain.agents.format_scratchpad.openai_tools import format_to_openai_tool_messages
|
||||
|
||||
# --- Tool ---
|
||||
@tool
|
||||
def add_numbers(x: int, y: int) -> int:
|
||||
"""Add two integers together."""
|
||||
return x + y
|
||||
|
||||
tools = [add_numbers]
|
||||
|
||||
# --- LLM that supports function-calling ---
|
||||
llm = ChatOpenAI(
|
||||
model="ollama/llama3.2:3b-instruct-fp16",
|
||||
openai_api_key="none",
|
||||
openai_api_base="http://localhost:8321/v1/openai/v1",
|
||||
use_responses_api=True
|
||||
).bind_tools(tools)
|
||||
|
||||
# --- Node that runs the agent ---
|
||||
def agent_node(state):
|
||||
messages = state["messages"]
|
||||
if "scratchpad" in state:
|
||||
messages += format_to_openai_tool_messages(state["scratchpad"])
|
||||
response = llm.invoke(messages)
|
||||
print(f"LLM returned Responses object: {response.response_metadata}")
|
||||
response.content
|
||||
return {
|
||||
"messages": messages + [response],
|
||||
"intermediate_step": response,
|
||||
}
|
||||
|
||||
# --- Node that executes tool call ---
|
||||
def tool_node(state):
|
||||
tool_call = state["intermediate_step"].tool_calls[0]
|
||||
result = add_numbers.invoke(tool_call["args"])
|
||||
return {
|
||||
"messages": state["messages"] + [
|
||||
ToolMessage(tool_call_id=tool_call["id"], content=str(result))
|
||||
]
|
||||
}
|
||||
|
||||
# --- Build LangGraph ---
|
||||
graph = StateGraph(dict)
|
||||
graph.add_node("agent", agent_node)
|
||||
graph.add_node("tool", tool_node)
|
||||
|
||||
graph.set_entry_point("agent")
|
||||
graph.add_edge("agent", "tool")
|
||||
graph.add_edge("tool", END)
|
||||
|
||||
compiled_graph = graph.compile()
|
||||
|
||||
# --- Run it ---
|
||||
initial_state = {
|
||||
"messages": [HumanMessage(content="What is 16 plus 9?")]
|
||||
}
|
||||
|
||||
final_state = compiled_graph.invoke(initial_state)
|
||||
|
||||
# --- Output ---
|
||||
for msg in final_state["messages"]:
|
||||
print(f"{msg.type.upper()}: {msg.content}")
|
|
@ -0,0 +1,70 @@
|
|||
# Copyright (c) Meta Platforms, Inc. and affiliates.
|
||||
# All rights reserved.
|
||||
#
|
||||
# This source code is licensed under the terms described in the LICENSE file in
|
||||
# the root directory of this source tree.
|
||||
|
||||
from langgraph.graph import StateGraph, END
|
||||
from langchain_core.messages import HumanMessage, ToolMessage
|
||||
from langchain.agents import tool
|
||||
from langchain_openai import ChatOpenAI
|
||||
from langchain.agents.format_scratchpad.openai_tools import format_to_openai_tool_messages
|
||||
|
||||
# --- Tool ---
|
||||
@tool
|
||||
def add_numbers(x: int, y: int) -> int:
|
||||
"""Add two integers together."""
|
||||
return x + y
|
||||
|
||||
tools = [add_numbers]
|
||||
|
||||
# --- LLM that supports function-calling ---
|
||||
llm = ChatOpenAI(
|
||||
model="ollama/llama3.2:3b-instruct-fp16",
|
||||
openai_api_key="none",
|
||||
openai_api_base="http://localhost:8321/v1/openai/v1"
|
||||
).bind_tools(tools)
|
||||
|
||||
# --- Node that runs the agent ---
|
||||
def agent_node(state):
|
||||
messages = state["messages"]
|
||||
if "scratchpad" in state:
|
||||
messages += format_to_openai_tool_messages(state["scratchpad"])
|
||||
response = llm.invoke(messages)
|
||||
print(f"LLM returned Chat Completion object: {response.response_metadata}")
|
||||
return {
|
||||
"messages": messages + [response],
|
||||
"intermediate_step": response,
|
||||
}
|
||||
|
||||
# --- Node that executes tool call ---
|
||||
def tool_node(state):
|
||||
tool_call = state["intermediate_step"].tool_calls[0]
|
||||
result = add_numbers.invoke(tool_call["args"])
|
||||
return {
|
||||
"messages": state["messages"] + [
|
||||
ToolMessage(tool_call_id=tool_call["id"], content=str(result))
|
||||
]
|
||||
}
|
||||
|
||||
# --- Build LangGraph ---
|
||||
graph = StateGraph(dict)
|
||||
graph.add_node("agent", agent_node)
|
||||
graph.add_node("tool", tool_node)
|
||||
|
||||
graph.set_entry_point("agent")
|
||||
graph.add_edge("agent", "tool")
|
||||
graph.add_edge("tool", END)
|
||||
|
||||
compiled_graph = graph.compile()
|
||||
|
||||
# --- Run it ---
|
||||
initial_state = {
|
||||
"messages": [HumanMessage(content="What is 16 plus 9?")]
|
||||
}
|
||||
|
||||
final_state = compiled_graph.invoke(initial_state)
|
||||
|
||||
# --- Output ---
|
||||
for msg in final_state["messages"]:
|
||||
print(f"{msg.type.upper()}: {msg.content}")
|
|
@ -0,0 +1,21 @@
|
|||
# Deep dive references for Llama Stack, OpenAI Responses API, and LangChain/LangGraph
|
||||
|
||||
Examples for dealing with combinations the LLama Stack Client API with say:
|
||||
- OpenAI Responses API
|
||||
- And a wide variety of frameworks, such as the LangChain API
|
||||
|
||||
are rapidly evolving throughout various code repositories, blogs, and documentations sites.
|
||||
|
||||
The list of scenarios included at such location are impossible to list and keep current, but for certain the
|
||||
minimally include such scenarios as:
|
||||
- Simple model inference
|
||||
- RAG with document search
|
||||
- Tool calling to MCP
|
||||
- Complex multi-step workflows.
|
||||
|
||||
Rather then duplicate these Llama Stack Client related examples in this documentation site, this section will provide
|
||||
references to these external sites.
|
||||
|
||||
## The AI Alliance
|
||||
|
||||
Consider the Responses API Examples detailed [here](https://github.com/The-AI-Alliance/llama-stack-examples/blob/main/notebooks/01-responses/README.md).
|
|
@ -0,0 +1,21 @@
|
|||
# More on Llama Stack's OpenAI API Compatibility and other AI Frameworks
|
||||
|
||||
Many of the other Agentic frameworks also recognize the value of providing OpenAI API compatibility to allow for coupling
|
||||
with their framework specific APIs, similar to the use of the OpenAI Responses API from a Llama Stack Client
|
||||
instance as described in the previous [Agents vs Responses API](responses_vs_agents) section.
|
||||
|
||||
This OpenAI API compatibility becomes the "least common denominator" of sorts, and allows for migrating these agentic applications written
|
||||
with these other frameworks onto AI infrastructure leveraging Llama Stack. Once on Llama Stack, the application maintainer
|
||||
can then leverage all the advantages Llama Stack can provide as summarized in the [Core Concepts section](../concepts/index.md).
|
||||
|
||||
As the Llama Stack community continues to dive into these different AI Frameworks with Open AI API compatibility, a
|
||||
variety of documentation sections, examples, and references will be provided. Here is what is currently available:
|
||||
|
||||
- **[LangChain/LangGraph](langchain_langgraph/index)**: the LangChain and associated LangGraph AI Frameworks.
|
||||
|
||||
```{toctree}
|
||||
:hidden:
|
||||
:maxdepth: 1
|
||||
|
||||
langchain_langgraph/index
|
||||
```
|
Loading…
Add table
Add a link
Reference in a new issue