mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-06-27 18:50:41 +00:00
# What does this PR do? These are a couple of fixes to get an example LangChain app working with our OpenAI Responses API implementation. The Responses API spec requires an annotations array in `output[*].content[*].annotations` and we were not providing one. So, this adds that as an empty list, even though we don't do anything to populate it yet. This prevents an error from client libraries like Langchain that expect this field to always exist, even if an empty list. The other fix is `web_search_preview` is a valid name for the web search tool in the Responses API, but we only responded to `web_search` or `web_search_preview_2025_03_11`. ## Test Plan The existing Responses unit tests were expanded to test these cases, via: ``` pytest -sv tests/unit/providers/agents/meta_reference/test_openai_responses.py ``` The existing test_openai_responses.py integration tests still pass with this change, tested as below with Fireworks: ``` uv run llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ uv run pytest -sv tests/integration/agents/test_openai_responses.py \ --text-model accounts/fireworks/models/llama4-scout-instruct-basic ``` Lastly, this example LangChain app now works with Llama stack (tested with Ollama in the starter template in this case). This LangChain code is using the example snippets for using Responses API at https://python.langchain.com/docs/integrations/chat/openai/#responses-api ```python from langchain_openai import ChatOpenAI llm = ChatOpenAI( base_url="http://localhost:8321/v1/openai/v1", api_key="fake", model="ollama/meta-llama/Llama-3.2-3B-Instruct", ) tool = {"type": "web_search_preview"} llm_with_tools = llm.bind_tools([tool]) response = llm_with_tools.invoke("What was a positive news story from today?") print(response.content) ``` Signed-off-by: Ben Browning <bbrownin@redhat.com> |
||
---|---|---|
.. | ||
_static | ||
notebooks | ||
openapi_generator | ||
resources | ||
source | ||
zero_to_hero_guide | ||
conftest.py | ||
contbuild.sh | ||
dog.jpg | ||
getting_started.ipynb | ||
getting_started_llama4.ipynb | ||
getting_started_llama_api.ipynb | ||
license_header.txt | ||
make.bat | ||
Makefile | ||
readme.md |
Llama Stack Documentation
Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our ReadTheDocs page.
Render locally
From the llama-stack root directory, run the following command to render the docs locally:
uv run --group docs sphinx-autobuild docs/source docs/build/html --write-all
You can open up the docs in your browser at http://localhost:8000
Content
Try out Llama Stack's capabilities through our detailed Jupyter notebooks:
- Building AI Applications Notebook - A comprehensive guide to building production-ready AI applications using Llama Stack
- Benchmark Evaluations Notebook - Detailed performance evaluations and benchmarking results
- Zero-to-Hero Guide - Step-by-step guide for getting started with Llama Stack