mirror of https://github.com/meta-llama/llama-stack.git synced 2025-07-15 01:26:10 +00:00

History

Ben Browning 655d3d0466 fix: annotations list and web_search_preview in Responses These are a couple of fixes to get an example LangChain app working with our OpenAI Responses API implementation. The Responses API spec requires an annotations array in output[].content[].annotations and we were not providing one. So, this adds that as an empty list, even though we don't do anything to populate it yet. This prevents an error from client libraries like Langchain that expect this field to always exist, even if an empty list. The other fix is `web_search_preview` is a valid name for the web search tool in the Responses API, but we only responded to `web_search` or `web_search_preview_2025_03_11`. The existing Responses unit tests were expanded to test these cases, via: ``` pytest -sv tests/unit/providers/agents/meta_reference/test_openai_responses.py ``` The existing test_openai_responses.py integration tests still pass with this change, tested as below with Fireworks: ``` uv run llama stack run llama_stack/templates/starter/run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 \ uv run pytest -sv tests/integration/agents/test_openai_responses.py \ --text-model accounts/fireworks/models/llama4-scout-instruct-basic ``` Lastly, this example Langchain app now works with Llama stack (tested with Ollama in the starter template in this case): ```python from langchain_openai import ChatOpenAI llm = ChatOpenAI( base_url="http://localhost:8321/v1/openai/v1", api_key="fake", model="ollama/meta-llama/Llama-3.2-3B-Instruct", ) tool = {"type": "web_search_preview"} llm_with_tools = llm.bind_tools([tool]) response = llm_with_tools.invoke("What was a positive news story from today?") print(response.content) ``` Signed-off-by: Ben Browning <bbrownin@redhat.com>		2025-06-25 15:14:10 -04:00
..
cli	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
distribution	feat: fine grained access control policy (#2264 )	2025-06-03 14:51:12 -07:00
files	feat: support pagination in inference/responses stores (#2397 )	2025-06-16 22:43:35 -07:00
models	chore: remove usage of load_tiktoken_bpe (#2276 )	2025-06-02 07:33:37 -07:00
providers	fix: annotations list and web_search_preview in Responses	2025-06-25 15:14:10 -04:00
rag	feat: Enable ingestion of precomputed embeddings (#2317 )	2025-05-31 04:03:37 -06:00
registry	feat: fine grained access control policy (#2264 )	2025-06-03 14:51:12 -07:00
server	feat: Add url field to PaginatedResponse and populate it using route … (#2419 )	2025-06-16 11:19:48 +02:00
utils	feat: support auth attributes in inference/responses stores (#2389 )	2025-06-20 10:24:45 -07:00
__init__.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
conftest.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
fixtures.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
README.md	chore: bump python supported version to 3.12 (#2475 )	2025-06-24 09:22:04 +05:30

README.md

Llama Stack Unit Tests

You can run the unit tests by running:

source .venv/bin/activate
./scripts/unit-tests.sh [PYTEST_ARGS]

Any additional arguments are passed to pytest. For example, you can specify a test directory, a specific test file, or any pytest flags (e.g., -vvv for verbosity). If no test directory is specified, it defaults to "tests/unit", e.g:

./scripts/unit-tests.sh tests/unit/registry/test_registry.py -vvv

If you'd like to run for a non-default version of Python (currently 3.12), pass PYTHON_VERSION variable as follows:

source .venv/bin/activate
PYTHON_VERSION=3.13 ./scripts/unit-tests.sh