forked from phoenix-oss/llama-stack-mirror

History

Ben Browning 8e316c9b1e feat: function tools in OpenAI Responses (#2094 ) # What does this PR do? This is a combination of what was previously 3 separate PRs - #2069, #2075, and #2083. It turns out all 3 of those are needed to land a working function calling Responses implementation. The web search builtin tool was already working, but this wires in support for custom function calling. I ended up combining all three into one PR because they all had lots of merge conflicts, both with each other but also with #1806 that just landed. And, because landing any of them individually would have only left a partially working implementation merged. The new things added here are: * Storing of input items from previous responses and restoring of those input items when adding previous responses to the conversation state * Handling of multiple input item messages roles, not just "user" messages. * Support for custom tools passed into the Responses API to enable function calling outside of just the builtin websearch tool. Closes #2074 Closes #2080 ## Test Plan ### Unit Tests Several new unit tests were added, and they all pass. Ran via: ``` python -m pytest -s -v tests/unit/providers/agents/meta_reference/test_openai_responses.py ``` ### Responses API Verification Tests I ran our verification run.yaml against multiple providers to ensure we were getting a decent pass rate. Specifically, I ensured the new custom tool verification test passed across multiple providers and that the multi-turn examples passed across at least some of the providers (some providers struggle with the multi-turn workflows still). Running the stack setup for verification testing: ``` llama stack run --image-type venv tests/verifications/openai-api-verification-run.yaml ``` Together, passing 100% as an example: ``` pytest -s -v 'tests/verifications/openai_api/test_responses.py' --provider=together-llama-stack ``` ## Documentation We will need to start documenting the OpenAI APIs, but for now the Responses stuff is still rapidly evolving so delaying that. --------- Signed-off-by: Derek Higgins <derekh@redhat.com> Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: Derek Higgins <derekh@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>		2025-05-13 11:29:15 -07:00
..
cli	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
distribution	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
models	feat: support '-' in tool names (#1807 )	2025-04-12 14:23:03 -07:00
providers	feat: function tools in OpenAI Responses (#2094 )	2025-05-13 11:29:15 -07:00
rag	fix: raise an error when no vector DB IDs are provided to the RAG tool (#1911 )	2025-05-12 11:25:13 +02:00
registry	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
server	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
__init__.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
conftest.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
fixtures.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
README.md	docs: revamp testing documentation (#2155 )	2025-05-13 11:28:29 -07:00

README.md

Llama Stack Unit Tests

You can run the unit tests by running:

source .venv/bin/activate
./scripts/unit-tests.sh [PYTEST_ARGS]

Any additional arguments are passed to pytest. For example, you can specify a test directory, a specific test file, or any pytest flags (e.g., -vvv for verbosity). If no test directory is specified, it defaults to "tests/unit", e.g:

./scripts/unit-tests.sh tests/unit/registry/test_registry.py -vvv

If you'd like to run for a non-default version of Python (currently 3.10), pass PYTHON_VERSION variable as follows:

source .venv/bin/activate
PYTHON_VERSION=3.13 ./scripts/unit-tests.sh