forked from phoenix-oss/llama-stack-mirror

History

Ashwin Bharambe 5cdb29758a feat(responses): add output_text delta events to responses (#2265 ) This adds initial streaming support to the Responses API. This PR makes sure that the _first_ inference call made to chat completions streams out. There's more to be done: - tool call output tokens need to stream out when possible - we need to loop through multiple rounds of inference and they all need to stream out. ## Test Plan Added a test. Executed as: ``` FIREWORKS_API_KEY=... \ pytest -s -v 'tests/verifications/openai_api/test_responses.py' \ --provider=stack:fireworks --model meta-llama/Llama-4-Scout-17B-16E-Instruct ``` Then, started a llama stack fireworks distro and tested against it like this: ``` OPENAI_API_KEY=blah \ pytest -s -v 'tests/verifications/openai_api/test_responses.py' \ --base-url http://localhost:8321/v1/openai/v1 \ --model meta-llama/Llama-4-Scout-17B-16E-Instruct ```		2025-05-27 13:07:14 -07:00
..
cli	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
distribution	fix(tools): do not index tools, only index toolgroups (#2261 )	2025-05-25 13:27:52 -07:00
models	feat: support '-' in tool names (#1807 )	2025-04-12 14:23:03 -07:00
providers	feat(responses): add output_text delta events to responses (#2265 )	2025-05-27 13:07:14 -07:00
rag	feat: Adding support for customizing chunk context in RAG insertion and querying (#2134 )	2025-05-14 21:56:20 -04:00
registry	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
server	chore: split routing_tables into individual files (#2259 )	2025-05-24 23:15:05 -07:00
utils	feat: implement get chat completions APIs (#2200 )	2025-05-21 22:21:52 -07:00
__init__.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
conftest.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
fixtures.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
README.md	docs: revamp testing documentation (#2155 )	2025-05-13 11:28:29 -07:00

README.md

Llama Stack Unit Tests

You can run the unit tests by running:

source .venv/bin/activate
./scripts/unit-tests.sh [PYTEST_ARGS]

Any additional arguments are passed to pytest. For example, you can specify a test directory, a specific test file, or any pytest flags (e.g., -vvv for verbosity). If no test directory is specified, it defaults to "tests/unit", e.g:

./scripts/unit-tests.sh tests/unit/registry/test_registry.py -vvv

If you'd like to run for a non-default version of Python (currently 3.10), pass PYTHON_VERSION variable as follows:

source .venv/bin/activate
PYTHON_VERSION=3.13 ./scripts/unit-tests.sh