llama-stack

forked from phoenix-oss/llama-stack-mirror

History

Ashwin Bharambe 5cdb29758a feat(responses): add output_text delta events to responses (#2265 ) This adds initial streaming support to the Responses API. This PR makes sure that the _first_ inference call made to chat completions streams out. There's more to be done: - tool call output tokens need to stream out when possible - we need to loop through multiple rounds of inference and they all need to stream out. ## Test Plan Added a test. Executed as: ``` FIREWORKS_API_KEY=... \ pytest -s -v 'tests/verifications/openai_api/test_responses.py' \ --provider=stack:fireworks --model meta-llama/Llama-4-Scout-17B-16E-Instruct ``` Then, started a llama stack fireworks distro and tested against it like this: ``` OPENAI_API_KEY=blah \ pytest -s -v 'tests/verifications/openai_api/test_responses.py' \ --base-url http://localhost:8321/v1/openai/v1 \ --model meta-llama/Llama-4-Scout-17B-16E-Instruct ```		2025-05-27 13:07:14 -07:00
..
agents	feat(responses): add output_text delta events to responses (#2265 )	2025-05-27 13:07:14 -07:00
batch_inference	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
benchmarks	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
common	chore: removed unused class (#2268 )	2025-05-26 08:41:37 -07:00
datasetio	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
datasets	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
eval	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
files	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
inference	feat: add list responses API (#2233 )	2025-05-23 13:16:48 -07:00
inspect	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
models	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
post_training	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
providers	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
safety	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
scoring	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
scoring_functions	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
shields	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
synthetic_data_generation	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
telemetry	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
tools	fix(tools): do not index tools, only index toolgroups (#2261 )	2025-05-25 13:27:52 -07:00
vector_dbs	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
vector_io	chore: more API validators (#2165 )	2025-05-15 11:22:51 -07:00
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00
datatypes.py	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
resource.py	chore: more mypy fixes (#2029 )	2025-05-06 09:52:31 -07:00
version.py	llama-stack version alpha -> v1	2025-01-15 05:58:09 -08:00