llama-stack/llama_stack/providers/inline
Ashwin Bharambe 5cdb29758a
feat(responses): add output_text delta events to responses (#2265)
This adds initial streaming support to the Responses API. 

This PR makes sure that the _first_ inference call made to chat
completions streams out.

There's more to be done:
 - tool call output tokens need to stream out when possible
- we need to loop through multiple rounds of inference and they all need
to stream out.

## Test Plan

Added a test. Executed as:

```
FIREWORKS_API_KEY=... \
  pytest -s -v 'tests/verifications/openai_api/test_responses.py' \
  --provider=stack:fireworks --model meta-llama/Llama-4-Scout-17B-16E-Instruct
```

Then, started a llama stack fireworks distro and tested against it like
this:

```
OPENAI_API_KEY=blah \
   pytest -s -v 'tests/verifications/openai_api/test_responses.py' \
   --base-url http://localhost:8321/v1/openai/v1 \
  --model meta-llama/Llama-4-Scout-17B-16E-Instruct 
```
2025-05-27 13:07:14 -07:00
..
agents feat(responses): add output_text delta events to responses (#2265) 2025-05-27 13:07:14 -07:00
datasetio chore(refact): move paginate_records fn outside of datasetio (#2137) 2025-05-12 10:56:14 -07:00
eval feat: implementation for agent/session list and describe (#1606) 2025-05-07 14:49:23 +02:00
inference chore: make cprint write to stderr (#2250) 2025-05-24 23:39:57 -07:00
ios/inference chore: removed executorch submodule (#1265) 2025-02-25 21:57:21 -08:00
post_training feat: add huggingface post_training impl (#2132) 2025-05-16 14:41:28 -07:00
safety chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
scoring chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
telemetry fix(telemetry): get rid of annoying sqlite span export error (#2245) 2025-05-24 20:24:34 -07:00
tool_runtime fix(tools): do not index tools, only index toolgroups (#2261) 2025-05-25 13:27:52 -07:00
vector_io feat(sqlite-vec): enable keyword search for sqlite-vec (#1439) 2025-05-21 15:24:24 -04:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00