llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

History

Ben Browning 8dfce2f596 feat: OpenAI Responses API (#1989 ) # What does this PR do? This provides an initial [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses) implementation. The API is not yet complete, and this is more a proof-of-concept to show how we can store responses in our key-value stores and use them to support the Responses API concepts like `previous_response_id`. ## Test Plan I've added a new `tests/integration/openai_responses/test_openai_responses.py` as part of a test-driven development for this new API. I'm only testing this locally with the remote-vllm provider for now, but it should work with any of our inference providers since the only API it requires out of the inference provider is the `openai_chat_completion` endpoint. ``` VLLM_URL="http://localhost:8000/v1" \ INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ llama stack build --template remote-vllm --image-type venv --run ``` ``` LLAMA_STACK_CONFIG="http://localhost:8321" \ python -m pytest -v \ tests/integration/openai_responses/test_openai_responses.py \ --text-model "meta-llama/Llama-3.2-3B-Instruct" ``` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>		2025-04-28 14:06:00 -07:00
..
agents	feat: OpenAI Responses API (#1989 )	2025-04-28 14:06:00 -07:00
batch_inference	feat: add batch inference API to llama stack inference (#1945 )	2025-04-12 11:41:12 -07:00
benchmarks	fix: return 4xx for non-existent resources in GET requests (#1635 )	2025-03-18 14:06:53 -07:00
common	refactor: extract pagination logic into shared helper function (#1770 )	2025-03-31 13:08:29 -07:00
datasetio	refactor: extract pagination logic into shared helper function (#1770 )	2025-03-31 13:08:29 -07:00
datasets	chore: Don't set type variables from register_schema() (#1713 )	2025-03-19 20:29:00 -07:00
eval	fix: fix jobs api literal return type (#1757 )	2025-03-21 14:04:21 -07:00
files	feat(api): don't return a payload on file delete (#1640 )	2025-03-25 17:12:36 -07:00
inference	fix: OpenAI spec cleanup for assistant requests (#1963 )	2025-04-17 06:56:10 -07:00
inspect	feat: add health to all providers through providers endpoint (#1418 )	2025-04-14 11:59:36 +02:00
models	feat: OpenAI-Compatible models, completions, chat/completions (#1894 )	2025-04-11 13:14:17 -07:00
post_training	feat: make training config fields optional (#1861 )	2025-04-12 01:13:45 -07:00
providers	feat: add health to all providers through providers endpoint (#1418 )	2025-04-14 11:59:36 +02:00
safety	chore: move all Llama Stack types from llama-models to llama-stack (#1098 )	2025-02-14 09:10:59 -08:00
scoring	docs: api documentation for agents/eval/scoring/datasets (#1400 )	2025-03-05 09:40:24 -08:00
scoring_functions	chore: Don't set type variables from register_schema() (#1713 )	2025-03-19 20:29:00 -07:00
shields	fix: return 4xx for non-existent resources in GET requests (#1635 )	2025-03-18 14:06:53 -07:00
synthetic_data_generation	chore: move all Llama Stack types from llama-models to llama-stack (#1098 )	2025-02-14 09:10:59 -08:00
telemetry	chore: Don't set type variables from register_schema() (#1713 )	2025-03-19 20:29:00 -07:00
tools	fix(api): don't return list for runtime tools (#1686 )	2025-04-01 09:53:11 +02:00
vector_dbs	fix: return 4xx for non-existent resources in GET requests (#1635 )	2025-03-18 14:06:53 -07:00
vector_io	chore: mypy violations cleanup for inline::{telemetry,tool_runtime,vector_io} (#1711 )	2025-03-20 10:01:10 -07:00
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00
datatypes.py	feat(api): don't return a payload on file delete (#1640 )	2025-03-25 17:12:36 -07:00
resource.py	fix!: update eval-tasks -> benchmarks (#1032 )	2025-02-13 16:40:58 -08:00
version.py	llama-stack version alpha -> v1	2025-01-15 05:58:09 -08:00