llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-08-15 22:18:00 +00:00

Author	SHA1	Message	Date
ashwinb	8ed69978f9	refactor(tests): make the responses tests nicer (#3161 ) # What does this PR do? A _bunch_ on cleanup for the Responses tests. - Got rid of YAML test cases, moved them to just use simple pydantic models - Splitting the large monolithic test file into multiple focused test files: - `test_basic_responses.py` for basic and image response tests - `test_tool_responses.py` for tool-related tests - `test_file_search.py` for file search specific tests - Adding a `StreamingValidator` helper class to standardize streaming response validation ## Test Plan Run the tests: ``` pytest -s -v tests/integration/non_ci/responses/ \ --stack-config=starter \ --text-model openai/gpt-4o \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 \ -k "client_with_models" ```	2025-08-15 00:05:36 +00:00
ashwinb	ba664474de	feat(responses): add mcp list tool streaming event (#3159 ) # What does this PR do? Adds proper streaming events for MCP tool listing (`mcp_list_tools.in_progress` and `mcp_list_tools.completed`). Also refactors things a bit more. ## Test Plan Verified existing integration tests pass with the refactored code. The test `test_response_streaming_multi_turn_tool_execution` has been updated to check for the new MCP list tools streaming events	2025-08-15 00:05:36 +00:00
Ashwin Bharambe	e1e161553c	feat(responses): add MCP argument streaming and content part events (#3136 ) # What does this PR do? Adds content part streaming events to the OpenAI-compatible Responses API to support more granular streaming of response content. This introduces: 1. New schema types for content parts: `OpenAIResponseContentPart` with variants for text output and refusals 2. New streaming event types: - `OpenAIResponseObjectStreamResponseContentPartAdded` for when content parts begin - `OpenAIResponseObjectStreamResponseContentPartDone` for when content parts complete 3. Implementation in the reference provider to emit these events during streaming responses. Also emits MCP arguments just like function call ones. ## Test Plan Updated existing streaming tests to verify content part events are properly emitted	2025-08-13 16:34:26 -07:00
Ashwin Bharambe	8638537d14	feat(responses): stream progress of tool calls (#3135 ) # What does this PR do? Enhances tool execution streaming by adding support for real-time progress events during tool calls. This implementation adds streaming events for MCP and web search tools, including in-progress, searching, completed, and failed states. The refactored `_execute_tool_call` method now returns an async iterator that yields streaming events throughout the tool execution lifecycle. ## Test Plan Updated the integration test `test_response_streaming_multi_turn_tool_execution` to verify the presence and structure of new streaming events, including: - Checking for MCP in-progress and completed events - Verifying that progress events contain required fields (item_id, output_index, sequence_number) - Ensuring completed events have the necessary sequence_number field	2025-08-13 16:31:25 -07:00
Ashwin Bharambe	5b312a80b9	feat(responses): improve streaming for function calls (#3124 ) Some checks failed Test Llama Stack Build / build-single-provider (push) Failing after 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 10s Details Test Llama Stack Build / generate-matrix (push) Successful in 9s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 13s Details Python Package Build Test / build (3.13) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 11s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 8s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 7s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 21s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 29s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Test External API and Providers / test-external (venv) (push) Failing after 13s Details Update ReadTheDocs / update-readthedocs (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 10s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 16s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 23s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 24s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 25s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 26s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 22s Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 17s Details Pre-commit / pre-commit (push) Successful in 1m10s Details Test Llama Stack Build / build (push) Failing after 12s Details Emit streaming events for function calls ## Test Plan Improved the test case	2025-08-13 11:23:27 -07:00
Ashwin Bharambe	3d90117891	chore(tests): fix responses and vector_io tests (#3119 ) Some fixes to MCP tests. And a bunch of fixes for Vector providers. I also enabled a bunch of Vector IO tests to be used with `LlamaStackLibraryClient` ## Test Plan Run Responses tests with llama stack library client: ``` pytest -s -v tests/integration/non_ci/responses/ --stack-config=server:starter \ --text-model openai/gpt-4o \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 \ -k "client_with_models" ``` Do the same with `-k openai_client` The rest should be taken care of by CI.	2025-08-12 16:15:53 -07:00
Ashwin Bharambe	5f1ddd35e4	chore(tests): refactor and move responses tests away from verifications (#3068 ) This PR kills the verifications infrastructure which is no longer used. It was relocated to the `llama-stack-evals` (https://github.com/meta-llama/llama-stack-evals) repository previously. Responses tests used this infrastructure but that wasn't quite necessary, just a little useful back when @bbrownin introduced the tests. On Discord, we agreed that tests can be moved to our regular integrations test infra. ## Test Plan Some tests currently do fail (although they run!) I will send a follow-up PR which makes them all pass.	2025-08-07 13:48:16 -07:00

7 commits