llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-04 18:13:44 +00:00

Author	SHA1	Message	Date
Omar Abdelwahab	eb4b6fa23a	precommit	2025-11-13 21:30:14 -08:00
Omar Abdelwahab	0b575f7635	Add MCP authorization parameter support with test recordings - Add 'authorization' parameter to OpenAI response tool configuration - Add security check to prevent Authorization in headers - Add tests for bearer token authorization with recordings - Maintain backward compatibility for tools without authorization	2025-11-13 21:24:20 -08:00
Omar Abdelwahab	3d02349783	test: Keep skip marker for MCP auth tests (recordings needed) After attempting local recording generation, encountered multiple environment issues: 1. Client/server version mismatches (0.3.x vs 0.4.0.dev0) 2. LlamaStackClient API changes (provider_data parameter removed) 3. Dev server network constraints (HTTP 426 errors with OpenAI API) Server logs from CI confirmed recordings are needed: - RuntimeError: Recording not found for request hash: 56ddb450d... - Tests with authorization parameter create different OpenAI request hashes Local recording generation requires complex environment setup that matches CI. Requesting reviewer assistance to generate recordings via CI infrastructure.	2025-11-13 19:58:31 -08:00
Omar Abdelwahab	e13014be23	test: Add skip marker for MCP auth tests in replay mode Analysis of CI server logs revealed that tests with authorization parameter create different OpenAI request hashes than existing MCP tool tests, requiring separate recordings. Server log showed: - RuntimeError: Recording not found for request hash: 56ddb450d... - Tests with authorization need their own recordings for replay mode Since recordings cannot be generated locally (dev server network constraints) and require proper CI infrastructure with OpenAI API access, adding skip marker until recordings can be generated in CI record mode. Tests pass when run with actual OpenAI API key in record mode.	2025-11-13 19:52:27 -08:00
Omar Abdelwahab	f60d72645f	test: Fix error handling test to accept BadRequestError The test was expecting ValueError but the server now raises BadRequestError for security violations. Updated to accept both exception types. Note: 3 tests still failing with 500 Internal Server Error - need to check server logs to diagnose the authorization processing bug.	2025-11-13 19:40:46 -08:00
Omar Abdelwahab	a8c8cd8241	test: Use responses_client and remove library client skips Following PR #4146, MCP tests now work in server mode. Updated tests to: - Replace compat_client with responses_client - Remove LlamaStackAsLibraryClient skip checks - Remove replay mode skip marker Tests can now run in both library and server modes without skipping.	2025-11-13 19:35:46 -08:00
Omar Abdelwahab	0391aaa8eb	test: Remove skip marker from MCP authentication tests These tests use local in-process MCP servers and don't require external API calls or recordings. They can run in both replay and record modes without issues since they don't depend on pre-recorded API responses.	2025-11-13 19:07:37 -08:00
Omar Abdelwahab	8d30c4018d	test: Add timeout to test_conversation_error_handling to prevent CI hang Following the same pattern as test_conversation_context_loading, adding a 60s timeout to prevent CI deadlock after running 25+ tests. This is a known issue with connection pool exhaustion or event loop state in the CI environment.	2025-11-13 18:46:27 -08:00
Omar Abdelwahab	e6c6c36b70	Merge remote-tracking branch 'upstream/main' into add-mcp-authentication-param	2025-11-13 12:04:44 -08:00
Charlie Doern	840ad75fe9	feat: split API and provider specs into separate llama-stack-api pkg (#3895 ) # What does this PR do? Extract API definitions and provider specifications into a standalone llama-stack-api package that can be published to PyPI independently of the main llama-stack server. see: https://github.com/llamastack/llama-stack/pull/2978 and https://github.com/llamastack/llama-stack/pull/2978#issuecomment-3145115942 Motivation External providers currently import from llama-stack, which overrides the installed version and causes dependency conflicts. This separation allows external providers to: - Install only the type definitions they need without server dependencies - Avoid version conflicts with the installed llama-stack package - Be versioned and released independently This enables us to re-enable external provider module tests that were previously blocked by these import conflicts. Changes - Created llama-stack-api package with minimal dependencies (pydantic, jsonschema) - Moved APIs, providers datatypes, strong_typing, and schema_utils - Updated all imports from llama_stack.* to llama_stack_api.* - Configured local editable install for development workflow - Updated linting and type-checking configuration for both packages Next Steps - Publish llama-stack-api to PyPI - Update external provider dependencies - Re-enable external provider module tests Pre-cursor PRs to this one: - #4093 - #3954 - #4064 These PRs moved key pieces _out_ of the Api pkg, limiting the scope of change here. relates to #3237 ## Test Plan Package builds successfully and can be imported independently. All pre-commit hooks pass with expected exclusions maintained. --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-13 11:51:17 -08:00
Ashwin Bharambe	fa2b361f46	Merge branch 'main' into add-mcp-authentication-param	2025-11-13 09:42:35 -08:00
Ashwin Bharambe	1e81056a22	feat(tests): enable MCP tests in server mode (#4146 ) We would like to run all OpenAI compatibility tests using only the openai-client library. This is most friendly for contributors since they can run tests without needing to update the client-sdks (which is getting easier but still a long pole.) This is the first step in enabling that -- no using "library client" for any of the Responses tests. This seems like a reasonable trade-off since the usage of an embeddeble library client for Responses (or any OpenAI-compatible) behavior seems to be not very common. To do this, we needed to enable MCP tests (which only worked in library client mode) for server mode.	2025-11-13 07:23:23 -08:00
Omar Abdelwahab	607e3cc05c	Merge branch 'main' into add-mcp-authentication-param	2025-11-12 14:55:23 -08:00
Charlie Doern	37853ca558	fix(tests): add OpenAI client connection cleanup to prevent CI hangs (#4119 ) # What does this PR do? Add explicit connection cleanup and shorter timeouts to OpenAI client fixtures. Fixes CI deadlock after 25+ tests due to connection pool exhaustion. Also adds 60s timeout to test_conversation_context_loading as safety net. ## Test Plan tests pass Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-11-12 12:17:13 -05:00
Omar Abdelwahab	c353873774	precommit run	2025-11-07 14:54:33 -08:00
Omar Abdelwahab	0f0aa6a6c5	fix: correct import path for LlamaStackAsLibraryClient in test Fixed incorrect import in test_mcp_authentication.py: - Changed: from llama_stack import LlamaStackAsLibraryClient - To: from llama_stack.core.library_client import LlamaStackAsLibraryClient This aligns with the correct import pattern used in other test files.	2025-11-07 14:49:27 -08:00
Omar Abdelwahab	1a7ba683e3	Merge branch 'main' into add-mcp-authentication-param	2025-11-07 14:26:06 -08:00
Omar Abdelwahab	ccb870c8fb	precommit	2025-11-07 12:14:42 -08:00
Omar Abdelwahab	8ce30b71f4	test: update error message match for authorization validation Updated test_mcp_authorization_error_when_header_provided to match the new validation error message from the Pydantic validator.	2025-11-07 10:52:40 -08:00
Ashwin Bharambe	aa2bd82b1d	fix(ci): add recordings for responses suite due to web search type changing (#4104 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 1s Details Pre-commit / pre-commit (push) Failing after 2s Details Integration Tests (Replay) / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 4s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test llama stack list-deps / generate-matrix (push) Successful in 3s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 4s Details Test llama stack list-deps / list-deps (push) Failing after 4s Details Test llama stack list-deps / show-single-provider (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s Details UI Tests / ui-tests (22) (push) Successful in 1m3s Details #4103 broke (even though the PR itself was green) trunk	2025-11-07 10:42:07 -08:00
Aakanksha Duggal	b83184f7ef	feat(responses)!: Add web_search_2025_08_26 to the WebSearchToolTypes (#4103 ) # What does this PR do? Resolves #4102 1. Added `web_search_2025_08_26` to the `WebSearchToolTypes` list and the `OpenAIResponseInputToolWebSearch.type` Literal union 2. No changes needed to tool execution logic - all `web_search` types map to the same underlying tool 3. Backward compatibility is maintained - existing `web_search`, `web_search_preview`, and `web_search_preview_2025_03_11` types continue to work 4. Added an integration test case using {"type": "web_search_2025_08_26"} to verify it works correctly 5. Updated `docs/docs/providers/openai_responses_limitations.mdx` to reflect that `web_search_2025_08_26` is now supported. 6. Removed incorrect references to `MOD1/MOD2/MOD3` (which don't exist in the codebase) <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> --------- Signed-off-by: Aakanksha Duggal <aduggal@redhat.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-11-07 10:01:12 -08:00
Ashwin Bharambe	f49cb0b717	chore: Stack server no longer depends on llama-stack-client (#4094 ) This dependency has been bothering folks for a long time (cc @leseb). We really needed it due to "library client" which is primarily used for our tests and is not a part of the Stack server. Anyone who needs to use the library client can certainly install `llama-stack-client` in their environment to make that work. Updated the notebook references to install `llama-stack-client` additionally when setting things up.	2025-11-07 09:54:09 -08:00
Omar Abdelwahab	d08c529ac0	formatting issues	2025-11-06 12:43:24 -08:00
Omar Abdelwahab	5ce48d2c6a	precommit	2025-11-06 12:02:45 -08:00
Omar Abdelwahab	dbe41d9510	Updated a single test case to not include authorization field in the header	2025-11-06 11:08:27 -08:00
Omar Abdelwahab	d58da03e40	fix: update test to use authorization parameter instead of headers For security reasons, reject Authorization header in headers dict and require use of the dedicated authorization parameter instead.	2025-11-06 11:07:21 -08:00
Omar Abdelwahab	18aff1abaa	rejecting headers that include Authorization in the header and pointing them to the authorization param.	2025-11-06 10:59:45 -08:00
Omar Abdelwahab	411b18a90f	Merge branch 'main' into add-mcp-authentication-param	2025-11-05 14:12:32 -08:00
Omar Abdelwahab	dcb3dc4211	raising an error when the authentication field is present in the authorization field and in the header	2025-11-05 11:41:02 -08:00
Omar Abdelwahab	09ef0b38c1	Updated the authentication field to take just the token	2025-11-05 10:49:35 -08:00
Ashwin Bharambe	4d3069bfa5	chore(ci): remove unused recordings (#4074 ) Added a script to cleanup recordings. While doing this, moved the CI matrix generation to a separate script so there is a single source of truth for the matrix. Ran the cleanup script as: ``` PYTHONPATH=. python scripts/cleanup_recordings.py ``` Also added this as part of the pre-commit workflow to ensure that the recordings are always up to date and that no stale recordings are left in the repo.	2025-11-05 09:21:58 -08:00
Omar Abdelwahab	8632c705aa	Merge branch 'main' into add-mcp-authentication-param	2025-11-04 16:20:38 -08:00
Omar Abdelwahab	5c5f6f7e65	updated the test script	2025-11-04 15:36:09 -08:00
Ashwin Bharambe	cb40da210f	fix: update tests for OpenAI-style models endpoint (#4053 ) The llama-stack-client now uses /`v1/openai/v1/models` which returns OpenAI-compatible model objects with 'id' and 'custom_metadata' fields instead of the Resource-style 'identifier' field. Updated api_recorder to handle the new endpoint and modified tests to access model metadata appropriately. Deleted stale model recordings for re-recording. NOTE: CI will be red on this one since it is dependent on https://github.com/llamastack/llama-stack-client-python/pull/291/files landing. I verified locally that it is green.	2025-11-03 17:30:08 -08:00
Omar Abdelwahab	1143db0f64	added a fix	2025-11-03 16:55:13 -08:00
Omar Abdelwahab	c49fef8087	precommit	2025-11-03 16:12:38 -08:00
Omar Abdelwahab	57eb575ea1	Added minor changes	2025-11-03 15:57:45 -08:00
Omar Abdelwahab	d0a8878337	MCP authentication parameter implementation	2025-11-03 15:48:56 -08:00
Charlie Doern	e8ecc99524	fix!: remove chunk_id property from Chunk class (#3954 ) # What does this PR do? chunk_id in the Chunk class executes actual logic to compute a chunk ID. This sort of logic should not live in the API spec. Instead, the providers should be in charge of calling generate_chunk_id, and pass it to `Chunk`. this removes the incorrect dependency between Provider impl and API impl Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-10-29 18:59:59 -07:00
Ashwin Bharambe	7918188f1e	fix(ci): enable responses tests in CI; suppress expected MCP auth error logs (#3889 ) Let us enable responses suite in CI now. Also a minor fix: MCP tool tests intentionally trigger authentication failures to verify error handling, but the resulting error logs clutter test output.	2025-10-22 14:59:42 -07:00
Ashwin Bharambe	c0c0e337d9	misc(tests): add recordings for responses tests	2025-10-21 16:39:08 -07:00
Ashwin Bharambe	f205ab6f6c	fix(responses): fixes, re-record tests (#3820 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 8s Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Test External API and Providers / test-external (venv) (push) Failing after 4s Details Unit Tests / unit-tests (3.12) (push) Failing after 6s Details Unit Tests / unit-tests (3.13) (push) Failing after 5s Details API Conformance Tests / check-schema-compatibility (push) Successful in 17s Details UI Tests / ui-tests (22) (push) Successful in 55s Details Pre-commit / pre-commit (push) Successful in 1m43s Details Wanted to re-enable Responses CI but it seems to hang for some reason due to some interactions with conversations_store or responses_store. ## Test Plan ``` # library client ./scripts/integration-tests.sh --stack-config ci-tests --suite responses # server ./scripts/integration-tests.sh --stack-config server:ci-tests --suite responses ```	2025-10-15 16:37:42 -07:00
slekkala1	99141c29b1	feat: Add responses and safety impl extra_body (#3781 ) Some checks failed SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.13) (push) Failing after 1s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 6s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Python Package Build Test / build (3.12) (push) Failing after 6s Details Vector IO Integration Tests / test-matrix (push) Failing after 9s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 9s Details Test External API and Providers / test-external (venv) (push) Failing after 8s Details Test Llama Stack Build / build (push) Failing after 7s Details Unit Tests / unit-tests (3.12) (push) Failing after 9s Details API Conformance Tests / check-schema-compatibility (push) Successful in 19s Details UI Tests / ui-tests (22) (push) Successful in 37s Details Pre-commit / pre-commit (push) Successful in 1m33s Details # What does this PR do? Have closed the previous PR due to merge conflicts with multiple PRs Addressed all comments from https://github.com/llamastack/llama-stack/pull/3768 (sorry for carrying over to this one) ## Test Plan Added UTs and integration tests	2025-10-15 15:01:37 -07:00
Ashwin Bharambe	8e7e0ddfec	fix(responses): use conversation items when no stored messages exist (#3819 ) Handle a base case when no stored messages exist because no Response call has been made. ## Test Plan ``` ./scripts/integration-tests.sh --stack-config server:ci-tests \ --suite responses --inference-mode record-if-missing --pattern test_conversation_responses ```	2025-10-15 14:43:44 -07:00
Ashwin Bharambe	e9b4278a51	feat(responses)!: improve responses + conversations implementations (#3810 ) This PR updates the Conversation item related types and improves a couple critical parts of the implemenation: - it creates a streaming output item for the final assistant message output by the model. until now we only added content parts and included that message in the final response. - rewrites the conversation update code completely to account for items other than messages (tool calls, outputs, etc.) ## Test Plan Used the test script from https://github.com/llamastack/llama-stack-client-python/pull/281 for this ``` TEST_API_BASE_URL=http://localhost:8321/v1 \ pytest tests/integration/test_agent_turn_step_events.py::test_client_side_function_tool -xvs ```	2025-10-15 09:36:11 -07:00
Ashwin Bharambe	7c63aebd64	feat(responses)!: add reasoning and annotation added events (#3793 ) Implements missing streaming events from OpenAI Responses API spec: - reasoning text/summary events for o1/o3 models, - refusal events for safety moderation - annotation events for citations, - and file search streaming events. Added optional reasoning_content field to chat completion chunks to support non-standard provider extensions. NOTE: OpenAI does _not_ fill reasoning_content when users use the chat_completion APIs. This means there is no way for us to implement Responses (with reasoning) by using OpenAI chat completions! We'd need to transparently punt to OpenAI's responses endpoints if we wish to do that. For others though (vLLM, etc.) we can use it. ## Test Plan File search streaming test passes: ``` ./scripts/integration-tests.sh --stack-config server:ci-tests \ --suite responses --setup gpt --inference-mode replay --pattern test_response_file_search_streaming_events ``` Need more complex setup and validation for reasoning tests (need a vLLM powered OSS model maybe gpt-oss which can return reasoning_content). I will do that in a followup PR.	2025-10-11 16:47:14 -07:00
Ashwin Bharambe	1394403360	feat(responses): implement usage tracking in streaming responses (#3771 ) Implementats usage accumulation to StreamingResponseOrchestrator. The most important part was to pass `stream_options = { "include_usage": true }` to the chat_completion call. This means I will have to record all responses tests again because request hash will change :) Test changes: - Add usage assertions to streaming and non-streaming tests - Update test recordings with actual usage data from OpenAI	2025-10-10 12:27:03 -07:00
Francisco Arceo	e7d21e1ee3	feat: Add support for Conversations in Responses API (#3743 ) # What does this PR do? This PR adds support for Conversations in Responses. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan Unit tests Integration tests <Details> <Summary>Manual testing with this script: (click to expand)</Summary> ```python from openai import OpenAI client = OpenAI() client = OpenAI(base_url="http://localhost:8321/v1/", api_key="none") def test_conversation_create(): print("Testing conversation create...") conversation = client.conversations.create( metadata={"topic": "demo"}, items=[ {"type": "message", "role": "user", "content": "Hello!"} ] ) print(f"Created: {conversation}") return conversation def test_conversation_retrieve(conv_id): print(f"Testing conversation retrieve for {conv_id}...") retrieved = client.conversations.retrieve(conv_id) print(f"Retrieved: {retrieved}") return retrieved def test_conversation_update(conv_id): print(f"Testing conversation update for {conv_id}...") updated = client.conversations.update( conv_id, metadata={"topic": "project-x"} ) print(f"Updated: {updated}") return updated def test_conversation_delete(conv_id): print(f"Testing conversation delete for {conv_id}...") deleted = client.conversations.delete(conv_id) print(f"Deleted: {deleted}") return deleted def test_conversation_items_create(conv_id): print(f"Testing conversation items create for {conv_id}...") items = client.conversations.items.create( conv_id, items=[ { "type": "message", "role": "user", "content": [{"type": "input_text", "text": "Hello!"}] }, { "type": "message", "role": "user", "content": [{"type": "input_text", "text": "How are you?"}] } ] ) print(f"Items created: {items}") return items def test_conversation_items_list(conv_id): print(f"Testing conversation items list for {conv_id}...") items = client.conversations.items.list(conv_id, limit=10) print(f"Items list: {items}") return items def test_conversation_item_retrieve(conv_id, item_id): print(f"Testing conversation item retrieve for {conv_id}/{item_id}...") item = client.conversations.items.retrieve(conversation_id=conv_id, item_id=item_id) print(f"Item retrieved: {item}") return item def test_conversation_item_delete(conv_id, item_id): print(f"Testing conversation item delete for {conv_id}/{item_id}...") deleted = client.conversations.items.delete(conversation_id=conv_id, item_id=item_id) print(f"Item deleted: {deleted}") return deleted def test_conversation_responses_create(): print("\nTesting conversation create for a responses example...") conversation = client.conversations.create() print(f"Created: {conversation}") response = client.responses.create( model="gpt-4.1", input=[{"role": "user", "content": "What are the 5 Ds of dodgeball?"}], conversation=conversation.id, ) print(f"Created response: {response} for conversation {conversation.id}") return response, conversation def test_conversations_responses_create_followup( conversation, content="Repeat what you just said but add 'this is my second time saying this'", ): print(f"Using: {conversation.id}") response = client.responses.create( model="gpt-4.1", input=[{"role": "user", "content": content}], conversation=conversation.id, ) print(f"Created response: {response} for conversation {conversation.id}") conv_items = client.conversations.items.list(conversation.id) print(f"\nRetrieving list of items for conversation {conversation.id}:") print(conv_items.model_dump_json(indent=2)) def test_response_with_fake_conv_id(): fake_conv_id = "conv_zzzzzzzzz5dc81908289d62779d2ac510a2b0b602ef00a44" print(f"Using {fake_conv_id}") try: response = client.responses.create( model="gpt-4.1", input=[{"role": "user", "content": "say hello"}], conversation=fake_conv_id, ) print(f"Created response: {response} for conversation {fake_conv_id}") except Exception as e: print(f"failed to create response for conversation {fake_conv_id} with error {e}") def main(): print("Testing OpenAI Conversations API...") # Create conversation conversation = test_conversation_create() conv_id = conversation.id # Retrieve conversation test_conversation_retrieve(conv_id) # Update conversation test_conversation_update(conv_id) # Create items items = test_conversation_items_create(conv_id) # List items items_list = test_conversation_items_list(conv_id) # Retrieve specific item if items_list.data: item_id = items_list.data[0].id test_conversation_item_retrieve(conv_id, item_id) # Delete item test_conversation_item_delete(conv_id, item_id) # Delete conversation test_conversation_delete(conv_id) response, conversation2 = test_conversation_responses_create() print('\ntesting reseponse retrieval') test_conversation_retrieve(conversation2.id) print('\ntesting responses follow up') test_conversations_responses_create_followup(conversation2) print('\ntesting responses follow up x2!') test_conversations_responses_create_followup( conversation2, content="Repeat what you just said but add 'this is my third time saying this'", ) test_response_with_fake_conv_id() print("All tests completed!") if __name__ == "__main__": main() ``` </Details> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-10-10 11:57:40 -07:00
Ashwin Bharambe	e039b61d26	feat(responses)!: add in_progress, failed, content part events (#3765 ) ## Summary - add schema + runtime support for response.in_progress / response.failed / response.incomplete - stream content parts with proper indexes and reasoning slots - align tests + docs with the richer event payloads ## Testing - uv run pytest tests/unit/providers/agents/meta_reference/test_openai_responses.py::test_create_openai_response_with_string_input - uv run pytest tests/unit/providers/agents/meta_reference/test_response_conversion_utils.py	2025-10-10 07:27:34 -07:00
Ashwin Bharambe	f50ce11a3b	feat(tests): make inference_recorder into api_recorder (include tool_invoke) (#3403 ) Renames `inference_recorder.py` to `api_recorder.py` and extends it to support recording/replaying tool invocations in addition to inference calls. This allows us to record web-search, etc. tool calls and thereafter apply recordings for `tests/integration/responses` ## Test Plan ``` export OPENAI_API_KEY=... export TAVILY_SEARCH_API_KEY=... ./scripts/integration-tests.sh --stack-config ci-tests \ --suite responses --inference-mode record-if-missing ```	2025-10-09 14:27:51 -07:00

1 2

54 commits