chore(responses): Refactor Responses Impl to be civilized (#3138)

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-04 10:10:36 +00:00

# What does this PR do?
Refactors the OpenAI responses implementation by extracting streaming and tool execution logic into separate modules. This improves code organization by:

1. Creating a new `StreamingResponseOrchestrator` class in `streaming.py` to handle the streaming response generation logic
2. Moving tool execution functionality to a dedicated `ToolExecutor` class in `tool_executor.py`

## Test Plan

Existing tests

This commit is contained in:

ashwinb

2025-08-15 00:05:35 +00:00

parent e69acbafbf

commit 47d5af703c

No known key found for this signature in database

GPG key ID: A7318BD657B83EA8

10 changed files with 1434 additions and 1156 deletions

Rows
Columns

chore(responses): Refactor Responses Impl to be civilized (#3138)

0 docs/source/distributions/k8s-benchmark/openai-mock-server.py Normal file → Executable file Unescape Escape View file

0

docs/source/distributions/k8s-benchmark/openai-mock-server.py Normal file → Executable file

View file