mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-04 10:10:36 +00:00
# What does this PR do? Refactors the OpenAI responses implementation by extracting streaming and tool execution logic into separate modules. This improves code organization by: 1. Creating a new `StreamingResponseOrchestrator` class in `streaming.py` to handle the streaming response generation logic 2. Moving tool execution functionality to a dedicated `ToolExecutor` class in `tool_executor.py` ## Test Plan Existing tests |
||
|---|---|---|
| .. | ||
| apply.sh | ||
| locust-k8s.yaml | ||
| locustfile.py | ||
| openai-mock-deployment.yaml | ||
| openai-mock-server.py | ||
| stack-configmap.yaml | ||
| stack-k8s.yaml.template | ||
| stack_run_config.yaml | ||