mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-25 17:11:12 +00:00
Move conversation sync logic before yield to ensure it executes even when streaming consumers break early after receiving response.completed event. ## Test Plan ``` OLLAMA_URL=http://localhost:11434 \ pytest -sv tests/integration/responses/ \ --stack-config server:ci-tests \ --text-model ollama/llama3.2:3b-instruct-fp16 \ --inference-mode live \ -k conversation_multi ``` This test now passes. |
||
|---|---|---|
| .. | ||
| agents | ||
| batches | ||
| datasetio | ||
| eval | ||
| files/localfs | ||
| inference | ||
| ios/inference | ||
| post_training | ||
| safety | ||
| scoring | ||
| telemetry | ||
| tool_runtime | ||
| vector_io | ||
| __init__.py | ||