llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-10 11:39:47 +00:00

History

Ashwin Bharambe 30ba8c8655 fix(responses): sync conversation before yielding terminal events in streaming (#3888 ) Move conversation sync logic before yield to ensure it executes even when streaming consumers break early after receiving response.completed event. ## Test Plan ``` OLLAMA_URL=http://localhost:11434 \ pytest -sv tests/integration/responses/ \ --stack-config server:ci-tests \ --text-model ollama/llama3.2:3b-instruct-fp16 \ --inference-mode live \ -k conversation_multi ``` This test now passes.		2025-10-22 14:31:12 -07:00
..
inline	fix(responses): sync conversation before yielding terminal events in streaming (#3888 )	2025-10-22 14:31:12 -07:00
registry	revert: "chore(cleanup)!: remove tool_runtime.rag_tool" (#3877 )	2025-10-21 11:22:06 -07:00
remote	chore(cleanup)!: kill vector_db references as far as possible (#3864 )	2025-10-20 20:06:16 -07:00
utils	feat: Add rerank models and rerank API change (#3831 )	2025-10-22 12:02:28 -07:00
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00
datatypes.py	chore(cleanup)!: kill vector_db references as far as possible (#3864 )	2025-10-20 20:06:16 -07:00