llama-stack

History

Ilya Kolchinsky 5052c3cbf3 fix: Fixed an "out of token budget" error when attempting a tool call via remote vLLM provider (#2114 ) # What does this PR do? Closes #2113. Closes #1783. Fixes a bug in handling the end of tool execution request stream where no `finish_reason` is provided by the model. ## Test Plan 1. Ran existing unit tests 2. Added a dedicated test verifying correct behavior in this edge case 3. Ran the code snapshot from #2113 [//]: # (## Documentation)		2025-05-14 13:11:02 -07:00
..
agent	feat: implementation for agent/session list and describe (#1606 )	2025-05-07 14:49:23 +02:00
agents	feat: function tools in OpenAI Responses (#2094 )	2025-05-13 11:29:15 -07:00
inference	fix: Fixed an "out of token budget" error when attempting a tool call via remote vLLM provider (#2114 )	2025-05-14 13:11:02 -07:00
nvidia	fix: Fix messages format in NVIDIA safety check request body (#2063 )	2025-04-30 18:01:28 +02:00
utils	fix: add check for interleavedContent (#1973 )	2025-05-06 09:55:07 -07:00
vector_io	chore: Updating sqlite-vec to make non-blocking calls (#1762 )	2025-03-23 17:25:44 -07:00
test_configs.py	feat(api): don't return a payload on file delete (#1640 )	2025-03-25 17:12:36 -07:00