llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-09 21:18:38 +00:00

History

ehhuang cf422da825 fix: responses <> chat completion input conversion (#3645 ) # What does this PR do? closes #3268 closes #3498 When resuming from previous response ID, currently we attempt to convert from the stored responses input to chat completion messages, which is not always possible, e.g. for tool calls where some data is lost once converted from chat completion message to repsonses input format. This PR stores the chat completion messages that correspond to the _last_ call to chat completion, which is sufficient to be resumed from in the next responses API call, where we load these saved messages and skip conversion entirely. Separate issue to optimize storage: https://github.com/llamastack/llama-stack/issues/3646 ## Test Plan existing CI tests		2025-10-02 21:50:13 -07:00
..
agent	feat(tools)!: substantial clean up of "Tool" related datatypes (#3627 )	2025-10-02 21:50:13 -07:00
agents	fix: responses <> chat completion input conversion (#3645 )	2025-10-02 21:50:13 -07:00
batches	feat(batches, completions): add /v1/completions support to /v1/batches (#3309 )	2025-09-05 11:59:57 -07:00
files	feat(files): fix expires_after API shape (#3604 )	2025-09-29 21:29:15 -07:00
inference	feat(tools)!: substantial clean up of "Tool" related datatypes (#3627 )	2025-10-02 21:50:13 -07:00
inline	feat(tools)!: substantial clean up of "Tool" related datatypes (#3627 )	2025-10-02 21:50:13 -07:00
nvidia	feat: add static embedding metadata to dynamic model listings for providers using OpenAIMixin (#3547 )	2025-09-25 17:17:00 -04:00
utils	feat(tools)!: substantial clean up of "Tool" related datatypes (#3627 )	2025-10-02 21:50:13 -07:00
vector_io	chore(api): remove deprecated embeddings impls (#3301 )	2025-09-29 14:45:09 -04:00
test_bedrock.py	fix: AWS Bedrock inference profile ID conversion for region-specific endpoints (#3386 )	2025-09-11 11:41:53 +02:00
test_configs.py	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00