llama-stack

forked from phoenix-oss/llama-stack-mirror

History

Ben Browning 40e71758d9 fix: inference providers still using tools with `tool_choice="none"` (#2048 ) # What does this PR do? In our OpenAI API verification tests, some providers were still calling tools even when `tool_choice="none"` was passed in the chat completion requests. Because they aren't all respecting `tool_choice` properly, this adjusts our routing implementation to remove the `tools` and `tool_choice` from the request if `tool_choice="none"` is passed in so that it does not attempt to call any of those tools. Adjusting this in the router fixes this across all providers. This also cleans up the non-streaming together.ai responses for tools, ensuring it returns `None` instead of an empty list when there are no tool calls, to exactly match the OpenAI API responses in that case. ## Test Plan I observed existing failures in our OpenAI API verification suite - see https://github.com/bbrowning/llama-stack-tests/blob/main/openai-api-verification/2025-04-27.md#together-llama-stack for the failing `test_chat_*_tool_choice_none` tests. All streaming and non-streaming variants were failing across all 3 tested models. After this change, all of those 6 failing tests are now passing with no regression in the other tests. I verified this via: ``` llama stack run --image-type venv \ tests/verifications/openai-api-verification-run.yaml ``` ``` python -m pytest -s -v \ 'tests/verifications/openai_api/test_chat_completion.py' \ --provider=together-llama-stack ``` The entire verification suite is not 100% on together.ai yet, but it's getting closer. This also increased the pass rate for fireworks.ai, and did not regress the groq or openai tests at all. Signed-off-by: Ben Browning <bbrownin@redhat.com>		2025-05-07 14:34:47 +02:00
..
anthropic	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
bedrock	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
cerebras	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
cerebras_openai_compat	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
databricks	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
fireworks	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
fireworks_openai_compat	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
gemini	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
groq	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
groq_openai_compat	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
llama_openai_compat	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
nvidia	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
ollama	fix: inference providers still using tools with `tool_choice="none"` (#2048 )	2025-05-07 14:34:47 +02:00
openai	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
passthrough	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
runpod	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
sambanova	feat(providers): sambanova updated to use LiteLLM openai-compat (#1596 )	2025-05-06 16:50:22 -07:00
sambanova_openai_compat	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
tgi	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
together	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
together_openai_compat	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
vllm	chore: more mypy fixes (#2029 )	2025-05-06 09:52:31 -07:00
watsonx	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00