forked from phoenix-oss/llama-stack-mirror

History

Ben Browning 10b1056dea fix: multiple tool calls in remote-vllm chat_completion (#2161 ) # What does this PR do? This fixes an issue in how we used the tool_call_buf from streaming tool calls in the remote-vllm provider where it would end up concatenating parameters from multiple different tool call results instead of aggregating the results from each tool call separately. It also fixes an issue found while digging into that where we were accidentally mixing the json string form of tool call parameters with the string representation of the python form, which mean we'd end up with single quotes in what should be double-quoted json strings. Closes #1120 ## Test Plan The following tests are now passing 100% for the remote-vllm provider, where some of the test_text_inference were failing before this change: ``` VLLM_URL="http://localhost:8000/v1" INFERENCE_MODEL="RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic" LLAMA_STACK_CONFIG=remote-vllm python -m pytest -v tests/integration/inference/test_text_inference.py --text-model "RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic" VLLM_URL="http://localhost:8000/v1" INFERENCE_MODEL="RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic" LLAMA_STACK_CONFIG=remote-vllm python -m pytest -v tests/integration/inference/test_vision_inference.py --vision-model "RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic" ``` All but one of the agent tests are passing (including the multi-tool one). See the PR at https://github.com/vllm-project/vllm/pull/17917 and a gist at https://gist.github.com/bbrowning/4734240ce96b4264340caa9584e47c9e for changes needed there, which will have to get made upstream in vLLM. Agent tests: ``` VLLM_URL="http://localhost:8000/v1" INFERENCE_MODEL="RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic" LLAMA_STACK_CONFIG=remote-vllm python -m pytest -v tests/integration/agents/test_agents.py --text-model "RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic" ```` --------- Signed-off-by: Ben Browning <bbrownin@redhat.com>		2025-05-15 11:23:29 -07:00
..
cli	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
distribution	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
models	feat: support '-' in tool names (#1807 )	2025-04-12 14:23:03 -07:00
providers	fix: multiple tool calls in remote-vllm chat_completion (#2161 )	2025-05-15 11:23:29 -07:00
rag	feat: Adding support for customizing chunk context in RAG insertion and querying (#2134 )	2025-05-14 21:56:20 -04:00
registry	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
server	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
__init__.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
conftest.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
fixtures.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
README.md	docs: revamp testing documentation (#2155 )	2025-05-13 11:28:29 -07:00

README.md

Llama Stack Unit Tests

You can run the unit tests by running:

source .venv/bin/activate
./scripts/unit-tests.sh [PYTEST_ARGS]

Any additional arguments are passed to pytest. For example, you can specify a test directory, a specific test file, or any pytest flags (e.g., -vvv for verbosity). If no test directory is specified, it defaults to "tests/unit", e.g:

./scripts/unit-tests.sh tests/unit/registry/test_registry.py -vvv

If you'd like to run for a non-default version of Python (currently 3.10), pass PYTHON_VERSION variable as follows:

source .venv/bin/activate
PYTHON_VERSION=3.13 ./scripts/unit-tests.sh