# What does this PR do? Fixes issue #1537 that causes "500 Internal Server Error" when unregistering a toolgroup # (Closes #1537 ) ## Test Plan ```console $ pytest -s -v tests/integration/tool_runtime/test_registration.py --stack-config=ollama --env INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" INFO 2025-03-14 21:15:03,999 tests.integration.conftest:41 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS /opt/homebrew/lib/python3.10/site-packages/pytest_asyncio/plugin.py:207: PytestDeprecationWarning: The configuration option "asyncio_default_fixture_loop_scope" is unset. The event loop scope for asynchronous fixtures will default to the fixture caching scope. Future versions of pytest-asyncio will default the loop scope for asynchronous fixtures to function scope. Set the default fixture loop scope explicitly in order to avoid unexpected behavior in the future. Valid fixture loop scopes are: "function", "class", "module", "package", "session" warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET)) ===================================================== test session starts ===================================================== platform darwin -- Python 3.10.16, pytest-8.3.5, pluggy-1.5.0 -- /opt/homebrew/opt/python@3.10/bin/python3.10 cachedir: .pytest_cache rootdir: /Users/paolo/Projects/aiplatform/llama-stack configfile: pyproject.toml plugins: asyncio-0.25.3, anyio-4.8.0 asyncio: mode=strict, asyncio_default_fixture_loop_scope=None collected 1 item tests/integration/tool_runtime/test_registration.py::test_register_and_unregister_toolgroup[None-None-None-None-None] INFO 2025-03-14 21:15:04,478 llama_stack.providers.remote.inference.ollama.ollama:75 inference: checking connectivity to Ollama at `http://localhost:11434`... INFO 2025-03-14 21:15:05,350 llama_stack.providers.remote.inference.ollama.ollama:294 inference: Pulling embedding model `all-minilm:latest` if necessary... INFO: Started server process [78391] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) INFO: 127.0.0.1:57424 - "GET /sse HTTP/1.1" 200 OK INFO: 127.0.0.1:57434 - "GET /sse HTTP/1.1" 200 OK INFO 2025-03-14 21:15:16,129 mcp.client.sse:51 uncategorized: Connecting to SSE endpoint: http://localhost:8000/sse INFO: 127.0.0.1:57445 - "GET /sse HTTP/1.1" 200 OK INFO 2025-03-14 21:15:16,146 mcp.client.sse:71 uncategorized: Received endpoint URL: http://localhost:8000/messages/?session_id=c5b6fc01f8dc4b5e80e38eb1c1b22a9b INFO 2025-03-14 21:15:16,147 mcp.client.sse:140 uncategorized: Starting post writer with endpoint URL: http://localhost:8000/messages/?session_id=c5b6fc01f8dc4b5e80e38eb1c1b22a9b INFO: 127.0.0.1:57447 - "POST /messages/?session_id=c5b6fc01f8dc4b5e80e38eb1c1b22a9b HTTP/1.1" 202 Accepted INFO: 127.0.0.1:57447 - "POST /messages/?session_id=c5b6fc01f8dc4b5e80e38eb1c1b22a9b HTTP/1.1" 202 Accepted INFO: 127.0.0.1:57447 - "POST /messages/?session_id=c5b6fc01f8dc4b5e80e38eb1c1b22a9b HTTP/1.1" 202 Accepted INFO 2025-03-14 21:15:16,155 mcp.server.lowlevel.server:535 uncategorized: Processing request of type ListToolsRequest PASSED =============================================== 1 passed, 4 warnings in 12.17s ================================================ ``` --------- Signed-off-by: Paolo Dettori <dettori@us.ibm.com> |
||
---|---|---|
.. | ||
agents | ||
datasets | ||
eval | ||
fixtures | ||
inference | ||
inspect | ||
post_training | ||
providers | ||
safety | ||
scoring | ||
telemetry | ||
test_cases | ||
tool_runtime | ||
tools | ||
vector_io | ||
__init__.py | ||
conftest.py | ||
metadata.py | ||
README.md | ||
report.py |
Llama Stack Integration Tests
We use pytest
for parameterizing and running tests. You can see all options with:
cd tests/integration
# this will show a long list of options, look for "Custom options:"
pytest --help
Here are the most important options:
--stack-config
: specify the stack config to use. You have three ways to point to a stack:- a URL which points to a Llama Stack distribution server
- a template (e.g.,
fireworks
,together
) or a path to a run.yaml file - a comma-separated list of api=provider pairs, e.g.
inference=fireworks,safety=llama-guard,agents=meta-reference
. This is most useful for testing a single API surface.
--env
: set environment variables, e.g. --env KEY=value. this is a utility option to set environment variables required by various providers.
Model parameters can be influenced by the following options:
--text-model
: comma-separated list of text models.--vision-model
: comma-separated list of vision models.--embedding-model
: comma-separated list of embedding models.--safety-shield
: comma-separated list of safety shields.--judge-model
: comma-separated list of judge models.--embedding-dimension
: output dimensionality of the embedding model to use for testing. Default: 384
Each of these are comma-separated lists and can be used to generate multiple parameter combinations. Note that tests will be skipped if no model is specified.
Experimental, under development, options:
--record-responses
: record new API responses instead of using cached ones--report
: path where the test report should be written, e.g. --report=/path/to/report.md
Examples
Run all text inference tests with the together
distribution:
pytest -s -v tests/integration/inference/test_text_inference.py \
--stack-config=together \
--text-model=meta-llama/Llama-3.1-8B-Instruct
Run all text inference tests with the together
distribution and meta-llama/Llama-3.1-8B-Instruct
:
pytest -s -v tests/integration/inference/test_text_inference.py \
--stack-config=together \
--text-model=meta-llama/Llama-3.1-8B-Instruct
Running all inference tests for a number of models:
TEXT_MODELS=meta-llama/Llama-3.1-8B-Instruct,meta-llama/Llama-3.1-70B-Instruct
VISION_MODELS=meta-llama/Llama-3.2-11B-Vision-Instruct
EMBEDDING_MODELS=all-MiniLM-L6-v2
export TOGETHER_API_KEY=<together_api_key>
pytest -s -v tests/integration/inference/ \
--stack-config=together \
--text-model=$TEXT_MODELS \
--vision-model=$VISION_MODELS \
--embedding-model=$EMBEDDING_MODELS
Same thing but instead of using the distribution, use an adhoc stack with just one provider (fireworks
for inference):
export FIREWORKS_API_KEY=<fireworks_api_key>
pytest -s -v tests/integration/inference/ \
--stack-config=inference=fireworks \
--text-model=$TEXT_MODELS \
--vision-model=$VISION_MODELS \
--embedding-model=$EMBEDDING_MODELS
Running Vector IO tests for a number of embedding models:
EMBEDDING_MODELS=all-MiniLM-L6-v2
pytest -s -v tests/integration/vector_io/ \
--stack-config=inference=sentence-transformers,vector_io=sqlite-vec \
--embedding-model=$EMBEDDING_MODELS