llama-stack-mirror/llama_stack/core
Ben Browning c658c5912c
fix: Gracefully handle errors when listing MCP tools
When listing (and lazily indexing) tools, it's possible for an error
to get thrown by individual toolgroups if for example an MCP toolgroup
is unable to connect to its `mcp_endpoint`.

This logs a warning in the server when that happens, logs a full stack
trace of the error if debug logging is enabled, and just returns the
list of tools from all working toolgroups instead of throwing an error
to the client when a single toolgroup is temporarily or permanently
misbehaving.

Fixes #2540

A new unit test was added to test this exception handling, which is
run as part of our regular test suite but also manually run to
specifically verify this fix via:

```
uv run pytest -sv --asyncio-mode=auto \
tests/unit/distribution/routers/test_routing_tables.py
```

To verify the additional debug logging is printing properly:

```
LLAMA_STACK_LOGGING=core=debug \
uv run pytest -sv --asyncio-mode=auto \
tests/unit/distribution/routers/test_routing_tables.py
```

Signed-off-by: Ben Browning <bbrownin@redhat.com>
2025-09-26 16:48:59 +02:00
..
access_control chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
prompts feat: Adding OpenAI Prompts API (#3319) 2025-09-08 11:05:13 -04:00
routers chore: introduce write queue for inference_store (#3383) 2025-09-10 11:57:42 -07:00
routing_tables fix: Gracefully handle errors when listing MCP tools 2025-09-26 16:48:59 +02:00
server feat: introduce API leveling, post_training, eval to v1alpha (#3449) 2025-09-26 16:18:07 +02:00
store fix: Added a bug fix when registering new models (#3453) 2025-09-16 19:09:06 -07:00
ui docs: update documentation links (#3459) 2025-09-17 10:37:35 -07:00
utils refactor(logging): rename llama_stack logger categories (#3065) 2025-08-21 17:31:04 -07:00
__init__.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
build.py feat(distro): no huggingface provider for starter (#3258) 2025-08-26 14:06:36 -07:00
build_container.sh chore: fix build (#3522) 2025-09-22 22:53:48 -07:00
build_venv.sh fix(ci, tests): ensure uv environments in CI are kosher, record tests (#3193) 2025-08-18 17:02:24 -07:00
client.py feat: introduce API leveling, post_training, eval to v1alpha (#3449) 2025-09-26 16:18:07 +02:00
common.sh refactor: remove Conda support from Llama Stack (#2969) 2025-08-02 15:52:59 -07:00
configure.py chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (#3061) 2025-08-20 07:15:35 -04:00
datatypes.py feat: combine ProviderSpec datatypes (#3378) 2025-09-18 16:10:00 +02:00
distribution.py feat: combine ProviderSpec datatypes (#3378) 2025-09-18 16:10:00 +02:00
external.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
inspect.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
library_client.py chore: refactor server.main (#3462) 2025-09-18 21:11:13 -07:00
providers.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
request_headers.py chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (#3061) 2025-08-20 07:15:35 -04:00
resolver.py feat: Adding OpenAI Prompts API (#3319) 2025-09-08 11:05:13 -04:00
stack.py chore: refactor server.main (#3462) 2025-09-18 21:11:13 -07:00
start_stack.sh docs: update documentation links (#3459) 2025-09-17 10:37:35 -07:00