fix: improve model availability checks: Allows use of unavailable models on startup (#3717) · 1970b4aa4b - phoenix-oss/llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

fix: improve model availability checks: Allows use of unavailable models on startup (#3717)

Some checks failed

SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s

Details

Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 3s

Details

Python Package Build Test / build (3.12) (push) Failing after 1s

Details

Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped

Details

SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s

Details

Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s

Details

Python Package Build Test / build (3.13) (push) Failing after 2s

Details

Vector IO Integration Tests / test-matrix (push) Failing after 5s

Details

Unit Tests / unit-tests (3.12) (push) Failing after 4s

Details

API Conformance Tests / check-schema-compatibility (push) Successful in 10s

Details

Unit Tests / unit-tests (3.13) (push) Failing after 4s

Details

Test External API and Providers / test-external (venv) (push) Failing after 7s

Details

UI Tests / ui-tests (22) (push) Successful in 39s

Details

Pre-commit / pre-commit (push) Successful in 1m28s

Details

- Allows use of unavailable models on startup
- Add has_model method to ModelsRoutingTable for checking pre-registered
models
- Update check_model_availability to check model_store before provider
APIs

# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->


Start llama stack and point unavailable vLLM

```
VLLM_URL=https://my-unavailable-vllm/v1 MILVUS_DB_PATH=./milvus.db INFERENCE_MODEL=vllm uv run --with llama-stack llama stack build --distro starter --image-type venv --run
```

llama stack will start without crashing but only notifying error. 

```


         - provider_id: rag-runtime
           toolgroup_id: builtin::rag
         vector_dbs: []
         version: 2

INFO     2025-10-07 06:40:41,804 llama_stack.providers.utils.inference.inference_store:74 inference: Write queue disabled for SQLite to avoid concurrency issues
INFO     2025-10-07 06:40:42,066 llama_stack.providers.utils.responses.responses_store:96 openai_responses: Write queue disabled for SQLite to avoid concurrency issues
ERROR    2025-10-07 06:40:58,882 llama_stack.providers.utils.inference.openai_mixin:436 providers::utils: VLLMInferenceAdapter.list_provider_model_ids() failed with: Request timed out.
WARNING  2025-10-07 06:40:58,883 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider vllm: Request timed out.
[...]
INFO     2025-10-07 06:40:59,036 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit)
INFO     2025-10-07 06:41:04,064 openai._base_client:1618 uncategorized: Retrying request to /models in 0.398814 seconds
INFO     2025-10-07 06:41:09,497 openai._base_client:1618 uncategorized: Retrying request to /models in 0.781908 seconds
ERROR    2025-10-07 06:41:15,282 llama_stack.providers.utils.inference.openai_mixin:436 providers::utils: VLLMInferenceAdapter.list_provider_model_ids() failed with: Request timed out.
WARNING  2025-10-07 06:41:15,283 llama_stack.core.routing_tables.models:36 core::routing_tables: Model refresh failed for provider vllm: Request timed out.
```

This commit is contained in:

Akram Ben Aissi

2025-10-07 19:27:24 +01:00

• committed by

GitHub

parent d5b136ac66

commit 1970b4aa4b

No known key found for this signature in database

GPG key ID: B5690EEEBB952194

4 changed files with 64 additions and 4 deletions

									
										6

tests/unit/distribution/routers/test_routing_tables.py
									
										View file
										
				@ -201,6 +201,12 @@ async def test_models_routing_table(cached_disk_dist_registry):

				    non_existent = await table.get_object_by_identifier("model", "non-existent-model")

				    assert non_existent is None

				    # Test has_model

				    assert await table.has_model("test_provider/test-model")

				    assert await table.has_model("test_provider/test-model-2")

				    assert not await table.has_model("non-existent-model")

				    assert not await table.has_model("test_provider/non-existent-model")

				    await table.unregister_model(model_id="test_provider/test-model")

				    await table.unregister_model(model_id="test_provider/test-model-2")

Rows
Columns

6 tests/unit/distribution/routers/test_routing_tables.py Unescape Escape View file

6

tests/unit/distribution/routers/test_routing_tables.py

View file