llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-04 10:10:36 +00:00

History

Akram Ben Aissi a548169b99 fix: allow skipping model availability check for vLLM (#3739 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Allows model check to fail gracefully instead of crashing on startup. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> set VLLM_URL to your VLLM server ``` (base) akram@Mac llama-stack % LAMA_STACK_LOGGING="all=debug" VLLM_ENABLE_MODEL_DISCOVERY=false MILVUS_DB_PATH=./milvus.db INFERENCE_MODEL=vllm uv run --with llama-stack llama stack build --distro starter --image-type venv --run ``` ``` INFO 2025-10-08 20:11:24,637 llama_stack.providers.utils.inference.inference_store:74 inference: Write queue disabled for SQLite to avoid concurrency issues INFO 2025-10-08 20:11:24,866 llama_stack.providers.utils.responses.responses_store:96 openai_responses: Write queue disabled for SQLite to avoid concurrency issues ERROR 2025-10-08 20:11:26,160 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: VLLMInferenceAdapter.list_provider_model_ids() failed with: <a href="https://oauth.akram.a1ey.p3.openshiftapps.com:443/oauth/authorize?approval_prompt=force&client_id=system%3Aserviceaccount%3Arhoai-30-genai%3Adefault&redirect_uri=ht tps%3A%2F%2Fvllm-rhoai-30-genai.apps.rosa.akram.a1ey.p3.openshiftapps.com%2Foauth%2Fcallback&response_type=code&scope=user%3Ainfo+user%3Acheck-access&state=9fba207425 5851c718aca717a5887d76%3A%2Fmodels">Found</a>. [...] INFO 2025-10-08 20:11:26,295 uvicorn.error:84 uncategorized: Started server process [83144] INFO 2025-10-08 20:11:26,296 uvicorn.error:48 uncategorized: Waiting for application startup. INFO 2025-10-08 20:11:26,297 llama_stack.core.server.server:170 core::server: Starting up INFO 2025-10-08 20:11:26,297 llama_stack.core.stack:399 core: starting registry refresh task INFO 2025-10-08 20:11:26,311 uvicorn.error:62 uncategorized: Application startup complete. INFO 2025-10-08 20:11:26,312 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit) ERROR 2025-10-08 20:11:26,791 llama_stack.providers.utils.inference.openai_mixin:439 providers::utils: VLLMInferenceAdapter.list_provider_model_ids() failed with: <a href="https://oauth.akram.a1ey.p3.openshiftapps.com:443/oauth/authorize?approval_prompt=force&client_id=system%3Aserviceaccount%3Arhoai-30-genai%3Adefault&redirect_uri=ht tps%3A%2F%2Fvllm-rhoai-30-genai.apps.rosa.akram.a1ey.p3.openshiftapps.com%2Foauth%2Fcallback&response_type=code&scope=user%3Ainfo+user%3Acheck-access&state=8ef0cba3e1 71a4f8b04cb445cfb91a4c%3A%2Fmodels">Found</a>. ```		2025-10-10 07:23:13 -07:00
..
agents	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
datasetio	chore(misc): make tests and starter faster (#3042 )	2025-08-05 14:55:05 -07:00
eval	feat: add static embedding metadata to dynamic model listings for providers using OpenAIMixin (#3547 )	2025-09-25 17:17:00 -04:00
files/s3	feat(tests): make inference_recorder into api_recorder (include tool_invoke) (#3403 )	2025-10-09 14:27:51 -07:00
inference	fix: allow skipping model availability check for vLLM (#3739 )	2025-10-10 07:23:13 -07:00
post_training	fix: remove inference.completion from docs (#3589 )	2025-09-29 13:14:41 -07:00
safety	refactor(logging): rename llama_stack logger categories (#3065 )	2025-08-21 17:31:04 -07:00
tool_runtime	feat(tools)!: substantial clean up of "Tool" related datatypes (#3627 )	2025-10-02 15:12:03 -07:00
vector_io	chore: fix flaky unit test and add proper shutdown for file batches (#3725 )	2025-10-07 14:23:14 -07:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00