feat: add static embedding metadata to dynamic model listings for providers using OpenAIMixin

- remove auto-download of ollama embedding models - add embedding model metadata to dynamic listing w/ unit test - add support and tests for allowed_models - removed inference provider models.py files where dynamic listing is enabled - store embedding metadata in embedding_model_metadata field on inference providers - make model_entries optional on ModelRegistryHelper and LiteLLMOpenAIMixin - make OpenAIMixin a ModelRegistryHelper - skip base64 embedding test for remote::ollama, always returns floats - only use OpenAI client for ollama model listing - remove unused build_model_entry function - remove unused get_huggingface_repo function
2025-10-04 04:04:14 +00:00 · 2025-09-25 04:56:54 -04:00 · 2025-09-25 04:56:54 -04:00 · 466ef6f490
commit 466ef6f490
parent a50b63906c
43 changed files with 370 additions and 1016 deletions
--- a/llama_stack/providers/remote/inference/vllm/vllm.py
+++ b/llama_stack/providers/remote/inference/vllm/vllm.py
@ -292,7 +292,7 @@ class VLLMInferenceAdapter(OpenAIMixin, LiteLLMOpenAIMixin, Inference, ModelsPro
    def __init__(self, config: VLLMInferenceAdapterConfig) -> None:
        LiteLLMOpenAIMixin.__init__(
            self,
-            build_hf_repo_model_entries(),
+            model_entries=build_hf_repo_model_entries(),
            litellm_provider_name="vllm",
            api_key_from_config=config.api_token,
            provider_data_api_key_field="vllm_api_token",