feat: add refresh_models support to inference adapters (default: false)

inference adapters can now configure `refresh_models: bool` to control periodic model listing from their providers BREAKING CHANGE: together inference adapter default changed. previously always refreshed, now follows config.
2025-12-16 14:49:30 +00:00 · 2025-10-07 07:23:07 -04:00 · 2025-10-07 07:23:07 -04:00 · bc47900ec0
commit bc47900ec0
parent 509ac4a659
31 changed files with 33 additions and 67 deletions
--- a/llama_stack/providers/utils/inference/openai_mixin.py
+++ b/llama_stack/providers/utils/inference/openai_mixin.py
@ -484,7 +484,7 @@ class OpenAIMixin(NeedsRequestProviderData, ABC, BaseModel):
        return model in self._model_cache

    async def should_refresh_models(self) -> bool:
-        return False
+        return self.config.refresh_models

    #
    # The model_dump implementations are to avoid serializing the extra fields,