feat: add refresh_models support to inference adapters (default: false)

inference adapters can now configure `refresh_models: bool` to control periodic model listing from their providers BREAKING CHANGE: together inference adapter default changed. previously always refreshed, now follows config.
2025-12-16 13:42:35 +00:00 · 2025-10-07 07:23:07 -04:00 · 2025-10-07 07:23:07 -04:00 · bc47900ec0
commit bc47900ec0
parent 509ac4a659
31 changed files with 33 additions and 67 deletions
--- a/llama_stack/providers/remote/inference/together/together.py
+++ b/llama_stack/providers/remote/inference/together/together.py
@ -63,9 +63,6 @@ class TogetherInferenceAdapter(OpenAIMixin, NeedsRequestProviderData):
        # Together's /v1/models is not compatible with OpenAI's /v1/models. Together support ticket #13355 -> will not fix, use Together's own client
        return [m.id for m in await self._get_client().models.list()]

-    async def should_refresh_models(self) -> bool:
-        return True
-
    async def openai_embeddings(
        self,
        model: str,