feat(vllm): periodically refresh models

2025-12-24 06:53:57 +00:00 · 2025-07-18 15:33:33 -07:00 · 2025-07-18 15:33:33 -07:00 · 1bf710bec0
commit 1bf710bec0
parent 68a2dfbad7
6 changed files with 95 additions and 13 deletions
--- a/docs/source/providers/inference/remote_vllm.md
+++ b/docs/source/providers/inference/remote_vllm.md
@ -12,11 +12,13 @@ Remote vLLM inference provider for connecting to vLLM servers.
 | `max_tokens` | `<class 'int'>` | No | 4096 | Maximum number of tokens to generate. |
 | `api_token` | `str \| None` | No | fake | The API token |
 | `tls_verify` | `bool \| str` | No | True | Whether to verify TLS certificates. Can be a boolean or a path to a CA certificate file. |
+| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically |
+| `refresh_models_interval` | `<class 'int'>` | No | 300 | Interval in seconds to refresh models |

 ## Sample Configuration

 ```yaml
-url: ${env.VLLM_URL}
+url: ${env.VLLM_URL:=}
 max_tokens: ${env.VLLM_MAX_TOKENS:=4096}
 api_token: ${env.VLLM_API_TOKEN:=fake}
 tls_verify: ${env.VLLM_TLS_VERIFY:=true}