mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-03 19:57:35 +00:00
When using bash style substitution env variable in distribution template, we are processing the string and convert it to the type associated with the provider's config class. This allows us to return the proper type. This is crucial for api key since they are not strings anymore but SecretStr. If the key is unset we will get an empty string which will result in a Pydantic error like: ``` ERROR 2025-09-25 21:40:44,565 __main__:527 core::server: Error creating app: 1 validation error for AnthropicConfig api_key Input should be a valid string For further information visit https://errors.pydantic.dev/2.11/v/string_type ``` Signed-off-by: Sébastien Han <seb@redhat.com>
30 lines
1 KiB
Text
30 lines
1 KiB
Text
---
|
|
description: "Remote vLLM inference provider for connecting to vLLM servers."
|
|
sidebar_label: Remote - Vllm
|
|
title: remote::vllm
|
|
---
|
|
|
|
# remote::vllm
|
|
|
|
## Description
|
|
|
|
Remote vLLM inference provider for connecting to vLLM servers.
|
|
|
|
## Configuration
|
|
|
|
| Field | Type | Required | Default | Description |
|
|
|-------|------|----------|---------|-------------|
|
|
| `url` | `str \| None` | No | | The URL for the vLLM model serving endpoint |
|
|
| `max_tokens` | `<class 'int'>` | No | 4096 | Maximum number of tokens to generate. |
|
|
| `api_token` | `<class 'llama_stack.core.secret_types.MySecretStr'>` | No | | The API token |
|
|
| `tls_verify` | `bool \| str` | No | True | Whether to verify TLS certificates. Can be a boolean or a path to a CA certificate file. |
|
|
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically |
|
|
|
|
## Sample Configuration
|
|
|
|
```yaml
|
|
url: ${env.VLLM_URL:=}
|
|
max_tokens: ${env.VLLM_MAX_TOKENS:=4096}
|
|
api_token: ${env.VLLM_API_TOKEN:=fake}
|
|
tls_verify: ${env.VLLM_TLS_VERIFY:=true}
|
|
```
|