mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-27 14:38:05 +00:00
chore: Enabling Milvus for VectorIO CI
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
This commit is contained in:
parent
709eb7da33
commit
c8d41d45ec
115 changed files with 2919 additions and 184 deletions
21
docs/source/providers/inference/remote_hf_serverless.md
Normal file
21
docs/source/providers/inference/remote_hf_serverless.md
Normal file
|
|
@ -0,0 +1,21 @@
|
|||
# remote::hf::serverless
|
||||
|
||||
## Description
|
||||
|
||||
HuggingFace Inference API serverless provider for on-demand model inference.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `huggingface_repo` | `<class 'str'>` | No | PydanticUndefined | The model ID of the model on the Hugging Face Hub (e.g. 'meta-llama/Meta-Llama-3.1-70B-Instruct') |
|
||||
| `api_token` | `pydantic.types.SecretStr \| None` | No | | Your Hugging Face user access token (will default to locally saved token if not provided) |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
huggingface_repo: ${env.INFERENCE_MODEL}
|
||||
api_token: ${env.HF_API_TOKEN}
|
||||
|
||||
```
|
||||
|
||||
Loading…
Add table
Add a link
Reference in a new issue