mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-27 06:48:05 +00:00

Francisco Javier Arceo c8d41d45ec chore: Enabling Milvus for VectorIO CI

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

2025-06-30 11:55:49 -04:00

remote::hf::serverless

Description

HuggingFace Inference API serverless provider for on-demand model inference.

Field	Type	Required	Default	Description
`huggingface_repo`	`<class 'str'>`	No	PydanticUndefined	The model ID of the model on the Hugging Face Hub (e.g. 'meta-llama/Meta-Llama-3.1-70B-Instruct')
`api_token`	`pydantic.types.SecretStr \| None`	No		Your Hugging Face user access token (will default to locally saved token if not provided)

huggingface_repo: ${env.INFERENCE_MODEL}
api_token: ${env.HF_API_TOKEN}