mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-15 21:32:29 +00:00
# What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> - Updates provider and distro codegen to handle the new format - Migrates provider and distro files to the new format ## Test Plan - Manual testing <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
25 lines
953 B
Text
25 lines
953 B
Text
---
|
|
description: "HuggingFace Inference Endpoints provider for dedicated model serving."
|
|
sidebar_label: Remote - Hf - Endpoint
|
|
title: remote::hf::endpoint
|
|
---
|
|
|
|
# remote::hf::endpoint
|
|
|
|
## Description
|
|
|
|
HuggingFace Inference Endpoints provider for dedicated model serving.
|
|
|
|
## Configuration
|
|
|
|
| Field | Type | Required | Default | Description |
|
|
|-------|------|----------|---------|-------------|
|
|
| `endpoint_name` | `<class 'str'>` | No | | The name of the Hugging Face Inference Endpoint in the format of '{namespace}/{endpoint_name}' (e.g. 'my-cool-org/meta-llama-3-1-8b-instruct-rce'). Namespace is optional and will default to the user account if not provided. |
|
|
| `api_token` | `pydantic.types.SecretStr \| None` | No | | Your Hugging Face user access token (will default to locally saved token if not provided) |
|
|
|
|
## Sample Configuration
|
|
|
|
```yaml
|
|
endpoint_name: ${env.INFERENCE_ENDPOINT_NAME}
|
|
api_token: ${env.HF_API_TOKEN}
|
|
```
|