mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 09:53:45 +00:00
Generate updated docs
This commit is contained in:
parent
5f02620a97
commit
63887f2a21
59 changed files with 173 additions and 167 deletions
|
|
@ -1,7 +1,8 @@
|
||||||
---
|
---
|
||||||
description: "Agents
|
description: |
|
||||||
|
Agents
|
||||||
|
|
||||||
APIs for creating and interacting with agentic systems."
|
APIs for creating and interacting with agentic systems.
|
||||||
sidebar_label: Agents
|
sidebar_label: Agents
|
||||||
title: Agents
|
title: Agents
|
||||||
---
|
---
|
||||||
|
|
@ -12,6 +13,6 @@ title: Agents
|
||||||
|
|
||||||
Agents
|
Agents
|
||||||
|
|
||||||
APIs for creating and interacting with agentic systems.
|
APIs for creating and interacting with agentic systems.
|
||||||
|
|
||||||
This section contains documentation for all available providers for the **agents** API.
|
This section contains documentation for all available providers for the **agents** API.
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@ Meta's reference implementation of an agent system that can use tools, access ve
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `persistence` | `<class 'inline.agents.meta_reference.config.AgentPersistenceConfig'>` | No | | |
|
| `persistence` | `inline.agents.meta_reference.config.AgentPersistenceConfig` | No | | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,23 +1,6 @@
|
||||||
---
|
---
|
||||||
description: "The Batches API enables efficient processing of multiple requests in a single operation,
|
description: |
|
||||||
particularly useful for processing large datasets, batch evaluation workflows, and
|
The Batches API enables efficient processing of multiple requests in a single operation,
|
||||||
cost-effective inference at scale.
|
|
||||||
|
|
||||||
The API is designed to allow use of openai client libraries for seamless integration.
|
|
||||||
|
|
||||||
This API provides the following extensions:
|
|
||||||
- idempotent batch creation
|
|
||||||
|
|
||||||
Note: This API is currently under active development and may undergo changes."
|
|
||||||
sidebar_label: Batches
|
|
||||||
title: Batches
|
|
||||||
---
|
|
||||||
|
|
||||||
# Batches
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
The Batches API enables efficient processing of multiple requests in a single operation,
|
|
||||||
particularly useful for processing large datasets, batch evaluation workflows, and
|
particularly useful for processing large datasets, batch evaluation workflows, and
|
||||||
cost-effective inference at scale.
|
cost-effective inference at scale.
|
||||||
|
|
||||||
|
|
@ -27,5 +10,23 @@ The Batches API enables efficient processing of multiple requests in a single op
|
||||||
- idempotent batch creation
|
- idempotent batch creation
|
||||||
|
|
||||||
Note: This API is currently under active development and may undergo changes.
|
Note: This API is currently under active development and may undergo changes.
|
||||||
|
sidebar_label: Batches
|
||||||
|
title: Batches
|
||||||
|
---
|
||||||
|
|
||||||
|
# Batches
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The Batches API enables efficient processing of multiple requests in a single operation,
|
||||||
|
particularly useful for processing large datasets, batch evaluation workflows, and
|
||||||
|
cost-effective inference at scale.
|
||||||
|
|
||||||
|
The API is designed to allow use of openai client libraries for seamless integration.
|
||||||
|
|
||||||
|
This API provides the following extensions:
|
||||||
|
- idempotent batch creation
|
||||||
|
|
||||||
|
Note: This API is currently under active development and may undergo changes.
|
||||||
|
|
||||||
This section contains documentation for all available providers for the **batches** API.
|
This section contains documentation for all available providers for the **batches** API.
|
||||||
|
|
|
||||||
|
|
@ -14,9 +14,9 @@ Reference implementation of batches API with KVStore persistence.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `kvstore` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Configuration for the key-value store backend. |
|
| `kvstore` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | Configuration for the key-value store backend. |
|
||||||
| `max_concurrent_batches` | `<class 'int'>` | No | 1 | Maximum number of concurrent batches to process simultaneously. |
|
| `max_concurrent_batches` | `int` | No | 1 | Maximum number of concurrent batches to process simultaneously. |
|
||||||
| `max_concurrent_requests_per_batch` | `<class 'int'>` | No | 10 | Maximum number of concurrent requests to process per batch. |
|
| `max_concurrent_requests_per_batch` | `int` | No | 10 | Maximum number of concurrent requests to process per batch. |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@ Local filesystem-based dataset I/O provider for reading and writing datasets to
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `kvstore` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
| `kvstore` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@ HuggingFace datasets provider for accessing and managing datasets from the Huggi
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `kvstore` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
| `kvstore` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -17,7 +17,7 @@ NVIDIA's dataset I/O provider for accessing datasets from NVIDIA's data platform
|
||||||
| `api_key` | `str \| None` | No | | The NVIDIA API key. |
|
| `api_key` | `str \| None` | No | | The NVIDIA API key. |
|
||||||
| `dataset_namespace` | `str \| None` | No | default | The NVIDIA dataset namespace. |
|
| `dataset_namespace` | `str \| None` | No | default | The NVIDIA dataset namespace. |
|
||||||
| `project_id` | `str \| None` | No | test-project | The NVIDIA project ID. |
|
| `project_id` | `str \| None` | No | test-project | The NVIDIA project ID. |
|
||||||
| `datasets_url` | `<class 'str'>` | No | http://nemo.test | Base URL for the NeMo Dataset API |
|
| `datasets_url` | `str` | No | http://nemo.test | Base URL for the NeMo Dataset API |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,7 +1,8 @@
|
||||||
---
|
---
|
||||||
description: "Evaluations
|
description: |
|
||||||
|
Evaluations
|
||||||
|
|
||||||
Llama Stack Evaluation API for running evaluations on model and agent candidates."
|
Llama Stack Evaluation API for running evaluations on model and agent candidates.
|
||||||
sidebar_label: Eval
|
sidebar_label: Eval
|
||||||
title: Eval
|
title: Eval
|
||||||
---
|
---
|
||||||
|
|
@ -12,6 +13,6 @@ title: Eval
|
||||||
|
|
||||||
Evaluations
|
Evaluations
|
||||||
|
|
||||||
Llama Stack Evaluation API for running evaluations on model and agent candidates.
|
Llama Stack Evaluation API for running evaluations on model and agent candidates.
|
||||||
|
|
||||||
This section contains documentation for all available providers for the **eval** API.
|
This section contains documentation for all available providers for the **eval** API.
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@ Meta's reference implementation of evaluation tasks with support for multiple la
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `kvstore` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
| `kvstore` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@ NVIDIA's evaluation provider for running evaluation tasks on NVIDIA's platform.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `evaluator_url` | `<class 'str'>` | No | http://0.0.0.0:7331 | The url for accessing the evaluator service |
|
| `evaluator_url` | `str` | No | http://0.0.0.0:7331 | The url for accessing the evaluator service |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,7 +1,8 @@
|
||||||
---
|
---
|
||||||
description: "Files
|
description: |
|
||||||
|
Files
|
||||||
|
|
||||||
This API is used to upload documents that can be used with other Llama Stack APIs."
|
This API is used to upload documents that can be used with other Llama Stack APIs.
|
||||||
sidebar_label: Files
|
sidebar_label: Files
|
||||||
title: Files
|
title: Files
|
||||||
---
|
---
|
||||||
|
|
@ -12,6 +13,6 @@ title: Files
|
||||||
|
|
||||||
Files
|
Files
|
||||||
|
|
||||||
This API is used to upload documents that can be used with other Llama Stack APIs.
|
This API is used to upload documents that can be used with other Llama Stack APIs.
|
||||||
|
|
||||||
This section contains documentation for all available providers for the **files** API.
|
This section contains documentation for all available providers for the **files** API.
|
||||||
|
|
|
||||||
|
|
@ -14,9 +14,9 @@ Local filesystem-based file storage provider for managing files and documents lo
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `storage_dir` | `<class 'str'>` | No | | Directory to store uploaded files |
|
| `storage_dir` | `str` | No | | Directory to store uploaded files |
|
||||||
| `metadata_store` | `<class 'llama_stack.core.storage.datatypes.SqlStoreReference'>` | No | | SQL store configuration for file metadata |
|
| `metadata_store` | `llama_stack.core.storage.datatypes.SqlStoreReference` | No | | SQL store configuration for file metadata |
|
||||||
| `ttl_secs` | `<class 'int'>` | No | 31536000 | |
|
| `ttl_secs` | `int` | No | 31536000 | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,8 +14,8 @@ OpenAI Files API provider for managing files through OpenAI's native file storag
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `api_key` | `<class 'str'>` | No | | OpenAI API key for authentication |
|
| `api_key` | `str` | No | | OpenAI API key for authentication |
|
||||||
| `metadata_store` | `<class 'llama_stack.core.storage.datatypes.SqlStoreReference'>` | No | | SQL store configuration for file metadata |
|
| `metadata_store` | `llama_stack.core.storage.datatypes.SqlStoreReference` | No | | SQL store configuration for file metadata |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,13 +14,13 @@ AWS S3-based file storage provider for scalable cloud file management with metad
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `bucket_name` | `<class 'str'>` | No | | S3 bucket name to store files |
|
| `bucket_name` | `str` | No | | S3 bucket name to store files |
|
||||||
| `region` | `<class 'str'>` | No | us-east-1 | AWS region where the bucket is located |
|
| `region` | `str` | No | us-east-1 | AWS region where the bucket is located |
|
||||||
| `aws_access_key_id` | `str \| None` | No | | AWS access key ID (optional if using IAM roles) |
|
| `aws_access_key_id` | `str \| None` | No | | AWS access key ID (optional if using IAM roles) |
|
||||||
| `aws_secret_access_key` | `str \| None` | No | | AWS secret access key (optional if using IAM roles) |
|
| `aws_secret_access_key` | `str \| None` | No | | AWS secret access key (optional if using IAM roles) |
|
||||||
| `endpoint_url` | `str \| None` | No | | Custom S3 endpoint URL (for MinIO, LocalStack, etc.) |
|
| `endpoint_url` | `str \| None` | No | | Custom S3 endpoint URL (for MinIO, LocalStack, etc.) |
|
||||||
| `auto_create_bucket` | `<class 'bool'>` | No | False | Automatically create the S3 bucket if it doesn't exist |
|
| `auto_create_bucket` | `bool` | No | False | Automatically create the S3 bucket if it doesn't exist |
|
||||||
| `metadata_store` | `<class 'llama_stack.core.storage.datatypes.SqlStoreReference'>` | No | | SQL store configuration for file metadata |
|
| `metadata_store` | `llama_stack.core.storage.datatypes.SqlStoreReference` | No | | SQL store configuration for file metadata |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,12 +1,13 @@
|
||||||
---
|
---
|
||||||
description: "Inference
|
description: |
|
||||||
|
Inference
|
||||||
|
|
||||||
Llama Stack Inference API for generating completions, chat completions, and embeddings.
|
Llama Stack Inference API for generating completions, chat completions, and embeddings.
|
||||||
|
|
||||||
This API provides the raw interface to the underlying models. Three kinds of models are supported:
|
This API provides the raw interface to the underlying models. Three kinds of models are supported:
|
||||||
- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.
|
- LLM models: these models generate "raw" and "chat" (conversational) completions.
|
||||||
- Embedding models: these models generate embeddings to be used for semantic search.
|
- Embedding models: these models generate embeddings to be used for semantic search.
|
||||||
- Rerank models: these models reorder the documents based on their relevance to a query."
|
- Rerank models: these models reorder the documents based on their relevance to a query.
|
||||||
sidebar_label: Inference
|
sidebar_label: Inference
|
||||||
title: Inference
|
title: Inference
|
||||||
---
|
---
|
||||||
|
|
@ -17,11 +18,11 @@ title: Inference
|
||||||
|
|
||||||
Inference
|
Inference
|
||||||
|
|
||||||
Llama Stack Inference API for generating completions, chat completions, and embeddings.
|
Llama Stack Inference API for generating completions, chat completions, and embeddings.
|
||||||
|
|
||||||
This API provides the raw interface to the underlying models. Three kinds of models are supported:
|
This API provides the raw interface to the underlying models. Three kinds of models are supported:
|
||||||
- LLM models: these models generate "raw" and "chat" (conversational) completions.
|
- LLM models: these models generate "raw" and "chat" (conversational) completions.
|
||||||
- Embedding models: these models generate embeddings to be used for semantic search.
|
- Embedding models: these models generate embeddings to be used for semantic search.
|
||||||
- Rerank models: these models reorder the documents based on their relevance to a query.
|
- Rerank models: these models reorder the documents based on their relevance to a query.
|
||||||
|
|
||||||
This section contains documentation for all available providers for the **inference** API.
|
This section contains documentation for all available providers for the **inference** API.
|
||||||
|
|
|
||||||
|
|
@ -16,12 +16,12 @@ Meta's reference implementation of inference with support for various model form
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `model` | `str \| None` | No | | |
|
| `model` | `str \| None` | No | | |
|
||||||
| `torch_seed` | `int \| None` | No | | |
|
| `torch_seed` | `int \| None` | No | | |
|
||||||
| `max_seq_len` | `<class 'int'>` | No | 4096 | |
|
| `max_seq_len` | `int` | No | 4096 | |
|
||||||
| `max_batch_size` | `<class 'int'>` | No | 1 | |
|
| `max_batch_size` | `int` | No | 1 | |
|
||||||
| `model_parallel_size` | `int \| None` | No | | |
|
| `model_parallel_size` | `int \| None` | No | | |
|
||||||
| `create_distributed_process_group` | `<class 'bool'>` | No | True | |
|
| `create_distributed_process_group` | `bool` | No | True | |
|
||||||
| `checkpoint_dir` | `str \| None` | No | | |
|
| `checkpoint_dir` | `str \| None` | No | | |
|
||||||
| `quantization` | `Bf16QuantizationConfig \| Fp8QuantizationConfig \| Int4QuantizationConfig, annotation=NoneType, required=True, discriminator='type'` | No | | |
|
| `quantization` | `Bf16QuantizationConfig \| Fp8QuantizationConfig \| Int4QuantizationConfig` | No | | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,8 +14,8 @@ Anthropic inference provider for accessing Claude models and Anthropic's AI serv
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
|
||||||
|
|
@ -21,10 +21,10 @@ https://learn.microsoft.com/en-us/azure/ai-foundry/openai/overview
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||||
| `api_base` | `<class 'pydantic.networks.HttpUrl'>` | No | | Azure API base for Azure (e.g., https://your-resource-name.openai.azure.com) |
|
| `api_base` | `pydantic.networks.HttpUrl` | No | | Azure API base for Azure (e.g., https://your-resource-name.openai.azure.com) |
|
||||||
| `api_version` | `str \| None` | No | | Azure API version for Azure (e.g., 2024-12-01-preview) |
|
| `api_version` | `str \| None` | No | | Azure API version for Azure (e.g., 2024-12-01-preview) |
|
||||||
| `api_type` | `str \| None` | No | azure | Azure API type for Azure (e.g., azure) |
|
| `api_type` | `str \| None` | No | azure | Azure API type for Azure (e.g., azure) |
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,10 +14,10 @@ Cerebras inference provider for running models on Cerebras Cloud platform.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||||
| `base_url` | `<class 'str'>` | No | https://api.cerebras.ai | Base URL for the Cerebras API |
|
| `base_url` | `str` | No | https://api.cerebras.ai | Base URL for the Cerebras API |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,8 +14,8 @@ Databricks inference provider for running models on Databricks' unified analytic
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_token` | `pydantic.types.SecretStr \| None` | No | | The Databricks API token |
|
| `api_token` | `pydantic.types.SecretStr \| None` | No | | The Databricks API token |
|
||||||
| `url` | `str \| None` | No | | The URL for the Databricks model serving endpoint |
|
| `url` | `str \| None` | No | | The URL for the Databricks model serving endpoint |
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,10 +14,10 @@ Fireworks AI inference provider for Llama models and other AI models on the Fire
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||||
| `url` | `<class 'str'>` | No | https://api.fireworks.ai/inference/v1 | The URL for the Fireworks server |
|
| `url` | `str` | No | https://api.fireworks.ai/inference/v1 | The URL for the Fireworks server |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,8 +14,8 @@ Google Gemini inference provider for accessing Gemini models and Google's AI ser
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
|
||||||
|
|
@ -14,10 +14,10 @@ Groq inference provider for ultra-fast inference using Groq's LPU technology.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||||
| `url` | `<class 'str'>` | No | https://api.groq.com | The URL for the Groq AI server |
|
| `url` | `str` | No | https://api.groq.com | The URL for the Groq AI server |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@ HuggingFace Inference Endpoints provider for dedicated model serving.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `endpoint_name` | `<class 'str'>` | No | | The name of the Hugging Face Inference Endpoint in the format of '{namespace}/{endpoint_name}' (e.g. 'my-cool-org/meta-llama-3-1-8b-instruct-rce'). Namespace is optional and will default to the user account if not provided. |
|
| `endpoint_name` | `str` | No | | The name of the Hugging Face Inference Endpoint in the format of '{namespace}/{endpoint_name}' (e.g. 'my-cool-org/meta-llama-3-1-8b-instruct-rce'). Namespace is optional and will default to the user account if not provided. |
|
||||||
| `api_token` | `pydantic.types.SecretStr \| None` | No | | Your Hugging Face user access token (will default to locally saved token if not provided) |
|
| `api_token` | `pydantic.types.SecretStr \| None` | No | | Your Hugging Face user access token (will default to locally saved token if not provided) |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@ HuggingFace Inference API serverless provider for on-demand model inference.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `huggingface_repo` | `<class 'str'>` | No | | The model ID of the model on the Hugging Face Hub (e.g. 'meta-llama/Meta-Llama-3.1-70B-Instruct') |
|
| `huggingface_repo` | `str` | No | | The model ID of the model on the Hugging Face Hub (e.g. 'meta-llama/Meta-Llama-3.1-70B-Instruct') |
|
||||||
| `api_token` | `pydantic.types.SecretStr \| None` | No | | Your Hugging Face user access token (will default to locally saved token if not provided) |
|
| `api_token` | `pydantic.types.SecretStr \| None` | No | | Your Hugging Face user access token (will default to locally saved token if not provided) |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
|
||||||
|
|
@ -14,10 +14,10 @@ Llama OpenAI-compatible provider for using Llama models with OpenAI API format.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||||
| `openai_compat_api_base` | `<class 'str'>` | No | https://api.llama.com/compat/v1/ | The URL for the Llama API server |
|
| `openai_compat_api_base` | `str` | No | https://api.llama.com/compat/v1/ | The URL for the Llama API server |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,13 +14,13 @@ NVIDIA inference provider for accessing NVIDIA NIM models and AI services.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||||
| `url` | `<class 'str'>` | No | https://integrate.api.nvidia.com | A base url for accessing the NVIDIA NIM |
|
| `url` | `str` | No | https://integrate.api.nvidia.com | A base url for accessing the NVIDIA NIM |
|
||||||
| `timeout` | `<class 'int'>` | No | 60 | Timeout for the HTTP requests |
|
| `timeout` | `int` | No | 60 | Timeout for the HTTP requests |
|
||||||
| `append_api_version` | `<class 'bool'>` | No | True | When set to false, the API version will not be appended to the base_url. By default, it is true. |
|
| `append_api_version` | `bool` | No | True | When set to false, the API version will not be appended to the base_url. By default, it is true. |
|
||||||
| `rerank_model_to_url` | `dict[str, str` | No | `{'nv-rerank-qa-mistral-4b:1': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/reranking', 'nvidia/nv-rerankqa-mistral-4b-v3': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/nv-rerankqa-mistral-4b-v3/reranking', 'nvidia/llama-3.2-nv-rerankqa-1b-v2': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/llama-3_2-nv-rerankqa-1b-v2/reranking'}` | Mapping of rerank model identifiers to their API endpoints. |
|
| `rerank_model_to_url` | `dict[str, str]` | No | `{'nv-rerank-qa-mistral-4b:1': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/reranking', 'nvidia/nv-rerankqa-mistral-4b-v3': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/nv-rerankqa-mistral-4b-v3/reranking', 'nvidia/llama-3.2-nv-rerankqa-1b-v2': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/llama-3_2-nv-rerankqa-1b-v2/reranking'}` | Mapping of rerank model identifiers to their API endpoints. |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,9 +14,9 @@ Ollama inference provider for running local models through the Ollama runtime.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `url` | `<class 'str'>` | No | http://localhost:11434 | |
|
| `url` | `str` | No | http://localhost:11434 | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,10 +14,10 @@ OpenAI inference provider for accessing GPT models and other OpenAI services.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||||
| `base_url` | `<class 'str'>` | No | https://api.openai.com/v1 | Base URL for OpenAI API |
|
| `base_url` | `str` | No | https://api.openai.com/v1 | Base URL for OpenAI API |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,8 +14,8 @@ RunPod inference provider for running models on RunPod's cloud GPU platform.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_token` | `pydantic.types.SecretStr \| None` | No | | The API token |
|
| `api_token` | `pydantic.types.SecretStr \| None` | No | | The API token |
|
||||||
| `url` | `str \| None` | No | | The URL for the Runpod model serving endpoint |
|
| `url` | `str \| None` | No | | The URL for the Runpod model serving endpoint |
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,10 +14,10 @@ SambaNova inference provider for running models on SambaNova's dataflow architec
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||||
| `url` | `<class 'str'>` | No | https://api.sambanova.ai/v1 | The URL for the SambaNova AI server |
|
| `url` | `str` | No | https://api.sambanova.ai/v1 | The URL for the SambaNova AI server |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,9 +14,9 @@ Text Generation Inference (TGI) provider for HuggingFace model serving.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `url` | `<class 'str'>` | No | | The URL for the TGI serving endpoint |
|
| `url` | `str` | No | | The URL for the TGI serving endpoint |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,10 +14,10 @@ Together AI inference provider for open-source models and collaborative AI devel
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||||
| `url` | `<class 'str'>` | No | https://api.together.xyz/v1 | The URL for the Together AI server |
|
| `url` | `str` | No | https://api.together.xyz/v1 | The URL for the Together AI server |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -53,10 +53,10 @@ Available Models:
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `project` | `<class 'str'>` | No | | Google Cloud project ID for Vertex AI |
|
| `project` | `str` | No | | Google Cloud project ID for Vertex AI |
|
||||||
| `location` | `<class 'str'>` | No | us-central1 | Google Cloud location for Vertex AI |
|
| `location` | `str` | No | us-central1 | Google Cloud location for Vertex AI |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,11 +14,11 @@ Remote vLLM inference provider for connecting to vLLM servers.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_token` | `pydantic.types.SecretStr \| None` | No | | The API token |
|
| `api_token` | `pydantic.types.SecretStr \| None` | No | | The API token |
|
||||||
| `url` | `str \| None` | No | | The URL for the vLLM model serving endpoint |
|
| `url` | `str \| None` | No | | The URL for the vLLM model serving endpoint |
|
||||||
| `max_tokens` | `<class 'int'>` | No | 4096 | Maximum number of tokens to generate. |
|
| `max_tokens` | `int` | No | 4096 | Maximum number of tokens to generate. |
|
||||||
| `tls_verify` | `bool \| str` | No | True | Whether to verify TLS certificates. Can be a boolean or a path to a CA certificate file. |
|
| `tls_verify` | `bool \| str` | No | True | Whether to verify TLS certificates. Can be a boolean or a path to a CA certificate file. |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
|
||||||
|
|
@ -14,12 +14,12 @@ IBM WatsonX inference provider for accessing AI models on IBM's WatsonX platform
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||||
| `url` | `<class 'str'>` | No | https://us-south.ml.cloud.ibm.com | A base url for accessing the watsonx.ai |
|
| `url` | `str` | No | https://us-south.ml.cloud.ibm.com | A base url for accessing the watsonx.ai |
|
||||||
| `project_id` | `str \| None` | No | | The watsonx.ai project ID |
|
| `project_id` | `str \| None` | No | | The watsonx.ai project ID |
|
||||||
| `timeout` | `<class 'int'>` | No | 60 | Timeout for the HTTP requests |
|
| `timeout` | `int` | No | 60 | Timeout for the HTTP requests |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,23 +14,23 @@ HuggingFace-based post-training provider for fine-tuning models using the Huggin
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `device` | `<class 'str'>` | No | cuda | |
|
| `device` | `str` | No | cuda | |
|
||||||
| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No | | |
|
| `distributed_backend` | `Literal['fsdp', 'deepspeed' \| None]` | No | | |
|
||||||
| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface | |
|
| `checkpoint_format` | `Literal['full_state', 'huggingface' \| None]` | No | huggingface | |
|
||||||
| `chat_template` | `<class 'str'>` | No | `<|user|>`<br/>`{input}`<br/>`<|assistant|>`<br/>`{output}` | |
|
| `chat_template` | `str` | No | `<|user|>`<br/>`{input}`<br/>`<|assistant|>`<br/>`{output}` | |
|
||||||
| `model_specific_config` | `<class 'dict'>` | No | `{'trust_remote_code': True, 'attn_implementation': 'sdpa'}` | |
|
| `model_specific_config` | `dict` | No | `{'trust_remote_code': True, 'attn_implementation': 'sdpa'}` | |
|
||||||
| `max_seq_length` | `<class 'int'>` | No | 2048 | |
|
| `max_seq_length` | `int` | No | 2048 | |
|
||||||
| `gradient_checkpointing` | `<class 'bool'>` | No | False | |
|
| `gradient_checkpointing` | `bool` | No | False | |
|
||||||
| `save_total_limit` | `<class 'int'>` | No | 3 | |
|
| `save_total_limit` | `int` | No | 3 | |
|
||||||
| `logging_steps` | `<class 'int'>` | No | 10 | |
|
| `logging_steps` | `int` | No | 10 | |
|
||||||
| `warmup_ratio` | `<class 'float'>` | No | 0.1 | |
|
| `warmup_ratio` | `float` | No | 0.1 | |
|
||||||
| `weight_decay` | `<class 'float'>` | No | 0.01 | |
|
| `weight_decay` | `float` | No | 0.01 | |
|
||||||
| `dataloader_num_workers` | `<class 'int'>` | No | 4 | |
|
| `dataloader_num_workers` | `int` | No | 4 | |
|
||||||
| `dataloader_pin_memory` | `<class 'bool'>` | No | True | |
|
| `dataloader_pin_memory` | `bool` | No | True | |
|
||||||
| `dpo_beta` | `<class 'float'>` | No | 0.1 | |
|
| `dpo_beta` | `float` | No | 0.1 | |
|
||||||
| `use_reference_model` | `<class 'bool'>` | No | True | |
|
| `use_reference_model` | `bool` | No | True | |
|
||||||
| `dpo_loss_type` | `Literal['sigmoid', 'hinge', 'ipo', 'kto_pair'` | No | sigmoid | |
|
| `dpo_loss_type` | `Literal['sigmoid', 'hinge', 'ipo', 'kto_pair']` | No | sigmoid | |
|
||||||
| `dpo_output_dir` | `<class 'str'>` | No | | |
|
| `dpo_output_dir` | `str` | No | | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -15,7 +15,7 @@ TorchTune-based post-training provider for fine-tuning and optimizing models usi
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `torch_seed` | `int \| None` | No | | |
|
| `torch_seed` | `int \| None` | No | | |
|
||||||
| `checkpoint_format` | `Literal['meta', 'huggingface'` | No | meta | |
|
| `checkpoint_format` | `Literal['meta', 'huggingface' \| None]` | No | meta | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -15,7 +15,7 @@ TorchTune-based post-training provider for fine-tuning and optimizing models usi
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `torch_seed` | `int \| None` | No | | |
|
| `torch_seed` | `int \| None` | No | | |
|
||||||
| `checkpoint_format` | `Literal['meta', 'huggingface'` | No | meta | |
|
| `checkpoint_format` | `Literal['meta', 'huggingface' \| None]` | No | meta | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -18,9 +18,9 @@ NVIDIA's post-training provider for fine-tuning models on NVIDIA's platform.
|
||||||
| `dataset_namespace` | `str \| None` | No | default | The NVIDIA dataset namespace. |
|
| `dataset_namespace` | `str \| None` | No | default | The NVIDIA dataset namespace. |
|
||||||
| `project_id` | `str \| None` | No | test-example-model@v1 | The NVIDIA project ID. |
|
| `project_id` | `str \| None` | No | test-example-model@v1 | The NVIDIA project ID. |
|
||||||
| `customizer_url` | `str \| None` | No | | Base URL for the NeMo Customizer API |
|
| `customizer_url` | `str \| None` | No | | Base URL for the NeMo Customizer API |
|
||||||
| `timeout` | `<class 'int'>` | No | 300 | Timeout for the NVIDIA Post Training API |
|
| `timeout` | `int` | No | 300 | Timeout for the NVIDIA Post Training API |
|
||||||
| `max_retries` | `<class 'int'>` | No | 3 | Maximum number of retries for the NVIDIA Post Training API |
|
| `max_retries` | `int` | No | 3 | Maximum number of retries for the NVIDIA Post Training API |
|
||||||
| `output_model_dir` | `<class 'str'>` | No | test-example-model@v1 | Directory to save the output model |
|
| `output_model_dir` | `str` | No | test-example-model@v1 | Directory to save the output model |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,7 +1,8 @@
|
||||||
---
|
---
|
||||||
description: "Safety
|
description: |
|
||||||
|
Safety
|
||||||
|
|
||||||
OpenAI-compatible Moderations API."
|
OpenAI-compatible Moderations API.
|
||||||
sidebar_label: Safety
|
sidebar_label: Safety
|
||||||
title: Safety
|
title: Safety
|
||||||
---
|
---
|
||||||
|
|
@ -12,6 +13,6 @@ title: Safety
|
||||||
|
|
||||||
Safety
|
Safety
|
||||||
|
|
||||||
OpenAI-compatible Moderations API.
|
OpenAI-compatible Moderations API.
|
||||||
|
|
||||||
This section contains documentation for all available providers for the **safety** API.
|
This section contains documentation for all available providers for the **safety** API.
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@ Llama Guard safety provider for content moderation and safety filtering using Me
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `excluded_categories` | `list[str` | No | [] | |
|
| `excluded_categories` | `list[str]` | No | [] | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@ Prompt Guard safety provider for detecting and filtering unsafe prompts and cont
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `guard_type` | `<class 'str'>` | No | injection | |
|
| `guard_type` | `str` | No | injection | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,13 +14,13 @@ AWS Bedrock safety provider for content moderation using AWS's safety services.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||||
| `aws_access_key_id` | `str \| None` | No | | The AWS access key to use. Default use environment variable: AWS_ACCESS_KEY_ID |
|
| `aws_access_key_id` | `str \| None` | No | | The AWS access key to use. Default use environment variable: AWS_ACCESS_KEY_ID |
|
||||||
| `aws_secret_access_key` | `str \| None` | No | | The AWS secret access key to use. Default use environment variable: AWS_SECRET_ACCESS_KEY |
|
| `aws_secret_access_key` | `str \| None` | No | | The AWS secret access key to use. Default use environment variable: AWS_SECRET_ACCESS_KEY |
|
||||||
| `aws_session_token` | `str \| None` | No | | The AWS session token to use. Default use environment variable: AWS_SESSION_TOKEN |
|
| `aws_session_token` | `str \| None` | No | | The AWS session token to use. Default use environment variable: AWS_SESSION_TOKEN |
|
||||||
| `region_name` | `str \| None` | No | | The default AWS Region to use, for example, us-west-1 or us-west-2.Default use environment variable: AWS_DEFAULT_REGION |
|
| `region_name` | `str \| None` | No | | The default AWS Region to use, for example, us-west-1 or us-west-2.Default use environment variable: AWS_DEFAULT_REGION |
|
||||||
| `profile_name` | `str \| None` | No | | The profile name that contains credentials to use.Default use environment variable: AWS_PROFILE |
|
| `profile_name` | `str \| None` | No | tpetkos | The profile name that contains credentials to use.Default use environment variable: AWS_PROFILE |
|
||||||
| `total_max_attempts` | `int \| None` | No | | An integer representing the maximum number of attempts that will be made for a single request, including the initial attempt. Default use environment variable: AWS_MAX_ATTEMPTS |
|
| `total_max_attempts` | `int \| None` | No | | An integer representing the maximum number of attempts that will be made for a single request, including the initial attempt. Default use environment variable: AWS_MAX_ATTEMPTS |
|
||||||
| `retry_mode` | `str \| None` | No | | A string representing the type of retries Boto3 will perform.Default use environment variable: AWS_RETRY_MODE |
|
| `retry_mode` | `str \| None` | No | | A string representing the type of retries Boto3 will perform.Default use environment variable: AWS_RETRY_MODE |
|
||||||
| `connect_timeout` | `float \| None` | No | 60.0 | The time in seconds till a timeout exception is thrown when attempting to make a connection. The default is 60 seconds. |
|
| `connect_timeout` | `float \| None` | No | 60.0 | The time in seconds till a timeout exception is thrown when attempting to make a connection. The default is 60 seconds. |
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@ NVIDIA's safety provider for content moderation and safety filtering.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `guardrails_service_url` | `<class 'str'>` | No | http://0.0.0.0:7331 | The url for accessing the Guardrails service |
|
| `guardrails_service_url` | `str` | No | http://0.0.0.0:7331 | The url for accessing the Guardrails service |
|
||||||
| `config_id` | `str \| None` | No | self-check | Guardrails configuration ID to use from the Guardrails configuration store |
|
| `config_id` | `str \| None` | No | self-check | Guardrails configuration ID to use from the Guardrails configuration store |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@ SambaNova's safety provider for content moderation and safety filtering.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `url` | `<class 'str'>` | No | https://api.sambanova.ai/v1 | The URL for the SambaNova AI server |
|
| `url` | `str` | No | https://api.sambanova.ai/v1 | The URL for the SambaNova AI server |
|
||||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | The SambaNova cloud API Key |
|
| `api_key` | `pydantic.types.SecretStr \| None` | No | | The SambaNova cloud API Key |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
|
||||||
|
|
@ -15,7 +15,7 @@ Bing Search tool for web search capabilities using Microsoft's search engine.
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `api_key` | `str \| None` | No | | |
|
| `api_key` | `str \| None` | No | | |
|
||||||
| `top_k` | `<class 'int'>` | No | 3 | |
|
| `top_k` | `int` | No | 3 | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -15,7 +15,7 @@ Brave Search tool for web search capabilities with privacy-focused results.
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `api_key` | `str \| None` | No | | The Brave Search API Key |
|
| `api_key` | `str \| None` | No | | The Brave Search API Key |
|
||||||
| `max_results` | `<class 'int'>` | No | 3 | The maximum number of results to return |
|
| `max_results` | `int` | No | 3 | The maximum number of results to return |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -15,7 +15,7 @@ Tavily Search tool for AI-optimized web search with structured results.
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `api_key` | `str \| None` | No | | The Tavily Search API Key |
|
| `api_key` | `str \| None` | No | | The Tavily Search API Key |
|
||||||
| `max_results` | `<class 'int'>` | No | 3 | The maximum number of results to return |
|
| `max_results` | `int` | No | 3 | The maximum number of results to return |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -78,8 +78,8 @@ See [Chroma's documentation](https://docs.trychroma.com/docs/overview/introducti
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `db_path` | `<class 'str'>` | No | | |
|
| `db_path` | `str` | No | | |
|
||||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Config for KV store backend |
|
| `persistence` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | Config for KV store backend |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -95,7 +95,7 @@ more details about Faiss in general.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
| `persistence` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@ Meta's reference implementation of a vector database.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
| `persistence` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -16,9 +16,9 @@ Please refer to the remote provider documentation.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `db_path` | `<class 'str'>` | No | | |
|
| `db_path` | `str` | No | | |
|
||||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Config for KV store backend (SQLite only for now) |
|
| `persistence` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | Config for KV store backend (SQLite only for now) |
|
||||||
| `consistency_level` | `<class 'str'>` | No | Strong | The consistency level of the Milvus server |
|
| `consistency_level` | `str` | No | Strong | The consistency level of the Milvus server |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -97,8 +97,8 @@ See the [Qdrant documentation](https://qdrant.tech/documentation/) for more deta
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `path` | `<class 'str'>` | No | | |
|
| `path` | `str` | No | | |
|
||||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
| `persistence` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -407,8 +407,8 @@ See [sqlite-vec's GitHub repo](https://github.com/asg017/sqlite-vec/tree/main) f
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `db_path` | `<class 'str'>` | No | | Path to the SQLite database file |
|
| `db_path` | `str` | No | | Path to the SQLite database file |
|
||||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Config for KV store backend (SQLite only for now) |
|
| `persistence` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | Config for KV store backend (SQLite only for now) |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -16,8 +16,8 @@ Please refer to the sqlite-vec provider documentation.
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `db_path` | `<class 'str'>` | No | | Path to the SQLite database file |
|
| `db_path` | `str` | No | | Path to the SQLite database file |
|
||||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Config for KV store backend (SQLite only for now) |
|
| `persistence` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | Config for KV store backend (SQLite only for now) |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -78,7 +78,7 @@ See [Chroma's documentation](https://docs.trychroma.com/docs/overview/introducti
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `url` | `str \| None` | No | | |
|
| `url` | `str \| None` | No | | |
|
||||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Config for KV store backend |
|
| `persistence` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | Config for KV store backend |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -405,10 +405,10 @@ For more details on TLS configuration, refer to the [TLS setup guide](https://mi
|
||||||
|
|
||||||
| Field | Type | Required | Default | Description |
|
| Field | Type | Required | Default | Description |
|
||||||
|-------|------|----------|---------|-------------|
|
|-------|------|----------|---------|-------------|
|
||||||
| `uri` | `<class 'str'>` | No | | The URI of the Milvus server |
|
| `uri` | `str` | No | | The URI of the Milvus server |
|
||||||
| `token` | `str \| None` | No | | The token of the Milvus server |
|
| `token` | `str \| None` | No | | The token of the Milvus server |
|
||||||
| `consistency_level` | `<class 'str'>` | No | Strong | The consistency level of the Milvus server |
|
| `consistency_level` | `str` | No | Strong | The consistency level of the Milvus server |
|
||||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Config for KV store backend |
|
| `persistence` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | Config for KV store backend |
|
||||||
| `config` | `dict` | No | `{}` | This configuration allows additional fields to be passed through to the underlying Milvus client. See the [Milvus](https://milvus.io/docs/install-overview.md) documentation for more details about Milvus in general. |
|
| `config` | `dict` | No | `{}` | This configuration allows additional fields to be passed through to the underlying Milvus client. See the [Milvus](https://milvus.io/docs/install-overview.md) documentation for more details about Milvus in general. |
|
||||||
|
|
||||||
:::note
|
:::note
|
||||||
|
|
|
||||||
|
|
@ -19,14 +19,14 @@ Please refer to the inline provider documentation.
|
||||||
| `location` | `str \| None` | No | | |
|
| `location` | `str \| None` | No | | |
|
||||||
| `url` | `str \| None` | No | | |
|
| `url` | `str \| None` | No | | |
|
||||||
| `port` | `int \| None` | No | 6333 | |
|
| `port` | `int \| None` | No | 6333 | |
|
||||||
| `grpc_port` | `<class 'int'>` | No | 6334 | |
|
| `grpc_port` | `int` | No | 6334 | |
|
||||||
| `prefer_grpc` | `<class 'bool'>` | No | False | |
|
| `prefer_grpc` | `bool` | No | False | |
|
||||||
| `https` | `bool \| None` | No | | |
|
| `https` | `bool \| None` | No | | |
|
||||||
| `api_key` | `str \| None` | No | | |
|
| `api_key` | `str \| None` | No | | |
|
||||||
| `prefix` | `str \| None` | No | | |
|
| `prefix` | `str \| None` | No | | |
|
||||||
| `timeout` | `int \| None` | No | | |
|
| `timeout` | `int \| None` | No | | |
|
||||||
| `host` | `str \| None` | No | | |
|
| `host` | `str \| None` | No | | |
|
||||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
| `persistence` | `llama_stack.core.storage.datatypes.KVStoreReference` | No | | |
|
||||||
|
|
||||||
## Sample Configuration
|
## Sample Configuration
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue