mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 18:00:36 +00:00
Merge branch 'main' into auto_instrument_1
This commit is contained in:
commit
7d8cef6c71
80 changed files with 1314 additions and 642 deletions
|
|
@ -104,23 +104,19 @@ client.toolgroups.register(
|
|||
)
|
||||
```
|
||||
|
||||
Note that most of the more useful MCP servers need you to authenticate with them. Many of them use OAuth2.0 for authentication. You can provide authorization headers to send to the MCP server using the "Provider Data" abstraction provided by Llama Stack. When making an agent call,
|
||||
Note that most of the more useful MCP servers need you to authenticate with them. Many of them use OAuth2.0 for authentication. You can provide the authorization token when creating the Agent:
|
||||
|
||||
```python
|
||||
agent = Agent(
|
||||
...,
|
||||
tools=["mcp::deepwiki"],
|
||||
extra_headers={
|
||||
"X-LlamaStack-Provider-Data": json.dumps(
|
||||
{
|
||||
"mcp_headers": {
|
||||
"http://mcp.deepwiki.com/sse": {
|
||||
"Authorization": "Bearer <your_access_token>",
|
||||
},
|
||||
},
|
||||
}
|
||||
),
|
||||
},
|
||||
tools=[
|
||||
{
|
||||
"type": "mcp",
|
||||
"server_url": "https://mcp.deepwiki.com/sse",
|
||||
"server_label": "mcp::deepwiki",
|
||||
"authorization": "<your_access_token>", # OAuth token (without "Bearer " prefix)
|
||||
}
|
||||
],
|
||||
)
|
||||
agent.create_turn(...)
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,7 +1,8 @@
|
|||
---
|
||||
description: "Agents
|
||||
description: |
|
||||
Agents
|
||||
|
||||
APIs for creating and interacting with agentic systems."
|
||||
APIs for creating and interacting with agentic systems.
|
||||
sidebar_label: Agents
|
||||
title: Agents
|
||||
---
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@ Meta's reference implementation of an agent system that can use tools, access ve
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `persistence` | `<class 'inline.agents.meta_reference.config.AgentPersistenceConfig'>` | No | | |
|
||||
| `persistence` | `AgentPersistenceConfig` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -1,14 +1,15 @@
|
|||
---
|
||||
description: "The Batches API enables efficient processing of multiple requests in a single operation,
|
||||
particularly useful for processing large datasets, batch evaluation workflows, and
|
||||
cost-effective inference at scale.
|
||||
description: |
|
||||
The Batches API enables efficient processing of multiple requests in a single operation,
|
||||
particularly useful for processing large datasets, batch evaluation workflows, and
|
||||
cost-effective inference at scale.
|
||||
|
||||
The API is designed to allow use of openai client libraries for seamless integration.
|
||||
The API is designed to allow use of openai client libraries for seamless integration.
|
||||
|
||||
This API provides the following extensions:
|
||||
- idempotent batch creation
|
||||
This API provides the following extensions:
|
||||
- idempotent batch creation
|
||||
|
||||
Note: This API is currently under active development and may undergo changes."
|
||||
Note: This API is currently under active development and may undergo changes.
|
||||
sidebar_label: Batches
|
||||
title: Batches
|
||||
---
|
||||
|
|
|
|||
|
|
@ -14,9 +14,9 @@ Reference implementation of batches API with KVStore persistence.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `kvstore` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Configuration for the key-value store backend. |
|
||||
| `max_concurrent_batches` | `<class 'int'>` | No | 1 | Maximum number of concurrent batches to process simultaneously. |
|
||||
| `max_concurrent_requests_per_batch` | `<class 'int'>` | No | 10 | Maximum number of concurrent requests to process per batch. |
|
||||
| `kvstore` | `KVStoreReference` | No | | Configuration for the key-value store backend. |
|
||||
| `max_concurrent_batches` | `int` | No | 1 | Maximum number of concurrent batches to process simultaneously. |
|
||||
| `max_concurrent_requests_per_batch` | `int` | No | 10 | Maximum number of concurrent requests to process per batch. |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@ Local filesystem-based dataset I/O provider for reading and writing datasets to
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `kvstore` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
||||
| `kvstore` | `KVStoreReference` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@ HuggingFace datasets provider for accessing and managing datasets from the Huggi
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `kvstore` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
||||
| `kvstore` | `KVStoreReference` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -17,7 +17,7 @@ NVIDIA's dataset I/O provider for accessing datasets from NVIDIA's data platform
|
|||
| `api_key` | `str \| None` | No | | The NVIDIA API key. |
|
||||
| `dataset_namespace` | `str \| None` | No | default | The NVIDIA dataset namespace. |
|
||||
| `project_id` | `str \| None` | No | test-project | The NVIDIA project ID. |
|
||||
| `datasets_url` | `<class 'str'>` | No | http://nemo.test | Base URL for the NeMo Dataset API |
|
||||
| `datasets_url` | `str` | No | http://nemo.test | Base URL for the NeMo Dataset API |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -1,7 +1,8 @@
|
|||
---
|
||||
description: "Evaluations
|
||||
description: |
|
||||
Evaluations
|
||||
|
||||
Llama Stack Evaluation API for running evaluations on model and agent candidates."
|
||||
Llama Stack Evaluation API for running evaluations on model and agent candidates.
|
||||
sidebar_label: Eval
|
||||
title: Eval
|
||||
---
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@ Meta's reference implementation of evaluation tasks with support for multiple la
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `kvstore` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
||||
| `kvstore` | `KVStoreReference` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@ NVIDIA's evaluation provider for running evaluation tasks on NVIDIA's platform.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `evaluator_url` | `<class 'str'>` | No | http://0.0.0.0:7331 | The url for accessing the evaluator service |
|
||||
| `evaluator_url` | `str` | No | http://0.0.0.0:7331 | The url for accessing the evaluator service |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -1,7 +1,8 @@
|
|||
---
|
||||
description: "Files
|
||||
description: |
|
||||
Files
|
||||
|
||||
This API is used to upload documents that can be used with other Llama Stack APIs."
|
||||
This API is used to upload documents that can be used with other Llama Stack APIs.
|
||||
sidebar_label: Files
|
||||
title: Files
|
||||
---
|
||||
|
|
|
|||
|
|
@ -14,9 +14,9 @@ Local filesystem-based file storage provider for managing files and documents lo
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `storage_dir` | `<class 'str'>` | No | | Directory to store uploaded files |
|
||||
| `metadata_store` | `<class 'llama_stack.core.storage.datatypes.SqlStoreReference'>` | No | | SQL store configuration for file metadata |
|
||||
| `ttl_secs` | `<class 'int'>` | No | 31536000 | |
|
||||
| `storage_dir` | `str` | No | | Directory to store uploaded files |
|
||||
| `metadata_store` | `SqlStoreReference` | No | | SQL store configuration for file metadata |
|
||||
| `ttl_secs` | `int` | No | 31536000 | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,8 +14,8 @@ OpenAI Files API provider for managing files through OpenAI's native file storag
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `api_key` | `<class 'str'>` | No | | OpenAI API key for authentication |
|
||||
| `metadata_store` | `<class 'llama_stack.core.storage.datatypes.SqlStoreReference'>` | No | | SQL store configuration for file metadata |
|
||||
| `api_key` | `str` | No | | OpenAI API key for authentication |
|
||||
| `metadata_store` | `SqlStoreReference` | No | | SQL store configuration for file metadata |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,13 +14,13 @@ AWS S3-based file storage provider for scalable cloud file management with metad
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `bucket_name` | `<class 'str'>` | No | | S3 bucket name to store files |
|
||||
| `region` | `<class 'str'>` | No | us-east-1 | AWS region where the bucket is located |
|
||||
| `bucket_name` | `str` | No | | S3 bucket name to store files |
|
||||
| `region` | `str` | No | us-east-1 | AWS region where the bucket is located |
|
||||
| `aws_access_key_id` | `str \| None` | No | | AWS access key ID (optional if using IAM roles) |
|
||||
| `aws_secret_access_key` | `str \| None` | No | | AWS secret access key (optional if using IAM roles) |
|
||||
| `endpoint_url` | `str \| None` | No | | Custom S3 endpoint URL (for MinIO, LocalStack, etc.) |
|
||||
| `auto_create_bucket` | `<class 'bool'>` | No | False | Automatically create the S3 bucket if it doesn't exist |
|
||||
| `metadata_store` | `<class 'llama_stack.core.storage.datatypes.SqlStoreReference'>` | No | | SQL store configuration for file metadata |
|
||||
| `auto_create_bucket` | `bool` | No | False | Automatically create the S3 bucket if it doesn't exist |
|
||||
| `metadata_store` | `SqlStoreReference` | No | | SQL store configuration for file metadata |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -1,12 +1,13 @@
|
|||
---
|
||||
description: "Inference
|
||||
description: |
|
||||
Inference
|
||||
|
||||
Llama Stack Inference API for generating completions, chat completions, and embeddings.
|
||||
Llama Stack Inference API for generating completions, chat completions, and embeddings.
|
||||
|
||||
This API provides the raw interface to the underlying models. Three kinds of models are supported:
|
||||
- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.
|
||||
- Embedding models: these models generate embeddings to be used for semantic search.
|
||||
- Rerank models: these models reorder the documents based on their relevance to a query."
|
||||
This API provides the raw interface to the underlying models. Three kinds of models are supported:
|
||||
- LLM models: these models generate "raw" and "chat" (conversational) completions.
|
||||
- Embedding models: these models generate embeddings to be used for semantic search.
|
||||
- Rerank models: these models reorder the documents based on their relevance to a query.
|
||||
sidebar_label: Inference
|
||||
title: Inference
|
||||
---
|
||||
|
|
|
|||
|
|
@ -16,12 +16,12 @@ Meta's reference implementation of inference with support for various model form
|
|||
|-------|------|----------|---------|-------------|
|
||||
| `model` | `str \| None` | No | | |
|
||||
| `torch_seed` | `int \| None` | No | | |
|
||||
| `max_seq_len` | `<class 'int'>` | No | 4096 | |
|
||||
| `max_batch_size` | `<class 'int'>` | No | 1 | |
|
||||
| `max_seq_len` | `int` | No | 4096 | |
|
||||
| `max_batch_size` | `int` | No | 1 | |
|
||||
| `model_parallel_size` | `int \| None` | No | | |
|
||||
| `create_distributed_process_group` | `<class 'bool'>` | No | True | |
|
||||
| `create_distributed_process_group` | `bool` | No | True | |
|
||||
| `checkpoint_dir` | `str \| None` | No | | |
|
||||
| `quantization` | `Bf16QuantizationConfig \| Fp8QuantizationConfig \| Int4QuantizationConfig, annotation=NoneType, required=True, discriminator='type'` | No | | |
|
||||
| `quantization` | `Bf16QuantizationConfig \| Fp8QuantizationConfig \| Int4QuantizationConfig \| None` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,9 +14,9 @@ Anthropic inference provider for accessing Claude models and Anthropic's AI serv
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -21,10 +21,10 @@ https://learn.microsoft.com/en-us/azure/ai-foundry/openai/overview
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `api_base` | `<class 'pydantic.networks.HttpUrl'>` | No | | Azure API base for Azure (e.g., https://your-resource-name.openai.azure.com) |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `api_base` | `HttpUrl` | No | | Azure API base for Azure (e.g., https://your-resource-name.openai.azure.com) |
|
||||
| `api_version` | `str \| None` | No | | Azure API version for Azure (e.g., 2024-12-01-preview) |
|
||||
| `api_type` | `str \| None` | No | azure | Azure API type for Azure (e.g., azure) |
|
||||
|
||||
|
|
|
|||
|
|
@ -14,10 +14,10 @@ AWS Bedrock inference provider using OpenAI compatible endpoint.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `region_name` | `<class 'str'>` | No | us-east-2 | AWS Region for the Bedrock Runtime endpoint |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `region_name` | `str` | No | us-east-2 | AWS Region for the Bedrock Runtime endpoint |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,10 +14,10 @@ Cerebras inference provider for running models on Cerebras Cloud platform.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `base_url` | `<class 'str'>` | No | https://api.cerebras.ai | Base URL for the Cerebras API |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `base_url` | `str` | No | https://api.cerebras.ai | Base URL for the Cerebras API |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,9 +14,9 @@ Databricks inference provider for running models on Databricks' unified analytic
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_token` | `pydantic.types.SecretStr \| None` | No | | The Databricks API token |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_token` | `SecretStr \| None` | No | | The Databricks API token |
|
||||
| `url` | `str \| None` | No | | The URL for the Databricks model serving endpoint |
|
||||
|
||||
## Sample Configuration
|
||||
|
|
|
|||
|
|
@ -14,10 +14,10 @@ Fireworks AI inference provider for Llama models and other AI models on the Fire
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `<class 'str'>` | No | https://api.fireworks.ai/inference/v1 | The URL for the Fireworks server |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `str` | No | https://api.fireworks.ai/inference/v1 | The URL for the Fireworks server |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,9 +14,9 @@ Google Gemini inference provider for accessing Gemini models and Google's AI ser
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,10 +14,10 @@ Groq inference provider for ultra-fast inference using Groq's LPU technology.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `<class 'str'>` | No | https://api.groq.com | The URL for the Groq AI server |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `str` | No | https://api.groq.com | The URL for the Groq AI server |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,8 +14,8 @@ HuggingFace Inference Endpoints provider for dedicated model serving.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `endpoint_name` | `<class 'str'>` | No | | The name of the Hugging Face Inference Endpoint in the format of '{namespace}/{endpoint_name}' (e.g. 'my-cool-org/meta-llama-3-1-8b-instruct-rce'). Namespace is optional and will default to the user account if not provided. |
|
||||
| `api_token` | `pydantic.types.SecretStr \| None` | No | | Your Hugging Face user access token (will default to locally saved token if not provided) |
|
||||
| `endpoint_name` | `str` | No | | The name of the Hugging Face Inference Endpoint in the format of '{namespace}/{endpoint_name}' (e.g. 'my-cool-org/meta-llama-3-1-8b-instruct-rce'). Namespace is optional and will default to the user account if not provided. |
|
||||
| `api_token` | `SecretStr \| None` | No | | Your Hugging Face user access token (will default to locally saved token if not provided) |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,8 +14,8 @@ HuggingFace Inference API serverless provider for on-demand model inference.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `huggingface_repo` | `<class 'str'>` | No | | The model ID of the model on the Hugging Face Hub (e.g. 'meta-llama/Meta-Llama-3.1-70B-Instruct') |
|
||||
| `api_token` | `pydantic.types.SecretStr \| None` | No | | Your Hugging Face user access token (will default to locally saved token if not provided) |
|
||||
| `huggingface_repo` | `str` | No | | The model ID of the model on the Hugging Face Hub (e.g. 'meta-llama/Meta-Llama-3.1-70B-Instruct') |
|
||||
| `api_token` | `SecretStr \| None` | No | | Your Hugging Face user access token (will default to locally saved token if not provided) |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,10 +14,10 @@ Llama OpenAI-compatible provider for using Llama models with OpenAI API format.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `openai_compat_api_base` | `<class 'str'>` | No | https://api.llama.com/compat/v1/ | The URL for the Llama API server |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `openai_compat_api_base` | `str` | No | https://api.llama.com/compat/v1/ | The URL for the Llama API server |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,13 +14,13 @@ NVIDIA inference provider for accessing NVIDIA NIM models and AI services.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `<class 'str'>` | No | https://integrate.api.nvidia.com | A base url for accessing the NVIDIA NIM |
|
||||
| `timeout` | `<class 'int'>` | No | 60 | Timeout for the HTTP requests |
|
||||
| `append_api_version` | `<class 'bool'>` | No | True | When set to false, the API version will not be appended to the base_url. By default, it is true. |
|
||||
| `rerank_model_to_url` | `dict[str, str` | No | `{'nv-rerank-qa-mistral-4b:1': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/reranking', 'nvidia/nv-rerankqa-mistral-4b-v3': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/nv-rerankqa-mistral-4b-v3/reranking', 'nvidia/llama-3.2-nv-rerankqa-1b-v2': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/llama-3_2-nv-rerankqa-1b-v2/reranking'}` | Mapping of rerank model identifiers to their API endpoints. |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `str` | No | https://integrate.api.nvidia.com | A base url for accessing the NVIDIA NIM |
|
||||
| `timeout` | `int` | No | 60 | Timeout for the HTTP requests |
|
||||
| `append_api_version` | `bool` | No | True | When set to false, the API version will not be appended to the base_url. By default, it is true. |
|
||||
| `rerank_model_to_url` | `dict[str, str]` | No | `{'nv-rerank-qa-mistral-4b:1': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/reranking', 'nvidia/nv-rerankqa-mistral-4b-v3': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/nv-rerankqa-mistral-4b-v3/reranking', 'nvidia/llama-3.2-nv-rerankqa-1b-v2': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/llama-3_2-nv-rerankqa-1b-v2/reranking'}` | Mapping of rerank model identifiers to their API endpoints. |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -21,14 +21,14 @@ https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `oci_auth_type` | `<class 'str'>` | No | instance_principal | OCI authentication type (must be one of: instance_principal, config_file) |
|
||||
| `oci_region` | `<class 'str'>` | No | us-ashburn-1 | OCI region (e.g., us-ashburn-1) |
|
||||
| `oci_compartment_id` | `<class 'str'>` | No | | OCI compartment ID for the Generative AI service |
|
||||
| `oci_config_file_path` | `<class 'str'>` | No | ~/.oci/config | OCI config file path (required if oci_auth_type is config_file) |
|
||||
| `oci_config_profile` | `<class 'str'>` | No | DEFAULT | OCI config profile (required if oci_auth_type is config_file) |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `oci_auth_type` | `str` | No | instance_principal | OCI authentication type (must be one of: instance_principal, config_file) |
|
||||
| `oci_region` | `str` | No | us-ashburn-1 | OCI region (e.g., us-ashburn-1) |
|
||||
| `oci_compartment_id` | `str` | No | | OCI compartment ID for the Generative AI service |
|
||||
| `oci_config_file_path` | `str` | No | ~/.oci/config | OCI config file path (required if oci_auth_type is config_file) |
|
||||
| `oci_config_profile` | `str` | No | DEFAULT | OCI config profile (required if oci_auth_type is config_file) |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,9 +14,9 @@ Ollama inference provider for running local models through the Ollama runtime.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `url` | `<class 'str'>` | No | http://localhost:11434 | |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `url` | `str` | No | http://localhost:11434 | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,10 +14,10 @@ OpenAI inference provider for accessing GPT models and other OpenAI services.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `base_url` | `<class 'str'>` | No | https://api.openai.com/v1 | Base URL for OpenAI API |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `base_url` | `str` | No | https://api.openai.com/v1 | Base URL for OpenAI API |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,10 +14,10 @@ Passthrough inference provider for connecting to any external inference service
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `<class 'str'>` | No | | The URL for the passthrough endpoint |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `str` | No | | The URL for the passthrough endpoint |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,9 +14,9 @@ RunPod inference provider for running models on RunPod's cloud GPU platform.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_token` | `pydantic.types.SecretStr \| None` | No | | The API token |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_token` | `SecretStr \| None` | No | | The API token |
|
||||
| `url` | `str \| None` | No | | The URL for the Runpod model serving endpoint |
|
||||
|
||||
## Sample Configuration
|
||||
|
|
|
|||
|
|
@ -14,10 +14,10 @@ SambaNova inference provider for running models on SambaNova's dataflow architec
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `<class 'str'>` | No | https://api.sambanova.ai/v1 | The URL for the SambaNova AI server |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `str` | No | https://api.sambanova.ai/v1 | The URL for the SambaNova AI server |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,9 +14,9 @@ Text Generation Inference (TGI) provider for HuggingFace model serving.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `url` | `<class 'str'>` | No | | The URL for the TGI serving endpoint |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `url` | `str` | No | | The URL for the TGI serving endpoint |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,10 +14,10 @@ Together AI inference provider for open-source models and collaborative AI devel
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `<class 'str'>` | No | https://api.together.xyz/v1 | The URL for the Together AI server |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `str` | No | https://api.together.xyz/v1 | The URL for the Together AI server |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -53,10 +53,10 @@ Available Models:
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `project` | `<class 'str'>` | No | | Google Cloud project ID for Vertex AI |
|
||||
| `location` | `<class 'str'>` | No | us-central1 | Google Cloud location for Vertex AI |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `project` | `str` | No | | Google Cloud project ID for Vertex AI |
|
||||
| `location` | `str` | No | us-central1 | Google Cloud location for Vertex AI |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,11 +14,11 @@ Remote vLLM inference provider for connecting to vLLM servers.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_token` | `pydantic.types.SecretStr \| None` | No | | The API token |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_token` | `SecretStr \| None` | No | | The API token |
|
||||
| `url` | `str \| None` | No | | The URL for the vLLM model serving endpoint |
|
||||
| `max_tokens` | `<class 'int'>` | No | 4096 | Maximum number of tokens to generate. |
|
||||
| `max_tokens` | `int` | No | 4096 | Maximum number of tokens to generate. |
|
||||
| `tls_verify` | `bool \| str` | No | True | Whether to verify TLS certificates. Can be a boolean or a path to a CA certificate file. |
|
||||
|
||||
## Sample Configuration
|
||||
|
|
|
|||
|
|
@ -14,12 +14,12 @@ IBM WatsonX inference provider for accessing AI models on IBM's WatsonX platform
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `<class 'str'>` | No | https://us-south.ml.cloud.ibm.com | A base url for accessing the watsonx.ai |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `api_key` | `SecretStr \| None` | No | | Authentication credential for the provider |
|
||||
| `url` | `str` | No | https://us-south.ml.cloud.ibm.com | A base url for accessing the watsonx.ai |
|
||||
| `project_id` | `str \| None` | No | | The watsonx.ai project ID |
|
||||
| `timeout` | `<class 'int'>` | No | 60 | Timeout for the HTTP requests |
|
||||
| `timeout` | `int` | No | 60 | Timeout for the HTTP requests |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,23 +14,23 @@ HuggingFace-based post-training provider for fine-tuning models using the Huggin
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `device` | `<class 'str'>` | No | cuda | |
|
||||
| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No | | |
|
||||
| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface | |
|
||||
| `chat_template` | `<class 'str'>` | No | `<|user|>`<br/>`{input}`<br/>`<|assistant|>`<br/>`{output}` | |
|
||||
| `model_specific_config` | `<class 'dict'>` | No | `{'trust_remote_code': True, 'attn_implementation': 'sdpa'}` | |
|
||||
| `max_seq_length` | `<class 'int'>` | No | 2048 | |
|
||||
| `gradient_checkpointing` | `<class 'bool'>` | No | False | |
|
||||
| `save_total_limit` | `<class 'int'>` | No | 3 | |
|
||||
| `logging_steps` | `<class 'int'>` | No | 10 | |
|
||||
| `warmup_ratio` | `<class 'float'>` | No | 0.1 | |
|
||||
| `weight_decay` | `<class 'float'>` | No | 0.01 | |
|
||||
| `dataloader_num_workers` | `<class 'int'>` | No | 4 | |
|
||||
| `dataloader_pin_memory` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_beta` | `<class 'float'>` | No | 0.1 | |
|
||||
| `use_reference_model` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_loss_type` | `Literal['sigmoid', 'hinge', 'ipo', 'kto_pair'` | No | sigmoid | |
|
||||
| `dpo_output_dir` | `<class 'str'>` | No | | |
|
||||
| `device` | `str` | No | cuda | |
|
||||
| `distributed_backend` | `Literal[fsdp, deepspeed] \| None` | No | | |
|
||||
| `checkpoint_format` | `Literal[full_state, huggingface] \| None` | No | huggingface | |
|
||||
| `chat_template` | `str` | No | `<|user|>`<br/>`{input}`<br/>`<|assistant|>`<br/>`{output}` | |
|
||||
| `model_specific_config` | `dict` | No | `{'trust_remote_code': True, 'attn_implementation': 'sdpa'}` | |
|
||||
| `max_seq_length` | `int` | No | 2048 | |
|
||||
| `gradient_checkpointing` | `bool` | No | False | |
|
||||
| `save_total_limit` | `int` | No | 3 | |
|
||||
| `logging_steps` | `int` | No | 10 | |
|
||||
| `warmup_ratio` | `float` | No | 0.1 | |
|
||||
| `weight_decay` | `float` | No | 0.01 | |
|
||||
| `dataloader_num_workers` | `int` | No | 4 | |
|
||||
| `dataloader_pin_memory` | `bool` | No | True | |
|
||||
| `dpo_beta` | `float` | No | 0.1 | |
|
||||
| `use_reference_model` | `bool` | No | True | |
|
||||
| `dpo_loss_type` | `Literal[sigmoid, hinge, ipo, kto_pair]` | No | sigmoid | |
|
||||
| `dpo_output_dir` | `str` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -15,7 +15,7 @@ TorchTune-based post-training provider for fine-tuning and optimizing models usi
|
|||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `torch_seed` | `int \| None` | No | | |
|
||||
| `checkpoint_format` | `Literal['meta', 'huggingface'` | No | meta | |
|
||||
| `checkpoint_format` | `Literal[meta, huggingface] \| None` | No | meta | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -15,7 +15,7 @@ TorchTune-based post-training provider for fine-tuning and optimizing models usi
|
|||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `torch_seed` | `int \| None` | No | | |
|
||||
| `checkpoint_format` | `Literal['meta', 'huggingface'` | No | meta | |
|
||||
| `checkpoint_format` | `Literal[meta, huggingface] \| None` | No | meta | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -18,9 +18,9 @@ NVIDIA's post-training provider for fine-tuning models on NVIDIA's platform.
|
|||
| `dataset_namespace` | `str \| None` | No | default | The NVIDIA dataset namespace. |
|
||||
| `project_id` | `str \| None` | No | test-example-model@v1 | The NVIDIA project ID. |
|
||||
| `customizer_url` | `str \| None` | No | | Base URL for the NeMo Customizer API |
|
||||
| `timeout` | `<class 'int'>` | No | 300 | Timeout for the NVIDIA Post Training API |
|
||||
| `max_retries` | `<class 'int'>` | No | 3 | Maximum number of retries for the NVIDIA Post Training API |
|
||||
| `output_model_dir` | `<class 'str'>` | No | test-example-model@v1 | Directory to save the output model |
|
||||
| `timeout` | `int` | No | 300 | Timeout for the NVIDIA Post Training API |
|
||||
| `max_retries` | `int` | No | 3 | Maximum number of retries for the NVIDIA Post Training API |
|
||||
| `output_model_dir` | `str` | No | test-example-model@v1 | Directory to save the output model |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -1,7 +1,8 @@
|
|||
---
|
||||
description: "Safety
|
||||
description: |
|
||||
Safety
|
||||
|
||||
OpenAI-compatible Moderations API."
|
||||
OpenAI-compatible Moderations API.
|
||||
sidebar_label: Safety
|
||||
title: Safety
|
||||
---
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@ Llama Guard safety provider for content moderation and safety filtering using Me
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `excluded_categories` | `list[str` | No | [] | |
|
||||
| `excluded_categories` | `list[str]` | No | [] | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@ Prompt Guard safety provider for detecting and filtering unsafe prompts and cont
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `guard_type` | `<class 'str'>` | No | injection | |
|
||||
| `guard_type` | `str` | No | injection | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,8 +14,8 @@ AWS Bedrock safety provider for content moderation using AWS's safety services.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `allowed_models` | `list[str] \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
|
||||
| `refresh_models` | `bool` | No | False | Whether to refresh models periodically from the provider |
|
||||
| `aws_access_key_id` | `str \| None` | No | | The AWS access key to use. Default use environment variable: AWS_ACCESS_KEY_ID |
|
||||
| `aws_secret_access_key` | `str \| None` | No | | The AWS secret access key to use. Default use environment variable: AWS_SECRET_ACCESS_KEY |
|
||||
| `aws_session_token` | `str \| None` | No | | The AWS session token to use. Default use environment variable: AWS_SESSION_TOKEN |
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@ NVIDIA's safety provider for content moderation and safety filtering.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `guardrails_service_url` | `<class 'str'>` | No | http://0.0.0.0:7331 | The url for accessing the Guardrails service |
|
||||
| `guardrails_service_url` | `str` | No | http://0.0.0.0:7331 | The url for accessing the Guardrails service |
|
||||
| `config_id` | `str \| None` | No | self-check | Guardrails configuration ID to use from the Guardrails configuration store |
|
||||
|
||||
## Sample Configuration
|
||||
|
|
|
|||
|
|
@ -14,8 +14,8 @@ SambaNova's safety provider for content moderation and safety filtering.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `url` | `<class 'str'>` | No | https://api.sambanova.ai/v1 | The URL for the SambaNova AI server |
|
||||
| `api_key` | `pydantic.types.SecretStr \| None` | No | | The SambaNova cloud API Key |
|
||||
| `url` | `str` | No | https://api.sambanova.ai/v1 | The URL for the SambaNova AI server |
|
||||
| `api_key` | `SecretStr \| None` | No | | The SambaNova cloud API Key |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -15,7 +15,7 @@ Bing Search tool for web search capabilities using Microsoft's search engine.
|
|||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `api_key` | `str \| None` | No | | |
|
||||
| `top_k` | `<class 'int'>` | No | 3 | |
|
||||
| `top_k` | `int` | No | 3 | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -15,7 +15,7 @@ Brave Search tool for web search capabilities with privacy-focused results.
|
|||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `api_key` | `str \| None` | No | | The Brave Search API Key |
|
||||
| `max_results` | `<class 'int'>` | No | 3 | The maximum number of results to return |
|
||||
| `max_results` | `int` | No | 3 | The maximum number of results to return |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -15,7 +15,7 @@ Tavily Search tool for AI-optimized web search with structured results.
|
|||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `api_key` | `str \| None` | No | | The Tavily Search API Key |
|
||||
| `max_results` | `<class 'int'>` | No | 3 | The maximum number of results to return |
|
||||
| `max_results` | `int` | No | 3 | The maximum number of results to return |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -78,8 +78,8 @@ See [Chroma's documentation](https://docs.trychroma.com/docs/overview/introducti
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `db_path` | `<class 'str'>` | No | | |
|
||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Config for KV store backend |
|
||||
| `db_path` | `str` | No | | |
|
||||
| `persistence` | `KVStoreReference` | No | | Config for KV store backend |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -95,7 +95,7 @@ more details about Faiss in general.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
||||
| `persistence` | `KVStoreReference` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@ Meta's reference implementation of a vector database.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
||||
| `persistence` | `KVStoreReference` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -16,9 +16,9 @@ Please refer to the remote provider documentation.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `db_path` | `<class 'str'>` | No | | |
|
||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Config for KV store backend (SQLite only for now) |
|
||||
| `consistency_level` | `<class 'str'>` | No | Strong | The consistency level of the Milvus server |
|
||||
| `db_path` | `str` | No | | |
|
||||
| `persistence` | `KVStoreReference` | No | | Config for KV store backend (SQLite only for now) |
|
||||
| `consistency_level` | `str` | No | Strong | The consistency level of the Milvus server |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -97,8 +97,8 @@ See the [Qdrant documentation](https://qdrant.tech/documentation/) for more deta
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `path` | `<class 'str'>` | No | | |
|
||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
||||
| `path` | `str` | No | | |
|
||||
| `persistence` | `KVStoreReference` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -407,8 +407,8 @@ See [sqlite-vec's GitHub repo](https://github.com/asg017/sqlite-vec/tree/main) f
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `db_path` | `<class 'str'>` | No | | Path to the SQLite database file |
|
||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Config for KV store backend (SQLite only for now) |
|
||||
| `db_path` | `str` | No | | Path to the SQLite database file |
|
||||
| `persistence` | `KVStoreReference` | No | | Config for KV store backend (SQLite only for now) |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -16,8 +16,8 @@ Please refer to the sqlite-vec provider documentation.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `db_path` | `<class 'str'>` | No | | Path to the SQLite database file |
|
||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Config for KV store backend (SQLite only for now) |
|
||||
| `db_path` | `str` | No | | Path to the SQLite database file |
|
||||
| `persistence` | `KVStoreReference` | No | | Config for KV store backend (SQLite only for now) |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -78,7 +78,7 @@ See [Chroma's documentation](https://docs.trychroma.com/docs/overview/introducti
|
|||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `url` | `str \| None` | No | | |
|
||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Config for KV store backend |
|
||||
| `persistence` | `KVStoreReference` | No | | Config for KV store backend |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -405,10 +405,10 @@ For more details on TLS configuration, refer to the [TLS setup guide](https://mi
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `uri` | `<class 'str'>` | No | | The URI of the Milvus server |
|
||||
| `uri` | `str` | No | | The URI of the Milvus server |
|
||||
| `token` | `str \| None` | No | | The token of the Milvus server |
|
||||
| `consistency_level` | `<class 'str'>` | No | Strong | The consistency level of the Milvus server |
|
||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | Config for KV store backend |
|
||||
| `consistency_level` | `str` | No | Strong | The consistency level of the Milvus server |
|
||||
| `persistence` | `KVStoreReference` | No | | Config for KV store backend |
|
||||
| `config` | `dict` | No | `{}` | This configuration allows additional fields to be passed through to the underlying Milvus client. See the [Milvus](https://milvus.io/docs/install-overview.md) documentation for more details about Milvus in general. |
|
||||
|
||||
:::note
|
||||
|
|
|
|||
|
|
@ -218,7 +218,7 @@ See [PGVector's documentation](https://github.com/pgvector/pgvector) for more de
|
|||
| `db` | `str \| None` | No | postgres | |
|
||||
| `user` | `str \| None` | No | postgres | |
|
||||
| `password` | `str \| None` | No | mysecretpassword | |
|
||||
| `persistence` | `llama_stack.core.storage.datatypes.KVStoreReference \| None` | No | | Config for KV store backend (SQLite only for now) |
|
||||
| `persistence` | `KVStoreReference \| None` | No | | Config for KV store backend (SQLite only for now) |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -19,14 +19,14 @@ Please refer to the inline provider documentation.
|
|||
| `location` | `str \| None` | No | | |
|
||||
| `url` | `str \| None` | No | | |
|
||||
| `port` | `int \| None` | No | 6333 | |
|
||||
| `grpc_port` | `<class 'int'>` | No | 6334 | |
|
||||
| `prefer_grpc` | `<class 'bool'>` | No | False | |
|
||||
| `grpc_port` | `int` | No | 6334 | |
|
||||
| `prefer_grpc` | `bool` | No | False | |
|
||||
| `https` | `bool \| None` | No | | |
|
||||
| `api_key` | `str \| None` | No | | |
|
||||
| `prefix` | `str \| None` | No | | |
|
||||
| `timeout` | `int \| None` | No | | |
|
||||
| `host` | `str \| None` | No | | |
|
||||
| `persistence` | `<class 'llama_stack.core.storage.datatypes.KVStoreReference'>` | No | | |
|
||||
| `persistence` | `KVStoreReference` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -75,7 +75,7 @@ See [Weaviate's documentation](https://weaviate.io/developers/weaviate) for more
|
|||
|-------|------|----------|---------|-------------|
|
||||
| `weaviate_api_key` | `str \| None` | No | | The API key for the Weaviate instance |
|
||||
| `weaviate_cluster_url` | `str \| None` | No | localhost:8080 | The URL of the Weaviate cluster |
|
||||
| `persistence` | `llama_stack.core.storage.datatypes.KVStoreReference \| None` | No | | Config for KV store backend (SQLite only for now) |
|
||||
| `persistence` | `KVStoreReference \| None` | No | | Config for KV store backend (SQLite only for now) |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue