docs: auto generated documentation for providers (#2543)

# What does this PR do? Simple approach to get some provider pages in the docs. Add or update description fields in the provider configuration class using Pydantic’s Field, ensuring these descriptions are clear and complete, as they will be used to auto-generate provider documentation via ./scripts/distro_codegen.py instead of editing the docs manually. Signed-off-by: Sébastien Han <seb@redhat.com>
2025-07-04 05:12:35 +00:00 · 2025-06-30 15:13:20 +02:00 · 2025-06-30 15:13:20 +02:00 · c9a49a80e8
commit c9a49a80e8
parent 8d8e90d78e
96 changed files with 2562 additions and 65 deletions
--- a/docs/source/getting_started/index.md
+++ b/docs/source/getting_started/index.md
@ -6,7 +6,7 @@ Llama Stack is a stateful service with REST APIs to support the seamless transit
 environments. You can build and test using a local server first and deploy to a hosted endpoint for production.

 In this guide, we'll walk through how to build a RAG application locally using Llama Stack with [Ollama](https://ollama.com/)
-as the inference [provider](../providers/index.md#inference) for a Llama Model.
+as the inference [provider](../providers/inference/index) for a Llama Model.

 #### Step 1: Install and setup
 1. Install [uv](https://docs.astral.sh/uv/)
--- a/docs/source/providers/agents/index.md
+++ b/docs/source/providers/agents/index.md
@ -0,0 +1,5 @@
+# Agents Providers
+
+This section contains documentation for all available providers for the **agents** API.
+
+- [inline::meta-reference](inline_meta-reference.md)
--- a/docs/source/providers/agents/inline_meta-reference.md
+++ b/docs/source/providers/agents/inline_meta-reference.md
@ -0,0 +1,26 @@
+# inline::meta-reference
+
+## Description
+
+Meta's reference implementation of an agent system that can use tools, access vector databases, and perform complex reasoning tasks.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `persistence_store` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite |  |
+| `responses_store` | `utils.sqlstore.sqlstore.SqliteSqlStoreConfig \| utils.sqlstore.sqlstore.PostgresSqlStoreConfig` | No | sqlite |  |
+
+## Sample Configuration
+
+```yaml
+persistence_store:
+  type: sqlite
+  namespace: null
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/agents_store.db
+responses_store:
+  type: sqlite
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/responses_store.db
+
+```
+
--- a/docs/source/providers/datasetio/index.md
+++ b/docs/source/providers/datasetio/index.md
@ -0,0 +1,7 @@
+# Datasetio Providers
+
+This section contains documentation for all available providers for the **datasetio** API.
+
+- [inline::localfs](inline_localfs.md)
+- [remote::huggingface](remote_huggingface.md)
+- [remote::nvidia](remote_nvidia.md)
--- a/docs/source/providers/datasetio/inline_localfs.md
+++ b/docs/source/providers/datasetio/inline_localfs.md
@ -0,0 +1,22 @@
+# inline::localfs
+
+## Description
+
+Local filesystem-based dataset I/O provider for reading and writing datasets to local storage.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite |  |
+
+## Sample Configuration
+
+```yaml
+kvstore:
+  type: sqlite
+  namespace: null
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/localfs_datasetio.db
+
+```
+
--- a/docs/source/providers/datasetio/remote_huggingface.md
+++ b/docs/source/providers/datasetio/remote_huggingface.md
@ -0,0 +1,22 @@
+# remote::huggingface
+
+## Description
+
+HuggingFace datasets provider for accessing and managing datasets from the HuggingFace Hub.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite |  |
+
+## Sample Configuration
+
+```yaml
+kvstore:
+  type: sqlite
+  namespace: null
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/huggingface_datasetio.db
+
+```
+
--- a/docs/source/providers/datasetio/remote_nvidia.md
+++ b/docs/source/providers/datasetio/remote_nvidia.md
@ -0,0 +1,25 @@
+# remote::nvidia
+
+## Description
+
+NVIDIA's dataset I/O provider for accessing datasets from NVIDIA's data platform.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | The NVIDIA API key. |
+| `dataset_namespace` | `str \| None` | No | default | The NVIDIA dataset namespace. |
+| `project_id` | `str \| None` | No | test-project | The NVIDIA project ID. |
+| `datasets_url` | `<class 'str'>` | No | http://nemo.test | Base URL for the NeMo Dataset API |
+
+## Sample Configuration
+
+```yaml
+api_key: ${env.NVIDIA_API_KEY:+}
+dataset_namespace: ${env.NVIDIA_DATASET_NAMESPACE:=default}
+project_id: ${env.NVIDIA_PROJECT_ID:=test-project}
+datasets_url: ${env.NVIDIA_DATASETS_URL:=http://nemo.test}
+
+```
+
--- a/docs/source/providers/eval/index.md
+++ b/docs/source/providers/eval/index.md
@ -0,0 +1,6 @@
+# Eval Providers
+
+This section contains documentation for all available providers for the **eval** API.
+
+- [inline::meta-reference](inline_meta-reference.md)
+- [remote::nvidia](remote_nvidia.md)
--- a/docs/source/providers/eval/inline_meta-reference.md
+++ b/docs/source/providers/eval/inline_meta-reference.md
@ -0,0 +1,22 @@
+# inline::meta-reference
+
+## Description
+
+Meta's reference implementation of evaluation tasks with support for multiple languages and evaluation metrics.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite |  |
+
+## Sample Configuration
+
+```yaml
+kvstore:
+  type: sqlite
+  namespace: null
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/meta_reference_eval.db
+
+```
+
--- a/docs/source/providers/eval/remote_nvidia.md
+++ b/docs/source/providers/eval/remote_nvidia.md
@ -0,0 +1,19 @@
+# remote::nvidia
+
+## Description
+
+NVIDIA's evaluation provider for running evaluation tasks on NVIDIA's platform.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `evaluator_url` | `<class 'str'>` | No | http://0.0.0.0:7331 | The url for accessing the evaluator service |
+
+## Sample Configuration
+
+```yaml
+evaluator_url: ${env.NVIDIA_EVALUATOR_URL:=http://localhost:7331}
+
+```
+
--- a/docs/source/providers/files/index.md
+++ b/docs/source/providers/files/index.md
@ -0,0 +1,5 @@
+# Files Providers
+
+This section contains documentation for all available providers for the **files** API.
+
+- [inline::localfs](inline_localfs.md)
--- a/docs/source/providers/files/inline_localfs.md
+++ b/docs/source/providers/files/inline_localfs.md
@ -0,0 +1,24 @@
+# inline::localfs
+
+## Description
+
+Local filesystem-based file storage provider for managing files and documents locally.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `storage_dir` | `<class 'str'>` | No | PydanticUndefined | Directory to store uploaded files |
+| `metadata_store` | `utils.sqlstore.sqlstore.SqliteSqlStoreConfig \| utils.sqlstore.sqlstore.PostgresSqlStoreConfig` | No | sqlite | SQL store configuration for file metadata |
+| `ttl_secs` | `<class 'int'>` | No | 31536000 |  |
+
+## Sample Configuration
+
+```yaml
+storage_dir: ${env.FILES_STORAGE_DIR:=~/.llama/dummy/files}
+metadata_store:
+  type: sqlite
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/files_metadata.db
+
+```
+
--- a/docs/source/providers/index.md
+++ b/docs/source/providers/index.md
@ -18,60 +18,92 @@ Llama Stack supports external providers that live outside of the main codebase.
 ## Agents
 Run multi-step agentic workflows with LLMs with tool usage, memory (RAG), etc.

+```{toctree}
+:maxdepth: 1
+
+agents/index
+```
+
 ## DatasetIO
 Interfaces with datasets and data loaders.

-## Eval
-Generates outputs (via Inference or Agents) and perform scoring.
-
-## Inference
-Runs inference with an LLM.
-
-## Post Training
-Fine-tunes a model.
-
-#### Post Training Providers
-The following providers are available for Post Training:
-
 ```{toctree}
 :maxdepth: 1

-external
-post_training/huggingface
-post_training/torchtune
-post_training/nvidia_nemo
+datasetio/index
+```
+
+## Eval
+Generates outputs (via Inference or Agents) and perform scoring.
+
+```{toctree}
+:maxdepth: 1
+
+eval/index
+```
+
+## Inference
+Runs inference with an LLM.
+
+```{toctree}
+:maxdepth: 1
+
+inference/index
+```
+
+## Post Training
+Fine-tunes a model.
+
+```{toctree}
+:maxdepth: 1
+
+post_training/index
 ```

 ## Safety
 Applies safety policies to the output at a Systems (not only model) level.

+```{toctree}
+:maxdepth: 1
+
+safety/index
+```
+
 ## Scoring
 Evaluates the outputs of the system.

+```{toctree}
+:maxdepth: 1
+
+scoring/index
+```
+
 ## Telemetry
 Collects telemetry data from the system.

+```{toctree}
+:maxdepth: 1
+
+telemetry/index
+```
+
 ## Tool Runtime
 Is associated with the ToolGroup resouces.

+```{toctree}
+:maxdepth: 1
+
+tool_runtime/index
+```
+
 ## Vector IO

 Vector IO refers to operations on vector databases, such as adding documents, searching, and deleting documents.
 Vector IO plays a crucial role in [Retreival Augmented Generation (RAG)](../..//building_applications/rag), where the vector
 io and database are used to store and retrieve documents for retrieval.

-#### Vector IO Providers
-The following providers (i.e., databases) are available for Vector IO:
-
 ```{toctree}
 :maxdepth: 1

-external
-vector_io/faiss
-vector_io/sqlite-vec
-vector_io/chromadb
-vector_io/pgvector
-vector_io/qdrant
-vector_io/milvus
-vector_io/weaviate
+vector_io/index
 ```
--- a/docs/source/providers/inference/index.md
+++ b/docs/source/providers/inference/index.md
@ -0,0 +1,32 @@
+# Inference Providers
+
+This section contains documentation for all available providers for the **inference** API.
+
+- [inline::meta-reference](inline_meta-reference.md)
+- [inline::sentence-transformers](inline_sentence-transformers.md)
+- [inline::vllm](inline_vllm.md)
+- [remote::anthropic](remote_anthropic.md)
+- [remote::bedrock](remote_bedrock.md)
+- [remote::cerebras](remote_cerebras.md)
+- [remote::cerebras-openai-compat](remote_cerebras-openai-compat.md)
+- [remote::databricks](remote_databricks.md)
+- [remote::fireworks](remote_fireworks.md)
+- [remote::fireworks-openai-compat](remote_fireworks-openai-compat.md)
+- [remote::gemini](remote_gemini.md)
+- [remote::groq](remote_groq.md)
+- [remote::groq-openai-compat](remote_groq-openai-compat.md)
+- [remote::hf::endpoint](remote_hf_endpoint.md)
+- [remote::hf::serverless](remote_hf_serverless.md)
+- [remote::llama-openai-compat](remote_llama-openai-compat.md)
+- [remote::nvidia](remote_nvidia.md)
+- [remote::ollama](remote_ollama.md)
+- [remote::openai](remote_openai.md)
+- [remote::passthrough](remote_passthrough.md)
+- [remote::runpod](remote_runpod.md)
+- [remote::sambanova](remote_sambanova.md)
+- [remote::sambanova-openai-compat](remote_sambanova-openai-compat.md)
+- [remote::tgi](remote_tgi.md)
+- [remote::together](remote_together.md)
+- [remote::together-openai-compat](remote_together-openai-compat.md)
+- [remote::vllm](remote_vllm.md)
+- [remote::watsonx](remote_watsonx.md)
--- a/docs/source/providers/inference/inline_meta-reference.md
+++ b/docs/source/providers/inference/inline_meta-reference.md
@ -0,0 +1,32 @@
+# inline::meta-reference
+
+## Description
+
+Meta's reference implementation of inference with support for various model formats and optimization techniques.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `model` | `str \| None` | No |  |  |
+| `torch_seed` | `int \| None` | No |  |  |
+| `max_seq_len` | `<class 'int'>` | No | 4096 |  |
+| `max_batch_size` | `<class 'int'>` | No | 1 |  |
+| `model_parallel_size` | `int \| None` | No |  |  |
+| `create_distributed_process_group` | `<class 'bool'>` | No | True |  |
+| `checkpoint_dir` | `str \| None` | No |  |  |
+| `quantization` | `Bf16QuantizationConfig \| Fp8QuantizationConfig \| Int4QuantizationConfig, annotation=NoneType, required=True, discriminator='type'` | No |  |  |
+
+## Sample Configuration
+
+```yaml
+model: Llama3.2-3B-Instruct
+checkpoint_dir: ${env.CHECKPOINT_DIR:=null}
+quantization:
+  type: ${env.QUANTIZATION_TYPE:=bf16}
+model_parallel_size: ${env.MODEL_PARALLEL_SIZE:=0}
+max_batch_size: ${env.MAX_BATCH_SIZE:=1}
+max_seq_len: ${env.MAX_SEQ_LEN:=4096}
+
+```
+
--- a/docs/source/providers/inference/inline_sentence-transformers.md
+++ b/docs/source/providers/inference/inline_sentence-transformers.md
@ -0,0 +1,13 @@
+# inline::sentence-transformers
+
+## Description
+
+Sentence Transformers inference provider for text embeddings and similarity search.
+
+## Sample Configuration
+
+```yaml
+{}
+
+```
+
--- a/docs/source/providers/inference/inline_vllm.md
+++ b/docs/source/providers/inference/inline_vllm.md
@ -0,0 +1,29 @@
+# inline::vllm
+
+## Description
+
+vLLM inference provider for high-performance model serving with PagedAttention and continuous batching.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `tensor_parallel_size` | `<class 'int'>` | No | 1 | Number of tensor parallel replicas (number of GPUs to use). |
+| `max_tokens` | `<class 'int'>` | No | 4096 | Maximum number of tokens to generate. |
+| `max_model_len` | `<class 'int'>` | No | 4096 | Maximum context length to use during serving. |
+| `max_num_seqs` | `<class 'int'>` | No | 4 | Maximum parallel batch size for generation. |
+| `enforce_eager` | `<class 'bool'>` | No | False | Whether to use eager mode for inference (otherwise cuda graphs are used). |
+| `gpu_memory_utilization` | `<class 'float'>` | No | 0.3 | How much GPU memory will be allocated when this provider has finished loading, including memory that was already allocated before loading. |
+
+## Sample Configuration
+
+```yaml
+tensor_parallel_size: ${env.TENSOR_PARALLEL_SIZE:=1}
+max_tokens: ${env.MAX_TOKENS:=4096}
+max_model_len: ${env.MAX_MODEL_LEN:=4096}
+max_num_seqs: ${env.MAX_NUM_SEQS:=4}
+enforce_eager: ${env.ENFORCE_EAGER:=False}
+gpu_memory_utilization: ${env.GPU_MEMORY_UTILIZATION:=0.3}
+
+```
+
--- a/docs/source/providers/inference/remote_anthropic.md
+++ b/docs/source/providers/inference/remote_anthropic.md
@ -0,0 +1,19 @@
+# remote::anthropic
+
+## Description
+
+Anthropic inference provider for accessing Claude models and Anthropic's AI services.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | API key for Anthropic models |
+
+## Sample Configuration
+
+```yaml
+api_key: ${env.ANTHROPIC_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_bedrock.md
+++ b/docs/source/providers/inference/remote_bedrock.md
@ -0,0 +1,28 @@
+# remote::bedrock
+
+## Description
+
+AWS Bedrock inference provider for accessing various AI models through AWS's managed service.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `aws_access_key_id` | `str \| None` | No |  | The AWS access key to use. Default use environment variable: AWS_ACCESS_KEY_ID |
+| `aws_secret_access_key` | `str \| None` | No |  | The AWS secret access key to use. Default use environment variable: AWS_SECRET_ACCESS_KEY |
+| `aws_session_token` | `str \| None` | No |  | The AWS session token to use. Default use environment variable: AWS_SESSION_TOKEN |
+| `region_name` | `str \| None` | No |  | The default AWS Region to use, for example, us-west-1 or us-west-2.Default use environment variable: AWS_DEFAULT_REGION |
+| `profile_name` | `str \| None` | No |  | The profile name that contains credentials to use.Default use environment variable: AWS_PROFILE |
+| `total_max_attempts` | `int \| None` | No |  | An integer representing the maximum number of attempts that will be made for a single request, including the initial attempt. Default use environment variable: AWS_MAX_ATTEMPTS |
+| `retry_mode` | `str \| None` | No |  | A string representing the type of retries Boto3 will perform.Default use environment variable: AWS_RETRY_MODE |
+| `connect_timeout` | `float \| None` | No | 60 | The time in seconds till a timeout exception is thrown when attempting to make a connection. The default is 60 seconds. |
+| `read_timeout` | `float \| None` | No | 60 | The time in seconds till a timeout exception is thrown when attempting to read from a connection.The default is 60 seconds. |
+| `session_ttl` | `int \| None` | No | 3600 | The time in seconds till a session expires. The default is 3600 seconds (1 hour). |
+
+## Sample Configuration
+
+```yaml
+{}
+
+```
+
--- a/docs/source/providers/inference/remote_cerebras-openai-compat.md
+++ b/docs/source/providers/inference/remote_cerebras-openai-compat.md
@ -0,0 +1,21 @@
+# remote::cerebras-openai-compat
+
+## Description
+
+Cerebras OpenAI-compatible provider for using Cerebras models with OpenAI API format.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | The Cerebras API key |
+| `openai_compat_api_base` | `<class 'str'>` | No | https://api.cerebras.ai/v1 | The URL for the Cerebras API server |
+
+## Sample Configuration
+
+```yaml
+openai_compat_api_base: https://api.cerebras.ai/v1
+api_key: ${env.CEREBRAS_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_cerebras.md
+++ b/docs/source/providers/inference/remote_cerebras.md
@ -0,0 +1,21 @@
+# remote::cerebras
+
+## Description
+
+Cerebras inference provider for running models on Cerebras Cloud platform.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `base_url` | `<class 'str'>` | No | https://api.cerebras.ai | Base URL for the Cerebras API |
+| `api_key` | `pydantic.types.SecretStr \| None` | No |  | Cerebras API Key |
+
+## Sample Configuration
+
+```yaml
+base_url: https://api.cerebras.ai
+api_key: ${env.CEREBRAS_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_databricks.md
+++ b/docs/source/providers/inference/remote_databricks.md
@ -0,0 +1,21 @@
+# remote::databricks
+
+## Description
+
+Databricks inference provider for running models on Databricks' unified analytics platform.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `url` | `<class 'str'>` | No |  | The URL for the Databricks model serving endpoint |
+| `api_token` | `<class 'str'>` | No |  | The Databricks API token |
+
+## Sample Configuration
+
+```yaml
+url: ${env.DATABRICKS_URL}
+api_token: ${env.DATABRICKS_API_TOKEN}
+
+```
+
--- a/docs/source/providers/inference/remote_fireworks-openai-compat.md
+++ b/docs/source/providers/inference/remote_fireworks-openai-compat.md
@ -0,0 +1,21 @@
+# remote::fireworks-openai-compat
+
+## Description
+
+Fireworks AI OpenAI-compatible provider for using Fireworks models with OpenAI API format.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | The Fireworks API key |
+| `openai_compat_api_base` | `<class 'str'>` | No | https://api.fireworks.ai/inference/v1 | The URL for the Fireworks API server |
+
+## Sample Configuration
+
+```yaml
+openai_compat_api_base: https://api.fireworks.ai/inference/v1
+api_key: ${env.FIREWORKS_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_fireworks.md
+++ b/docs/source/providers/inference/remote_fireworks.md
@ -0,0 +1,21 @@
+# remote::fireworks
+
+## Description
+
+Fireworks AI inference provider for Llama models and other AI models on the Fireworks platform.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `url` | `<class 'str'>` | No | https://api.fireworks.ai/inference/v1 | The URL for the Fireworks server |
+| `api_key` | `pydantic.types.SecretStr \| None` | No |  | The Fireworks.ai API Key |
+
+## Sample Configuration
+
+```yaml
+url: https://api.fireworks.ai/inference/v1
+api_key: ${env.FIREWORKS_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_gemini.md
+++ b/docs/source/providers/inference/remote_gemini.md
@ -0,0 +1,19 @@
+# remote::gemini
+
+## Description
+
+Google Gemini inference provider for accessing Gemini models and Google's AI services.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | API key for Gemini models |
+
+## Sample Configuration
+
+```yaml
+api_key: ${env.GEMINI_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_groq-openai-compat.md
+++ b/docs/source/providers/inference/remote_groq-openai-compat.md
@ -0,0 +1,21 @@
+# remote::groq-openai-compat
+
+## Description
+
+Groq OpenAI-compatible provider for using Groq models with OpenAI API format.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | The Groq API key |
+| `openai_compat_api_base` | `<class 'str'>` | No | https://api.groq.com/openai/v1 | The URL for the Groq API server |
+
+## Sample Configuration
+
+```yaml
+openai_compat_api_base: https://api.groq.com/openai/v1
+api_key: ${env.GROQ_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_groq.md
+++ b/docs/source/providers/inference/remote_groq.md
@ -0,0 +1,21 @@
+# remote::groq
+
+## Description
+
+Groq inference provider for ultra-fast inference using Groq's LPU technology.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | The Groq API key |
+| `url` | `<class 'str'>` | No | https://api.groq.com | The URL for the Groq AI server |
+
+## Sample Configuration
+
+```yaml
+url: https://api.groq.com
+api_key: ${env.GROQ_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_hf_endpoint.md
+++ b/docs/source/providers/inference/remote_hf_endpoint.md
@ -0,0 +1,21 @@
+# remote::hf::endpoint
+
+## Description
+
+HuggingFace Inference Endpoints provider for dedicated model serving.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `endpoint_name` | `<class 'str'>` | No | PydanticUndefined | The name of the Hugging Face Inference Endpoint in the format of '{namespace}/{endpoint_name}' (e.g. 'my-cool-org/meta-llama-3-1-8b-instruct-rce'). Namespace is optional and will default to the user account if not provided. |
+| `api_token` | `pydantic.types.SecretStr \| None` | No |  | Your Hugging Face user access token (will default to locally saved token if not provided) |
+
+## Sample Configuration
+
+```yaml
+endpoint_name: ${env.INFERENCE_ENDPOINT_NAME}
+api_token: ${env.HF_API_TOKEN}
+
+```
+
--- a/docs/source/providers/inference/remote_hf_serverless.md
+++ b/docs/source/providers/inference/remote_hf_serverless.md
@ -0,0 +1,21 @@
+# remote::hf::serverless
+
+## Description
+
+HuggingFace Inference API serverless provider for on-demand model inference.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `huggingface_repo` | `<class 'str'>` | No | PydanticUndefined | The model ID of the model on the Hugging Face Hub (e.g. 'meta-llama/Meta-Llama-3.1-70B-Instruct') |
+| `api_token` | `pydantic.types.SecretStr \| None` | No |  | Your Hugging Face user access token (will default to locally saved token if not provided) |
+
+## Sample Configuration
+
+```yaml
+huggingface_repo: ${env.INFERENCE_MODEL}
+api_token: ${env.HF_API_TOKEN}
+
+```
+
--- a/docs/source/providers/inference/remote_llama-openai-compat.md
+++ b/docs/source/providers/inference/remote_llama-openai-compat.md
@ -0,0 +1,21 @@
+# remote::llama-openai-compat
+
+## Description
+
+Llama OpenAI-compatible provider for using Llama models with OpenAI API format.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | The Llama API key |
+| `openai_compat_api_base` | `<class 'str'>` | No | https://api.llama.com/compat/v1/ | The URL for the Llama API server |
+
+## Sample Configuration
+
+```yaml
+openai_compat_api_base: https://api.llama.com/compat/v1/
+api_key: ${env.LLAMA_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_nvidia.md
+++ b/docs/source/providers/inference/remote_nvidia.md
@ -0,0 +1,24 @@
+# remote::nvidia
+
+## Description
+
+NVIDIA inference provider for accessing NVIDIA NIM models and AI services.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `url` | `<class 'str'>` | No | https://integrate.api.nvidia.com | A base url for accessing the NVIDIA NIM |
+| `api_key` | `pydantic.types.SecretStr \| None` | No |  | The NVIDIA API key, only needed of using the hosted service |
+| `timeout` | `<class 'int'>` | No | 60 | Timeout for the HTTP requests |
+| `append_api_version` | `<class 'bool'>` | No | True | When set to false, the API version will not be appended to the base_url. By default, it is true. |
+
+## Sample Configuration
+
+```yaml
+url: ${env.NVIDIA_BASE_URL:=https://integrate.api.nvidia.com}
+api_key: ${env.NVIDIA_API_KEY:+}
+append_api_version: ${env.NVIDIA_APPEND_API_VERSION:=True}
+
+```
+
--- a/docs/source/providers/inference/remote_ollama.md
+++ b/docs/source/providers/inference/remote_ollama.md
@ -0,0 +1,21 @@
+# remote::ollama
+
+## Description
+
+Ollama inference provider for running local models through the Ollama runtime.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `url` | `<class 'str'>` | No | http://localhost:11434 |  |
+| `raise_on_connect_error` | `<class 'bool'>` | No | True |  |
+
+## Sample Configuration
+
+```yaml
+url: ${env.OLLAMA_URL:=http://localhost:11434}
+raise_on_connect_error: true
+
+```
+
--- a/docs/source/providers/inference/remote_openai.md
+++ b/docs/source/providers/inference/remote_openai.md
@ -0,0 +1,19 @@
+# remote::openai
+
+## Description
+
+OpenAI inference provider for accessing GPT models and other OpenAI services.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | API key for OpenAI models |
+
+## Sample Configuration
+
+```yaml
+api_key: ${env.OPENAI_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_passthrough.md
+++ b/docs/source/providers/inference/remote_passthrough.md
@ -0,0 +1,21 @@
+# remote::passthrough
+
+## Description
+
+Passthrough inference provider for connecting to any external inference service not directly supported.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `url` | `<class 'str'>` | No |  | The URL for the passthrough endpoint |
+| `api_key` | `pydantic.types.SecretStr \| None` | No |  | API Key for the passthrouth endpoint |
+
+## Sample Configuration
+
+```yaml
+url: ${env.PASSTHROUGH_URL}
+api_key: ${env.PASSTHROUGH_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_runpod.md
+++ b/docs/source/providers/inference/remote_runpod.md
@ -0,0 +1,21 @@
+# remote::runpod
+
+## Description
+
+RunPod inference provider for running models on RunPod's cloud GPU platform.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `url` | `str \| None` | No |  | The URL for the Runpod model serving endpoint |
+| `api_token` | `str \| None` | No |  | The API token |
+
+## Sample Configuration
+
+```yaml
+url: ${env.RUNPOD_URL:+}
+api_token: ${env.RUNPOD_API_TOKEN:+}
+
+```
+
--- a/docs/source/providers/inference/remote_sambanova-openai-compat.md
+++ b/docs/source/providers/inference/remote_sambanova-openai-compat.md
@ -0,0 +1,21 @@
+# remote::sambanova-openai-compat
+
+## Description
+
+SambaNova OpenAI-compatible provider for using SambaNova models with OpenAI API format.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | The SambaNova API key |
+| `openai_compat_api_base` | `<class 'str'>` | No | https://api.sambanova.ai/v1 | The URL for the SambaNova API server |
+
+## Sample Configuration
+
+```yaml
+openai_compat_api_base: https://api.sambanova.ai/v1
+api_key: ${env.SAMBANOVA_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_sambanova.md
+++ b/docs/source/providers/inference/remote_sambanova.md
@ -0,0 +1,21 @@
+# remote::sambanova
+
+## Description
+
+SambaNova inference provider for running models on SambaNova's dataflow architecture.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `url` | `<class 'str'>` | No | https://api.sambanova.ai/v1 | The URL for the SambaNova AI server |
+| `api_key` | `pydantic.types.SecretStr \| None` | No |  | The SambaNova cloud API Key |
+
+## Sample Configuration
+
+```yaml
+url: https://api.sambanova.ai/v1
+api_key: ${env.SAMBANOVA_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_tgi.md
+++ b/docs/source/providers/inference/remote_tgi.md
@ -0,0 +1,19 @@
+# remote::tgi
+
+## Description
+
+Text Generation Inference (TGI) provider for HuggingFace model serving.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `url` | `<class 'str'>` | No | PydanticUndefined | The URL for the TGI serving endpoint |
+
+## Sample Configuration
+
+```yaml
+url: ${env.TGI_URL}
+
+```
+
--- a/docs/source/providers/inference/remote_together-openai-compat.md
+++ b/docs/source/providers/inference/remote_together-openai-compat.md
@ -0,0 +1,21 @@
+# remote::together-openai-compat
+
+## Description
+
+Together AI OpenAI-compatible provider for using Together models with OpenAI API format.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | The Together API key |
+| `openai_compat_api_base` | `<class 'str'>` | No | https://api.together.xyz/v1 | The URL for the Together API server |
+
+## Sample Configuration
+
+```yaml
+openai_compat_api_base: https://api.together.xyz/v1
+api_key: ${env.TOGETHER_API_KEY}
+
+```
+
--- a/docs/source/providers/inference/remote_together.md
+++ b/docs/source/providers/inference/remote_together.md
@ -0,0 +1,21 @@
+# remote::together
+
+## Description
+
+Together AI inference provider for open-source models and collaborative AI development.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `url` | `<class 'str'>` | No | https://api.together.xyz/v1 | The URL for the Together AI server |
+| `api_key` | `pydantic.types.SecretStr \| None` | No |  | The Together AI API Key |
+
+## Sample Configuration
+
+```yaml
+url: https://api.together.xyz/v1
+api_key: ${env.TOGETHER_API_KEY:+}
+
+```
+
--- a/docs/source/providers/inference/remote_vllm.md
+++ b/docs/source/providers/inference/remote_vllm.md
@ -0,0 +1,25 @@
+# remote::vllm
+
+## Description
+
+Remote vLLM inference provider for connecting to vLLM servers.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `url` | `str \| None` | No |  | The URL for the vLLM model serving endpoint |
+| `max_tokens` | `<class 'int'>` | No | 4096 | Maximum number of tokens to generate. |
+| `api_token` | `str \| None` | No | fake | The API token |
+| `tls_verify` | `bool \| str` | No | True | Whether to verify TLS certificates. Can be a boolean or a path to a CA certificate file. |
+
+## Sample Configuration
+
+```yaml
+url: ${env.VLLM_URL}
+max_tokens: ${env.VLLM_MAX_TOKENS:=4096}
+api_token: ${env.VLLM_API_TOKEN:=fake}
+tls_verify: ${env.VLLM_TLS_VERIFY:=true}
+
+```
+
--- a/docs/source/providers/inference/remote_watsonx.md
+++ b/docs/source/providers/inference/remote_watsonx.md
@ -0,0 +1,24 @@
+# remote::watsonx
+
+## Description
+
+IBM WatsonX inference provider for accessing AI models on IBM's WatsonX platform.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `url` | `<class 'str'>` | No | https://us-south.ml.cloud.ibm.com | A base url for accessing the watsonx.ai |
+| `api_key` | `pydantic.types.SecretStr \| None` | No |  | The watsonx API key, only needed of using the hosted service |
+| `project_id` | `str \| None` | No |  | The Project ID key, only needed of using the hosted service |
+| `timeout` | `<class 'int'>` | No | 60 | Timeout for the HTTP requests |
+
+## Sample Configuration
+
+```yaml
+url: ${env.WATSONX_BASE_URL:=https://us-south.ml.cloud.ibm.com}
+api_key: ${env.WATSONX_API_KEY:+}
+project_id: ${env.WATSONX_PROJECT_ID:+}
+
+```
+
--- a/docs/source/providers/post_training/index.md
+++ b/docs/source/providers/post_training/index.md
@ -0,0 +1,7 @@
+# Post_Training Providers
+
+This section contains documentation for all available providers for the **post_training** API.
+
+- [inline::huggingface](inline_huggingface.md)
+- [inline::torchtune](inline_torchtune.md)
+- [remote::nvidia](remote_nvidia.md)
--- a/docs/source/providers/post_training/inline_huggingface.md
+++ b/docs/source/providers/post_training/inline_huggingface.md
@ -0,0 +1,36 @@
+# inline::huggingface
+
+## Description
+
+HuggingFace-based post-training provider for fine-tuning models using the HuggingFace ecosystem.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `device` | `<class 'str'>` | No | cuda |  |
+| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No |  |  |
+| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface |  |
+| `chat_template` | `<class 'str'>` | No | <|user|>
+{input}
+<|assistant|>
+{output} |  |
+| `model_specific_config` | `<class 'dict'>` | No | {'trust_remote_code': True, 'attn_implementation': 'sdpa'} |  |
+| `max_seq_length` | `<class 'int'>` | No | 2048 |  |
+| `gradient_checkpointing` | `<class 'bool'>` | No | False |  |
+| `save_total_limit` | `<class 'int'>` | No | 3 |  |
+| `logging_steps` | `<class 'int'>` | No | 10 |  |
+| `warmup_ratio` | `<class 'float'>` | No | 0.1 |  |
+| `weight_decay` | `<class 'float'>` | No | 0.01 |  |
+| `dataloader_num_workers` | `<class 'int'>` | No | 4 |  |
+| `dataloader_pin_memory` | `<class 'bool'>` | No | True |  |
+
+## Sample Configuration
+
+```yaml
+checkpoint_format: huggingface
+distributed_backend: null
+device: cpu
+
+```
+
--- a/docs/source/providers/post_training/inline_torchtune.md
+++ b/docs/source/providers/post_training/inline_torchtune.md
@ -0,0 +1,20 @@
+# inline::torchtune
+
+## Description
+
+TorchTune-based post-training provider for fine-tuning and optimizing models using Meta's TorchTune framework.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `torch_seed` | `int \| None` | No |  |  |
+| `checkpoint_format` | `Literal['meta', 'huggingface'` | No | meta |  |
+
+## Sample Configuration
+
+```yaml
+checkpoint_format: meta
+
+```
+
--- a/docs/source/providers/post_training/remote_nvidia.md
+++ b/docs/source/providers/post_training/remote_nvidia.md
@ -0,0 +1,28 @@
+# remote::nvidia
+
+## Description
+
+NVIDIA's post-training provider for fine-tuning models on NVIDIA's platform.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | The NVIDIA API key. |
+| `dataset_namespace` | `str \| None` | No | default | The NVIDIA dataset namespace. |
+| `project_id` | `str \| None` | No | test-example-model@v1 | The NVIDIA project ID. |
+| `customizer_url` | `str \| None` | No |  | Base URL for the NeMo Customizer API |
+| `timeout` | `<class 'int'>` | No | 300 | Timeout for the NVIDIA Post Training API |
+| `max_retries` | `<class 'int'>` | No | 3 | Maximum number of retries for the NVIDIA Post Training API |
+| `output_model_dir` | `<class 'str'>` | No | test-example-model@v1 | Directory to save the output model |
+
+## Sample Configuration
+
+```yaml
+api_key: ${env.NVIDIA_API_KEY:+}
+dataset_namespace: ${env.NVIDIA_DATASET_NAMESPACE:=default}
+project_id: ${env.NVIDIA_PROJECT_ID:=test-project}
+customizer_url: ${env.NVIDIA_CUSTOMIZER_URL:=http://nemo.test}
+
+```
+
--- a/docs/source/providers/safety/index.md
+++ b/docs/source/providers/safety/index.md
@ -0,0 +1,10 @@
+# Safety Providers
+
+This section contains documentation for all available providers for the **safety** API.
+
+- [inline::code-scanner](inline_code-scanner.md)
+- [inline::llama-guard](inline_llama-guard.md)
+- [inline::prompt-guard](inline_prompt-guard.md)
+- [remote::bedrock](remote_bedrock.md)
+- [remote::nvidia](remote_nvidia.md)
+- [remote::sambanova](remote_sambanova.md)
--- a/docs/source/providers/safety/inline_code-scanner.md
+++ b/docs/source/providers/safety/inline_code-scanner.md
@ -0,0 +1,13 @@
+# inline::code-scanner
+
+## Description
+
+Code Scanner safety provider for detecting security vulnerabilities and unsafe code patterns.
+
+## Sample Configuration
+
+```yaml
+{}
+
+```
+
--- a/docs/source/providers/safety/inline_llama-guard.md
+++ b/docs/source/providers/safety/inline_llama-guard.md
@ -0,0 +1,19 @@
+# inline::llama-guard
+
+## Description
+
+Llama Guard safety provider for content moderation and safety filtering using Meta's Llama Guard model.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `excluded_categories` | `list[str` | No | [] |  |
+
+## Sample Configuration
+
+```yaml
+excluded_categories: []
+
+```
+
--- a/docs/source/providers/safety/inline_prompt-guard.md
+++ b/docs/source/providers/safety/inline_prompt-guard.md
@ -0,0 +1,19 @@
+# inline::prompt-guard
+
+## Description
+
+Prompt Guard safety provider for detecting and filtering unsafe prompts and content.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `guard_type` | `<class 'str'>` | No | injection |  |
+
+## Sample Configuration
+
+```yaml
+guard_type: injection
+
+```
+
--- a/docs/source/providers/safety/remote_bedrock.md
+++ b/docs/source/providers/safety/remote_bedrock.md
@ -0,0 +1,28 @@
+# remote::bedrock
+
+## Description
+
+AWS Bedrock safety provider for content moderation using AWS's safety services.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `aws_access_key_id` | `str \| None` | No |  | The AWS access key to use. Default use environment variable: AWS_ACCESS_KEY_ID |
+| `aws_secret_access_key` | `str \| None` | No |  | The AWS secret access key to use. Default use environment variable: AWS_SECRET_ACCESS_KEY |
+| `aws_session_token` | `str \| None` | No |  | The AWS session token to use. Default use environment variable: AWS_SESSION_TOKEN |
+| `region_name` | `str \| None` | No |  | The default AWS Region to use, for example, us-west-1 or us-west-2.Default use environment variable: AWS_DEFAULT_REGION |
+| `profile_name` | `str \| None` | No |  | The profile name that contains credentials to use.Default use environment variable: AWS_PROFILE |
+| `total_max_attempts` | `int \| None` | No |  | An integer representing the maximum number of attempts that will be made for a single request, including the initial attempt. Default use environment variable: AWS_MAX_ATTEMPTS |
+| `retry_mode` | `str \| None` | No |  | A string representing the type of retries Boto3 will perform.Default use environment variable: AWS_RETRY_MODE |
+| `connect_timeout` | `float \| None` | No | 60 | The time in seconds till a timeout exception is thrown when attempting to make a connection. The default is 60 seconds. |
+| `read_timeout` | `float \| None` | No | 60 | The time in seconds till a timeout exception is thrown when attempting to read from a connection.The default is 60 seconds. |
+| `session_ttl` | `int \| None` | No | 3600 | The time in seconds till a session expires. The default is 3600 seconds (1 hour). |
+
+## Sample Configuration
+
+```yaml
+{}
+
+```
+
--- a/docs/source/providers/safety/remote_nvidia.md
+++ b/docs/source/providers/safety/remote_nvidia.md
@ -0,0 +1,21 @@
+# remote::nvidia
+
+## Description
+
+NVIDIA's safety provider for content moderation and safety filtering.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `guardrails_service_url` | `<class 'str'>` | No | http://0.0.0.0:7331 | The url for accessing the Guardrails service |
+| `config_id` | `str \| None` | No | self-check | Guardrails configuration ID to use from the Guardrails configuration store |
+
+## Sample Configuration
+
+```yaml
+guardrails_service_url: ${env.GUARDRAILS_SERVICE_URL:=http://localhost:7331}
+config_id: ${env.NVIDIA_GUARDRAILS_CONFIG_ID:=self-check}
+
+```
+
--- a/docs/source/providers/safety/remote_sambanova.md
+++ b/docs/source/providers/safety/remote_sambanova.md
@ -0,0 +1,21 @@
+# remote::sambanova
+
+## Description
+
+SambaNova's safety provider for content moderation and safety filtering.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `url` | `<class 'str'>` | No | https://api.sambanova.ai/v1 | The URL for the SambaNova AI server |
+| `api_key` | `pydantic.types.SecretStr \| None` | No |  | The SambaNova cloud API Key |
+
+## Sample Configuration
+
+```yaml
+url: https://api.sambanova.ai/v1
+api_key: ${env.SAMBANOVA_API_KEY}
+
+```
+
--- a/docs/source/providers/scoring/index.md
+++ b/docs/source/providers/scoring/index.md
@ -0,0 +1,7 @@
+# Scoring Providers
+
+This section contains documentation for all available providers for the **scoring** API.
+
+- [inline::basic](inline_basic.md)
+- [inline::braintrust](inline_braintrust.md)
+- [inline::llm-as-judge](inline_llm-as-judge.md)
--- a/docs/source/providers/scoring/inline_basic.md
+++ b/docs/source/providers/scoring/inline_basic.md
@ -0,0 +1,13 @@
+# inline::basic
+
+## Description
+
+Basic scoring provider for simple evaluation metrics and scoring functions.
+
+## Sample Configuration
+
+```yaml
+{}
+
+```
+
--- a/docs/source/providers/scoring/inline_braintrust.md
+++ b/docs/source/providers/scoring/inline_braintrust.md
@ -0,0 +1,19 @@
+# inline::braintrust
+
+## Description
+
+Braintrust scoring provider for evaluation and scoring using the Braintrust platform.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `openai_api_key` | `str \| None` | No |  | The OpenAI API Key |
+
+## Sample Configuration
+
+```yaml
+openai_api_key: ${env.OPENAI_API_KEY:+}
+
+```
+
--- a/docs/source/providers/scoring/inline_llm-as-judge.md
+++ b/docs/source/providers/scoring/inline_llm-as-judge.md
@ -0,0 +1,13 @@
+# inline::llm-as-judge
+
+## Description
+
+LLM-as-judge scoring provider that uses language models to evaluate and score responses.
+
+## Sample Configuration
+
+```yaml
+{}
+
+```
+
--- a/docs/source/providers/telemetry/index.md
+++ b/docs/source/providers/telemetry/index.md
@ -0,0 +1,5 @@
+# Telemetry Providers
+
+This section contains documentation for all available providers for the **telemetry** API.
+
+- [inline::meta-reference](inline_meta-reference.md)
--- a/docs/source/providers/telemetry/inline_meta-reference.md
+++ b/docs/source/providers/telemetry/inline_meta-reference.md
@ -0,0 +1,25 @@
+# inline::meta-reference
+
+## Description
+
+Meta's reference implementation of telemetry and observability using OpenTelemetry.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `otel_trace_endpoint` | `str \| None` | No |  | The OpenTelemetry collector endpoint URL for traces |
+| `otel_metric_endpoint` | `str \| None` | No |  | The OpenTelemetry collector endpoint URL for metrics |
+| `service_name` | `<class 'str'>` | No |  | The service name to use for telemetry |
+| `sinks` | `list[inline.telemetry.meta_reference.config.TelemetrySink` | No | [<TelemetrySink.CONSOLE: 'console'>, <TelemetrySink.SQLITE: 'sqlite'>] | List of telemetry sinks to enable (possible values: otel, sqlite, console) |
+| `sqlite_db_path` | `<class 'str'>` | No | ~/.llama/runtime/trace_store.db | The path to the SQLite database to use for storing traces |
+
+## Sample Configuration
+
+```yaml
+service_name: "${env.OTEL_SERVICE_NAME:=\u200B}"
+sinks: ${env.TELEMETRY_SINKS:=console,sqlite}
+sqlite_db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/trace_store.db
+
+```
+
--- a/docs/source/providers/tool_runtime/index.md
+++ b/docs/source/providers/tool_runtime/index.md
@ -0,0 +1,10 @@
+# Tool_Runtime Providers
+
+This section contains documentation for all available providers for the **tool_runtime** API.
+
+- [inline::rag-runtime](inline_rag-runtime.md)
+- [remote::bing-search](remote_bing-search.md)
+- [remote::brave-search](remote_brave-search.md)
+- [remote::model-context-protocol](remote_model-context-protocol.md)
+- [remote::tavily-search](remote_tavily-search.md)
+- [remote::wolfram-alpha](remote_wolfram-alpha.md)
--- a/docs/source/providers/tool_runtime/inline_rag-runtime.md
+++ b/docs/source/providers/tool_runtime/inline_rag-runtime.md
@ -0,0 +1,13 @@
+# inline::rag-runtime
+
+## Description
+
+RAG (Retrieval-Augmented Generation) tool runtime for document ingestion, chunking, and semantic search.
+
+## Sample Configuration
+
+```yaml
+{}
+
+```
+
--- a/docs/source/providers/tool_runtime/remote_bing-search.md
+++ b/docs/source/providers/tool_runtime/remote_bing-search.md
@ -0,0 +1,20 @@
+# remote::bing-search
+
+## Description
+
+Bing Search tool for web search capabilities using Microsoft's search engine.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  |  |
+| `top_k` | `<class 'int'>` | No | 3 |  |
+
+## Sample Configuration
+
+```yaml
+api_key: ${env.BING_API_KEY:}
+
+```
+
--- a/docs/source/providers/tool_runtime/remote_brave-search.md
+++ b/docs/source/providers/tool_runtime/remote_brave-search.md
@ -0,0 +1,21 @@
+# remote::brave-search
+
+## Description
+
+Brave Search tool for web search capabilities with privacy-focused results.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | The Brave Search API Key |
+| `max_results` | `<class 'int'>` | No | 3 | The maximum number of results to return |
+
+## Sample Configuration
+
+```yaml
+api_key: ${env.BRAVE_SEARCH_API_KEY:+}
+max_results: 3
+
+```
+
--- a/docs/source/providers/tool_runtime/remote_model-context-protocol.md
+++ b/docs/source/providers/tool_runtime/remote_model-context-protocol.md
@ -0,0 +1,13 @@
+# remote::model-context-protocol
+
+## Description
+
+Model Context Protocol (MCP) tool for standardized tool calling and context management.
+
+## Sample Configuration
+
+```yaml
+{}
+
+```
+
--- a/docs/source/providers/tool_runtime/remote_tavily-search.md
+++ b/docs/source/providers/tool_runtime/remote_tavily-search.md
@ -0,0 +1,21 @@
+# remote::tavily-search
+
+## Description
+
+Tavily Search tool for AI-optimized web search with structured results.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  | The Tavily Search API Key |
+| `max_results` | `<class 'int'>` | No | 3 | The maximum number of results to return |
+
+## Sample Configuration
+
+```yaml
+api_key: ${env.TAVILY_SEARCH_API_KEY:+}
+max_results: 3
+
+```
+
--- a/docs/source/providers/tool_runtime/remote_wolfram-alpha.md
+++ b/docs/source/providers/tool_runtime/remote_wolfram-alpha.md
@ -0,0 +1,19 @@
+# remote::wolfram-alpha
+
+## Description
+
+Wolfram Alpha tool for computational knowledge and mathematical calculations.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `api_key` | `str \| None` | No |  |  |
+
+## Sample Configuration
+
+```yaml
+api_key: ${env.WOLFRAM_ALPHA_API_KEY:+}
+
+```
+
--- a/docs/source/providers/vector_io/index.md
+++ b/docs/source/providers/vector_io/index.md
@ -0,0 +1,16 @@
+# Vector_Io Providers
+
+This section contains documentation for all available providers for the **vector_io** API.
+
+- [inline::chromadb](inline_chromadb.md)
+- [inline::faiss](inline_faiss.md)
+- [inline::meta-reference](inline_meta-reference.md)
+- [inline::milvus](inline_milvus.md)
+- [inline::qdrant](inline_qdrant.md)
+- [inline::sqlite-vec](inline_sqlite-vec.md)
+- [inline::sqlite_vec](inline_sqlite_vec.md)
+- [remote::chromadb](remote_chromadb.md)
+- [remote::milvus](remote_milvus.md)
+- [remote::pgvector](remote_pgvector.md)
+- [remote::qdrant](remote_qdrant.md)
+- [remote::weaviate](remote_weaviate.md)
--- a/docs/source/providers/vector_io/inline_chromadb.md
+++ b/docs/source/providers/vector_io/inline_chromadb.md
@ -0,0 +1,52 @@
+# inline::chromadb
+
+## Description
+
+
+[Chroma](https://www.trychroma.com/) is an inline and remote vector
+database provider for Llama Stack. It allows you to store and query vectors directly within a Chroma database.
+That means you're not limited to storing vectors in memory or in a separate service.
+
+## Features
+Chroma supports:
+- Store embeddings and their metadata
+- Vector search
+- Full-text search
+- Document storage
+- Metadata filtering
+- Multi-modal retrieval
+
+## Usage
+
+To use Chrome in your Llama Stack project, follow these steps:
+
+1. Install the necessary dependencies.
+2. Configure your Llama Stack project to use chroma.
+3. Start storing and querying vectors.
+
+## Installation
+
+You can install chroma using pip:
+
+```bash
+pip install chromadb
+```
+
+## Documentation
+See [Chroma's documentation](https://docs.trychroma.com/docs/overview/introduction) for more details about Chroma in general.
+
+
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `db_path` | `<class 'str'>` | No | PydanticUndefined |  |
+
+## Sample Configuration
+
+```yaml
+db_path: ${env.CHROMADB_PATH}
+
+```
+
--- a/docs/source/providers/vector_io/inline_faiss.md
+++ b/docs/source/providers/vector_io/inline_faiss.md
@ -1,7 +1,7 @@
---
-orphan: true
---
-# Faiss
+# inline::faiss
+
+## Description
+

 [Faiss](https://github.com/facebookresearch/faiss) is an inline vector database provider for Llama Stack. It
 allows you to store and query vectors directly in memory.
@ -31,3 +31,21 @@ pip install faiss-cpu
 ## Documentation
 See [Faiss' documentation](https://faiss.ai/) or the [Faiss Wiki](https://github.com/facebookresearch/faiss/wiki) for
 more details about Faiss in general.
+
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite |  |
+
+## Sample Configuration
+
+```yaml
+kvstore:
+  type: sqlite
+  namespace: null
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/faiss_store.db
+
+```
+
--- a/docs/source/providers/vector_io/inline_meta-reference.md
+++ b/docs/source/providers/vector_io/inline_meta-reference.md
@ -0,0 +1,26 @@
+# inline::meta-reference
+
+## Description
+
+Meta's reference implementation of a vector database.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite |  |
+
+## Sample Configuration
+
+```yaml
+kvstore:
+  type: sqlite
+  namespace: null
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/faiss_store.db
+
+```
+
+## Deprecation Notice
+
+⚠️ **Warning**: Please use the `inline::faiss` provider instead.
+
--- a/docs/source/providers/vector_io/inline_milvus.md
+++ b/docs/source/providers/vector_io/inline_milvus.md
@ -0,0 +1,26 @@
+# inline::milvus
+
+## Description
+
+
+Please refer to the remote provider documentation.
+
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `db_path` | `<class 'str'>` | No | PydanticUndefined |  |
+| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite |  |
+
+## Sample Configuration
+
+```yaml
+db_path: ${env.MILVUS_DB_PATH:=~/.llama/dummy/milvus.db}
+kvstore:
+  type: sqlite
+  namespace: null
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/${env.MILVUS_KVSTORE_DB_PATH:=~/.llama/dummy/milvus_registry.db}
+
+```
+
--- a/docs/source/providers/vector_io/inline_qdrant.md
+++ b/docs/source/providers/vector_io/inline_qdrant.md
@ -1,7 +1,7 @@
---
-orphan: true
---
-# Qdrant
+# inline::qdrant
+
+## Description
+

 [Qdrant](https://qdrant.tech/documentation/) is an inline and remote vector database provider for Llama Stack. It
 allows you to store and query vectors directly in memory.
@ -44,3 +44,18 @@ docker pull qdrant/qdrant
 ```
 ## Documentation
 See the [Qdrant documentation](https://qdrant.tech/documentation/) for more details about Qdrant in general.
+
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `path` | `<class 'str'>` | No | PydanticUndefined |  |
+
+## Sample Configuration
+
+```yaml
+path: ${env.QDRANT_PATH:=~/.llama/~/.llama/dummy}/qdrant.db
+
+```
+
--- a/docs/source/providers/vector_io/inline_sqlite-vec.md
+++ b/docs/source/providers/vector_io/inline_sqlite-vec.md
@ -1,7 +1,7 @@
---
-orphan: true
---
-# SQLite-Vec
+# inline::sqlite-vec
+
+## Description
+

 [SQLite-Vec](https://github.com/asg017/sqlite-vec) is an inline vector database provider for Llama Stack. It
 allows you to store and query vectors directly within an SQLite database.
@ -199,3 +199,18 @@ pip install sqlite-vec
 See [sqlite-vec's GitHub repo](https://github.com/asg017/sqlite-vec/tree/main) for more details about sqlite-vec in general.

 [^1]: Cormack, G. V., Clarke, C. L., & Buettcher, S. (2009). [Reciprocal rank fusion outperforms condorcet and individual rank learning methods](https://dl.acm.org/doi/10.1145/1571941.1572114). In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval (pp. 758-759).
+
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `db_path` | `<class 'str'>` | No | PydanticUndefined |  |
+
+## Sample Configuration
+
+```yaml
+db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/sqlite_vec.db
+
+```
+
--- a/docs/source/providers/vector_io/inline_sqlite_vec.md
+++ b/docs/source/providers/vector_io/inline_sqlite_vec.md
@ -0,0 +1,25 @@
+# inline::sqlite_vec
+
+## Description
+
+
+Please refer to the sqlite-vec provider documentation.
+
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `db_path` | `<class 'str'>` | No | PydanticUndefined |  |
+
+## Sample Configuration
+
+```yaml
+db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/sqlite_vec.db
+
+```
+
+## Deprecation Notice
+
+⚠️ **Warning**: Please use the `inline::sqlite-vec` provider (notice the hyphen instead of underscore) instead.
+
--- a/docs/source/providers/vector_io/remote_chromadb.md
+++ b/docs/source/providers/vector_io/remote_chromadb.md
@ -1,7 +1,7 @@
---
-orphan: true
---
-# Chroma
+# remote::chromadb
+
+## Description
+

 [Chroma](https://www.trychroma.com/) is an inline and remote vector
 database provider for Llama Stack. It allows you to store and query vectors directly within a Chroma database.
@ -34,3 +34,18 @@ pip install chromadb

 ## Documentation
 See [Chroma's documentation](https://docs.trychroma.com/docs/overview/introduction) for more details about Chroma in general.
+
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `url` | `str \| None` | No | PydanticUndefined |  |
+
+## Sample Configuration
+
+```yaml
+url: ${env.CHROMADB_URL}
+
+```
+
--- a/docs/source/providers/vector_io/remote_milvus.md
+++ b/docs/source/providers/vector_io/remote_milvus.md
@ -1,7 +1,7 @@
---
-orphan: true
---
-# Milvus
+# remote::milvus
+
+## Description
+

 [Milvus](https://milvus.io/) is an inline and remote vector database provider for Llama Stack. It
 allows you to store and query vectors directly within a Milvus database.
@ -96,7 +96,7 @@ vector_io:
 #### Key Parameters for TLS Configuration

 - **`secure`**: Enables TLS encryption when set to `true`. Defaults to `false`.
- **`server_pem_path`**: Path to the **server certificate** for verifying the server’s identity (used in one-way TLS).
+- **`server_pem_path`**: Path to the **server certificate** for verifying the server's identity (used in one-way TLS).
 - **`ca_pem_path`**: Path to the **Certificate Authority (CA) certificate** for validating the server certificate (required in mTLS).
 - **`client_pem_path`**: Path to the **client certificate** file (required for mTLS).
 - **`client_key_path`**: Path to the **client private key** file (required for mTLS).
@ -105,3 +105,24 @@ vector_io:
 See the [Milvus documentation](https://milvus.io/docs/install-overview.md) for more details about Milvus in general.

 For more details on TLS configuration, refer to the [TLS setup guide](https://milvus.io/docs/tls.md).
+
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `uri` | `<class 'str'>` | No | PydanticUndefined | The URI of the Milvus server |
+| `token` | `str \| None` | No | PydanticUndefined | The token of the Milvus server |
+| `consistency_level` | `<class 'str'>` | No | Strong | The consistency level of the Milvus server |
+| `config` | `dict` | No | {} | This configuration allows additional fields to be passed through to the underlying Milvus client. See the [Milvus](https://milvus.io/docs/install-overview.md) documentation for more details about Milvus in general. |
+
+> **Note**: This configuration class accepts additional fields beyond those listed above. You can pass any additional configuration options that will be forwarded to the underlying provider.
+
+## Sample Configuration
+
+```yaml
+uri: ${env.MILVUS_ENDPOINT}
+token: ${env.MILVUS_TOKEN}
+
+```
+
--- a/docs/source/providers/vector_io/remote_pgvector.md
+++ b/docs/source/providers/vector_io/remote_pgvector.md
@ -1,7 +1,7 @@
---
-orphan: true
---
-# Postgres PGVector
+# remote::pgvector
+
+## Description
+

 [PGVector](https://github.com/pgvector/pgvector) is a remote vector database provider for Llama Stack. It
 allows you to store and query vectors directly in memory.
@ -29,3 +29,26 @@ docker pull pgvector/pgvector:pg17
 ```
 ## Documentation
 See [PGVector's documentation](https://github.com/pgvector/pgvector) for more details about PGVector in general.
+
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `host` | `str \| None` | No | localhost |  |
+| `port` | `int \| None` | No | 5432 |  |
+| `db` | `str \| None` | No | postgres |  |
+| `user` | `str \| None` | No | postgres |  |
+| `password` | `str \| None` | No | mysecretpassword |  |
+
+## Sample Configuration
+
+```yaml
+host: ${env.PGVECTOR_HOST:=localhost}
+port: ${env.PGVECTOR_PORT:=5432}
+db: ${env.PGVECTOR_DB}
+user: ${env.PGVECTOR_USER}
+password: ${env.PGVECTOR_PASSWORD}
+
+```
+
--- a/docs/source/providers/vector_io/remote_qdrant.md
+++ b/docs/source/providers/vector_io/remote_qdrant.md
@ -0,0 +1,30 @@
+# remote::qdrant
+
+## Description
+
+
+Please refer to the inline provider documentation.
+
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `location` | `str \| None` | No |  |  |
+| `url` | `str \| None` | No |  |  |
+| `port` | `int \| None` | No | 6333 |  |
+| `grpc_port` | `<class 'int'>` | No | 6334 |  |
+| `prefer_grpc` | `<class 'bool'>` | No | False |  |
+| `https` | `bool \| None` | No |  |  |
+| `api_key` | `str \| None` | No |  |  |
+| `prefix` | `str \| None` | No |  |  |
+| `timeout` | `int \| None` | No |  |  |
+| `host` | `str \| None` | No |  |  |
+
+## Sample Configuration
+
+```yaml
+api_key: ${env.QDRANT_API_KEY}
+
+```
+
--- a/docs/source/providers/vector_io/remote_weaviate.md
+++ b/docs/source/providers/vector_io/remote_weaviate.md
@ -1,7 +1,7 @@
---
-orphan: true
---
-# Weaviate
+# remote::weaviate
+
+## Description
+

 [Weaviate](https://weaviate.io/) is a vector database provider for Llama Stack.
 It allows you to store and query vectors directly within a Weaviate database.
@ -31,3 +31,12 @@ To install Weaviate see the [Weaviate quickstart documentation](https://weaviate

 ## Documentation
 See [Weaviate's documentation](https://weaviate.io/developers/weaviate) for more details about Weaviate in general.
+
+
+## Sample Configuration
+
+```yaml
+{}
+
+```
+