Merge branch 'main' into implement-search-for-PGVector

2025-12-18 15:49:49 +00:00 · 2025-08-28 10:20:25 -06:00 · 2025-08-28 10:20:25 -06:00 · 4c03cddf6f
commit 4c03cddf6f
parent 12f4cfa9f1 52106d95d3
176 changed files with 8344 additions and 734 deletions
--- a/docs/source/providers/batches/index.md
+++ b/docs/source/providers/batches/index.md
@ -2,12 +2,15 @@

 ## Overview

-Protocol for batch processing API operations.
-
-    The Batches API enables efficient processing of multiple requests in a single operation,
+The Batches API enables efficient processing of multiple requests in a single operation,
    particularly useful for processing large datasets, batch evaluation workflows, and
    cost-effective inference at scale.

+    The API is designed to allow use of openai client libraries for seamless integration.
+
+    This API provides the following extensions:
+     - idempotent batch creation
+
    Note: This API is currently under active development and may undergo changes.

 This section contains documentation for all available providers for the **batches** API.
--- a/docs/source/providers/files/index.md
+++ b/docs/source/providers/files/index.md
@ -10,4 +10,5 @@ This section contains documentation for all available providers for the **files*
 :maxdepth: 1

 inline_localfs
+remote_s3
 ```
--- a/docs/source/providers/files/remote_s3.md
+++ b/docs/source/providers/files/remote_s3.md
@ -0,0 +1,33 @@
+# remote::s3
+
+## Description
+
+AWS S3-based file storage provider for scalable cloud file management with metadata persistence.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `bucket_name` | `<class 'str'>` | No |  | S3 bucket name to store files |
+| `region` | `<class 'str'>` | No | us-east-1 | AWS region where the bucket is located |
+| `aws_access_key_id` | `str \| None` | No |  | AWS access key ID (optional if using IAM roles) |
+| `aws_secret_access_key` | `str \| None` | No |  | AWS secret access key (optional if using IAM roles) |
+| `endpoint_url` | `str \| None` | No |  | Custom S3 endpoint URL (for MinIO, LocalStack, etc.) |
+| `auto_create_bucket` | `<class 'bool'>` | No | False | Automatically create the S3 bucket if it doesn't exist |
+| `metadata_store` | `utils.sqlstore.sqlstore.SqliteSqlStoreConfig \| utils.sqlstore.sqlstore.PostgresSqlStoreConfig` | No | sqlite | SQL store configuration for file metadata |
+
+## Sample Configuration
+
+```yaml
+bucket_name: ${env.S3_BUCKET_NAME}
+region: ${env.AWS_REGION:=us-east-1}
+aws_access_key_id: ${env.AWS_ACCESS_KEY_ID:=}
+aws_secret_access_key: ${env.AWS_SECRET_ACCESS_KEY:=}
+endpoint_url: ${env.S3_ENDPOINT_URL:=}
+auto_create_bucket: ${env.S3_AUTO_CREATE_BUCKET:=false}
+metadata_store:
+  type: sqlite
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/s3_files_metadata.db
+
+```
+
--- a/docs/source/providers/post_training/index.md
+++ b/docs/source/providers/post_training/index.md
@ -9,7 +9,8 @@ This section contains documentation for all available providers for the **post_t
 ```{toctree}
 :maxdepth: 1

-inline_huggingface
-inline_torchtune
+inline_huggingface-gpu
+inline_torchtune-cpu
+inline_torchtune-gpu
 remote_nvidia
 ```
--- a/docs/source/providers/post_training/inline_huggingface-cpu.md
+++ b/docs/source/providers/post_training/inline_huggingface-cpu.md
@ -0,0 +1,41 @@
+# inline::huggingface-cpu
+
+## Description
+
+HuggingFace-based post-training provider for fine-tuning models using the HuggingFace ecosystem.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `device` | `<class 'str'>` | No | cuda |  |
+| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No |  |  |
+| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface |  |
+| `chat_template` | `<class 'str'>` | No | <|user|>
+{input}
+<|assistant|>
+{output} |  |
+| `model_specific_config` | `<class 'dict'>` | No | {'trust_remote_code': True, 'attn_implementation': 'sdpa'} |  |
+| `max_seq_length` | `<class 'int'>` | No | 2048 |  |
+| `gradient_checkpointing` | `<class 'bool'>` | No | False |  |
+| `save_total_limit` | `<class 'int'>` | No | 3 |  |
+| `logging_steps` | `<class 'int'>` | No | 10 |  |
+| `warmup_ratio` | `<class 'float'>` | No | 0.1 |  |
+| `weight_decay` | `<class 'float'>` | No | 0.01 |  |
+| `dataloader_num_workers` | `<class 'int'>` | No | 4 |  |
+| `dataloader_pin_memory` | `<class 'bool'>` | No | True |  |
+| `dpo_beta` | `<class 'float'>` | No | 0.1 |  |
+| `use_reference_model` | `<class 'bool'>` | No | True |  |
+| `dpo_loss_type` | `Literal['sigmoid', 'hinge', 'ipo', 'kto_pair'` | No | sigmoid |  |
+| `dpo_output_dir` | `<class 'str'>` | No |  |  |
+
+## Sample Configuration
+
+```yaml
+checkpoint_format: huggingface
+distributed_backend: null
+device: cpu
+dpo_output_dir: ~/.llama/dummy/dpo_output
+
+```
+
--- a/docs/source/providers/post_training/inline_huggingface-gpu.md
+++ b/docs/source/providers/post_training/inline_huggingface-gpu.md
@ -0,0 +1,41 @@
+# inline::huggingface-gpu
+
+## Description
+
+HuggingFace-based post-training provider for fine-tuning models using the HuggingFace ecosystem.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `device` | `<class 'str'>` | No | cuda |  |
+| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No |  |  |
+| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface |  |
+| `chat_template` | `<class 'str'>` | No | <|user|>
+{input}
+<|assistant|>
+{output} |  |
+| `model_specific_config` | `<class 'dict'>` | No | {'trust_remote_code': True, 'attn_implementation': 'sdpa'} |  |
+| `max_seq_length` | `<class 'int'>` | No | 2048 |  |
+| `gradient_checkpointing` | `<class 'bool'>` | No | False |  |
+| `save_total_limit` | `<class 'int'>` | No | 3 |  |
+| `logging_steps` | `<class 'int'>` | No | 10 |  |
+| `warmup_ratio` | `<class 'float'>` | No | 0.1 |  |
+| `weight_decay` | `<class 'float'>` | No | 0.01 |  |
+| `dataloader_num_workers` | `<class 'int'>` | No | 4 |  |
+| `dataloader_pin_memory` | `<class 'bool'>` | No | True |  |
+| `dpo_beta` | `<class 'float'>` | No | 0.1 |  |
+| `use_reference_model` | `<class 'bool'>` | No | True |  |
+| `dpo_loss_type` | `Literal['sigmoid', 'hinge', 'ipo', 'kto_pair'` | No | sigmoid |  |
+| `dpo_output_dir` | `<class 'str'>` | No |  |  |
+
+## Sample Configuration
+
+```yaml
+checkpoint_format: huggingface
+distributed_backend: null
+device: cpu
+dpo_output_dir: ~/.llama/dummy/dpo_output
+
+```
+
--- a/docs/source/providers/post_training/inline_torchtune-cpu.md
+++ b/docs/source/providers/post_training/inline_torchtune-cpu.md
@ -0,0 +1,20 @@
+# inline::torchtune-cpu
+
+## Description
+
+TorchTune-based post-training provider for fine-tuning and optimizing models using Meta's TorchTune framework.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `torch_seed` | `int \| None` | No |  |  |
+| `checkpoint_format` | `Literal['meta', 'huggingface'` | No | meta |  |
+
+## Sample Configuration
+
+```yaml
+checkpoint_format: meta
+
+```
+
--- a/docs/source/providers/post_training/inline_torchtune-gpu.md
+++ b/docs/source/providers/post_training/inline_torchtune-gpu.md
@ -0,0 +1,20 @@
+# inline::torchtune-gpu
+
+## Description
+
+TorchTune-based post-training provider for fine-tuning and optimizing models using Meta's TorchTune framework.
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `torch_seed` | `int \| None` | No |  |  |
+| `checkpoint_format` | `Literal['meta', 'huggingface'` | No | meta |  |
+
+## Sample Configuration
+
+```yaml
+checkpoint_format: meta
+
+```
+