mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-04 04:04:14 +00:00
Merge branch 'main' into chroma
This commit is contained in:
commit
d3958fae4f
192 changed files with 7088 additions and 853 deletions
|
@ -225,8 +225,32 @@ server:
|
|||
port: 8321 # Port to listen on (default: 8321)
|
||||
tls_certfile: "/path/to/cert.pem" # Optional: Path to TLS certificate for HTTPS
|
||||
tls_keyfile: "/path/to/key.pem" # Optional: Path to TLS key for HTTPS
|
||||
cors: true # Optional: Enable CORS (dev mode) or full config object
|
||||
```
|
||||
|
||||
### CORS Configuration
|
||||
|
||||
CORS (Cross-Origin Resource Sharing) can be configured in two ways:
|
||||
|
||||
**Local development** (allows localhost origins only):
|
||||
```yaml
|
||||
server:
|
||||
cors: true
|
||||
```
|
||||
|
||||
**Explicit configuration** (custom origins and settings):
|
||||
```yaml
|
||||
server:
|
||||
cors:
|
||||
allow_origins: ["https://myapp.com", "https://app.example.com"]
|
||||
allow_methods: ["GET", "POST", "PUT", "DELETE"]
|
||||
allow_headers: ["Content-Type", "Authorization"]
|
||||
allow_credentials: true
|
||||
max_age: 3600
|
||||
```
|
||||
|
||||
When `cors: true`, the server enables secure localhost-only access for local development. For production, specify exact origins to maintain security.
|
||||
|
||||
### Authentication Configuration
|
||||
|
||||
> **Breaking Change (v0.2.14)**: The authentication configuration structure has changed. The previous format with `provider_type` and `config` fields has been replaced with a unified `provider_config` field that includes the `type` field. Update your configuration files accordingly.
|
||||
|
@ -618,6 +642,54 @@ Content-Type: application/json
|
|||
}
|
||||
```
|
||||
|
||||
### CORS Configuration
|
||||
|
||||
Configure CORS to allow web browsers to make requests from different domains. Disabled by default.
|
||||
|
||||
#### Quick Setup
|
||||
|
||||
For development, use the simple boolean flag:
|
||||
|
||||
```yaml
|
||||
server:
|
||||
cors: true # Auto-enables localhost with any port
|
||||
```
|
||||
|
||||
This automatically allows `http://localhost:*` and `https://localhost:*` with secure defaults.
|
||||
|
||||
#### Custom Configuration
|
||||
|
||||
For specific origins and full control:
|
||||
|
||||
```yaml
|
||||
server:
|
||||
cors:
|
||||
allow_origins: ["https://myapp.com", "https://staging.myapp.com"]
|
||||
allow_credentials: true
|
||||
allow_methods: ["GET", "POST", "PUT", "DELETE"]
|
||||
allow_headers: ["Content-Type", "Authorization"]
|
||||
allow_origin_regex: "https://.*\\.example\\.com" # Optional regex pattern
|
||||
expose_headers: ["X-Total-Count"]
|
||||
max_age: 86400
|
||||
```
|
||||
|
||||
#### Configuration Options
|
||||
|
||||
| Field | Description | Default |
|
||||
| -------------------- | ---------------------------------------------- | ------- |
|
||||
| `allow_origins` | List of allowed origins. Use `["*"]` for any. | `["*"]` |
|
||||
| `allow_origin_regex` | Regex pattern for allowed origins (optional). | `None` |
|
||||
| `allow_methods` | Allowed HTTP methods. | `["*"]` |
|
||||
| `allow_headers` | Allowed headers. | `["*"]` |
|
||||
| `allow_credentials` | Allow credentials (cookies, auth headers). | `false` |
|
||||
| `expose_headers` | Headers exposed to browser. | `[]` |
|
||||
| `max_age` | Preflight cache time (seconds). | `600` |
|
||||
|
||||
**Security Notes**:
|
||||
- `allow_credentials: true` requires explicit origins (no wildcards)
|
||||
- `cors: true` enables localhost access only (secure for development)
|
||||
- For public APIs, always specify exact allowed origins
|
||||
|
||||
## Extending to handle Safety
|
||||
|
||||
Configuring Safety can be a little involved so it is instructive to go through an example.
|
||||
|
|
|
@ -17,7 +17,6 @@ client = LlamaStackAsLibraryClient(
|
|||
# provider_data is optional, but if you need to pass in any provider specific data, you can do so here.
|
||||
provider_data={"tavily_search_api_key": os.environ["TAVILY_SEARCH_API_KEY"]},
|
||||
)
|
||||
client.initialize()
|
||||
```
|
||||
|
||||
This will parse your config and set up any inline implementations and remote clients needed for your implementation.
|
||||
|
@ -32,5 +31,4 @@ If you've created a [custom distribution](https://llama-stack.readthedocs.io/en/
|
|||
|
||||
```python
|
||||
client = LlamaStackAsLibraryClient(config_path)
|
||||
client.initialize()
|
||||
```
|
||||
|
|
|
@ -2,13 +2,16 @@
|
|||
|
||||
## Overview
|
||||
|
||||
Protocol for batch processing API operations.
|
||||
|
||||
The Batches API enables efficient processing of multiple requests in a single operation,
|
||||
particularly useful for processing large datasets, batch evaluation workflows, and
|
||||
cost-effective inference at scale.
|
||||
particularly useful for processing large datasets, batch evaluation workflows, and
|
||||
cost-effective inference at scale.
|
||||
|
||||
Note: This API is currently under active development and may undergo changes.
|
||||
The API is designed to allow use of openai client libraries for seamless integration.
|
||||
|
||||
This API provides the following extensions:
|
||||
- idempotent batch creation
|
||||
|
||||
Note: This API is currently under active development and may undergo changes.
|
||||
|
||||
This section contains documentation for all available providers for the **batches** API.
|
||||
|
||||
|
|
|
@ -10,4 +10,5 @@ This section contains documentation for all available providers for the **files*
|
|||
:maxdepth: 1
|
||||
|
||||
inline_localfs
|
||||
remote_s3
|
||||
```
|
||||
|
|
33
docs/source/providers/files/remote_s3.md
Normal file
33
docs/source/providers/files/remote_s3.md
Normal file
|
@ -0,0 +1,33 @@
|
|||
# remote::s3
|
||||
|
||||
## Description
|
||||
|
||||
AWS S3-based file storage provider for scalable cloud file management with metadata persistence.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `bucket_name` | `<class 'str'>` | No | | S3 bucket name to store files |
|
||||
| `region` | `<class 'str'>` | No | us-east-1 | AWS region where the bucket is located |
|
||||
| `aws_access_key_id` | `str \| None` | No | | AWS access key ID (optional if using IAM roles) |
|
||||
| `aws_secret_access_key` | `str \| None` | No | | AWS secret access key (optional if using IAM roles) |
|
||||
| `endpoint_url` | `str \| None` | No | | Custom S3 endpoint URL (for MinIO, LocalStack, etc.) |
|
||||
| `auto_create_bucket` | `<class 'bool'>` | No | False | Automatically create the S3 bucket if it doesn't exist |
|
||||
| `metadata_store` | `utils.sqlstore.sqlstore.SqliteSqlStoreConfig \| utils.sqlstore.sqlstore.PostgresSqlStoreConfig` | No | sqlite | SQL store configuration for file metadata |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
bucket_name: ${env.S3_BUCKET_NAME}
|
||||
region: ${env.AWS_REGION:=us-east-1}
|
||||
aws_access_key_id: ${env.AWS_ACCESS_KEY_ID:=}
|
||||
aws_secret_access_key: ${env.AWS_SECRET_ACCESS_KEY:=}
|
||||
endpoint_url: ${env.S3_ENDPOINT_URL:=}
|
||||
auto_create_bucket: ${env.S3_AUTO_CREATE_BUCKET:=false}
|
||||
metadata_store:
|
||||
type: sqlite
|
||||
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/s3_files_metadata.db
|
||||
|
||||
```
|
||||
|
|
@ -9,7 +9,9 @@ This section contains documentation for all available providers for the **post_t
|
|||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
inline_huggingface
|
||||
inline_torchtune
|
||||
inline_huggingface-cpu
|
||||
inline_huggingface-gpu
|
||||
inline_torchtune-cpu
|
||||
inline_torchtune-gpu
|
||||
remote_nvidia
|
||||
```
|
||||
|
|
|
@ -0,0 +1,41 @@
|
|||
# inline::huggingface-cpu
|
||||
|
||||
## Description
|
||||
|
||||
HuggingFace-based post-training provider for fine-tuning models using the HuggingFace ecosystem.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `device` | `<class 'str'>` | No | cuda | |
|
||||
| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No | | |
|
||||
| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface | |
|
||||
| `chat_template` | `<class 'str'>` | No | <|user|>
|
||||
{input}
|
||||
<|assistant|>
|
||||
{output} | |
|
||||
| `model_specific_config` | `<class 'dict'>` | No | {'trust_remote_code': True, 'attn_implementation': 'sdpa'} | |
|
||||
| `max_seq_length` | `<class 'int'>` | No | 2048 | |
|
||||
| `gradient_checkpointing` | `<class 'bool'>` | No | False | |
|
||||
| `save_total_limit` | `<class 'int'>` | No | 3 | |
|
||||
| `logging_steps` | `<class 'int'>` | No | 10 | |
|
||||
| `warmup_ratio` | `<class 'float'>` | No | 0.1 | |
|
||||
| `weight_decay` | `<class 'float'>` | No | 0.01 | |
|
||||
| `dataloader_num_workers` | `<class 'int'>` | No | 4 | |
|
||||
| `dataloader_pin_memory` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_beta` | `<class 'float'>` | No | 0.1 | |
|
||||
| `use_reference_model` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_loss_type` | `Literal['sigmoid', 'hinge', 'ipo', 'kto_pair'` | No | sigmoid | |
|
||||
| `dpo_output_dir` | `<class 'str'>` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
checkpoint_format: huggingface
|
||||
distributed_backend: null
|
||||
device: cpu
|
||||
dpo_output_dir: ~/.llama/dummy/dpo_output
|
||||
|
||||
```
|
||||
|
|
@ -0,0 +1,41 @@
|
|||
# inline::huggingface-gpu
|
||||
|
||||
## Description
|
||||
|
||||
HuggingFace-based post-training provider for fine-tuning models using the HuggingFace ecosystem.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `device` | `<class 'str'>` | No | cuda | |
|
||||
| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No | | |
|
||||
| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface | |
|
||||
| `chat_template` | `<class 'str'>` | No | <|user|>
|
||||
{input}
|
||||
<|assistant|>
|
||||
{output} | |
|
||||
| `model_specific_config` | `<class 'dict'>` | No | {'trust_remote_code': True, 'attn_implementation': 'sdpa'} | |
|
||||
| `max_seq_length` | `<class 'int'>` | No | 2048 | |
|
||||
| `gradient_checkpointing` | `<class 'bool'>` | No | False | |
|
||||
| `save_total_limit` | `<class 'int'>` | No | 3 | |
|
||||
| `logging_steps` | `<class 'int'>` | No | 10 | |
|
||||
| `warmup_ratio` | `<class 'float'>` | No | 0.1 | |
|
||||
| `weight_decay` | `<class 'float'>` | No | 0.01 | |
|
||||
| `dataloader_num_workers` | `<class 'int'>` | No | 4 | |
|
||||
| `dataloader_pin_memory` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_beta` | `<class 'float'>` | No | 0.1 | |
|
||||
| `use_reference_model` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_loss_type` | `Literal['sigmoid', 'hinge', 'ipo', 'kto_pair'` | No | sigmoid | |
|
||||
| `dpo_output_dir` | `<class 'str'>` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
checkpoint_format: huggingface
|
||||
distributed_backend: null
|
||||
device: cpu
|
||||
dpo_output_dir: ~/.llama/dummy/dpo_output
|
||||
|
||||
```
|
||||
|
20
docs/source/providers/post_training/inline_torchtune-cpu.md
Normal file
20
docs/source/providers/post_training/inline_torchtune-cpu.md
Normal file
|
@ -0,0 +1,20 @@
|
|||
# inline::torchtune-cpu
|
||||
|
||||
## Description
|
||||
|
||||
TorchTune-based post-training provider for fine-tuning and optimizing models using Meta's TorchTune framework.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `torch_seed` | `int \| None` | No | | |
|
||||
| `checkpoint_format` | `Literal['meta', 'huggingface'` | No | meta | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
checkpoint_format: meta
|
||||
|
||||
```
|
||||
|
20
docs/source/providers/post_training/inline_torchtune-gpu.md
Normal file
20
docs/source/providers/post_training/inline_torchtune-gpu.md
Normal file
|
@ -0,0 +1,20 @@
|
|||
# inline::torchtune-gpu
|
||||
|
||||
## Description
|
||||
|
||||
TorchTune-based post-training provider for fine-tuning and optimizing models using Meta's TorchTune framework.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `torch_seed` | `int \| None` | No | | |
|
||||
| `checkpoint_format` | `Literal['meta', 'huggingface'` | No | meta | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
checkpoint_format: meta
|
||||
|
||||
```
|
||||
|
Loading…
Add table
Add a link
Reference in a new issue