llama-stack-mirror/llama_stack/providers/remote/files/s3
2025-09-29 18:41:53 -07:00
..
__init__.py feat(s3 auth): add authorization support for s3 files provider (#3265) 2025-08-29 16:14:00 +02:00
config.py feat: Add S3 Files Provider (#3202) 2025-08-22 10:38:59 -04:00
files.py fix signatures 2025-09-29 18:41:53 -07:00
README.md feat: Add S3 Files Provider (#3202) 2025-08-22 10:38:59 -04:00

S3 Files Provider

A remote S3-based implementation of the Llama Stack Files API that provides scalable cloud file storage with metadata persistence.

Features

  • AWS S3 Storage: Store files in AWS S3 buckets for scalable, durable storage
  • Metadata Management: Uses SQL database for efficient file metadata queries
  • OpenAI API Compatibility: Full compatibility with OpenAI Files API endpoints
  • Flexible Authentication: Support for IAM roles and access keys
  • Custom S3 Endpoints: Support for MinIO and other S3-compatible services

Configuration

Basic Configuration

api: files
provider_type: remote::s3
config:
  bucket_name: my-llama-stack-files
  region: us-east-1
  metadata_store:
    type: sqlite
    db_path: ./s3_files_metadata.db

Advanced Configuration

api: files
provider_type: remote::s3
config:
  bucket_name: my-llama-stack-files
  region: us-east-1
  aws_access_key_id: YOUR_ACCESS_KEY
  aws_secret_access_key: YOUR_SECRET_KEY
  endpoint_url: https://s3.amazonaws.com  # Optional for custom endpoints
  metadata_store:
    type: sqlite
    db_path: ./s3_files_metadata.db

Environment Variables

The configuration supports environment variable substitution:

config:
  bucket_name: "${env.S3_BUCKET_NAME}"
  region: "${env.AWS_REGION:=us-east-1}"
  aws_access_key_id: "${env.AWS_ACCESS_KEY_ID:=}"
  aws_secret_access_key: "${env.AWS_SECRET_ACCESS_KEY:=}"
  endpoint_url: "${env.S3_ENDPOINT_URL:=}"

Note: S3_BUCKET_NAME has no default value since S3 bucket names must be globally unique.

Authentication

For production deployments, use IAM roles:

config:
  bucket_name: my-bucket
  region: us-east-1
  # No credentials needed - will use IAM role

Access Keys

For development or specific use cases:

config:
  bucket_name: my-bucket
  region: us-east-1
  aws_access_key_id: AKIAIOSFODNN7EXAMPLE
  aws_secret_access_key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

S3 Bucket Setup

Required Permissions

The S3 provider requires the following permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-bucket-name",
        "arn:aws:s3:::your-bucket-name/*"
      ]
    }
  ]
}

Automatic Bucket Creation

By default, the S3 provider expects the bucket to already exist. If you want the provider to automatically create the bucket when it doesn't exist, set auto_create_bucket: true in your configuration:

config:
  bucket_name: my-bucket
  auto_create_bucket: true  # Will create bucket if it doesn't exist
  region: us-east-1

Note: When auto_create_bucket is enabled, the provider will need additional permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject",
        "s3:ListBucket",
        "s3:CreateBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-bucket-name",
        "arn:aws:s3:::your-bucket-name/*"
      ]
    }
  ]
}

Bucket Policy (Optional)

For additional security, you can add a bucket policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "LlamaStackAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::YOUR-ACCOUNT:role/LlamaStackRole"
      },
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::your-bucket-name/*"
    },
    {
      "Sid": "LlamaStackBucketAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::YOUR-ACCOUNT:role/LlamaStackRole"
      },
      "Action": [
        "s3:ListBucket"
      ],
      "Resource": "arn:aws:s3:::your-bucket-name"
    }
  ]
}

Features

Metadata Persistence

File metadata is stored in a SQL database for fast queries and OpenAI API compatibility. The metadata includes:

  • File ID
  • Original filename
  • Purpose (assistants, batch, etc.)
  • File size in bytes
  • Created and expiration timestamps

TTL and Cleanup

Files currently have a fixed long expiration time (100 years).

Development and Testing

Using MinIO

For self-hosted S3-compatible storage:

config:
  bucket_name: test-bucket
  region: us-east-1
  endpoint_url: http://localhost:9000
  aws_access_key_id: minioadmin
  aws_secret_access_key: minioadmin

Monitoring and Logging

The provider logs important operations and errors. For production deployments, consider:

  • CloudWatch monitoring for S3 operations
  • Custom metrics for file upload/download rates
  • Error rate monitoring
  • Performance metrics tracking

Error Handling

The provider handles various error scenarios:

  • S3 connectivity issues
  • Bucket access permissions
  • File not found errors
  • Metadata consistency checks

Known Limitations

  • Fixed long TTL (100 years) instead of configurable expiration
  • No server-side encryption enabled by default
  • No support for AWS session tokens
  • No S3 key prefix organization support
  • No multipart upload support (all files uploaded as single objects)