feat: Add S3 Files Provider implementation

Implements a complete S3-based file storage provider for Llama Stack with:

Core Implementation:
- S3FilesImpl class with full OpenAI Files API compatibility
- Support for file upload, download, listing, deletion operations
- Sqlite-based metadata storage for fast queries and API compliance
- Configurable S3 endpoints (AWS, MinIO, LocalStack support)

Key Features:
- Automatic S3 bucket creation and management
- Metadata persistence
- Proper error handling for S3 connectivity and permissions

Dependencies:
- Adds boto3 for AWS S3 integration
- Adds moto[s3] for testing infrastructure

Testing:

 Unit: `./scripts/unit-tests.sh tests/unit/files tests/unit/providers/files`

 Integration:

  Start MinIO: `podman run --rm -it -p 9000:9000 minio/minio server /data`

  Start stack w/ S3 provider: `S3_ENDPOINT_URL=http://localhost:9000 AWS_ACCESS_KEY_ID=minioadmin AWS_SECRET_ACCESS_KEY=minioadmin S3_BUCKET_NAME=llama-stack-files uv run llama stack build --image-type venv --providers files=remote::s3 --run`

  Run integration tests: `./scripts/integration-tests.sh --stack-config http://localhost:8321 --provider ollama --test-subdirs files`
This commit is contained in:
Matthew Farrellee 2025-08-18 18:18:18 -04:00
parent c2c859a6b0
commit 8cdfdbe884
11 changed files with 948 additions and 2 deletions

View file

@ -0,0 +1,38 @@
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the terms described in the LICENSE file in
# the root directory of this source tree.
from typing import Any
from pydantic import BaseModel, Field
from llama_stack.providers.utils.sqlstore.sqlstore import SqliteSqlStoreConfig, SqlStoreConfig
class S3FilesImplConfig(BaseModel):
"""Configuration for S3-based files provider."""
bucket_name: str = Field(description="S3 bucket name to store files")
region: str = Field(default="us-east-1", description="AWS region where the bucket is located")
aws_access_key_id: str | None = Field(default=None, description="AWS access key ID (optional if using IAM roles)")
aws_secret_access_key: str | None = Field(
default=None, description="AWS secret access key (optional if using IAM roles)"
)
endpoint_url: str | None = Field(default=None, description="Custom S3 endpoint URL (for MinIO, LocalStack, etc.)")
metadata_store: SqlStoreConfig = Field(description="SQL store configuration for file metadata")
@classmethod
def sample_run_config(cls, __distro_dir__: str) -> dict[str, Any]:
return {
"bucket_name": "${env.S3_BUCKET_NAME}", # no default, buckets must be globally unique
"region": "${env.AWS_REGION:=us-east-1}",
"aws_access_key_id": "${env.AWS_ACCESS_KEY_ID:=}",
"aws_secret_access_key": "${env.AWS_SECRET_ACCESS_KEY:=}",
"endpoint_url": "${env.S3_ENDPOINT_URL:=}",
"metadata_store": SqliteSqlStoreConfig.sample_run_config(
__distro_dir__=__distro_dir__,
db_name="s3_files_metadata.db",
),
}