S3 Files

Architecture & Semantics

  • S3 Files is described as “EFS with bidirectional sync to S3”: an NFS-like filesystem backed by EFS acting as a cache, periodically syncing objects to S3 (roughly every 60 seconds, aggregated as PUTs).
  • File/object views are consistent within each system but only eventually consistent between EFS and S3.
  • Conflict handling: if a file is changed via the filesystem and also replaced directly in S3 before sync, the filesystem version is moved to a per-filesystem “lost+found” directory and the S3 version wins.
  • No atomic rename support at launch; several commenters flag this as a fundamental limitation for a serious distributed filesystem.

Comparison to Existing Solutions

  • Many note that S3-as-filesystem has existed for years via s3fs, goofys, GeeseFS, JuiceFS, ZeroFS, mountpoint-s3, and others.
  • Key differentiator cited: using EFS as a durable, shared metadata/data layer, rather than doing all non-atomic ops and buffering on individual clients.
  • Some see the blog as “reinventing sliced bread” and underplaying prior art; others argue this is a more robust, supported, and better-performing version of a pattern users already rely on.

Cost & Performance Concerns

  • Strong focus on pricing:
    • Writes: ~$0.06/GB (all writes go to EFS).
    • Cached reads: ~$0.03/GB.
    • Cache storage: ~$0.30/GB-month.
  • Large reads above a configurable threshold (default 128 KB) stream directly from S3, avoiding EFS read charges but inheriting S3 latency; several commenters think 128 KB is too low and worry about random-access workloads.
  • Some praise the latency benefits for many small files and metadata-heavy workloads; others say the EFS layer undercuts S3’s appeal as a cheap store.

Use Cases & Limitations

  • Potentially attractive for:
    • Existing EFS users wanting to move data to cheaper S3 without rewriting apps.
    • Data lake / analytics workloads (e.g., DuckDB, DuckLake, log storage) where S3 listings and small files dominate.
  • Less suitable for:
    • Write-heavy workloads due to EFS write charges.
    • Workloads requiring NFS-safe locking (e.g., SQLite) or true POSIX semantics.
    • Use cases needing in-place updates or efficient large renames; S3 immutability still applies.

Motivation, Adoption & Internal Dynamics

  • Some see this as AWS formalizing a pattern customers already forced into existence (“people will use S3 as a filesystem anyway”).
  • Others worry it will encourage misuse by less-expert users and lead to “surprise” bills or misunderstanding of eventual consistency.
  • One commenter describes past internal attempts to merge EFS and S3, characterizing prior efforts as politically fraught and technically overambitious; this design is seen as a pragmatic fallback with acknowledged warts.