Launch HN: Regatta Storage (YC F24) – Turn S3 into a local-like, POSIX cloud FS

Overview

  • Service turns S3 (and S3‑compatible stores) into a cloud file system exposed over NFSv3 now, with a custom FUSE-based protocol planned.
  • Goal: local‑like, POSIX‑style semantics (random writes, appends, renames, locking, symlinks) on top of cheap, “infinite” object storage, so apps can stay file-based and stateless.

Architecture & Semantics

  • Clients mount an NFSv3 export from Regatta’s fleet; that fleet talks to a customer-owned S3 bucket.
  • Every file is a single S3 object, stored in native S3 layout (no proprietary block format), so existing buckets can be mounted and data remains usable via the S3 API.
  • Writes go to a durable, replicated cache; fsync guarantees durability in this cache, not in S3. Data is then asynchronously flushed to S3, typically within minutes, with write coalescing.
  • File clients see strong read-after-write consistency; cross‑protocol consistency (NFS vs direct S3 edits of the same object) is “undefined,” with some etag-based detection/alerting.
  • POSIX ACLs not yet supported; filesystem metadata lives in the cache.

Comparisons to Other Systems

  • Versus s3fs/rclone/goofys/Mountpoint: emphasizes full POSIX semantics and a shared durable cache for multi‑client consistency, rather than a simple FUSE bridge.
  • Versus JuiceFS, Nasuni, S3QL: main differentiator is storing objects 1:1 in native S3 format instead of proprietary block layouts.
  • Versus AWS Storage Gateway/EFS/FSx Lustre/FlexFS: pitched as cloud‑native, elastic, high‑IOPS/throughput, and cheaper when replacing overprovisioned EBS/EFS; aims to reach Lustre‑like scale.

Use Cases & Limits

  • Discussed workloads: ML/MLOps (training corpora, notebooks), analytics engines (ClickHouse, DuckDB), databases (Postgres/SQLite debated), backups, bioinformatics, email stores.
  • Currently AWS us‑east‑1 only and effectively EC2‑only; broader regions, GCP, Azure, K8s CSI, and Docker plugins are planned.
  • Not recommended for flaky consumer networks (e.g., laptops) in current form.

Pricing & Economics

  • Charges for cache capacity and Regatta↔S3 data transfer; S3 storage itself is BYO and billed by the cloud provider.
  • Proponents argue it can be cheaper than EBS/EFS due to typical under‑utilization of block volumes; critics find cache pricing high relative to DIY or bare metal.

Concerns and Reception

  • Technical skepticism around crash consistency, NFS semantics for databases, and “write‑back cache” failure modes; some call for Jepsen-style testing.
  • Others are enthusiastic, seeing it as a long‑missing primitive that could underpin new serverless data platforms and simplify working with S3-backed datasets.