Introducing S2

What S2 Is

  • Described as “S3 for streams”: append-only, ordered logs/streams as a cloud storage primitive.
  • Conceptually overlaps with message queues and Kafka-like event streams, but at a lower-level “log/record” abstraction rather than a full messaging system.
  • Meant as a building block for data systems (buffering, decoupling, journaling, event sourcing, WALs).

Differences vs Existing Systems (Kafka, Kinesis, WarpStream, S3)

  • Higher ordered throughput per stream/partition than typical managed streaming services (claims ~125 MiB/s append, 500 MiB/s real-time read).
  • “Unlimited” number of streams, avoiding shard/partition count limits in Kinesis/Kafka-like services.
  • Object-store-backed, but hides blob/byte-range complexity behind ordered records and sequence numbers.
  • Unlike plain S3 append objects, supports tailing reads and record semantics; unlike Kafka/Kinesis, exposes concurrency control (fencing) for safe distributed writes.

Performance & Architecture

  • Fully object-storage backed (no disks in their own infra); writes batched into multi-tenant chunks to keep S3 write sizes efficient.
  • Different storage classes to trade off latency vs cost; planned “native” NVMe-backed tier for very low tail latency.
  • Similar in spirit to systems like WarpStream or Gazette, but with different latency/architecture tradeoffs.

Security & Multi-tenancy

  • Data from multiple tenants is colocated in shared S3 objects, triggering worries about cross-tenant leaks.
  • Team plans per-stream or per-bucket authenticated encryption and encourages client-side encryption; single-tenant cells also mentioned as future option.
  • Lack of per-tenant encryption today is seen by some as a blocker for serious workloads.

Pricing, Egress & Sustainability

  • Initial public pricing for internet egress was below AWS list; this drew strong skepticism about viability and fears of future price hikes.
  • After feedback, planned egress pricing was adjusted upward; service is free during preview.
  • Some commenters argue retail cloud bandwidth costs make this a tough business unless high discounts at scale are secured.

Developer Experience & APIs

  • Current SDK focus is Rust and CLI; lack of Java/Python SDKs seen as a barrier for Kafka-heavy, Spring-based orgs.
  • Suggestions to build SDKs in non-Rust languages early to flush out “Rust-isms” in the API.
  • Desire for BYO-S3 or S3-compatible backends and self-hostable or source-available options to reduce lock-in.

Positioning, Use Cases & Market Concerns

  • Some find the landing page too focused on low-level primitives, not enough on concrete business problems and examples.
  • Feedback that adoption depends on making it trivially swappable with existing tooling (Kafka API compatibility, Iceberg integration, Debezium pipelines, IoT/MQTT, etc.).
  • There is both enthusiasm (“beautiful API”, “useful primitive”) and skepticism that the addressable market for a raw stream primitive is narrow without higher-level offerings.
  • Concern that large cloud vendors could easily ship a similar service (e.g., S3 append + record semantics) and undercut or overshadow S2.

Branding & Naming Reactions

  • Many joke about confusion with S3 and other “letter+number” products; some think “S2” sounds like a downgrade from S3.
  • Genuine concern raised that the name plus explicit S3 comparisons may invite trademark friction with Amazon.
  • Others view the name as clearly engineer-led and like the honest, S3-inspired positioning.

Future Directions & Feature Requests

  • Requested: compaction, event-sourcing helpers, GDPR-friendly deletion patterns, Athena/Presto-style querying, Kafka compatibility layer, IoT protocol adapters, emulator for local dev (possibly SQLite-backed).
  • Interest in integrating with emerging table formats (Iceberg, S3 Table buckets) by buffering small writes and flushing optimized Parquet files.