Colossus for Rapid Storage
What Rapid Storage / Colossus-Based Buckets Are
- New Cloud Storage zonal bucket type colocated with GPUs/TPUs for much higher random-read throughput (claimed up to 20x vs regional buckets).
- Built directly on Colossus’ stateful protocol; a gRPC client is planned that is essentially a thin wrapper over Colossus.
- Targeted at AI/ML workloads and analytics that need very high random-read bandwidth (e.g., large Parquet datasets, LLM training).
Zonal vs Regional / Multi-Region Semantics
- “Zonal” = tied to a single availability zone; may still be replicated but replicas can share failure domains.
- In Google’s terminology, “regional” usually implies transparent multi‑zone replication; “zonal” does not.
- Rapid Storage complements existing regional and dual‑region buckets; users can choose latency/durability/cost tradeoffs via the same GCS API.
Comparison to AWS S3 Express One Zone and Other Providers
- Several comments frame Rapid Storage as GCP’s answer to S3 Express One Zone (low-latency, single‑AZ object storage).
- S3 Express offers much lower latency but is significantly more expensive than standard S3; naming is criticized as misleading.
- Some argue GCP now uniquely offers low‑latency zonal, standard regional, and transparent dual‑region object storage under one consistent API; others counter that S3 has overlapping but not identical multi-region features.
Performance, AI Branding, and Hypercomputer Marketing
- Mixed reaction to marketing: some praise finally exposing Colossus-like capabilities; others see “AI infrastructure” and hypercomputer FLOPS comparisons as heavy spin.
- Confusion over claims that a TPU pod exceeds the world’s largest supercomputer; clarified that Google is comparing 8‑bit AI FLOPS, not traditional 64‑bit supercomputing FLOPS.
Cost, Elasticity, and DIY Alternatives
- Detailed back-of-the-envelope comparisons argue Google’s “HDD prices” rhetoric is overstated versus self-built storage and cheaper cloud storage (e.g., Backblaze, Hetzner).
- Counterpoints emphasize elasticity and operational convenience: instant bucket creation, scaling to TBs then deleting, avoiding hardware deployment/maintenance, and fine‑grained isolation.
Colossus Semantics and Tradeoffs
- Colossus objects are append-only with a single writer; objects can be “finalized” to disallow further writes; no random writes.
- Advocates: dropping POSIX features like multi-writer atomic updates enables much higher performance, cost efficiency, and strong multi‑tenant isolation at scale.
- Skeptics note that such semantics can be hard to retrofit into existing POSIX-based applications, which likely delayed a direct Colossus offering.
Anywhere Cache vs Rapid Storage
- Anywhere Cache = SSD cache in front of normal (often multi‑regional) buckets; improves latency and avoids egress on cache hits.
- Rapid Storage = new bucket type with all data locally stored and fast, including writes, plus fast durable appends—semantics not available in standard buckets.
Adoption and Product Stability Concerns
- Some excitement from users in scientific computing and analytics who expect major speedups from data locality.
- Others caution against early adoption due to Google’s history of killing or reshaping products; recommendation to wait and see if Rapid Storage “sticks.”