Using S3 as a Container Registry

OCI spec and sequential uploads

  • Multiple commenters question why the spec mandates ordered chunk uploads for a single blob.
  • Suggested rationales:
    • Simpler cleanup of partial uploads.
    • Ability for the registry to stream data into a running digest computation, avoiding rereading large blobs on finalize.
  • Others argue this is now mostly a performance limitation and could be relaxed.
  • There is confusion about whether registries could support non-standard parallel upload APIs alongside the spec; some wish cloud providers did this.

Layer size, parallelism, and image building

  • Large AI/ML layers (1–2+ GiB) make sequential uploads painful.
  • You can push multiple layers in parallel, but not chunks of a single layer.
  • Workarounds:
    • Split content into multiple layers while keeping each file within a single layer.
    • Use multi-stage builds and many COPY steps to create evenly sized layers, improving pull parallelism at the expense of slower builds.
  • Several people want more explicit control over layer boundaries (e.g., LAYER directives, transactional BEGIN/COMMIT blocks, heredocs, BuildKit-based syntaxes).

Using S3/R2 and other storage-backed registries

  • S3 (and R2) can back a registry; existing software (Docker’s registry, GitLab, Nexus, Gitea, zot, Cloudflare’s serverless-registry) already do variants of this.
  • One approach is a very thin “control plane” that returns redirects to blobs stored in S3/R2, so heavy traffic bypasses the registry.
  • Pure S3-based approaches may miss newer OCI features (e.g., referrers API) and headers like Docker-Content-Digest, and need HTTPS for integrity.
  • Private repos require some auth/proxy layer; fully static, auth-free S3 only fits public images.

Performance, cost, and ECR/other cloud registries

  • Multiple reports of slow ECR and Artifact Registry pushes/pulls, especially for multi‑GB layers, even within the same region.
  • Benchmarks mentioned show S3 significantly faster than ECR for the same data.
  • ECR storage is noted as ~5× more expensive per GB than S3 standard; S3/R2 can be attractive as cheaper backends.
  • Some point out ECR does security scanning and uses multipart APIs, which may contribute to overhead; ordered chunk uploads are still enforced.

Critiques of the OCI spec and ecosystem

  • The Distribution spec is described as under-designed:
    • Chunked upload often broken in practice; clients fall back to whole-blob uploads.
    • Content-Range examples don’t match HTTP RFCs.
    • Tag listing pagination text was accidentally removed, so each registry invented its own scheme.
  • Some wish the spec allowed “dumb” static HTTP/file:// registries, since manifests already hold the metadata.
  • Hashing choices (SHA-256) are criticized for being unfriendly to incremental/parallel verification; tree hashes or alternative algorithms are proposed but would require ecosystem-wide support.
  • Compression (especially gzip) is also seen as a bottleneck.

Debate on containers and private registries

  • One camp questions why private registries are needed at all, suggesting tar files or custom distribution.
  • Others list advantages:
    • Central, versioned artifact store.
    • Standard tooling (docker pull, CI/CD, orchestration).
    • AuthN/AuthZ integration, vulnerability scanning, lifecycle policies.
    • Co-location with cloud infrastructure for latency and egress control.
  • Broader argument over Docker/containers:
    • Pro side: dramatically simplified deployment, dependency management, replication, and long-term reproducibility; especially transformative for ops and CI/CD.
    • Skeptical side: increased complexity, large images, and “dependency hell” shifted into containers; perceived limited improvement in end‑user reliability and feature delivery.
    • Several note that pre-container setups could be efficient with discipline, but containers standardized good practices for the wider ecosystem.