2024-08-08

How we migrated onto K8s in less than 12 months

Languages & tooling around Kubernetes

For programmatic integration and controllers, Go is viewed as the de facto standard; Rust is mentioned as increasingly popular for operators.
Many interactions are config‑centric: YAML manifests, Helm charts, Kustomize, and tools like Kyverno.
Terraform and Pulumi are common for surrounding cloud infra; some use Terraform to drive Helm, though this can become unwieldy at scale.
Alternative packaging/config tools (e.g., Timoni) are discussed but seen as less established.

Motivations for migrating from ECS to K8s

Desire to reuse the broader CNCF ecosystem: Helm charts, etcd, and other “Kubernetes‑first” tools.
Better support for stateful workloads (e.g., etcd) and storage primitives than ECS historically offered.
Operational features: easier node cordoning/draining, pod rescheduling, health‑based restarts, and autoscaling (often via tools like KEDA).
Platform standardization, better internal developer experience, and easier hiring on a widely known stack.
Reduced perceived vendor lock‑in and potential for multi‑cloud / on‑prem and negotiation leverage with cloud providers.

Critiques of the rationale

Several argue the ECS setup was simply mis‑designed: autoscaling, blue‑green deploys, and templates could have been built on ECS with less upheaval.
Some see “we want Helm/etcd” as tool‑driven, not user‑ or business‑driven, with unclear ROI and large opportunity cost.
Skepticism that multicloud or lock‑in mitigation actually yields measurable cost savings; data is rarely presented.
Concern that migrations of this scale are often resume‑driven or fad‑driven “grand migrations.”

Helm, Terraform, and deployment patterns

Helm is praised as the default distribution mechanism for many third‑party apps and for atomic rollbacks.
Others strongly dislike Helm’s templated YAML, quoting maintainability, “indent hell,” and frequent need to fork charts.
Terraform is seen as better for cloud infra than for K8s workloads due to state complexity and slow plans at scale.
A common pattern: Terraform for cluster/infra, GitOps (ArgoCD/Flux) + Helm/Kustomize for app deployments.

Kubernetes vs ECS/Fargate and managed services

Pro‑ECS/Fargate side: very low operational burden, good enough autoscaling, integrated ALB, CloudWatch, and no cluster upgrades; suits most “normal” apps and small teams.
Pro‑K8s side: unified abstractions for discovery, scaling, rollouts, secrets/config, and storage across clouds and on‑prem; especially valuable with many services or hybrid/regulatory constraints.
Some report K8s clusters (especially with many addons/operators) as fragile and upgrade‑heavy; others claim managed K8s (EKS/GKE/AKS) has become “not that hard” and is simpler than stitching bespoke tooling.

Complexity, overengineering, and scale

Strong thread of frustration with industry‑wide complexity, microservice overuse, and HA “at any cost” without business justification.
Others argue that once you need service discovery, autoscaling, HA, and multi‑team infrastructure, rolling your own on VMs/Ansible/Chef tends to be worse than K8s.
Overall, consensus is split: K8s is powerful and appropriate for some orgs and products (Figma‑scale, multi‑tenant, on‑prem offerings), but easily overkill or a distraction for smaller, simpler systems.

Related topics