Why We Killed Our Kubernetes Cluster

600 MB of control plane to run three microservices. etcd consensus on a five-node cluster. A YAML file longer than the service it deployed. We tore it out and replaced it with 5 MB of C23. Here is exactly what we did and what we measured.

I want to be precise about something before we start. This post is not a Kubernetes hate piece. Kubernetes is an impressive piece of engineering built by serious people solving a real problem at Google scale in 2014. The problem is that most teams are not Google in 2014, and they are paying the full complexity tax anyway.

We were running three services: a Node.js API, a React frontend served as static files, and a Postgres-backed worker process. Five nodes in a bare-metal cluster at a datacenter. Here is what Kubernetes cost us to run those three services.

The Inventory

Before we touched anything, we ran ps aux on the control plane node and listed everything that existed purely to serve k8s's needs, not our application's:

PROCESS	RESIDENT MEMORY	PURPOSE
etcd	~180 MB	Distributed consensus store. Stores pod specs, ConfigMaps, Secrets.
kube-apiserver	~200 MB	REST frontend to etcd. Every cluster operation goes through this.
kube-scheduler	~50 MB	Watches apiserver for unscheduled pods. Assigns them to nodes.
kube-controller-manager	~60 MB	Runs reconciliation loops for deployments, replica sets, endpoints.
kubelet (×5 nodes)	~40 MB each	Node agent. Talks to apiserver, manages container runtime.
kube-proxy (×5 nodes)	~20 MB each	iptables rules for service routing. Reprograms netfilter on every change.
containerd (×5 nodes)	~30 MB each	Container runtime daemon. Pulls images, manages overlay filesystems.
CoreDNS	~30 MB	In-cluster DNS. Required for service name resolution.
nginx-ingress-controller	~90 MB	Routes external HTTP to services. Watches apiserver for Ingress objects.
Total overhead	~870 MB	None of this runs our application.

870 megabytes of resident memory to run a 12 MB Node.js API, a 3 MB static site, and an 8 MB worker. The control plane outweighs the application by 37:1.

This is not a memory argument. Memory is cheap. This is a complexity argument. Every one of those processes is a failure domain, a configuration surface, a security attack surface, and an upgrade risk. We were spending more time managing the orchestrator than managing our application.

The Breaking Points

1. The YAML Surface Area

To deploy that Node.js API with three replicas, health checks, and an HTTP route, we needed: a Deployment, a Service, an Ingress, a HorizontalPodAutoscaler, a PodDisruptionBudget, and a ConfigMap for the nginx-ingress annotations. Six resource types. 214 lines of YAML. Here is a sample of what the ingress alone looked like:

api-ingress.yaml — 47 lines to route HTTP
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/proxy-body-size: "10m"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "30"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "30"
    nginx.ingress.kubernetes.io/use-regex: "true"
    nginx.ingress.kubernetes.io/rewrite-target: /$1
spec:
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /api(/|$)(.*)
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 8080

The equivalent in Skr8tr — start the ingress binary with a flag:

one flag, same result
skr8tr_ingress --listen 80 --tower 127.0.0.1 \
  --route /api:api-service \
  --route /:frontend

2. The Auth Model

Kubernetes authentication is credential-file based. Your kubeconfig contains a token field that is base64-encoded. Base64 is not encryption. It is not hashing. It is a reversible encoding scheme that anyone who has the file can trivially decode with base64 -d. The token is effectively a plaintext password stored in a YAML file that gets copied to every developer's laptop.

There is more. ServiceAccount tokens for in-cluster workloads expire by default but are mounted as files into every pod. Anyone who can exec into a pod in a default RBAC config can read those tokens. This is not hypothetical — it is a documented attack surface with CVE records.

Skr8tr's auth model is different in kind, not degree. Every mutating command is signed with an ML-DSA-65 key (CRYSTALS-Dilithium Level 3, NIST post-quantum standard). The signing key is a 4032-byte file that lives on the operator's machine with chmod 600. It never goes to the server. The server only sees the public key (1952 bytes). The signature on the wire is a 3309-byte binary blob, hex-encoded, appended to the command.

what the conductor receives for a signed SUBMIT
# wire payload (truncated for readability)
SUBMIT|/opt/apps/api.skr8tr|1743890400|3a4f8b2c...6618 hex chars...e9d1a0
#   ^cmd    ^manifest path         ^unix_ts   ^ML-DSA-65 signature

# The conductor verifies: OQS_SIG_verify(payload, sig, pubkey)
# If the timestamp is outside ±30s: replay attack rejected
# If the signature is invalid: ERR|UNAUTHORIZED
# The signing key never left the operator's laptop

3. The Rollout Ceremony

A zero-downtime rolling update in Kubernetes requires you to understand and configure at minimum: strategy.rollingUpdate.maxSurge, strategy.rollingUpdate.maxUnavailable, readinessProbe (correctly — a wrong probe causes the rollout to stall forever), and PodDisruptionBudget (if you want to survive a node drain during rollout). Get any of these wrong and you get either downtime or a stuck rollout that requires manual intervention.

In Skr8tr:

rolling update
skr8tr --key ~/.skr8tr/signing.sec rollout api-v2.skr8tr
#  rolling out /opt/apps/api-v2.skr8tr... ok
#  app     api-server
#  status  new replicas launching, old replicas draining (8s settle)

The rollout thread in the Conductor launches a new-generation replica, waits 8 seconds for it to settle, then sends SIGTERM to the old-generation replica followed by SIGKILL after a 2-second grace window. One at a time. No probe YAML. No PodDisruptionBudget. At any point during the rollout, N−1 replicas are live.

What We Built Instead

Skr8tr is three C23 daemons and a Rust CLI. Here is the full component inventory:

BINARY	SIZE	PURPOSE
`skr8tr_reg`	~40 KB	Service registry. UDP. Register, lookup, round-robin across replicas.
`skr8tr_sched`	~80 KB	Conductor. Schedules workloads, tracks placements, handles auth, rolling updates.
`skr8tr_node`	~60 KB	Fleet node. Runs workloads via fork+exec. Health checks. Log ring buffer.
`skr8tr_ingress`	~45 KB	HTTP reverse proxy. Longest-prefix routing. Dynamic backend via Tower.
`skr8tr` (CLI)	~3 MB	Operator interface. Rust. PQC signing built in.
Total	~3.3 MB	Everything. Including auth. Including ingress.

The Manifest Format

We did not want YAML. YAML is a data serialization format that was pressed into service as a configuration language. It has significant whitespace, implicit type coercion (no parses as false, Norway parses as NO in some parsers), and no native schema. We built our own format.

api-server.skr8tr
app api-server
  exec     /usr/local/bin/myapi
  args     --port 8080 --db postgres.internal:5432
  port     8080
  replicas 3
  health {
    check  GET /healthz 200
    interval 10s
    retries 3
  }
  scale {
    min       1
    max       8
    cpu-above 80
    cpu-below 20
  }

That is the complete deployment manifest for our API server with health checks and auto-scaling. 18 lines. No anchors. No indentation ambiguity. No implicit type coercion. The parser is 200 lines of C23.

The Numbers

We ran both stacks side by side on identical hardware for two weeks. Here is what we measured:

METRIC	KUBERNETES	SKRTR
Control plane resident memory	~870 MB	~12 MB
Time from `git push` to new replica serving traffic	~45s (image pull + pod scheduling + readiness)	~1.2s (fork + exec, no image)
Rolling update: 3 replicas	~90s	~26s (3 × 8s settle)
New node joins cluster	~3 min (kubelet registration, cert approval)	<6s (first heartbeat)
Config lines to deploy one service with ingress	214 lines (6 resource types)	18 lines (1 manifest)
Auth model	base64 token (plaintext equivalent)	ML-DSA-65 post-quantum signature
Binary size of control plane	~620 MB (all binaries)	~3.3 MB

The 1.2-second deploy time is not a trick. Skr8tr does not pull a container image. It does not set up an overlay filesystem. It does not configure network namespaces. It calls fork() and execve() with the binary path from the manifest. The binary was already on disk. That is the entire deployment step.

What Skr8tr Does Not Do (Yet)

Honest accounting. These are genuine gaps relative to a mature k8s installation:

No network namespace isolation — workloads share the host network. This is fine for most services; it is not acceptable for multi-tenant untrusted workloads. Container isolation is not planned for the core.
No TLS termination at the ingress — we terminate at the cloud load balancer. The ingress runs plain HTTP internally. Standard production pattern, but if you need edge TLS in the binary itself, it is not there yet.
No persistent volume management — stateful workloads (Postgres, etc.) are run outside Skr8tr for now, or bind-mounted from host paths. No CSI driver equivalent.
No image registry — binaries must be pre-placed on nodes. We use a simple rsync step in CI. Not elegant, but it is explicit and fast.

If you need multi-tenant container isolation, network policies, or a distributed block storage system, Kubernetes is a reasonable answer. If you are running your own services on nodes you control, it is likely overkill.

The Source

Skr8tr is Apache 2.0. The full source is on GitHub. The control plane is ~2000 lines of C23 across four files. The CLI is ~500 lines of Rust. The parser for .skr8tr manifests is 200 lines. It is small enough to read in an afternoon.

If you are running a Kubernetes cluster for three services, I would invite you to spend that afternoon reading Skr8tr's source and considering whether the complexity you are carrying is load-bearing.

Questions, corrections, or war stories from your own k8s migration: open an issue or email directly.

Why We Killed Our Kubernetes Cluster(and What We Replaced It With)

The Inventory

The Breaking Points

1. The YAML Surface Area

2. The Auth Model

3. The Rollout Ceremony

What We Built Instead

The Manifest Format

The Numbers

What Skr8tr Does Not Do (Yet)

The Source

Why We Killed Our Kubernetes Cluster
(and What We Replaced It With)