← Blog

Rolling Updates Without ReadinessProbes, PodDisruptionBudgets, or YAML

We built the entire rolling update system in a single detached pthread — roughly 80 lines of C23. Launch new replica, wait 8 seconds, kill old replica. Repeat per replica. Verified on a local single-node cluster. Here is exactly how it works.

SB
Scott Baker
Systems engineer — C23, Rust, NixOS, post-quantum security

A Kubernetes rolling update requires you to configure and correctly understand at least five things before it works without incident: maxSurge, maxUnavailable, readinessProbe, PodDisruptionBudget, and the termination grace period. Get the readiness probe wrong and the rollout stalls indefinitely — waiting for a pod to become ready that will never report ready because the probe path is wrong. We have all been there.

Skr8tr's rollout is one command:

triggering a rolling update
$ skr8tr --key ~/.skr8tr/signing.sec rollout api-v2.skr8tr rolling out /home/sbaker/skr8tr/examples/api-v2.skr8tr... ok app api-server status new replicas launching, old replicas draining (8s settle)

How rollout_thread Works

When the Conductor receives a ROLLOUT command it does two things: responds immediately with OK|ROLLOUT|<app> (non-blocking), then spawns a detached pthread to do the actual work. The thread owns the rollout from that point forward.

src/daemon/skr8tr_sched.c — rollout_thread (simplified)
static void *rollout_thread(void *arg) { RolloutArgs *ra = arg; /* Bump the generation counter — new replicas get gen N+1 */ pthread_mutex_lock(&g_mu); Workload *wl = workload_find(ra->app_name); wl->current_gen++; int new_gen = wl->current_gen; pthread_mutex_unlock(&g_mu); /* For each existing (old-gen) placement, one at a time: */ for (int i = 0; i < old_count; i++) { /* 1. Launch a new-generation replica on the best available node */ NodeEntry *node = node_least_loaded_for_port(wl->port); launch_replica(wl, node); /* sends LAUNCH to node, waits for LAUNCHED+PID */ /* 2. Wait for the settle window — workload starts, stabilises */ sleep(ROLLOUT_WAIT_S); /* ROLLOUT_WAIT_S = 8 */ /* 3. Kill the old-generation replica */ send_kill(old_placements[i].node_id, old_placements[i].app_name); node_port_release(old_node, wl->port); } free(ra); return NULL; }

The generation counter is the key: every Placement struct carries an integer generation field. When a rollout starts, all existing placements are old-gen (N). New placements get gen N+1. The thread walks the old-gen list, replacing each one. During the rollout, both generations coexist briefly — there is no downtime gap for multi-replica workloads.

Port Collision Safety

One problem we hit early: if a workload binds port 8080, and both the old and new replica land on the same node, the new process fails to bind. We fixed this by tracking bound ports per node in the Conductor.

NodeEntry struct — port tracking added for rollout safety
typedef struct { char node_id[64]; char ip[64]; int cpu_pct; int ram_free_mb; time_t last_heartbeat; /* Added for rollout port safety: */ int used_ports[64]; int used_port_count; } NodeEntry;

node_least_loaded_for_port(port) skips any node that already has that port in its used_ports array. Ports are claimed when the node confirms LAUNCHED, released on KILL or node expiry.

The Verified Test

We ran this on a single Arch Linux workstation — one Conductor, one node, both on localhost. The workload was /bin/sleep 3600 (the simplest possible long-running process).

end-to-end rollout verification — 2026-04-06
# Deploy initial workload — PID 1031776 $ skr8tr --key ~/.skr8tr/signing.sec up examples/my-server.skr8tr submitting... ok | app my-server | node 638eb13e... $ skr8tr list my-server 638eb13e... 1031776 # Trigger rollout $ skr8tr --key ~/.skr8tr/signing.sec rollout examples/my-server.skr8tr rolling out... ok — new replicas launching, old replicas draining (8s settle) # Wait 10s, then check $ skr8tr list my-server 638eb13e... 1032014 ← new PID # Confirm old PID is dead $ ps -p 1031776 old PID 1031776 is dead — rollout replaced it
The 8-second settle window is a pragmatic choice, not a fundamental constraint. For services that start in milliseconds (most compiled binaries), 8 seconds is conservative. A future version will support a configurable settle field in the manifest and an optional HTTP probe to cut the window short once the new replica is responding.

What We Have Not Done

Honest gaps in the current implementation:

Source: src/daemon/skr8tr_sched.c — search for rollout_thread.


Questions: scott.bakerphx@gmail.com

← PQC Auth Next: HTTP Ingress →