state, servertest: property-test HA election + invariant catalogue

Expand TestPrimaryRoutesProperty (5 -> 9 ops). New ops mirror the
production shapes the failure cases hit: BatchProbeResults via
UpdateNodes, SimultaneousDisconnect via UpdateNodes, SetApprovedRoutes
that leaves announced RoutableIPs intact, OfflineExpiry that keeps
Unhealthy set. The model now tracks announced and approved separately
and recomputes the intersection.

Strengthen the per-op assertions to cover invariants the model alone
cannot prove: every primary must be online, every primary must
currently advertise its prefix, no flap onto an unhealthy candidate
when a healthy one was available, no flap off a previous primary that
remains a healthy candidate. The check now takes a pre-op snapshot so
the anti-flap rule has a stable reference.

Add TestHAProberProperty in servertest. It drives a real TestServer
with three HA-route-advertising clients through rapid-drawn sequences
of ClientDisconnect / ClientReconnect / ProberTick / WaitForSnapshot
ops and re-checks the same shape invariants after every step.

Document the system in hscontrol/state/HA_INVARIANTS.md: a state
machine over (Healthy+Online, Unhealthy+Online, Offline,
OfflineExpired), fifteen numbered invariants with predicates and
violation paths, and a coverage matrix mapping each invariant to its
unit, servertest, and integration tests. Three rows pin the recent
fixes to the invariants they enforce.
This commit is contained in:
Kristoffer Dalby
2026-05-17 20:32:53 +00:00
parent c7630b505b
commit e2f2f9211f
5 changed files with 968 additions and 43 deletions

View File

@@ -4066,9 +4066,9 @@ func TestHASubnetRouterFailoverBothOffline(t *testing.T) {
// connection until kernel keepalives time out (often >60 s). When
// the cable returns, two server-side longpoll sessions can overlap.
//
// Issue #3203 reports the bug after cable pulls; this variant blocks all
// traffic between the router container and headscale via iptables and then
// removes the block to mimic that behaviour.
// This variant blocks all traffic between the router container and
// headscale via iptables and then removes the block to mimic the
// cable-pull behaviour.
func TestHASubnetRouterFailoverBothOfflineCablePull(t *testing.T) {
IntegrationSkip(t)