Files
headscale/hscontrol/servertest
Kristoffer Dalby fb8eecae25 state: defer HA failover when probe target reconnected mid-cycle
The HA prober dispatches a PingRequest, waits ProbeTimeout (5s), and
marks the node unhealthy if no callback arrives. A node that bounced
its poll session between probe cycles satisfies two conditions that
conspire to fail TestHASubnetRouterFailover: a probe queued against
the previous session is silently dropped when the worker writes to
the closed connection (timeout always fires), and a probe sent
immediately after reconnect lands while wgengine is still rebuilding
magicsock state from the new netmap. Either path installs a spurious
unhealthy bit, which sends the preserved-primary anti-flap the wrong
way.

Record the session observed at dispatch time and drop the timeout
path if the node reconnected since. Require the session to survive
a full probe cycle before a timeout can drive a failover.
2026-05-18 17:18:08 +02:00
..