TestGrantViaSubnetFilterRules pins exact-equality dst. Add a sibling
for the broader-dst case so the regression sits at the server level
alongside the policy-engine unit test.
Updates #3267
slices.Contains required exact equality between grant dst and the
advertised subnet route. Any non-identical pair was rejected, so a
via grant with broader (or narrower) dst emitted no filter rule and
added no route to the viewer's AllowedIPs. Tailscale SaaS uses
containment in either direction.
Switch to slices.ContainsFunc(routes, dst.Overlaps) for filter rule
emission (keep dst literal in DstPorts), and append overlapping
advertised routes to ViaRoutesForPeer.Include / Exclude. Rewrite the
multi-router HA election and regular-grant overlap detection to key
off the matched routes rather than the dst. Resolve *Host aliases to
*Prefix once in compileOneViaGrant and at the top of ViaRoutesForPeer
so the switch arms reach them.
Fixes#3267
Both helpers existed to write the literal "[]" when clearing a slice
column — a workaround for GORM's struct-Updates skipping nil slices.
The State path goes exclusively through persistNodeToDB, which is now
correct end-to-end thanks to the named IsZero slice types, so the
helpers are dead in production. The remaining callers were tests.
TestSetTags is dropped — TestSetTags_* in hscontrol/grpcv1_test.go
already covers the State path that production uses. TestAutoApproveRoutes
now writes routes via DB.Save on the loaded node, which is the path
gRPC SetApprovedRoutes drives in production.
Updates #3110
Drives the persist path for ApprovedRoutes, Tags and Endpoints —
seed a non-empty value, clear to nil, read the column back from disk,
then close the State and reopen one against the same sqlite file to
simulate a server restart. Pins the contract the named IsZero slice
types enforce so future changes to the persist path cannot silently
drop a cleared slice column.
Updates #3110
Endpoints, Tags and ApprovedRoutes serialize as JSON on Node. GORM's
struct Updates path skips fields it considers zero, and reflect treats
a nil slice as zero — clearing any of these columns via the State
persist path would leave the previous value in the database.
Introduce Strings, Prefixes and AddrPorts as named slice types whose
IsZero() always reports false, so GORM keeps the column in the UPDATE
regardless of the slice being nil or empty. JSON marshalling is
unchanged: nil serializes to null, empty to []. List() returns the
underlying unnamed slice for callers (mainly testify assertions over
reflect.DeepEqual) that distinguish the named type from its base.
Regenerated types_clone.go and types_view.go follow the field-type
swap. Test assertions across hscontrol/{db,state,servertest} updated
to call .List() where reflect.DeepEqual previously matched the raw
slice type.
Fixes#3110
Replays recorded policy responses for the sshTests block. 200 captures must evaluate; non-200 captures must reject with the recorded body as a substring of the headscale error. Divergences are listed in knownSSHTesterDivergences.
SetPolicy and policy check now compile per-dst SSH rules and replay each sshTests entry. The accept assertion treats check-action rules as reachable; the check assertion requires HoldAndDelegate on the matching rule. Boot reload warns and continues.
Adds SSHPolicyTest plus parse-time validation: empty src/dst, port/CIDR/autogroup-internet destinations, and tag references missing from tagOwners are rejected. Engine evaluation comes in a follow-up.
The guard added for #2862 in handleRegister checked
node.Expiry().Valid() before preserving node state on
Auth=nil + Expiry=zero registration requests. Valid() returns false
when node.Expiry is nil, the default for tagged nodes and for untagged
nodes registered against a preauth key with no default node.expiry
configured. Both fell through to handleLogout, which wrote
&time.Time{} (0001-01-01T00:00:00Z) over the original nil — the
user-visible 0001-01-01 expiry that `headscale nodes list` reports
after restart.
IsExpired() already returns false for both nil and zero-time, so the
Valid() check was redundant. Drop it so all nil-expiry nodes are
covered by the same preservation path.
Fixes#3170Fixes#3262
When a node carries the disable-ipv4 nodeAttr documented at
https://tailscale.com/docs/reference/troubleshooting/network-configuration/cgnat-conflicts,
SaaS stops sending the node's CGNAT IPv4 prefix in MapResponse. The
allocator keeps assigning IPv4 server-side; only the wire-shape
delivery is filtered. Subnet routes the node advertises -- including
IPv4 prefixes -- survive in AllowedIPs and PrimaryRoutes.
TailNode now drops Is4 prefixes from Addresses and from the node's
own /32 slot in AllowedIPs when selfPolicyCaps carries
disable-ipv4. Mapper.buildTailPeers passes each peer's policy
CapMap so the filter applies in viewer netmaps too; the CapMap
merge that follows is overwritten by PeerCapMap so only the address
filter survives on the peer path.
Two captures land in testdata/nodeattrs_results to anchor the
behaviour:
- nodeattrs-attr-c15-disable-ipv4 (on tag:client)
- nodeattrs-attr-c16-disable-ipv4-router (on tag:router, which
advertises 10.33.0.0/16, confirming subnet routes survive)
Tailscale stamps tailcfg.NodeAttrDefaultAutoUpdate on every node's
CapMap with a JSON bool reflecting the tailnet-wide auto-update
default. Headscale grows an auto_update.enabled config option and
emits the cap accordingly from TailNode -- the cap leaves the
unmodelledTailnetStateCaps strip list and is compared in full by the
nodeAttrs compat suite.
testNodeAttrsSuccess drives cfg.AutoUpdate.Enabled from
tf.Input.Tailnet.Settings.DevicesAutoUpdatesOn so each capture's
expected emission matches the SaaS state it was taken under. Two
captures cover both branches:
- nodeattrs-tailnet-devices-auto-updates-on -> [true]
- nodeattrs-tailnet-devices-auto-updates-off -> [false]
The Tailscale v2 TailnetSettings API does not expose the Send Files
toggle, so the compat suite cannot vary cfg.Taildrop.Enabled per
capture. TestTaildropDisabledWithholdsFileSharingCap covers the off
path directly in servertest.
Taildrive (drive:share and drive:access) is policy-driven per
Tailscale's documented behaviour
(https://tailscale.com/docs/features/taildrive). The previous
always-on baseline emission diverged from SaaS for every node not
targeted by a drive nodeAttr -- a real semantic divergence that the
compat suite caught once the test moved to comparing TailNode output
against the captured netmaps.
types.Node.TailNode no longer stamps the drive pair. Operators
wanting taildrive add a nodeAttrs entry:
"nodeAttrs": [
{ "target": ["*"], "attr": ["drive:share", "drive:access"] }
]
unmodelledTailnetStateCaps shrinks accordingly. The baseline-divergence
group is gone; every entry left in the list is genuinely unmodelled
(user-role caps, unimplemented features, tailnet metadata, internal
tuning).
servertest's TestNodeAttrsBaselineCapsAlwaysOn expects the smaller
baseline (admin + ssh + file-sharing). Integration TestGrantCapDrive
grants the drive caps explicitly via NodeAttrs to exercise the
policy-driven emission path.
types.NodeView.TailNode takes a selfPolicyCaps tailcfg.NodeCapMap
parameter and merges it into the baseline. The mapper's WithSelfNode
hands it the policy result via state.NodeCapMap; peer-path callers
pass nil because peer-side CapMap is set downstream via
policyv2.PeerCapMap.
The nodeAttrs compat test now diffs the full TailNode self-view
output against captured SaaS netmaps. Before this change the test
compared compileNodeAttrs alone -- the policy-only output -- and
needed a strip list to compensate for the missing baseline. With
TailNode on the diff path, baseline emission is exercised end-to-end
by every capture; a regression in TailNode breaks the suite.
unmodelledTailnetStateCaps drops cap/ssh and cap/file-sharing now
that both sides emit them identically. The file header is rewritten
to read as 'caps SaaS emits where headscale has no equivalent yet'
rather than the more confusing 'shape divergence' framing.
The Tailscale client surfaces 'use this peer as your exit node' when
the peer's CapMap carries the tailcfg.NodeAttrSuggestExitNode cap.
SaaS emits it only on peers whose advertised exit routes are
approved -- not every peer that just has the cap in its own
nodeAttrs slot.
policyv2.PeerCapMap encodes that emission rule: it walks the
peer's own self-CapMap (built from compileNodeAttrs) and surfaces
the gated entries (today just suggest-exit-node when the peer
IsExitNode). Mapper.buildTailPeers calls it for each peer instead
of merging the peer's full nodeAttrs CapMap onto its peer view.
allCapMaps snapshots the full per-node CapMap once per peer-list
build so pm.mu is acquired once rather than per peer.
Adds a data-driven test that loads testdata/nodeattrs_results/*.hujson
and diffs the captured SaaS-rendered netmaps against headscale's
compileNodeAttrs output. Each capture is one scenario the SaaS
control plane has rendered against the same policy headscale is asked
to compile -- the test enforces shape parity per node.
tailnet_state_caps.go enumerates the caps SaaS emits where headscale
has no equivalent concept yet (user-role admin/owner, tailnet lock,
services host, app connectors, internal magicsock and SSH tuning,
tailnet-state metadata) plus the always-on baseline (admin, ssh,
file-sharing) and the taildrive pair. stripUnmodelledTailnetStateCaps
filters both sides of cmp.Diff so the comparison focuses on the
policy-driven caps. PeerCapMap encodes which caps the Tailscale
client reads from the peer view (suggest-exit-node when exit routes
are approved, etc.) for use by the mapper.
testcapture switches to typed tailcfg/netmap/filtertype/apitype
values so schema drift between the capture tool and headscale
becomes a compile error rather than a silent test failure. Existing
compat suites (acl, grants, routes, ssh, issue_3212) move to the
typed shape.
The 53 SelfNode netmap captures and the 7 anonymizer-corrupted
suggest-charmander -> suggest-exit-node restorations in
routes_results / issue_3212 ride along.
Tailscale models the randomize-client-port toggle as a top-level
field on the ACL policy. Headscale now matches that shape: the
server-config randomize_client_port key is removed, the toggle
lives in the policy file as randomizeClientPort, and per-node
opt-in via nodeAttrs is also supported.
Operators upgrading from a config-set randomize_client_port hit
depr.fatalWithHint at startup, which prints the deprecation message
and points at the new policy field rather than silently dropping
the toggle. The default carries over (false) so operators who never
set it are unaffected. config-example.yaml ships a REMOVED stanza
showing the migration.
types/node.go drops the cfg.RandomizeClientPort read from
TailNode -- the cap is now policy-driven through compileNodeAttrs
and the tail_test.go expectations follow.
WithSelfNode and buildTailPeers merge each node's policy CapMap
into the tailcfg.Node.CapMap they emit. State.NodeCapMap and
State.NodeCapMaps wrap the policy manager: NodeCapMap returns a
defensive clone per call; NodeCapMaps snapshots the full per-node
map once for batched callers, amortising pm.mu acquisition across
a peer build.
generateDNSConfig grew a per-node CapMap argument so it can apply
nodeAttr-driven DNS overlays. The nextdns DoH rewrite hardens against
policy-controlled inputs:
- nextDNSDoHHost anchors the prefix match instead of substring,
so a hostile resolver URL cannot smuggle a nextdns hostname in
a path or query.
- nextDNSProfileFromCapMap accepts only profile names matching
[A-Za-z0-9._-]{1,64} and picks the lexicographically first when
multiple are granted -- deterministic, no shell metacharacters
or URL fragments through.
- addNextDNSMetadata composes the rewritten URL via url.Parse +
url.Values rather than fmt.Sprintf, so existing query strings
on the resolver URL survive and metadata cannot inject a new
component.
WithTaildropEnabled in servertest controls cfg.Taildrop.Enabled per
test so cap/file-sharing emission can be toggled in tests that need
to verify the off path.
ACL policies now accept a top-level nodeAttrs block. Each entry hands
a list of tailcfg node capabilities to every node matching target.
Accepted target forms are the same as acls.src and grants.src: users,
groups, tags, hosts, prefixes, autogroup:member, autogroup:tagged,
and *. autogroup:self, autogroup:internet, and autogroup:danger-all
are rejected at validate time because none describes a stable
identity set a node-level attribute can attach to.
NodeAttrGrant carries Targets, Attrs, and IPPool. IPPool is parsed
but rejected at validate time -- the allocator that consumes it is
not yet implemented. nodeAttrUnsupportedCaps lists caps SaaS accepts
that headscale cannot act on (funnel today) and rejects them with a
tracking-issue link in the error.
compileNodeAttrs resolves each entry's targets, then maps every
targeted node to a tailcfg.NodeCapMap of the entry's attrs. Per-node
IPs are cached once per call so the inner attr loop is O(grants)
instead of O(grants * nodes) IP allocations.
PolicyManager grows NodeCapMap (per-node), NodeCapMaps (snapshot for
batched callers), and NodesWithChangedCapMap (drain buffer for the
self-broadcast diff). refreshNodeAttrsLocked appends to the drain
rather than overwriting so a SetUsers/SetNodes between SetPolicy and
the drain cannot lose the policy-reload diff.
The policy `tests` block lets entries omit `proto`. Tailscale's client
maps that to the default protocol set {TCP, UDP, ICMP, ICMPv6} — the
captured packet_filter_matches show all four IANA numbers explicitly
when no proto is set — and a rule restricted to any one of them
satisfies an empty-proto reachability test.
srcReachesDst was passing the empty Protocol through unchanged, which
landed an empty []int in ruleMatchesProto. The matcher then short-
circuited to "no match" for every rule with a non-empty IPProto
restriction, including TCP-only grants compiled from `ip: ["tcp:80"]`.
The bug surfaced in the captured allpass-acls-and-grants-mixed
scenario: the grant `tag:client → webserver:80` was reachable in the
compiled filter but the empty-proto test could not see it.
Expand the empty Protocol to the default set at the call site so
ruleMatchesProto's intersection check sees the right requested
protocols. Drop the now-dead empty-requestedProtos branch from the
matcher. The last divergence drops out of knownPolicyTesterDivergences
as a result.
Updates #1803
Tailscale accepts both named ("tcp") and numeric IANA ("6") protocol
forms wherever a Protocol value is allowed. Headscale stored whichever
form the user wrote, leaving downstream code with two equivalents to
handle separately. validateProtocolPortCompatibility only recognised
the named constants and rejected the numeric form, so a policy with
`proto: "6", dst: ["host:443"]` was rejected at parse time even though
SaaS accepts it.
Resolve the disagreement by normalising to the named form during
Protocol.UnmarshalJSON. Every downstream consumer now sees one form
regardless of what the user wrote, so layered guards like
`|| protocol == "6"` in the validator are unnecessary.
Updates #1803
A `tests` entry describes one connection attempt to one specific
host on one specific port over a connection-oriented protocol, and
asserts whether it is allowed or denied. Five shape rules follow —
single-port dst, proto in {tcp, udp, sctp, ""}, no
autogroup:internet dst, no CIDR-typed dst (raw `/N` or hosts:-alias
to a multi-host prefix), at least one of accept/deny — and every
one was previously silently accepted by headscale even though
Tailscale SaaS rejects them as "test(s) failed".
Enforce them in one pass over `pol.Tests` from `Policy.validate()`,
reusing the existing parse-time multierr aggregation. The same
shapes remain valid inside ACL or Grant destinations where the rule
does not apply; the validator only walks the tests array.
The compat runner now treats parse-time errors equivalently to
SetPolicy errors so the captured Tailscale body still matches via
substring regardless of which step surfaces the rejection. Nine
divergences resolved by this validation pass drop out of
knownPolicyTesterDivergences.
Updates #1803
57 captures covering the alias × outcome matrix for the tests block,
recorded against a real Tailscale SaaS tailnet. Replayed by
TestPolicyTesterCompat.
Bump the check-added-large-files pre-commit threshold to 1024 KB —
captures include verbose per-node netmaps and one is 620 KB.
Updates #1803
Pin headscale's accept/reject decision and error body against
Tailscale SaaS by replaying captures recorded from a real tailnet.
Mirrors the tailscale_grants_compat_test.go pattern: glob over
testdata/policytest_results/, one t.Run per file, parse-or-SetPolicy
error must contain the captured api_response_body.message.
errPolicyTestsFailed is "test(s) failed" — Tailscale's literal body —
so substring match works against captured response bodies. Per-test
detail (src, dst, expected vs got) is preserved below the prefix for
the CLI / config-reload paths that don't have an audit endpoint.
knownPolicyTesterDivergences gates the 12 mismatches the captures
will surface so the suite stays green; engine fixes in follow-up
commits drop the entries as each is resolved.
Updates #1803
v2 silently dropped policy.tests, so a policy that contradicted its
own assertions still applied. Resolve src/dst via the existing Alias
machinery, walk the compiled global filter rules (acls and grants
both contribute), and run on every user-write boundary: SetPolicy,
the file watcher, and `headscale policy check`. A failing test
rejects the write before it mutates live state.
Boot-time reload skips evaluation; an already-stored policy that
references a deleted user shouldn't lock the server out.
`headscale policy check` is a thin frontend for the new CheckPolicy
gRPC method. The server-side handler builds a fresh PolicyManager
from the request bytes and the state's live users/nodes, runs
SetPolicy on the sandbox so the tests block executes, and returns
the result through gRPC status. No persistence, no policy_mode
coupling. --bypass-grpc-and-access-database-directly opens the DB
directly when the server is not running.
cmd/headscale/cli/root.go no longer special-cases `policy check` in
init() (the early return from PR #2580 broke --config registration
and viper priming for --bypass).
integration/cli_policy_test.go covers policy_mode={file,database} x
fixture={acl-only, acl+passing-tests, acl+failing-tests} x
bypass={false,true} = 12 rows.
Updates #1803
Co-authored-by: Janis Jansons <janhouse@gmail.com>
MatchFromFilterRule only read DstPorts[].IP into the destination
IPSet. Cap-grant-only filter rules (e.g. tailscale.com/cap/relay)
carry their destinations in CapGrant[].Dsts, so the derived matchers
had empty dest sets and BuildPeerMap / ReduceNodes never exposed the
cap target to its source nodes. Without a companion IP-level grant
the relay node stayed invisible, so clients never tried to use it
and connections sat on DERP.
Union CapGrant[].Dsts into the destination IPSet alongside DstPorts.
Restores peer-visibility for any cap-grant-only relationship; the
peer-relay flow is the most visible instance.
Fixes#3256
When a user@ token resolved to more than one DB row, ACL and SSH
rules referencing it were silently dropped at compile time, leaving
clients with SSHPolicy={rules: null} and no signal to the admin.
Validate every Username reference in groups, tagOwners,
autoApprovers, ACLs and SSH rules at NewPolicyManager and SetPolicy
and return ErrMultipleUsersFound. Missing-user tokens stay tolerant
per #2863.
Updates #3160
Mirror the guard from HandleNodeFromPreAuthKey in HandleNodeFromAuthPath.
Both functions log the old user's name in the "different user" branch
when an existing NodeStore entry under the same machine key belongs to
another user. UserView.Name dereferences the backing User pointer
unconditionally, so when the cached node was loaded with a non-nil
UserID but a nil User (Preload join missed the row, or upstream code
left the snapshot in that shape), the log call panics with a nil-pointer
dereference at hscontrol/types/types_view.go:97.
The panic is caught by the http2 server's runHandler for the noise
control plane, so the process keeps running but every retry produces a
new panic — production has observed bursts of ~1.9k panics per hour
during a tailscaled reconnect loop. The gRPC/OIDC entry has no equivalent
recover and would surface the panic to the caller.
Guard both call sites with oldUser.Valid() and fall back to an empty
old-user name when the pointer is nil. The "Creating new node for
different user" log line still includes the existing node ID, hostname,
machine key, and new user, so operator visibility is preserved.
Add reproduction tests for both handlers seeding the orphan shape
directly into NodeStore via PutNodeInStoreForTest.
Co-Authored-By: Kristoffer Dalby <kristoffer@dalby.cc>
The 39 SSH-*.hujson files in hscontrol/policy/v2/testdata/ssh_results/
were legacy hand-written "expected SSH rules" snippets superseded by
the lowercase tscap captures (ssh-*.hujson). The active loader in
TestSSHDataCompat globs ssh-*.hujson; filepath.Glob is case-sensitive
on Linux so the uppercase set was loaded by no test.
The duplication caused permanent dirty git state on case-insensitive
filesystems (APFS, NTFS) where only one of SSH-A1.hujson and
ssh-a1.hujson can physically exist in the working tree.
Add an assertion to TestSSHDataCompat that the loader picks up every
*.hujson under ssh_results/ so future fixture migrations cannot leave
stranded files behind.
Fixes#3240
Some OIDC providers (notably JumpCloud) return the `groups` claim as
a plain string when the user belongs to a single group, rather than
a single-element array:
Single group: {"groups": "MyGroup"}
Multiple groups: {"groups": ["Group1", "Group2"]}
This causes `json.Unmarshal` to fail with:
cannot unmarshal string into Go struct field OIDCClaims.groups of type []string
This is the same class of issue as juanfont#2293 (FlexibleBoolean for
email_verified). The fix follows the same pattern: introduce a
FlexibleStringSlice type with a custom UnmarshalJSON that accepts
both a string and a []string, and use it for the Groups field in
both OIDCClaims and OIDCUserInfo.
TestGrantViaExitNodeInternetVisibility boots a server, applies a
policy that scopes autogroup:internet to a tag, registers a tagged
exit advertiser and a regular client, and asserts the client's netmap
surfaces the exit node with 0.0.0.0/0 and ::/0 in AllowedIPs — the
substrate the Tailscale client reads to populate
`tailscale exit-node list`.
TestGrantViaExitNodeNoFilterRules retains its assertion (literal /0
absent from the exit node's PacketFilter, matching SaaS PacketFilter
encoding); only its docstring is updated to reflect that the exit
node now does receive a TheInternet-shaped rule, just not the
literal /0 form.
Updates #3233
A grant of the form `{src: alice, dst: autogroup:internet, via:
tag:exit1}` was loading without error but stripping every exit node
from alice's view: `tailscale exit-node list` returned "no exit nodes
found".
Two sites skipped autogroup:internet at the compile / steering layer:
compileViaForNode's *AutoGroup arm produced no FilterRule for the
via-tagged exit node, and ViaRoutesForPeer's *AutoGroup arm produced
no Include/Exclude. With pm.needsPerNodeFilter true, the exit node's
matchers were empty, BuildPeerMap could not link source to exit, and
RoutesForPeer's ReduceRoutes stripped 0.0.0.0/0 and ::/0 from
AllowedIPs.
The skip belongs at the wire-format layer (ReduceFilterRules), not at
the compile layer that also feeds internal matchers. Lift
autogroup:internet handling into both *AutoGroup arms with the same
shape used for *Prefix destinations: emit a TheInternet rule on
via-tagged exit advertisers; surface peer.ExitRoutes() in Include
when the peer carries the via tag, Exclude otherwise.
ReduceFilterRules continues to keep the rule on exit-route
advertisers' wire output and strip it elsewhere, preserving SaaS
PacketFilter encoding.
Also drop compileViaForNode's early len(SubnetRoutes)==0 return:
SubnetRoutes excludes exit routes, so the early return pre-empted the
autogroup:internet branch on nodes that only advertise exit routes.
Existing tests pinning the buggy behaviour (TestViaRoutesForPeer
subtests, TestCompileViaGrant case) flipped to the new contract.
Fixes#3233
`time.After(ProbeTimeout)` returned a single channel shared by every
probe goroutine in the cycle. Only the first goroutine to receive the
deadline tick drains the channel; any other goroutine still waiting on
its `responseCh` is then stuck forever, `wg.Wait()` never returns, and
the scheduler loop in `app.go` stalls on the next tick. The condition
fires whenever two or more nodes time out in the same cycle — common
under cable-pull where IsOnline lags reality and both routers stay in
the candidate set as half-open TCP.
Move the timer inside each goroutine so every probe has its own
deadline.
Updates #3234
electPrimaryRoutes' all-unhealthy fallback picked candidates[0]
(lowest NodeID) regardless of who was prev. Under cable-pull
semantics IsOnline lags reality (long-poll TCP half-open), so
both routers stay in candidates and both go Unhealthy via the
prober — the fallback then churned primary to a node that was
itself unreachable.
Prefer prev when still in candidates; fall through to
candidates[0] only when prev is gone. Anti-blackhole holds.
Update the property test reference model and split the unit
test into existence (KeepsAPrimary) and identity
(PreservesPrevious) cases.
Fixes#3203
Three regression tests for the user scenario: an in-process
Disconnect/Reconnect, a tailscale-down/up integration test, and
an iptables -j DROP cable-pull integration test.
Updates #3203
Restore the legacy auto-clear at write boundaries that drop HA
candidacy: Disconnect, SetApprovedRoutes(empty), and
UpdateNodeFromMapRequest shrinking advertised routes to empty.
Plus a defensive guard in SetNodeUnhealthy.
Updates #3203
Replace routes.PrimaryRoutes reads with NodeStore. Connect bumps
SessionEpoch; Disconnect re-checks it inside UpdateNode so the
check and mutation are atomic against a concurrent Connect on
the same node.
The connect_race regression test is carried in its final
SessionEpoch form.
Updates #3203
- test: comment that the !regReq.Expiry.IsZero() gate also covers
the tags-only PreAuthKey path
- CHANGELOG: note pre-existing 0001-01-01 rows self-heal on
re-registration rather than being backfilled
When a user owned node registers or re registers with a PreAuthKey and the
client sends zero client expiry while node.expiry is set to 0, the expiry
column ends up stored as 0001-01-01 00:00:00 instead of NULL. Two sites in
HandleNodeFromPreAuthKey build a non nil pointer to regReq.Expiry even when
the value is zero time, and the needsDefaultExpiry guard only replaces it
when s.cfg.Node.Expiry > 0, so the pointer to zero time survives to the
database.
Convert an unset regReq.Expiry to nil before handing it off so the
needsDefaultExpiry path is the only place that assigns a non nil pointer.
This is a narrower sibling of #3170 on the user owned PreAuthKey path. The
regression was introduced alongside the fix for #3111 in 6337a3db.
compileFilterRules skipped autogroup:internet destinations to keep them
out of the wire-format PacketFilter, but those same compiled rules are
the source of pm.matchers — and Node.CanAccess relies on a matcher whose
DestsIsTheInternet covers the public internet to surface exit-node peers
to ACL sources. With the skip in place no such matcher existed, exit
nodes silently dropped out of the source's peer list, and the docs'
exit-node walkthrough stopped working: `tailscale exit-node list`
returned "no exit nodes found" and `tailscale set --exit-node=<ip>`
returned "no node found in netmap with IP".
Drop the compile-time skip so autogroup:internet flows through normal
matcher derivation, and teach ReduceFilterRules to keep the resulting
client packet-filter rule on exit-route advertisers — Tailscale SaaS
sends those rules to exit nodes so the kernel filter accepts traffic
forwarded by autogroup:internet sources.
Verified against a live tailnet on 2026-04-28 via tscap; the b17/b18
captures land under testdata/issue_3212/ as a regression guard. The
captures are isolated from testdata/routes_results/ because the broader
TestRoutesCompat machinery assumes a CIDR-prefix wire format that
differs from the IPSet-range form SaaS emits for autogroup:internet —
aligning that wire format is tracked separately.
Fixes#3212
compileFilterRules, compileGrants, and updateLocked guarded the
"no rules so allow all" fallback with len(pol.Grants) == 0, which
matches both an absent grants field and an explicit empty array.
JSON {"grants": []} unmarshals to a non-nil empty slice; it should
compile to zero filter rules (deny all) to match Tailscale SaaS,
but the length check sent it down the FilterAllowAll path.
Distinguish absent (nil) from explicit-empty by switching the guard
to pol.Grants == nil, the same asymmetry already used for ACLs.
{} keeps allowing all; {"acls": []} and {"grants": []} now both
deny all.
Fixes#3211