Endpoints, Tags and ApprovedRoutes serialize as JSON on Node. GORM's
struct Updates path skips fields it considers zero, and reflect treats
a nil slice as zero — clearing any of these columns via the State
persist path would leave the previous value in the database.
Introduce Strings, Prefixes and AddrPorts as named slice types whose
IsZero() always reports false, so GORM keeps the column in the UPDATE
regardless of the slice being nil or empty. JSON marshalling is
unchanged: nil serializes to null, empty to []. List() returns the
underlying unnamed slice for callers (mainly testify assertions over
reflect.DeepEqual) that distinguish the named type from its base.
Regenerated types_clone.go and types_view.go follow the field-type
swap. Test assertions across hscontrol/{db,state,servertest} updated
to call .List() where reflect.DeepEqual previously matched the raw
slice type.
Fixes#3110
Replays recorded policy responses for the sshTests block. 200 captures must evaluate; non-200 captures must reject with the recorded body as a substring of the headscale error. Divergences are listed in knownSSHTesterDivergences.
SetPolicy and policy check now compile per-dst SSH rules and replay each sshTests entry. The accept assertion treats check-action rules as reachable; the check assertion requires HoldAndDelegate on the matching rule. Boot reload warns and continues.
Adds SSHPolicyTest plus parse-time validation: empty src/dst, port/CIDR/autogroup-internet destinations, and tag references missing from tagOwners are rejected. Engine evaluation comes in a follow-up.
The guard added for #2862 in handleRegister checked
node.Expiry().Valid() before preserving node state on
Auth=nil + Expiry=zero registration requests. Valid() returns false
when node.Expiry is nil, the default for tagged nodes and for untagged
nodes registered against a preauth key with no default node.expiry
configured. Both fell through to handleLogout, which wrote
&time.Time{} (0001-01-01T00:00:00Z) over the original nil — the
user-visible 0001-01-01 expiry that `headscale nodes list` reports
after restart.
IsExpired() already returns false for both nil and zero-time, so the
Valid() check was redundant. Drop it so all nil-expiry nodes are
covered by the same preservation path.
Fixes#3170Fixes#3262
When a node carries the disable-ipv4 nodeAttr documented at
https://tailscale.com/docs/reference/troubleshooting/network-configuration/cgnat-conflicts,
SaaS stops sending the node's CGNAT IPv4 prefix in MapResponse. The
allocator keeps assigning IPv4 server-side; only the wire-shape
delivery is filtered. Subnet routes the node advertises -- including
IPv4 prefixes -- survive in AllowedIPs and PrimaryRoutes.
TailNode now drops Is4 prefixes from Addresses and from the node's
own /32 slot in AllowedIPs when selfPolicyCaps carries
disable-ipv4. Mapper.buildTailPeers passes each peer's policy
CapMap so the filter applies in viewer netmaps too; the CapMap
merge that follows is overwritten by PeerCapMap so only the address
filter survives on the peer path.
Two captures land in testdata/nodeattrs_results to anchor the
behaviour:
- nodeattrs-attr-c15-disable-ipv4 (on tag:client)
- nodeattrs-attr-c16-disable-ipv4-router (on tag:router, which
advertises 10.33.0.0/16, confirming subnet routes survive)
Tailscale stamps tailcfg.NodeAttrDefaultAutoUpdate on every node's
CapMap with a JSON bool reflecting the tailnet-wide auto-update
default. Headscale grows an auto_update.enabled config option and
emits the cap accordingly from TailNode -- the cap leaves the
unmodelledTailnetStateCaps strip list and is compared in full by the
nodeAttrs compat suite.
testNodeAttrsSuccess drives cfg.AutoUpdate.Enabled from
tf.Input.Tailnet.Settings.DevicesAutoUpdatesOn so each capture's
expected emission matches the SaaS state it was taken under. Two
captures cover both branches:
- nodeattrs-tailnet-devices-auto-updates-on -> [true]
- nodeattrs-tailnet-devices-auto-updates-off -> [false]
The Tailscale v2 TailnetSettings API does not expose the Send Files
toggle, so the compat suite cannot vary cfg.Taildrop.Enabled per
capture. TestTaildropDisabledWithholdsFileSharingCap covers the off
path directly in servertest.
Taildrive (drive:share and drive:access) is policy-driven per
Tailscale's documented behaviour
(https://tailscale.com/docs/features/taildrive). The previous
always-on baseline emission diverged from SaaS for every node not
targeted by a drive nodeAttr -- a real semantic divergence that the
compat suite caught once the test moved to comparing TailNode output
against the captured netmaps.
types.Node.TailNode no longer stamps the drive pair. Operators
wanting taildrive add a nodeAttrs entry:
"nodeAttrs": [
{ "target": ["*"], "attr": ["drive:share", "drive:access"] }
]
unmodelledTailnetStateCaps shrinks accordingly. The baseline-divergence
group is gone; every entry left in the list is genuinely unmodelled
(user-role caps, unimplemented features, tailnet metadata, internal
tuning).
servertest's TestNodeAttrsBaselineCapsAlwaysOn expects the smaller
baseline (admin + ssh + file-sharing). Integration TestGrantCapDrive
grants the drive caps explicitly via NodeAttrs to exercise the
policy-driven emission path.
types.NodeView.TailNode takes a selfPolicyCaps tailcfg.NodeCapMap
parameter and merges it into the baseline. The mapper's WithSelfNode
hands it the policy result via state.NodeCapMap; peer-path callers
pass nil because peer-side CapMap is set downstream via
policyv2.PeerCapMap.
The nodeAttrs compat test now diffs the full TailNode self-view
output against captured SaaS netmaps. Before this change the test
compared compileNodeAttrs alone -- the policy-only output -- and
needed a strip list to compensate for the missing baseline. With
TailNode on the diff path, baseline emission is exercised end-to-end
by every capture; a regression in TailNode breaks the suite.
unmodelledTailnetStateCaps drops cap/ssh and cap/file-sharing now
that both sides emit them identically. The file header is rewritten
to read as 'caps SaaS emits where headscale has no equivalent yet'
rather than the more confusing 'shape divergence' framing.
The Tailscale client surfaces 'use this peer as your exit node' when
the peer's CapMap carries the tailcfg.NodeAttrSuggestExitNode cap.
SaaS emits it only on peers whose advertised exit routes are
approved -- not every peer that just has the cap in its own
nodeAttrs slot.
policyv2.PeerCapMap encodes that emission rule: it walks the
peer's own self-CapMap (built from compileNodeAttrs) and surfaces
the gated entries (today just suggest-exit-node when the peer
IsExitNode). Mapper.buildTailPeers calls it for each peer instead
of merging the peer's full nodeAttrs CapMap onto its peer view.
allCapMaps snapshots the full per-node CapMap once per peer-list
build so pm.mu is acquired once rather than per peer.
Adds a data-driven test that loads testdata/nodeattrs_results/*.hujson
and diffs the captured SaaS-rendered netmaps against headscale's
compileNodeAttrs output. Each capture is one scenario the SaaS
control plane has rendered against the same policy headscale is asked
to compile -- the test enforces shape parity per node.
tailnet_state_caps.go enumerates the caps SaaS emits where headscale
has no equivalent concept yet (user-role admin/owner, tailnet lock,
services host, app connectors, internal magicsock and SSH tuning,
tailnet-state metadata) plus the always-on baseline (admin, ssh,
file-sharing) and the taildrive pair. stripUnmodelledTailnetStateCaps
filters both sides of cmp.Diff so the comparison focuses on the
policy-driven caps. PeerCapMap encodes which caps the Tailscale
client reads from the peer view (suggest-exit-node when exit routes
are approved, etc.) for use by the mapper.
testcapture switches to typed tailcfg/netmap/filtertype/apitype
values so schema drift between the capture tool and headscale
becomes a compile error rather than a silent test failure. Existing
compat suites (acl, grants, routes, ssh, issue_3212) move to the
typed shape.
The 53 SelfNode netmap captures and the 7 anonymizer-corrupted
suggest-charmander -> suggest-exit-node restorations in
routes_results / issue_3212 ride along.
Tailscale models the randomize-client-port toggle as a top-level
field on the ACL policy. Headscale now matches that shape: the
server-config randomize_client_port key is removed, the toggle
lives in the policy file as randomizeClientPort, and per-node
opt-in via nodeAttrs is also supported.
Operators upgrading from a config-set randomize_client_port hit
depr.fatalWithHint at startup, which prints the deprecation message
and points at the new policy field rather than silently dropping
the toggle. The default carries over (false) so operators who never
set it are unaffected. config-example.yaml ships a REMOVED stanza
showing the migration.
types/node.go drops the cfg.RandomizeClientPort read from
TailNode -- the cap is now policy-driven through compileNodeAttrs
and the tail_test.go expectations follow.
WithSelfNode and buildTailPeers merge each node's policy CapMap
into the tailcfg.Node.CapMap they emit. State.NodeCapMap and
State.NodeCapMaps wrap the policy manager: NodeCapMap returns a
defensive clone per call; NodeCapMaps snapshots the full per-node
map once for batched callers, amortising pm.mu acquisition across
a peer build.
generateDNSConfig grew a per-node CapMap argument so it can apply
nodeAttr-driven DNS overlays. The nextdns DoH rewrite hardens against
policy-controlled inputs:
- nextDNSDoHHost anchors the prefix match instead of substring,
so a hostile resolver URL cannot smuggle a nextdns hostname in
a path or query.
- nextDNSProfileFromCapMap accepts only profile names matching
[A-Za-z0-9._-]{1,64} and picks the lexicographically first when
multiple are granted -- deterministic, no shell metacharacters
or URL fragments through.
- addNextDNSMetadata composes the rewritten URL via url.Parse +
url.Values rather than fmt.Sprintf, so existing query strings
on the resolver URL survive and metadata cannot inject a new
component.
WithTaildropEnabled in servertest controls cfg.Taildrop.Enabled per
test so cap/file-sharing emission can be toggled in tests that need
to verify the off path.
ACL policies now accept a top-level nodeAttrs block. Each entry hands
a list of tailcfg node capabilities to every node matching target.
Accepted target forms are the same as acls.src and grants.src: users,
groups, tags, hosts, prefixes, autogroup:member, autogroup:tagged,
and *. autogroup:self, autogroup:internet, and autogroup:danger-all
are rejected at validate time because none describes a stable
identity set a node-level attribute can attach to.
NodeAttrGrant carries Targets, Attrs, and IPPool. IPPool is parsed
but rejected at validate time -- the allocator that consumes it is
not yet implemented. nodeAttrUnsupportedCaps lists caps SaaS accepts
that headscale cannot act on (funnel today) and rejects them with a
tracking-issue link in the error.
compileNodeAttrs resolves each entry's targets, then maps every
targeted node to a tailcfg.NodeCapMap of the entry's attrs. Per-node
IPs are cached once per call so the inner attr loop is O(grants)
instead of O(grants * nodes) IP allocations.
PolicyManager grows NodeCapMap (per-node), NodeCapMaps (snapshot for
batched callers), and NodesWithChangedCapMap (drain buffer for the
self-broadcast diff). refreshNodeAttrsLocked appends to the drain
rather than overwriting so a SetUsers/SetNodes between SetPolicy and
the drain cannot lose the policy-reload diff.
- Mention policy as generic term that covers ACLs or Grants
- Refresh routes policy examples
- Remove Headscale specific exit node separation. Use via instead.
Fixes: #3087
- Rename from acl.md to policy.md and setup redirect links
- Mention both ACLs and Grants
- Remove most old ACL docs and replace with links to Tailscale docs
- Add "Getting started" section
- Add section about notable differences
The policy `tests` block lets entries omit `proto`. Tailscale's client
maps that to the default protocol set {TCP, UDP, ICMP, ICMPv6} — the
captured packet_filter_matches show all four IANA numbers explicitly
when no proto is set — and a rule restricted to any one of them
satisfies an empty-proto reachability test.
srcReachesDst was passing the empty Protocol through unchanged, which
landed an empty []int in ruleMatchesProto. The matcher then short-
circuited to "no match" for every rule with a non-empty IPProto
restriction, including TCP-only grants compiled from `ip: ["tcp:80"]`.
The bug surfaced in the captured allpass-acls-and-grants-mixed
scenario: the grant `tag:client → webserver:80` was reachable in the
compiled filter but the empty-proto test could not see it.
Expand the empty Protocol to the default set at the call site so
ruleMatchesProto's intersection check sees the right requested
protocols. Drop the now-dead empty-requestedProtos branch from the
matcher. The last divergence drops out of knownPolicyTesterDivergences
as a result.
Updates #1803
Tailscale accepts both named ("tcp") and numeric IANA ("6") protocol
forms wherever a Protocol value is allowed. Headscale stored whichever
form the user wrote, leaving downstream code with two equivalents to
handle separately. validateProtocolPortCompatibility only recognised
the named constants and rejected the numeric form, so a policy with
`proto: "6", dst: ["host:443"]` was rejected at parse time even though
SaaS accepts it.
Resolve the disagreement by normalising to the named form during
Protocol.UnmarshalJSON. Every downstream consumer now sees one form
regardless of what the user wrote, so layered guards like
`|| protocol == "6"` in the validator are unnecessary.
Updates #1803
A `tests` entry describes one connection attempt to one specific
host on one specific port over a connection-oriented protocol, and
asserts whether it is allowed or denied. Five shape rules follow —
single-port dst, proto in {tcp, udp, sctp, ""}, no
autogroup:internet dst, no CIDR-typed dst (raw `/N` or hosts:-alias
to a multi-host prefix), at least one of accept/deny — and every
one was previously silently accepted by headscale even though
Tailscale SaaS rejects them as "test(s) failed".
Enforce them in one pass over `pol.Tests` from `Policy.validate()`,
reusing the existing parse-time multierr aggregation. The same
shapes remain valid inside ACL or Grant destinations where the rule
does not apply; the validator only walks the tests array.
The compat runner now treats parse-time errors equivalently to
SetPolicy errors so the captured Tailscale body still matches via
substring regardless of which step surfaces the rejection. Nine
divergences resolved by this validation pass drop out of
knownPolicyTesterDivergences.
Updates #1803
57 captures covering the alias × outcome matrix for the tests block,
recorded against a real Tailscale SaaS tailnet. Replayed by
TestPolicyTesterCompat.
Bump the check-added-large-files pre-commit threshold to 1024 KB —
captures include verbose per-node netmaps and one is 620 KB.
Updates #1803
Pin headscale's accept/reject decision and error body against
Tailscale SaaS by replaying captures recorded from a real tailnet.
Mirrors the tailscale_grants_compat_test.go pattern: glob over
testdata/policytest_results/, one t.Run per file, parse-or-SetPolicy
error must contain the captured api_response_body.message.
errPolicyTestsFailed is "test(s) failed" — Tailscale's literal body —
so substring match works against captured response bodies. Per-test
detail (src, dst, expected vs got) is preserved below the prefix for
the CLI / config-reload paths that don't have an audit endpoint.
knownPolicyTesterDivergences gates the 12 mismatches the captures
will surface so the suite stays green; engine fixes in follow-up
commits drop the entries as each is resolved.
Updates #1803
v2 silently dropped policy.tests, so a policy that contradicted its
own assertions still applied. Resolve src/dst via the existing Alias
machinery, walk the compiled global filter rules (acls and grants
both contribute), and run on every user-write boundary: SetPolicy,
the file watcher, and `headscale policy check`. A failing test
rejects the write before it mutates live state.
Boot-time reload skips evaluation; an already-stored policy that
references a deleted user shouldn't lock the server out.
`headscale policy check` is a thin frontend for the new CheckPolicy
gRPC method. The server-side handler builds a fresh PolicyManager
from the request bytes and the state's live users/nodes, runs
SetPolicy on the sandbox so the tests block executes, and returns
the result through gRPC status. No persistence, no policy_mode
coupling. --bypass-grpc-and-access-database-directly opens the DB
directly when the server is not running.
cmd/headscale/cli/root.go no longer special-cases `policy check` in
init() (the early return from PR #2580 broke --config registration
and viper priming for --bypass).
integration/cli_policy_test.go covers policy_mode={file,database} x
fixture={acl-only, acl+passing-tests, acl+failing-tests} x
bypass={false,true} = 12 rows.
Updates #1803
Co-authored-by: Janis Jansons <janhouse@gmail.com>
CheckPolicy validates a candidate policy against a running server's
live users and nodes (running its tests block) without persisting
anything. Used by 'headscale policy check' to replace the in-process
validation path the CLI runs today, which would otherwise need its
own database connection.
Updates #1803
MatchFromFilterRule only read DstPorts[].IP into the destination
IPSet. Cap-grant-only filter rules (e.g. tailscale.com/cap/relay)
carry their destinations in CapGrant[].Dsts, so the derived matchers
had empty dest sets and BuildPeerMap / ReduceNodes never exposed the
cap target to its source nodes. Without a companion IP-level grant
the relay node stayed invisible, so clients never tried to use it
and connections sat on DERP.
Union CapGrant[].Dsts into the destination IPSet alongside DstPorts.
Restores peer-visibility for any cap-grant-only relationship; the
peer-relay flow is the most visible instance.
Fixes#3256
revgrep needs pull_request.base.sha in the local clone to compute
the diff against new code. With fetch-depth: 2, only HEAD and one
parent are fetched, so a stale base SHA (when main moves between
PR syncs) is not reachable and revgrep falls through, surfacing
pre-existing issues outside the PR scope.
Replace the grep/awk hash extraction in build.yml with a structured
vendorhash check step; the PR review comment now reads expected/
actual values directly from $GITHUB_OUTPUT instead of scraping Nix
stderr. Add a prek hook so divergence is caught locally before push.
Move the headscale vendorHash out of flake.nix into a content-
addressed flakehashes.json maintained by a small Go tool. The
schema and goModFingerprint algorithm mirror upstream tailscale's
tool/updateflakes so a future shared library extraction is trivial.
vendorhash check verifies flakehashes.json against the current
go.mod/go.sum. Hot path is a sha256 over those two files, so
re-runs without input change are essentially free; only an actual
fingerprint drift triggers go mod vendor + nardump.SRI.
vendorhash update recomputes both fields and rewrites the JSON.
The nix-vendor-sri devShell shim now wraps it.
Pulls in the cmd/nardump library split (tailscale/tailscale#19551)
so flakehashes.json tooling can import nardump.SRI directly.
Side effects: Go directive bumps to 1.26.2 and the nixpkgs lock
advances to a revision shipping go 1.26.2.
Add --bypass-grpc-and-access-database-directly to policy check so
the new ambiguous-user validator runs against the live user list.
Without the flag, policy check stays a syntax-only check and the
success message says so.
Updates #3160