573 Commits

Author SHA1 Message Date
Kristoffer Dalby
742878d172 all: regenerate generated files for new tool versions
The nix dev shell refresh in 758fef9b pulled in protoc-gen-go-grpc
v1.6.1 and newer tailscale.com/cmd/{viewer,cloner}, so rerunning
`make generate` updates the version header comments in the three
affected generated files. No semantic changes.
2026-04-09 18:42:25 +01:00
Jacky
7c756b8201 db: scope DestroyUser to only delete the target user's pre-auth keys
DestroyUser called ListPreAuthKeys(tx) which returns ALL pre-auth keys
across all users, then deleted every one of them. This caused deleting
any single user to wipe out pre-auth keys for every other user.

Extract a ListPreAuthKeysByUser function (consistent with the existing
ListNodesByUser pattern) and use it in DestroyUser to scope key deletion
to the user being destroyed.

Add unit test (table-driven in TestDestroyUserErrors) and integration
test to prevent regression.

Fixes #3154

Co-authored-by: Kristoffer Dalby <kristoffer@dalby.cc>
2026-04-09 08:30:21 +01:00
Kristoffer Dalby
6ae182696f state: fix policy change race in UpdateNodeFromMapRequest
When UpdateNodeFromMapRequest and SetNodeTags race on persistNodeToDB,
the first caller to run updatePolicyManagerNodes detects the tag change
and returns a PolicyChange. The second caller finds no change and falls
back to NodeAdded.

If UpdateNodeFromMapRequest wins the race, it checked
policyChange.IsFull() which is always false for PolicyChange (only sets
IncludePolicy and RequiresRuntimePeerComputation). This caused the
PolicyChange to be dropped, so affected clients never received
PeersRemoved and the stale peer remained in their NetMap indefinitely.

Fix: check !policyChange.IsEmpty() instead, which correctly detects
any non-trivial policy change including PolicyChange().

This fixes the root cause of TestACLTagPropagation/multiple-tags-partial-
removal flaking at ~20% on CI.

Updates #3125
2026-04-08 14:32:08 +01:00
Kristoffer Dalby
ccddeceeec state: fix GORM not persisting user_id=NULL on tagged node conversion
GORM's struct-based Updates() silently skips nil pointer fields.
When SetNodeTags sets node.UserID = nil to transfer ownership to tags,
the in-memory NodeStore is correct but the database retains the old
user_id value. This causes tagged nodes to remain associated with the
original user in the database, preventing user deletion and risking
ON DELETE CASCADE destroying tagged nodes.

Add Select("*") before Omit() on all three node persistence paths
to force GORM to include all fields in the UPDATE statement, including
nil pointers. This is the same pattern already used in db/ip.go for
IPv4/IPv6 nil handling, and is documented GORM behavior:

  db.Select("*").Omit("excluded").Updates(struct)

The three affected paths are:
- persistNodeToDB: used by SetNodeTags and MapRequest updates
- applyAuthNodeUpdate: used by re-authentication with --advertise-tags
- HandleNodeFromPreAuthKey: used by PAK re-registration

Fixes #3161
2026-04-08 14:32:08 +01:00
Kristoffer Dalby
580dcad683 hscontrol: add tests for SetTags user_id database persistence
Add four tests that verify the tags-as-identity ownership transition
correctly persists to the database when converting a user-owned node
to a tagged node via SetTags:

- TestSetTags_ClearsUserIDInDatabase: verifies user_id is NULL in DB
- TestSetTags_NodeDisappearsFromUserListing: verifies ListNodes by user
- TestSetTags_NodeStoreAndDBConsistency: verifies in-memory and DB agree
- TestSetTags_UserDeletionDoesNotCascadeToTaggedNode: verifies user
  deletion does not cascade-delete tagged nodes

Three of these tests currently fail because GORM's struct-based
Updates() silently skips nil pointer fields, so user_id is never
written as NULL to the database after SetNodeTags clears it in memory.

Updates #3161
2026-04-08 14:32:08 +01:00
Kristoffer Dalby
442fcdbd33 capver: regenerate for tailscale v1.96
go generate ./hscontrol/capver/...

Adds v1.96 (capVer 133) to tailscaleToCapVer and capVerToTailscaleVer,
rolls the 10-version support window forward so MinSupportedCapabilityVersion
is now 109 (v1.78), and refreshes the test fixture accordingly.
2026-04-08 13:00:22 +01:00
Kristoffer Dalby
380f531342 state: trigger PolicyChange on every Connect and Disconnect
Connect and Disconnect previously only appended a PolicyChange when
the affected node was a subnet router (routeChange) or the database
persist returned a full change. For every other node the peers just
received a small PeerChangedPatch{Online: ...} and no filter rules
were recomputed. That was too narrow: a node going offline or coming
online can affect policy compilation in ways beyond subnet routes.

TestGrantCapRelay Phase 4 exposed this. When the cap/relay target node
went down with `tailscale down`, headscale only sent an Online=false
patch, peers never got a recomputed netmap, and their cached
PeerRelay allocation stayed populated until the 120s assertion
timeout. With a PolicyChange queued on Disconnect, peers immediately
receive a full netmap on relay loss and clear PeerRelay as expected;
the symmetric change on Connect lets Phase 5 re-publish the policy
when the relay comes back.

Drop the now-unused routeChange return from the Disconnect gate.

Updates #2180
2026-04-08 13:00:22 +01:00
Kristoffer Dalby
ff29af63f6 servertest: use memnet networking and add WithNodeExpiry option
Replace httptest (real TCP sockets) with tailscale.com/net/memnet
so all connections stay in-process. Wire the client's tsdial.Dialer
to the server's memnet.Network via SetSystemDialerForTest,
preserving the full Noise protocol path.

Also update servertest to use the new Node.Ephemeral.InactivityTimeout
config path introduced in the types refactor, and add WithNodeExpiry
server option for testing default node key expiry behaviour.

Updates #1711
2026-04-08 13:00:22 +01:00
Kristoffer Dalby
7e8930c507 hscontrol: add tests for default node key expiry
Add tests covering the core expiry scenarios:
- Untagged auth key with zero expiry gets configured default
- Tagged nodes ignore node.expiry
- node.expiry=0 disables default (backwards compatible)
- Client-requested expiry takes precedence
- Re-registration refreshes the default expiry

Updates #1711
2026-04-08 13:00:22 +01:00
Kristoffer Dalby
6337a3dbc4 state: apply default node key expiry on registration
Use the node.expiry config to apply a default expiry to non-tagged
nodes when the client does not request a specific expiry. This covers
all registration paths: new node creation, re-authentication, and
pre-auth key re-registration.

Tagged nodes remain exempt and never expire.

Fixes #1711
2026-04-08 13:00:22 +01:00
Kristoffer Dalby
4d0b273b90 types: add node.expiry config, deprecate oidc.expiry
Introduce a structured NodeConfig that replaces the flat
EphemeralNodeInactivityTimeout field with a nested Node section.

Add node.expiry config (default: no expiry) as the unified default key
expiry for all non-tagged nodes regardless of registration method.

Remove oidc.expiry entirely — node.expiry now applies to OIDC nodes
the same as all other registration methods. Using oidc.expiry in the
config is a hard error. determineNodeExpiry() returns nil (no expiry)
unless use_expiry_from_token is enabled, letting state.go apply the
node.expiry default uniformly.

The old ephemeral_node_inactivity_timeout key is preserved for
backwards compatibility.

Updates #1711
2026-04-08 13:00:22 +01:00
Kristoffer Dalby
835db974b5 testdata: strip unused fields from all test data files (23MB -> 4MB)
Strip fields not consumed by any test from all 594 HuJSON test data files:

grant_results/ (248 files, 21MB -> 1.8MB):
  - Remove: timestamp, propagation_wait_seconds, input.policy_file,
    input.grants_section, input.api_endpoint, input.api_method,
    topology.nodes.mts_name, topology.nodes.socket, topology.nodes.user_id,
    captures.commands, captures.packet_filter_matches, captures.whois
  - V14-V16, V26-V36: keep stripped netmap (Peers.Name/AllowedIPs/PrimaryRoutes
    + PacketFilterRules) for via_compat_test.go compatibility
  - V17-V25: strip netmap (old topology, incompatible with via_compat harness)

acl_results/ (215 files, 1.4MB -> 1.2MB):
  - Remove: timestamp, propagation_wait_seconds, input.policy_file,
    input.api_endpoint, input.api_response_code, entire topology section
    (parsed by Go struct but completely ignored — nodes are hardcoded)

routes_results/ (92 files, unchanged — topology is actively used):
  - Remove: timestamp, propagation_wait_seconds, input.policy_file,
    input.api_endpoint, input.api_response_code

ssh_results/ (39 files, unchanged — minimal to begin with):
  - Remove: policy_file
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
30dce30a9d testdata: convert .json to .hujson with header comments
Rename all 594 test data files from .json to .hujson and add
descriptive header comments to each file documenting what policy
rules are under test and what outcome is expected.

Update test loaders in all 5 _test.go files to parse HuJSON via
hujson.Parse/Standardize/Pack before json.Unmarshal.

Add cross-dependency warning to via_compat_test.go documenting
that GRANT-V29/V30/V31/V36 are shared with TestGrantsCompat.

Add .gitignore exemption for testdata HuJSON files.
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
abd2b15db5 policy/v2: clean up dead error variables, stale TODO, and test skip reasons
Remove unused error variables (ErrGrantViaNotSupported, ErrGrantEmptySources, ErrGrantEmptyDestinations, ErrGrantViaOnlyTag) and the stale TODO for via implementation. Update compat test skip reasons to reflect that user:*@passkey wildcard is a known unsupported feature, not a pending implementation.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
b762e4c350 integration: remove exit node via grant tests
Remove TestGrantViaExitNodeSteering and TestGrantViaMixedSteering.
Exit node traffic forwarding through via grants cannot be validated
with curl/traceroute in Docker containers because Tailscale exit nodes
strip locally-connected subnets from their forwarding filter.

The correctness of via exit steering is validated by:
- Golden MapResponse comparison (TestViaGrantMapCompat with GRANT-V31
  and GRANT-V36) comparing full netmap output against Tailscale SaaS
- Filter rule compatibility (TestGrantsCompat with GRANT-V14 through
  GRANT-V36) comparing per-node PacketFilter rules against Tailscale SaaS
- TestGrantViaSubnetSteering (kept) validates via subnet steering with
  actual curl/traceroute through Docker, which works for subnet routes

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
c36cedc32f policy/v2: fix via grants in BuildPeerMap, MatchersForNode, and ViaRoutesForPeer
Use per-node compilation path for via grants in BuildPeerMap and MatchersForNode to ensure via-granted nodes appear in peer maps. Fix ViaRoutesForPeer golden test route inference to correctly resolve via grant effects.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
6a55f7d731 policy/v2: add via exit steering golden captures and tests
Add golden test data for via exit route steering and fix via exit grant compilation to match Tailscale SaaS behavior. Includes MapResponse golden tests for via grant route steering verification.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
0431039f2a servertest: add regression tests for via grant filter rules
Add three tests that verify control plane behavior for grant policies:

- TestGrantViaSubnetFilterRules: verifies the router's PacketFilter
  contains destination rules for via-steered subnets. Without per-node
  filter compilation for via grants, these rules were missing and the
  router would drop forwarded traffic.

- TestGrantViaExitNodeFilterRules: same verification for exit nodes
  with via grants steering autogroup:internet traffic.

- TestGrantIPv6OnlyPrefixACL: verifies that address-based aliases
  (Prefix, Host) resolve to exactly the literal prefix and do not
  expand to include the matching node's other IP addresses. An
  IPv6-only host definition produces only IPv6 filter rules.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
ccd284c0a5 policy/v2: use per-node filter compilation for via grants
Via grants compile filter rules that depend on the node's route state
(SubnetRoutes, ExitRoutes). Without per-node compilation, these rules
were only included in the global filter path which explicitly skips via
grants (compileFilterRules skips grants with non-empty Via fields).

Add a needsPerNodeFilter flag that is true when the policy uses either
autogroup:self or via grants. filterForNodeLocked now uses this flag
instead of usesAutogroupSelf alone, ensuring via grant rules are
compiled per-node through compileFilterRulesForNode/compileViaGrant.

The filter cache also needs to account for route-dependent compilation:

- nodesHavePolicyAffectingChanges now treats route changes as
  policy-affecting when needsPerNodeFilter is true, so SetNodes
  triggers updateLocked and clears caches through the normal flow.

- invalidateGlobalPolicyCache now clears compiledFilterRulesMap
  (the unreduced per-node cache) alongside filterRulesMap when
  needsPerNodeFilter is true and routes changed.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
3ca4ff8f3f state,servertest: add grant control plane tests and fix via route ReduceRoutes filtering
Add servertest grant policy control plane tests covering basic grants, via grants, and cap grants. Fix ReduceRoutes in State to apply route reduction to non-via routes first, then append via-included routes, preventing via grant routes from being incorrectly filtered.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
5cd5e5de69 policy/v2: add unit tests for ViaRoutesForPeer
Test via route computation for viewer-peer pairs: self-steering returns
empty, viewer not in source returns empty, peer without advertised
destination returns empty, peer with/without via tag populates
Include/Exclude respectively, mixed prefix and autogroup:internet
destinations, and exit route steering.

7 subtests covering all code paths in ViaRoutesForPeer.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
08d26e541c policy/v2: add unit tests for grant filter compilation helpers
Test companionCapGrantRules, sourcesHaveWildcard, sourcesHaveDangerAll,
srcIPsWithRoutes, the FilterAllowAll fix for grant-only policies,
compileViaGrant, compileGrantWithAutogroupSelf grant paths, and
destinationsToNetPortRange autogroup:internet skipping.

51 subtests across 8 test functions covering all grant-specific code
paths in filter.go that previously had no test coverage.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
d243adaedd types,mapper,integration: enable Taildrive and add cap/drive grant lifecycle test
Add NodeAttrsTaildriveShare and NodeAttrsTaildriveAccess to the node capability map, enabling Taildrive file sharing when granted via policy. Add integration test verifying the full cap/drive grant lifecycle.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
8573ff9158 policy/v2: fix grant-only policies returning FilterAllowAll
compileFilterRules checked only pol.ACLs == nil to decide whether
to return FilterAllowAll (permit-any). Policies that use only Grants
(no ACLs) had nil ACLs, so the function short-circuited before
compiling any CapGrant rules. This meant cap/relay, cap/drive, and
any other App-based grant capabilities were silently ignored.

Check both ACLs and Grants are empty before returning FilterAllowAll.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
8358017dcf policy/v2,state,mapper: implement per-viewer via route steering
Via grants steer routes to specific nodes per viewer. Until now,
all clients saw the same routes for each peer because route
assembly was viewer-independent. This implements per-viewer route
visibility so that via-designated peers serve routes only to
matching viewers, while non-designated peers have those routes
withdrawn.

Add ViaRouteResult type (Include/Exclude prefix lists) and
ViaRoutesForPeer to the PolicyManager interface. The v2
implementation iterates via grants, resolves sources against the
viewer, matches destinations against the peer's advertised routes
(both subnet and exit), and categorizes prefixes by whether the
peer has the via tag.

Add RoutesForPeer to State which composes global primary election,
via Include/Exclude filtering, exit routes, and ACL reduction.
When no via grants exist, it falls back to existing behavior.

Update the mapper to call RoutesForPeer per-peer instead of using
a single route function for all peers. The route function now
returns all routes (subnet + exit), and TailNode filters exit
routes out of the PrimaryRoutes field for HA tracking.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
28be15f8ea policy/v2: handle autogroup:internet in via grant compilation
compileViaGrant only handled *Prefix destinations, skipping
*AutoGroup entirely. This meant via grants with
dst=[autogroup:internet] produced no filter rules even when the
node was an exit node with approved exit routes.

Switch the destination loop from a type assertion to a type switch
that handles both *Prefix (subnet routes) and *AutoGroup (exit
routes via autogroup:internet). Also check ExitRoutes() in
addition to SubnetRoutes() so the function doesn't bail early
when a node only has exit routes.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
687cf0882f policy/v2: implement autogroup:danger-all support
Add autogroup:danger-all as a valid source alias that matches ALL IP
addresses including non-Tailscale addresses. When used as a source,
it resolves to 0.0.0.0/0 + ::/0 internally but produces SrcIPs: ["*"]
in filter rules. When used as a destination, it is rejected with an
error matching Tailscale SaaS behavior.

Key changes:
- Add AutoGroupDangerAll constant and validation
- Add sourcesHaveDangerAll() helper and hasDangerAll parameter to
  srcIPsWithRoutes() across all compilation paths
- Add ErrAutogroupDangerAllDst for destination rejection
- Remove 3 AUTOGROUP_DANGER_ALL skip entries (K6, K7, K8)

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
4f040dead2 policy/v2: implement grant validation rules matching Tailscale SaaS
Implement comprehensive grant validation including: accept empty sources/destinations (they produce no rules), validate grant ip/app field requirements, capability name format, autogroup constraints, via tag existence, and default route CIDR restrictions.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
54db47badc policy/v2: implement via route compilation for grants
Compile grants with "via" field into FilterRules that are placed only
on nodes matching the via tag and actually advertising the destination
subnets. Key behavior:

- Filter rules go exclusively to via-nodes with matching approved routes
- Destination subnets not advertised by the via node are silently dropped
- App-only via grants (no ip field) produce no packet filter rules
- Via grants are skipped in the global compileFilterRules since they
  are node-specific

Reduces grant compat test skips from 41 to 30 (11 newly passing).

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
0e3acdd8ec policy/v2: implement CapGrant compilation with companion capabilities
Compile grant app fields into CapGrant FilterRules matching Tailscale
SaaS behavior. Key changes:

- Generate CapGrant rules in compileFilterRules and
  compileGrantWithAutogroupSelf, with node-specific /32 and /128
  Dsts for autogroup:self grants
- Add reversed companion rules for drive→drive-sharer and
  relay→relay-target capabilities, ordered by original cap name
- Narrow broad CapGrant Dsts to node-specific prefixes in
  ReduceFilterRules via new reduceCapGrantRule helper
- Skip merging CapGrant rules in mergeFilterRules to preserve
  per-capability structure
- Remove ip+app mutual exclusivity validation (Tailscale accepts both)
- Add semantic JSON comparison for RawMessage types and netip.Prefix
  comparators in test infrastructure

Reduces grant compat test skips from 99 to 41 (58 newly passing).

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
ebe0f4078d policy/v2: preserve non-wildcard source IPs alongside wildcard ranges
When an ACL source list contains a wildcard (*) alongside explicit
sources (tags, groups, hosts, etc.), Tailscale preserves the individual
IPs from non-wildcard sources in SrcIPs alongside the merged wildcard
CGNAT ranges. Previously, headscale's IPSetBuilder would merge all
sources into a single set, absorbing the explicit IPs into the wildcard
range.

Track non-wildcard resolved addresses separately during source
resolution, then append their individual IP strings to the output
when a wildcard is also present. This fixes the remaining 5 ACL
compat test failures (K01 and M06 subtests).

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
dda35847b0 policy/v2: reorder ACL self grants to match Tailscale rule ordering
When an ACL has non-autogroup destinations (groups, users, tags, hosts)
alongside autogroup:self, emit non-self grants before self grants to
match Tailscale's filter rule ordering. ACLs with only autogroup
destinations (self + member) preserve the policy-defined order.

This fixes ACL-A17, ACL-SF07, and ACL-SF11 compat test failures.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
f95b254ea9 policy/v2: exclude exit routes from ReduceFilterRules
Add exit route check in ReduceFilterRules to prevent exit nodes from receiving packet filter rules for destinations that only overlap via exit routes. Remove resolved SUBNET_ROUTE_FILTER_RULES grant skip entries and update error message formatting for grant validation.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
e05f45cfb1 policy/v2: use approved node routes in wildcard SrcIPs
Per Tailscale documentation, the wildcard (*) source includes "any
approved subnets" — the actually-advertised-and-approved routes from
nodes, not the autoApprover policy prefixes.

Change Asterix.resolve() to return just the base CGNAT+ULA set, and
add approved subnet routes as separate SrcIPs entries in the filter
compilation path. This preserves individual route prefixes that would
otherwise be merged by IPSet (e.g., 10.0.0.0/8 absorbing 10.33.0.0/16).

Also swap rule ordering in compileGrantWithAutogroupSelf() to emit
non-self destination rules before autogroup:self rules, matching the
Tailscale FilterRule wire format ordering.

Remove the unused AutoApproverPolicy.prefixes() method.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
995ed0187c policy/v2: add advertised routes to compat test topologies
Add routable_ips and approved_routes fields to the node topology
definitions in all golden test files. These represent the subnet
routes actually advertised by nodes on the Tailscale SaaS network
during data capture:

  Routes topology (92 files, 6 router nodes):
    big-router:     10.0.0.0/8
    subnet-router:  10.33.0.0/16
    ha-router1:     192.168.1.0/24
    ha-router2:     192.168.1.0/24
    multi-router:   172.16.0.0/24
    exit-node:      0.0.0.0/0, ::/0

  ACL topology (199 files, 1 router node):
    subnet-router:  10.33.0.0/16

  Grants topology (203 files, 1 router node):
    subnet-router:  10.33.0.0/16

The route assignments were deduced from the golden data by analyzing
which router nodes receive FilterRules for which destination CIDRs
across all test files, and cross-referenced with the MTS setup
script (setup_grant_nodes.sh).

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
927ce418d2 policy/v2: use bare IPs in autogroup:self DstPorts
Use ip.String() instead of netip.PrefixFrom(ip, ip.BitLen()).String()
when building DstPorts for autogroup:self destinations. This produces
bare IPs like "100.90.199.68" instead of CIDR notation like
"100.90.199.68/32", matching the Tailscale FilterRule wire format.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
93d79d8da9 policy: include IPv6 in identity-based alias resolution
AppendToIPSet now adds both IPv4 and IPv6 addresses for nodes, matching Tailscale's FilterRule wire format where identity-based aliases (tags, users, groups, autogroups) resolve to both address families. Update ReduceFilterRules test expectations to include IPv6 entries.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
500442c8f1 policy/v2: convert routes compat tests to data-driven format with Tailscale SaaS captures
Replace 8,286 lines of inline Go test expectations with 92 JSON golden files captured from Tailscale SaaS. The data-driven test driver validates route filtering, auto-approval, HA routing, and exit node behavior against real Tailscale output.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
2fb71690e8 policy/v2: convert ACL compat tests to data-driven format with Tailscale SaaS captures
Replace 9,937 lines of inline Go test expectations with 215 JSON golden files captured from Tailscale SaaS. The new data-driven test driver compares headscale's filter compilation output against real Tailscale behavior for each node in an 8-node topology.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
9f7aa55689 policy/v2: refactor alias resolution to use ResolvedAddresses
Introduce ResolvedAddresses type for structured IP set results. Refactor all Alias.Resolve() methods to return ResolvedAddresses instead of raw IPSets. Restrict identity-based aliases to matching address families, fix nil dereferences in partial resolution paths, and update test expectations for the new IP format (bare IPs, IP ranges instead of CIDR prefixes).

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
0fa9dcaff8 policy/v2: add data-driven grants compatibility test with Tailscale SaaS captures
Rename tailscale_compat_test.go to tailscale_acl_compat_test.go to make room for the grants compat test. Add 237 GRANT-*.json golden test files captured from Tailscale SaaS and a data-driven test driver that compares headscale's grant filter compilation against real Tailscale behavior.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
f74ea5b8ed hscontrol/policy/v2: add Grant policy format support
Add support for the Grant policy format as an alternative to ACL format,
following Tailscale's policy v2 specification. Grants provide a more
structured way to define network access rules with explicit separation
of IP-based and capability-based permissions.

Key changes:

- Add Grant struct with Sources, Destinations, InternetProtocols (ip),
  and App (capabilities) fields
- Add ProtocolPort type for unmarshaling protocol:port strings
- Add Grant validation in Policy.validate() to enforce:
  - Mutual exclusivity of ip and app fields
  - Required ip or app field presence
  - Non-empty sources and destinations
- Refactor compileFilterRules to support both ACLs and Grants
- Convert ACLs to Grants internally via aclToGrants() for unified
  processing
- Extract destinationsToNetPortRange() helper for cleaner code
- Rename parseProtocol() to toIANAProtocolNumbers() for clarity
- Add ProtocolNumberToName mapping for reverse lookups

The Grant format allows policies to be written using either the legacy
ACL format or the new Grant format. ACLs are converted to Grants
internally, ensuring backward compatibility while enabling the new
format's benefits.

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
53b8a81d48 servertest: support tagged pre-auth keys in test clients
WithTags was defined but never passed through to CreatePreAuthKey.
Fix NewClient to use CreateTaggedPreAuthKey when tags are specified,
enabling tests that need tagged nodes (e.g. via grant steering).

Updates #2180
2026-04-01 14:10:42 +01:00
Kristoffer Dalby
15c1cfd778 types: include ExitRoutes in HasNetworkChanges
When exit routes are approved, SubnetRoutes remains empty because exit
routes (0.0.0.0/0, ::/0) are classified separately. Without checking
ExitRoutes, the PolicyManager cache is not invalidated on exit route
approval, causing stale filter rules that lack via grant entries for
autogroup:internet destinations.

Updates #2180
2026-04-01 14:10:42 +01:00
Tanayk07
568baf3d02 fix: align banner right-side border to consistent 64-char width 2026-03-19 07:08:35 +01:00
Tanayk07
5105033224 feat: add prominent warning banner for non-standard IP prefixes
Add a highly visible ASCII-art warning banner that is printed at
startup when the configured IP prefixes fall outside the standard
Tailscale CGNAT (100.64.0.0/10) or ULA (fd7a:115c:a1e0::/48) ranges.

The warning fires once even if both v4 and v6 are non-standard, and
the warnBanner() function is reusable for other critical configuration
warnings in the future.

Also updates config-example.yaml to clarify that subsets of the
default ranges are fine, but ranges outside CGNAT/ULA are not.

Closes #3055
2026-03-19 07:08:35 +01:00
Kristoffer Dalby
3d53f97c82 hscontrol/servertest: fix test expectations for eventual consistency
Three corrections to issue tests that had wrong assumptions about
when data becomes available:

1. initial_map_should_include_peer_online_status: use WaitForCondition
   instead of checking the initial netmap. Online status is set by
   Connect() which sends a PeerChange patch after the initial
   RegisterResponse, so it may not be present immediately.

2. disco_key_should_propagate_to_peers: use WaitForCondition. The
   DiscoKey is sent in the first MapRequest (not RegisterRequest),
   so peers may not see it until a subsequent map update.

3. approved_route_without_announcement: invert the test expectation.
   Tailscale uses a strict advertise-then-approve model -- routes are
   only distributed when the node advertises them (Hostinfo.RoutableIPs)
   AND they are approved. An approval without advertisement is a dormant
   pre-approval. The test now asserts the route does NOT appear in
   AllowedIPs, matching upstream Tailscale semantics.

Also fix TestClient.Reconnect to clear the cached netmap and drain
pending updates before re-registering. Without this, WaitForPeers
returned immediately based on the old session's stale data.
2026-03-19 07:05:58 +01:00
Kristoffer Dalby
1053fbb16b hscontrol/state: fix online status reset during re-registration
Two fixes to how online status is handled during registration:

1. Re-registration (applyAuthNodeUpdate, HandleNodeFromPreAuthKey) no
   longer resets IsOnline to false. Online status is managed exclusively
   by Connect()/Disconnect() in the poll session lifecycle. The reset
   caused a false offline blip: the auth handler's change notification
   triggered a map regeneration showing the node as offline to peers,
   even though Connect() would set it back to true moments later.

2. New node creation (createAndSaveNewNode) now explicitly sets
   IsOnline=false instead of leaving it nil. This ensures peers always
   receive a known online status rather than an ambiguous nil/unknown.
2026-03-19 07:05:58 +01:00
Kristoffer Dalby
b09af3846b hscontrol/poll,state: fix grace period disconnect TOCTOU race
When a node disconnects, serveLongPoll defers a cleanup that starts a
grace period goroutine. This goroutine polls batcher.IsConnected() and,
if the node has not reconnected within ~10 seconds, calls
state.Disconnect() to mark it offline. A TOCTOU race exists: the node
can reconnect (calling Connect()) between the IsConnected check and
the Disconnect() call, causing the stale Disconnect() to overwrite
the new session's online status.

Fix with a monotonic per-node generation counter:

- State.Connect() increments the counter and returns the current
  generation alongside the change list.
- State.Disconnect() accepts the generation from the caller and
  rejects the call if a newer generation exists, making stale
  disconnects from old sessions a no-op.
- serveLongPoll captures the generation at Connect() time and passes
  it to Disconnect() in the deferred cleanup.
- RemoveNode's return value is now checked: if another session already
  owns the batcher slot (reconnect happened), the old session skips
  the grace period entirely.

Update batcher_test.go to track per-node connect generations and
pass them through to Disconnect(), matching production behavior.

Fixes the following test failures:
- server_state_online_after_reconnect_within_grace
- update_history_no_false_offline
- nodestore_correct_after_rapid_reconnect
- rapid_reconnect_peer_never_sees_offline
2026-03-19 07:05:58 +01:00
Kristoffer Dalby
00c41b6422 hscontrol/servertest: add race, stress, and poll race tests
Add three test files designed to stress the control plane under
concurrent and adversarial conditions:

- race_test.go: 14 tests exercising concurrent mutations, session
  replacement, batcher contention, NodeStore access, and map response
  delivery during disconnect. All pass the Go race detector.

- poll_race_test.go: 8 tests targeting the poll.go grace period
  interleaving. These confirm a logical TOCTOU race: when a node
  disconnects and reconnects within the grace period, the old
  session's deferred Disconnect() can overwrite the new session's
  Connect(), leaving IsOnline=false despite an active poll session.

- stress_test.go: sustained churn, rapid mutations, rolling
  replacement, data integrity checks under load, and verification
  that rapid reconnects do not leak false-offline notifications.

Known failing tests (grace period TOCTOU race):
- server_state_online_after_reconnect_within_grace
- update_history_no_false_offline
- rapid_reconnect_peer_never_sees_offline
2026-03-19 07:05:58 +01:00