headscale

mirror of https://github.com/juanfont/headscale.git synced 2026-05-23 18:48:42 +09:00

Author	SHA1	Message	Date
Kristoffer Dalby	b5b786f519	servertest: cover broader-dst via grant in filter test TestGrantViaSubnetFilterRules pins exact-equality dst. Add a sibling for the broader-dst case so the regression sits at the server level alongside the policy-engine unit test. Updates #3267	2026-05-18 14:02:00 +02:00
Kristoffer Dalby	2cb914df59	policy/v2: add SaaS goldens for via-grant prefix containment Captures from Tailscale SaaS exercising broader, narrower, host alias, disjoint, and 4via6 grant destinations against advertised subnet routes. TestGrantsCompat replays them. Updates #3267	2026-05-18 14:02:00 +02:00
Kristoffer Dalby	e5fcd01ee6	policy/v2: match via-grant destinations by prefix overlap slices.Contains required exact equality between grant dst and the advertised subnet route. Any non-identical pair was rejected, so a via grant with broader (or narrower) dst emitted no filter rule and added no route to the viewer's AllowedIPs. Tailscale SaaS uses containment in either direction. Switch to slices.ContainsFunc(routes, dst.Overlaps) for filter rule emission (keep dst literal in DstPorts), and append overlapping advertised routes to ViaRoutesForPeer.Include / Exclude. Rewrite the multi-router HA election and regular-grant overlap detection to key off the matched routes rather than the dst. Resolve Host aliases to Prefix once in compileOneViaGrant and at the top of ViaRoutesForPeer so the switch arms reach them. Fixes #3267	2026-05-18 14:02:00 +02:00
Kristoffer Dalby	af7e7a4560	db: remove unused SetApprovedRoutes and SetTags helpers Both helpers existed to write the literal "[]" when clearing a slice column — a workaround for GORM's struct-Updates skipping nil slices. The State path goes exclusively through persistNodeToDB, which is now correct end-to-end thanks to the named IsZero slice types, so the helpers are dead in production. The remaining callers were tests. TestSetTags is dropped — TestSetTags_* in hscontrol/grpcv1_test.go already covers the State path that production uses. TestAutoApproveRoutes now writes routes via DB.Save on the loaded node, which is the path gRPC SetApprovedRoutes drives in production. Updates #3110	2026-05-15 11:21:58 +02:00
Kristoffer Dalby	b1196baf6d	state: add regression test for Node slice persistence Drives the persist path for ApprovedRoutes, Tags and Endpoints — seed a non-empty value, clear to nil, read the column back from disk, then close the State and reopen one against the same sqlite file to simulate a server restart. Pins the contract the named IsZero slice types enforce so future changes to the persist path cannot silently drop a cleared slice column. Updates #3110	2026-05-15 11:21:58 +02:00
Kristoffer Dalby	7a20db9f49	types: persist Node JSON slices via named IsZero types Endpoints, Tags and ApprovedRoutes serialize as JSON on Node. GORM's struct Updates path skips fields it considers zero, and reflect treats a nil slice as zero — clearing any of these columns via the State persist path would leave the previous value in the database. Introduce Strings, Prefixes and AddrPorts as named slice types whose IsZero() always reports false, so GORM keeps the column in the UPDATE regardless of the slice being nil or empty. JSON marshalling is unchanged: nil serializes to null, empty to []. List() returns the underlying unnamed slice for callers (mainly testify assertions over reflect.DeepEqual) that distinguish the named type from its base. Regenerated types_clone.go and types_view.go follow the field-type swap. Test assertions across hscontrol/{db,state,servertest} updated to call .List() where reflect.DeepEqual previously matched the raw slice type. Fixes #3110	2026-05-15 11:21:58 +02:00
Kristoffer Dalby	26eebcea5a	policy/v2: add sshtester compat runner Replays recorded policy responses for the sshTests block. 200 captures must evaluate; non-200 captures must reject with the recorded body as a substring of the headscale error. Divergences are listed in knownSSHTesterDivergences.	2026-05-13 21:10:13 +02:00
Kristoffer Dalby	013dea4f40	policy/v2: evaluate sshTests at write boundary SetPolicy and policy check now compile per-dst SSH rules and replay each sshTests entry. The accept assertion treats check-action rules as reachable; the check assertion requires HoldAndDelegate on the matching rule. Boot reload warns and continues.	2026-05-13 21:10:13 +02:00
Kristoffer Dalby	6a0a297c7f	policy/v2: validate sshTests at parse Adds SSHPolicyTest plus parse-time validation: empty src/dst, port/CIDR/autogroup-internet destinations, and tag references missing from tagOwners are rejected. Engine evaluation comes in a follow-up.	2026-05-13 21:10:13 +02:00
Kristoffer Dalby	d600090f2c	policy/v2: align SSH rule validation with Tailscale Trim whitespace on action, users, src, dst; reject empty/wildcard users; reject empty acceptEnv; reject negative and over-max checkPeriod; reject hosts-table aliases as SSH dst; reject non-ASCII tag names; tolerate tag-owner cycles; match group-nesting wording.	2026-05-13 21:10:13 +02:00
Kristoffer Dalby	4ad200ab73	hscontrol: preserve nil expiry on tailscaled restart The guard added for #2862 in handleRegister checked node.Expiry().Valid() before preserving node state on Auth=nil + Expiry=zero registration requests. Valid() returns false when node.Expiry is nil, the default for tagged nodes and for untagged nodes registered against a preauth key with no default node.expiry configured. Both fell through to handleLogout, which wrote &time.Time{} (0001-01-01T00:00:00Z) over the original nil — the user-visible 0001-01-01 expiry that `headscale nodes list` reports after restart. IsExpired() already returns false for both nil and zero-time, so the Valid() check was redundant. Drop it so all nil-expiry nodes are covered by the same preservation path. Fixes #3170 Fixes #3262	2026-05-13 17:06:16 +02:00
Kristoffer Dalby	5d502bfb88	types/node, mapper: strip own IPv4 from emission when node has disable-ipv4 cap When a node carries the disable-ipv4 nodeAttr documented at https://tailscale.com/docs/reference/troubleshooting/network-configuration/cgnat-conflicts, SaaS stops sending the node's CGNAT IPv4 prefix in MapResponse. The allocator keeps assigning IPv4 server-side; only the wire-shape delivery is filtered. Subnet routes the node advertises -- including IPv4 prefixes -- survive in AllowedIPs and PrimaryRoutes. TailNode now drops Is4 prefixes from Addresses and from the node's own /32 slot in AllowedIPs when selfPolicyCaps carries disable-ipv4. Mapper.buildTailPeers passes each peer's policy CapMap so the filter applies in viewer netmaps too; the CapMap merge that follows is overwritten by PeerCapMap so only the address filter survives on the peer path. Two captures land in testdata/nodeattrs_results to anchor the behaviour: - nodeattrs-attr-c15-disable-ipv4 (on tag:client) - nodeattrs-attr-c16-disable-ipv4-router (on tag:router, which advertises 10.33.0.0/16, confirming subnet routes survive)	2026-05-13 14:22:30 +02:00
Kristoffer Dalby	64d13f77e8	types/config, types/node: model default-auto-update from auto_update.enabled Tailscale stamps tailcfg.NodeAttrDefaultAutoUpdate on every node's CapMap with a JSON bool reflecting the tailnet-wide auto-update default. Headscale grows an auto_update.enabled config option and emits the cap accordingly from TailNode -- the cap leaves the unmodelledTailnetStateCaps strip list and is compared in full by the nodeAttrs compat suite. testNodeAttrsSuccess drives cfg.AutoUpdate.Enabled from tf.Input.Tailnet.Settings.DevicesAutoUpdatesOn so each capture's expected emission matches the SaaS state it was taken under. Two captures cover both branches: - nodeattrs-tailnet-devices-auto-updates-on -> [true] - nodeattrs-tailnet-devices-auto-updates-off -> [false] The Tailscale v2 TailnetSettings API does not expose the Send Files toggle, so the compat suite cannot vary cfg.Taildrop.Enabled per capture. TestTaildropDisabledWithholdsFileSharingCap covers the off path directly in servertest.	2026-05-13 14:22:30 +02:00
Kristoffer Dalby	8ea4cd3faa	types/node, policy/v2: drop taildrive caps from baseline emission Taildrive (drive:share and drive:access) is policy-driven per Tailscale's documented behaviour (https://tailscale.com/docs/features/taildrive). The previous always-on baseline emission diverged from SaaS for every node not targeted by a drive nodeAttr -- a real semantic divergence that the compat suite caught once the test moved to comparing TailNode output against the captured netmaps. types.Node.TailNode no longer stamps the drive pair. Operators wanting taildrive add a nodeAttrs entry: "nodeAttrs": [ { "target": ["*"], "attr": ["drive:share", "drive:access"] } ] unmodelledTailnetStateCaps shrinks accordingly. The baseline-divergence group is gone; every entry left in the list is genuinely unmodelled (user-role caps, unimplemented features, tailnet metadata, internal tuning). servertest's TestNodeAttrsBaselineCapsAlwaysOn expects the smaller baseline (admin + ssh + file-sharing). Integration TestGrantCapDrive grants the drive caps explicitly via NodeAttrs to exercise the policy-driven emission path.	2026-05-13 14:22:30 +02:00
Kristoffer Dalby	5ebc53c29e	types/node, mapper, policy/v2: assemble self CapMap inside TailNode types.NodeView.TailNode takes a selfPolicyCaps tailcfg.NodeCapMap parameter and merges it into the baseline. The mapper's WithSelfNode hands it the policy result via state.NodeCapMap; peer-path callers pass nil because peer-side CapMap is set downstream via policyv2.PeerCapMap. The nodeAttrs compat test now diffs the full TailNode self-view output against captured SaaS netmaps. Before this change the test compared compileNodeAttrs alone -- the policy-only output -- and needed a strip list to compensate for the missing baseline. With TailNode on the diff path, baseline emission is exercised end-to-end by every capture; a regression in TailNode breaks the suite. unmodelledTailnetStateCaps drops cap/ssh and cap/file-sharing now that both sides emit them identically. The file header is rewritten to read as 'caps SaaS emits where headscale has no equivalent yet' rather than the more confusing 'shape divergence' framing.	2026-05-13 14:22:30 +02:00
Kristoffer Dalby	b3f795f0b4	mapper, policy/v2: stamp suggest-exit-node on Peer.CapMap when exit routes approved The Tailscale client surfaces 'use this peer as your exit node' when the peer's CapMap carries the tailcfg.NodeAttrSuggestExitNode cap. SaaS emits it only on peers whose advertised exit routes are approved -- not every peer that just has the cap in its own nodeAttrs slot. policyv2.PeerCapMap encodes that emission rule: it walks the peer's own self-CapMap (built from compileNodeAttrs) and surfaces the gated entries (today just suggest-exit-node when the peer IsExitNode). Mapper.buildTailPeers calls it for each peer instead of merging the peer's full nodeAttrs CapMap onto its peer view. allCapMaps snapshots the full per-node CapMap once per peer-list build so pm.mu is acquired once rather than per peer.	2026-05-13 14:22:30 +02:00
Kristoffer Dalby	078b9e308f	policy/v2: SaaS-derived compat tests for nodeAttrs Adds a data-driven test that loads testdata/nodeattrs_results/*.hujson and diffs the captured SaaS-rendered netmaps against headscale's compileNodeAttrs output. Each capture is one scenario the SaaS control plane has rendered against the same policy headscale is asked to compile -- the test enforces shape parity per node. tailnet_state_caps.go enumerates the caps SaaS emits where headscale has no equivalent concept yet (user-role admin/owner, tailnet lock, services host, app connectors, internal magicsock and SSH tuning, tailnet-state metadata) plus the always-on baseline (admin, ssh, file-sharing) and the taildrive pair. stripUnmodelledTailnetStateCaps filters both sides of cmp.Diff so the comparison focuses on the policy-driven caps. PeerCapMap encodes which caps the Tailscale client reads from the peer view (suggest-exit-node when exit routes are approved, etc.) for use by the mapper. testcapture switches to typed tailcfg/netmap/filtertype/apitype values so schema drift between the capture tool and headscale becomes a compile error rather than a silent test failure. Existing compat suites (acl, grants, routes, ssh, issue_3212) move to the typed shape. The 53 SelfNode netmap captures and the 7 anonymizer-corrupted suggest-charmander -> suggest-exit-node restorations in routes_results / issue_3212 ride along.	2026-05-13 14:22:30 +02:00
Kristoffer Dalby	3f73ed5404	config, types: move randomize_client_port from server config to policy file Tailscale models the randomize-client-port toggle as a top-level field on the ACL policy. Headscale now matches that shape: the server-config randomize_client_port key is removed, the toggle lives in the policy file as randomizeClientPort, and per-node opt-in via nodeAttrs is also supported. Operators upgrading from a config-set randomize_client_port hit depr.fatalWithHint at startup, which prints the deprecation message and points at the new policy field rather than silently dropping the toggle. The default carries over (false) so operators who never set it are unaffected. config-example.yaml ships a REMOVED stanza showing the migration. types/node.go drops the cfg.RandomizeClientPort read from TailNode -- the cap is now policy-driven through compileNodeAttrs and the tail_test.go expectations follow.	2026-05-13 14:22:30 +02:00
Kristoffer Dalby	6fcff9e352	mapper, state: deliver nodeAttrs through MapResponse and harden nextdns DoH rewrite WithSelfNode and buildTailPeers merge each node's policy CapMap into the tailcfg.Node.CapMap they emit. State.NodeCapMap and State.NodeCapMaps wrap the policy manager: NodeCapMap returns a defensive clone per call; NodeCapMaps snapshots the full per-node map once for batched callers, amortising pm.mu acquisition across a peer build. generateDNSConfig grew a per-node CapMap argument so it can apply nodeAttr-driven DNS overlays. The nextdns DoH rewrite hardens against policy-controlled inputs: - nextDNSDoHHost anchors the prefix match instead of substring, so a hostile resolver URL cannot smuggle a nextdns hostname in a path or query. - nextDNSProfileFromCapMap accepts only profile names matching [A-Za-z0-9._-]{1,64} and picks the lexicographically first when multiple are granted -- deterministic, no shell metacharacters or URL fragments through. - addNextDNSMetadata composes the rewritten URL via url.Parse + url.Values rather than fmt.Sprintf, so existing query strings on the resolver URL survive and metadata cannot inject a new component. WithTaildropEnabled in servertest controls cfg.Taildrop.Enabled per test so cap/file-sharing emission can be toggled in tests that need to verify the off path.	2026-05-13 14:22:30 +02:00
Kristoffer Dalby	a4f05b0962	policy/v2: parse, validate, and compile nodeAttrs ACL policies now accept a top-level nodeAttrs block. Each entry hands a list of tailcfg node capabilities to every node matching target. Accepted target forms are the same as acls.src and grants.src: users, groups, tags, hosts, prefixes, autogroup:member, autogroup:tagged, and . autogroup:self, autogroup:internet, and autogroup:danger-all are rejected at validate time because none describes a stable identity set a node-level attribute can attach to. NodeAttrGrant carries Targets, Attrs, and IPPool. IPPool is parsed but rejected at validate time -- the allocator that consumes it is not yet implemented. nodeAttrUnsupportedCaps lists caps SaaS accepts that headscale cannot act on (funnel today) and rejects them with a tracking-issue link in the error. compileNodeAttrs resolves each entry's targets, then maps every targeted node to a tailcfg.NodeCapMap of the entry's attrs. Per-node IPs are cached once per call so the inner attr loop is O(grants) instead of O(grants nodes) IP allocations. PolicyManager grows NodeCapMap (per-node), NodeCapMaps (snapshot for batched callers), and NodesWithChangedCapMap (drain buffer for the self-broadcast diff). refreshNodeAttrsLocked appends to the drain rather than overwriting so a SetUsers/SetNodes between SetPolicy and the drain cannot lose the policy-reload diff.	2026-05-13 14:22:30 +02:00
Kristoffer Dalby	d5b2837231	policy/v2: match default proto set for tests with no proto The policy `tests` block lets entries omit `proto`. Tailscale's client maps that to the default protocol set {TCP, UDP, ICMP, ICMPv6} — the captured packet_filter_matches show all four IANA numbers explicitly when no proto is set — and a rule restricted to any one of them satisfies an empty-proto reachability test. srcReachesDst was passing the empty Protocol through unchanged, which landed an empty []int in ruleMatchesProto. The matcher then short- circuited to "no match" for every rule with a non-empty IPProto restriction, including TCP-only grants compiled from `ip: ["tcp:80"]`. The bug surfaced in the captured allpass-acls-and-grants-mixed scenario: the grant `tag:client → webserver:80` was reachable in the compiled filter but the empty-proto test could not see it. Expand the empty Protocol to the default set at the call site so ruleMatchesProto's intersection check sees the right requested protocols. Drop the now-dead empty-requestedProtos branch from the matcher. The last divergence drops out of knownPolicyTesterDivergences as a result. Updates #1803	2026-05-12 11:54:54 +01:00
Kristoffer Dalby	e4e209f919	policy/v2: canonicalize Protocol form during unmarshal Tailscale accepts both named ("tcp") and numeric IANA ("6") protocol forms wherever a Protocol value is allowed. Headscale stored whichever form the user wrote, leaving downstream code with two equivalents to handle separately. validateProtocolPortCompatibility only recognised the named constants and rejected the numeric form, so a policy with `proto: "6", dst: ["host:443"]` was rejected at parse time even though SaaS accepts it. Resolve the disagreement by normalising to the named form during Protocol.UnmarshalJSON. Every downstream consumer now sees one form regardless of what the user wrote, so layered guards like `\|\| protocol == "6"` in the validator are unnecessary. Updates #1803	2026-05-12 11:54:54 +01:00
Kristoffer Dalby	f172dba0e3	policy/v2: validate tests block at parse boundary A `tests` entry describes one connection attempt to one specific host on one specific port over a connection-oriented protocol, and asserts whether it is allowed or denied. Five shape rules follow — single-port dst, proto in {tcp, udp, sctp, ""}, no autogroup:internet dst, no CIDR-typed dst (raw `/N` or hosts:-alias to a multi-host prefix), at least one of accept/deny — and every one was previously silently accepted by headscale even though Tailscale SaaS rejects them as "test(s) failed". Enforce them in one pass over `pol.Tests` from `Policy.validate()`, reusing the existing parse-time multierr aggregation. The same shapes remain valid inside ACL or Grant destinations where the rule does not apply; the validator only walks the tests array. The compat runner now treats parse-time errors equivalently to SetPolicy errors so the captured Tailscale body still matches via substring regardless of which step surfaces the rejection. Nine divergences resolved by this validation pass drop out of knownPolicyTesterDivergences. Updates #1803	2026-05-12 11:54:54 +01:00
Kristoffer Dalby	c0774a739b	policy/v2: add policytester captures recorded from Tailscale SaaS 57 captures covering the alias × outcome matrix for the tests block, recorded against a real Tailscale SaaS tailnet. Replayed by TestPolicyTesterCompat. Bump the check-added-large-files pre-commit threshold to 1024 KB — captures include verbose per-node netmaps and one is 620 KB. Updates #1803	2026-05-12 11:54:54 +01:00
Kristoffer Dalby	7bc701179b	policy/v2: add policytester compat test runner Pin headscale's accept/reject decision and error body against Tailscale SaaS by replaying captures recorded from a real tailnet. Mirrors the tailscale_grants_compat_test.go pattern: glob over testdata/policytest_results/, one t.Run per file, parse-or-SetPolicy error must contain the captured api_response_body.message. errPolicyTestsFailed is "test(s) failed" — Tailscale's literal body — so substring match works against captured response bodies. Per-test detail (src, dst, expected vs got) is preserved below the prefix for the CLI / config-reload paths that don't have an audit endpoint. knownPolicyTesterDivergences gates the 12 mismatches the captures will surface so the suite stays green; engine fixes in follow-up commits drop the entries as each is resolved. Updates #1803	2026-05-12 11:54:54 +01:00
Kristoffer Dalby	b29ae25356	policy/v2: evaluate the tests block on user-initiated writes v2 silently dropped policy.tests, so a policy that contradicted its own assertions still applied. Resolve src/dst via the existing Alias machinery, walk the compiled global filter rules (acls and grants both contribute), and run on every user-write boundary: SetPolicy, the file watcher, and `headscale policy check`. A failing test rejects the write before it mutates live state. Boot-time reload skips evaluation; an already-stored policy that references a deleted user shouldn't lock the server out. `headscale policy check` is a thin frontend for the new CheckPolicy gRPC method. The server-side handler builds a fresh PolicyManager from the request bytes and the state's live users/nodes, runs SetPolicy on the sandbox so the tests block executes, and returns the result through gRPC status. No persistence, no policy_mode coupling. --bypass-grpc-and-access-database-directly opens the DB directly when the server is not running. cmd/headscale/cli/root.go no longer special-cases `policy check` in init() (the early return from PR #2580 broke --config registration and viper priming for --bypass). integration/cli_policy_test.go covers policy_mode={file,database} x fixture={acl-only, acl+passing-tests, acl+failing-tests} x bypass={false,true} = 12 rows. Updates #1803 Co-authored-by: Janis Jansons <janhouse@gmail.com>	2026-05-12 11:54:54 +01:00
Kristoffer Dalby	c3df84e354	policy/matcher: include CapGrant.Dsts in match destinations MatchFromFilterRule only read DstPorts[].IP into the destination IPSet. Cap-grant-only filter rules (e.g. tailscale.com/cap/relay) carry their destinations in CapGrant[].Dsts, so the derived matchers had empty dest sets and BuildPeerMap / ReduceNodes never exposed the cap target to its source nodes. Without a companion IP-level grant the relay node stayed invisible, so clients never tried to use it and connections sat on DERP. Union CapGrant[].Dsts into the destination IPSet alongside DstPorts. Restores peer-visibility for any cap-grant-only relationship; the peer-relay flow is the most visible instance. Fixes #3256	2026-05-11 14:55:06 +01:00
Lealem Amedie	542091e82b	Add unit test	2026-05-11 09:25:26 +01:00
Lealem Amedie	6cd919d411	mapper: include UserProfiles in policy-change MapResponses	2026-05-11 09:25:26 +01:00
Kristoffer Dalby	2f907edf87	hscontrol/types: regenerate types_clone.go for viewer bump cmd/viewer in tailscale.com/cmd v1.97.0-pre emits new(x) instead of ptr.To(x). No behaviour change.	2026-05-11 08:46:12 +01:00
Kristoffer Dalby	bc9fb6d403	hscontrol/policy/v2: reject ambiguous user references at load time When a user@ token resolved to more than one DB row, ACL and SSH rules referencing it were silently dropped at compile time, leaving clients with SSHPolicy={rules: null} and no signal to the admin. Validate every Username reference in groups, tagOwners, autoApprovers, ACLs and SSH rules at NewPolicyManager and SetPolicy and return ErrMultipleUsersFound. Missing-user tokens stay tolerant per #2863. Updates #3160	2026-05-09 11:28:12 +01:00
SAY-5	01e548e030	state: avoid nil deref in registration handlers when old user is missing Mirror the guard from HandleNodeFromPreAuthKey in HandleNodeFromAuthPath. Both functions log the old user's name in the "different user" branch when an existing NodeStore entry under the same machine key belongs to another user. UserView.Name dereferences the backing User pointer unconditionally, so when the cached node was loaded with a non-nil UserID but a nil User (Preload join missed the row, or upstream code left the snapshot in that shape), the log call panics with a nil-pointer dereference at hscontrol/types/types_view.go:97. The panic is caught by the http2 server's runHandler for the noise control plane, so the process keeps running but every retry produces a new panic — production has observed bursts of ~1.9k panics per hour during a tailscaled reconnect loop. The gRPC/OIDC entry has no equivalent recover and would surface the panic to the caller. Guard both call sites with oldUser.Valid() and fall back to an empty old-user name when the pointer is nil. The "Creating new node for different user" log line still includes the existing node ID, hostname, machine key, and new user, so operator visibility is preserved. Add reproduction tests for both handlers seeding the orphan shape directly into NodeStore via PutNodeInStoreForTest. Co-Authored-By: Kristoffer Dalby <kristoffer@dalby.cc>	2026-05-06 07:23:02 +01:00
Kristoffer Dalby	9482cdf590	testdata: drop unused uppercase SSH-.hujson fixtures The 39 SSH-.hujson files in hscontrol/policy/v2/testdata/ssh_results/ were legacy hand-written "expected SSH rules" snippets superseded by the lowercase tscap captures (ssh-.hujson). The active loader in TestSSHDataCompat globs ssh-.hujson; filepath.Glob is case-sensitive on Linux so the uppercase set was loaded by no test. The duplication caused permanent dirty git state on case-insensitive filesystems (APFS, NTFS) where only one of SSH-A1.hujson and ssh-a1.hujson can physically exist in the working tree. Add an assertion to TestSSHDataCompat that the loader picks up every *.hujson under ssh_results/ so future fixture migrations cannot leave stranded files behind. Fixes #3240	2026-05-05 11:59:01 +01:00
primewildy	3d0f597b23	oidc: handle groups claim as string or array (FlexibleStringSlice) Some OIDC providers (notably JumpCloud) return the `groups` claim as a plain string when the user belongs to a single group, rather than a single-element array: Single group: {"groups": "MyGroup"} Multiple groups: {"groups": ["Group1", "Group2"]} This causes `json.Unmarshal` to fail with: cannot unmarshal string into Go struct field OIDCClaims.groups of type []string This is the same class of issue as juanfont#2293 (FlexibleBoolean for email_verified). The fix follows the same pattern: introduce a FlexibleStringSlice type with a custom UnmarshalJSON that accepts both a string and a []string, and use it for the Groups field in both OIDCClaims and OIDCUserInfo.	2026-05-04 15:26:53 +02:00
Kristoffer Dalby	76ee29352b	servertest: cover via-grant exit-node visibility end-to-end TestGrantViaExitNodeInternetVisibility boots a server, applies a policy that scopes autogroup:internet to a tag, registers a tagged exit advertiser and a regular client, and asserts the client's netmap surfaces the exit node with 0.0.0.0/0 and ::/0 in AllowedIPs — the substrate the Tailscale client reads to populate `tailscale exit-node list`. TestGrantViaExitNodeNoFilterRules retains its assertion (literal /0 absent from the exit node's PacketFilter, matching SaaS PacketFilter encoding); only its docstring is updated to reflect that the exit node now does receive a TheInternet-shaped rule, just not the literal /0 form. Updates #3233	2026-04-30 19:22:45 +01:00
Kristoffer Dalby	2b7f15abaa	policy/v2: surface autogroup:internet via grants on exit nodes A grant of the form `{src: alice, dst: autogroup:internet, via: tag:exit1}` was loading without error but stripping every exit node from alice's view: `tailscale exit-node list` returned "no exit nodes found". Two sites skipped autogroup:internet at the compile / steering layer: compileViaForNode's AutoGroup arm produced no FilterRule for the via-tagged exit node, and ViaRoutesForPeer's AutoGroup arm produced no Include/Exclude. With pm.needsPerNodeFilter true, the exit node's matchers were empty, BuildPeerMap could not link source to exit, and RoutesForPeer's ReduceRoutes stripped 0.0.0.0/0 and ::/0 from AllowedIPs. The skip belongs at the wire-format layer (ReduceFilterRules), not at the compile layer that also feeds internal matchers. Lift autogroup:internet handling into both AutoGroup arms with the same shape used for Prefix destinations: emit a TheInternet rule on via-tagged exit advertisers; surface peer.ExitRoutes() in Include when the peer carries the via tag, Exclude otherwise. ReduceFilterRules continues to keep the rule on exit-route advertisers' wire output and strip it elsewhere, preserving SaaS PacketFilter encoding. Also drop compileViaForNode's early len(SubnetRoutes)==0 return: SubnetRoutes excludes exit routes, so the early return pre-empted the autogroup:internet branch on nodes that only advertise exit routes. Existing tests pinning the buggy behaviour (TestViaRoutesForPeer subtests, TestCompileViaGrant case) flipped to the new contract. Fixes #3233	2026-04-30 19:22:45 +01:00
Kristoffer Dalby	94ec607bca	state: per-goroutine deadline in HA probe cycle `time.After(ProbeTimeout)` returned a single channel shared by every probe goroutine in the cycle. Only the first goroutine to receive the deadline tick drains the channel; any other goroutine still waiting on its `responseCh` is then stuck forever, `wg.Wait()` never returns, and the scheduler loop in `app.go` stalls on the next tick. The condition fires whenever two or more nodes time out in the same cycle — common under cable-pull where IsOnline lags reality and both routers stay in the candidate set as half-open TCP. Move the timer inside each goroutine so every probe has its own deadline. Updates #3234	2026-04-30 12:52:05 +01:00
Kristoffer Dalby	3d5c0af4e7	state: preserve previous primary when all HA advertisers unhealthy electPrimaryRoutes' all-unhealthy fallback picked candidates[0] (lowest NodeID) regardless of who was prev. Under cable-pull semantics IsOnline lags reality (long-poll TCP half-open), so both routers stay in candidates and both go Unhealthy via the prober — the fallback then churned primary to a node that was itself unreachable. Prefer prev when still in candidates; fall through to candidates[0] only when prev is gone. Anti-blackhole holds. Update the property test reference model and split the unit test into existence (KeepsAPrimary) and identity (PreservesPrevious) cases. Fixes #3203	2026-04-29 18:08:39 +01:00
Kristoffer Dalby	863fa2f815	servertest, integration: cover HA both-offline recovery Three regression tests for the user scenario: an in-process Disconnect/Reconnect, a tailscale-down/up integration test, and an iptables -j DROP cable-pull integration test. Updates #3203	2026-04-29 18:08:39 +01:00
Kristoffer Dalby	9f7c8e9a07	state: clear Unhealthy when node leaves HA candidate set Restore the legacy auto-clear at write boundaries that drop HA candidacy: Disconnect, SetApprovedRoutes(empty), and UpdateNodeFromMapRequest shrinking advertised routes to empty. Plus a defensive guard in SetNodeUnhealthy. Updates #3203	2026-04-29 18:08:39 +01:00
Kristoffer Dalby	66ac785c22	state: delete routes package, port primary route tests Remove hscontrol/routes/. Port the named scenarios and the rapid property test to hscontrol/state/. Updates #3203	2026-04-29 18:08:39 +01:00
Kristoffer Dalby	437754aeea	state: switch consumers to NodeStore primary routes Replace routes.PrimaryRoutes reads with NodeStore. Connect bumps SessionEpoch; Disconnect re-checks it inside UpdateNode so the check and mutation are atomic against a concurrent Connect on the same node. The connect_race regression test is carried in its final SessionEpoch form. Updates #3203	2026-04-29 18:08:39 +01:00
Kristoffer Dalby	da927eb018	state: compute primary routes inside NodeStore snapshot Add primaries and isPrimary maps to Snapshot plus an election algorithm. No callers yet. Updates #3203	2026-04-29 18:08:39 +01:00
Kristoffer Dalby	942313a10a	types: move DebugRoutes from routes to types Unblocks deletion of the routes package. Updates #3203	2026-04-29 18:08:39 +01:00
Kristoffer Dalby	1fe682b141	types: add Unhealthy and SessionEpoch fields to Node Runtime-only (gorm:"-") fields read by the HA primary route refactor. Updates #3203	2026-04-29 18:08:39 +01:00
Kristoffer Dalby	010a5564c5	all: rephrase prose to fit codebase voice Reword comments, one doc paragraph, and one test failure message so the prose reads naturally. No behaviour change.	2026-04-29 16:22:19 +01:00
Akhilesh Arora	de60982d83	state: note tagged-path coverage and self-healing behaviour for #3199 - test: comment that the !regReq.Expiry.IsZero() gate also covers the tags-only PreAuthKey path - CHANGELOG: note pre-existing 0001-01-01 rows self-heal on re-registration rather than being backfilled	2026-04-29 13:06:38 +01:00
Akhilesh Arora	0e10ca4e9a	state: preserve nil expiry on user owned registration when no default is configured When a user owned node registers or re registers with a PreAuthKey and the client sends zero client expiry while node.expiry is set to 0, the expiry column ends up stored as 0001-01-01 00:00:00 instead of NULL. Two sites in HandleNodeFromPreAuthKey build a non nil pointer to regReq.Expiry even when the value is zero time, and the needsDefaultExpiry guard only replaces it when s.cfg.Node.Expiry > 0, so the pointer to zero time survives to the database. Convert an unset regReq.Expiry to nil before handing it off so the needsDefaultExpiry path is the only place that assigns a non nil pointer. This is a narrower sibling of #3170 on the user owned PreAuthKey path. The regression was introduced alongside the fix for #3111 in `6337a3db`.	2026-04-29 13:06:38 +01:00
Kristoffer Dalby	c7a0ca709f	policy: surface exit nodes via autogroup:internet (#3212 ) compileFilterRules skipped autogroup:internet destinations to keep them out of the wire-format PacketFilter, but those same compiled rules are the source of pm.matchers — and Node.CanAccess relies on a matcher whose DestsIsTheInternet covers the public internet to surface exit-node peers to ACL sources. With the skip in place no such matcher existed, exit nodes silently dropped out of the source's peer list, and the docs' exit-node walkthrough stopped working: `tailscale exit-node list` returned "no exit nodes found" and `tailscale set --exit-node=<ip>` returned "no node found in netmap with IP". Drop the compile-time skip so autogroup:internet flows through normal matcher derivation, and teach ReduceFilterRules to keep the resulting client packet-filter rule on exit-route advertisers — Tailscale SaaS sends those rules to exit nodes so the kernel filter accepts traffic forwarded by autogroup:internet sources. Verified against a live tailnet on 2026-04-28 via tscap; the b17/b18 captures land under testdata/issue_3212/ as a regression guard. The captures are isolated from testdata/routes_results/ because the broader TestRoutesCompat machinery assumes a CIDR-prefix wire format that differs from the IPSet-range form SaaS emits for autogroup:internet — aligning that wire format is tracked separately. Fixes #3212	2026-04-29 11:24:33 +01:00
Kristoffer Dalby	2e1a716a9a	policy/v2: fix empty grants/acls returning FilterAllowAll compileFilterRules, compileGrants, and updateLocked guarded the "no rules so allow all" fallback with len(pol.Grants) == 0, which matches both an absent grants field and an explicit empty array. JSON {"grants": []} unmarshals to a non-nil empty slice; it should compile to zero filter rules (deny all) to match Tailscale SaaS, but the length check sent it down the FilterAllowAll path. Distinguish absent (nil) from explicit-empty by switching the guard to pol.Grants == nil, the same asymmetry already used for ACLs. {} keeps allowing all; {"acls": []} and {"grants": []} now both deny all. Fixes #3211	2026-04-29 08:55:07 +01:00

1 2 3 4 5 ...

679 Commits