Add end-to-end integration test that validates localpart:*@domain
SSH user mapping with real Tailscale clients. The test sets up an
SSH policy with localpart entries and verifies that users can SSH
into tagged servers using their email local-part as the username.
Updates #3049
Add support for localpart:*@<domain> entries in SSH policy users.
When a user SSHes into a target, their email local-part becomes the
OS username (e.g. alice@example.com → OS user alice).
Type system (types.go):
- SSHUser.IsLocalpart() and ParseLocalpart() for validation
- SSHUsers.LocalpartEntries(), NormalUsers(), ContainsLocalpart()
- Enforces format: localpart:*@<domain> (wildcard-only)
- UserWildcard.Resolve for user:*@domain SSH source aliases
- acceptEnv passthrough for SSH rules
Compilation (filter.go):
- resolveLocalparts: pure function mapping users to local-parts
by email domain. No node walking, easy to test.
- groupSourcesByUser: single walk producing per-user principals
with sorted user IDs, and tagged principals separately.
- ipSetToPrincipals: shared helper replacing 6 inline copies.
- selfPrincipalsForNode: self-access using pre-computed byUser.
The approach separates data gathering from rule assembly. Localpart
rules are interleaved per source user to match Tailscale SaaS
first-match-wins ordering.
Updates #3049
Build / build-cross (GOARCH=amd64 GOOS=darwin) (push) Has been cancelled
Build / build-cross (GOARCH=amd64 GOOS=linux) (push) Has been cancelled
Build / build-cross (GOARCH=arm64 GOOS=darwin) (push) Has been cancelled
Build / build-cross (GOARCH=arm64 GOOS=linux) (push) Has been cancelled
Check Generated Files / check-generated (push) Has been cancelled
NixOS Module Tests / nix-module-check (push) Has been cancelled
Tests / test (push) Has been cancelled
NodeView.CanAccess called node2.AsStruct() on every check. In peer-map construction we run CanAccess in O(n^2) pair scans (often twice per pair), so that per-call clone multiplied into large heap churn
Add ReadLog method to headscale integration container for log
inspection. Split SSH check mode tests into CLI and OIDC variants
and add comprehensive test coverage:
- TestSSHOneUserToOneCheckModeCLI: basic check mode with CLI approval
- TestSSHOneUserToOneCheckModeOIDC: check mode with OIDC approval
- TestSSHCheckModeUnapprovedTimeout: rejection on cache expiry
- TestSSHCheckModeCheckPeriodCLI: session expiry and re-auth
- TestSSHCheckModeAutoApprove: auto-approval within check period
- TestSSHCheckModeNegativeCLI: explicit rejection via CLI
Update existing integration tests to use headscale auth register.
Updates #1850
Add SSH check period tracking so that recently authenticated users
are auto-approved without requiring manual intervention each time.
Introduce SSHCheckPeriod type with validation (min 1m, max 168h,
"always" for every request) and encode the compiled check period
as URL query parameters in the HoldAndDelegate URL.
The SSHActionHandler checks recorded auth times before creating a
new HoldAndDelegate flow. Auth timestamps are stored in-memory:
- Default period (no explicit checkPeriod): auth covers any
destination, keyed by source node with Dst=0 sentinel
- Explicit period: auth covers only that specific destination,
keyed by (source, destination) pair
Auth times are cleared on policy changes.
Updates #1850
Add gRPC service definitions for managing auth requests:
AuthRegister to register interactive auth sessions and
AuthApprove/AuthReject to approve or deny pending requests
(used for SSH check mode).
Updates #1850
Implement the SSH "check" action which requires additional
verification before allowing SSH access. The policy compiler generates
a HoldAndDelegate URL that the Tailscale client calls back to
headscale. The SSHActionHandler creates an auth session and waits for
approval via the generalised auth flow.
Sort check (HoldAndDelegate) rules before accept rules to match
Tailscale's first-match-wins evaluation order.
Updates #1850
Extract shared HTML/CSS design into a common template and create
generalised auth success and web auth templates that work for both
node registration and SSH check authentication flows.
Updates #1850
Generalise the registration pipeline to a more general auth pipeline
supporting both node registrations and SSH check auth requests.
Rename RegistrationID to AuthID, unexport AuthRequest fields, and
introduce AuthVerdict to unify the auth finish API.
Add the urlParam generic helper for extracting typed URL parameters
from chi routes, used by the new auth request handler.
Updates #1850
Replace gorilla/mux with go-chi/chi as the HTTP router and add a
custom zerolog-based request logger to replace chi's default
stdlib-based middleware.Logger, consistent with the rest of the
application.
Updates #1850
Build / build-cross (GOARCH=amd64 GOOS=darwin) (push) Has been cancelled
Build / build-cross (GOARCH=amd64 GOOS=linux) (push) Has been cancelled
Build / build-cross (GOARCH=arm64 GOOS=darwin) (push) Has been cancelled
Build / build-cross (GOARCH=arm64 GOOS=linux) (push) Has been cancelled
Check Generated Files / check-generated (push) Has been cancelled
NixOS Module Tests / nix-module-check (push) Has been cancelled
Tests / test (push) Has been cancelled
Needs More Info - Timer / remove-label-on-response (push) Has been cancelled
Needs More Info - Timer / close-stale (push) Has been cancelled
update-flake-lock / lockfile (push) Has been cancelled
GitHub Actions Version Updater / build (push) Has been cancelled
Close inactive issues / close-issues (push) Has been cancelled
Extract the existing-node lookup logic from HandleNodeFromPreAuthKey
into a separate method. This reduces the cyclomatic complexity from
32 to 28, below the gocyclo limit of 30.
Updates #3077
Tagged nodes no longer have user_id set, so ListNodes(user) cannot
find them. Update integration tests to use ListNodes() (all nodes)
when looking up tagged nodes.
Add a findNode helper to locate nodes by predicate from an
unfiltered list, used in ACL tests that have multiple nodes per
scenario.
Updates #3077
Test the tagged-node-survives-user-deletion scenario at two layers:
DB layer (users_test.go):
- success_user_only_has_tagged_nodes: tagged nodes with nil
user_id do not block user deletion and survive it
- error_user_has_tagged_and_owned_nodes: user-owned nodes
still block deletion even when tagged nodes coexist
App layer (grpcv1_test.go):
- TestDeleteUser_TaggedNodeSurvives: full registration flow
with tagged PreAuthKey verifies nil UserID after registration,
absence from nodesByUser index, user deletion succeeds, and
tagged node remains in global node list
Also update auth_tags_test.go assertions to expect nil UserID
on tagged nodes, consistent with the new invariant.
Updates #3077
Tagged nodes are owned by their tags, not a user. Enforce this
invariant at every write path:
- createAndSaveNewNode: do not set UserID for tagged PreAuthKey
registration; clear UserID when advertise-tags are applied
during OIDC/CLI registration
- SetNodeTags: clear UserID/User when tags are assigned
- processReauthTags: clear UserID/User when tags are applied
during re-authentication
- validateNodeOwnership: reject tagged nodes with non-nil UserID
- NodeStore: skip nodesByUser indexing for tagged nodes since
they have no owning user
- HandleNodeFromPreAuthKey: add fallback lookup for tagged PAK
re-registration (tagged nodes indexed under UserID(0)); guard
against nil User deref for tagged nodes in different-user check
Since tagged nodes now have user_id = NULL, ListNodesByUser
will not return them and DestroyUser naturally allows deleting
users whose nodes have all been tagged. The ON DELETE CASCADE
FK cannot reach tagged nodes through a NULL foreign key.
Also tone down shouty comments throughout state.go.
Fixes#3077
Tagged nodes are owned by their tags, not a user. Previously
user_id was kept as "created by" tracking, but this prevents
deleting users whose nodes have all been tagged, and the
ON DELETE CASCADE FK would destroy the tagged nodes.
Add a migration that sets user_id = NULL on all existing tagged
nodes. Subsequent commits enforce this invariant at write time.
Updates #3077
Add --disable flag to "headscale nodes expire" CLI command and
disable_expiry field handling in the gRPC API to allow disabling
key expiry for nodes. When disabled, the node's expiry is set to
NULL and IsExpired() returns false.
The CLI follows the new grpcRunE/RunE/printOutput patterns
introduced in the recent CLI refactor.
Also fix NodeSetExpiry to persist directly to the database instead
of going through persistNodeToDB which omits the expiry field.
Fixes#2681
Co-authored-by: Marco Santos <me@marcopsantos.com>
Add bool disable_expiry field (field 3) to ExpireNodeRequest proto
and regenerate all protobuf, gRPC gateway, and OpenAPI files.
Fixes#2681
Co-authored-by: Marco Santos <me@marcopsantos.com>
Add end-to-end test cases to TestUnmarshalPolicy that verify bracketed
IPv6 addresses are correctly parsed through the full policy pipeline
(JSON unmarshal -> splitDestinationAndPort -> parseAlias -> parsePortRange)
and survive JSON round-trips.
Cover single port, multiple ports, wildcard port, CIDR prefix, port
range, bracketed IPv4, and hostname rejection.
Updates #2754
Headscale rejects IPv6 addresses with square brackets in ACL policy
destinations (e.g. "[fd7a:115c:a1e0::87e1]:80,443"), while Tailscale
SaaS accepts them. The root cause is that splitDestinationAndPort uses
strings.LastIndex(":") which leaves brackets on the destination string,
and netip.ParseAddr does not accept brackets.
Add a bracket-handling branch at the top of splitDestinationAndPort that
uses net.SplitHostPort for RFC 3986 parsing when input starts with "[".
The extracted host is validated with netip.ParseAddr/ParsePrefix to
ensure brackets are only accepted around IP addresses and CIDR prefixes,
not hostnames or other alias types like tags and groups.
Fixes#2754
Build / build-cross (GOARCH=amd64 GOOS=darwin) (push) Has been cancelled
Build / build-cross (GOARCH=amd64 GOOS=linux) (push) Has been cancelled
Build / build-cross (GOARCH=arm64 GOOS=darwin) (push) Has been cancelled
Build / build-cross (GOARCH=arm64 GOOS=linux) (push) Has been cancelled
Check Generated Files / check-generated (push) Has been cancelled
NixOS Module Tests / nix-module-check (push) Has been cancelled
Tests / test (push) Has been cancelled
Needs More Info - Timer / remove-label-on-response (push) Has been cancelled
Needs More Info - Timer / close-stale (push) Has been cancelled
Deploy docs / deploy (push) Has been cancelled
Close inactive issues / close-issues (push) Has been cancelled
This just fixes a small issue I noticed reading the docs: the two 'scenarios' listed in the scaling section end up showing up as a numbered list of five items, instead of the desired two items + their descriptions.
Build / build-cross (GOARCH=amd64 GOOS=darwin) (push) Has been cancelled
Build / build-cross (GOARCH=amd64 GOOS=linux) (push) Has been cancelled
Build / build-cross (GOARCH=arm64 GOOS=darwin) (push) Has been cancelled
Build / build-cross (GOARCH=arm64 GOOS=linux) (push) Has been cancelled
Check Generated Files / check-generated (push) Has been cancelled
NixOS Module Tests / nix-module-check (push) Has been cancelled
Tests / test (push) Has been cancelled
Remove dead if-resp-nil checks in tagCmd and approveRoutesCmd; gRPC
returns either an error or a valid response, never (nil, nil).
Rename HasMachineOutputFlag to hasMachineOutputFlag since it has a
single internal caller in root.go.
Add bypassDatabase() to consolidate the repeated LoadServerConfig +
NewHeadscaleDatabase pattern in getPolicy and setPolicy.
Replace os.Open + io.ReadAll with os.ReadFile in setPolicy and
checkPolicy, removing the manual file-handle management.
Move errMissingParameter from users.go to utils.go alongside the
other shared sentinel errors; the variable is referenced by
api_key.go and preauthkeys.go.
Move the Error constant-error type from debug.go to mockoidc.go,
its only consumer.
Replace five hardcoded "2006-01-02 15:04:05" strings with the
HeadscaleDateTimeFormat constant already defined in utils.go.
Replace two literal 10 base arguments to strconv.FormatUint with
util.Base10 to match the convention in api_key.go and nodes.go.
Flags registered on a cobra.Command cannot fail to read at runtime;
GetString/GetUint64/GetStringSlice only error when the flag name is
unknown. The error-handling blocks for these calls are unreachable
dead code.
Adopt the value, _ := pattern already used in api_key.go,
preauthkeys.go and users.go, removing ~40 lines of dead code from
nodes.go and debug.go.
The --namespace flag on nodes list/register and debug create-node was
never wired to the --user flag, so its value was silently ignored.
Remove it along with the deprecateNamespaceMessage constant.
Also remove the namespace/ns command aliases on users and
machine/machines aliases on nodes, which have been deprecated since
the naming changes in 0.23.0.
Add expirationFromFlag helper that parses the --expiration flag into a
timestamppb.Timestamp, replacing identical duration-parsing blocks in
api_key.go and preauthkeys.go.
Add apiKeyIDOrPrefix helper to validate the mutually-exclusive --id and
--prefix flags, replacing the duplicated switch block in expireAPIKeyCmd
and deleteAPIKeyCmd.
Centralise the repeated force-flag-check + YesNo-prompt logic into a
single confirmAction(cmd, prompt) helper. Callers still decide what
to return on decline (error, message, or nil).
Replace three inconsistent MarkFlagRequired error-handling styles
(stdlib log.Fatal, zerolog log.Fatal, silently discarded) with a
single mustMarkRequired helper that panics on programmer error.
Also fixes a bug where renameNodeCmd.MarkFlagRequired("new-name")
targeted the wrong command (should be renameUserCmd), making the
--new-name flag effectively never required on "headscale users rename".
Add a helper that checks the --output flag and either serialises as
JSON/YAML or invokes a table-rendering callback. This removes the
repeated format,_ := cmd.Flags().GetString("output") + if-branch from
the five list commands.
Convert the 10 commands that were still using Run with
ErrorOutput/SuccessOutput or log.Fatal/os.Exit:
- backfillNodeIPsCmd: use grpcRunE-style manual connection with
error returns; simplify the confirm/force logic
- getPolicy, setPolicy, checkPolicy: replace ErrorOutput with
fmt.Errorf returns in both the bypass-gRPC and gRPC paths
- serveCmd, configTestCmd: replace log.Fatal with error returns
- mockOidcCmd: replace log.Error+os.Exit with error return
- versionCmd, generatePrivateKeyCmd: replace SuccessOutput with
printOutput
- dumpConfigCmd: return the error instead of swallowing it
Rename grpcRun to grpcRunE: the inner closure now returns error
and the wrapper returns a cobra RunE-compatible function.
Change newHeadscaleCLIWithConfig to return an error instead of
calling log.Fatal/os.Exit, making connection failures propagate
through the normal error path.
Add formatOutput (returns error) and printOutput (writes to stdout)
as non-exiting replacements for the old output/SuccessOutput pair.
Extract output format string literals into package-level constants.
Mark the old ErrorOutput, SuccessOutput and output helpers as
deprecated; they remain temporarily for the unconverted commands.
Convert all 22 grpcRunE commands from Run+ErrorOutput+SuccessOutput
to RunE+fmt.Errorf+printOutput. Change usernameAndIDFromFlag to
return an error instead of calling ErrorOutput directly.
Update backfillNodeIPsCmd and policy.go callers of
newHeadscaleCLIWithConfig for the new 5-return signature while
keeping their Run-based pattern for now.
Set SilenceErrors and SilenceUsage on the root command so that
cobra never prints usage text for runtime errors. A SetFlagErrorFunc
callback re-enables usage output specifically for flag-parsing
errors (the kubectl pattern).
Add printError to utils.go and switch Execute() to ExecuteC() so
the returned error is formatted as JSON/YAML when --output requests
machine-readable output.
Add a grpcRun helper that wraps cobra RunFuncs, injecting a ready
gRPC client and context. The connection lifecycle (cancel, close)
is managed by the wrapper, eliminating the duplicated 3-line
boilerplate (newHeadscaleCLIWithConfig + defer cancel + defer
conn.Close) from 22 command handlers across 7 files.
Three call sites are intentionally left unconverted:
- backfillNodeIPsCmd: creates the client only after user confirmation
- getPolicy/setPolicy: conditionally use gRPC vs direct DB access