Skip to content

feat: GitOps platform-user management + bootstrap superuser SA (protected, config-only)#1709

Closed
rohilsurana wants to merge 25 commits into
mainfrom
feat/authoritative-superuser-reconciliation
Closed

feat: GitOps platform-user management + bootstrap superuser SA (protected, config-only)#1709
rohilsurana wants to merge 25 commits into
mainfrom
feat/authoritative-superuser-reconciliation

Conversation

@rohilsurana

@rohilsurana rohilsurana commented Jun 21, 2026

Copy link
Copy Markdown
Member

What

Moves platform-superuser management to a GitOps model. Config no longer seeds human
superusers; instead it seeds a single bootstrap superuser service account, and all
ongoing platform-user management (humans and service accounts, admin and member) happens
out-of-band via a declarative frontier reconcile CLI driven from the IAM repo. The
bootstrap SA is treated as a protected, config-only break-glass identity.

Note: this PR started as "opt-in authoritative reconciliation" and then pivoted to GitOps
(dropping authoritative reconciliation and the whole config human-superuser flow), and
later grew a bootstrap-SA hardening pass. Commits are kept as-is by request, so the history
is winding — the final diff is the source of truth; this description summarises the end
state.

Changes

Remove the config human-superuser flow

  • Dropped app.admin.users + MakeSuperUsers + the boot-time authoritative reconciliation
    that an earlier revision of this PR had added. No human superusers are seeded from config.

Config-bootstrapped superuser service account (app.admin.bootstrap.{client_id, client_secret})

  • At boot, idempotently ensures a ClientSecret service-account credential exists in the
    platform org and is promoted to superuser; rotates the secret if it changed. This is the only
    config-seeded superuser and the break-glass identity. client_id must be a UUID (it is the
    credential id); validated with a fail-fast error.
  • The SA is created with a fixed, well-known service-user id
    (00000000-0000-0000-0000-000000000001, a deliberately non-nil sentinel — uuid.Nil is the
    audit "system" actor and must not be a real principal).
  • Automation (the reconcile workflow) authenticates as this SA via HTTP Basic.
  • If creds.Create fails after the service-user row is created, the orphan row is rolled back so
    repeated boot failures don't accumulate credential-less service users.

Harden the bootstrap SA — immutable via the API (security)

  • The SA is managed only via app.admin.bootstrap at boot. The API refuses, by its well-known
    id, to: remove its platform access (RemovePlatformUser), delete it
    (DeleteServiceUser), or mint credentials/tokens/keys for it
    (CreateServiceUserCredential / Token / JWK). Not even platform superusers may — this
    prevents a rotation-proof superuser backdoor on a publicly-known id.
  • The authoritative reconciler excludes the bootstrap SA by id, so an apply never plans to
    remove it. (Validated end-to-end on a live instance, including the escalation path where the
    per-object manage authz is satisfied — the guard still blocks it.)
  • Boot-time secret rotation is unaffected (it writes via the repositories, not these handlers).

Fix platform-user grant/revoke correctness

  • Sudo is now idempotent on the exact relation tuple, not the derived permission. check
    is granted by both admin and member, so the old permission check skipped creating the
    member tuple for an existing admin — silently breaking admin→member downgrades. Sudo and
    UnSudo are now symmetric (both relation-level).
  • UnSudo(ctx, id, relationName) is parameterized and actually strips the requested relation
    (previously it could only remove member, never admin).
  • RemovePlatformUser honors the new optional relation field (feat(frontier): add optional relation field to RemovePlatformUserRequest proton#489): when set,
    removes just that relation (e.g. demote admin → member); empty keeps the remove-both default.
  • ListPlatformUsers reports all relations a principal holds (metadata.relations), not just
    the last-seen one, so the reconciler converges to the desired state exactly (admin-only,
    member-only, or both — extras are removed). metadata.relation is retained for backward
    compatibility.

Audit platform grants/revokes

  • New platform.{admin,member}_{added,removed} events via the auditrecord system, on both the
    RPC path (real principal) and boot/config path (system actor, uuid.Nil). Change-only.

Declarative reconcile framework + CLI

  • internal/reconcile: a Reconciler interface + registry + kind-based (kind:) multi-doc YAML,
    with PlatformUser as the first kind (parse → ListPlatformUsers → diff (principal, relation)
    → apply). Future kinds (roles, plans, traits…) plug in without changing the command/format.
  • frontier reconcile -f <file> [--dry-run] -H <auth> — authoritative converge of the desired
    state; auth via the existing --header mechanism.

Supporting fix

  • serviceuser_credentials org_name made nullable-tolerant (sql.NullString) — a platform-level
    service account has no real organization row.

Testing

  • Unit suite green (incl. new internal/reconcile, bootstrap SA, audit, relation-selective
    removal, the Sudo/list correctness fixes, and the API immutability guards). Tests run at
    bcrypt.MinCost so they don't blow the -race -count 2 timeout.
  • Full e2e regression suite passes against real Postgres + SpiceDB — the testbench seeds its
    superuser the production way (authenticates as the bootstrap SA and grants the test admin via
    AddPlatformUser). The e2e caught two real integration bugs the unit mocks couldn't (UUID
    client_id, nullable org_name), both fixed here.
  • Live security validation against a real instance: confirmed the bootstrap SA cannot be
    removed, deleted, or have credentials/tokens/keys minted — as a superuser, a regular user, or
    unauthenticated — and that the API guard still blocks minting even when the per-object manage
    authorization is deliberately satisfied.
  • go build ./..., go vet, gofmt, golangci-lint all clean.

Related

Notes

  • Default-safe: with no app.admin.bootstrap configured, boot is a no-op (no superusers seeded);
    existing platform relations are untouched.
  • Breaking for operators relying on app.admin.users: that config is removed — seed the bootstrap
    SA instead and manage humans via GitOps.

@vercel

vercel Bot commented Jun 21, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
frontier Ready Ready Preview, Comment Jun 30, 2026 9:51am

@coderabbitai

coderabbitai Bot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds audit recording for platform grant and revoke operations in core/user and core/serviceuser, propagates relationName through API interfaces and handler calls, and introduces bootstrap superuser seeding with updated startup wiring.

Changes

Platform audit recording and bootstrap superuser setup

Layer / File(s) Summary
Audit constants and repository contracts
pkg/auditrecord/consts.go, core/user/service.go, core/serviceuser/service.go, core/user/mocks/audit_record_repository.go, core/serviceuser/mocks/audit_record_repository.go
Adds platform audit event and entity constants, introduces AuditRecordRepository in both services, and adds generated mocks for the new repository interface.
user service audit flow
core/user/service.go, core/user/mocks/audit_record_repository.go, core/user/service_test.go
Sudo and UnSudo now record platform grant/revoke audits, accept relationName, validate and check the specific relation before deletion, and use helpers to map relation names to audit events. Tests cover the new dependency and audit calls.
service-user audit flow
core/serviceuser/service.go, core/serviceuser/mocks/audit_record_repository.go, core/serviceuser/service_test.go
Sudo and UnSudo now record platform grant/revoke audits, accept relationName, and delete only the requested relation. Tests cover the new dependency and audit calls.
API relationName propagation
internal/api/v1beta1connect/interfaces.go, internal/api/v1beta1connect/mocks/*.go, internal/api/v1beta1connect/platform.go, internal/api/v1beta1connect/platform_test.go
Propagates the UnSudo signature change through API interfaces and generated mocks, updates RemovePlatformUser to remove both admin and member relations, and adds handler tests for user, service-user, and missing-ID cases.
bootstrap superuser seeding
internal/bootstrap/service.go, internal/bootstrap/bootstrapuser.go, internal/bootstrap/bootstrapuser_test.go, config/sample.config.yaml, cmd/serve.go
Adds bootstrap superuser config and dependencies, implements create-or-rotate credential handling and promotion, documents the config, tests the bootstrap flow, and invokes bootstrap ensuring during startup.
startup wiring for audit repositories
cmd/serve.go
Passes auditRecordRepository into both service constructors and preserves the updated bootstrap service wiring.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • raystack/frontier#1567: Both PRs change core/user/service.go and cmd/serve.go wiring around user.NewService.
  • raystack/frontier#1616: Both PRs extend the bootstrap startup path in internal/bootstrap/service.go and cmd/serve.go.

Suggested reviewers

  • AmanGIT07
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coveralls

coveralls commented Jun 21, 2026

Copy link
Copy Markdown

Coverage Report for CI Build 28435616158

Warning

Build has drifted: This PR's base is out of sync with its target branch, so coverage data may include unrelated changes.
Quick fix: rebase this PR. Learn more →

Coverage increased (+0.7%) to 44.459%

Details

  • Coverage increased (+0.7%) from the base build.
  • Patch coverage: 141 uncovered changes across 11 files (396 of 537 lines covered, 73.74%).
  • 74 coverage regressions across 2 files.

Uncovered Changes

Top 10 Files by Coverage Impact Changed Covered %
cmd/reconcile.go 60 31 51.67%
internal/reconcile/platformuser_reconciler.go 103 76 73.79%
internal/reconcile/reconcile.go 27 0 0.0%
internal/bootstrap/bootstrapuser.go 89 66 74.16%
internal/api/v1beta1connect/platform.go 65 58 89.23%
cmd/serve.go 6 0 0.0%
core/serviceuser/service.go 51 45 88.24%
core/user/service.go 50 44 88.0%
internal/reconcile/platformuser.go 63 58 92.06%
internal/bootstrap/service.go 3 0 0.0%
Total (13 files) 537 396 73.74%

Coverage Regressions

74 previously-covered lines in 2 files lost coverage.

File Lines Losing Coverage Coverage
pkg/server/connect_interceptors/authorization.go 72 0.0%
core/user/service.go 2 67.28%

Coverage Stats

Coverage Status
Relevant Lines: 37531
Covered Lines: 16686
Line Coverage: 44.46%
Coverage Strength: 12.44 hits per line

💛 - Coveralls

@rohilsurana rohilsurana marked this pull request as ready for review June 22, 2026 05:03

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/serviceuser/service.go (1)

518-531: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make member revocation idempotent in UnSudo.

UnSudo currently returns an error when deleting a non-existent member relation. In remove flows that revoke both admin and member, this can fail after successful admin demotion and leave the operation reported as failed.

Suggested fix
-	if err := s.relationService.Delete(ctx, relation.Relation{
+	if err := s.relationService.Delete(ctx, relation.Relation{
 		Object: relation.Object{
 			ID:        schema.PlatformID,
 			Namespace: schema.PlatformNamespace,
 		},
 		Subject: relation.Subject{
 			ID:        currentUser.ID,
 			Namespace: schema.ServiceUserPrincipal,
 		},
 		RelationName: relationName,
-	}); err != nil {
+	}); err != nil && !errors.Is(err, relation.ErrNotExist) {
 		return err
 	}
core/user/service.go (1)

255-267: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Treat missing member relation as no-op during UnSudo.

When relationService.Delete returns relation.ErrNotExist for member, UnSudo fails even though the desired end state is already satisfied. This can break platform-user removal flows that revoke both relations.

Suggested fix
-	if err := s.relationService.Delete(ctx, relation.Relation{
+	if err := s.relationService.Delete(ctx, relation.Relation{
 		Object: relation.Object{
 			ID:        schema.PlatformID,
 			Namespace: schema.PlatformNamespace,
 		},
 		Subject: relation.Subject{
 			ID:        currentUser.ID,
 			Namespace: schema.UserPrincipal,
 		},
 		RelationName: relationName,
-	}); err != nil {
+	}); err != nil && !errors.Is(err, relation.ErrNotExist) {
 		return err
 	}

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f1ae0dde-555e-4c9a-b963-1eee98883348

📥 Commits

Reviewing files that changed from the base of the PR and between 6dc16f9 and e175a9a.

📒 Files selected for processing (16)
  • cmd/serve.go
  • config/sample.config.yaml
  • core/serviceuser/mocks/audit_record_repository.go
  • core/serviceuser/service.go
  • core/serviceuser/service_test.go
  • core/user/mocks/audit_record_repository.go
  • core/user/service.go
  • core/user/service_test.go
  • internal/api/v1beta1connect/interfaces.go
  • internal/api/v1beta1connect/mocks/service_user_service.go
  • internal/api/v1beta1connect/mocks/user_service.go
  • internal/api/v1beta1connect/platform.go
  • internal/api/v1beta1connect/platform_test.go
  • internal/bootstrap/service.go
  • internal/bootstrap/service_test.go
  • pkg/auditrecord/consts.go

Comment thread internal/bootstrap/service.go Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/user/service.go (1)

185-200: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Switch Sudo idempotency to relation-level checks.

Line 186-Line 199 uses permission checks (PlatformCheckPermission) for member. That can short-circuit Sudo(..., member) for already-admin users without creating the member relation tuple, while UnSudo now operates on exact relation tuples. This asymmetry can break downgrade flows that expect tuple-level state transitions.

Proposed fix
-	// check if already su
-	permissionName := ""
-	switch relationName {
-	case schema.MemberRelationName:
-		permissionName = schema.PlatformCheckPermission
-	case schema.AdminRelationName:
-		permissionName = schema.PlatformSudoPermission
-	}
-	if permissionName == "" {
+	// validate requested platform relation
+	switch relationName {
+	case schema.MemberRelationName, schema.AdminRelationName:
+	default:
 		return fmt.Errorf("invalid relation name, possible options are: %s, %s", schema.MemberRelationName, schema.AdminRelationName)
 	}
 
-	if ok, err := s.IsSudo(ctx, currentUser.ID, permissionName); err != nil {
+	// check if the exact relation already exists
+	if ok, err := s.IsSudo(ctx, currentUser.ID, relationName); err != nil {
 		return err
 	} else if ok {
 		return nil
 	}

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6af69031-afee-4758-b2f3-fef5aec7469f

📥 Commits

Reviewing files that changed from the base of the PR and between e175a9a and 5a346d7.

📒 Files selected for processing (5)
  • core/serviceuser/service.go
  • core/serviceuser/service_test.go
  • core/user/service.go
  • core/user/service_test.go
  • pkg/auditrecord/consts.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • core/serviceuser/service_test.go
  • core/serviceuser/service.go
  • core/user/service_test.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
core/user/service_test.go (1)

703-704: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Strengthen audit matcher assertions to lock the full payload contract.

These matchers only validate Event and Target.ID. Please also assert Resource (ID/Type), Target.Type, and Metadata["relation"] so payload regressions are caught.

Proposed matcher shape
- return r.Event == pkgAuditRecord.PlatformAdminAddedEvent && r.Target != nil && r.Target.ID == "test-id"
+ return r.Event == pkgAuditRecord.PlatformAdminAddedEvent &&
+   r.Target != nil &&
+   r.Target.ID == "test-id" &&
+   r.Target.Type == pkgAuditRecord.UserType &&
+   r.Resource.ID == schema.PlatformID &&
+   r.Resource.Type == pkgAuditRecord.PlatformType &&
+   r.Metadata["relation"] == schema.AdminRelationName

Also applies to: 817-818, 903-904, 958-959


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dc102442-61a1-47be-b539-28d52a165871

📥 Commits

Reviewing files that changed from the base of the PR and between 5a346d7 and 1625cf6.

📒 Files selected for processing (7)
  • core/serviceuser/service.go
  • core/serviceuser/service_test.go
  • core/user/service.go
  • core/user/service_test.go
  • internal/bootstrap/service.go
  • internal/bootstrap/service_test.go
  • pkg/auditrecord/consts.go
🚧 Files skipped from review as they are similar to previous changes (5)
  • core/serviceuser/service_test.go
  • core/user/service.go
  • internal/bootstrap/service_test.go
  • internal/bootstrap/service.go
  • core/serviceuser/service.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0783d969-56a8-45e7-8f00-957f3e88825f

📥 Commits

Reviewing files that changed from the base of the PR and between b059826 and b3ca1b6.

📒 Files selected for processing (5)
  • cmd/serve.go
  • config/sample.config.yaml
  • internal/bootstrap/bootstrapuser.go
  • internal/bootstrap/bootstrapuser_test.go
  • internal/bootstrap/service.go

Comment thread internal/bootstrap/bootstrapuser_test.go Outdated
Comment thread internal/bootstrap/bootstrapuser.go
Comment thread internal/bootstrap/bootstrapuser.go Outdated
@rohilsurana rohilsurana changed the title feat: add opt-in authoritative superuser reconciliation feat: GitOps platform-user management (bootstrap superuser SA + reconcile CLI) Jun 29, 2026
@rohilsurana rohilsurana changed the title feat: GitOps platform-user management (bootstrap superuser SA + reconcile CLI) feat: GitOps platform-user management + bootstrap superuser SA (protected, config-only) Jun 30, 2026
@rohilsurana

Copy link
Copy Markdown
Member Author

Superseded by the 2-PR split of this work:

Together they equal this PR's final diff. Closing in favour of the smaller, independently-reviewable PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RemovePlatformUser is not relation-selective (asymmetric with AddPlatformUser)

2 participants