Skip to content

Wire OBO SecretEnvVars into MCPServer and MCPRemoteProxy#5540

Merged
tgrunnagle merged 4 commits into
mainfrom
obo_issue_5537
Jun 16, 2026
Merged

Wire OBO SecretEnvVars into MCPServer and MCPRemoteProxy#5540
tgrunnagle merged 4 commits into
mainfrom
obo_issue_5537

Conversation

@tgrunnagle

@tgrunnagle tgrunnagle commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Summary

The operator's OBOHandler extension point exposes three dispatch points — Validate, ApplyRunConfig, and SecretEnvVars. Validate and ApplyRunConfig are reached uniformly from every MCPExternalAuthConfig consumer, but SecretEnvVars was dispatched from only one of the three (VirtualMCPServer). As a result, in a build with a registered (enterprise) OBO handler, an obo-typed MCPExternalAuthConfig referenced from an MCPServer or MCPRemoteProxy had its OBO middleware wired into the RunConfig but never received the env var(s) the handler's SecretEnvVars returns — so the middleware could not read its credential at startup. (This is inert in stock builds: the default handler returns obo.ErrEnterpriseRequired.)

This PR closes that gap by giving the secret-env path a shared OBO dispatcher and wiring it into the two missing consumers.

  • Why: the secret-env path lacked the single shared dispatch point that the middleware path (AddExternalAuthConfigOptions) already has, so two of three consumers silently omitted the OBO env vars — a "every consumer must remember to call it" failure mode.
  • What: added controllerutil.AddOBOSecretEnvVars and wired it into the MCPServer deployment builder + drift check and into MCPRemoteProxy's buildEnvVarsForProxy. VirtualMCPServer's obo arm was also reconciled to forward all handler env vars (it previously kept only the first).

Closes #5537

Type of change

  • Bug fix

Test plan

  • Unit tests (task test)
  • Linting (task lint-fix)
  • Manual testing (describe below)

Validation run: full task operator-test passes; golangci-lint is clean on all changed files. (One repo-wide pre-existing gosec finding in cmd/thv/app/upgrade.go is unrelated and untouched.)

Test coverage added/updated:

  • Dispatcher unit tests (tokenexchange_test.go): every ExternalAuthType (nil ref, missing config, and tokenExchange/bearerToken/headerInjection/unauthenticated/embeddedAuthServer/awsSts/upstreamInject all contribute no env vars here; obo with the default handler is inert), plus a registered-handler case asserting the handler's env vars are forwarded verbatim and a genuine-error case. The non-obo rows now seed real secret refs so the "must stay empty" assertions act as regression guards.
  • Controller tests: TestMCPServerDeployment_OBOSecretEnvVars and TestMCPRemoteProxyDeployment_OBOSecretEnvVars assert the OBO env var is injected into the proxy container and that the freshly built deployment does not drift; TestMCPServerDeployment_OBOSecretEnvVars_GenuineErrorDivergence locks in the documented builder/drift behavior on a genuine (non-enterprise) handler error.
  • VirtualMCPServer: its obo dispatch test is updated for the slice return and still asserts obo.ErrEnterpriseRequired is propagated (the wired-but-disabled contract from Wire OBO dispatch arms and reconciler branch; add integration tests #5328).

Changes

File Change
cmd/thv-operator/pkg/controllerutil/tokenexchange.go Add AddOBOSecretEnvVars — resolves the ref, dispatches only the obo type through OBOSecretEnvVars, swallows obo.ErrEnterpriseRequired to (nil, nil). Name carries the obo-only scope so it isn't mistaken for full type coverage.
cmd/thv-operator/controllers/mcpserver_controller.go Call AddOBOSecretEnvVars in both deploymentForMCPServer (builder) and deploymentNeedsUpdate (drift check), at the same position, to keep the two symmetric. Add a shared logOBOSecretEnvVarError helper that logs verbatim only for *obo.ValidationError and generically otherwise (avoids leaking Secret names).
cmd/thv-operator/controllers/mcpremoteproxy_deployment.go Call AddOBOSecretEnvVars in buildEnvVarsForProxy, which feeds both builder and drift paths; log errors via the shared logOBOSecretEnvVarError.
cmd/thv-operator/controllers/virtualmcpserver_deployment.go Rename getExternalAuthConfigSecretEnvVargetExternalAuthConfigSecretEnvVars (slice return); its obo arm now forwards all handler env vars via OBOSecretEnvVars (propagating ErrEnterpriseRequired per #5328); remove firstOBOSecretEnvVar.
cmd/thv-operator/pkg/controllerutil/tokenexchange_test.go Table-driven unit tests for the dispatcher across every auth type (with secret-ref regression guards), plus registered-handler and error cases.
cmd/thv-operator/controllers/mcpserver_externalauth_test.go Controller tests asserting OBO env injection + no drift for MCPServer, and the genuine-error builder/drift divergence.
cmd/thv-operator/controllers/mcpremoteproxy_deployment_test.go Controller test asserting OBO env injection and no drift for MCPRemoteProxy.
cmd/thv-operator/controllers/virtualmcpserver_externalauth_test.go Update the obo dispatch test for the slice return; assert ErrEnterpriseRequired is still propagated.

Does this introduce a user-facing change?

No upstream behavior change — obo is inert in stock builds. For out-of-tree builds with a registered OBO handler, an obo-typed MCPExternalAuthConfig referenced from an MCPServer or MCPRemoteProxy now correctly injects the handler's secret env var(s) into the proxy container, matching VirtualMCPServer. VirtualMCPServer itself now forwards all of a handler's env vars rather than only the first — a no-op for today's single-var handlers, but it removes a latent divergence.

Implementation plan

Approved implementation plan (+ PR-review updates)

The issue's literal proposal was to lift VirtualMCPServer's whole env-var switch into a shared dispatcher and route all three consumers through it. That is not safe: the consumers use different env var naming for the same non-obo types — MCPServer/MCPRemoteProxy rely on the fixed name TOOLHIVE_TOKEN_EXCHANGE_CLIENT_SECRET that the runtime middleware hardcodes (pkg/oauthproto/tokenexchange/middleware.go), while VirtualMCPServer uses unique per-config names. Routing them through vMCP's switch would break the runtime and violate the issue's own "byte-identical pod env per consumer" acceptance criterion.

The dispatcher is therefore scoped to only the obo type, whose env var name is handler-defined and identical across consumers. Every other type keeps its existing per-consumer helper. This scoping ("Option A") was explicitly approved before implementation.

AddOBOSecretEnvVars swallows obo.ErrEnterpriseRequired (returns nil, nil), so the MCPServer builder and drift check stay symmetric in handler-less builds and there is no reconcile hot-loop.

Updated after PR review (#5540): five non-blocking findings (0 HIGH · 2 MEDIUM · 3 LOW) were addressed in two commits:

  • F2 (MEDIUM) — renamed AddExternalAuthConfigSecretEnvVarsAddOBOSecretEnvVars so the identifier reflects the obo-only scope.
  • F1 (MEDIUM) — corrected the drift-site comment: the builder/drift divergence on a genuine handler error persists every reconcile while the handler keeps erroring (it does not "self-resolve" for persistent errors).
  • F5 (LOW) — log raw handler errors verbatim only for *obo.ValidationError; otherwise a generic message pointing at the MCPExternalAuthConfig status, so transient errors don't leak Secret names.
  • F3 (LOW) — secret-ref regression guards in the dispatcher table + a controller test locking in the genuine-error divergence.
  • F4 (LOW) — reconciled VirtualMCPServer rather than deferring: it now forwards all obo env vars. It still calls OBOSecretEnvVars directly (not AddOBOSecretEnvVars), because that wrapper swallows ErrEnterpriseRequired and vMCP must propagate it per Wire OBO dispatch arms and reconciler branch; add integration tests #5328 (its test enforces the wired-but-disabled contract). So all three consumers now forward all obo env vars and differ only in the intentional, separately-tested ErrEnterpriseRequired handling.

All five acceptance criteria from the issue are met and verified by the tests above.

Special notes for reviewers

  • Scoping is deliberate. AddOBOSecretEnvVars handles only the obo type by design; the doc comment explains why the other types must stay on their per-consumer helpers (consumer-specific env var names that must remain byte-identical). Please do not "complete" the switch by moving other types into it — that would reintroduce the runtime breakage described in the implementation plan.
  • Builder/drift symmetry. In MCPServer, the dispatcher is called at the same position in both deploymentForMCPServer and deploymentNeedsUpdate; in MCPRemoteProxy the single buildEnvVarsForProxy feeds both paths. The ErrEnterpriseRequired-to-(nil, nil) swallow is what keeps the two sides computing identical env in handler-less builds. On a genuine handler error the MCPServer builder logs-and-continues while the drift check reports drift — a pre-existing pattern (token-exchange behaves the same), documented at the drift site and out of scope to unify here.
  • VirtualMCPServer keeps a separate dispatch on purpose. It calls OBOSecretEnvVars directly rather than AddOBOSecretEnvVars, because it must propagate ErrEnterpriseRequired (the swallowing wrapper would mask the wired-but-disabled state, breaking the Wire OBO dispatch arms and reconciler branch; add integration tests #5328 contract its test enforces). All three consumers now forward every handler env var; only the ErrEnterpriseRequired handling differs, and that difference is intentional and tested on both sides.

tgrunnagle and others added 2 commits June 16, 2026 11:25
The OBOHandler exposes three operator dispatch points, but SecretEnvVars
was dispatched from VirtualMCPServer only. For a build with a registered
OBO handler, an obo-typed MCPExternalAuthConfig referenced from an
MCPServer or MCPRemoteProxy got its middleware wired into the RunConfig
yet never received the env var(s) the handler asks for, so the OBO
middleware could not read its credential at startup.

Add AddExternalAuthConfigSecretEnvVars in controllerutil — the secret-env
sibling of AddExternalAuthConfigOptions — and call it from MCPServer (both
the deployment builder and the drift check) and MCPRemoteProxy. The
dispatcher handles only the obo type, whose env var name is defined by the
handler and is therefore identical across consumers; every other type
keeps its existing per-consumer helper so pod env stays byte-identical and
VirtualMCPServer is untouched. ErrEnterpriseRequired is treated as "no env
vars" so builder and drift agree in handler-less builds (no hot-loop).

Implements changes for issue #5537:
- Add the shared obo-scoped secret-env dispatcher with unit tests covering
  every ExternalAuthType plus the registered-handler and inert cases
- Inject OBO env vars in MCPServer builder + drift and MCPRemoteProxy
- Add controller tests asserting the env var is present and does not drift

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fixed issues from code review (comment-only, no behavior change):
- MEDIUM: Document the known builder/drift divergence on a genuine OBO
  handler error at the MCPServer drift site; the inert/handler-less path
  stays symmetric, and unifying the transient-error case is out of scope
- LOW: Note in AddExternalAuthConfigSecretEnvVars that VirtualMCPServer's
  firstOBOSecretEnvVar keeps only the first env var, so the two are
  reconciled if multi-var handlers appear

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the size/M Medium PR: 300-599 lines changed label Jun 16, 2026
@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 83.63636% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.85%. Comparing base (3ebb76b) to head (9944d6f).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...-operator/controllers/mcpremoteproxy_deployment.go 70.00% 1 Missing and 2 partials ⚠️
...d/thv-operator/controllers/mcpserver_controller.go 86.36% 2 Missing and 1 partial ⚠️
...perator/controllers/virtualmcpserver_deployment.go 72.72% 3 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5540   +/-   ##
=======================================
  Coverage   69.84%   69.85%           
=======================================
  Files         647      647           
  Lines       65796    65833   +37     
=======================================
+ Hits        45957    45985   +28     
- Misses      16494    16498    +4     
- Partials     3345     3350    +5     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tgrunnagle tgrunnagle left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-agent review summary

Reviewed by 5 specialist passes (Go correctness, operator/reconciliation, OBO/auth, test coverage, general quality). Recommendation: COMMENT — no blocking issues. 0 HIGH · 2 MEDIUM · 3 LOW. The fix correctly closes #5537 for both MCPServer and MCPRemoteProxy, all five acceptance criteria are met, and the deliberate narrowing from the issue's literal proposal (obo-only dispatcher, VirtualMCPServer left unchanged) is sound and well documented.

What's strong

  • Builder/drift symmetry is correct: identical call and position in both MCPServer paths; structural in MCPRemoteProxy via the single buildEnvVarsForProxy.
  • ErrEnterpriseRequired(nil, nil) keeps stock builds inert and prevents a reconcile hot-loop.
  • Thorough tests: every ExternalAuthType, nil-ref, missing-config, inert handler, verbatim forwarding, and genuine-error propagation; controller tests assert both injection and no-drift.

Findings (details in inline comments)

# Sev Finding
F1 MEDIUM Builder/drift error-handling asymmetry can churn on a persistently-failing handler; the "self-resolves" comment is imprecise.
F2 MEDIUM AddExternalAuthConfigSecretEnvVars name implies full type coverage but only handles obo.
F3 LOW The obo-only boundary is guarded only by prose; the controller-level genuine-error divergence is untested.
F4 LOW firstOBOSecretEnvVar (vMCP) keeps only the first env var while this dispatcher forwards all — latent split if a handler ever returns 2+ vars.
F5 LOW Builder/proxy log the raw handler error, which per the error contract may include Secret names.

Highest-value quick wins: F2 (rename) and F1 (comment wording). F3–F5 are reasonable follow-ups.

🤖 Generated with Claude Code

Comment thread cmd/thv-operator/controllers/mcpserver_controller.go
Comment thread cmd/thv-operator/pkg/controllerutil/tokenexchange.go Outdated
Comment thread cmd/thv-operator/pkg/controllerutil/tokenexchange_test.go
Comment thread cmd/thv-operator/pkg/controllerutil/tokenexchange.go Outdated
Comment thread cmd/thv-operator/controllers/mcpserver_controller.go Outdated
tgrunnagle and others added 2 commits June 16, 2026 12:20
Addresses #5540 review comments:
- MEDIUM tokenexchange.go (3423253267): rename AddExternalAuthConfigSecretEnvVars
  to AddOBOSecretEnvVars so the name reflects its obo-only scope and can't be
  misread as the full type coverage its sibling AddExternalAuthConfigOptions has
- MEDIUM mcpserver_controller.go (3423253261): correct the drift-site comment —
  the builder/drift divergence on a genuine handler error persists every reconcile
  while the handler keeps erroring; it does not "self-resolve" for persistent errors
- LOW mcpserver_controller.go (3423253288): log raw handler errors verbatim only
  for *obo.ValidationError (contractually safe); otherwise log a generic message
  pointing at the MCPExternalAuthConfig status, so transient errors don't leak
  Secret names. Shared logOBOSecretEnvVarError helper used by both controllers
- LOW tokenexchange_test.go (3423253271): populate secret refs on the non-obo
  table rows so the "must stay empty" assertions are real regression guards, and
  add a controller test locking in the genuine-error builder/drift divergence

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Addresses #5540 review comments:
- LOW virtualmcpserver_deployment.go (3423253276): vMCP's firstOBOSecretEnvVar
  kept only the first handler env var while MCPServer/MCPRemoteProxy forward all,
  a latent split that would silently reproduce the class of bug #5537 fixed if a
  handler ever returned 2+ vars. Reconcile it now rather than deferring.

Change getExternalAuthConfigSecretEnvVar to return a slice
(getExternalAuthConfigSecretEnvVars) and have its obo arm forward every env var
via OBOSecretEnvVars, removing firstOBOSecretEnvVar. vMCP keeps calling
OBOSecretEnvVars directly (not the swallowing AddOBOSecretEnvVars wrapper) so it
still propagates obo.ErrEnterpriseRequired per #5328 — its existing test and the
wired-but-disabled contract are preserved.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/M Medium PR: 300-599 lines changed size/L Large PR: 600-999 lines changed labels Jun 16, 2026
@tgrunnagle tgrunnagle marked this pull request as ready for review June 16, 2026 19:30
@tgrunnagle tgrunnagle merged commit 64b0eff into main Jun 16, 2026
76 of 77 checks passed
@tgrunnagle tgrunnagle deleted the obo_issue_5537 branch June 16, 2026 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Large PR: 600-999 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Operator: OBOHandler.SecretEnvVars is not dispatched for MCPServer and MCPRemoteProxy

2 participants