[WIP] test(e2e): reuse one runtime per worker + shared readiness helper (Tier 2)#9607
Open
nishantmonu51 wants to merge 4 commits into
Open
[WIP] test(e2e): reuse one runtime per worker + shared readiness helper (Tier 2)#9607nishantmonu51 wants to merge 4 commits into
nishantmonu51 wants to merge 4 commits into
Conversation
…er 2) Reuse the rill binary across tests instead of respawning per test: - web-common rillDev fixture now starts rill once per Playwright worker (scope: worker) and resets the project between tests rather than killing the process. Reset clears the previous project's source files (preserving rill.yaml and the runtime's tmp/ DB dir, since deleting rill.yaml orphans resources and an open DuckDB can't be wiped), waits for the old resources to tear down, copies the new project, then waits for full reconciliation. - Tests needing an uninitialized/pristine project (welcome flow: init, rill-yaml empty-project, motherduck) opt out via a new freshInstance option. - Promote waitForReconciliation into web-common/tests/utils and add gotoWhenReady; the fixture now gates every test start on reconciliation instead of a fixed waitForTimeout(1500). Repoint the 5 importers. - Delete the unused startRuntimeForEachTest.ts. workers stays 1 (Tier 2 sharding deferred). Local validation: a 16-test subset across Blank/AdBids transitions with file mutation ran with 1 binary spawn (was ~16) and zero controller-closed/readonly-database teardown races.
- clickhouse/postgres connectors: mark freshInstance. They drive the welcome flow (needs an uninitialized project) and clickhouse restarts the controller by changing olap_connector — both incompatible with the shared per-worker runtime. Fixes the two web-local failures. - waitForProjectReady: key stability on resource names only, not stateVersion. A resource that re-reconciles back to idle (common during a deploy) was resetting the stability counter and stalling readiness, which failed web-integration with 'did not become ready' and no pending resources. - cliHomeDir: default to undefined and route any test that sets it to its own instance. The deploy journey relies on CLI auth state in a specific home, which the shared runtime's worker home would not provide.
… deploy churn - clickhouse/postgres: move freshInstance onto only the 'Welcome screen' blocks (which need an uninitialized project). The 'Home page' blocks stay on the shared runtime — they passed there, and the per-test reset recovers the controller restart that clickhouse's olap_connector change triggers. Marking the whole describe freshInstance had regressed the Home page tests. - waitForProjectReady: if the resource set never stabilizes within the timeout but the project did reach a fully-idle state, proceed instead of failing. The deploy tests' background cloud sync churns the resource set continuously, which previously failed with 'did not become ready (none pending)'. Validated locally: clickhouse welcome (fresh) + home page (shared) + a trailing shared spec all pass, confirming the olap change doesn't corrupt the reused runtime.
The per-test (freshInstance) path copies the project into its dir, but startRuntime then rmSync'd that dir before spawning, leaving the runtime with an empty project. This only surfaced for freshInstance tests that use a project (the deploy journey: freshInstance via cliHomeDir + project=AdBids); the welcome-flow freshInstance tests use no project so an empty dir was correct. startRuntime now assumes the caller prepared projectDir. Validated locally: a freshInstance + project:AdBids instance now loads the project's resources (waitForProjectReady no longer times out on an empty project).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Tier 2 of the e2e performance & flakiness work (follow-up to #9596). Keeps
workers: 1for now; horizontal sharding is deferred.rillbinary across tests instead of respawning per test. Theweb-commonrillDevfixture now startsrillonce per Playwright worker (scope: "worker") and resets the project between tests rather than killing the process. This eliminates ~one binary spawn per test and thecontroller closed/readonly databaseDuckDB teardown races that came with it.rill.yaml(deleting it orphans the parser's resources) and the runtime'stmp/DB dir (the open DuckDB can't be wiped). It then waits for the old resources to tear down, copies the new project, and waits for full reconciliation. Two runtime behaviours discovered while validating this are documented in code: theProjectParserstaysRUNNINGwhile watching the repo (so readiness keys on data resources only), and deletingrill.yamlleaves resources orphaned.init,rill-yamlempty-project,motherduck) opt out via a newfreshInstancefixture option.waitForReconciliationintoweb-common/tests/utilsand addgotoWhenReady. The fixture now gates every test start on full reconciliation instead of a fixedwaitForTimeout(1500), addressing the recurring navigate-before-reconcile flakiness centrally.startRuntimeForEachTest.ts.Local validation (
workers: 1): a 16-test subset spanningBlank↔AdBidstransitions with heavy file mutation ran with 1 binary spawn (was ~16) and zerocontroller closed/readonly databaseteardown races; a mixed shared +freshInstancebatch also passed.Not included: Tier 2 workers/sharding (deferred per request —
workers: 1retained) and the web-admindescribe.serialblast-radius reduction. The latter is deferred because those blocks are legitimate create→apply→delete lifecycles sharing one explore; the spec's own TODO notes per-feature explores are needed first, andadminPagecan't be used inbeforeAll. Decoupling them blind would introduce conflicts.Checklist:
Developed in collaboration with Claude Code