Compute classpath container updates in parallel#2375
Conversation
Test Results 129 files ± 0 129 suites ±0 38m 8s ⏱️ +13s Results for commit 5a29036. ± Comparison against base commit dd790b2. This pull request removes 6 and adds 1 tests. Note that renamed tests count towards both.♻️ This comment has been updated with latest results. |
f2245aa to
ef65c4a
Compare
If it is independent it should be made a separate PR so it can eventually already be merged what makes review easier.
I don't think a system property is the right choice here and instead should be a preference that can be set by the user like for other options (e.g. API tools) with PDE. |
|
Thanks for the feedback @laeubi. I agree to both points, they will be adressed before I convert that out of draft. |
ef65c4a to
fa598e3
Compare
|
Some wall clock numbers from my slow client: Without this parallelization: Update of classpath between 3-8 mins |
Replace the work queue of UpdateClasspathsJob with a map keyed by project so that deduplication happens structurally when a request is added instead of while draining the queue. This allows UpdateRequest to become private again and removes the drainRequests helper that was made public only for testing, together with the isolated unit tests for it. Addresses review feedback in eclipse-pde#2363
When the classpath containers of many projects are updated in one run of UpdateClasspathsJob, the transitive dependency closure used to add transitive dependencies with forbidden access rules was computed from scratch for every project, although the dependency graphs of the projects in a workspace overlap to a large degree. Introduce DependencyClosureCache, which memoizes the build-relevant closure per bundle. The closure of a project's direct dependencies is the union of the per-bundle closures. This composition is exact because the build-relevant namespaces include Fragment-Host, so a fragment always pulls its host into the same closure and requirements declared by attached fragments are followed in every closure containing the fragment. The cache is scoped to a single job run because the resolved state can change between runs. Contributes to eclipse-pde#2361
The classpath containers of all queued projects were computed one after another in UpdateClasspathsJob, although the per-project computation only reads the resolved target state and the project's own saved state. For a workspace with hundreds of plug-ins this dominated the time spent reloading the target platform. Split the job into a parallel compute phase and the existing batched apply phase. The entries of each project are computed on a bounded thread pool (one less than the number of available processors), collected in request order, and then applied to the Java model in the single existing JavaCore.setClasspathContainer call. The shared dependency closure cache is already backed by a concurrent map, the resolved OSGi state is only read, and the per-project state files are written from the sequential apply phase, so no additional locking is required. Projects backed by a bnd file are computed sequentially because they share a single bnd workspace that is not safe for concurrent access. The parallelization can be disabled with -Dpde.classpath.parallelUpdate=false and automatically falls back to sequential computation for a single request or on machines with few cores. Contributes to eclipse-pde#2361
fa598e3 to
5a29036
Compare
Speeds up the PDE classpath container update when many projects are refreshed in one run, for example after a target platform reload. Because the per-project computation only reads the resolved target state and each project's own saved state, it now runs on a bounded thread pool (one less than the number of cores) and the results are applied to the Java model in the existing single batched
JavaCore.setClasspathContainercall. It builds on a shared, concurrent dependency-closure cache so the heavily overlapping dependency graphs of workspace plug-ins are traversed once instead of from scratch per project. Parallelization can be disabled with-Dpde.classpath.parallelUpdate=falseand falls back to sequential computation for a single request, on low-core machines, and for bnd-backed projects. The base commit is a small refactor of the update queue that the cache and parallel changes build on.Draft for now. Contributes to #2361.