Skip to content

Implement a make jobserver (continuation of #9210)#9625

Merged
hsbt merged 8 commits into
masterfrom
claude/silly-joliot-7993f2
Jun 16, 2026
Merged

Implement a make jobserver (continuation of #9210)#9625
hsbt merged 8 commits into
masterfrom
claude/silly-joliot-7993f2

Conversation

@hsbt

@hsbt hsbt commented Jun 16, 2026

Copy link
Copy Markdown
Member

Resumes #9210. @Edouard-chin I'd like to get this solved, so I'm picking your work back up here. This rebases your jobserver implementation onto the current master, resolving the conflicts from the spec directory move and the priority queue change.

Fixes #9170.

Make sure the following tasks are checked

- Fix #9170
- In #9131, we added running `make` in parallel with the `-j` flag.

  It's problematic because since Bundler installs gems in parallel,
  we could end up installing `N gem x M processors` which would exhaust
  the machine resources and causes issues like #9170.

  In this patch, I have implemented a make [jobserver](https://www.gnu.org/software/make/manual/html_node/POSIX-Jobserver.html),
  so that each thread installing a gem can take available job slots
  from the pool.

  I also want to highlight that with this patch, we can still end up
  running 2 more jobs than what's available. The reason being that
  if a gem is waiting for slots to be available, then the whole
  `bundle install` takes much longer which defeats the whole purpose
  of the `make -j` optimization. So if there is no more slots
  available, we compile the extension without parallelization
  (we still use one cpu though).

  Example
  ----------

  If `bundle install -j 6` is running, then we have 6 slots
  available. Each gem with native extension that gets installed may
  take up to *3* slots (see my reasoning on this number below).

  In the event where 3 gems with native extensions get installed,
  2 gems will consume all 6 slots, leaving the third gem without any.
  Ultimately, this means that we could use *at maximum 2 more
  slots than what's available* (e.g. `bundle install -j 3` for 3 gems
  will use 5 slots).

  I chose `3` slots per `make` process because after installing
  multiple gems, I didn't see any benefit to adding more jobs.
  It's possible that some gems have many `make` recipes that could
  run more than 3 in parallel, but I don't think this is very frequent.
Comment thread bundler/lib/bundler/installer/parallel_installer.rb Fixed
Comment thread bundler/lib/bundler/installer/parallel_installer.rb Fixed
hsbt and others added 6 commits June 16, 2026 10:41
IO.new defaults to read mode when no mode is given. On POSIX, Ruby
inspects the descriptor's access mode and opens the write end of the
jobserver pipe for writing anyway, but on Windows that inspection is not
available, so releasing slots raised "IOError: not opened for writing".
Pass explicit modes so the write end is usable on every platform.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The guard that avoids appending a duplicate -j used /-j\d?(\s|\Z)/, but
\d? matches at most one digit, so an existing -j10 (or any two-digit job
count) failed to match and a second -j was appended anyway. Use \d* so a
job count of any width is recognized.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
connect_to_jobserver matched the first --jobserver-auth in MAKEFLAGS, but
the value is appended after any pre-existing flags. When bundle install
runs under a parent make jobserver, MAKEFLAGS already carries the parent's
--jobserver-auth, so the leftmost match returned the parent's descriptors
instead of the pool ParallelInstaller just created. Scan and take the last
match so we always connect to our own pipe.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
If the read end raised while closing, the original ensure skipped both the
write-end close and the MAKEFLAGS restore, leaving --jobserver-auth with
closed descriptors in the process environment. Restore MAKEFLAGS first and
guard the closes so the environment is always cleaned up. Building the value
with compact.join also drops the stray leading space when MAKEFLAGS was
previously unset.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
redefine_build_jobs coordinated the two worker threads with `sleep(0.1)
while flag` loops. If a thread died before clearing its flag the other
spun forever, and the polling added latency to every run. Use blocking
Thread::Queue handoffs so the ordering is deterministic and a failure
surfaces instead of hanging on a flag that never flips.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Name the per-gem slot cap MAX_JOBS_PER_GEM with a comment explaining the
limit, rename the local to acquired_jobs since it counts grabbed tokens
rather than free ones, and rescue EOFError alongside IO::WaitReadable so a
closed jobserver pipe falls back to a single job instead of raising.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bundler's quality spec rejects weak modifiers in lib source, and the
connect_to_jobserver comment used "the pool we just created". Name
ParallelInstaller instead so the meaning stays clear without "just".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@hsbt hsbt merged commit fbcd4f8 into master Jun 16, 2026
110 checks passed
@hsbt hsbt deleted the claude/silly-joliot-7993f2 branch June 16, 2026 04:51
@hsbt hsbt mentioned this pull request Jun 16, 2026
4 tasks
@Edouard-chin

Copy link
Copy Markdown
Member

Thank you soo much for picking this again @hsbt ❤️ !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Running make with -j causes issues in docker environments

2 participants