Skip to content

fix(ssl): serialize ssl operations with a process-wide file lock#498

Open
mrrobot47 wants to merge 1 commit into
EasyEngine:developfrom
mrrobot47:fix/ssl-concurrency-lock
Open

fix(ssl): serialize ssl operations with a process-wide file lock#498
mrrobot47 wants to merge 1 commit into
EasyEngine:developfrom
mrrobot47:fix/ssl-concurrency-lock

Conversation

@mrrobot47

Copy link
Copy Markdown
Member

Problem

Nothing serialized SSL operations. A cron ssl-renew --all running concurrently with a manual ee site ssl/ssl-verify (or two crons) would read/write the same certificate_order.json, account/key.private.pem, and acme-conf/var/{domain}/* — risking corrupted JSON, duplicate ACME orders, and account-key overwrites. No flock existed anywhere on the SSL path.

Fix

Acquire a single global, exclusive, non-blocking flock (EE_ROOT_DIR/ssl-global.lock) at the three ACME entry points — init_le() (before register()/authorize() write account/order state), ssl_verify(), and ssl_renew(). A concurrent SSL operation in another process fails fast with a clear "another SSL operation is already in progress" error instead of racing. The lock is held for the whole operation and released automatically on process exit (advisory flock — crash-safe).

The handle is a process-level static with a reentrancy short-circuit, and this is load-bearing: ssl-renew --all dispatches each site via EE::run_command in one process (a fresh command instance per site), and flock denies a second lock on the same file via a different fd even within the same process — so an instance-level guard would make site #2 of --all wrongly error. The static handle means the first acquire locks and every later/nested acquire (init_le → ssl_verify; each --all site) returns reentrantly. (It intentionally differs from the backup lock's instance handle + blocking wait + explicit release: SSL's --all-in-one-process pattern needs a process-wide handle, fail-fast for cron, and no per-site release.)

Testing

Manual: start a long ee site create <le-site> (or ssl-renew --all) and concurrently run ee site ssl-verify <other> → the second exits immediately with the "in progress" error; sequential runs and --all (many sites in one process) work normally.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants