Skip to content

Fix reload reconciliation of keyed children#72

Closed
samuel-williams-shopify wants to merge 1 commit into
mainfrom
reload-reconciliation-tests
Closed

Fix reload reconciliation of keyed children#72
samuel-williams-shopify wants to merge 1 commit into
mainfrom
reload-reconciliation-tests

Conversation

@samuel-williams-shopify

@samuel-williams-shopify samuel-williams-shopify commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Reloading a container is supposed to reconcile keyed children against the current configuration: reuse unchanged keys, spawn newly-appeared keys, and stop keys that have disappeared. The stop half was broken.

Bugs fixed

  1. NoMethodError on removal. Keyed#stop? called @value.stop, but Forked::Child/Threaded::Child have no stop method (only interrupt!/terminate!/kill!). So dropping a keyed child on reload always raised. Never covered by a test.

  2. Respawn of removed restart: true children. Even once stopped, a restart: true keyed child's supervising fiber would immediately respawn it (the restart && !@stopping gate is container-wide). Falcon's virtual hosting spawns exactly these (spawn(restart: true, key: path)).

  3. wait_until_ready hang after a reuse/remove-only reload. wait_until_ready slept before checking readiness, so a reload that spawned no new child (nothing sends a readiness message) blocked forever in select even though everything was already ready.

Changes

  • Group#stop_child(channel, graceful) — stop a single child with the same multi-phase sequence as Group#stop (SIGINT + wait, then SIGKILL + wait).
  • Channel#stopping! / #stopping? — a per-child flag, checked by the supervising fiber's restart gate, so a deliberately-stopped child is not respawned. Lives on the child (whose lifetime matches the flag and which resets naturally on restart), keeping the child-reported state hash free of container control flags.
  • Generic#stop_child marks the child stopping and delegates to the group; the supervising fiber removes the entry from @keyed/@state when the child exits.
  • Generic#reload snapshots keyed values and stops any that were not re-marked (outside the @keyed iteration, to avoid re-entrant mutation).
  • Generic#wait_until_ready checks readiness before sleeping.
  • Removed the broken Keyed#stop?.

Tests

Adds two behavioural tests under with "#reload": spawns a newly configured keyed child and stops a keyed child that is no longer configured. The removal test is the regression guard. Full suite (193) green.

Scope note: this implements synchronous (blocking) reap in reload, correct for the current (non-reactor) controller. Once the controller runs under Async, Group#stop_child and Channel#stopping! can be replaced by per-child Async::Task cancellation.

Assisted-By: devx/904563b8-dbee-48b0-9726-f036df3ed96d
@samuel-williams-shopify

Copy link
Copy Markdown
Contributor Author

Superseded by #73 — rather than fix the incomplete reconcile-remove, we're removing reload/keyed reconciliation and will revisit with a simpler (adoption-based) design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant