Skip to content

fix(mcp-bridge): terminate backend process on client disconnect and bound concurrent sessions#13482

Open
shreemaan-abhishek wants to merge 1 commit into
apache:masterfrom
shreemaan-abhishek:fix/mcp-bridge-session-cleanup
Open

fix(mcp-bridge): terminate backend process on client disconnect and bound concurrent sessions#13482
shreemaan-abhishek wants to merge 1 commit into
apache:masterfrom
shreemaan-abhishek:fix/mcp-bridge-session-cleanup

Conversation

@shreemaan-abhishek

Copy link
Copy Markdown
Contributor

Description

The mcp-bridge plugin spawns a backend process per SSE connection and relies on the SSE session loop returning to tear that process down. Today the loop only ends when a keepalive write fails, which can take up to two 30s ping cycles or, if writes keep being absorbed by the connection, may not happen at all. As a result a backend process started for a client that has already gone away can stay alive until the worker reloads. Separately, there is no ceiling on how many sessions a worker will keep open, so a route can keep spawning backend processes for as many connections as are opened.

This PR makes session teardown deterministic and bounds concurrency:

  • Register an ngx.on_abort handler so a client disconnect stops the session promptly instead of waiting for the next keepalive write to fail.
  • Make the ping loop wake early once the session has been asked to stop, so the backend process is released without waiting out the keepalive interval.
  • Always run teardown (backend process + broker state) and free the session slot through a guarded path, regardless of how the loop ended.
  • Add a per-worker concurrent-session ceiling via a new max_sessions config field (default 100); connections beyond the ceiling get 429.
  • Enable lua_check_client_abort in the main http block so on_abort is usable.

The concurrent-session bookkeeping is factored into a small apisix/plugins/mcp/session_limit.lua module so it can be unit-tested.

Behaviour changes

  • lua_check_client_abort is now enabled globally in the main http block. This means long-running Lua handler phases are terminated when the client disconnects. Proxied requests were already aborted by nginx in this case, so the practical effect is limited to Lua streaming handlers, which now stop work when the client goes away.
  • New optional config field max_sessions (integer, default 100). Existing configs keep working unchanged; the default applies when the field is omitted.

Which issue(s) this PR fixes:

Fixes #

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 8, 2026
@shreemaan-abhishek shreemaan-abhishek force-pushed the fix/mcp-bridge-session-cleanup branch from 223720d to 18cab31 Compare June 12, 2026 07:01

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the mcp-bridge plugin’s SSE session lifecycle by making backend teardown deterministic on client disconnect and by introducing a per-worker cap on concurrent sessions to prevent unbounded backend process spawning.

Changes:

  • Add client-abort handling (ngx.on_abort) + a server:stop() flag so sessions exit promptly on disconnect.
  • Add a per-worker session ceiling via new max_sessions config (default 100) backed by a small session_limit module.
  • Enable lua_check_client_abort on; globally in the Nginx http block template.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
t/plugin/mcp-bridge.t Adds schema validation coverage for max_sessions and unit-tests the session limit bookkeeping.
apisix/plugins/mcp/session_limit.lua New per-worker counter for tracking concurrent MCP SSE sessions.
apisix/plugins/mcp/server.lua Makes the ping loop wake early when asked to stop; adds stop() to request exit.
apisix/plugins/mcp/server_wrapper.lua Registers ngx.on_abort, enforces max_sessions, and centralizes teardown/release logic.
apisix/plugins/mcp-bridge.lua Extends plugin schema with max_sessions (min 1, default 100).
apisix/cli/ngx_tpl.lua Enables lua_check_client_abort to allow ngx.on_abort usage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread apisix/plugins/mcp/server_wrapper.lua

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Comment thread apisix/plugins/mcp/session_limit.lua
Comment thread apisix/plugins/mcp/server_wrapper.lua Outdated
Comment thread t/plugin/mcp-bridge.t
…ound concurrent sessions

The SSE session loop only ended when a keepalive write failed, which can
take two 30s ping cycles or never happen, so a backend process spawned for
a disconnected client could stay alive until worker reload. There was also
no ceiling on concurrent sessions, so a route could keep spawning backend
processes for as many connections as were opened.

- register an ngx.on_abort handler so a client disconnect stops the session
  promptly instead of waiting for the next keepalive write to fail
- make the ping loop wake early once the session is asked to stop
- always run teardown (process + broker) and free the session slot via a
  guarded path, regardless of how the loop ended
- add a per-worker concurrent-session ceiling (max_sessions, default 100);
  excess SSE connections get 429
- enable lua_check_client_abort in the main http block so on_abort works

Behaviour change: lua_check_client_abort is now on globally, so long-running
Lua handler phases are terminated when the client disconnects (proxied
requests were already aborted by nginx). New config field max_sessions.
@shreemaan-abhishek shreemaan-abhishek force-pushed the fix/mcp-bridge-session-cleanup branch from ab9f040 to 8dafd1b Compare June 15, 2026 05:57

@membphis membphis left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The session teardown path is now deterministic, the per-worker session limit is bounded and released through a guarded cleanup path, and CI is passing. No merge blockers found.

@nic-6443 nic-6443 requested a review from bzp2010 June 16, 2026 01:25

@bzp2010 bzp2010 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the comments above, I also suggest that we begin deprecate this plugin, as it no longer serves a practical purpose. We have also been plagued by false-positive vulnerability reports related to it.

local pcall = pcall
local core = require("apisix.core")
local mcp_server = require("apisix.plugins.mcp.server")
local session_limit = require("apisix.plugins.mcp.session_limit")

@bzp2010 bzp2010 Jun 16, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies globally, right? That means no matter how many mcp-bridge routes are configured, the total number of sessions cannot exceed the limit?

It doesn't seem to be at the routes level.

Comment on lines +46 to +48
local ok, err = ngx_on_abort(function()
server:stop()
end)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit risky, as ngx.on_abort can only be called once per request. If a plugin calls this hook, it effectively hijacks the call to that hook for that single plugin. If other plugins or the core need to use this hook, it will fail.

I suggest adding some comments to clarify this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants