deerflow2/backend/app/gateway/routers
AochenShen99 a5599c100c
fix(gateway): honour on_disconnect on /wait endpoints (#3267)
* fix(gateway): honour on_disconnect on /wait endpoints (#3265)

The non-streaming /threads/{tid}/runs/wait and /runs/wait handlers used
to await record.task directly with no disconnect handling and silently
swallow CancelledError. When a long tool call (e.g. pip install inside
a custom skill) kept the connection idle long enough for an
intermediate HTTP layer to time out, the handler would still read the
in-progress checkpoint and return it as if the run had completed
normally -- masking a half-finished run as a successful response.

Add wait_for_run_completion in app.gateway.services that mirrors
sse_consumer's bridge-consumption pattern: subscribe to the stream
bridge until END_SENTINEL, poll request.is_disconnected on every
wake-up, and on real client disconnect cancel the background run when
record.on_disconnect is "cancel". Wire it into both wait endpoints.

The streaming path was unaffected because sse_consumer already has
this loop; this just brings /wait to parity.

* fix(gateway): skip checkpoint serialization on /wait disconnect

Copilot review on #3267 caught a follow-on of the same #3265 bug: when
the client disconnects, wait_for_run_completion breaks out of the bridge
loop and cancels the run, but the /wait endpoint then continues to read
the checkpointer and serializes whatever partial checkpoint exists as a
normal 200 response.

Have the helper return a bool — True only when END_SENTINEL was observed
— and skip the checkpoint serialization path on False. Also reorder the
inner check so END_SENTINEL is honoured even when is_disconnected() flips
true in the same iteration; the run truly finished so the real final
checkpoint is still valid.
2026-05-28 07:22:39 +08:00
..
__init__.py feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403) 2026-03-30 16:02:23 +08:00
agents.py feat(agent): add custom-agent self-updates with user isolation (#2713) 2026-05-05 23:17:42 +08:00
artifacts.py fix(gateway): cap skill artifact preview size (#2963) 2026-05-15 22:15:58 +08:00
assistants_compat.py feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403) 2026-03-30 16:02:23 +08:00
auth.py fix(auth): replace setup-status 429 rate limit with cached response (#2915) 2026-05-18 22:07:01 +08:00
channels.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
feedback.py feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134) 2026-04-26 11:09:55 +08:00
mcp.py fix(security): mask sensitive values in MCP config API responses (#2667) 2026-05-21 10:28:57 +08:00
memory.py feat(persistence): per-user filesystem isolation, run-scoped APIs, and state/history simplification (#2153) 2026-04-26 11:13:01 +08:00
models.py refactor: thread release config through lead path (#2612) 2026-04-28 14:53:18 +08:00
runs.py fix(gateway): honour on_disconnect on /wait endpoints (#3267) 2026-05-28 07:22:39 +08:00
skills.py refactor(skills): Unified skill storage capability (#2613) 2026-05-01 13:23:26 +08:00
suggestions.py refactor: thread release config through lead path (#2612) 2026-04-28 14:53:18 +08:00
thread_runs.py fix(gateway): honour on_disconnect on /wait endpoints (#3267) 2026-05-28 07:22:39 +08:00
threads.py perf(harness): push thread metadata filters into SQL (#2865) 2026-05-12 23:21:22 +08:00
uploads.py fix(sandbox): add group/other read permissions to uploaded files for Docker sandbox (#3127) (#3134) 2026-05-25 09:26:18 +08:00