deerflow2/backend/docs
AochenShen99 dc2ababf00
feat(skill): add blocking-io-guard — SOP skill for blocking-IO triage and runtime anchors (#3503)
* feat(blocking-io): add changed-lines blocking-IO scanner (L1)

* feat(blocking-io): add scan-changed CLI wrapper

* feat(skill): add blocking-io-guard developer SOP skill

* docs(blocking-io): point contributors at the blocking-io-guard skill

* style(blocking-io): apply ruff format to scanner and tests

* docs(backend): document changed-lines blocking-IO scanner in CLAUDE.md

* feat(skill): add post-fix re-scan check and PR batching policy

* refactor(skill): fix SOP step ordering, align template with repo conventions

- Move re-scan into an explicit 'apply the fix' step (was wedged after
  anchor generation while telling you to go back before the anchor)
- Renumber steps 0-6; drop undefined 'L1' jargon
- Mode A: document that the diff is <base>...HEAD (commit first)
- Mode B: prefer make detect-blocking-io + findings JSON file
- anchor template: module-level pytestmark per tests/blocking_io convention
- CLAUDE.md: fix 'git diff --base' phrasing

* fix(skill): catch findings introduced without touching the blocking line

Review follow-up: changed-line intersection alone misses the case where a
new async caller exposes an old sync helper — the static finding sits on
the untouched blocking line, so Mode A returned empty and the SOP stopped
on a false 'no blocking-IO surface'.

Selection is now a union over the changed files:
- findings on added lines of git diff <base>...HEAD (kept: a second
  identical symbol in an already-flagged function collides on the stable
  key and only this selection sees it);
- findings new versus the merge base, matched by (path, function,
  symbol) — never line numbers.

Base sources are materialized via git show <merge-base>:<path>; files
absent at base count every head finding as new. SKILL.md now states the
residual same-file-only blind spot (cross-file async callers) instead of
treating an empty list as proof of zero exposure, and only requires
reading sop-skeleton.md when generalizing to another detector domain.

* docs(skill): examples teach test-writing, the teeth check defines the rule

All examples in the references/template are filesystem-flavored; make
explicit that they are instances, not the SOP's boundary — the same rules
apply to every detector category (FILE_IO, HTTP, SUBPROCESS, SLEEP) and
acceptance is always red/green teeth, never similarity to an example.
Neutralize the template's arrange comment accordingly.

* fix(blocking-io): harden changed-lines scanner per review

- Dedup the union selection by the stable key (path, function, symbol)
  instead of dict identity, so a future selector returning copied dicts
  cannot silently empty the result.
- parse_changed_lines now handles any unified diff: context lines advance
  the new-file counter, \-markers and deletions do not, and the counter
  resets at each +++ header. Previously correct only for --unified=0.
- Add blocking_io_static.scan_source (in-memory scan); base-version
  comparison no longer round-trips through temp files.
- Empty Mode A report now prints the same-file-only reachability caveat
  at the point of use instead of relying on the SOP text alone.

* docs(skill): bound best-effort cleanup when the offload sits in finally

Lesson from the #3505 review: the SOP routinely drives 'offload the
cleanup branch' transformations, and an awaited cleanup in finally can
mask or stall the primary exception. One sentence in Step 2 closes that
gap at the point where the fix is written.
2026-06-12 10:20:38 +08:00
..
API.md fix(security): harden MCP config endpoint (#3425) 2026-06-08 12:21:02 +08:00
APPLE_CONTAINER.md Fix command syntax for container image pull (#1349) 2026-03-26 00:14:08 +08:00
ARCHITECTURE.md docs: clarify LangGraph compatibility entrypoints (#2914) 2026-05-12 23:15:11 +08:00
AUTH_DESIGN.md docs: document auth design and user isolation (#2913) 2026-05-12 23:07:11 +08:00
AUTH_TEST_DOCKER_GAP.md docs: clean gateway runtime transition remnants (#3334) 2026-06-02 10:03:28 +08:00
AUTH_TEST_PLAN.md docs: clean standalone LangGraph server remnants (#3301) 2026-05-29 11:36:45 +08:00
AUTH_UPGRADE.md docs: clean gateway runtime transition remnants (#3334) 2026-06-02 10:03:28 +08:00
AUTO_TITLE_GENERATION.md docs: fix some broken links (#1864) 2026-04-05 15:35:42 +08:00
BLOCKING_IO_DETECTION.md feat(skill): add blocking-io-guard — SOP skill for blocking-IO triage and runtime anchors (#3503) 2026-06-12 10:20:38 +08:00
CONFIGURATION.md feat: MiniMax provider for image/video/podcast skills + new music-generation skill (#3437) 2026-06-08 22:04:38 +08:00
FILE_UPLOAD.md fix(uploads): enforce streaming upload limits in gateway (#2589) 2026-05-01 20:19:30 +08:00
GUARDRAILS.md fix: rename present_file to present_files in docs and prompts (#2393) 2026-04-21 16:10:14 +08:00
MCP_SERVER.md docs: discourage MCP filesystem workspace config (#3141) 2026-05-22 09:19:23 +08:00
MEMORY_IMPROVEMENTS_SUMMARY.md refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
MEMORY_IMPROVEMENTS.md feat(memory): add memory.token_counting config to avoid tiktoken network dependency (#3429) (#3465) 2026-06-10 23:26:15 +08:00
MEMORY_SETTINGS_REVIEW.md feat: support manual add and edit for memory facts (#1538) 2026-03-29 23:53:23 +08:00
memory-settings-sample.json feat: support manual add and edit for memory facts (#1538) 2026-03-29 23:53:23 +08:00
middleware-execution-flow.md feat(loop-detection): defer warning injection (#2752) 2026-05-21 14:36:07 +08:00
PATH_EXAMPLES.md refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
plan_mode_usage.md refactor(lead-agent): make build_middlewares public to drop the last cross-module private import (#3458) 2026-06-09 11:56:28 +08:00
README.md chore: add sandbox memory profiling tools (#3249) 2026-06-03 22:02:27 +08:00
REPLAY_E2E.md fix(replay-e2e): key fixtures by caller and conversation (#3453) 2026-06-09 21:58:31 +08:00
rfc-create-deerflow-agent.md feat: add create_deerflow_agent SDK entry point (Phase 1) (#1203) 2026-03-29 15:31:18 +08:00
rfc-extract-shared-modules.md refactor: extract shared skill installer and upload manager to harness (#1202) 2026-03-25 16:28:33 +08:00
rfc-grep-glob-tools.md feat(sandbox): add built-in grep and glob tools (#1784) 2026-04-03 16:03:06 +08:00
SANDBOX_MEMORY_PROFILING.md chore: add sandbox memory profiling tools (#3249) 2026-06-03 22:02:27 +08:00
SETUP.md fix(harness): resolve runtime paths from project root (#2642) 2026-05-01 22:19:50 +08:00
STREAMING.md fix(backend): stream DeerFlowClient AI text as token deltas (#1969) (#1974) 2026-04-10 18:16:38 +08:00
summarization.md fix(middleware): avoid rescuing non-skill tool outputs during summarization (#2458) 2026-04-24 21:19:46 +08:00
task_tool_improvements.md refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
TITLE_GENERATION_IMPLEMENTATION.md feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134) 2026-04-26 11:09:55 +08:00
TODO.md docs: clean standalone LangGraph server remnants (#3301) 2026-05-29 11:36:45 +08:00

Documentation

This directory contains detailed documentation for the DeerFlow backend.

Document Description
ARCHITECTURE.md System architecture overview
API.md Complete API reference
AUTH_DESIGN.md User authentication, CSRF, and per-user isolation design
CONFIGURATION.md Configuration options
SETUP.md Quick setup guide

Feature Documentation

Document Description
STREAMING.md Token-level streaming design: Gateway vs DeerFlowClient paths, stream_mode semantics, per-id dedup
FILE_UPLOAD.md File upload functionality
PATH_EXAMPLES.md Path types and usage examples
SANDBOX_MEMORY_PROFILING.md Sandbox memory baseline and runtime comparison guide
summarization.md Context summarization feature
plan_mode_usage.md Plan mode with TodoList
AUTO_TITLE_GENERATION.md Automatic title generation

Development

Document Description
TODO.md Planned features and known issues

Getting Started

  1. New to DeerFlow? Start with SETUP.md for quick installation
  2. Configuring the system? See CONFIGURATION.md
  3. Understanding the architecture? Read ARCHITECTURE.md
  4. Building integrations? Check API.md for API reference

Document Organization

docs/
├── README.md                  # This file
├── ARCHITECTURE.md            # System architecture
├── API.md                     # API reference
├── AUTH_DESIGN.md             # User authentication and isolation design
├── CONFIGURATION.md           # Configuration guide
├── SETUP.md                   # Setup instructions
├── FILE_UPLOAD.md             # File upload feature
├── PATH_EXAMPLES.md           # Path usage examples
├── summarization.md           # Summarization feature
├── plan_mode_usage.md         # Plan mode feature
├── STREAMING.md               # Token-level streaming design
├── AUTO_TITLE_GENERATION.md   # Title generation
├── TITLE_GENERATION_IMPLEMENTATION.md  # Title implementation details
└── TODO.md                    # Roadmap and issues