deerflow2/.planning/phases/07-phase-06-mention-upload/07-RESEARCH.md

18 KiB
Raw Blame History

Phase 07: Phase 06 验收后补丁归档mention/upload语义与附件预览复用- Research

Researched: 2026-04-15
Domain: 前后端 mention/upload 语义收敛、附件预览组件复用、memory 清理与验证归档
Confidence: HIGH

User Constraints (from CONTEXT.md)

07-phase-06-mention-upload 目录下不存在 *-CONTEXT.md,因此无可逐字拷贝的 Locked Decisions/Discretion/Deferred。 [VERIFIED: codebase grep .planning/phases/07-phase-06-mention-upload/*-CONTEXT.md]

基于本次 objective 的硬约束如下:将 Phase 06 已验收绕行改动正式纳入 Phase 07范围必须覆盖 mention/upload 语义统一、附件预览复用、memory 清理、可验证提交路径。 [VERIFIED: user objective]

Summary

Phase 06 的代码层关键补丁已经在仓库内落地:前端通过 additional_kwargs.files 单一 envelope 发送 uploads + mentions后端 UploadsMiddleware 已区分 ref_kind=mention 并单独注入 <mentioned_files>,且 new_files 不再错误吸收 mention。 [VERIFIED: codebase grep frontend/src/core/threads/hooks.ts, frontend/src/core/threads/submit-files.ts, backend/.../uploads_middleware.py]

memory 侧也已有清理链路:MemoryMiddleware 在入队前剥离 <uploaded_files>/<mentioned_files>MemoryUpdater 在落盘前清除上传事件句子与 facts对应回归测试存在且本地通过。 [VERIFIED: codebase grep backend/.../memory_middleware.py, backend/.../memory/updater.py, backend/tests/test_memory_upload_filtering.py; VERIFIED: test run uv run pytest -q tests/test_memory_upload_filtering.py]

Phase 07 的核心不是“再造新功能”,而是“归档与验证闭环”:统一术语契约、固定附件预览复用边界、补齐 E2E 选择器漂移、同步 UAT/Validation/Requirements 文档状态,形成可审计提交路径。 [VERIFIED: codebase grep .planning/phases/06-/06-VERIFICATION.md, .planning/phases/06-/06-UAT.md, .planning/REQUIREMENTS.md; VERIFIED: test run pnpm -s test:e2e --grep "DF-INPUT-007|DF-INPUT-008|DF-INPUT-009"]

Primary recommendation: Phase 07 按 docs/contract-fix -> test-fix -> re-verify -> archive 四段执行,禁止再扩展功能面。 [VERIFIED: repo state + phase goal]

Project Constraints (from CLAUDE.md)

项目根目录不存在 CLAUDE.md,无额外项目级强制约束。 [VERIFIED: filesystem check test -f CLAUDE.md]

Standard Stack

Core

Library Version Purpose Why Standard
@radix-ui/react-dropdown-menu repo: ^2.1.16; npm latest: 2.1.16 (2025-08-13) mention 候选面板(键盘/焦点/定位) 已在输入框实现且与现有 shadcn 体系一致,避免自定义浮层分叉。 [VERIFIED: codebase grep frontend/src/components/workspace/input-box.tsx; VERIFIED: npm registry npm view @radix-ui/react-dropdown-menu version time]
sonner repo: ^2.0.7; npm latest: 2.0.7 (2025-08-02) stale/上限提示 现有错误提示已基于 toast 语义,便于保持软失败行为一致。 [VERIFIED: codebase grep toast.error in hooks.ts/input-box.tsx; VERIFIED: npm registry npm view sonner version time]
PromptInputAttachment(内部组件) repo internal 输入区附件/引用缩略预览 当前 reference 预览已复用该组件,是 Phase 07 应固化的复用基线。 [VERIFIED: codebase grep frontend/src/components/workspace/input-box.tsx, frontend/src/components/ai-elements/prompt-input.tsx]
UploadsMiddleware + MemoryMiddleware(内部中间件) repo internal upload/mention 注入与 memory 入队清理 语义分层已形成:uploaded_filesmentioned_files 分离memory 过滤双重防线。 [VERIFIED: codebase grep backend/.../uploads_middleware.py, backend/.../memory_middleware.py, backend/.../memory/updater.py]

Supporting

Library Version Purpose When to Use
@playwright/test repo: ^1.48.0; CLI: 1.48.0 前端 @引用 回归 验证 DF-INPUT-007/008/009 与 testid 合同一致性。 [VERIFIED: frontend/package.json; VERIFIED: command pnpm exec playwright --version]
pytest via uv run backend dev: pytest>=8.0.0 后端 middleware/memory 回归 本机无全局 pytest 时使用 uv run pytest。 [VERIFIED: backend/pyproject.toml; VERIFIED: env check command -v pytest; VERIFIED: test run]

Alternatives Considered

Instead of Could Use Tradeoff
DropdownMenu 自定义绝对定位浮层 自定义层更易与焦点管理/E2E 选择器漂移。 [VERIFIED: historical phase docs + current selector mismatch]
PromptInputAttachment 复用 新建 mention-only 预览组件 会重复实现删除/图片缩略行为,增加 UI 行为分叉。 [VERIFIED: code comparison in input-box.tsx + prompt-input.tsx]

Installation:

cd frontend && pnpm install
cd backend && uv sync

Architecture Patterns

frontend/src/components/workspace/input-box.tsx        # mention candidate + 引用预览
frontend/src/core/threads/submit-files.ts             # files envelope 归一化
frontend/src/core/threads/hooks.ts                    # 发送链路 + stale 软失败
backend/packages/harness/.../uploads_middleware.py    # uploaded/mentioned 语义拆分
backend/packages/harness/.../memory_middleware.py     # 入队前剥离标签
backend/packages/harness/.../memory/updater.py        # 落盘前清理上传事件
backend/tests/test_uploads_middleware_core_logic.py   # mention/upload 后端回归
backend/tests/test_memory_upload_filtering.py         # memory 清理回归
frontend/tests/e2e/input-and-compose.spec.ts          # DF-INPUT-007/008/009

[VERIFIED: codebase grep]

Pattern 1: 单一提交 Envelope + 语义位区分

What: 统一走 additional_kwargs.files,通过 ref_kind/ref_source 区分 mention 与 upload。 [VERIFIED: submit-files.ts, hooks.ts, uploads_middleware.py]
When to use: 所有消息级文件上下文(上传/引用)都应遵循。 [VERIFIED: current implementation] Example:

// Source: frontend/src/core/threads/submit-files.ts
referenceFiles.push({
  filename: reference.filename,
  size: reference.size ?? 0,
  path: reference.path,
  status: "uploaded",
  ref_kind: "mention",
  ref_source: reference.ref_source,
});

Pattern 2: 输入区预览复用 PromptInputAttachment

What: 引用预览与上传附件预览统一使用同一渲染组件。 [VERIFIED: input-box.tsx + prompt-input.tsx]
When to use: 输入区顶部预览条(包含图片缩略图和删除动作)。 [VERIFIED: current UI structure] Example:

// Source: frontend/src/components/workspace/input-box.tsx
<PromptInputAttachment
  data={{ type: "file", id: `reference:${reference.ref_source}:${reference.path ?? reference.filename}`, filename, mediaType, url }}
  onRemove={() => onRemoveReference(reference)}
/>

Pattern 3: 双层 memory 清理

What: 入队前去标签 + 落盘前清句子/事实。 [VERIFIED: memory_middleware.py, updater.py]
When to use: 任何会把会话瞬时文件路径写入上下文的中间件链路。 [VERIFIED: existing middleware design] Example:

# Source: backend/packages/harness/deerflow/agents/middlewares/memory_middleware.py
stripped = _UPLOAD_BLOCK_RE.sub("", content_str).strip()

Anti-Patterns to Avoid

  • 再开并行字段(如 mentions: 会破坏既有 additional_kwargs.files 消费链。 [VERIFIED: hooks.ts, message-list-item.tsx]
  • mention 进入 new_files: 会把引用误判为本次上传,污染 <uploaded_files>。 [VERIFIED: uploads_middleware.py tests]
  • E2E 依赖不存在 testid: reference-chip-remove 当前无实现,导致回归假红。 [VERIFIED: grep reference-chip-remove only in test files]

Don't Hand-Roll

Problem Don't Build Use Instead Why
mention 候选浮层 自定义定位/焦点层 DropdownMenu* 组件族 避免键盘焦点与收起时机出现分叉。 [VERIFIED: input-box.tsx]
引用缩略预览 新写一套 chip/thumbnail PromptInputAttachment 已含图片/文件两类渲染与 remove 交互。 [VERIFIED: prompt-input.tsx]
memory 上传清理 单点字符串替换 memory_middleware + updater 双层过滤 一层漏掉仍可在另一层兜底。 [VERIFIED: code + test_memory_upload_filtering.py]

Key insight: Phase 07 的价值在“收口”,不是“扩面”。任何新造轮子都会重新引入 Phase 06 已解决的不一致。 [VERIFIED: phase artifacts + current code]

Common Pitfalls

Pitfall 1: 测试选择器漂移导致误判回归

What goes wrong: E2E 断言 reference-chip-remove 失败,但功能未必失效。 [VERIFIED: test run output]
Why it happens: 预览组件复用后删除按钮 testid 未对齐旧用例。 [VERIFIED: grep results]
How to avoid: 在复用组件上补稳定选择器,或更新用例改查 aria-label。 [ASSUMED]
Warning signs: DF-INPUT-007 单点失败且 reference-chip 仍可见。 [VERIFIED: test run output]

Pitfall 2: mention/upload 语义回退

What goes wrong: mention 被算成 uploaded_files。 [VERIFIED: historical issue + tests]
Why it happens: _files_from_kwargs 未过滤 ref_kind=mention。 [VERIFIED: uploads_middleware.py]
How to avoid: 保持过滤并用 mixed-list 测试守护。 [VERIFIED: test_uploads_middleware_core_logic.py]
Warning signs: <uploaded_files> 出现 source=mention 的条目。 [VERIFIED: middleware behavior]

Pitfall 3: 会话瞬时文件路径被写入长期 memory

What goes wrong: 后续会话反复检索不存在的旧路径。 [VERIFIED: updater.py docstring/comments]
Why it happens: 上传标签/句子未在 memory pipeline 剥离。 [VERIFIED: memory_middleware.py, updater.py]
How to avoid: 保留双层清理并跑 test_memory_upload_filtering.py。 [VERIFIED: test pass]
Warning signs: memory facts 出现 /mnt/user-data/uploads/。 [VERIFIED: regex intent]

Code Examples

mention 与 upload 分流(后端)

# Source: backend/packages/harness/deerflow/agents/middlewares/uploads_middleware.py
if f.get("ref_kind") == "mention":
    continue

构建单一 files envelope前端

// Source: frontend/src/core/threads/hooks.ts
const { files: filesForSubmit, staleCount } = buildFilesForSubmit(
  uploadedFileInfo,
  normalizedReferences,
);

memory 标签剥离(中间件)

# Source: backend/packages/harness/deerflow/agents/middlewares/memory_middleware.py
_UPLOAD_BLOCK_RE = re.compile(
    r"<(?:uploaded_files|mentioned_files)>[\\s\\S]*?</(?:uploaded_files|mentioned_files)>\\n*",
    re.IGNORECASE,
)

State of the Art

Old Approach Current Approach When Changed Impact
mention 与 upload 同池处理 ref_kind/ref_source 明确区分并分块注入 Phase 06 后段2026-04-15 消除“引用被当上传”副作用。 [VERIFIED: git log + middleware code]
memory 仅靠提示词约束不记上传 middleware + updater 双层代码过滤 已在当前工作树 减少长期 memory 污染。 [VERIFIED: memory_middleware.py, updater.py, tests]

Deprecated/outdated:

  • 仅依赖文档状态判断 Phase 06 完成度(未同步会误判)。 [VERIFIED: 06-VERIFICATION.md vs 06-UAT.md/REQUIREMENTS.md 状态差异]

Assumptions Log

# Claim Section Risk if Wrong
A1 通过补 data-testid 或改为 aria 断言即可稳定 DF-INPUT-007 Common Pitfalls 可能需要更深层 UI 结构调整。

Open Questions

  1. Phase 07 是否要“改代码”还是“仅归档文档+测试修正”?

    • What we know: 语义与 memory 主链路代码已到位。 [VERIFIED: code + tests]
    • What's unclear: 你是否接受只修测试契约与文档闭环,不再动功能实现。
    • Recommendation: 先锁定“最小变更原则”,避免 Phase 07 再引入行为漂移。 [ASSUMED]
  2. E2E 断言口径是否改为可访问性语义?

    • What we know: reference-chip-remove testid 当前缺失。 [VERIFIED: grep + test output]
    • What's unclear: 团队更偏好稳定 testid 还是 aria 文案断言。
    • Recommendation: 若追求跨重构稳定,优先 aria若追求低改动补 testid。 [ASSUMED]

Environment Availability

Dependency Required By Available Version Fallback
Node.js frontend tests/tooling v24.14.0
pnpm frontend scripts 10.32.1 npm不推荐lockfile 不一致)
Playwright CLI DF-INPUT E2E 1.48.0
Python backend tests 3.12.3
uv backend test runner 0.10.10
pytest (global) backend tests uv run pytest

[VERIFIED: local command checks]

Missing dependencies with no fallback:

  • None. [VERIFIED: local checks]

Missing dependencies with fallback:

  • 全局 pytest 缺失;使用 uv run pytest。 [VERIFIED: local checks + successful runs]

Validation Architecture

Test Framework

Property Value
Framework Node test runner + Playwright + pytest (via uv)
Config file frontend/playwright.config.ts, backend/pyproject.toml
Quick run command cd frontend && node --test src/core/threads/hooks.test.ts
Full suite command `cd backend && uv run pytest -q tests/test_uploads_middleware_core_logic.py tests/test_memory_upload_filtering.py && cd ../frontend && pnpm -s test:e2e --grep "DF-INPUT-007

[VERIFIED: codebase files + executed commands]

Phase Requirements → Test Map

Req ID Behavior Test Type Automated Command File Exists?
P7-SEM-01 mention 不计入 new upload unit cd backend && uv run pytest -q tests/test_uploads_middleware_core_logic.py -k "mention or files_from_kwargs"
P7-MEM-01 memory 不保留上传事件 unit cd backend && uv run pytest -q tests/test_memory_upload_filtering.py
P7-UI-01 @候选/引用 chip 交互稳定 e2e `cd frontend && pnpm -s test:e2e --grep "DF-INPUT-007 DF-INPUT-008
P7-DOC-01 验收状态文档闭环 docs check `rg -n "ATREF-01 ATREF-02

Sampling Rate

  • Per task commit: 对应最小命令(前端 unit 或后端 targeted pytest。 [VERIFIED: commit guide + current tests]
  • Per wave merge: 跑后端双测 + 前端三条 E2E。 [VERIFIED: current phase scope]
  • Phase gate: 三类测试全绿且文档状态同步后再进入 verify-work。 [VERIFIED: verification gaps]

Wave 0 Gaps

  • frontend/tests/e2e/input-and-compose.spec.ts 与组件选择器合同未对齐(reference-chip-remove)。 [VERIFIED: test failure + grep]
  • .planning/phases/06-/06-UAT.md 状态未回写到最新结果。 [VERIFIED: file content]
  • .planning/REQUIREMENTS.mdATREF-01..04 仍 Pending。 [VERIFIED: file content]

Security Domain

Applicable ASVS Categories

ASVS Category Applies Standard Control
V2 Authentication no 本 phase 不新增 auth 面。 [VERIFIED: scope]
V3 Session Management no 不改会话机制。 [VERIFIED: scope]
V4 Access Control yes mention 候选限定当前 thread 数据源。 [VERIFIED: input-box.tsx + phase docs]
V5 Input Validation yes 后端 _files_from_kwargs 校验 filename/path。 [VERIFIED: uploads_middleware.py]
V6 Cryptography no 无加密实现变更。 [VERIFIED: scope]

Known Threat Patterns for this phase stack

Pattern STRIDE Standard Mitigation
跨线程文件引用泄露 Information Disclosure 候选仅取当前 thread artifacts/uploads。 [VERIFIED: input-box.tsx]
伪造 additional_kwargs.files 注入 Tampering 后端校验 basename 与 /mnt/user-data/ 前缀。 [VERIFIED: uploads_middleware.py]
memory 泄露临时路径 Information Disclosure middleware + updater 双层过滤上传标签与句子。 [VERIFIED: memory code + tests]

Sources

Primary (HIGH confidence)

  • 本仓库代码:frontend/src/components/workspace/input-box.tsxfrontend/src/components/ai-elements/prompt-input.tsxfrontend/src/core/threads/hooks.tsfrontend/src/core/threads/submit-files.ts。 [VERIFIED: codebase grep]
  • 本仓库代码:backend/packages/harness/deerflow/agents/middlewares/uploads_middleware.pymemory_middleware.pymemory/updater.py。 [VERIFIED: codebase grep]
  • 本地执行结果:node --test, uv run pytest, pnpm test:e2e --grep ...。 [VERIFIED: command output]
  • npm registry@radix-ui/react-dropdown-menusonner 版本与发布时间。 [VERIFIED: npm view]

Secondary (MEDIUM confidence)

  • .planning/phases/06-/06-VERIFICATION.md06-UAT.md06-VALIDATION.md.planning/REQUIREMENTS.md 的状态交叉对比。 [VERIFIED: local docs]

Tertiary (LOW confidence)

  • None.

Metadata

Confidence breakdown:

  • Standard stack: HIGH - 基于当前仓库依赖与 npm registry 实查。
  • Architecture: HIGH - 关键链路均有代码与测试证据。
  • Pitfalls: MEDIUM - 一部分为当前失败现象,一部分为经验性防回退建议。

Research date: 2026-04-15
Valid until: 2026-05-1530 天)