diff --git a/.planning/phases/07-phase-06-mention-upload/07-RESEARCH.md b/.planning/phases/07-phase-06-mention-upload/07-RESEARCH.md new file mode 100644 index 00000000..a55505c7 --- /dev/null +++ b/.planning/phases/07-phase-06-mention-upload/07-RESEARCH.md @@ -0,0 +1,287 @@ +# Phase 07: Phase 06 验收后补丁归档(mention/upload语义与附件预览复用)- Research + +**Researched:** 2026-04-15 +**Domain:** 前后端 mention/upload 语义收敛、附件预览组件复用、memory 清理与验证归档 +**Confidence:** HIGH + +## User Constraints (from CONTEXT.md) + +`07-phase-06-mention-upload` 目录下不存在 `*-CONTEXT.md`,因此无可逐字拷贝的 Locked Decisions/Discretion/Deferred。 [VERIFIED: codebase grep `.planning/phases/07-phase-06-mention-upload/*-CONTEXT.md`] + +基于本次 objective 的硬约束如下:将 Phase 06 已验收绕行改动正式纳入 Phase 07,范围必须覆盖 mention/upload 语义统一、附件预览复用、memory 清理、可验证提交路径。 [VERIFIED: user objective] + +## Summary + +Phase 06 的代码层关键补丁已经在仓库内落地:前端通过 `additional_kwargs.files` 单一 envelope 发送 uploads + mentions,后端 `UploadsMiddleware` 已区分 `ref_kind=mention` 并单独注入 ``,且 `new_files` 不再错误吸收 mention。 [VERIFIED: codebase grep `frontend/src/core/threads/hooks.ts`, `frontend/src/core/threads/submit-files.ts`, `backend/.../uploads_middleware.py`] + +memory 侧也已有清理链路:`MemoryMiddleware` 在入队前剥离 `/`,`MemoryUpdater` 在落盘前清除上传事件句子与 facts;对应回归测试存在且本地通过。 [VERIFIED: codebase grep `backend/.../memory_middleware.py`, `backend/.../memory/updater.py`, `backend/tests/test_memory_upload_filtering.py`; VERIFIED: test run `uv run pytest -q tests/test_memory_upload_filtering.py`] + +Phase 07 的核心不是“再造新功能”,而是“归档与验证闭环”:统一术语契约、固定附件预览复用边界、补齐 E2E 选择器漂移、同步 UAT/Validation/Requirements 文档状态,形成可审计提交路径。 [VERIFIED: codebase grep `.planning/phases/06-/06-VERIFICATION.md`, `.planning/phases/06-/06-UAT.md`, `.planning/REQUIREMENTS.md`; VERIFIED: test run `pnpm -s test:e2e --grep "DF-INPUT-007|DF-INPUT-008|DF-INPUT-009"`] + +**Primary recommendation:** Phase 07 按 `docs/contract-fix -> test-fix -> re-verify -> archive` 四段执行,禁止再扩展功能面。 [VERIFIED: repo state + phase goal] + +## Project Constraints (from CLAUDE.md) + +项目根目录不存在 `CLAUDE.md`,无额外项目级强制约束。 [VERIFIED: filesystem check `test -f CLAUDE.md`] + +## Standard Stack + +### Core +| Library | Version | Purpose | Why Standard | +|---------|---------|---------|--------------| +| `@radix-ui/react-dropdown-menu` | repo: `^2.1.16`; npm latest: `2.1.16` (2025-08-13) | mention 候选面板(键盘/焦点/定位) | 已在输入框实现且与现有 shadcn 体系一致,避免自定义浮层分叉。 [VERIFIED: codebase grep `frontend/src/components/workspace/input-box.tsx`; VERIFIED: npm registry `npm view @radix-ui/react-dropdown-menu version time`] | +| `sonner` | repo: `^2.0.7`; npm latest: `2.0.7` (2025-08-02) | stale/上限提示 | 现有错误提示已基于 toast 语义,便于保持软失败行为一致。 [VERIFIED: codebase grep `toast.error` in `hooks.ts`/`input-box.tsx`; VERIFIED: npm registry `npm view sonner version time`] | +| `PromptInputAttachment`(内部组件) | repo internal | 输入区附件/引用缩略预览 | 当前 reference 预览已复用该组件,是 Phase 07 应固化的复用基线。 [VERIFIED: codebase grep `frontend/src/components/workspace/input-box.tsx`, `frontend/src/components/ai-elements/prompt-input.tsx`] | +| `UploadsMiddleware` + `MemoryMiddleware`(内部中间件) | repo internal | upload/mention 注入与 memory 入队清理 | 语义分层已形成:`uploaded_files` 与 `mentioned_files` 分离,memory 过滤双重防线。 [VERIFIED: codebase grep `backend/.../uploads_middleware.py`, `backend/.../memory_middleware.py`, `backend/.../memory/updater.py`] | + +### Supporting +| Library | Version | Purpose | When to Use | +|---------|---------|---------|-------------| +| `@playwright/test` | repo: `^1.48.0`; CLI: `1.48.0` | 前端 @引用 回归 | 验证 DF-INPUT-007/008/009 与 testid 合同一致性。 [VERIFIED: `frontend/package.json`; VERIFIED: command `pnpm exec playwright --version`] | +| `pytest` via `uv run` | backend dev: `pytest>=8.0.0` | 后端 middleware/memory 回归 | 本机无全局 `pytest` 时使用 `uv run pytest`。 [VERIFIED: `backend/pyproject.toml`; VERIFIED: env check `command -v pytest`; VERIFIED: test run] | + +### Alternatives Considered +| Instead of | Could Use | Tradeoff | +|------------|-----------|----------| +| `DropdownMenu` | 自定义绝对定位浮层 | 自定义层更易与焦点管理/E2E 选择器漂移。 [VERIFIED: historical phase docs + current selector mismatch] | +| `PromptInputAttachment` 复用 | 新建 mention-only 预览组件 | 会重复实现删除/图片缩略行为,增加 UI 行为分叉。 [VERIFIED: code comparison in `input-box.tsx` + `prompt-input.tsx`] | + +**Installation:** +```bash +cd frontend && pnpm install +cd backend && uv sync +``` + +## Architecture Patterns + +### Recommended Project Structure +```text +frontend/src/components/workspace/input-box.tsx # mention candidate + 引用预览 +frontend/src/core/threads/submit-files.ts # files envelope 归一化 +frontend/src/core/threads/hooks.ts # 发送链路 + stale 软失败 +backend/packages/harness/.../uploads_middleware.py # uploaded/mentioned 语义拆分 +backend/packages/harness/.../memory_middleware.py # 入队前剥离标签 +backend/packages/harness/.../memory/updater.py # 落盘前清理上传事件 +backend/tests/test_uploads_middleware_core_logic.py # mention/upload 后端回归 +backend/tests/test_memory_upload_filtering.py # memory 清理回归 +frontend/tests/e2e/input-and-compose.spec.ts # DF-INPUT-007/008/009 +``` +[VERIFIED: codebase grep] + +### Pattern 1: 单一提交 Envelope + 语义位区分 +**What:** 统一走 `additional_kwargs.files`,通过 `ref_kind/ref_source` 区分 mention 与 upload。 [VERIFIED: `submit-files.ts`, `hooks.ts`, `uploads_middleware.py`] +**When to use:** 所有消息级文件上下文(上传/引用)都应遵循。 [VERIFIED: current implementation] +**Example:** +```typescript +// Source: frontend/src/core/threads/submit-files.ts +referenceFiles.push({ + filename: reference.filename, + size: reference.size ?? 0, + path: reference.path, + status: "uploaded", + ref_kind: "mention", + ref_source: reference.ref_source, +}); +``` + +### Pattern 2: 输入区预览复用 `PromptInputAttachment` +**What:** 引用预览与上传附件预览统一使用同一渲染组件。 [VERIFIED: `input-box.tsx` + `prompt-input.tsx`] +**When to use:** 输入区顶部预览条(包含图片缩略图和删除动作)。 [VERIFIED: current UI structure] +**Example:** +```tsx +// Source: frontend/src/components/workspace/input-box.tsx + onRemoveReference(reference)} +/> +``` + +### Pattern 3: 双层 memory 清理 +**What:** 入队前去标签 + 落盘前清句子/事实。 [VERIFIED: `memory_middleware.py`, `updater.py`] +**When to use:** 任何会把会话瞬时文件路径写入上下文的中间件链路。 [VERIFIED: existing middleware design] +**Example:** +```python +# Source: backend/packages/harness/deerflow/agents/middlewares/memory_middleware.py +stripped = _UPLOAD_BLOCK_RE.sub("", content_str).strip() +``` + +### Anti-Patterns to Avoid +- **再开并行字段(如 `mentions`):** 会破坏既有 `additional_kwargs.files` 消费链。 [VERIFIED: `hooks.ts`, `message-list-item.tsx`] +- **mention 进入 `new_files`:** 会把引用误判为本次上传,污染 ``。 [VERIFIED: `uploads_middleware.py` tests] +- **E2E 依赖不存在 testid:** `reference-chip-remove` 当前无实现,导致回归假红。 [VERIFIED: grep `reference-chip-remove` only in test files] + +## Don't Hand-Roll + +| Problem | Don't Build | Use Instead | Why | +|---------|-------------|-------------|-----| +| mention 候选浮层 | 自定义定位/焦点层 | `DropdownMenu*` 组件族 | 避免键盘焦点与收起时机出现分叉。 [VERIFIED: `input-box.tsx`] | +| 引用缩略预览 | 新写一套 chip/thumbnail | `PromptInputAttachment` | 已含图片/文件两类渲染与 remove 交互。 [VERIFIED: `prompt-input.tsx`] | +| memory 上传清理 | 单点字符串替换 | `memory_middleware` + `updater` 双层过滤 | 一层漏掉仍可在另一层兜底。 [VERIFIED: code + `test_memory_upload_filtering.py`] | + +**Key insight:** Phase 07 的价值在“收口”,不是“扩面”。任何新造轮子都会重新引入 Phase 06 已解决的不一致。 [VERIFIED: phase artifacts + current code] + +## Common Pitfalls + +### Pitfall 1: 测试选择器漂移导致误判回归 +**What goes wrong:** E2E 断言 `reference-chip-remove` 失败,但功能未必失效。 [VERIFIED: test run output] +**Why it happens:** 预览组件复用后删除按钮 testid 未对齐旧用例。 [VERIFIED: grep results] +**How to avoid:** 在复用组件上补稳定选择器,或更新用例改查 aria-label。 [ASSUMED] +**Warning signs:** `DF-INPUT-007` 单点失败且 `reference-chip` 仍可见。 [VERIFIED: test run output] + +### Pitfall 2: mention/upload 语义回退 +**What goes wrong:** mention 被算成 `uploaded_files`。 [VERIFIED: historical issue + tests] +**Why it happens:** `_files_from_kwargs` 未过滤 `ref_kind=mention`。 [VERIFIED: `uploads_middleware.py`] +**How to avoid:** 保持过滤并用 mixed-list 测试守护。 [VERIFIED: `test_uploads_middleware_core_logic.py`] +**Warning signs:** `` 出现 source=mention 的条目。 [VERIFIED: middleware behavior] + +### Pitfall 3: 会话瞬时文件路径被写入长期 memory +**What goes wrong:** 后续会话反复检索不存在的旧路径。 [VERIFIED: `updater.py` docstring/comments] +**Why it happens:** 上传标签/句子未在 memory pipeline 剥离。 [VERIFIED: `memory_middleware.py`, `updater.py`] +**How to avoid:** 保留双层清理并跑 `test_memory_upload_filtering.py`。 [VERIFIED: test pass] +**Warning signs:** memory facts 出现 `/mnt/user-data/uploads/`。 [VERIFIED: regex intent] + +## Code Examples + +### mention 与 upload 分流(后端) +```python +# Source: backend/packages/harness/deerflow/agents/middlewares/uploads_middleware.py +if f.get("ref_kind") == "mention": + continue +``` + +### 构建单一 files envelope(前端) +```typescript +// Source: frontend/src/core/threads/hooks.ts +const { files: filesForSubmit, staleCount } = buildFilesForSubmit( + uploadedFileInfo, + normalizedReferences, +); +``` + +### memory 标签剥离(中间件) +```python +# Source: backend/packages/harness/deerflow/agents/middlewares/memory_middleware.py +_UPLOAD_BLOCK_RE = re.compile( + r"<(?:uploaded_files|mentioned_files)>[\\s\\S]*?\\n*", + re.IGNORECASE, +) +``` + +## State of the Art + +| Old Approach | Current Approach | When Changed | Impact | +|--------------|------------------|--------------|--------| +| mention 与 upload 同池处理 | `ref_kind/ref_source` 明确区分并分块注入 | Phase 06 后段(2026-04-15) | 消除“引用被当上传”副作用。 [VERIFIED: git log + middleware code] | +| memory 仅靠提示词约束不记上传 | middleware + updater 双层代码过滤 | 已在当前工作树 | 减少长期 memory 污染。 [VERIFIED: `memory_middleware.py`, `updater.py`, tests] | + +**Deprecated/outdated:** +- 仅依赖文档状态判断 Phase 06 完成度(未同步会误判)。 [VERIFIED: `06-VERIFICATION.md` vs `06-UAT.md`/`REQUIREMENTS.md` 状态差异] + +## Assumptions Log + +| # | Claim | Section | Risk if Wrong | +|---|-------|---------|---------------| +| A1 | 通过补 `data-testid` 或改为 aria 断言即可稳定 DF-INPUT-007 | Common Pitfalls | 可能需要更深层 UI 结构调整。 | + +## Open Questions + +1. **Phase 07 是否要“改代码”还是“仅归档文档+测试修正”?** + - What we know: 语义与 memory 主链路代码已到位。 [VERIFIED: code + tests] + - What's unclear: 你是否接受只修测试契约与文档闭环,不再动功能实现。 + - Recommendation: 先锁定“最小变更原则”,避免 Phase 07 再引入行为漂移。 [ASSUMED] + +2. **E2E 断言口径是否改为可访问性语义?** + - What we know: `reference-chip-remove` testid 当前缺失。 [VERIFIED: grep + test output] + - What's unclear: 团队更偏好稳定 testid 还是 aria 文案断言。 + - Recommendation: 若追求跨重构稳定,优先 aria;若追求低改动,补 testid。 [ASSUMED] + +## Environment Availability + +| Dependency | Required By | Available | Version | Fallback | +|------------|------------|-----------|---------|----------| +| Node.js | frontend tests/tooling | ✓ | v24.14.0 | — | +| pnpm | frontend scripts | ✓ | 10.32.1 | `npm`(不推荐,lockfile 不一致) | +| Playwright CLI | DF-INPUT E2E | ✓ | 1.48.0 | — | +| Python | backend tests | ✓ | 3.12.3 | — | +| uv | backend test runner | ✓ | 0.10.10 | — | +| pytest (global) | backend tests | ✗ | — | `uv run pytest` | + +[VERIFIED: local command checks] + +**Missing dependencies with no fallback:** +- None. [VERIFIED: local checks] + +**Missing dependencies with fallback:** +- 全局 `pytest` 缺失;使用 `uv run pytest`。 [VERIFIED: local checks + successful runs] + +## Validation Architecture + +### Test Framework +| Property | Value | +|----------|-------| +| Framework | Node test runner + Playwright + pytest (via uv) | +| Config file | `frontend/playwright.config.ts`, `backend/pyproject.toml` | +| Quick run command | `cd frontend && node --test src/core/threads/hooks.test.ts` | +| Full suite command | `cd backend && uv run pytest -q tests/test_uploads_middleware_core_logic.py tests/test_memory_upload_filtering.py && cd ../frontend && pnpm -s test:e2e --grep "DF-INPUT-007|DF-INPUT-008|DF-INPUT-009"` | + +[VERIFIED: codebase files + executed commands] + +### Phase Requirements → Test Map +| Req ID | Behavior | Test Type | Automated Command | File Exists? | +|--------|----------|-----------|-------------------|-------------| +| P7-SEM-01 | mention 不计入 new upload | unit | `cd backend && uv run pytest -q tests/test_uploads_middleware_core_logic.py -k "mention or files_from_kwargs"` | ✅ | +| P7-MEM-01 | memory 不保留上传事件 | unit | `cd backend && uv run pytest -q tests/test_memory_upload_filtering.py` | ✅ | +| P7-UI-01 | @候选/引用 chip 交互稳定 | e2e | `cd frontend && pnpm -s test:e2e --grep "DF-INPUT-007|DF-INPUT-008|DF-INPUT-009"` | ✅(当前有失败) | +| P7-DOC-01 | 验收状态文档闭环 | docs check | `rg -n "ATREF-01|ATREF-02|ATREF-03|ATREF-04|status:" .planning/REQUIREMENTS.md .planning/phases/06-/06-UAT.md .planning/phases/06-/06-VALIDATION.md` | ✅ | + +### Sampling Rate +- **Per task commit:** 对应最小命令(前端 unit 或后端 targeted pytest)。 [VERIFIED: commit guide + current tests] +- **Per wave merge:** 跑后端双测 + 前端三条 E2E。 [VERIFIED: current phase scope] +- **Phase gate:** 三类测试全绿且文档状态同步后再进入 verify-work。 [VERIFIED: verification gaps] + +### Wave 0 Gaps +- [ ] `frontend/tests/e2e/input-and-compose.spec.ts` 与组件选择器合同未对齐(`reference-chip-remove`)。 [VERIFIED: test failure + grep] +- [ ] `.planning/phases/06-/06-UAT.md` 状态未回写到最新结果。 [VERIFIED: file content] +- [ ] `.planning/REQUIREMENTS.md` 中 `ATREF-01..04` 仍 Pending。 [VERIFIED: file content] + +## Security Domain + +### Applicable ASVS Categories +| ASVS Category | Applies | Standard Control | +|---------------|---------|-----------------| +| V2 Authentication | no | 本 phase 不新增 auth 面。 [VERIFIED: scope] | +| V3 Session Management | no | 不改会话机制。 [VERIFIED: scope] | +| V4 Access Control | yes | mention 候选限定当前 thread 数据源。 [VERIFIED: `input-box.tsx` + phase docs] | +| V5 Input Validation | yes | 后端 `_files_from_kwargs` 校验 filename/path。 [VERIFIED: `uploads_middleware.py`] | +| V6 Cryptography | no | 无加密实现变更。 [VERIFIED: scope] | + +### Known Threat Patterns for this phase stack +| Pattern | STRIDE | Standard Mitigation | +|---------|--------|---------------------| +| 跨线程文件引用泄露 | Information Disclosure | 候选仅取当前 thread artifacts/uploads。 [VERIFIED: `input-box.tsx`] | +| 伪造 `additional_kwargs.files` 注入 | Tampering | 后端校验 basename 与 `/mnt/user-data/` 前缀。 [VERIFIED: `uploads_middleware.py`] | +| memory 泄露临时路径 | Information Disclosure | middleware + updater 双层过滤上传标签与句子。 [VERIFIED: memory code + tests] | + +## Sources + +### Primary (HIGH confidence) +- 本仓库代码:`frontend/src/components/workspace/input-box.tsx`、`frontend/src/components/ai-elements/prompt-input.tsx`、`frontend/src/core/threads/hooks.ts`、`frontend/src/core/threads/submit-files.ts`。 [VERIFIED: codebase grep] +- 本仓库代码:`backend/packages/harness/deerflow/agents/middlewares/uploads_middleware.py`、`memory_middleware.py`、`memory/updater.py`。 [VERIFIED: codebase grep] +- 本地执行结果:`node --test`, `uv run pytest`, `pnpm test:e2e --grep ...`。 [VERIFIED: command output] +- npm registry:`@radix-ui/react-dropdown-menu`、`sonner` 版本与发布时间。 [VERIFIED: npm view] + +### Secondary (MEDIUM confidence) +- `.planning/phases/06-/06-VERIFICATION.md`、`06-UAT.md`、`06-VALIDATION.md`、`.planning/REQUIREMENTS.md` 的状态交叉对比。 [VERIFIED: local docs] + +### Tertiary (LOW confidence) +- None. + +## Metadata + +**Confidence breakdown:** +- Standard stack: HIGH - 基于当前仓库依赖与 npm registry 实查。 +- Architecture: HIGH - 关键链路均有代码与测试证据。 +- Pitfalls: MEDIUM - 一部分为当前失败现象,一部分为经验性防回退建议。 + +**Research date:** 2026-04-15 +**Valid until:** 2026-05-15(30 天)