deerflow2

History

SHIYAO ZHANG 87f41d3ae8 feat(uploads): inject document outline into agent context for converted files (#1738 ) * feat(uploads): inject document outline into agent context for converted files Extract headings from converted .md files and inject them into the <uploaded_files> context block so the agent can navigate large documents by line number before reading. - Add `extract_outline()` to `file_conversion.py`: recognises standard Markdown headings (#/##/###) and SEC-style bold structural headings (ITEM N. BUSINESS, PART II); caps at 50 entries; excludes cover-page boilerplate (WASHINGTON DC, CURRENT REPORT, SIGNATURES) - Add `_extract_outline_for_file()` helper in `uploads_middleware.py`: looks for a sibling `.md` file produced by the conversion pipeline - Update `UploadsMiddleware._create_files_message()` to render the outline under each file entry with `L{line}: {title}` format and a `read_file` prompt for range-based reading - Tests: 10 new tests for `extract_outline()`, 4 new tests for outline injection in `UploadsMiddleware`; existing test updated for new `outline` field in `uploaded_files` state Partially addresses #1647 (agent ignores uploaded files). * fix(uploads): stream outline file reads and strip inline bold from heading titles - Switch extract_outline() from read_text().splitlines() to open()+line iteration so large converted documents are not loaded into memory on every agent turn; exits as soon as MAX_OUTLINE_ENTRIES is reached (Copilot suggestion) - Strip ... wrapper from standard Markdown heading titles before appending to outline so agent context stays clean (e.g. "## Overview" → "Overview") (Copilot suggestion) - Remove unused pathlib.Path import and fix import sort order in test_file_conversion.py to satisfy ruff CI lint * fix(uploads): show truncation hint when outline exceeds MAX_OUTLINE_ENTRIES When extract_outline() hits the cap it now appends a sentinel entry {"truncated": True} instead of silently dropping the rest of the headings. UploadsMiddleware reads the sentinel and renders a hint line: ... (showing first 50 headings; use `read_file` to explore further) Without this the agent had no way to know the outline was incomplete and would treat the first 50 headings as the full document structure. * fix(uploads): fall back to configurable.thread_id when runtime.context lacks thread_id runtime.context does not always carry thread_id (depends on LangGraph invocation path). ThreadDataMiddleware already falls back to get_config().configurable.thread_id — apply the same pattern so UploadsMiddleware can resolve the uploads directory and attach outlines in all invocation paths. * style: apply ruff format --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>		2026-04-03 20:52:47 +08:00
..
conftest.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_acp_config.py	feat(acp): add env field to ACPAgentConfig for subprocess env injection (#1447 )	2026-03-27 20:03:30 +08:00
test_aio_sandbox_local_backend.py	fix: use safe docker bind mount syntax for sandbox mounts (#1655 )	2026-04-01 11:42:12 +08:00
test_aio_sandbox_provider.py	fix Windows Docker sandbox path mounting (#1634 )	2026-03-31 22:19:27 +08:00
test_aio_sandbox.py	fix: prevent concurrent subagent file write conflicts in sandbox tools (#1714 )	2026-04-02 15:39:41 +08:00
test_app_config_reload.py	fix(config): reload AppConfig when config path or mtime changes (#1239 )	2026-03-22 20:34:01 +08:00
test_artifacts_router.py	fix(gateway): enforce safe download for active artifact MIME types to mitigate stored XSS (#1389 )	2026-03-26 17:44:25 +08:00
test_channel_file_attachments.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_channels.py	Improve Python reliability in channel retries and thread typing (#1776 )	2026-04-03 07:50:11 +08:00
test_checkpointer_none_fix.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_checkpointer.py	fix: normalize structured LLM content in serialization and memory updater (#1215 )	2026-03-22 17:29:29 +08:00
test_claude_provider_oauth_billing.py	fix(oauth): Harden Claude OAuth cache-control handling (#1583 )	2026-03-30 07:41:18 +08:00
test_cli_auth_providers.py	feat(harness): integration ACP agent tool (#1344 )	2026-03-26 14:20:18 +08:00
test_client_e2e.py	[Security] Address critical host-shell escape in LocalSandboxProvider (#1547 )	2026-03-29 21:03:58 +08:00
test_client_live.py	[Security] Address critical host-shell escape in LocalSandboxProvider (#1547 )	2026-03-29 21:03:58 +08:00
test_client.py	feat(client): add `available_skills` parameter to DeerFlowClient (#1779 )	2026-04-03 11:22:58 +08:00
test_config_version.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_create_deerflow_agent_live.py	feat: add create_deerflow_agent SDK entry point (Phase 1) (#1203 )	2026-03-29 15:31:18 +08:00
test_create_deerflow_agent.py	style: format unformatted files and add .omc/ to prettierignore (#1539 )	2026-03-29 16:45:31 +08:00
test_credential_loader.py	feat: add Claude Code OAuth and Codex CLI as LLM providers (#1166 )	2026-03-22 22:39:50 +08:00
test_custom_agent.py	feat/per agent skill filter (#1650 )	2026-04-02 15:02:09 +08:00
test_dangling_tool_call_middleware.py	test: add unit tests for DanglingToolCallMiddleware (#1305 )	2026-03-26 00:20:08 +08:00
test_docker_sandbox_mode_detection.py	fix Windows Docker sandbox path mounting (#1634 )	2026-03-31 22:19:27 +08:00
test_feishu_parser.py	fix: avoid treating Feishu file paths as commands (#1654 )	2026-04-01 23:23:00 +08:00
test_file_conversion.py	feat(uploads): inject document outline into agent context for converted files (#1738 )	2026-04-03 20:52:47 +08:00
test_gateway_services.py	fix(gateway): prevent 400 error when client sends context with configurable (#1660 )	2026-04-01 23:21:32 +08:00
test_guardrail_middleware.py	feat(guardrails): add pre-tool-call authorization middleware with pluggable providers (#1240 )	2026-03-23 18:07:33 +08:00
test_harness_boundary.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_infoquest_client.py	feat(harness): integration ACP agent tool (#1344 )	2026-03-26 14:20:18 +08:00
test_invoke_acp_agent_tool.py	fix ACP mcpServers payload (#1735 )	2026-04-03 15:28:56 +08:00
test_jina_client.py	refactor: replace sync requests with async httpx in Jina AI client (#1603 )	2026-04-01 17:02:39 +08:00
test_lead_agent_model_resolution.py	ci: enforce code formatting checks for backend and frontend (#1536 )	2026-03-29 15:34:38 +08:00
test_lead_agent_prompt.py	fix: surface configured sandbox mounts to agents (#1638 )	2026-03-31 22:22:30 +08:00
test_lead_agent_skills.py	feat/per agent skill filter (#1650 )	2026-04-02 15:02:09 +08:00
test_llm_error_handling_middleware.py	Fix/1681 llm call retry handling (#1683 )	2026-04-02 10:12:17 +08:00
test_local_bash_tool_loading.py	[Security] Address critical host-shell escape in LocalSandboxProvider (#1547 )	2026-03-29 21:03:58 +08:00
test_local_sandbox_encoding.py	fix: add Windows shell fallback for local sandbox (#1505 )	2026-03-29 21:31:29 +08:00
test_local_sandbox_provider_mounts.py	feat(sandbox): add read-only support for local sandbox path mappings (#1808 )	2026-04-03 19:46:22 +08:00
test_loop_detection_middleware.py	fix(middleware): use HumanMessage in LoopDetectionMiddleware for Anthropic compat (#1300 )	2026-03-25 08:00:01 +08:00
test_mcp_client_config.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_mcp_oauth.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_mcp_sync_wrapper.py	feat(harness): integration ACP agent tool (#1344 )	2026-03-26 14:20:18 +08:00
test_memory_prompt_injection.py	fix: inject longTermBackground into memory prompt (#1734 )	2026-04-03 11:21:58 +08:00
test_memory_queue.py	feat(memory): structured reflection + correction detection in MemoryMiddleware (#1620 ) (#1668 )	2026-04-01 16:45:29 +08:00
test_memory_router.py	feat(memory): structured reflection + correction detection in MemoryMiddleware (#1620 ) (#1668 )	2026-04-01 16:45:29 +08:00
test_memory_storage.py	ci: enforce code formatting checks for backend and frontend (#1536 )	2026-03-29 15:34:38 +08:00
test_memory_updater.py	feat(memory): structured reflection + correction detection in MemoryMiddleware (#1620 ) (#1668 )	2026-04-01 16:45:29 +08:00
test_memory_upload_filtering.py	feat(memory): structured reflection + correction detection in MemoryMiddleware (#1620 ) (#1668 )	2026-04-01 16:45:29 +08:00
test_model_config.py	feat(codex): support explicit OpenAI Responses API config (#1235 )	2026-03-22 20:39:26 +08:00
test_model_factory.py	feat(tracing): add optional Langfuse support (#1717 )	2026-04-02 13:06:10 +08:00
test_patched_minimax.py	fix: improve MiniMax code plan integration (#1169 )	2026-03-20 17:18:59 +08:00
test_patched_openai.py	fix(LLM): fixing Gemini thinking + tool calls via OpenAI gateway (#1180 ) (#1205 )	2026-03-26 15:07:05 +08:00
test_present_file_tool_core_logic.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_provisioner_kubeconfig.py	feat(subagents): make subagent timeout configurable via config.yaml (#897 )	2026-02-25 08:39:29 +08:00
test_readability.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_reflection_resolvers.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_run_manager.py	fix: surface configured sandbox mounts to agents (#1638 )	2026-03-31 22:22:30 +08:00
test_sandbox_audit_middleware.py	feat(sandbox): add SandboxAuditMiddleware for bash command security auditing (#1532 )	2026-03-30 07:48:31 +08:00
test_sandbox_search_tools.py	feat(sandbox): add built-in grep and glob tools (#1784 )	2026-04-03 16:03:06 +08:00
test_sandbox_tools_security.py	feat(sandbox): add read-only support for local sandbox path mappings (#1808 )	2026-04-03 19:46:22 +08:00
test_serialization.py	feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403 )	2026-03-30 16:02:23 +08:00
test_serialize_message_content.py	feat(harness): integration ACP agent tool (#1344 )	2026-03-26 14:20:18 +08:00
test_skills_archive_root.py	refactor: extract shared skill installer and upload manager to harness (#1202 )	2026-03-25 16:28:33 +08:00
test_skills_installer.py	Fix Windows backend test compatibility (#1384 )	2026-03-26 17:39:16 +08:00
test_skills_loader.py	feat(harness): integration ACP agent tool (#1344 )	2026-03-26 14:20:18 +08:00
test_skills_parser.py	fix(skills): support parsing multiline YAML strings in SKILL.md frontmatter (#1703 )	2026-04-01 23:08:30 +08:00
test_skills_validation.py	test: add unit tests for skill frontmatter validation (#1309 )	2026-03-27 20:20:31 +08:00
test_sse_format.py	feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403 )	2026-03-30 16:02:23 +08:00
test_stream_bridge.py	fix: guarantee END sentinel delivery when stream bridge queue is full (#1695 )	2026-04-03 20:12:30 +08:00
test_subagent_executor.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_subagent_limit_middleware.py	test: add unit tests for SubagentLimitMiddleware (#1306 )	2026-03-25 10:20:16 +08:00
test_subagent_prompt_security.py	[Security] Address critical host-shell escape in LocalSandboxProvider (#1547 )	2026-03-29 21:03:58 +08:00
test_subagent_timeout_config.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_suggestions_router.py	fix: use SystemMessage+HumanMessage for follow-up question generation (#1751 )	2026-04-03 20:09:01 +08:00
test_task_tool_core_logic.py	[Security] Address critical host-shell escape in LocalSandboxProvider (#1547 )	2026-03-29 21:03:58 +08:00
test_thread_data_middleware.py	Fix Windows backend test compatibility (#1384 )	2026-03-26 17:39:16 +08:00
test_threads_router.py	fix(threads): clean up local thread data after thread deletion (#1262 )	2026-03-24 00:36:08 +08:00
test_title_generation.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_title_middleware_core_logic.py	fix: add sync after_model to TitleMiddleware (#1190 )	2026-03-19 15:46:31 +08:00
test_todo_middleware.py	test: add unit tests for TodoMiddleware (#1307 )	2026-03-26 00:20:50 +08:00
test_token_usage.py	feat(harness): integration ACP agent tool (#1344 )	2026-03-26 14:20:18 +08:00
test_tool_error_handling_middleware.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_tool_output_truncation.py	feat(sandbox): truncate oversized bash and read_file tool outputs (#1677 )	2026-04-02 09:22:41 +08:00
test_tool_search.py	fix: promote deferred tools after tool_search returns schema (#1570 )	2026-03-30 11:23:15 +08:00
test_tracing_config.py	feat(tracing): add optional Langfuse support (#1717 )	2026-04-02 13:06:10 +08:00
test_tracing_factory.py	feat(tracing): add optional Langfuse support (#1717 )	2026-04-02 13:06:10 +08:00
test_uploads_manager.py	Fix Windows backend test compatibility (#1384 )	2026-03-26 17:39:16 +08:00
test_uploads_middleware_core_logic.py	feat(uploads): inject document outline into agent context for converted files (#1738 )	2026-04-03 20:52:47 +08:00
test_uploads_router.py	fix(sandbox): Relax upload permissions for aio sandbox sync (#1409 )	2026-03-27 17:37:44 +08:00