deerflow2/backend/tests
ZHANG Ning 5b633449f8
fix(middleware): add per-tool-type frequency detection to LoopDetectionMiddleware (#1988)
* fix(middleware): add per-tool-type frequency detection to LoopDetectionMiddleware

The existing hash-based loop detection only catches identical tool call
sets. When the agent calls the same tool type (e.g. read_file) on many
different files, each call produces a unique hash and bypasses detection.
This causes the agent to exhaust recursion_limit, consuming 150K-225K
tokens per failed run.

Add a second detection layer that tracks cumulative call counts per tool
type per thread. Warns at 30 calls (configurable) and forces stop at 50.
The hard stop message now uses the actual returned message instead of a
hardcoded constant, so both hash-based and frequency-based stops produce
accurate diagnostics.

Also fix _apply() to use the warning message returned by
_track_and_check() for hard stops, instead of always using _HARD_STOP_MSG.

Closes #1987

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix(lint): remove unused imports and fix line length

- Remove unused _TOOL_FREQ_HARD_STOP_MSG and _TOOL_FREQ_WARNING_MSG
  imports from test file (F401)
- Break long _TOOL_FREQ_WARNING_MSG string to fit within 240 char limit (E501)

* style: apply ruff format

* test: add LRU eviction and per-thread reset coverage for frequency state

Address review feedback from @WillemJiang:
- Verify _tool_freq and _tool_freq_warned are cleaned on LRU eviction
- Add test for reset(thread_id=...) clearing only the target thread's
  frequency state while leaving others intact

* fix(makefile): route Windows shell-script targets through Git Bash (#2060)

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Asish Kumar <87874775+officialasishkumar@users.noreply.github.com>
2026-04-11 17:33:27 +08:00
..
conftest.py feat(provisioner): add optional PVC support for sandbox volumes (#2020) 2026-04-10 20:40:30 +08:00
test_acp_config.py feat(acp): add env field to ACPAgentConfig for subprocess env injection (#1447) 2026-03-27 20:03:30 +08:00
test_aio_sandbox.py fix: prevent concurrent subagent file write conflicts in sandbox tools (#1714) 2026-04-02 15:39:41 +08:00
test_aio_sandbox_local_backend.py fix: use safe docker bind mount syntax for sandbox mounts (#1655) 2026-04-01 11:42:12 +08:00
test_aio_sandbox_provider.py fix Windows Docker sandbox path mounting (#1634) 2026-03-31 22:19:27 +08:00
test_app_config_reload.py fix(config): reload AppConfig when config path or mtime changes (#1239) 2026-03-22 20:34:01 +08:00
test_artifacts_router.py fix(gateway): enforce safe download for active artifact MIME types to mitigate stored XSS (#1389) 2026-03-26 17:44:25 +08:00
test_channel_file_attachments.py Feature/feishu receive file (#1608) 2026-04-06 22:14:12 +08:00
test_channels.py Feature/feishu receive file (#1608) 2026-04-06 22:14:12 +08:00
test_checkpointer.py Move async SQLite mkdir off the event loop (#1921) 2026-04-07 10:47:20 +08:00
test_checkpointer_none_fix.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_clarification_middleware.py fix(middleware): handle string-serialized options in ClarificationMiddleware (#1997) 2026-04-08 21:04:20 +08:00
test_claude_provider_oauth_billing.py fix(oauth): Harden Claude OAuth cache-control handling (#1583) 2026-03-30 07:41:18 +08:00
test_cli_auth_providers.py fix(provider): preserve streamed Codex output when response.completed.output is empty (#1928) 2026-04-07 18:21:22 +08:00
test_client.py fix(backend): stream DeerFlowClient AI text as token deltas (#1969) (#1974) 2026-04-10 18:16:38 +08:00
test_client_e2e.py [Security] Address critical host-shell escape in LocalSandboxProvider (#1547) 2026-03-29 21:03:58 +08:00
test_client_live.py [Security] Address critical host-shell escape in LocalSandboxProvider (#1547) 2026-03-29 21:03:58 +08:00
test_codex_provider.py fix: resolve missing serialized kwargs in PatchedChatDeepSeek (#2025) 2026-04-09 16:07:16 +08:00
test_config_version.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_create_deerflow_agent.py fix(backend): preserve viewed image reducer metadata (#1900) 2026-04-06 16:47:19 +08:00
test_create_deerflow_agent_live.py feat: add create_deerflow_agent SDK entry point (Phase 1) (#1203) 2026-03-29 15:31:18 +08:00
test_credential_loader.py feat: add Claude Code OAuth and Codex CLI as LLM providers (#1166) 2026-03-22 22:39:50 +08:00
test_custom_agent.py fix: include soul field in GET /api/agents list response (fixes #1819) (#1863) 2026-04-05 10:49:58 +08:00
test_dangling_tool_call_middleware.py test: add unit tests for DanglingToolCallMiddleware (#1305) 2026-03-26 00:20:08 +08:00
test_docker_sandbox_mode_detection.py fix Windows Docker sandbox path mounting (#1634) 2026-03-31 22:19:27 +08:00
test_doctor.py feat(dx): Setup Wizard + doctor command — closes #2030 (#2034) 2026-04-10 17:43:39 +08:00
test_exa_tools.py feat(community): add Exa search as community tool provider (#1357) 2026-04-08 17:13:39 +08:00
test_feishu_parser.py Feature/feishu receive file (#1608) 2026-04-06 22:14:12 +08:00
test_file_conversion.py fix(uploads): handle split-bold headings and ** ** artefacts in extract_outline (#1838) 2026-04-04 14:25:08 +08:00
test_firecrawl_tools.py feat(dx): Setup Wizard + doctor command — closes #2030 (#2034) 2026-04-10 17:43:39 +08:00
test_gateway_services.py fix(gateway): prevent 400 error when client sends context with configurable (#1660) 2026-04-01 23:21:32 +08:00
test_guardrail_middleware.py feat(guardrails): add pre-tool-call authorization middleware with pluggable providers (#1240) 2026-03-23 18:07:33 +08:00
test_harness_boundary.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_infoquest_client.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_invoke_acp_agent_tool.py fix ACP mcpServers payload (#1735) 2026-04-03 15:28:56 +08:00
test_jina_client.py refactor: replace sync requests with async httpx in Jina AI client (#1603) 2026-04-01 17:02:39 +08:00
test_lead_agent_model_resolution.py ci: enforce code formatting checks for backend and frontend (#1536) 2026-03-29 15:34:38 +08:00
test_lead_agent_prompt.py fix(agent): file-io path guidance in agent prompts (#2019) 2026-04-09 16:12:34 +08:00
test_lead_agent_skills.py fix(skill): make skill prompt cache refresh nonblocking (#1924) 2026-04-07 10:50:34 +08:00
test_llm_error_handling_middleware.py Fix/1681 llm call retry handling (#1683) 2026-04-02 10:12:17 +08:00
test_local_bash_tool_loading.py fix(sandbox): improve sandbox security and preserve multimodal content (#2114) 2026-04-11 16:52:10 +08:00
test_local_sandbox_encoding.py fix: add Windows shell fallback for local sandbox (#1505) 2026-03-29 21:31:29 +08:00
test_local_sandbox_provider_mounts.py feat(sandbox): add read-only support for local sandbox path mappings (#1808) 2026-04-03 19:46:22 +08:00
test_loop_detection_middleware.py fix(middleware): add per-tool-type frequency detection to LoopDetectionMiddleware (#1988) 2026-04-11 17:33:27 +08:00
test_mcp_client_config.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_mcp_oauth.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_mcp_sync_wrapper.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_memory_prompt_injection.py fix: inject longTermBackground into memory prompt (#1734) 2026-04-03 11:21:58 +08:00
test_memory_queue.py fix(memory): case-insensitive fact deduplication and positive reinforcement detection (#1804) 2026-04-05 16:23:00 +08:00
test_memory_router.py feat(memory): structured reflection + correction detection in MemoryMiddleware (#1620) (#1668) 2026-04-01 16:45:29 +08:00
test_memory_storage.py ci: enforce code formatting checks for backend and frontend (#1536) 2026-03-29 15:34:38 +08:00
test_memory_updater.py fix(memory): case-insensitive fact deduplication and positive reinforcement detection (#1804) 2026-04-05 16:23:00 +08:00
test_memory_upload_filtering.py fix(memory): case-insensitive fact deduplication and positive reinforcement detection (#1804) 2026-04-05 16:23:00 +08:00
test_model_config.py feat(codex): support explicit OpenAI Responses API config (#1235) 2026-03-22 20:39:26 +08:00
test_model_factory.py feat(config): add when_thinking_disabled support for model configs (#1970) 2026-04-09 18:49:00 +08:00
test_patched_deepseek.py fix: resolve missing serialized kwargs in PatchedChatDeepSeek (#2025) 2026-04-09 16:07:16 +08:00
test_patched_minimax.py fix: improve MiniMax code plan integration (#1169) 2026-03-20 17:18:59 +08:00
test_patched_openai.py fix(LLM): fixing Gemini thinking + tool calls via OpenAI gateway (#1180) (#1205) 2026-03-26 15:07:05 +08:00
test_present_file_tool_core_logic.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_provisioner_kubeconfig.py feat(provisioner): add optional PVC support for sandbox volumes (#2020) 2026-04-10 20:40:30 +08:00
test_provisioner_pvc_volumes.py feat(provisioner): add optional PVC support for sandbox volumes (#2020) 2026-04-10 20:40:30 +08:00
test_readability.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_reflection_resolvers.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_run_manager.py fix: surface configured sandbox mounts to agents (#1638) 2026-03-31 22:22:30 +08:00
test_run_worker_rollback.py feat: implement full checkpoint rollback on user cancellation (#1867) 2026-04-09 17:56:36 +08:00
test_sandbox_audit_middleware.py feat(sandbox): strengthen bash command auditing with compound splitting and expanded patterns (#1881) 2026-04-07 17:15:24 +08:00
test_sandbox_orphan_reconciliation.py fix(sandbox): add startup reconciliation to prevent orphaned container leaks (#1976) 2026-04-09 17:21:23 +08:00
test_sandbox_orphan_reconciliation_e2e.py fix(sandbox): add startup reconciliation to prevent orphaned container leaks (#1976) 2026-04-09 17:21:23 +08:00
test_sandbox_search_tools.py feat(sandbox): add built-in grep and glob tools (#1784) 2026-04-03 16:03:06 +08:00
test_sandbox_tools_security.py fix(sandbox): prevent memory leak in file operation locks using WeakValueDictionary (#2096) 2026-04-10 22:55:53 +08:00
test_security_scanner.py Implement skill self-evolution and skill_manage flow (#1874) 2026-04-06 22:07:11 +08:00
test_serialization.py feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403) 2026-03-30 16:02:23 +08:00
test_serialize_message_content.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_setup_wizard.py feat(dx): Setup Wizard + doctor command — closes #2030 (#2034) 2026-04-10 17:43:39 +08:00
test_skill_manage_tool.py fix(skill): make skill prompt cache refresh nonblocking (#1924) 2026-04-07 10:50:34 +08:00
test_skills_archive_root.py refactor: extract shared skill installer and upload manager to harness (#1202) 2026-03-25 16:28:33 +08:00
test_skills_custom_router.py fix(skill): make skill prompt cache refresh nonblocking (#1924) 2026-04-07 10:50:34 +08:00
test_skills_installer.py Fix Windows backend test compatibility (#1384) 2026-03-26 17:39:16 +08:00
test_skills_loader.py Implement skill self-evolution and skill_manage flow (#1874) 2026-04-06 22:07:11 +08:00
test_skills_parser.py fix(skills): support parsing multiline YAML strings in SKILL.md frontmatter (#1703) 2026-04-01 23:08:30 +08:00
test_skills_validation.py test: add unit tests for skill frontmatter validation (#1309) 2026-03-27 20:20:31 +08:00
test_sse_format.py feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403) 2026-03-30 16:02:23 +08:00
test_stream_bridge.py Fix(#1702): stream resume run (#1858) 2026-04-06 14:51:10 +08:00
test_subagent_executor.py Fix(subagent): Event loop conflict in SubagentExecutor.execute() (#1965) 2026-04-08 11:46:06 +08:00
test_subagent_limit_middleware.py test: add unit tests for SubagentLimitMiddleware (#1306) 2026-03-25 10:20:16 +08:00
test_subagent_prompt_security.py fix(agent): file-io path guidance in agent prompts (#2019) 2026-04-09 16:12:34 +08:00
test_subagent_timeout_config.py chroe(config):Increase subagent max-turn limits (#1852) 2026-04-05 15:41:00 +08:00
test_suggestions_router.py fix: unblock concurrent threads and workspace hydration (#1839) 2026-04-04 21:19:35 +08:00
test_task_tool_core_logic.py fix(subagents): add cooperative cancellation for subagent threads (#1873) 2026-04-07 11:12:25 +08:00
test_thread_data_middleware.py Fix Windows backend test compatibility (#1384) 2026-03-26 17:39:16 +08:00
test_threads_router.py fix(threads): clean up local thread data after thread deletion (#1262) 2026-03-24 00:36:08 +08:00
test_title_generation.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_title_middleware_core_logic.py fix: unblock concurrent threads and workspace hydration (#1839) 2026-04-04 21:19:35 +08:00
test_todo_middleware.py test: add unit tests for TodoMiddleware (#1307) 2026-03-26 00:20:50 +08:00
test_token_usage.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_tool_error_handling_middleware.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_tool_output_truncation.py fix: add output truncation to ls_tool to prevent context window overflow (#1896) 2026-04-06 15:09:57 +08:00
test_tool_search.py fix: promote deferred tools after tool_search returns schema (#1570) 2026-03-30 11:23:15 +08:00
test_tracing_config.py feat(tracing): add optional Langfuse support (#1717) 2026-04-02 13:06:10 +08:00
test_tracing_factory.py feat(tracing): add optional Langfuse support (#1717) 2026-04-02 13:06:10 +08:00
test_uploads_manager.py Fix Windows backend test compatibility (#1384) 2026-03-26 17:39:16 +08:00
test_uploads_middleware_core_logic.py fix(sandbox): improve sandbox security and preserve multimodal content (#2114) 2026-04-11 16:52:10 +08:00
test_uploads_router.py fix(sandbox): Relax upload permissions for aio sandbox sync (#1409) 2026-03-27 17:37:44 +08:00
test_vllm_provider.py feat(models): add vLLM provider support (#1860) 2026-04-06 15:18:34 +08:00
test_wechat_channel.py feat: add WeChat channel integration (#1869) 2026-04-10 20:49:28 +08:00