deerflow2/backend/tests
KKK 3b3e8e1b0b
feat(sandbox): strengthen bash command auditing with compound splitting and expanded patterns (#1881)
* fix(sandbox): strengthen regex coverage in SandboxAuditMiddleware

Expand high-risk patterns from 6 to 13 and medium-risk from 4 to 6,
closing several bypass vectors identified by cross-referencing Claude
Code's BashSecurity validator chain against DeerFlow's threat model.

High-risk additions:
- Generalised pipe-to-sh (replaces narrow curl|sh rule)
- Targeted command substitution ($() / backtick with dangerous executables)
- base64 decode piped to execution
- Overwrite system binaries (/usr/bin/, /bin/, /sbin/)
- Overwrite shell startup files (~/.bashrc, ~/.profile, etc.)
- /proc/*/environ leakage
- LD_PRELOAD / LD_LIBRARY_PATH hijack
- /dev/tcp/ bash built-in networking

Medium-risk additions:
- sudo/su (no-op under Docker root, warn only)
- PATH= modification (long attack chain, warn only)

Design decisions:
- Command substitution uses targeted matching (curl/wget/bash/sh/python/
  ruby/perl/base64) rather than blanket block to avoid false positives
  on safe usage like $(date) or `whoami`.
- Skipped encoding/obfuscation checks (hex, octal, Unicode homoglyphs)
  as ROI is low in Docker sandbox — LLMs don't generate encoded commands
  and container isolation bounds the blast radius.
- Merged pip/pip3 into single pip3? pattern.

* feat(sandbox): compound command splitting and fork bomb detection

Split compound bash commands (&&, ||, ;) into sub-commands and classify
each independently — prevents dangerous commands hidden after safe
prefixes (e.g. "cd /workspace && rm -rf /") from bypassing detection.

- Add _split_compound_command() with shlex quote-aware splitting
- Add fork bomb detection patterns (classic and while-loop variants)
- Most severe verdict wins; block short-circuits
- 15 new tests covering compound commands, splitting, and fork bombs

* test(sandbox): add async tests for fork bomb and compound commands

Cover awrap_tool_call path for fork bomb detection (3 variants) and
compound command splitting (block/warn/pass scenarios).

* fix(sandbox): address Copilot review — no-whitespace operators, >>/etc/, whole-command scan

- _split_compound_command: replace shlex-based implementation with a
  character-by-character quote/escape-aware scanner. shlex.split only
  separates '&&' / '||' / ';' when they are surrounded by whitespace,
  so payloads like 'rm -rf /&&echo ok' or 'safe;rm -rf /' bypassed the
  previous splitter and therefore the per-sub-command classifier.
- _HIGH_RISK_PATTERNS: change r'>\s*/etc/' to r'>+\s*/etc/' so append
  redirection ('>>/etc/hosts') is also blocked.
- _classify_command: run a whole-command high-risk scan *before*
  splitting. Structural attacks like 'while true; do bash & done'
  span multiple shell statements — splitting on ';' destroys the
  pattern context, so the raw command must be scanned first.
- tests: add no-whitespace operator cases to TestSplitCompoundCommand
  and test_compound_command_classification to lock in the bypass fix.
2026-04-07 17:15:24 +08:00
..
conftest.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_acp_config.py feat(acp): add env field to ACPAgentConfig for subprocess env injection (#1447) 2026-03-27 20:03:30 +08:00
test_aio_sandbox.py fix: prevent concurrent subagent file write conflicts in sandbox tools (#1714) 2026-04-02 15:39:41 +08:00
test_aio_sandbox_local_backend.py fix: use safe docker bind mount syntax for sandbox mounts (#1655) 2026-04-01 11:42:12 +08:00
test_aio_sandbox_provider.py fix Windows Docker sandbox path mounting (#1634) 2026-03-31 22:19:27 +08:00
test_app_config_reload.py fix(config): reload AppConfig when config path or mtime changes (#1239) 2026-03-22 20:34:01 +08:00
test_artifacts_router.py fix(gateway): enforce safe download for active artifact MIME types to mitigate stored XSS (#1389) 2026-03-26 17:44:25 +08:00
test_channel_file_attachments.py Feature/feishu receive file (#1608) 2026-04-06 22:14:12 +08:00
test_channels.py Feature/feishu receive file (#1608) 2026-04-06 22:14:12 +08:00
test_checkpointer.py Move async SQLite mkdir off the event loop (#1921) 2026-04-07 10:47:20 +08:00
test_checkpointer_none_fix.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_claude_provider_oauth_billing.py fix(oauth): Harden Claude OAuth cache-control handling (#1583) 2026-03-30 07:41:18 +08:00
test_cli_auth_providers.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_client.py fix: expose custom events from DeerFlowClient.stream() (#1827) 2026-04-06 10:09:39 +08:00
test_client_e2e.py [Security] Address critical host-shell escape in LocalSandboxProvider (#1547) 2026-03-29 21:03:58 +08:00
test_client_live.py [Security] Address critical host-shell escape in LocalSandboxProvider (#1547) 2026-03-29 21:03:58 +08:00
test_config_version.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_create_deerflow_agent.py fix(backend): preserve viewed image reducer metadata (#1900) 2026-04-06 16:47:19 +08:00
test_create_deerflow_agent_live.py feat: add create_deerflow_agent SDK entry point (Phase 1) (#1203) 2026-03-29 15:31:18 +08:00
test_credential_loader.py feat: add Claude Code OAuth and Codex CLI as LLM providers (#1166) 2026-03-22 22:39:50 +08:00
test_custom_agent.py fix: include soul field in GET /api/agents list response (fixes #1819) (#1863) 2026-04-05 10:49:58 +08:00
test_dangling_tool_call_middleware.py test: add unit tests for DanglingToolCallMiddleware (#1305) 2026-03-26 00:20:08 +08:00
test_docker_sandbox_mode_detection.py fix Windows Docker sandbox path mounting (#1634) 2026-03-31 22:19:27 +08:00
test_feishu_parser.py Feature/feishu receive file (#1608) 2026-04-06 22:14:12 +08:00
test_file_conversion.py fix(uploads): handle split-bold headings and ** ** artefacts in extract_outline (#1838) 2026-04-04 14:25:08 +08:00
test_gateway_services.py fix(gateway): prevent 400 error when client sends context with configurable (#1660) 2026-04-01 23:21:32 +08:00
test_guardrail_middleware.py feat(guardrails): add pre-tool-call authorization middleware with pluggable providers (#1240) 2026-03-23 18:07:33 +08:00
test_harness_boundary.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_infoquest_client.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_invoke_acp_agent_tool.py fix ACP mcpServers payload (#1735) 2026-04-03 15:28:56 +08:00
test_jina_client.py refactor: replace sync requests with async httpx in Jina AI client (#1603) 2026-04-01 17:02:39 +08:00
test_lead_agent_model_resolution.py ci: enforce code formatting checks for backend and frontend (#1536) 2026-03-29 15:34:38 +08:00
test_lead_agent_prompt.py fix(skill): make skill prompt cache refresh nonblocking (#1924) 2026-04-07 10:50:34 +08:00
test_lead_agent_skills.py fix(skill): make skill prompt cache refresh nonblocking (#1924) 2026-04-07 10:50:34 +08:00
test_llm_error_handling_middleware.py Fix/1681 llm call retry handling (#1683) 2026-04-02 10:12:17 +08:00
test_local_bash_tool_loading.py [Security] Address critical host-shell escape in LocalSandboxProvider (#1547) 2026-03-29 21:03:58 +08:00
test_local_sandbox_encoding.py fix: add Windows shell fallback for local sandbox (#1505) 2026-03-29 21:31:29 +08:00
test_local_sandbox_provider_mounts.py feat(sandbox): add read-only support for local sandbox path mappings (#1808) 2026-04-03 19:46:22 +08:00
test_loop_detection_middleware.py fix(middleware): handle list-type AIMessage.content in LoopDetectionMiddleware (#1823) 2026-04-04 10:38:22 +08:00
test_mcp_client_config.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_mcp_oauth.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_mcp_sync_wrapper.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_memory_prompt_injection.py fix: inject longTermBackground into memory prompt (#1734) 2026-04-03 11:21:58 +08:00
test_memory_queue.py fix(memory): case-insensitive fact deduplication and positive reinforcement detection (#1804) 2026-04-05 16:23:00 +08:00
test_memory_router.py feat(memory): structured reflection + correction detection in MemoryMiddleware (#1620) (#1668) 2026-04-01 16:45:29 +08:00
test_memory_storage.py ci: enforce code formatting checks for backend and frontend (#1536) 2026-03-29 15:34:38 +08:00
test_memory_updater.py fix(memory): case-insensitive fact deduplication and positive reinforcement detection (#1804) 2026-04-05 16:23:00 +08:00
test_memory_upload_filtering.py fix(memory): case-insensitive fact deduplication and positive reinforcement detection (#1804) 2026-04-05 16:23:00 +08:00
test_model_config.py feat(codex): support explicit OpenAI Responses API config (#1235) 2026-03-22 20:39:26 +08:00
test_model_factory.py feat(models): add vLLM provider support (#1860) 2026-04-06 15:18:34 +08:00
test_patched_minimax.py fix: improve MiniMax code plan integration (#1169) 2026-03-20 17:18:59 +08:00
test_patched_openai.py fix(LLM): fixing Gemini thinking + tool calls via OpenAI gateway (#1180) (#1205) 2026-03-26 15:07:05 +08:00
test_present_file_tool_core_logic.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_provisioner_kubeconfig.py feat(subagents): make subagent timeout configurable via config.yaml (#897) 2026-02-25 08:39:29 +08:00
test_readability.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_reflection_resolvers.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_run_manager.py fix: surface configured sandbox mounts to agents (#1638) 2026-03-31 22:22:30 +08:00
test_sandbox_audit_middleware.py feat(sandbox): strengthen bash command auditing with compound splitting and expanded patterns (#1881) 2026-04-07 17:15:24 +08:00
test_sandbox_search_tools.py feat(sandbox): add built-in grep and glob tools (#1784) 2026-04-03 16:03:06 +08:00
test_sandbox_tools_security.py fix: preserve virtual path separator style (#1828) 2026-04-05 15:52:22 +08:00
test_security_scanner.py Implement skill self-evolution and skill_manage flow (#1874) 2026-04-06 22:07:11 +08:00
test_serialization.py feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403) 2026-03-30 16:02:23 +08:00
test_serialize_message_content.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_skill_manage_tool.py fix(skill): make skill prompt cache refresh nonblocking (#1924) 2026-04-07 10:50:34 +08:00
test_skills_archive_root.py refactor: extract shared skill installer and upload manager to harness (#1202) 2026-03-25 16:28:33 +08:00
test_skills_custom_router.py fix(skill): make skill prompt cache refresh nonblocking (#1924) 2026-04-07 10:50:34 +08:00
test_skills_installer.py Fix Windows backend test compatibility (#1384) 2026-03-26 17:39:16 +08:00
test_skills_loader.py Implement skill self-evolution and skill_manage flow (#1874) 2026-04-06 22:07:11 +08:00
test_skills_parser.py fix(skills): support parsing multiline YAML strings in SKILL.md frontmatter (#1703) 2026-04-01 23:08:30 +08:00
test_skills_validation.py test: add unit tests for skill frontmatter validation (#1309) 2026-03-27 20:20:31 +08:00
test_sse_format.py feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403) 2026-03-30 16:02:23 +08:00
test_stream_bridge.py Fix(#1702): stream resume run (#1858) 2026-04-06 14:51:10 +08:00
test_subagent_executor.py fix(subagents): add cooperative cancellation for subagent threads (#1873) 2026-04-07 11:12:25 +08:00
test_subagent_limit_middleware.py test: add unit tests for SubagentLimitMiddleware (#1306) 2026-03-25 10:20:16 +08:00
test_subagent_prompt_security.py [Security] Address critical host-shell escape in LocalSandboxProvider (#1547) 2026-03-29 21:03:58 +08:00
test_subagent_timeout_config.py chroe(config):Increase subagent max-turn limits (#1852) 2026-04-05 15:41:00 +08:00
test_suggestions_router.py fix: unblock concurrent threads and workspace hydration (#1839) 2026-04-04 21:19:35 +08:00
test_task_tool_core_logic.py fix(subagents): add cooperative cancellation for subagent threads (#1873) 2026-04-07 11:12:25 +08:00
test_thread_data_middleware.py Fix Windows backend test compatibility (#1384) 2026-03-26 17:39:16 +08:00
test_threads_router.py fix(threads): clean up local thread data after thread deletion (#1262) 2026-03-24 00:36:08 +08:00
test_title_generation.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_title_middleware_core_logic.py fix: unblock concurrent threads and workspace hydration (#1839) 2026-04-04 21:19:35 +08:00
test_todo_middleware.py test: add unit tests for TodoMiddleware (#1307) 2026-03-26 00:20:50 +08:00
test_token_usage.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_tool_error_handling_middleware.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_tool_output_truncation.py fix: add output truncation to ls_tool to prevent context window overflow (#1896) 2026-04-06 15:09:57 +08:00
test_tool_search.py fix: promote deferred tools after tool_search returns schema (#1570) 2026-03-30 11:23:15 +08:00
test_tracing_config.py feat(tracing): add optional Langfuse support (#1717) 2026-04-02 13:06:10 +08:00
test_tracing_factory.py feat(tracing): add optional Langfuse support (#1717) 2026-04-02 13:06:10 +08:00
test_uploads_manager.py Fix Windows backend test compatibility (#1384) 2026-03-26 17:39:16 +08:00
test_uploads_middleware_core_logic.py fix(uploads): handle split-bold headings and ** ** artefacts in extract_outline (#1838) 2026-04-04 14:25:08 +08:00
test_uploads_router.py fix(sandbox): Relax upload permissions for aio sandbox sync (#1409) 2026-03-27 17:37:44 +08:00
test_vllm_provider.py feat(models): add vLLM provider support (#1860) 2026-04-06 15:18:34 +08:00