deerflow2/backend/tests
Octopus 1f59e945af
fix: cap prompt caching breakpoints at 4 to prevent API 400 errors (#2449)
* fix: cap prompt caching breakpoints at 4 to prevent API 400 errors (fixes #2448)

The previous _apply_prompt_caching() attached cache_control to every text
block in the system prompt, every content block in the last N messages, and
the last tool definition. In multi-turn conversations with structured content
blocks this easily exceeded the 4-breakpoint hard limit enforced by both the
Anthropic API and AWS Bedrock, producing a 400 Bad Request (or a silent
"No generations found in stream" when streaming).

Fix: collect all candidate blocks in document order, then apply cache_control
only to the last MAX_CACHE_BREAKPOINTS (4) of them. Later breakpoints cover a
larger prefix and therefore yield better cache hit rates, making this the
optimal placement strategy as well as the safe one.

Adds 13 unit tests covering the budget cap, edge cases, and correct
last-candidate placement.

* docs: clarify _apply_prompt_caching docstring includes tool definitions

Per Copilot review: the implementation also caches the last tool definition
(see the candidates list at lines 202-205), so the docstring summary should
explicitly mention tools alongside system and recent messages.

* Fix the lint error

* style: fix ruff format check for test_claude_provider_prompt_caching.py

Add the missing blank line before the 'Edge cases' section comment so
that ruff format --check passes in CI.

---------

Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-25 19:40:06 +08:00
..
conftest.py feat(provisioner): add optional PVC support for sandbox volumes (#2020) 2026-04-10 20:40:30 +08:00
test_acp_config.py feat(acp): add env field to ACPAgentConfig for subprocess env injection (#1447) 2026-03-27 20:03:30 +08:00
test_aio_sandbox_local_backend.py fix: use safe docker bind mount syntax for sandbox mounts (#1655) 2026-04-01 11:42:12 +08:00
test_aio_sandbox_provider.py fix Windows Docker sandbox path mounting (#1634) 2026-03-31 22:19:27 +08:00
test_aio_sandbox.py fix: prevent concurrent subagent file write conflicts in sandbox tools (#1714) 2026-04-02 15:39:41 +08:00
test_app_config_reload.py fix: disable custom-agent management API by default (#2161) 2026-04-14 00:03:38 +08:00
test_artifacts_router.py fix(gateway): enforce safe download for active artifact MIME types to mitigate stored XSS (#1389) 2026-03-26 17:44:25 +08:00
test_channel_file_attachments.py Feature/feishu receive file (#1608) 2026-04-06 22:14:12 +08:00
test_channels.py Feature/feishu receive file (#1608) 2026-04-06 22:14:12 +08:00
test_check_script.py fix(check): windows pnpm version detection in check script (#2189) 2026-04-14 10:29:44 +08:00
test_checkpointer_none_fix.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_checkpointer.py fix(checkpointer): create parent directory before opening SQLite in sync provider (#2272) 2026-04-16 09:06:38 +08:00
test_clarification_middleware.py fix(backend): make clarification messages idempotent (#2350) (#2351) 2026-04-19 22:00:58 +08:00
test_claude_provider_oauth_billing.py fix(oauth): Harden Claude OAuth cache-control handling (#1583) 2026-03-30 07:41:18 +08:00
test_claude_provider_prompt_caching.py fix: cap prompt caching breakpoints at 4 to prevent API 400 errors (#2449) 2026-04-25 19:40:06 +08:00
test_cli_auth_providers.py fix(provider): preserve streamed Codex output when response.completed.output is empty (#1928) 2026-04-07 18:21:22 +08:00
test_client_e2e.py [Security] Address critical host-shell escape in LocalSandboxProvider (#1547) 2026-03-29 21:03:58 +08:00
test_client_live.py [Security] Address critical host-shell escape in LocalSandboxProvider (#1547) 2026-03-29 21:03:58 +08:00
test_client.py feat: show token usage per assistant response (#2270) 2026-04-16 08:56:49 +08:00
test_codex_provider.py fix: resolve missing serialized kwargs in PatchedChatDeepSeek (#2025) 2026-04-09 16:07:16 +08:00
test_config_version.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_create_deerflow_agent_live.py feat: add create_deerflow_agent SDK entry point (Phase 1) (#1203) 2026-03-29 15:31:18 +08:00
test_create_deerflow_agent.py fix(backend): preserve viewed image reducer metadata (#1900) 2026-04-06 16:47:19 +08:00
test_credential_loader.py feat: add Claude Code OAuth and Codex CLI as LLM providers (#1166) 2026-03-22 22:39:50 +08:00
test_custom_agent.py fix: disable custom-agent management API by default (#2161) 2026-04-14 00:03:38 +08:00
test_dangling_tool_call_middleware.py fix(middleware): repair dangling tool-call history after loop interru… (#2035) 2026-04-12 19:11:22 +08:00
test_discord_channel.py feat(channels): add Discord channel integration (#1806) 2026-04-11 17:48:04 +08:00
test_docker_sandbox_mode_detection.py fix Windows Docker sandbox path mounting (#1634) 2026-03-31 22:19:27 +08:00
test_doctor.py feat(dx): Setup Wizard + doctor command — closes #2030 (#2034) 2026-04-10 17:43:39 +08:00
test_exa_tools.py feat(community): add Exa search as community tool provider (#1357) 2026-04-08 17:13:39 +08:00
test_feishu_parser.py Feature/feishu receive file (#1608) 2026-04-06 22:14:12 +08:00
test_file_conversion.py [security] fix(uploads): require explicit opt-in for host-side document conversion (#2332) 2026-04-18 22:47:42 +08:00
test_firecrawl_tools.py feat(dx): Setup Wizard + doctor command — closes #2030 (#2034) 2026-04-10 17:43:39 +08:00
test_gateway_lifespan_shutdown.py fix(gateway): bound lifespan shutdown hooks to prevent worker hang under uvicorn reload (#2331) 2026-04-23 19:41:26 +08:00
test_gateway_services.py fix: read lead agent options from context (#2515) 2026-04-24 22:46:51 +08:00
test_guardrail_middleware.py feat(guardrails): add pre-tool-call authorization middleware with pluggable providers (#1240) 2026-03-23 18:07:33 +08:00
test_harness_boundary.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_infoquest_client.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_invoke_acp_agent_tool.py fix ACP mcpServers payload (#1735) 2026-04-03 15:28:56 +08:00
test_jina_client.py fix(jina): log transient failures at WARNING without traceback (#2484) (#2485) 2026-04-24 16:00:14 +08:00
test_lead_agent_model_resolution.py fix: read lead agent options from context (#2515) 2026-04-24 22:46:51 +08:00
test_lead_agent_prompt.py fix(agent): file-io path guidance in agent prompts (#2019) 2026-04-09 16:12:34 +08:00
test_lead_agent_skills.py fix(skill): make skill prompt cache refresh nonblocking (#1924) 2026-04-07 10:50:34 +08:00
test_llm_error_handling_middleware.py fix: Catch httpx.ReadError in the error handling (#2309) 2026-04-19 22:30:22 +08:00
test_local_bash_tool_loading.py fix(sandbox): improve sandbox security and preserve multimodal content (#2114) 2026-04-11 16:52:10 +08:00
test_local_sandbox_encoding.py fix: add Windows shell fallback for local sandbox (#1505) 2026-03-29 21:31:29 +08:00
test_local_sandbox_provider_mounts.py fix: use subprocess instead of os.system in local_backend.py (#2494) 2026-04-25 08:59:31 +08:00
test_loop_detection_middleware.py fix(middleware): repair dangling tool-call history after loop interru… (#2035) 2026-04-12 19:11:22 +08:00
test_mcp_client_config.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_mcp_custom_interceptors.py feat(mcp): support custom tool interceptors via extensions_config.json (#2451) 2026-04-25 09:18:13 +08:00
test_mcp_oauth.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_mcp_sync_wrapper.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_memory_prompt_injection.py fix: inject longTermBackground into memory prompt (#1734) 2026-04-03 11:21:58 +08:00
test_memory_queue.py feat: flush memory before summarization (#2176) 2026-04-14 15:01:06 +08:00
test_memory_router.py feat(memory): structured reflection + correction detection in MemoryMiddleware (#1620) (#1668) 2026-04-01 16:45:29 +08:00
test_memory_storage.py fix: Memory update system has cache corruption, data loss, and thread-safety bugs (#2251) 2026-04-17 12:00:31 +08:00
test_memory_updater.py feat(trace):Add run_name to the trace info for system agents. (#2492) 2026-04-24 17:06:55 +08:00
test_memory_upload_filtering.py feat: flush memory before summarization (#2176) 2026-04-14 15:01:06 +08:00
test_mindie_provider.py feat(models): Provider for MindIE model engine (#2483) 2026-04-25 08:59:03 +08:00
test_model_config.py feat(codex): support explicit OpenAI Responses API config (#1235) 2026-03-22 20:39:26 +08:00
test_model_factory.py fix(token-usage): enable stream usage for openai-compatible models (#2217) 2026-04-19 22:42:55 +08:00
test_patched_deepseek.py fix: resolve missing serialized kwargs in PatchedChatDeepSeek (#2025) 2026-04-09 16:07:16 +08:00
test_patched_minimax.py fix: improve MiniMax code plan integration (#1169) 2026-03-20 17:18:59 +08:00
test_patched_openai.py fix(LLM): fixing Gemini thinking + tool calls via OpenAI gateway (#1180) (#1205) 2026-03-26 15:07:05 +08:00
test_present_file_tool_core_logic.py fix(middleware): fix present_files thread id fallback (#2181) 2026-04-13 22:59:13 +08:00
test_provisioner_kubeconfig.py feat(provisioner): add optional PVC support for sandbox volumes (#2020) 2026-04-10 20:40:30 +08:00
test_provisioner_pvc_volumes.py feat(provisioner): add optional PVC support for sandbox volumes (#2020) 2026-04-10 20:40:30 +08:00
test_readability.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_reflection_resolvers.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_run_manager.py fix: surface configured sandbox mounts to agents (#1638) 2026-03-31 22:22:30 +08:00
test_run_worker_rollback.py feat: implement full checkpoint rollback on user cancellation (#1867) 2026-04-09 17:56:36 +08:00
test_sandbox_audit_middleware.py feat(sandbox): strengthen bash command auditing with compound splitting and expanded patterns (#1881) 2026-04-07 17:15:24 +08:00
test_sandbox_orphan_reconciliation_e2e.py fix(sandbox): add startup reconciliation to prevent orphaned container leaks (#1976) 2026-04-09 17:21:23 +08:00
test_sandbox_orphan_reconciliation.py fix(sandbox): add startup reconciliation to prevent orphaned container leaks (#1976) 2026-04-09 17:21:23 +08:00
test_sandbox_search_tools.py fix(sandbox): add missing path masking in ls_tool output (#2317) 2026-04-18 08:46:59 +08:00
test_sandbox_tools_security.py fix(sandbox): prevent memory leak in file operation locks using WeakValueDictionary (#2096) 2026-04-10 22:55:53 +08:00
test_security_scanner.py feat(trace):Add run_name to the trace info for system agents. (#2492) 2026-04-24 17:06:55 +08:00
test_serialization.py feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403) 2026-03-30 16:02:23 +08:00
test_serialize_message_content.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_setup_agent_tool.py fix(setup-agent): prevent data loss when setup fails on existing agen… (#2254) 2026-04-20 20:17:30 +08:00
test_setup_wizard.py feat(dx): Setup Wizard + doctor command — closes #2030 (#2034) 2026-04-10 17:43:39 +08:00
test_skill_manage_tool.py fix(skill): make skill prompt cache refresh nonblocking (#1924) 2026-04-07 10:50:34 +08:00
test_skills_archive_root.py refactor: extract shared skill installer and upload manager to harness (#1202) 2026-03-25 16:28:33 +08:00
test_skills_bundled.py fix(skills): validate bundled SKILL.md front-matter in CI (fixes #2443) (#2457) 2026-04-23 14:06:14 +08:00
test_skills_custom_router.py fix(skills): avoid blocking custom skill deletion on readonly history writes (#2197) 2026-04-14 09:00:29 +08:00
test_skills_installer.py Fix Windows backend test compatibility (#1384) 2026-03-26 17:39:16 +08:00
test_skills_loader.py Implement skill self-evolution and skill_manage flow (#1874) 2026-04-06 22:07:11 +08:00
test_skills_parser.py fix: resolve tool duplication and skill parser YAML inconsistencies (#1803) (#2107) 2026-04-20 20:25:03 +08:00
test_skills_validation.py test: add unit tests for skill frontmatter validation (#1309) 2026-03-27 20:20:31 +08:00
test_sse_format.py feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403) 2026-03-30 16:02:23 +08:00
test_stream_bridge.py Fix(#1702): stream resume run (#1858) 2026-04-06 14:51:10 +08:00
test_subagent_executor.py Fix(subagent): Event loop conflict in SubagentExecutor.execute() (#1965) 2026-04-08 11:46:06 +08:00
test_subagent_limit_middleware.py test: add unit tests for SubagentLimitMiddleware (#1306) 2026-03-25 10:20:16 +08:00
test_subagent_prompt_security.py feat(subagents): support per-subagent skill loading and custom subagent types (#2253) 2026-04-23 23:59:47 +08:00
test_subagent_skills_config.py feat(subagents): support per-subagent skill loading and custom subagent types (#2253) 2026-04-23 23:59:47 +08:00
test_subagent_timeout_config.py feat(subagents): allow model override per subagent in config.yaml (#2064) 2026-04-12 16:40:21 +08:00
test_suggestions_router.py feat(trace):Add run_name to the trace info for system agents. (#2492) 2026-04-24 17:06:55 +08:00
test_summarization_middleware.py fix(middleware): avoid rescuing non-skill tool outputs during summarization (#2458) 2026-04-24 21:19:46 +08:00
test_task_tool_core_logic.py fix: inherit subagent skill allowlists (#2514) 2026-04-24 21:24:42 +08:00
test_thread_data_middleware.py Fix Windows backend test compatibility (#1384) 2026-03-26 17:39:16 +08:00
test_threads_router.py fix(threads): clean up local thread data after thread deletion (#1262) 2026-03-24 00:36:08 +08:00
test_title_generation.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_title_middleware_core_logic.py feat(trace):Add run_name to the trace info for system agents. (#2492) 2026-04-24 17:06:55 +08:00
test_todo_middleware.py fix(todo-middleware): prevent premature agent exit with incomplete todos (#2135) 2026-04-14 11:11:26 +08:00
test_token_usage.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_tool_deduplication.py fix: resolve tool duplication and skill parser YAML inconsistencies (#1803) (#2107) 2026-04-20 20:25:03 +08:00
test_tool_error_handling_middleware.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_tool_output_truncation.py fix: add output truncation to ls_tool to prevent context window overflow (#1896) 2026-04-06 15:09:57 +08:00
test_tool_search.py fix: gate deferred MCP tool execution (#2513) 2026-04-24 22:45:41 +08:00
test_tracing_config.py feat(tracing): add optional Langfuse support (#1717) 2026-04-02 13:06:10 +08:00
test_tracing_factory.py feat(tracing): add optional Langfuse support (#1717) 2026-04-02 13:06:10 +08:00
test_uploads_manager.py Fix Windows backend test compatibility (#1384) 2026-03-26 17:39:16 +08:00
test_uploads_middleware_core_logic.py fix(sandbox): improve sandbox security and preserve multimodal content (#2114) 2026-04-11 16:52:10 +08:00
test_uploads_router.py [security] fix(uploads): require explicit opt-in for host-side document conversion (#2332) 2026-04-18 22:47:42 +08:00
test_view_image_middleware.py test: add unit tests for ViewImageMiddleware (#2256) 2026-04-15 23:54:30 +08:00
test_vllm_provider.py feat(models): add vLLM provider support (#1860) 2026-04-06 15:18:34 +08:00
test_wechat_channel.py feat: add WeChat channel integration (#1869) 2026-04-10 20:49:28 +08:00