deerflow2

History

Huixin615 64d923b0fd fix(middleware): externalize oversized tool output into sandbox for non-mounted sandboxes (#3417 ) * fix(middleware): externalize oversized tool output into sandbox for non-mounted sandboxes ToolOutputBudgetMiddleware persisted oversized tool results to the host filesystem and returned a /mnt/user-data/outputs virtual path. For sandboxes that do not use thread-data mounts (e.g. remote AIO sandbox), that virtual path does not exist inside the sandbox, so the model's read_file tool could not read it back and reported 'file not found'. Branch on SandboxProvider.uses_thread_data_mounts: - Mounted sandboxes (local Docker, AIO + LocalContainerBackend) keep the original host-disk path; the host outputs dir is bind-mounted to the same virtual path inside the sandbox, so behavior is unchanged. - Non-mounted (remote) sandboxes externalize into the sandbox itself via execute_command('mkdir -p ...') + write_file + 'test -s' validation. The validation step is required because AIO sandbox execute_command returns 'Error: ...' as a string on failure instead of raising, so a silent mkdir failure would otherwise leak through. Any failure (rejected subdir, mkdir/write/validate error) falls back to the existing inline head+tail truncation, so an unreadable path is never returned to the model. The sandbox resolver reads the sandbox_id that SandboxMiddleware already writes into runtime.state['sandbox']; it never calls provider.acquire(), keeping the tool-call hot path free of blocking I/O. Tools that do not use a sandbox (web_search, MCP, ...) resolve to None and fall through to inline truncation, which is the safe behavior for them. Fixes #3416 * fix(middleware): address Copilot review feedback on sandbox externalization - Make get_sandbox_provider() lookup best-effort in _budget_content: only query when outputs_path or sandbox is available, and fall back to inline truncation if provider initialization raises rather than propagating the error. A resolved sandbox instance is sufficient on its own to take the non-mounted externalization branch. - Strict-match the sandbox post-write validation echo (check.strip() == 'OK') to avoid false positives if execute_command ever surfaces unrelated stdout/stderr containing 'OK' as a substring. Refs: #3417 * test: fix flaky tests relying on /nonexistent/... path under container root Two tests in this module (test_returns_none_on_invalid_path and test_fallback_when_disk_write_fails) used paths like '/nonexistent/impossible/path' to trigger _externalize's OSError fallback. These paths are creatable when the test process runs as root inside the CI container: os.makedirs(..., exist_ok=True) successfully creates the entire chain under /, so the OSError branch is never hit and the tests fail. Reproducible on main independently of this PR. Switch to '/dev/null/cannot-mkdir-here'. /dev/null is a character device on both Linux and macOS, so os.makedirs always fails with NotADirectoryError regardless of privileges, reliably exercising the OSError fallback. * fix(tool-output-budget): only consult sandbox provider when a sandbox is resolved The previous revision called get_sandbox_provider() whenever externalization was triggered, including on the legacy host-disk path. Environments without a configured sandbox -- in particular CI runners without a config.yaml -- would raise FileNotFoundError there, get caught, and silently fall back to inline truncation. That defeated the host-disk externalization path that predates this PR and was the root cause of the regressing legacy tests. Restructure the branching so the provider is only consulted when a sandbox has actually been resolved for the current tool call: - sandbox resolved + provider.uses_thread_data_mounts: host-disk write (bind-mounted into the sandbox, equivalent to a sandbox-side write). - sandbox resolved + non-mounted provider: sandbox write (#3416). - no sandbox + outputs_path: host-disk write (legacy / non-sandbox tools, no provider call at all). - otherwise: inline fallback. No test changes; the legacy externalization tests are provider-agnostic by construction and now pass without monkeypatching. Refs: #3416 * test(tool-output-budget): assert legacy path does not call sandbox provider Lock in the contract introduced by d6e2d25b: when no sandbox is resolved for a tool call, _budget_content must externalize to the host outputs directory without consulting get_sandbox_provider(). Regressing this would re-break legacy / non-sandbox tools in environments without a configured sandbox (e.g. CI without config.yaml), which is the failure mode #3416's fix avoids. The test injects a get_sandbox_provider that raises on call, so any future refactor that moves the provider lookup out of the sandbox-only branch will fail loudly. Refs: #3416		2026-06-08 12:24:48 +08:00
..
blocking_io	fix(middleware): offload memory injection off event loop to prevent tiktoken blocking (#3402 ) (#3411 )	2026-06-08 12:21:55 +08:00
support	Add static blocking IO inventory (#3208 )	2026-05-26 23:30:24 +08:00
_agent_e2e_helpers.py	fix(agents): make update_agent honor runtime.context user_id like setup_agent (#2867 )	2026-05-12 23:18:54 +08:00
_router_auth_helpers.py	fix the lint error in backend	2026-04-26 15:09:25 +08:00
_run_message_pagination_helpers.py	fix: load paginated run history messages (#3305 )	2026-06-01 15:50:39 +08:00
conftest.py	Add static blocking IO inventory (#3208 )	2026-05-26 23:30:24 +08:00
test_acp_config.py	feat(acp): add env field to ACPAgentConfig for subprocess env injection (#1447 )	2026-03-27 20:03:30 +08:00
test_aio_sandbox_local_backend.py	[security] fix(sandbox): bind local Docker ports to loopback (#2633 )	2026-04-30 11:40:28 +08:00
test_aio_sandbox_provider.py	fix(sandbox): close AioSandbox HTTP client during provider teardown (#2872 ) (#3245 )	2026-06-02 22:55:59 +08:00
test_aio_sandbox_readiness.py	fix(sandbox): avoid blocking sandbox readiness polling (#2822 )	2026-05-21 14:44:34 +08:00
test_aio_sandbox.py	fix(sandbox): close AioSandbox HTTP client during provider teardown (#2872 ) (#3245 )	2026-06-02 22:55:59 +08:00
test_app_config_reload.py	fix(config): reset config-backed singletons on hot reload (#2588 )	2026-05-06 10:17:55 +08:00
test_artifacts_router.py	fix(gateway): cap skill artifact preview size (#2963 )	2026-05-15 22:15:58 +08:00
test_assistant_payload_replay.py	refactor(provider): share assistant payload replay matching (#3307 )	2026-05-29 23:05:59 +08:00
test_auth_config.py	fix(auth): persist auto-generated JWT secret to survive restarts (#2933 )	2026-05-16 09:24:40 +08:00
test_auth_errors.py	feat(auth): release-validation pass for 2.0-rc — 12 blockers + simplify follow-ups (#2008 )	2026-04-26 11:08:11 +08:00
test_auth_middleware.py	feat: implement process-local internal authentication for Gateway and enhance CSRF handling	2026-04-26 22:20:57 +08:00
test_auth_type_system.py	feat(auth): release-validation pass for 2.0-rc — 12 blockers + simplify follow-ups (#2008 )	2026-04-26 11:08:11 +08:00
test_auth.py	fix(security): harden auth system and fix run journal logic bug (#2593 )	2026-04-28 11:34:07 +08:00
test_cancel_run_idempotent.py	fix(runtime): make RunManager.cancel() idempotent for already-interrupted runs (#3055 ) (#3058 )	2026-05-20 16:37:36 +08:00
test_channel_file_attachments.py	[security] fix(upload): reject symlinked upload destinations (#2623 )	2026-05-02 15:19:28 +08:00
test_channels.py	fix(mcp): add auth interceptor with channel user_id and keep header propagation to mcp tools (#3294 )	2026-06-03 15:48:19 +08:00
test_check_script.py	fix(check): windows pnpm version detection in check script (#2189 )	2026-04-14 10:29:44 +08:00
test_checkpointer_none_fix.py	feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134 )	2026-04-26 11:09:55 +08:00
test_checkpointer.py	fix(runtime): protect sync singleton init and reset (#3413 )	2026-06-08 08:38:36 +08:00
test_clarification_middleware.py	fix(backend): make clarification messages idempotent (#2350 ) (#2351 )	2026-04-19 22:00:58 +08:00
test_claude_provider_oauth_billing.py	fix(oauth): Harden Claude OAuth cache-control handling (#1583 )	2026-03-30 07:41:18 +08:00
test_claude_provider_prompt_caching.py	fix: cap prompt caching breakpoints at 4 to prevent API 400 errors (#2449 )	2026-04-25 19:40:06 +08:00
test_cli_auth_providers.py	fix(provider): preserve streamed Codex output when response.completed.output is empty (#1928 )	2026-04-07 18:21:22 +08:00
test_client_e2e.py	fix(harness): resolve runtime paths from project root (#2642 )	2026-05-01 22:19:50 +08:00
test_client_langfuse_metadata.py	fix(tracing): propagate session_id and user_id into Langfuse traces (#2944 )	2026-05-21 16:49:31 +08:00
test_client_live.py	[Security] Address critical host-shell escape in LocalSandboxProvider (#1547 )	2026-03-29 21:03:58 +08:00
test_client_message_serialization.py	feat: refine token usage display modes (#2329 )	2026-05-04 09:56:16 +08:00
test_client.py	fix upload file size contract (#3408 )	2026-06-06 15:12:17 +08:00
test_codex_provider.py	chroe(2585): keep polishing the code of codex token usage (#2689 )	2026-05-02 15:04:11 +08:00
test_config_version.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_converters.py	feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930 )	2026-04-26 11:05:47 +08:00
test_create_deerflow_agent_live.py	feat: add create_deerflow_agent SDK entry point (Phase 1) (#1203 )	2026-03-29 15:31:18 +08:00
test_create_deerflow_agent.py	feat(loop-detection): make loop detection configurable with per-tool frequency overrides (#2711 )	2026-05-07 16:15:15 +08:00
test_credential_loader.py	feat(loop-detection): make loop detection configurable with per-tool frequency overrides (#2711 )	2026-05-07 16:15:15 +08:00
test_csrf_middleware.py	feat: static system prompt with DynamicContextMiddleware for prefix-cache optimization (#2801 )	2026-05-09 09:27:02 +08:00
test_custom_agent.py	feat(agent): add custom-agent self-updates with user isolation (#2713 )	2026-05-05 23:17:42 +08:00
test_dangling_tool_call_middleware.py	fix(runtime): guide malformed write_file recovery (#3040 )	2026-05-29 17:46:24 +08:00
test_ddg_search_tools.py	fix(search): fix DDGS Wikipedia region handling (#3423 )	2026-06-08 07:59:50 +08:00
test_deferred_catalog.py	refactor(tool-search): consolidate MCP metadata tag and harden deferred-tool setup (#3370 )	2026-06-05 15:21:41 +08:00
test_deferred_filter_middleware.py	fix(tool-search): reliably hide deferred MCP schemas by removing the ContextVar (closures + graph state) (#3342 )	2026-06-02 22:43:22 +08:00
test_deferred_promotion_integration.py	refactor(tool-search): consolidate MCP metadata tag and harden deferred-tool setup (#3370 )	2026-06-05 15:21:41 +08:00
test_deferred_setup.py	refactor(tool-search): consolidate MCP metadata tag and harden deferred-tool setup (#3370 )	2026-06-05 15:21:41 +08:00
test_deferred_tool_crosscontext.py	refactor(tool-search): consolidate MCP metadata tag and harden deferred-tool setup (#3370 )	2026-06-05 15:21:41 +08:00
test_deferred_tool_promotion_real_llm.py	fix(tool-search): reliably hide deferred MCP schemas by removing the ContextVar (closures + graph state) (#3342 )	2026-06-02 22:43:22 +08:00
test_detect_blocking_io_static.py	Add static blocking IO inventory (#3208 )	2026-05-26 23:30:24 +08:00
test_detect_thread_boundaries.py	chore(dev): add async/thread boundary detector (#2936 )	2026-05-20 10:00:17 +08:00
test_detect_uv_extras.py	fix(scripts): preserve uv extras across `make dev` restarts (#2754 ) (#2767 )	2026-05-10 22:28:29 +08:00
test_dev_entrypoint.py	fix(scripts): preserve uv extras across `make dev` restarts (#2754 ) (#2767 )	2026-05-10 22:28:29 +08:00
test_dingtalk_channel.py	feat(channels): add DingTalk channel integration (#2628 )	2026-04-30 11:25:33 +08:00
test_discord_channel.py	feat(channels): add Discord channel integration (#1806 )	2026-04-11 17:48:04 +08:00
test_docker_sandbox_mode_detection.py	fix Windows Docker sandbox path mounting (#1634 )	2026-03-31 22:19:27 +08:00
test_doctor.py	feat(dx): Setup Wizard + doctor command — closes #2030 (#2034 )	2026-04-10 17:43:39 +08:00
test_dynamic_context_middleware.py	fix(harness): preserve dynamic context across summarization (#2823 )	2026-05-09 19:39:36 +08:00
test_ensure_admin.py	refactor: Remove init_token handling from admin initialization logic and related tests	2026-04-26 11:09:56 +08:00
test_exa_tools.py	feat(community): add Exa search as community tool provider (#1357 )	2026-04-08 17:13:39 +08:00
test_feedback.py	feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134 )	2026-04-26 11:09:55 +08:00
test_feishu_parser.py	fix(channels): preserve Feishu clarification thread continuity (#3285 )	2026-05-31 22:43:07 +08:00
test_file_conversion.py	[security] fix(uploads): require explicit opt-in for host-side document conversion (#2332 )	2026-04-18 22:47:42 +08:00
test_firecrawl_tools.py	feat(dx): Setup Wizard + doctor command — closes #2030 (#2034 )	2026-04-10 17:43:39 +08:00
test_gateway_config_freshness.py	fix(stability): resolve P0 blockers from v2.0-m1-rc1 stability audit (#3107 ) (#3131 )	2026-05-21 21:18:10 +08:00
test_gateway_docs_toggle.py	fix(nginx): defer CORS to gateway allowlist (#2861 )	2026-05-11 17:38:37 +08:00
test_gateway_lifespan_shutdown.py	fix(stability): resolve P0 blockers from v2.0-m1-rc1 stability audit (#3107 ) (#3131 )	2026-05-21 21:18:10 +08:00
test_gateway_run_drain_shutdown.py	fix(gateway): drain in-flight runs before closing checkpointer on shutdown (#3381 )	2026-06-07 11:24:30 +08:00
test_gateway_run_recovery.py	fix(gateway): drain in-flight runs before closing checkpointer on shutdown (#3381 )	2026-06-07 11:24:30 +08:00
test_gateway_runtime_cleanup.py	chore: remove stale LangGraph server runtime remnants (#3344 )	2026-06-03 22:04:05 +08:00
test_gateway_services.py	fix(mcp): add auth interceptor with channel user_id and keep header propagation to mcp tools (#3294 )	2026-06-03 15:48:19 +08:00
test_guardrail_middleware.py	feat(guardrails): add pre-tool-call authorization middleware with pluggable providers (#1240 )	2026-03-23 18:07:33 +08:00
test_harness_boundary.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_infoquest_client.py	feat(harness): integration ACP agent tool (#1344 )	2026-03-26 14:20:18 +08:00
test_initialize_admin.py	fix(auth): replace setup-status 429 rate limit with cached response (#2915 )	2026-05-18 22:07:01 +08:00
test_internal_auth.py	fix(auth): share internal gateway token across workers (#3184 )	2026-05-26 23:19:57 +08:00
test_invoke_acp_agent_tool.py	fix(harness): wrap all async-only tools for sync clients (#2935 )	2026-05-19 22:11:46 +08:00
test_jina_client.py	fix(jina): log transient failures at WARNING without traceback (#2484 ) (#2485 )	2026-04-24 16:00:14 +08:00
test_jsonl_event_store_async_io.py	fix(runtime): harden JSONL async I/O and DB put_batch thread validation (#3084 )	2026-05-29 09:27:53 +08:00
test_langgraph_auth.py	fix(security): harden auth system and fix run journal logic bug (#2593 )	2026-04-28 11:34:07 +08:00
test_lead_agent_model_resolution.py	fix(chat): preserve messages after summarization (#3280 )	2026-05-29 08:24:47 +08:00
test_lead_agent_prompt.py	fix(#3189 ): prevent write_file streaming timeout on long reports (#3195 )	2026-06-07 17:47:11 +08:00
test_lead_agent_skills.py	fix(skills): enforce allowed-tools metadata (#2626 )	2026-05-07 08:34:43 +08:00
test_llm_error_handling_middleware.py	fix(#3189 ): prevent write_file streaming timeout on long reports (#3195 )	2026-06-07 17:47:11 +08:00
test_local_bash_tool_loading.py	fix(sandbox): improve sandbox security and preserve multimodal content (#2114 )	2026-04-11 16:52:10 +08:00
test_local_sandbox_encoding.py	fix(sandbox): disable msys path conversion (#2766 )	2026-05-08 10:13:11 +08:00
test_local_sandbox_provider_mounts.py	feat(sandbox) Adds download file interface in Sandbox (#3038 )	2026-05-20 10:16:31 +08:00
test_local_sandbox_virtual_path_contract.py	fix(sandbox): uphold /mnt/user-data contract at Sandbox API boundary (#2873 ) (#2881 )	2026-05-17 08:26:04 +08:00
test_local_skill_storage_write.py	Fix custom skill install permissions (#3241 )	2026-05-28 15:48:32 +08:00
test_logging_level_from_config.py	fix(config): unify log_level from config.yaml across Gateway and debug entry points (#2601 )	2026-04-30 22:27:14 +08:00
test_loop_detection_config.py	feat(loop-detection): make loop detection configurable with per-tool frequency overrides (#2711 )	2026-05-07 16:15:15 +08:00
test_loop_detection_middleware.py	feat(loop-detection): defer warning injection (#2752 )	2026-05-21 14:36:07 +08:00
test_mcp_client_config.py	fix(mcp): accept transport field as alias for type (#3238 ) (#3243 )	2026-06-03 18:11:38 +08:00
test_mcp_config_secrets.py	fix(security): harden MCP config endpoint (#3425 )	2026-06-08 12:21:02 +08:00
test_mcp_custom_interceptors.py	feat(mcp): support custom tool interceptors via extensions_config.json (#2451 )	2026-04-25 09:18:13 +08:00
test_mcp_oauth.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_mcp_session_pool.py	fix(mcp): close stdio sessions on their owning loop to avoid cross-task cancel-scope error (#3379 ) (#3392 )	2026-06-07 21:37:30 +08:00
test_mcp_sync_wrapper.py	fix(harness): wrap all async-only tools for sync clients (#2935 )	2026-05-19 22:11:46 +08:00
test_memory_prompt_injection.py	fix: inject longTermBackground into memory prompt (#1734 )	2026-04-03 11:21:58 +08:00
test_memory_queue_user_isolation.py	fix(memory): isolate queued memory updates by agent (#2941 )	2026-05-15 10:26:35 +08:00
test_memory_queue.py	fix(memory): isolate queued memory updates by agent (#2941 )	2026-05-15 10:26:35 +08:00
test_memory_router.py	feat(persistence): per-user filesystem isolation, run-scoped APIs, and state/history simplification (#2153 )	2026-04-26 11:13:01 +08:00
test_memory_storage_user_isolation.py	fix the lint error in backend	2026-04-26 15:09:25 +08:00
test_memory_storage.py	fix: Memory update system has cache corruption, data loss, and thread-safety bugs (#2251 )	2026-04-17 12:00:31 +08:00
test_memory_thread_meta_isolation.py	feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134 )	2026-04-26 11:09:55 +08:00
test_memory_updater_user_isolation.py	fix the lint error in backend	2026-04-26 15:09:25 +08:00
test_memory_updater.py	fix(memory): parse wrapped memory update json responses (#3252 )	2026-05-28 07:46:44 +08:00
test_memory_upload_filtering.py	feat: flush memory before summarization (#2176 )	2026-04-14 15:01:06 +08:00
test_migration_user_isolation.py	feat(agent): add custom-agent self-updates with user isolation (#2713 )	2026-05-05 23:17:42 +08:00
test_mindie_provider.py	feat(channels): enhance Discord with mention-only mode, thread routing, and typing indicators (#2842 )	2026-05-15 22:30:05 +08:00
test_model_config.py	feat(codex): support explicit OpenAI Responses API config (#1235 )	2026-03-22 20:39:26 +08:00
test_model_factory.py	fix(#3189 ): prevent write_file streaming timeout on long reports (#3195 )	2026-06-07 17:47:11 +08:00
test_openapi_operation_ids.py	fix(gateway): split stream_existing_run into per-method routes for unique OpenAPI operationIds (#3228 )	2026-05-28 08:20:52 +08:00
test_owner_isolation.py	feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134 )	2026-04-26 11:09:55 +08:00
test_patched_deepseek.py	fix: resolve missing serialized kwargs in PatchedChatDeepSeek (#2025 )	2026-04-09 16:07:16 +08:00
test_patched_mimo.py	feat(provider) Add patched MiMo reasoning content support (#3298 )	2026-05-28 18:24:32 +08:00
test_patched_minimax.py	feat: upgrade MiniMax default model to M3 (#3357 )	2026-06-03 17:04:16 +08:00
test_patched_openai.py	fix(LLM): fixing Gemini thinking + tool calls via OpenAI gateway (#1180 ) (#1205 )	2026-03-26 15:07:05 +08:00
test_paths_user_isolation.py	fix(mcp): add auth interceptor with channel user_id and keep header propagation to mcp tools (#3294 )	2026-06-03 15:48:19 +08:00
test_persistence_scaffold.py	fix(packaging): add postgres extra for store/checkpointer supportFix postgres extra install guidance (#2584 )	2026-05-09 09:49:08 +08:00
test_persistence_timezone.py	fix(persistence): emit tz-aware timestamps from SQLite-backed stores (#3130 )	2026-05-21 16:22:09 +08:00
test_present_file_tool_core_logic.py	feat(persistence): per-user filesystem isolation, run-scoped APIs, and state/history simplification (#2153 )	2026-04-26 11:13:01 +08:00
test_provisioner_kubeconfig.py	feat(provisioner): add optional PVC support for sandbox volumes (#2020 )	2026-04-10 20:40:30 +08:00
test_provisioner_pvc_volumes.py	fix(sandbox): scope provisioner PVC data by user (#2973 )	2026-05-17 15:23:42 +08:00
test_readability.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_reflection_resolvers.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_reload_boundary.py	fix(config): make the reload boundary discoverable from code (#3144 ) (#3153 )	2026-06-07 21:27:14 +08:00
test_remote_sandbox_backend.py	fix(sandbox): avoid blocking sandbox readiness polling (#2822 )	2026-05-21 14:44:34 +08:00
test_run_event_store_pagination.py	fix the lint error in backend	2026-04-26 15:09:25 +08:00
test_run_event_store.py	fix(runtime): avoid postgres aggregate row lock (#2962 )	2026-05-15 10:32:09 +08:00
test_run_journal.py	fix(runs): expose active progress counters (#3148 )	2026-05-22 21:42:14 +08:00
test_run_manager.py	fix(runtime): make run creation persistence atomic (#3152 )	2026-05-23 22:43:34 +08:00
test_run_naming.py	feat(trace):LangGraph -> lead_agent and set custom agent_name to run_name (#3101 )	2026-05-21 14:48:28 +08:00
test_run_repository.py	fix: harden run finalization persistence (#3155 )	2026-05-23 00:09:06 +08:00
test_run_worker_rollback.py	fix(middleware): fix LLM fallback run status (#3321 )	2026-05-31 22:42:13 +08:00
test_runs_api_endpoints.py	fix: load paginated run history messages (#3305 )	2026-06-01 15:50:39 +08:00
test_runtime_lifecycle_e2e.py	fix(gateway): split stream_existing_run into per-method routes for unique OpenAPI operationIds (#3228 )	2026-05-28 08:20:52 +08:00
test_runtime_paths.py	fix(harness): restore legacy skills path fallback (#2694 ) (#2696 )	2026-05-03 23:40:59 +08:00
test_safety_finish_reason_graph_integration.py	fix(runtime): suppress tool execution when provider safety-terminates with tool_calls (#3035 )	2026-05-22 21:20:28 +08:00
test_safety_finish_reason_middleware.py	fix(runtime): suppress tool execution when provider safety-terminates with tool_calls (#3035 )	2026-05-22 21:20:28 +08:00
test_safety_termination_detectors.py	fix(runtime): suppress tool execution when provider safety-terminates with tool_calls (#3035 )	2026-05-22 21:20:28 +08:00
test_sandbox_audit_middleware.py	feat(sandbox): strengthen bash command auditing with compound splitting and expanded patterns (#1881 )	2026-04-07 17:15:24 +08:00
test_sandbox_memory_profile_script.py	chore: add sandbox memory profiling tools (#3249 )	2026-06-03 22:02:27 +08:00
test_sandbox_middleware.py	fix(sandbox): avoid blocking sandbox readiness polling (#2822 )	2026-05-21 14:44:34 +08:00
test_sandbox_orphan_reconciliation_e2e.py	fix(sandbox): add startup reconciliation to prevent orphaned container leaks (#1976 )	2026-04-09 17:21:23 +08:00
test_sandbox_orphan_reconciliation.py	fix(sandbox): add startup reconciliation to prevent orphaned container leaks (#1976 )	2026-04-09 17:21:23 +08:00
test_sandbox_search_tools.py	fix(sandbox): add missing path masking in ls_tool output (#2317 )	2026-04-18 08:46:59 +08:00
test_sandbox_tools_security.py	fix(runtime): bound write_file execution-failure observations (#3133 )	2026-05-21 20:35:46 +08:00
test_security_scanner.py	fix(skills): make security scanner JSON parsing robust for LLM output variations (#2987 )	2026-05-17 08:59:42 +08:00
test_serialization.py	feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403 )	2026-03-30 16:02:23 +08:00
test_serialize_message_content.py	feat(harness): integration ACP agent tool (#1344 )	2026-03-26 14:20:18 +08:00
test_serper_tools.py	feat(community): add Serper web search provider (#2630 )	2026-05-02 16:22:35 +08:00
test_setup_agent_e2e_user_isolation.py	fix(agents): make update_agent honor runtime.context user_id like setup_agent (#2867 )	2026-05-12 23:18:54 +08:00
test_setup_agent_http_e2e_real_server.py	fix(agents): make update_agent honor runtime.context user_id like setup_agent (#2867 )	2026-05-12 23:18:54 +08:00
test_setup_agent_tool.py	fix: keep new agent bootstrap in user scope (#2784 )	2026-05-09 19:43:50 +08:00
test_setup_wizard.py	fix(setup): refresh LLM provider wizard defaults (#3421 )	2026-06-08 08:33:24 +08:00
test_skill_manage_tool.py	refactor(skills): Unified skill storage capability (#2613 )	2026-05-01 13:23:26 +08:00
test_skill_permissions.py	Fix custom skill install permissions (#3241 )	2026-05-28 15:48:32 +08:00
test_skills_archive_root.py	refactor: extract shared skill installer and upload manager to harness (#1202 )	2026-03-25 16:28:33 +08:00
test_skills_bundled.py	fix(skills): validate bundled SKILL.md front-matter in CI (fixes #2443 ) (#2457 )	2026-04-23 14:06:14 +08:00
test_skills_custom_router.py	Fix custom skill install permissions (#3241 )	2026-05-28 15:48:32 +08:00
test_skills_installer.py	Fix custom skill install permissions (#3241 )	2026-05-28 15:48:32 +08:00
test_skills_loader.py	fix(harness): restore legacy skills path fallback (#2694 ) (#2696 )	2026-05-03 23:40:59 +08:00
test_skills_parser.py	fix(skills): surface offending line and quoting hint on SKILL.md YAML… (#3335 )	2026-06-03 21:53:52 +08:00
test_skills_validation.py	fix(skills): enforce allowed-tools metadata (#2626 )	2026-05-07 08:34:43 +08:00
test_sse_format.py	feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403 )	2026-03-30 16:02:23 +08:00
test_stream_bridge.py	Fix(#1702 ): stream resume run (#1858 )	2026-04-06 14:51:10 +08:00
test_subagent_executor.py	fix(subagents): make subagent timeout terminal state atomic (#2583 )	2026-05-18 22:19:32 +08:00
test_subagent_limit_middleware.py	fix(middleware): sync raw tool call metadata (#2757 )	2026-05-08 10:08:53 +08:00
test_subagent_prompt_security.py	feat(subagents): support per-subagent skill loading and custom subagent types (#2253 )	2026-04-23 23:59:47 +08:00
test_subagent_skills_config.py	refactor: thread app_config through lead and subagent task path (#2666 )	2026-05-02 06:37:49 +08:00
test_subagent_status_contract.py	fix(subagent): structured subagent_status field over text parsing (#3146 ) (#3154 )	2026-06-07 22:49:55 +08:00
test_subagent_timeout_config.py	feat(subagents): allow model override per subagent in config.yaml (#2064 )	2026-04-12 16:40:21 +08:00
test_subagent_token_collector.py	fix: bucket subagent token usage into parent run totals (#2838 )	2026-05-10 22:47:30 +08:00
test_suggestions_router.py	refactor: thread release config through lead path (#2612 )	2026-04-28 14:53:18 +08:00
test_summarization_middleware.py	fix(summarization): tag summary LLM calls nostream to stop phantom stream messages (#2503 ) (#3378 )	2026-06-07 17:55:04 +08:00
test_task_tool_core_logic.py	fix(task-tool): cancel and schedule deferred cleanup on polling safety timeout (#3097 )	2026-05-21 07:47:19 +08:00
test_task_tool_usage_recorder.py	fix(stability): resolve P0 blockers from v2.0-m1-rc1 stability audit (#3107 ) (#3131 )	2026-05-21 21:18:10 +08:00
test_thread_data_middleware.py	Fix Windows backend test compatibility (#1384 )	2026-03-26 17:39:16 +08:00
test_thread_meta_repo.py	perf(harness): push thread metadata filters into SQL (#2865 )	2026-05-12 23:21:22 +08:00
test_thread_run_messages_pagination.py	fix: load paginated run history messages (#3305 )	2026-06-01 15:50:39 +08:00
test_thread_state_promoted.py	fix(tool-search): reliably hide deferred MCP schemas by removing the ContextVar (closures + graph state) (#3342 )	2026-06-02 22:43:22 +08:00
test_thread_state_reducers.py	fix(agents): preserve todos state across node updates (#3180 )	2026-05-23 23:25:38 +08:00
test_thread_token_usage.py	fix(runs): expose active progress counters (#3148 )	2026-05-22 21:42:14 +08:00
test_threads_router.py	perf(harness): push thread metadata filters into SQL (#2865 )	2026-05-12 23:21:22 +08:00
test_tiktoken_cache_and_count_tokens.py	fix(middleware): offload memory injection off event loop to prevent tiktoken blocking (#3402 ) (#3411 )	2026-06-08 12:21:55 +08:00
test_title_generation.py	refactor: split backend into harness (deerflow.) and app (app.) (#1131 )	2026-03-14 22:55:52 +08:00
test_title_middleware_core_logic.py	fix(tracing): propagate session_id and user_id into Langfuse traces (#2944 )	2026-05-21 16:49:31 +08:00
test_todo_middleware.py	fix(todo): reuse thread state schema (#3206 )	2026-05-26 23:58:08 +08:00
test_token_usage_config.py	enable token usage by default (#2841 )	2026-05-10 22:00:57 +08:00
test_token_usage_middleware.py	feat: stream subagent token usage to header via terminal task events (#2882 )	2026-05-13 23:52:19 +08:00
test_token_usage.py	feat(harness): integration ACP agent tool (#1344 )	2026-03-26 14:20:18 +08:00
test_tool_args_schema_no_pydantic_warning.py	fix(tools): make write_file append discoverable in model-facing schema (#2843 )	2026-05-10 23:09:03 +08:00
test_tool_deduplication.py	fix(harness): wrap all async-only tools for sync clients (#2935 )	2026-05-19 22:11:46 +08:00
test_tool_error_handling_middleware.py	feat(agent): add ToolOutputBudgetMiddleware for oversized tool output protection (#3303 )	2026-05-29 22:59:26 +08:00
test_tool_error_handling_subagent_stamp.py	fix(subagent): structured subagent_status field over text parsing (#3146 ) (#3154 )	2026-06-07 22:49:55 +08:00
test_tool_output_budget_middleware.py	fix(middleware): externalize oversized tool output into sandbox for non-mounted sandboxes (#3417 )	2026-06-08 12:24:48 +08:00
test_tool_output_truncation.py	fix: add output truncation to ls_tool to prevent context window overflow (#1896 )	2026-04-06 15:09:57 +08:00
test_tool_search.py	fix(tool-search): reliably hide deferred MCP schemas by removing the ContextVar (closures + graph state) (#3342 )	2026-06-02 22:43:22 +08:00
test_tracing_config.py	fix(tracing): propagate session_id and user_id into Langfuse traces (#2944 )	2026-05-21 16:49:31 +08:00
test_tracing_factory.py	fix(tracing): propagate session_id and user_id into Langfuse traces (#2944 )	2026-05-21 16:49:31 +08:00
test_tracing_metadata.py	fix(tracing): propagate session_id and user_id into Langfuse traces (#2944 )	2026-05-21 16:49:31 +08:00
test_update_agent_e2e_user_isolation.py	fix(agents): make update_agent honor runtime.context user_id like setup_agent (#2867 )	2026-05-12 23:18:54 +08:00
test_update_agent_tool.py	fix(agents): harden update_agent null-like args (#3237 )	2026-06-04 07:10:59 +08:00
test_uploads_manager.py	fix(uploads): add Windows support for safe symlink-protected uploads (#2794 )	2026-05-09 18:21:54 +08:00
test_uploads_middleware_core_logic.py	feat(persistence): per-user filesystem isolation, run-scoped APIs, and state/history simplification (#2153 )	2026-04-26 11:13:01 +08:00
test_uploads_router.py	fix upload file size contract (#3408 )	2026-06-06 15:12:17 +08:00
test_user_context.py	fix the lint error in backend	2026-04-26 15:09:25 +08:00
test_utils_time.py	fix(gateway): return ISO 8601 timestamps from threads endpoints (#2599 )	2026-05-02 15:16:16 +08:00
test_view_image_middleware.py	test: add unit tests for ViewImageMiddleware (#2256 )	2026-04-15 23:54:30 +08:00
test_view_image_tool.py	fix(harness): constrain view_image to thread data paths (#2557 )	2026-04-28 11:13:17 +08:00
test_vllm_provider.py	feat(models): add vLLM provider support (#1860 )	2026-04-06 15:18:34 +08:00
test_wait_disconnect_handling.py	fix(gateway): honour on_disconnect on /wait endpoints (#3267 )	2026-05-28 07:22:39 +08:00
test_wechat_channel.py	feat: add WeChat channel integration (#1869 )	2026-04-10 20:49:28 +08:00
test_worker_langfuse_metadata.py	fix(tracing): propagate session_id and user_id into Langfuse traces (#2944 )	2026-05-21 16:49:31 +08:00
test_write_file_tool_size_guard.py	fix(#3189 ): prevent write_file streaming timeout on long reports (#3195 )	2026-06-07 17:47:11 +08:00