deerflow2/backend/tests
Huixin615 919d8bc279
fix(sandbox): persist lazily-acquired sandbox state via Command (#3464)
* fix(sandbox): persist lazily-acquired sandbox state via Command

ensure_sandbox_initialized mutates runtime.state in place, which is local
to the current tool invocation and is not picked up by LangGraph's channel
reducer. Subsequent graph steps and downstream consumers (such as
ToolOutputBudgetMiddleware and the sub-agent task_tool) therefore cannot
observe the sandbox id from state.

Wrap tool calls in SandboxMiddleware (wrap_tool_call / awrap_tool_call) to
detect fresh lazy initialization by diffing runtime.state before and after
the handler, and emit a proper state update via Command(update=...):

- ToolMessage results are wrapped into Command(update={sandbox, messages})
- Command results with a dict update are merged on the sandbox key while
  preserving messages / goto / graph / resume
- Command results with non-dict updates are left untouched to avoid silent
  data loss on unknown update shapes

Tests:
- 7 new unit tests cover lazy-init emit, passthrough, dict-update merge,
  non-dict-update passthrough (sync and async)
- Refresh replay golden write_read_file.ultra.events.json: SSE 'values'
  events now correctly carry the 'sandbox' key in their keys list, which
  is the direct evidence that the fix is effective

Closes #3463

* refactor(sandbox): use dataclasses.replace to preserve Command fields

Address Copilot review on #3464: replace manual field-copy with
dataclasses.replace so any current or future Command fields are
preserved automatically when merging sandbox_update.

Also add a regression test that constructs a Command with non-None
graph/goto/resume to lock this behavior in.
2026-06-11 17:50:36 +08:00
..
blocking_io fix(agents): offload blocking filesystem IO in the custom-agent router off the event loop (#3457) 2026-06-09 22:24:53 +08:00
fixtures/replay fix(sandbox): persist lazily-acquired sandbox state via Command (#3464) 2026-06-11 17:50:36 +08:00
support Add static blocking IO inventory (#3208) 2026-05-26 23:30:24 +08:00
_agent_e2e_helpers.py fix(agents): make update_agent honor runtime.context user_id like setup_agent (#2867) 2026-05-12 23:18:54 +08:00
_replay_fixture.py fix(replay-e2e): key fixtures by caller and conversation (#3453) 2026-06-09 21:58:31 +08:00
_router_auth_helpers.py fix the lint error in backend 2026-04-26 15:09:25 +08:00
_run_message_pagination_helpers.py fix: load paginated run history messages (#3305) 2026-06-01 15:50:39 +08:00
conftest.py Add static blocking IO inventory (#3208) 2026-05-26 23:30:24 +08:00
replay_provider.py fix(replay-e2e): key fixtures by caller and conversation (#3453) 2026-06-09 21:58:31 +08:00
seed_runs_router.py test(e2e): deterministic record/replay front-back contract verification (#3365) 2026-06-08 12:35:03 +08:00
test_acp_config.py feat(acp): add env field to ACPAgentConfig for subprocess env injection (#1447) 2026-03-27 20:03:30 +08:00
test_aio_sandbox_local_backend.py [security] fix(sandbox): bind local Docker ports to loopback (#2633) 2026-04-30 11:40:28 +08:00
test_aio_sandbox_provider.py fix(sandbox): close AioSandbox HTTP client during provider teardown (#2872) (#3245) 2026-06-02 22:55:59 +08:00
test_aio_sandbox_readiness.py fix(sandbox): avoid blocking sandbox readiness polling (#2822) 2026-05-21 14:44:34 +08:00
test_aio_sandbox.py fix(sandbox): close AioSandbox HTTP client during provider teardown (#2872) (#3245) 2026-06-02 22:55:59 +08:00
test_app_config_reload.py fix(config): coerce null config.yaml list sections to empty list (#3434) 2026-06-09 15:45:28 +08:00
test_artifacts_router.py fix(gateway): cap skill artifact preview size (#2963) 2026-05-15 22:15:58 +08:00
test_assistant_payload_replay.py refactor(provider): share assistant payload replay matching (#3307) 2026-05-29 23:05:59 +08:00
test_auth_config.py fix(auth): persist auto-generated JWT secret to survive restarts (#2933) 2026-05-16 09:24:40 +08:00
test_auth_errors.py feat(auth): release-validation pass for 2.0-rc — 12 blockers + simplify follow-ups (#2008) 2026-04-26 11:08:11 +08:00
test_auth_middleware.py fix: align auth-disabled mode and mock history loading (#3471) 2026-06-10 16:11:00 +08:00
test_auth_type_system.py feat(auth): release-validation pass for 2.0-rc — 12 blockers + simplify follow-ups (#2008) 2026-04-26 11:08:11 +08:00
test_auth.py fix(security): harden auth system and fix run journal logic bug (#2593) 2026-04-28 11:34:07 +08:00
test_cancel_run_idempotent.py fix(runtime): make RunManager.cancel() idempotent for already-interrupted runs (#3055) (#3058) 2026-05-20 16:37:36 +08:00
test_channel_file_attachments.py [security] fix(upload): reject symlinked upload destinations (#2623) 2026-05-02 15:19:28 +08:00
test_channels.py fix(skills): harden slash skill activation across chat channels (#3466) 2026-06-09 23:07:17 +08:00
test_check_script.py fix(check): windows pnpm version detection in check script (#2189) 2026-04-14 10:29:44 +08:00
test_checkpointer_none_fix.py feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134) 2026-04-26 11:09:55 +08:00
test_checkpointer.py refactor(lead-agent): make build_middlewares public to drop the last cross-module private import (#3458) 2026-06-09 11:56:28 +08:00
test_clarification_middleware.py fix(backend): make clarification messages idempotent (#2350) (#2351) 2026-04-19 22:00:58 +08:00
test_claude_provider_oauth_billing.py fix(oauth): Harden Claude OAuth cache-control handling (#1583) 2026-03-30 07:41:18 +08:00
test_claude_provider_prompt_caching.py fix: cap prompt caching breakpoints at 4 to prevent API 400 errors (#2449) 2026-04-25 19:40:06 +08:00
test_cli_auth_providers.py fix(provider): preserve streamed Codex output when response.completed.output is empty (#1928) 2026-04-07 18:21:22 +08:00
test_client_e2e.py refactor(lead-agent): make build_middlewares public to drop the last cross-module private import (#3458) 2026-06-09 11:56:28 +08:00
test_client_langfuse_metadata.py fix(tracing): propagate session_id and user_id into Langfuse traces (#2944) 2026-05-21 16:49:31 +08:00
test_client_live.py [Security] Address critical host-shell escape in LocalSandboxProvider (#1547) 2026-03-29 21:03:58 +08:00
test_client_message_serialization.py feat: refine token usage display modes (#2329) 2026-05-04 09:56:16 +08:00
test_client.py feat(memory): add memory.token_counting config to avoid tiktoken network dependency (#3429) (#3465) 2026-06-10 23:26:15 +08:00
test_codex_provider.py chroe(2585): keep polishing the code of codex token usage (#2689) 2026-05-02 15:04:11 +08:00
test_compose_default_workers.py fix(docker): default Gateway to a single worker to prevent multi-worker breakage (#3475) 2026-06-10 21:36:25 +08:00
test_config_version.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_converters.py feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930) 2026-04-26 11:05:47 +08:00
test_create_deerflow_agent_live.py feat: add create_deerflow_agent SDK entry point (Phase 1) (#1203) 2026-03-29 15:31:18 +08:00
test_create_deerflow_agent.py feat(loop-detection): make loop detection configurable with per-tool frequency overrides (#2711) 2026-05-07 16:15:15 +08:00
test_credential_loader.py feat(loop-detection): make loop detection configurable with per-tool frequency overrides (#2711) 2026-05-07 16:15:15 +08:00
test_csrf_middleware.py feat: static system prompt with DynamicContextMiddleware for prefix-cache optimization (#2801) 2026-05-09 09:27:02 +08:00
test_custom_agent.py fix(agents): require config.yaml in resolve_agent_dir to skip memory-only directories (#3390) (#3481) 2026-06-10 23:57:17 +08:00
test_dangling_tool_call_middleware.py fix(runtime): guide malformed write_file recovery (#3040) 2026-05-29 17:46:24 +08:00
test_ddg_search_tools.py fix(search): fix DDGS Wikipedia region handling (#3423) 2026-06-08 07:59:50 +08:00
test_deferred_catalog.py refactor(tool-search): consolidate MCP metadata tag and harden deferred-tool setup (#3370) 2026-06-05 15:21:41 +08:00
test_deferred_filter_middleware.py fix(tool-search): reliably hide deferred MCP schemas by removing the ContextVar (closures + graph state) (#3342) 2026-06-02 22:43:22 +08:00
test_deferred_promotion_integration.py refactor(tool-search): consolidate MCP metadata tag and harden deferred-tool setup (#3370) 2026-06-05 15:21:41 +08:00
test_deferred_setup.py refactor(tool-search): consolidate MCP metadata tag and harden deferred-tool setup (#3370) 2026-06-05 15:21:41 +08:00
test_deferred_tool_crosscontext.py feat(subagents): extend deferred MCP tool loading to subagents (#3432) 2026-06-08 23:17:22 +08:00
test_deferred_tool_promotion_real_llm.py fix(tool-search): reliably hide deferred MCP schemas by removing the ContextVar (closures + graph state) (#3342) 2026-06-02 22:43:22 +08:00
test_detect_blocking_io_static.py Add static blocking IO inventory (#3208) 2026-05-26 23:30:24 +08:00
test_detect_thread_boundaries.py chore(dev): add async/thread boundary detector (#2936) 2026-05-20 10:00:17 +08:00
test_detect_uv_extras.py fix(scripts): preserve uv extras across make dev restarts (#2754) (#2767) 2026-05-10 22:28:29 +08:00
test_dev_entrypoint.py fix(dev): create backend/sandbox before uvicorn reload-exclude (#3459) (#3460) 2026-06-09 15:29:40 +08:00
test_dingtalk_channel.py feat(channels): add DingTalk channel integration (#2628) 2026-04-30 11:25:33 +08:00
test_discord_channel.py fix(skills): harden slash skill activation across chat channels (#3466) 2026-06-09 23:07:17 +08:00
test_docker_sandbox_mode_detection.py fix Windows Docker sandbox path mounting (#1634) 2026-03-31 22:19:27 +08:00
test_doctor.py feat(dx): Setup Wizard + doctor command — closes #2030 (#2034) 2026-04-10 17:43:39 +08:00
test_dynamic_context_middleware.py fix(harness): preserve dynamic context across summarization (#2823) 2026-05-09 19:39:36 +08:00
test_ensure_admin.py refactor: Remove init_token handling from admin initialization logic and related tests 2026-04-26 11:09:56 +08:00
test_exa_tools.py feat(community): add Exa search as community tool provider (#1357) 2026-04-08 17:13:39 +08:00
test_feedback.py feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134) 2026-04-26 11:09:55 +08:00
test_feishu_parser.py fix(channels): preserve Feishu clarification thread continuity (#3285) 2026-05-31 22:43:07 +08:00
test_file_conversion.py [security] fix(uploads): require explicit opt-in for host-side document conversion (#2332) 2026-04-18 22:47:42 +08:00
test_firecrawl_tools.py feat(dx): Setup Wizard + doctor command — closes #2030 (#2034) 2026-04-10 17:43:39 +08:00
test_gateway_config_freshness.py fix(stability): resolve P0 blockers from v2.0-m1-rc1 stability audit (#3107) (#3131) 2026-05-21 21:18:10 +08:00
test_gateway_docs_toggle.py fix(nginx): defer CORS to gateway allowlist (#2861) 2026-05-11 17:38:37 +08:00
test_gateway_lifespan_shutdown.py fix(stability): resolve P0 blockers from v2.0-m1-rc1 stability audit (#3107) (#3131) 2026-05-21 21:18:10 +08:00
test_gateway_run_drain_shutdown.py fix(gateway): drain in-flight runs before closing checkpointer on shutdown (#3381) 2026-06-07 11:24:30 +08:00
test_gateway_run_recovery.py fix(gateway): drain in-flight runs before closing checkpointer on shutdown (#3381) 2026-06-07 11:24:30 +08:00
test_gateway_runtime_cleanup.py fix(dev): create backend/sandbox before uvicorn reload-exclude (#3459) (#3460) 2026-06-09 15:29:40 +08:00
test_gateway_services.py fix(mcp): add auth interceptor with channel user_id and keep header propagation to mcp tools (#3294) 2026-06-03 15:48:19 +08:00
test_guardrail_middleware.py feat(guardrails): add pre-tool-call authorization middleware with pluggable providers (#1240) 2026-03-23 18:07:33 +08:00
test_harness_boundary.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_infoquest_client.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_initialize_admin.py fix(auth): replace setup-status 429 rate limit with cached response (#2915) 2026-05-18 22:07:01 +08:00
test_internal_auth.py fix(auth): share internal gateway token across workers (#3184) 2026-05-26 23:19:57 +08:00
test_invoke_acp_agent_tool.py fix(harness): wrap all async-only tools for sync clients (#2935) 2026-05-19 22:11:46 +08:00
test_jina_client.py fix(web_fetch): support proxy for Jina reader in restricted networks (#3418) (#3430) 2026-06-08 23:25:29 +08:00
test_jsonl_event_store_async_io.py fix(runtime): harden JSONL async I/O and DB put_batch thread validation (#3084) 2026-05-29 09:27:53 +08:00
test_langgraph_auth.py fix: align auth-disabled mode and mock history loading (#3471) 2026-06-10 16:11:00 +08:00
test_lead_agent_model_resolution.py refactor(lead-agent): make build_middlewares public to drop the last cross-module private import (#3458) 2026-06-09 11:56:28 +08:00
test_lead_agent_prompt.py feat(memory): add memory.token_counting config to avoid tiktoken network dependency (#3429) (#3465) 2026-06-10 23:26:15 +08:00
test_lead_agent_skills.py fix(skills): harden slash skill activation across chat channels (#3466) 2026-06-09 23:07:17 +08:00
test_llm_error_handling_middleware.py fix(#3189): prevent write_file streaming timeout on long reports (#3195) 2026-06-07 17:47:11 +08:00
test_local_bash_tool_loading.py fix(sandbox): improve sandbox security and preserve multimodal content (#2114) 2026-04-11 16:52:10 +08:00
test_local_sandbox_encoding.py fix(sandbox): disable msys path conversion (#2766) 2026-05-08 10:13:11 +08:00
test_local_sandbox_provider_mounts.py fix(sandbox): make missing sandbox.mounts host_path a loud ERROR (#3244) (#3250) 2026-06-09 23:16:14 +08:00
test_local_sandbox_virtual_path_contract.py fix(sandbox): uphold /mnt/user-data contract at Sandbox API boundary (#2873) (#2881) 2026-05-17 08:26:04 +08:00
test_local_skill_storage_write.py Fix custom skill install permissions (#3241) 2026-05-28 15:48:32 +08:00
test_logging_level_from_config.py fix(config): unify log_level from config.yaml across Gateway and debug entry points (#2601) 2026-04-30 22:27:14 +08:00
test_loop_detection_config.py feat(loop-detection): make loop detection configurable with per-tool frequency overrides (#2711) 2026-05-07 16:15:15 +08:00
test_loop_detection_middleware.py feat(loop-detection): defer warning injection (#2752) 2026-05-21 14:36:07 +08:00
test_mcp_client_config.py fix(mcp): accept transport field as alias for type (#3238) (#3243) 2026-06-03 18:11:38 +08:00
test_mcp_config_secrets.py fix(security): harden MCP config endpoint (#3425) 2026-06-08 12:21:02 +08:00
test_mcp_custom_interceptors.py feat(mcp): support custom tool interceptors via extensions_config.json (#2451) 2026-04-25 09:18:13 +08:00
test_mcp_oauth.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_mcp_session_pool.py fix(mcp): close stdio sessions on their owning loop to avoid cross-task cancel-scope error (#3379) (#3392) 2026-06-07 21:37:30 +08:00
test_mcp_sync_wrapper.py fix(harness): wrap all async-only tools for sync clients (#2935) 2026-05-19 22:11:46 +08:00
test_memory_prompt_injection.py feat(memory): add memory.token_counting config to avoid tiktoken network dependency (#3429) (#3465) 2026-06-10 23:26:15 +08:00
test_memory_queue_user_isolation.py fix(memory): isolate queued memory updates by agent (#2941) 2026-05-15 10:26:35 +08:00
test_memory_queue.py fix(memory): isolate queued memory updates by agent (#2941) 2026-05-15 10:26:35 +08:00
test_memory_router.py feat(persistence): per-user filesystem isolation, run-scoped APIs, and state/history simplification (#2153) 2026-04-26 11:13:01 +08:00
test_memory_storage_user_isolation.py fix the lint error in backend 2026-04-26 15:09:25 +08:00
test_memory_storage.py fix: Memory update system has cache corruption, data loss, and thread-safety bugs (#2251) 2026-04-17 12:00:31 +08:00
test_memory_thread_meta_isolation.py feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134) 2026-04-26 11:09:55 +08:00
test_memory_updater_user_isolation.py fix the lint error in backend 2026-04-26 15:09:25 +08:00
test_memory_updater.py fix(memory): parse wrapped memory update json responses (#3252) 2026-05-28 07:46:44 +08:00
test_memory_upload_filtering.py feat: flush memory before summarization (#2176) 2026-04-14 15:01:06 +08:00
test_migration_user_isolation.py feat(agent): add custom-agent self-updates with user isolation (#2713) 2026-05-05 23:17:42 +08:00
test_mindie_provider.py feat(channels): enhance Discord with mention-only mode, thread routing, and typing indicators (#2842) 2026-05-15 22:30:05 +08:00
test_model_config.py feat(codex): support explicit OpenAI Responses API config (#1235) 2026-03-22 20:39:26 +08:00
test_model_factory.py feat: MiniMax provider for image/video/podcast skills + new music-generation skill (#3437) 2026-06-08 22:04:38 +08:00
test_openapi_operation_ids.py fix(gateway): split stream_existing_run into per-method routes for unique OpenAPI operationIds (#3228) 2026-05-28 08:20:52 +08:00
test_owner_isolation.py feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134) 2026-04-26 11:09:55 +08:00
test_patched_deepseek.py fix: resolve missing serialized kwargs in PatchedChatDeepSeek (#2025) 2026-04-09 16:07:16 +08:00
test_patched_mimo.py feat(provider) Add patched MiMo reasoning content support (#3298) 2026-05-28 18:24:32 +08:00
test_patched_minimax.py feat: MiniMax provider for image/video/podcast skills + new music-generation skill (#3437) 2026-06-08 22:04:38 +08:00
test_patched_openai.py fix(LLM): fixing Gemini thinking + tool calls via OpenAI gateway (#1180) (#1205) 2026-03-26 15:07:05 +08:00
test_patched_stepfun.py feat(models): add StepFun reasoning model adapter (#3461) 2026-06-09 18:01:43 +08:00
test_paths_user_isolation.py fix(mcp): add auth interceptor with channel user_id and keep header propagation to mcp tools (#3294) 2026-06-03 15:48:19 +08:00
test_persistence_scaffold.py fix(packaging): add postgres extra for store/checkpointer supportFix postgres extra install guidance (#2584) 2026-05-09 09:49:08 +08:00
test_persistence_timezone.py fix(persistence): emit tz-aware timestamps from SQLite-backed stores (#3130) 2026-05-21 16:22:09 +08:00
test_present_file_tool_core_logic.py feat(persistence): per-user filesystem isolation, run-scoped APIs, and state/history simplification (#2153) 2026-04-26 11:13:01 +08:00
test_provisioner_kubeconfig.py feat(provisioner): add optional PVC support for sandbox volumes (#2020) 2026-04-10 20:40:30 +08:00
test_provisioner_pvc_volumes.py fix(sandbox): scope provisioner PVC data by user (#2973) 2026-05-17 15:23:42 +08:00
test_readability.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_reflection_resolvers.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_reload_boundary.py fix(config): make the reload boundary discoverable from code (#3144) (#3153) 2026-06-07 21:27:14 +08:00
test_remote_sandbox_backend.py fix(sandbox): avoid blocking sandbox readiness polling (#2822) 2026-05-21 14:44:34 +08:00
test_replay_golden.py fix(replay-e2e): match by conversation, not the living system prompt (#3436) 2026-06-08 17:32:41 +08:00
test_replay_provider.py fix(replay-e2e): key fixtures by caller and conversation (#3453) 2026-06-09 21:58:31 +08:00
test_run_event_store_pagination.py fix the lint error in backend 2026-04-26 15:09:25 +08:00
test_run_event_store.py fix(runtime): avoid postgres aggregate row lock (#2962) 2026-05-15 10:32:09 +08:00
test_run_journal.py fix runtime journal run lifecycle events (#3470) 2026-06-10 08:33:29 +08:00
test_run_manager.py fix(runtime): make run creation persistence atomic (#3152) 2026-05-23 22:43:34 +08:00
test_run_naming.py feat(trace):LangGraph -> lead_agent and set custom agent_name to run_name (#3101) 2026-05-21 14:48:28 +08:00
test_run_repository.py fix: harden run finalization persistence (#3155) 2026-05-23 00:09:06 +08:00
test_run_worker_rollback.py fix(middleware): fix LLM fallback run status (#3321) 2026-05-31 22:42:13 +08:00
test_runs_api_endpoints.py fix: load paginated run history messages (#3305) 2026-06-01 15:50:39 +08:00
test_runtime_lifecycle_e2e.py fix(gateway): split stream_existing_run into per-method routes for unique OpenAPI operationIds (#3228) 2026-05-28 08:20:52 +08:00
test_runtime_paths.py fix(harness): restore legacy skills path fallback (#2694) (#2696) 2026-05-03 23:40:59 +08:00
test_safety_finish_reason_graph_integration.py fix(runtime): suppress tool execution when provider safety-terminates with tool_calls (#3035) 2026-05-22 21:20:28 +08:00
test_safety_finish_reason_middleware.py fix(runtime): suppress tool execution when provider safety-terminates with tool_calls (#3035) 2026-05-22 21:20:28 +08:00
test_safety_termination_detectors.py fix(runtime): suppress tool execution when provider safety-terminates with tool_calls (#3035) 2026-05-22 21:20:28 +08:00
test_sandbox_audit_middleware.py feat(sandbox): strengthen bash command auditing with compound splitting and expanded patterns (#1881) 2026-04-07 17:15:24 +08:00
test_sandbox_memory_profile_script.py chore: add sandbox memory profiling tools (#3249) 2026-06-03 22:02:27 +08:00
test_sandbox_middleware.py fix(sandbox): persist lazily-acquired sandbox state via Command (#3464) 2026-06-11 17:50:36 +08:00
test_sandbox_orphan_reconciliation_e2e.py fix(sandbox): add startup reconciliation to prevent orphaned container leaks (#1976) 2026-04-09 17:21:23 +08:00
test_sandbox_orphan_reconciliation.py fix(sandbox): add startup reconciliation to prevent orphaned container leaks (#1976) 2026-04-09 17:21:23 +08:00
test_sandbox_search_tools.py fix(sandbox): add missing path masking in ls_tool output (#2317) 2026-04-18 08:46:59 +08:00
test_sandbox_tools_security.py fix(runtime): bound write_file execution-failure observations (#3133) 2026-05-21 20:35:46 +08:00
test_security_scanner.py fix(skills): make security scanner JSON parsing robust for LLM output variations (#2987) 2026-05-17 08:59:42 +08:00
test_serialization.py feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403) 2026-03-30 16:02:23 +08:00
test_serialize_message_content.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_serper_tools.py feat(community): add Serper web search provider (#2630) 2026-05-02 16:22:35 +08:00
test_setup_agent_e2e_user_isolation.py fix(agents): make update_agent honor runtime.context user_id like setup_agent (#2867) 2026-05-12 23:18:54 +08:00
test_setup_agent_http_e2e_real_server.py fix(agents): make update_agent honor runtime.context user_id like setup_agent (#2867) 2026-05-12 23:18:54 +08:00
test_setup_agent_tool.py fix: keep new agent bootstrap in user scope (#2784) 2026-05-09 19:43:50 +08:00
test_setup_wizard.py feat: MiniMax provider for image/video/podcast skills + new music-generation skill (#3437) 2026-06-08 22:04:38 +08:00
test_skill_manage_tool.py refactor(skills): Unified skill storage capability (#2613) 2026-05-01 13:23:26 +08:00
test_skill_permissions.py Fix custom skill install permissions (#3241) 2026-05-28 15:48:32 +08:00
test_skills_archive_root.py refactor: extract shared skill installer and upload manager to harness (#1202) 2026-03-25 16:28:33 +08:00
test_skills_bundled.py fix(skills): validate bundled SKILL.md front-matter in CI (fixes #2443) (#2457) 2026-04-23 14:06:14 +08:00
test_skills_custom_router.py Fix custom skill install permissions (#3241) 2026-05-28 15:48:32 +08:00
test_skills_installer.py Fix custom skill install permissions (#3241) 2026-05-28 15:48:32 +08:00
test_skills_loader.py fix(harness): restore legacy skills path fallback (#2694) (#2696) 2026-05-03 23:40:59 +08:00
test_skills_parser.py fix(skills): surface offending line and quoting hint on SKILL.md YAML… (#3335) 2026-06-03 21:53:52 +08:00
test_skills_validation.py fix(skills): enforce allowed-tools metadata (#2626) 2026-05-07 08:34:43 +08:00
test_slash_skills.py fix(skills): harden slash skill activation across chat channels (#3466) 2026-06-09 23:07:17 +08:00
test_sse_format.py feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403) 2026-03-30 16:02:23 +08:00
test_stateless_runs_owner_isolation.py fix(gateway): enforce thread ownership on stateless run endpoints (#3473) 2026-06-10 23:03:39 +08:00
test_stream_bridge.py Fix(#1702): stream resume run (#1858) 2026-04-06 14:51:10 +08:00
test_subagent_deferred_promotion_integration.py feat(subagents): extend deferred MCP tool loading to subagents (#3432) 2026-06-08 23:17:22 +08:00
test_subagent_executor.py feat(subagents): extend deferred MCP tool loading to subagents (#3432) 2026-06-08 23:17:22 +08:00
test_subagent_limit_middleware.py fix(middleware): sync raw tool call metadata (#2757) 2026-05-08 10:08:53 +08:00
test_subagent_prompt_security.py feat(subagents): support per-subagent skill loading and custom subagent types (#2253) 2026-04-23 23:59:47 +08:00
test_subagent_skills_config.py refactor: thread app_config through lead and subagent task path (#2666) 2026-05-02 06:37:49 +08:00
test_subagent_status_contract.py fix(subagent): structured subagent_status field over text parsing (#3146) (#3154) 2026-06-07 22:49:55 +08:00
test_subagent_timeout_config.py feat(subagents): allow model override per subagent in config.yaml (#2064) 2026-04-12 16:40:21 +08:00
test_subagent_token_collector.py fix: bucket subagent token usage into parent run totals (#2838) 2026-05-10 22:47:30 +08:00
test_suggestions_router.py fix(suggestions): strip inline <think> reasoning before parsing follow-up questions (#3435) 2026-06-08 15:48:00 +08:00
test_summarization_middleware.py fix(summarization): tag summary LLM calls nostream to stop phantom stream messages (#2503) (#3378) 2026-06-07 17:55:04 +08:00
test_task_tool_core_logic.py fix(task-tool): cancel and schedule deferred cleanup on polling safety timeout (#3097) 2026-05-21 07:47:19 +08:00
test_task_tool_usage_recorder.py fix(stability): resolve P0 blockers from v2.0-m1-rc1 stability audit (#3107) (#3131) 2026-05-21 21:18:10 +08:00
test_thread_data_middleware.py Fix Windows backend test compatibility (#1384) 2026-03-26 17:39:16 +08:00
test_thread_meta_repo.py perf(harness): push thread metadata filters into SQL (#2865) 2026-05-12 23:21:22 +08:00
test_thread_run_messages_pagination.py fix: load paginated run history messages (#3305) 2026-06-01 15:50:39 +08:00
test_thread_state_promoted.py fix(tool-search): reliably hide deferred MCP schemas by removing the ContextVar (closures + graph state) (#3342) 2026-06-02 22:43:22 +08:00
test_thread_state_reducers.py fix(agents): preserve todos state across node updates (#3180) 2026-05-23 23:25:38 +08:00
test_thread_token_usage.py fix(runs): expose active progress counters (#3148) 2026-05-22 21:42:14 +08:00
test_threads_router.py fix(threads): assign new checkpoint ID in update_thread_state (#2391) 2026-06-08 23:12:25 +08:00
test_tiktoken_cache_and_count_tokens.py feat(memory): add memory.token_counting config to avoid tiktoken network dependency (#3429) (#3465) 2026-06-10 23:26:15 +08:00
test_title_generation.py refactor: split backend into harness (deerflow.*) and app (app.*) (#1131) 2026-03-14 22:55:52 +08:00
test_title_middleware_core_logic.py fix(tracing): propagate session_id and user_id into Langfuse traces (#2944) 2026-05-21 16:49:31 +08:00
test_todo_middleware.py fix(todo): reuse thread state schema (#3206) 2026-05-26 23:58:08 +08:00
test_token_usage_config.py enable token usage by default (#2841) 2026-05-10 22:00:57 +08:00
test_token_usage_middleware.py feat: stream subagent token usage to header via terminal task events (#2882) 2026-05-13 23:52:19 +08:00
test_token_usage.py feat(harness): integration ACP agent tool (#1344) 2026-03-26 14:20:18 +08:00
test_tool_args_schema_no_pydantic_warning.py fix(tools): make write_file append discoverable in model-facing schema (#2843) 2026-05-10 23:09:03 +08:00
test_tool_deduplication.py fix(harness): wrap all async-only tools for sync clients (#2935) 2026-05-19 22:11:46 +08:00
test_tool_error_handling_middleware.py feat(subagents): extend deferred MCP tool loading to subagents (#3432) 2026-06-08 23:17:22 +08:00
test_tool_error_handling_subagent_stamp.py fix(subagent): structured subagent_status field over text parsing (#3146) (#3154) 2026-06-07 22:49:55 +08:00
test_tool_output_budget_middleware.py fix(middleware): externalize oversized tool output into sandbox for non-mounted sandboxes (#3417) 2026-06-08 12:24:48 +08:00
test_tool_output_truncation.py fix: add output truncation to ls_tool to prevent context window overflow (#1896) 2026-04-06 15:09:57 +08:00
test_tool_search.py feat(subagents): extend deferred MCP tool loading to subagents (#3432) 2026-06-08 23:17:22 +08:00
test_tracing_config.py fix(tracing): propagate session_id and user_id into Langfuse traces (#2944) 2026-05-21 16:49:31 +08:00
test_tracing_factory.py fix(tracing): propagate session_id and user_id into Langfuse traces (#2944) 2026-05-21 16:49:31 +08:00
test_tracing_metadata.py fix(tracing): propagate session_id and user_id into Langfuse traces (#2944) 2026-05-21 16:49:31 +08:00
test_update_agent_e2e_user_isolation.py fix(agents): make update_agent honor runtime.context user_id like setup_agent (#2867) 2026-05-12 23:18:54 +08:00
test_update_agent_tool.py fix(agents): harden update_agent null-like args (#3237) 2026-06-04 07:10:59 +08:00
test_uploads_manager.py fix(uploads): add Windows support for safe symlink-protected uploads (#2794) 2026-05-09 18:21:54 +08:00
test_uploads_middleware_core_logic.py fix(skills): harden slash skill activation across chat channels (#3466) 2026-06-09 23:07:17 +08:00
test_uploads_router.py fix upload file size contract (#3408) 2026-06-06 15:12:17 +08:00
test_user_context.py fix the lint error in backend 2026-04-26 15:09:25 +08:00
test_utils_time.py fix(gateway): return ISO 8601 timestamps from threads endpoints (#2599) 2026-05-02 15:16:16 +08:00
test_uvicorn_reload_exclude.py fix(dev): create backend/sandbox before uvicorn reload-exclude (#3459) (#3460) 2026-06-09 15:29:40 +08:00
test_view_image_middleware.py feat: MiniMax provider for image/video/podcast skills + new music-generation skill (#3437) 2026-06-08 22:04:38 +08:00
test_view_image_tool.py fix(harness): constrain view_image to thread data paths (#2557) 2026-04-28 11:13:17 +08:00
test_vllm_provider.py feat(models): add vLLM provider support (#1860) 2026-04-06 15:18:34 +08:00
test_wait_disconnect_handling.py fix(gateway): honour on_disconnect on /wait endpoints (#3267) 2026-05-28 07:22:39 +08:00
test_wechat_channel.py feat: add WeChat channel integration (#1869) 2026-04-10 20:49:28 +08:00
test_worker_langfuse_metadata.py fix(tracing): propagate session_id and user_id into Langfuse traces (#2944) 2026-05-21 16:49:31 +08:00
test_write_file_tool_size_guard.py fix(#3189): prevent write_file streaming timeout on long reports (#3195) 2026-06-07 17:47:11 +08:00