deerflow2/backend/packages/harness/deerflow/agents/middlewares
DanielWalnut c1b7f1d189
feat: static system prompt with DynamicContextMiddleware for prefix-cache optimization (#2801)
* feat(middleware): inject dynamic context via DynamicContextMiddleware

Move memory and current date out of the system prompt and into a
dedicated <system-reminder> HumanMessage injected once per session
(frozen-snapshot pattern) via a new DynamicContextMiddleware.

This keeps the system prompt byte-exact across all users and sessions,
enabling maximum Anthropic/Bedrock prefix-cache reuse.

Key design decisions:
- ID-swap technique: reminder takes the first HumanMessage's ID
  (replacing it in-place via add_messages), original content gets a
  derived `{id}__user` ID (appended after). Preserves correct ordering.
- hide_from_ui: True on reminder messages so frontend filters them out.
- Midnight crossing: date-update reminder injected before the current
  turn's HumanMessage when the conversation spans midnight.
- INFO-level logging for production diagnostics.

Also adds prompt-caching breakpoint budget enforcement tests and
updates ClaudeChatModel docs to reference the new pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(token-usage): log input/output token detail breakdown in middleware

Extend the LLM token usage log line to include input_token_details and
output_token_details (cache_creation, cache_read, reasoning, audio, etc.)
when present. Adds tests covering Anthropic cache detail logging from
both usage_metadata and response_metadata.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: fix nginx

* fix(middleware): always inject date; gate memory on injection_enabled

Date injection is now unconditional — it is part of the static system
prompt replacement and should always be present. Memory injection
remains gated by `memory.injection_enabled` in the app config.

Previously the entire DynamicContextMiddleware was skipped when
injection_enabled was False, which also suppressed the date.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): format files and correct test assertions for token usage middleware

- ruff format dynamic_context_middleware.py and test_claude_provider_prompt_caching.py
- Remove unused pytest import from test_dynamic_context_middleware.py
- Fix two tests that asserted response_metadata fallback logic that
  doesn't exist: replace with tests that match actual middleware behavior

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(middleware): address Copilot review comments on DynamicContextMiddleware

- Use additional_kwargs flag for reminder detection instead of content
  substring matching, so user messages containing '<system-reminder>'
  are not mistakenly treated as injected reminders
- Generate stable UUID when original HumanMessage.id is None to prevent
  ambiguous 'None__user' derived IDs and message collisions
- Downgrade per-turn no-op log to DEBUG; keep actual injection events at INFO
- Add two new tests: missing-id UUID fallback and user-text false-positive

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 09:27:02 +08:00
..
__init__.py feat: add create_deerflow_agent SDK entry point (Phase 1) (#1203) 2026-03-29 15:31:18 +08:00
clarification_middleware.py fix(backend): make clarification messages idempotent (#2350) (#2351) 2026-04-19 22:00:58 +08:00
dangling_tool_call_middleware.py fix(middleware): repair dangling tool-call history after loop interru… (#2035) 2026-04-12 19:11:22 +08:00
deferred_tool_filter_middleware.py fix: gate deferred MCP tool execution (#2513) 2026-04-24 22:45:41 +08:00
dynamic_context_middleware.py feat: static system prompt with DynamicContextMiddleware for prefix-cache optimization (#2801) 2026-05-09 09:27:02 +08:00
llm_error_handling_middleware.py refactor: thread app_config through middleware factories (#2652) 2026-04-30 12:41:09 +08:00
loop_detection_middleware.py feat(loop-detection): make loop detection configurable with per-tool frequency overrides (#2711) 2026-05-07 16:15:15 +08:00
memory_middleware.py refactor: thread app_config through lead and subagent task path (#2666) 2026-05-02 06:37:49 +08:00
sandbox_audit_middleware.py feat(sandbox): strengthen bash command auditing with compound splitting and expanded patterns (#1881) 2026-04-07 17:15:24 +08:00
subagent_limit_middleware.py fix(middleware): sync raw tool call metadata (#2757) 2026-05-08 10:08:53 +08:00
summarization_middleware.py fix(middleware): sync raw tool call metadata (#2757) 2026-05-08 10:08:53 +08:00
thread_data_middleware.py feat: enhance chat history loading with new hooks and UI components (#2338) 2026-04-26 11:20:17 +08:00
title_middleware.py refactor: thread app_config through lead and subagent task path (#2666) 2026-05-02 06:37:49 +08:00
todo_middleware.py fix(todo-middleware): prevent premature agent exit with incomplete todos (#2135) 2026-04-14 11:11:26 +08:00
token_usage_middleware.py feat: static system prompt with DynamicContextMiddleware for prefix-cache optimization (#2801) 2026-05-09 09:27:02 +08:00
tool_call_metadata.py fix(middleware): sync raw tool call metadata (#2757) 2026-05-08 10:08:53 +08:00
tool_error_handling_middleware.py fix(subagents): use model override for tools and middleware (#2641) 2026-05-01 22:21:10 +08:00
uploads_middleware.py feat: enhance chat history loading with new hooks and UI components (#2338) 2026-04-26 11:20:17 +08:00
view_image_middleware.py fix(backend): preserve viewed image reducer metadata (#1900) 2026-04-06 16:47:19 +08:00