Commit Graph

40 Commits

Author SHA1 Message Date
Willem Jiang e09f46601b fix(frontend): render all tool calls in the frontend #796 (#797) 2026-01-05 22:24:52 +08:00
Willem Jiang 99a44f1350 feat(eval): add report quality evaluation module and UI integration (#776)
* feat(eval): add report quality evaluation module

Addresses issue #773 - How to evaluate generated report quality objectively.

This module provides two evaluation approaches:
1. Automated metrics (no LLM required):
   - Citation count and source diversity
   - Word count compliance per report style
   - Section structure validation
   - Image inclusion tracking

2. LLM-as-Judge evaluation:
   - Factual accuracy scoring
   - Completeness assessment
   - Coherence evaluation
   - Relevance and citation quality checks

The combined evaluator provides a final score (1-10) and letter grade (A+ to F).

Files added:
- src/eval/__init__.py
- src/eval/metrics.py
- src/eval/llm_judge.py
- src/eval/evaluator.py
- tests/unit/eval/test_metrics.py
- tests/unit/eval/test_evaluator.py

* feat(eval): integrate report evaluation with web UI

This commit adds the web UI integration for the evaluation module:

Backend:
- Add EvaluateReportRequest/Response models in src/server/eval_request.py
- Add /api/report/evaluate endpoint to src/server/app.py

Frontend:
- Add evaluateReport API function in web/src/core/api/evaluate.ts
- Create EvaluationDialog component with grade badge, metrics display,
  and optional LLM deep evaluation
- Add evaluation button (graduation cap icon) to research-block.tsx toolbar
- Add i18n translations for English and Chinese

The evaluation UI allows users to:
1. View quick metrics-only evaluation (instant)
2. Optionally run deep LLM-based evaluation for detailed analysis
3. See grade (A+ to F), score (1-10), and metric breakdown

* feat(eval): improve evaluation reliability and add LLM judge tests

- Extract MAX_REPORT_LENGTH constant in llm_judge.py for maintainability
- Add comprehensive unit tests for LLMJudge class (parse_response,
  calculate_weighted_score, evaluate with mocked LLM)
- Pass reportStyle prop to EvaluationDialog for accurate evaluation criteria
- Add researchQueries store map to reliably associate queries with research
- Add getResearchQuery helper to retrieve query by researchId
- Remove unused imports in test_metrics.py

* fix(eval): use resolveServiceURL for evaluate API endpoint

The evaluateReport function was using a relative URL '/api/report/evaluate'
which sent requests to the Next.js server instead of the FastAPI backend.
Changed to use resolveServiceURL() consistent with other API functions.

* fix: improve type accuracy and React hooks in evaluation components

- Fix get_word_count_target return type from Optional[Dict] to Dict since it always returns a value via default fallback
- Fix useEffect dependency issue in EvaluationDialog using useRef to prevent unwanted re-evaluations
- Add aria-label to GradeBadge for screen reader accessibility
2025-12-25 21:55:48 +08:00
Jiahe Wu d1ce339090 feat(web): add multi-format report export (Markdown, HTML, PDF, Word,… (#756)
* feat(web): add multi-format report export (Markdown, HTML, PDF, Word, Image)

* fix: correct import order for docx (lint error)

* fix(web): address Copilot review comments for multi-format export

- Add i18n support for dropdown menu items (en/zh)

- Add DOMPurify for HTML sanitization (XSS protection)

- Fix async handling for canvas.toBlob with Promise wrapper

- Add toast notifications for export errors

- Fix Tooltip + DropdownMenuTrigger nesting (accessibility)

- Ensure container cleanup in finally block

* fix(web): enhance markdown parsing for PDF and Word export

- Add list support (bullet and numbered) for PDF export
- Add parseInlineMarkdown helper for Word export to handle bold, italic, code, links
- Add list support for Word export (bullet and numbered)
- Address Copilot review comments from PR #756

* fix(web): address PR review feedback for multi-format export

- Extract PDF formatting magic numbers into PDF_CONSTANTS

- Add Tooltip wrapper for download dropdown button

- Reduce triggerDownload cleanup timeout from 1000ms to 100ms

- Use marked.Lexer.lexInline for robust markdown parsing

- Add console.warn for image export cleanup errors

- Add numbering config for Word document ordered lists

- Fix CSS class typo: px-5pb-20 -> px-5 pb-20

- Remove unreachable dead code in parseInlineMarkdown

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2025-12-16 09:06:24 +08:00
agoudbg a5f6bde26f refactor: Welcome layout and conditional rendering (#690)
* refactor: Welcome layout and conditional rendering

Improves flex layout and spacing in ConversationStarter, and updates MessagesBlock to conditionally render ConversationStarter or MessageListView based on chat state. This streamlines the UI and removes redundant rendering logic.

* fix: replay mode

* fix: Remove unnecessary inset-0

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2025-11-21 09:27:14 +08:00
Willem Jiang 452ecd77b9 fix: prevent DOM error when removing temporary download link (#675) (#676)
Add defensive checks before removeChild to prevent 'Failed to execute removeChild' error when the element has already been removed from DOM. Wrap URL.revokeObjectURL in finally block to ensure proper resource cleanup.
2025-10-31 22:30:34 +08:00
Willem Jiang 6686a531bd fix: improve config loading resilience for non-localhost access (#510) (#658)
* fix: improve config loading resilience for non-localhost access (#510)

- Add DEFAULT_CONFIG fallback to always return valid config even if fetch fails
- Implement retry logic with exponential backoff (max 2 retries) to handle transient failures
- Add 5-second fetch timeout to prevent hanging on unreachable backends
- Improve error logging with clear messages about config fetch status
- Always return DeerFlowConfig (never null) to prevent UI rendering issues
- Add safety checks in input-box component to verify reasoning models before access
- Improve type safety: verify array length before accessing array indices
- Add comprehensive documentation in .env.example with examples for different deployment scenarios
- Document NEXT_PUBLIC_API_URL variable behavior and fallback mechanism

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: add nullish coalescing to prevent TypeScript error in input-box

- Add ?? operator to handle potential undefined value when accessing reasoning[0]
- Fixes TS2322 error: Type 'string | undefined' is not assignable to type 'string | number | Date'

---------

Co-authored-by: Willem Jiang <143703838+willem-bd@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-10-26 07:34:12 +08:00
Willem Jiang 3994949328 fix: parsed json with extra tokens issue (#656)
Fixes #598 

* fix: parsed json with extra tokens issue

* Added unit test for json.ts

* fix the json unit test running issue

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update the code with code review suggestion

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Willem Jiang <143703838+willem-bd@users.noreply.github.com>
2025-10-26 07:24:25 +08:00
Willem Jiang d0d1573707 fix: react key warnings from duplicate message IDs + establish jest testing framework (#655)
* fix: resolve issue #588 - react key warnings from duplicate message IDs + establish jest testing framework

* Update the makefile and workflow with the js test

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-10-25 20:46:43 +08:00
Willem Jiang 19a90d59eb fix: handle non-string tool results to fix #631 (#633)
- Backend: Convert non-string content (lists, dicts) to JSON strings in _create_event_stream_message to ensure frontend always receives string content
- Frontend: Add type guard before calling startsWith() on toolCall.result for defensive programming

This fixes the TypeError: toolCall.result.startsWith is not a function when tools return complex objects.
2025-10-20 23:10:58 +08:00
Willem Jiang a0a0057c86 fix: optimize animations to prevent browser freeze with many research steps (#630)
Fixes #570 where browser freezes when research plan has 8+ steps.

Performance optimizations:
- Add animation throttling: only animate first 10 activity items
- Reduce animation durations (0.4s → 0.3s for activities, 0.2s → 0.15s for results)
- Remove scale animations (GPU-intensive) from search results
- Limit displayed results (20 pages, 10 images max)
- Add conditional animations based on item index
- Cap animation delays to prevent excessive staggering
- Add React.memo to ActivityMessage and ActivityListItem components

These changes significantly improve performance when rendering multiple
research steps while maintaining visual appeal for smaller lists.
2025-10-19 19:24:57 +08:00
Willem Jiang a6451251d8 fix: prevent repeated content animation during thinking streaming (#614) (#623)
* fix: prevent repeated content animation during thinking streaming (#614)

- Implement chunked rendering using reasoningContentChunks
- Static content (previous chunks) renders without animation
- Only current streaming chunk animates
- Disable animation on plan content (title, thought, steps) during streaming
- Animation applies after content finishes streaming (when complete)
- Prevents visual duplication of repeated sentences in thinking process
2025-10-16 19:48:05 +08:00
Willem Jiang fd244d8bf6 fix: add unique key prop to conversation starter list items (#619)
- Changed key from question text to combination of index and question text
- Ensures unique keys even if translation has duplicate questions
- Resolves React warning: 'Each child in a list should have a unique key prop'
2025-10-16 18:24:36 +08:00
CHANGXUBO 84c2eec410 feat: Implement Milvus retriver for RAG (#516)
* feat: Implement MilvusRetriever with embedding model and resource management

* chore: Update configuration and loader files for consistency

* chore: Clean up test_milvus.py for improved readability and organization

* feat: Add tests for DashscopeEmbeddings query and document embedding methods

* feat: Add tests for embedding model initialization and example file loading in MilvusProvider

* chore: Remove unused imports and clean up test_milvus.py for better readability

* chore: Clean up test_milvus.py for improved readability and organization

* chore: Clean up test_milvus.py for improved readability and organization

* fix: replace print statements with logging in recursion limit function

* Implement feature X to enhance user experience and optimize performance

* refactor: clean up unused imports and comments in AboutTab component

* Implement feature X to enhance user experience and fix bug Y in module Z

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2025-09-12 22:20:55 +08:00
Willem Jiang f4d70caf2c Fix: build of font end of #466 (#530) 2025-08-21 23:25:52 +08:00
道心坚定韩道友 9c6fa30667 FIX/Adapt message box to handle long text in frontend (#466)
* fix:ui

* fix:ui bug

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2025-08-21 10:31:54 +08:00
johnny0120 5e1b2dfb22 feat: add i18n support and add Chinese (#372)
* feat: add i18n support and add Chinese

* fix: resolve conflicts

* Update en.json with cancle settings

* Update zh.json with settngs cancle

---------

Co-authored-by: johnny0120 <15564476+johnny0120@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Willem Jiang <143703838+willem-bd@users.noreply.github.com>
2025-07-12 15:18:28 +08:00
JeffJiang e4411d80d7 fix: next server fetch error (#374) 2025-06-27 14:23:04 +08:00
DanielWalnut b6edc0fc8a feat: add deep think feature (#311)
* feat: implement backend logic

* feat: implement api/config endpoint

* rename the symbol

* feat: re-implement configuration at client-side

* feat: add client-side of deep thinking

* fix backend bug

* feat: add reasoning block

* docs: update readme

* fix: translate into English

* fix: change icon to lightbulb

* feat: ignore more bad cases

* feat: adjust thinking layout, and implement auto scrolling

* docs: add comments

---------

Co-authored-by: Henry Li <henry1943@163.com>
2025-06-14 13:12:43 +08:00
Muharrem Okutan 446901ec0b feat: added report download button (#78) 2025-06-11 09:50:48 +08:00
DanielWalnut ffa6604297 feat: implement enhance prompt (#294)
* feat: implement enhance prompt

* add unit test

* fix prompt

* fix: fix eslint and compiling issues

* feat: add border-beam animation

* fix: fix importing issues

---------

Co-authored-by: Henry Li <henry1943@163.com>
2025-06-08 19:41:59 +08:00
DanielWalnut 01b1c21044 feat: support to adjust writing style (#290)
* feat: implment backend for adjust report style

* feat: add web part

* fix test cases

* fix: fix typing

---------

Co-authored-by: Henry Li <henry1943@163.com>
2025-06-07 20:48:39 +08:00
SToneX 3dbf0e36ae feat(chat): add animated deer to response indicator (#269) 2025-05-31 19:13:13 +08:00
JeffJiang 15cdfe631e feat: rag retrieving tool call result display (#263)
* feat: local search tool call result display

* chore: add file copyright

* fix: miss edit plan interrupt feedback

* feat: disable pasting html into input box
2025-05-29 19:52:34 +08:00
JeffJiang d2ffcc5dfb fix: editing plan style (#261) 2025-05-29 10:46:05 +08:00
JeffJiang 0210729ad3 fix: message block width (#257) 2025-05-28 19:11:20 +08:00
JeffJiang 96fd196baa feat: RAG Integration (#238)
* feat: add rag provider and retriever

* feat: retriever tool

* feat: add retriever tool to the researcher node

* feat: add rag http apis

* feat: new message input supports resource mentions

* feat: new message input component support resource mentions

* refactor: need_web_search to need_search

* chore: RAG integration docs

* chore: change example api host

* fix: user message color in dark mode

* fix: mentions style

* feat: add local_search_tool to researcher prompt

* chore: research prompt

* fix: ragflow page size and reporter with

* docs: ragflow integration and add acknowledgment projects

* chore: format
2025-05-28 14:13:46 +08:00
Leo Hui fac1376df0 feat: refactor crawler trust link style (#166)
* feat: refactor crawler trust link style

* feat: enhance link credibility checks in Markdown and related components
2025-05-15 17:17:10 +08:00
JeffJiang c99513d604 fix: report editor styles (#163)
* fix: report editor styles
2025-05-15 15:18:01 +08:00
JeffJiang 0f9ac7cd23 Check the output links are hallucinations from AI (#139)
* feat: check output links if a hallucination from AI
2025-05-15 10:39:53 +08:00
Henry Li 239098bb20 feat: add python result and error handling (#141) 2025-05-14 03:47:28 -07:00
Henry Li de014db3a5 fix: fix compiling issues 2025-05-13 08:57:09 +08:00
Henry Li be5159a4c1 feat: use number ticker to display star count (#89) 2025-05-12 23:15:43 +08:00
Henry Li fda3b53317 fix: add error handling for podcast generation (#59)
Co-authored-by: Jiang Feng <jiangfeng.11@bytedance.com>
2025-05-12 20:56:38 +08:00
JeffJiang 560ecad165 pref: message render performence (#81)
* fix: message card always unmount when messages change

* pref: add useShallow for complex store selector
2025-05-12 20:21:54 +08:00
Leo Hui 54ee700bb8 feat: show repo star on site-header (#27)
* feat: show repo star on site-header

* feat: add GITHUB_OAUTH_TOKEN to environment configuration

* feat: remove comment

* feat: show star counter only in website
2025-05-12 14:36:50 +08:00
Henry Li fdf5208dc5 refactor: rename to `animated` 2025-05-12 11:59:24 +08:00
Henry Li 8109de3b73 fix: allow the first activity to be reporting (#8) 2025-05-09 10:32:49 +08:00
Nonoroazoro 5bbaa30b42 fix: auto-scrolling to the bottom occasionally fails when toggling research (#7) 2025-05-08 19:49:56 +08:00
Li Xin 03bf6f1f07 feat: enhance replay mode in static website 2025-05-08 09:53:09 +08:00
Li Xin fdfc607747 refactor: extract `components` folder 2025-05-02 10:43:14 +08:00