Commit Graph

35 Commits

Author SHA1 Message Date
Willem Jiang 99a44f1350 feat(eval): add report quality evaluation module and UI integration (#776)
* feat(eval): add report quality evaluation module

Addresses issue #773 - How to evaluate generated report quality objectively.

This module provides two evaluation approaches:
1. Automated metrics (no LLM required):
   - Citation count and source diversity
   - Word count compliance per report style
   - Section structure validation
   - Image inclusion tracking

2. LLM-as-Judge evaluation:
   - Factual accuracy scoring
   - Completeness assessment
   - Coherence evaluation
   - Relevance and citation quality checks

The combined evaluator provides a final score (1-10) and letter grade (A+ to F).

Files added:
- src/eval/__init__.py
- src/eval/metrics.py
- src/eval/llm_judge.py
- src/eval/evaluator.py
- tests/unit/eval/test_metrics.py
- tests/unit/eval/test_evaluator.py

* feat(eval): integrate report evaluation with web UI

This commit adds the web UI integration for the evaluation module:

Backend:
- Add EvaluateReportRequest/Response models in src/server/eval_request.py
- Add /api/report/evaluate endpoint to src/server/app.py

Frontend:
- Add evaluateReport API function in web/src/core/api/evaluate.ts
- Create EvaluationDialog component with grade badge, metrics display,
  and optional LLM deep evaluation
- Add evaluation button (graduation cap icon) to research-block.tsx toolbar
- Add i18n translations for English and Chinese

The evaluation UI allows users to:
1. View quick metrics-only evaluation (instant)
2. Optionally run deep LLM-based evaluation for detailed analysis
3. See grade (A+ to F), score (1-10), and metric breakdown

* feat(eval): improve evaluation reliability and add LLM judge tests

- Extract MAX_REPORT_LENGTH constant in llm_judge.py for maintainability
- Add comprehensive unit tests for LLMJudge class (parse_response,
  calculate_weighted_score, evaluate with mocked LLM)
- Pass reportStyle prop to EvaluationDialog for accurate evaluation criteria
- Add researchQueries store map to reliably associate queries with research
- Add getResearchQuery helper to retrieve query by researchId
- Remove unused imports in test_metrics.py

* fix(eval): use resolveServiceURL for evaluate API endpoint

The evaluateReport function was using a relative URL '/api/report/evaluate'
which sent requests to the Next.js server instead of the FastAPI backend.
Changed to use resolveServiceURL() consistent with other API functions.

* fix: improve type accuracy and React hooks in evaluation components

- Fix get_word_count_target return type from Optional[Dict] to Dict since it always returns a value via default fallback
- Fix useEffect dependency issue in EvaluationDialog using useRef to prevent unwanted re-evaluations
- Add aria-label to GradeBadge for screen reader accessibility
2025-12-25 21:55:48 +08:00
Jiahe Wu 7f9563bbca feat(web): add enable_web_search frontend UI (#681) (#766)
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2025-12-17 23:36:32 +08:00
Willem Jiang d0d1573707 fix: react key warnings from duplicate message IDs + establish jest testing framework (#655)
* fix: resolve issue #588 - react key warnings from duplicate message IDs + establish jest testing framework

* Update the makefile and workflow with the js test

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-10-25 20:46:43 +08:00
Qiyuan Jiao 2aa07f93c2 fix: Optimize the performance of stream data processing and add anti-… (#642)
* fix: Optimize the performance of stream data processing and add anti-shake and batch update mechanisms

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* 修复消息批量更新重复问题

- 将 pendingUpdates 从数组改为 Map,使用 message.id 作为键
- 避免在16ms窗口内多次更新同一消息导致的重复处理
- 优化了批量更新性能,减少冗余的映射操作

* fix lint error

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2025-10-22 23:08:18 +08:00
jimmyuconn1982 1ecf7859e7 fix: add max_clarification_rounds parameter passing from frontend to backend (#616)
Bug Fix
This PR fixes the issue where max_clarification_rounds parameter was not being passed from the frontend to the backend, causing a TypeError: '<' not supported between instances of 'int' and 'NoneType' error.

Technical Details
The issue was that the frontend was not passing the max_clarification_rounds parameter to the backend API, causing the backend to receive None values and fail during comparison operations. This fix ensures the parameter is properly typed and passed through the entire request chain.
2025-10-14 17:56:20 +08:00
jimmyuconn1982 fabec4abb6 feat: Add intelligent clarification feature in coordinate step for research queries (#613)
* fix: support local models by making thought field optional in Plan model

- Make thought field optional in Plan model to fix Pydantic validation errors with local models
- Add Ollama configuration example to conf.yaml.example
- Update documentation to include local model support
- Improve planner prompt with better JSON format requirements

Fixes local model integration issues where models like qwen3:14b would fail
due to missing thought field in JSON output.

* feat: Add intelligent clarification feature for research queries

- Add multi-turn clarification process to refine vague research questions
- Implement three-dimension clarification standard (Tech/App, Focus, Scope)
- Add clarification state management in coordinator node
- Update coordinator prompt with detailed clarification guidelines
- Add UI settings to enable/disable clarification feature (disabled by default)
- Update workflow to handle clarification rounds recursively
- Add comprehensive test coverage for clarification functionality
- Update documentation with clarification feature usage guide

Key components:
- src/graph/nodes.py: Core clarification logic and state management
- src/prompts/coordinator.md: Detailed clarification guidelines
- src/workflow.py: Recursive clarification handling
- web/: UI settings integration
- tests/: Comprehensive test coverage
- docs/: Updated configuration guide

* fix: Improve clarification conversation continuity

- Add comprehensive conversation history to clarification context
- Include previous exchanges summary in system messages
- Add explicit guidelines for continuing rounds in coordinator prompt
- Prevent LLM from starting new topics during clarification
- Ensure topic continuity across clarification rounds

Fixes issue where LLM would restart clarification instead of building upon previous exchanges.

* fix: Add conversation history to clarification context

* fix: resolve clarification feature message to planer, prompt, test issues

- Optimize coordinator.md prompt template for better clarification flow
- Simplify final message sent to planner after clarification
- Fix API key assertion issues in test_search.py

* fix: Add configurable max_clarification_rounds and comprehensive tests

- Add max_clarification_rounds parameter for external configuration
- Add comprehensive test cases for clarification feature in test_app.py
- Fixes issues found during interactive mode testing where:
  - Recursive call failed due to missing initial_state parameter
  - Clarification exited prematurely at max rounds
  - Incorrect logging of max rounds reached

* Move clarification tests to test_nodes.py and add max_clarification_rounds to zh.json
2025-10-14 13:35:57 +08:00
HagonChan 4de5724389 feat: add strategic_investment report style (#595)
* add strategic_investment mode

* make format

* make lint

* fix: repair
lint-frontend
2025-09-24 09:50:36 +08:00
Anoyer-lzh 660a93338a fix: env parameters exception when configuring SSE or HTTP MCP server (#513)
* fix: _create_streamable_http_session() got an unexpected keyword argument 'env'

fix unit error

* update md

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2025-08-20 17:23:57 +08:00
Willem Jiang 20ec3f3bc4 fix: build of the web (#492) 2025-07-31 11:54:37 +08:00
Willem Jiang fca0ae8fcb fix:try to fix the docker build of front-end (#487) 2025-07-30 09:52:53 +08:00
DanielWalnut b6edc0fc8a feat: add deep think feature (#311)
* feat: implement backend logic

* feat: implement api/config endpoint

* rename the symbol

* feat: re-implement configuration at client-side

* feat: add client-side of deep thinking

* fix backend bug

* feat: add reasoning block

* docs: update readme

* fix: translate into English

* fix: change icon to lightbulb

* feat: ignore more bad cases

* feat: adjust thinking layout, and implement auto scrolling

* docs: add comments

---------

Co-authored-by: Henry Li <henry1943@163.com>
2025-06-14 13:12:43 +08:00
DanielWalnut 01b1c21044 feat: support to adjust writing style (#290)
* feat: implment backend for adjust report style

* feat: add web part

* fix test cases

* fix: fix typing

---------

Co-authored-by: Henry Li <henry1943@163.com>
2025-06-07 20:48:39 +08:00
JeffJiang 96fd196baa feat: RAG Integration (#238)
* feat: add rag provider and retriever

* feat: retriever tool

* feat: add retriever tool to the researcher node

* feat: add rag http apis

* feat: new message input supports resource mentions

* feat: new message input component support resource mentions

* refactor: need_web_search to need_search

* chore: RAG integration docs

* chore: change example api host

* fix: user message color in dark mode

* fix: mentions style

* feat: add local_search_tool to researcher prompt

* chore: research prompt

* fix: ragflow page size and reporter with

* docs: ragflow integration and add acknowledgment projects

* chore: format
2025-05-28 14:13:46 +08:00
DanielWalnut bc95f4fca6 feat: config max_search_results for search engine (#192)
* feat: implement UI

* feat: config max_search_results for search engine via api

---------

Co-authored-by: Henry Li <henry1943@163.com>
2025-05-18 13:23:52 +08:00
JeffJiang 0f9ac7cd23 Check the output links are hallucinations from AI (#139)
* feat: check output links if a hallucination from AI
2025-05-15 10:39:53 +08:00
Henry Li fda3b53317 fix: add error handling for podcast generation (#59)
Co-authored-by: Jiang Feng <jiangfeng.11@bytedance.com>
2025-05-12 20:56:38 +08:00
JeffJiang 560ecad165 pref: message render performence (#81)
* fix: message card always unmount when messages change

* pref: add useShallow for complex store selector
2025-05-12 20:21:54 +08:00
Henry Li 8109de3b73 fix: allow the first activity to be reporting (#8) 2025-05-09 10:32:49 +08:00
Li Xin f841df834f feat: add error handling 2025-04-30 21:09:14 +08:00
Li Xin 0a0f705aca feat: enable investigation mode 2025-04-30 19:53:14 +08:00
Yu Chao 2dad50a95f feat: add openResearch and closeResearch 2025-04-28 15:26:26 +08:00
Yu Chao fa7d38db4b feat: move getChatStreamSettings to settings store 2025-04-28 14:59:46 +08:00
Yu Chao 26b123dd03 feat: handle exception 2025-04-28 14:16:31 +08:00
Yu Chao 1610c2ff4e feat: extract getChatStreamSettings 2025-04-28 12:11:37 +08:00
Yu Chao 834d5221ca feat: clear ongoingResearchId on failure 2025-04-28 11:38:40 +08:00
Li Xin cb4b7b7495 feat: add `Replay Mode` 2025-04-24 21:51:08 +08:00
Li Xin f5cfd6b9d5 feat: change the default value 2025-04-24 17:27:37 +08:00
Li Xin 1755128209 feat: add `autoAcceptedPlan` to options 2025-04-24 17:24:58 +08:00
Li Xin cd6afcdd98 fix: fix import order 2025-04-24 17:17:27 +08:00
Li Xin 10b1d63834 feat: implement MCP UIs 2025-04-24 15:41:33 +08:00
Li Xin 1e7036ed90 chore: add settings store 2025-04-23 21:18:33 +08:00
Li Xin 5ee1632489 feat: extract parseJSON() 2025-04-22 11:02:18 +08:00
Li Xin 2f06f0c433 feat: enable podcast 2025-04-19 22:11:57 +08:00
He Tao fd85115f6f chore: add license header for web 2025-04-17 14:26:41 +08:00
Li Xin fd7a803753 chore: merge with web UI project 2025-04-17 12:02:23 +08:00