deerflow2

History

KKK 654354c624 test(skills): add evaluation + trigger analysis for systematic-literature-review (#2061 ) * test(skills): add trigger eval set for systematic-literature-review skill 20 eval queries (10 should-trigger, 10 should-not-trigger) for use with skill-creator's run_eval.py. Includes real-world SLR queries contributed by @VANDRANKI (issue #1862 author) and edge cases for routing disambiguation with academic-paper-review. * test(skills): add grader expectations for SLR skill evaluation 5 eval cases with 39 expectations covering: - Standard SLR flow (APA/BibTeX/IEEE format selection) - Keyword extraction and search behavior - Subagent dispatch for metadata extraction - Report structure (themes, convergences, gaps, per-paper annotations) - Negative case: single-paper routing to academic-paper-review - Edge case: implicit SLR without explicit keywords * refactor(skills): shorten SLR description for better trigger rate Reduce description from 833 to 344 chars. Key changes: - Lead with "systematic literature review" as primary trigger phrase - Strengthen single-paper exclusion: "Not for single-paper tasks" - Remove verbose example patterns that didn't improve routing Tested with run_eval.py (10 runs/query): - False positive "best paper on RL": 67% → 20% (improved) - True positive explicit SLR query: ~30% (unchanged) Low recall is a routing-layer limitation, not a description issue — see PR description for full analysis. * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>		2026-04-10 18:02:45 +08:00
..
academic-paper-review	feat(skills): add academic-paper-review, code-documentation, and newsletter-generation skills (#1861 )	2026-04-05 10:19:35 +08:00
bootstrap	feat(agent):Supports custom agent and chat experience with refactoring (#957 )	2026-03-03 21:32:01 +08:00
chart-visualization	feat(agent):Supports custom agent and chat experience with refactoring (#957 )	2026-03-03 21:32:01 +08:00
claude-to-deerflow	feat: add claude-to-deerflow skill for DeerFlow API integration (#1024 )	2026-03-08 22:06:24 +08:00
code-documentation	feat(skills): add academic-paper-review, code-documentation, and newsletter-generation skills (#1861 )	2026-04-05 10:19:35 +08:00
consulting-analysis	fix(skill): enhance data authenticity protocols and clarify reporting guidelines (#905 )	2026-02-25 22:25:23 +08:00
data-analysis	fix: use subprocess instead of os.system in analyze.py (#1289 )	2026-03-24 20:42:03 +08:00
deep-research	feat(agent):Supports custom agent and chat experience with refactoring (#957 )	2026-03-03 21:32:01 +08:00
find-skills	feat: add find-skills skill for discovering agent skills	2026-02-01 23:54:08 +08:00
frontend-design	refactor: refine skills	2026-01-21 21:22:56 +08:00
github-deep-research	feat: Support gitHub PAT configuration for higher github API accessing rate. (#1374 )	2026-03-27 09:54:14 +08:00
image-generation	fix: issue 1138 windows encoding (#1139 )	2026-03-16 16:53:12 +08:00
newsletter-generation	feat(skills): add academic-paper-review, code-documentation, and newsletter-generation skills (#1861 )	2026-04-05 10:19:35 +08:00
podcast-generation	fix: add error handling for podcast generation failures (#1257 )	2026-03-24 00:20:12 +08:00
ppt-generation	fix: issue 1138 windows encoding (#1139 )	2026-03-16 16:53:12 +08:00
skill-creator	fix: issue 1138 windows encoding (#1139 )	2026-03-16 16:53:12 +08:00
surprise-me	docs: update description for surprise-me skill to enhance clarity	2026-02-07 10:51:43 +08:00
systematic-literature-review	test(skills): add evaluation + trigger analysis for systematic-literature-review (#2061 )	2026-04-10 18:02:45 +08:00
vercel-deploy-claimable	feat: use list of links	2026-02-02 13:25:21 +08:00
video-generation	fix: issue 1138 windows encoding (#1139 )	2026-03-16 16:53:12 +08:00
web-design-guidelines	fix: fix skill md path	2026-01-20 21:10:05 +08:00