deerflow2/.planning/codebase/INTEGRATIONS.md

169 lines
8.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 外部集成审计Tech Focus
**分析日期:** 2026-04-07
## APIs 与外部服务
**LLM Provider通过配置动态切换**
- OpenAI / Anthropic / Gemini / DeepSeek / MiniMax / OpenRouter示例在 `config.example.yaml``models`
- SDK/适配层:`langchain_openai`、`langchain_anthropic`、`langchain_google_genai`、`langchain_deepseek``backend/packages/harness/pyproject.toml`
- 认证:`config.yaml` 中模型字段支持 `$ENV_VAR` 注入(`backend/packages/harness/deerflow/config/app_config.py`
**MCPModel Context Protocol服务**
- 支持 `stdio` / `sse` / `http` 三种传输(`backend/packages/harness/deerflow/mcp/client.py`
- 管理接口:`GET/PUT /api/mcp/config``backend/app/gateway/routers/mcp.py`
- 配置文件:`extensions_config.json``backend/packages/harness/deerflow/config/extensions_config.py`
- OAuthHTTP/SSE MCP 可启用 token 自动刷新(`backend/packages/harness/deerflow/mcp/oauth.py`
**Web 搜索与抓取:**
- DuckDuckGo`ddgs`,免 key`backend/packages/harness/deerflow/community/ddg_search/tools.py`
- Jina Reader`https://r.jina.ai/`(可选 `JINA_API_KEY``backend/packages/harness/deerflow/community/jina_ai/jina_client.py`
- Tavily可配置 api_key`backend/packages/harness/deerflow/community/tavily/tools.py`
- Firecrawl可配置 api_key`backend/packages/harness/deerflow/community/firecrawl/tools.py`
- InfoQuest`INFOQUEST_API_KEY``backend/packages/harness/deerflow/community/infoquest/infoquest_client.py`
**IM 渠道:**
- Feishu/Lark、Slack、Telegram、WeCom`backend/app/channels/*.py`
- Feishu`app_id`/`app_secret`
- Slack`bot_token`/`app_token`
- Telegram`bot_token`
- WeCom`bot_id`/`bot_secret`
**前端到后端接口:**
- 前端直接调用网关 REST`/api/models`、`/api/memory`、`/api/skills`、`/api/mcp/config`、`/api/threads/*/uploads``frontend/src/core/*/api.ts`
- 前端通过 `@langchain/langgraph-sdk` 调用 LangGraph API`frontend/src/core/api/api-client.ts`
**结论:**
- 集成模式以“配置驱动 + 适配层解耦”为主;新增三方服务优先走 `config.yaml` / `extensions_config.json`,避免硬编码。
## 数据存储
**会话状态与持久化:**
- Checkpointer 支持:`memory` / `sqlite` / `postgres``backend/packages/harness/deerflow/config/checkpointer_config.py`
- 默认示例为 SQLite`config.example.yaml` 的 `checkpointer` 段)
- 同步 Store 与 checkpointer 类型保持一致(`backend/packages/harness/deerflow/runtime/store/provider.py`
**文件与工件存储:**
- 上传与工件基于本地文件系统路径(`backend/app/gateway/routers/uploads.py`、`backend/app/gateway/routers/artifacts.py`、`backend/packages/harness/deerflow/uploads/manager.py`
**缓存:**
- 未检测到 Redis/Memcached 等独立缓存服务;主要使用进程内缓存/单例(如配置缓存与客户端缓存,见 `backend/packages/harness/deerflow/config/*.py`、`frontend/src/core/api/api-client.ts`
**结论:**
- 当前默认可单机落地SQLite + 本地文件);若进入多实例部署,应优先切换 Postgres checkpointer/store 并外置文件存储策略。
## 身份认证与权限
**前端身份认证:**
- `better-auth``frontend/src/server/better-auth/config.ts`、`frontend/src/app/api/auth/[...all]/route.ts`
- 当前配置启用 `emailAndPassword`GitHub 相关变量为可选(`frontend/src/env.js`
**MCP 授权:**
- MCP HTTP/SSE OAuth 支持 `client_credentials``refresh_token``backend/packages/harness/deerflow/mcp/oauth.py`
- 可针对每个 MCP server 配置 headers/env/oauth`backend/packages/harness/deerflow/config/extensions_config.py`
**结论:**
- 认证面分为“前端会话认证”和“后端集成凭证认证”两条线;规划时应分离处理,避免混用同一密钥域。
## 观测与可观测性
**Tracing**
- LangSmith`LANGSMITH_*` / `LANGCHAIN_*`)与 Langfuse`LANGFUSE_*`)双支持(`backend/packages/harness/deerflow/config/tracing_config.py`
- 回调挂载在模型创建阶段(`backend/packages/harness/deerflow/tracing/factory.py`、`backend/packages/harness/deerflow/models/factory.py`
**日志:**
- Gateway 使用 Python logging支持 `LOG_LEVEL``backend/app/gateway/app.py`
**结论:**
- Tracing 已具备按环境开关能力,建议在 staging 强制开启至少一个 provider减少线上问题追踪成本。
## CI/CD 与部署集成
**CI**
- GitHub Actions
- 后端单测(`.github/workflows/backend-unit-tests.yml`
- 前后端 lint/type/build`.github/workflows/lint-check.yml`
**部署:**
- 一体化入口:`make dev` / `make up`(根 `Makefile`
- Nginx 统一反代前端 + LangGraph + Gateway`backend/README.md`、`docker/nginx/nginx.local.conf`、`docker/nginx/nginx.conf`
- Docker 编排文件存在:`docker/docker-compose.yaml`、`docker/docker-compose-dev.yaml`
**结论:**
- 已形成本地开发与容器部署双通道;下一步提升点是把 e2ePlaywright纳入 CI 的默认门禁。
## 环境变量(关键清单)
**前端(`frontend/src/env.js`**
- `BETTER_AUTH_SECRET`
- `BETTER_AUTH_GITHUB_CLIENT_ID`
- `BETTER_AUTH_GITHUB_CLIENT_SECRET`
- `GITHUB_OAUTH_TOKEN`
- `NEXT_PUBLIC_BACKEND_BASE_URL`
- `NEXT_PUBLIC_LANGGRAPH_BASE_URL`
- `NEXT_PUBLIC_STATIC_WEBSITE_ONLY`
- `SKIP_ENV_VALIDATION`
**后端网关(`backend/app/gateway/config.py`、`backend/app/gateway/app.py`**
- `GATEWAY_HOST`
- `GATEWAY_PORT`
- `CORS_ORIGINS`
- `SKILL_CONTENT_API_URL`
- `LOG_LEVEL`
**后端主配置解析(`backend/packages/harness/deerflow/config/app_config.py`**
- `DEER_FLOW_CONFIG_PATH`
- `DEER_FLOW_EXTENSIONS_CONFIG_PATH`
- 以及 `config.yaml` / `extensions_config.json` 中所有 `$VAR` 占位符
**Tracing`backend/packages/harness/deerflow/config/tracing_config.py`**
- `LANGSMITH_TRACING` / `LANGCHAIN_TRACING_V2` / `LANGCHAIN_TRACING`
- `LANGSMITH_API_KEY` / `LANGCHAIN_API_KEY`
- `LANGSMITH_PROJECT` / `LANGCHAIN_PROJECT`
- `LANGSMITH_ENDPOINT` / `LANGCHAIN_ENDPOINT`
- `LANGFUSE_TRACING`
- `LANGFUSE_PUBLIC_KEY`
- `LANGFUSE_SECRET_KEY`
- `LANGFUSE_BASE_URL`
**Channels`config.example.yaml`、`backend/app/channels/service.py`**
- `FEISHU_APP_ID`、`FEISHU_APP_SECRET`
- `SLACK_BOT_TOKEN`、`SLACK_APP_TOKEN`
- `TELEGRAM_BOT_TOKEN`
- `WECOM_BOT_ID`、`WECOM_BOT_SECRET`
- `DEER_FLOW_CHANNELS_LANGGRAPH_URL`
- `DEER_FLOW_CHANNELS_GATEWAY_URL`
**社区工具与凭证:**
- `JINA_API_KEY``backend/packages/harness/deerflow/community/jina_ai/jina_client.py`
- `INFOQUEST_API_KEY``backend/packages/harness/deerflow/community/infoquest/infoquest_client.py`
- Claude/Codex 凭证相关变量(`backend/packages/harness/deerflow/models/credential_loader.py`
**结论:**
- 环境变量来源分散于前端 env schema、后端配置加载器和工具客户端后续应维护一份单独的“env contract”用于部署校验。
## Webhook 与回调
**Incoming**
- 未检测到典型公网 webhook 接收实现IM 渠道主要是 WebSocket/轮询主动连接(`backend/app/channels/*.py`
**Outgoing**
- MCP OAuth token endpoint按 server 配置动态请求,`backend/packages/harness/deerflow/mcp/oauth.py`
- 远端技能内容拉取接口(`SKILL_CONTENT_API_URL``backend/app/gateway/config.py`
- 第三方搜索/抓取 APIJina、InfoQuest、Tavily、Firecrawl
**结论:**
- 当前外部交互以“主动调用”为主,公网暴露面较小;若新增 webhook应同步补充签名校验与重放保护。
## 总结(规划导向)
- DeerFlow 当前集成体系已经具备“多模型 + 多工具 + 多渠道 + 可选追踪”的完整闭环,且关键接入点均配置化。
- 后续规划优先级建议:
1. 统一环境变量契约与部署校验(降低配置错误率)
2. 多实例场景的持久化与文件存储升级Postgres + 外置对象存储)
3. 外部集成回归套件MCP OAuth、IM 渠道、搜索工具)持续化到 CI
---
*集成审计完成于 2026-04-07基于 `backend`、`frontend`、`config.example.yaml`、CI 工作流与网关/工具实现静态审计)*