169 lines
8.1 KiB
Markdown
169 lines
8.1 KiB
Markdown
# 外部集成审计(Tech Focus)
|
||
|
||
**分析日期:** 2026-04-07
|
||
|
||
## APIs 与外部服务
|
||
|
||
**LLM Provider(通过配置动态切换):**
|
||
- OpenAI / Anthropic / Gemini / DeepSeek / MiniMax / OpenRouter(示例在 `config.example.yaml` 的 `models`)
|
||
- SDK/适配层:`langchain_openai`、`langchain_anthropic`、`langchain_google_genai`、`langchain_deepseek`(`backend/packages/harness/pyproject.toml`)
|
||
- 认证:`config.yaml` 中模型字段支持 `$ENV_VAR` 注入(`backend/packages/harness/deerflow/config/app_config.py`)
|
||
|
||
**MCP(Model Context Protocol)服务:**
|
||
- 支持 `stdio` / `sse` / `http` 三种传输(`backend/packages/harness/deerflow/mcp/client.py`)
|
||
- 管理接口:`GET/PUT /api/mcp/config`(`backend/app/gateway/routers/mcp.py`)
|
||
- 配置文件:`extensions_config.json`(`backend/packages/harness/deerflow/config/extensions_config.py`)
|
||
- OAuth:HTTP/SSE MCP 可启用 token 自动刷新(`backend/packages/harness/deerflow/mcp/oauth.py`)
|
||
|
||
**Web 搜索与抓取:**
|
||
- DuckDuckGo(`ddgs`,免 key):`backend/packages/harness/deerflow/community/ddg_search/tools.py`
|
||
- Jina Reader:`https://r.jina.ai/`(可选 `JINA_API_KEY`,`backend/packages/harness/deerflow/community/jina_ai/jina_client.py`)
|
||
- Tavily(可配置 api_key):`backend/packages/harness/deerflow/community/tavily/tools.py`
|
||
- Firecrawl(可配置 api_key):`backend/packages/harness/deerflow/community/firecrawl/tools.py`
|
||
- InfoQuest(`INFOQUEST_API_KEY`):`backend/packages/harness/deerflow/community/infoquest/infoquest_client.py`
|
||
|
||
**IM 渠道:**
|
||
- Feishu/Lark、Slack、Telegram、WeCom(`backend/app/channels/*.py`)
|
||
- Feishu:`app_id`/`app_secret`
|
||
- Slack:`bot_token`/`app_token`
|
||
- Telegram:`bot_token`
|
||
- WeCom:`bot_id`/`bot_secret`
|
||
|
||
**前端到后端接口:**
|
||
- 前端直接调用网关 REST:`/api/models`、`/api/memory`、`/api/skills`、`/api/mcp/config`、`/api/threads/*/uploads`(`frontend/src/core/*/api.ts`)
|
||
- 前端通过 `@langchain/langgraph-sdk` 调用 LangGraph API(`frontend/src/core/api/api-client.ts`)
|
||
|
||
**结论:**
|
||
- 集成模式以“配置驱动 + 适配层解耦”为主;新增三方服务优先走 `config.yaml` / `extensions_config.json`,避免硬编码。
|
||
|
||
## 数据存储
|
||
|
||
**会话状态与持久化:**
|
||
- Checkpointer 支持:`memory` / `sqlite` / `postgres`(`backend/packages/harness/deerflow/config/checkpointer_config.py`)
|
||
- 默认示例为 SQLite(`config.example.yaml` 的 `checkpointer` 段)
|
||
- 同步 Store 与 checkpointer 类型保持一致(`backend/packages/harness/deerflow/runtime/store/provider.py`)
|
||
|
||
**文件与工件存储:**
|
||
- 上传与工件基于本地文件系统路径(`backend/app/gateway/routers/uploads.py`、`backend/app/gateway/routers/artifacts.py`、`backend/packages/harness/deerflow/uploads/manager.py`)
|
||
|
||
**缓存:**
|
||
- 未检测到 Redis/Memcached 等独立缓存服务;主要使用进程内缓存/单例(如配置缓存与客户端缓存,见 `backend/packages/harness/deerflow/config/*.py`、`frontend/src/core/api/api-client.ts`)
|
||
|
||
**结论:**
|
||
- 当前默认可单机落地(SQLite + 本地文件);若进入多实例部署,应优先切换 Postgres checkpointer/store 并外置文件存储策略。
|
||
|
||
## 身份认证与权限
|
||
|
||
**前端身份认证:**
|
||
- `better-auth`(`frontend/src/server/better-auth/config.ts`、`frontend/src/app/api/auth/[...all]/route.ts`)
|
||
- 当前配置启用 `emailAndPassword`,GitHub 相关变量为可选(`frontend/src/env.js`)
|
||
|
||
**MCP 授权:**
|
||
- MCP HTTP/SSE OAuth 支持 `client_credentials` 与 `refresh_token`(`backend/packages/harness/deerflow/mcp/oauth.py`)
|
||
- 可针对每个 MCP server 配置 headers/env/oauth(`backend/packages/harness/deerflow/config/extensions_config.py`)
|
||
|
||
**结论:**
|
||
- 认证面分为“前端会话认证”和“后端集成凭证认证”两条线;规划时应分离处理,避免混用同一密钥域。
|
||
|
||
## 观测与可观测性
|
||
|
||
**Tracing:**
|
||
- LangSmith(`LANGSMITH_*` / `LANGCHAIN_*`)与 Langfuse(`LANGFUSE_*`)双支持(`backend/packages/harness/deerflow/config/tracing_config.py`)
|
||
- 回调挂载在模型创建阶段(`backend/packages/harness/deerflow/tracing/factory.py`、`backend/packages/harness/deerflow/models/factory.py`)
|
||
|
||
**日志:**
|
||
- Gateway 使用 Python logging,支持 `LOG_LEVEL`(`backend/app/gateway/app.py`)
|
||
|
||
**结论:**
|
||
- Tracing 已具备按环境开关能力,建议在 staging 强制开启至少一个 provider,减少线上问题追踪成本。
|
||
|
||
## CI/CD 与部署集成
|
||
|
||
**CI:**
|
||
- GitHub Actions:
|
||
- 后端单测(`.github/workflows/backend-unit-tests.yml`)
|
||
- 前后端 lint/type/build(`.github/workflows/lint-check.yml`)
|
||
|
||
**部署:**
|
||
- 一体化入口:`make dev` / `make up`(根 `Makefile`)
|
||
- Nginx 统一反代前端 + LangGraph + Gateway(`backend/README.md`、`docker/nginx/nginx.local.conf`、`docker/nginx/nginx.conf`)
|
||
- Docker 编排文件存在:`docker/docker-compose.yaml`、`docker/docker-compose-dev.yaml`
|
||
|
||
**结论:**
|
||
- 已形成本地开发与容器部署双通道;下一步提升点是把 e2e(Playwright)纳入 CI 的默认门禁。
|
||
|
||
## 环境变量(关键清单)
|
||
|
||
**前端(`frontend/src/env.js`):**
|
||
- `BETTER_AUTH_SECRET`
|
||
- `BETTER_AUTH_GITHUB_CLIENT_ID`
|
||
- `BETTER_AUTH_GITHUB_CLIENT_SECRET`
|
||
- `GITHUB_OAUTH_TOKEN`
|
||
- `NEXT_PUBLIC_BACKEND_BASE_URL`
|
||
- `NEXT_PUBLIC_LANGGRAPH_BASE_URL`
|
||
- `NEXT_PUBLIC_STATIC_WEBSITE_ONLY`
|
||
- `SKIP_ENV_VALIDATION`
|
||
|
||
**后端网关(`backend/app/gateway/config.py`、`backend/app/gateway/app.py`):**
|
||
- `GATEWAY_HOST`
|
||
- `GATEWAY_PORT`
|
||
- `CORS_ORIGINS`
|
||
- `SKILL_CONTENT_API_URL`
|
||
- `LOG_LEVEL`
|
||
|
||
**后端主配置解析(`backend/packages/harness/deerflow/config/app_config.py`):**
|
||
- `DEER_FLOW_CONFIG_PATH`
|
||
- `DEER_FLOW_EXTENSIONS_CONFIG_PATH`
|
||
- 以及 `config.yaml` / `extensions_config.json` 中所有 `$VAR` 占位符
|
||
|
||
**Tracing(`backend/packages/harness/deerflow/config/tracing_config.py`):**
|
||
- `LANGSMITH_TRACING` / `LANGCHAIN_TRACING_V2` / `LANGCHAIN_TRACING`
|
||
- `LANGSMITH_API_KEY` / `LANGCHAIN_API_KEY`
|
||
- `LANGSMITH_PROJECT` / `LANGCHAIN_PROJECT`
|
||
- `LANGSMITH_ENDPOINT` / `LANGCHAIN_ENDPOINT`
|
||
- `LANGFUSE_TRACING`
|
||
- `LANGFUSE_PUBLIC_KEY`
|
||
- `LANGFUSE_SECRET_KEY`
|
||
- `LANGFUSE_BASE_URL`
|
||
|
||
**Channels(`config.example.yaml`、`backend/app/channels/service.py`):**
|
||
- `FEISHU_APP_ID`、`FEISHU_APP_SECRET`
|
||
- `SLACK_BOT_TOKEN`、`SLACK_APP_TOKEN`
|
||
- `TELEGRAM_BOT_TOKEN`
|
||
- `WECOM_BOT_ID`、`WECOM_BOT_SECRET`
|
||
- `DEER_FLOW_CHANNELS_LANGGRAPH_URL`
|
||
- `DEER_FLOW_CHANNELS_GATEWAY_URL`
|
||
|
||
**社区工具与凭证:**
|
||
- `JINA_API_KEY`(`backend/packages/harness/deerflow/community/jina_ai/jina_client.py`)
|
||
- `INFOQUEST_API_KEY`(`backend/packages/harness/deerflow/community/infoquest/infoquest_client.py`)
|
||
- Claude/Codex 凭证相关变量(`backend/packages/harness/deerflow/models/credential_loader.py`)
|
||
|
||
**结论:**
|
||
- 环境变量来源分散于前端 env schema、后端配置加载器和工具客户端;后续应维护一份单独的“env contract”用于部署校验。
|
||
|
||
## Webhook 与回调
|
||
|
||
**Incoming:**
|
||
- 未检测到典型公网 webhook 接收实现;IM 渠道主要是 WebSocket/轮询主动连接(`backend/app/channels/*.py`)
|
||
|
||
**Outgoing:**
|
||
- MCP OAuth token endpoint(按 server 配置动态请求,`backend/packages/harness/deerflow/mcp/oauth.py`)
|
||
- 远端技能内容拉取接口(`SKILL_CONTENT_API_URL`,`backend/app/gateway/config.py`)
|
||
- 第三方搜索/抓取 API(Jina、InfoQuest、Tavily、Firecrawl)
|
||
|
||
**结论:**
|
||
- 当前外部交互以“主动调用”为主,公网暴露面较小;若新增 webhook,应同步补充签名校验与重放保护。
|
||
|
||
## 总结(规划导向)
|
||
|
||
- DeerFlow 当前集成体系已经具备“多模型 + 多工具 + 多渠道 + 可选追踪”的完整闭环,且关键接入点均配置化。
|
||
- 后续规划优先级建议:
|
||
1. 统一环境变量契约与部署校验(降低配置错误率)
|
||
2. 多实例场景的持久化与文件存储升级(Postgres + 外置对象存储)
|
||
3. 外部集成回归套件(MCP OAuth、IM 渠道、搜索工具)持续化到 CI
|
||
|
||
---
|
||
|
||
*集成审计完成于 2026-04-07(基于 `backend`、`frontend`、`config.example.yaml`、CI 工作流与网关/工具实现静态审计)*
|