343 lines
17 KiB
Markdown
343 lines
17 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Project Overview
|
|
|
|
DeerFlow is an open-source **AI super agent harness** built on LangGraph/LangChain. It's a full-stack application that orchestrates sub-agents, memory, sandboxes, and extensible skills to perform complex multi-step tasks.
|
|
|
|
**Architecture**:
|
|
- **LangGraph Server** (port 2024): Agent runtime and workflow execution
|
|
- **Gateway API** (port 8001): FastAPI REST API for models, MCP, skills, memory, artifacts, uploads
|
|
- **Frontend** (port 3000): Next.js 16 web interface
|
|
- **Nginx** (port 2026): Unified reverse proxy entry point
|
|
- **Provisioner** (port 8002, optional): Started only when sandbox is configured for provisioner/Kubernetes mode
|
|
|
|
**Key Technologies**:
|
|
- **Backend**: Python 3.12+, LangGraph/LangChain, FastAPI, uv package manager
|
|
- **Frontend**: Next.js 16, React 19, TypeScript 5.8, Tailwind CSS 4, pnpm
|
|
- **AI/ML**: Model-agnostic (any OpenAI-compatible API), multi-model support with thinking/vision capabilities
|
|
|
|
## Commands
|
|
|
|
### Root Directory (Full Application)
|
|
```bash
|
|
make check # Check system requirements (Node.js 22+, pnpm, uv, nginx)
|
|
make config # Generate local configuration files from templates
|
|
make install # Install all dependencies (frontend + backend)
|
|
make setup-sandbox # Pre-pull sandbox container image (recommended for Docker sandbox)
|
|
make dev # Start all services (LangGraph + Gateway + Frontend + Nginx) on localhost:2026
|
|
make stop # Stop all running services
|
|
make clean # Clean up processes and temporary files
|
|
|
|
# Docker Development (Recommended)
|
|
make docker-init # Build custom k3s image with pre-cached sandbox image
|
|
make docker-start # Start Docker services (mode-aware from config.yaml)
|
|
make docker-stop # Stop Docker development services
|
|
make docker-logs # View Docker development logs
|
|
```
|
|
|
|
### Backend Directory (`/backend/`)
|
|
```bash
|
|
make install # Install backend dependencies with uv
|
|
make dev # Run LangGraph server only (port 2024)
|
|
make gateway # Run Gateway API only (port 8001)
|
|
make lint # Lint with ruff
|
|
make format # Format code with ruff
|
|
uv run pytest # Run backend tests
|
|
```
|
|
|
|
### Frontend Directory (`/frontend/`)
|
|
```bash
|
|
pnpm dev # Dev server with Turbopack (http://localhost:3000)
|
|
pnpm build # Production build
|
|
pnpm check # Lint + type check (run before committing)
|
|
pnpm lint # ESLint only
|
|
pnpm lint:fix # ESLint with auto-fix
|
|
pnpm typecheck # TypeScript type check (tsc --noEmit)
|
|
pnpm start # Start production server
|
|
```
|
|
|
|
## Architecture
|
|
|
|
### Microservices Architecture
|
|
```
|
|
┌──────────────────────────────────────┐
|
|
│ Nginx (Port 2026) │
|
|
│ Unified reverse proxy │
|
|
└───────┬──────────────────┬───────────┘
|
|
│ │
|
|
/api/langgraph/* │ │ /api/* (other)
|
|
▼ ▼
|
|
┌────────────────────┐ ┌────────────────────────┐
|
|
│ LangGraph Server │ │ Gateway API (8001) │
|
|
│ (Port 2024) │ │ FastAPI REST │
|
|
│ │ │ │
|
|
│ ┌────────────────┐ │ │ Models, MCP, Skills, │
|
|
│ │ Lead Agent │ │ │ Memory, Uploads, │
|
|
│ │ ┌──────────┐ │ │ │ Artifacts │
|
|
│ │ │Middleware│ │ │ └────────────────────────┘
|
|
│ │ │ Chain │ │ │
|
|
│ │ └──────────┘ │ │
|
|
│ │ ┌──────────┐ │ │
|
|
│ │ │ Tools │ │ │
|
|
│ │ └──────────┘ │ │
|
|
│ │ ┌──────────┐ │ │
|
|
│ │ │Subagents │ │ │
|
|
│ │ └──────────┘ │ │
|
|
│ └────────────────┘ │
|
|
└────────────────────┘
|
|
```
|
|
|
|
### Core Components
|
|
|
|
**Lead Agent** (`src/agents/lead_agent/agent.py`):
|
|
- Entry point: `make_lead_agent(config: RunnableConfig)` registered in `langgraph.json`
|
|
- Dynamic model selection via `create_chat_model()` with thinking/vision support
|
|
- Tools loaded via `get_available_tools()` - combines sandbox, built-in, MCP, community, and subagent tools
|
|
|
|
**Middleware Chain** (11 middlewares in strict order):
|
|
1. **ThreadDataMiddleware** - Creates per-thread isolated directories
|
|
2. **UploadsMiddleware** - Tracks and injects newly uploaded files
|
|
3. **SandboxMiddleware** - Acquires sandbox, stores `sandbox_id` in state
|
|
4. **DanglingToolCallMiddleware** - Injects placeholder ToolMessages for interrupted tool calls
|
|
5. **SummarizationMiddleware** - Context reduction when approaching token limits
|
|
6. **TodoListMiddleware** - Task tracking with `write_todos` tool (plan mode)
|
|
7. **TitleMiddleware** - Auto-generates thread title
|
|
8. **MemoryMiddleware** - Queues conversations for async memory update
|
|
9. **ViewImageMiddleware** - Injects base64 image data before LLM call (vision support)
|
|
10. **SubagentLimitMiddleware** - Truncates excess `task` tool calls to enforce concurrency limits
|
|
11. **ClarificationMiddleware** - Intercepts `ask_clarification` tool calls, interrupts via `Command(goto=END)`
|
|
|
|
**Sandbox System** (`src/sandbox/`):
|
|
- **Interface**: Abstract `Sandbox` with `execute_command`, `read_file`, `write_file`, `list_dir`
|
|
- **Provider Pattern**: `SandboxProvider` with `acquire`, `get`, `release` lifecycle
|
|
- **Implementations**: `LocalSandboxProvider` (local filesystem), `AioSandboxProvider` (Docker-based isolation)
|
|
- **Virtual Path System**: Agent sees `/mnt/user-data/{workspace,uploads,outputs}`, `/mnt/skills` mapped to physical paths
|
|
|
|
**Subagent System** (`src/subagents/`):
|
|
- **Built-in Agents**: `general-purpose` (all tools except `task`) and `bash` (command specialist)
|
|
- **Concurrency**: `MAX_CONCURRENT_SUBAGENTS = 3` enforced by `SubagentLimitMiddleware`
|
|
- **Execution**: Dual thread pool - `_scheduler_pool` (3 workers) + `_execution_pool` (3 workers)
|
|
- **Flow**: `task()` tool → `SubagentExecutor` → background thread → poll 5s → SSE events → result
|
|
|
|
**Memory System** (`src/agents/memory/`):
|
|
- **Components**: `updater.py` (LLM-based memory updates), `queue.py` (debounced update queue), `prompt.py` (templates)
|
|
- **Data Structure**: Stored in `backend/.deer-flow/memory.json` with user context, history, and facts
|
|
- **Workflow**: MemoryMiddleware filters messages → queue debounces (30s) → background LLM extracts updates → atomic file I/O
|
|
|
|
**Tool System** (`src/tools/`):
|
|
- **Config-defined tools**: Resolved from `config.yaml` via `resolve_variable()`
|
|
- **MCP tools**: From enabled MCP servers (lazy initialized, cached with mtime invalidation)
|
|
- **Built-in tools**: `present_files`, `ask_clarification`, `view_image` (vision support)
|
|
- **Subagent tool**: `task` (delegate to subagent)
|
|
- **Community tools**: `tavily/` (web search/fetch), `jina_ai/` (web fetch), `firecrawl/` (scraping), `image_search/` (DuckDuckGo)
|
|
|
|
**Skills System** (`src/skills/`):
|
|
- **Location**: `deer-flow/skills/{public,custom}/`
|
|
- **Format**: Directory with `SKILL.md` (YAML frontmatter: name, description, license, allowed-tools)
|
|
- **Loading**: `load_skills()` scans directories, parses SKILL.md, reads enabled state from extensions_config.json
|
|
- **Installation**: `POST /api/skills/install` extracts .skill ZIP archive to custom/ directory
|
|
|
|
### Configuration System
|
|
|
|
**Main Configuration** (`config.yaml` in project root):
|
|
- Setup: Copy `config.example.yaml` to `config.yaml`
|
|
- Configuration priority: explicit `config_path` → `DEER_FLOW_CONFIG_PATH` env → `config.yaml` in current dir → `config.yaml` in parent dir
|
|
- Config values starting with `$` are resolved as environment variables (e.g., `$OPENAI_API_KEY`)
|
|
|
|
**Extensions Configuration** (`extensions_config.json` in project root):
|
|
- Contains MCP servers and skills configuration
|
|
- Configuration priority: explicit `config_path` → `DEER_FLOW_EXTENSIONS_CONFIG_PATH` env → `extensions_config.json` in current dir → `extensions_config.json` in parent dir
|
|
|
|
**Key Config Sections**:
|
|
- `models[]` - LLM configs with `use` class path, `supports_thinking`, `supports_vision`
|
|
- `tools[]` - Tool configs with `use` variable path and `group`
|
|
- `tool_groups[]` - Logical groupings for tools
|
|
- `sandbox.use` - Sandbox provider class path
|
|
- `skills.path` / `skills.container_path` - Host and container paths to skills directory
|
|
- `title` - Auto-title generation
|
|
- `summarization` - Context summarization
|
|
- `subagents.enabled` - Master switch for subagent delegation
|
|
- `memory` - Memory system configuration
|
|
|
|
### Gateway API (`src/gateway/`)
|
|
|
|
FastAPI application on port 8001 with health check at `GET /health`.
|
|
|
|
**Routers**:
|
|
- **Models** (`/api/models`): `GET /` (list models), `GET /{name}` (model details)
|
|
- **MCP** (`/api/mcp`): `GET /config` (get config), `PUT /config` (update config)
|
|
- **Skills** (`/api/skills`): `GET /` (list), `GET /{name}` (details), `PUT /{name}` (update enabled), `POST /install` (install from .skill archive)
|
|
- **Memory** (`/api/memory`): `GET /` (memory data), `POST /reload` (force reload), `GET /config` (config), `GET /status` (config + data)
|
|
- **Uploads** (`/api/threads/{id}/uploads`): `POST /` (upload files with auto-conversion), `GET /list` (list), `DELETE /{filename}` (delete)
|
|
- **Artifacts** (`/api/threads/{id}/artifacts`): `GET /{path}` (serve artifacts), `?download=true` for file download
|
|
|
|
### Frontend Architecture (`/frontend/`)
|
|
|
|
**Source Layout** (`src/`):
|
|
- **`app/`** - Next.js App Router: `/` (landing), `/workspace/chats/[thread_id]` (chat)
|
|
- **`components/`** - React components: `ui/` (Shadcn UI), `ai-elements/` (Vercel AI SDK), `workspace/` (chat), `landing/` (landing)
|
|
- **`core/`** - Business logic: `threads/` (state management), `api/` (LangGraph client), `artifacts/` (loading/caching), `i18n/`, `settings/`, `memory/`, `skills/`, `messages/`, `mcp/`, `models/`
|
|
- **`hooks/`** - Shared React hooks
|
|
- **`lib/`** - Utilities (`cn()` from clsx + tailwind-merge)
|
|
- **`server/`** - Server-side code (better-auth, not yet active)
|
|
- **`styles/`** - Global CSS with Tailwind v4 `@import` syntax
|
|
|
|
**Data Flow**:
|
|
1. User input → thread hooks (`core/threads/hooks.ts`) → LangGraph SDK streaming
|
|
2. Stream events update thread state (messages, artifacts, todos)
|
|
3. TanStack Query manages server state; localStorage stores user settings
|
|
4. Components subscribe to thread state and render updates
|
|
|
|
## Development Workflow
|
|
|
|
### Running the Full Application
|
|
From the **project root** directory:
|
|
```bash
|
|
make dev
|
|
```
|
|
This starts all services and makes the application available at `http://localhost:2026`.
|
|
|
|
**Nginx routing**:
|
|
- `/api/langgraph/*` → LangGraph Server (2024)
|
|
- `/api/*` (other) → Gateway API (8001)
|
|
- `/` (non-API) → Frontend (3000)
|
|
|
|
### Running Services Separately
|
|
From the **backend** directory:
|
|
```bash
|
|
# Terminal 1: LangGraph server
|
|
make dev
|
|
|
|
# Terminal 2: Gateway API
|
|
make gateway
|
|
```
|
|
|
|
Direct access (without nginx):
|
|
- LangGraph: `http://localhost:2024`
|
|
- Gateway: `http://localhost:8001`
|
|
|
|
### Docker Development (Recommended)
|
|
```bash
|
|
make docker-init # Build images and install dependencies (first time)
|
|
make docker-start # Start all services with hot-reload
|
|
```
|
|
Access at `http://localhost:2026`. Services automatically restart on code changes.
|
|
|
|
## Key Features
|
|
|
|
### File Upload
|
|
- Endpoint: `POST /api/threads/{thread_id}/uploads`
|
|
- Supports: PDF, PPT, Excel, Word documents (auto-converted via `markitdown`)
|
|
- Files stored in thread-isolated directories
|
|
- Agent receives uploaded file list via `UploadsMiddleware`
|
|
|
|
### Plan Mode
|
|
- Controlled via runtime config: `config.configurable.is_plan_mode = True`
|
|
- Provides `write_todos` tool for task tracking
|
|
- One task in_progress at a time, real-time updates
|
|
|
|
### Context Summarization
|
|
- Automatic conversation summarization when approaching token limits
|
|
- Configured in `config.yaml` under `summarization` key
|
|
- Trigger types: tokens, messages, or fraction of max input
|
|
- Keeps recent messages while summarizing older ones
|
|
|
|
### Vision Support
|
|
- For models with `supports_vision: true`
|
|
- `ViewImageMiddleware` processes images in conversation
|
|
- `view_image_tool` added to agent's toolset
|
|
- Images automatically converted to base64 and injected into state
|
|
|
|
## Code Style
|
|
|
|
### Backend (Python)
|
|
- Uses `ruff` for linting and formatting
|
|
- Line length: 240 characters
|
|
- Python 3.12+ with type hints
|
|
- Double quotes, space indentation
|
|
|
|
### Frontend (TypeScript)
|
|
- **Imports**: Enforced ordering (builtin → external → internal → parent → sibling), alphabetized
|
|
- **Unused variables**: Prefix with `_`
|
|
- **Class names**: Use `cn()` from `@/lib/utils` for conditional Tailwind classes
|
|
- **Path alias**: `@/*` maps to `src/*`
|
|
- **Components**: `ui/` and `ai-elements/` are generated from registries - don't manually edit these
|
|
|
|
## Testing
|
|
|
|
### Backend Tests
|
|
```bash
|
|
cd backend
|
|
uv run pytest
|
|
```
|
|
**Regression tests** (run in CI for every PR):
|
|
- `tests/test_docker_sandbox_mode_detection.py` - mode detection from `config.yaml`
|
|
- `tests/test_provisioner_kubeconfig.py` - kubeconfig file/directory handling
|
|
|
|
### Frontend Tests
|
|
No test framework is currently configured.
|
|
|
|
## Documentation Update Policy
|
|
|
|
**CRITICAL: Always update README.md and CLAUDE.md after every code change**
|
|
|
|
When making code changes, you MUST update the relevant documentation:
|
|
- Update `README.md` for user-facing changes (features, setup, usage instructions)
|
|
- Update `CLAUDE.md` for development changes (architecture, commands, workflows, internal systems)
|
|
- Keep documentation synchronized with the codebase at all times
|
|
- Ensure accuracy and timeliness of all documentation
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
deer-flow/
|
|
├── Makefile # Root commands (check, install, dev, stop)
|
|
├── config.yaml # Main application configuration
|
|
├── extensions_config.json # MCP servers and skills configuration
|
|
├── backend/ # Backend application
|
|
│ ├── Makefile # Backend-only commands (dev, gateway, lint)
|
|
│ ├── langgraph.json # LangGraph server configuration
|
|
│ ├── src/
|
|
│ │ ├── agents/ # LangGraph agent system
|
|
│ │ │ ├── lead_agent/ # Main agent (factory + system prompt)
|
|
│ │ │ ├── middlewares/ # 11 middleware components
|
|
│ │ │ ├── memory/ # Memory extraction, queue, prompts
|
|
│ │ │ └── thread_state.py # ThreadState schema
|
|
│ │ ├── gateway/ # FastAPI Gateway API
|
|
│ │ │ ├── app.py # FastAPI application
|
|
│ │ │ └── routers/ # 6 route modules
|
|
│ │ ├── sandbox/ # Sandbox execution system
|
|
│ │ │ ├── local/ # Local filesystem provider
|
|
│ │ │ ├── sandbox.py # Abstract Sandbox interface
|
|
│ │ │ ├── tools.py # bash, ls, read/write/str_replace
|
|
│ │ │ └── middleware.py # Sandbox lifecycle management
|
|
│ │ ├── subagents/ # Subagent delegation system
|
|
│ │ │ ├── builtins/ # general-purpose, bash agents
|
|
│ │ │ ├── executor.py # Background execution engine
|
|
│ │ │ └── registry.py # Agent registry
|
|
│ │ ├── tools/builtins/ # Built-in tools (present_files, ask_clarification, view_image)
|
|
│ │ ├── mcp/ # MCP integration (tools, cache, client)
|
|
│ │ ├── models/ # Model factory with thinking/vision support
|
|
│ │ ├── skills/ # Skills discovery, loading, parsing
|
|
│ │ ├── config/ # Configuration system (app, model, sandbox, tool, etc.)
|
|
│ │ ├── community/ # Community tools (tavily, jina_ai, firecrawl, image_search, aio_sandbox)
|
|
│ │ ├── reflection/ # Dynamic module loading (resolve_variable, resolve_class)
|
|
│ │ └── utils/ # Utilities (network, readability)
|
|
│ ├── tests/ # Test suite
|
|
│ └── docs/ # Documentation
|
|
├── frontend/ # Next.js frontend application
|
|
│ ├── src/
|
|
│ │ ├── app/ # Next.js App Router
|
|
│ │ ├── components/ # React components
|
|
│ │ ├── core/ # Business logic
|
|
│ │ ├── hooks/ # Shared React hooks
|
|
│ │ ├── lib/ # Utilities
|
|
│ │ ├── server/ # Server-side code
|
|
│ │ └── styles/ # Global CSS with Tailwind v4
|
|
│ └── package.json # Next.js 16, React 19, TypeScript 5.8, pnpm
|
|
└── skills/ # Agent skills directory
|
|
├── public/ # Public skills (committed)
|
|
└── custom/ # Custom skills (gitignored)
|
|
``` |