171 lines
5.8 KiB
Markdown
171 lines
5.8 KiB
Markdown
# Paper Burner X - Claude Code Guide
|
|
|
|
AI文献识别、翻译、阅读与智能分析工具。浏览器即开即用的AI工作站。
|
|
|
|
## Project Overview
|
|
|
|
Paper Burner X is a browser-based AI-powered document processing tool for academic literature. It supports PDF/DOCX/PPTX/EPUB/Markdown formats with OCR, translation, and intelligent analysis capabilities.
|
|
|
|
**Key Features:**
|
|
- Frontend Agentic RAG system for document Q&A
|
|
- High-performance batch processing with concurrent OCR and translation
|
|
- Terminology glossary support (tens of thousands of entries)
|
|
- Dual deployment modes: pure frontend (Vercel) or backend (Docker/self-hosted)
|
|
|
|
## Architecture
|
|
|
|
```
|
|
paper-burner-x/
|
|
├── js/ # Frontend JavaScript (Vanilla JS, no framework)
|
|
│ ├── app.js # Main entry point and event coordinator
|
|
│ ├── index.js # UI initialization and module management
|
|
│ ├── storage/
|
|
│ │ └── storage-adapter.js # Dual-mode storage abstraction
|
|
│ ├── chatbot/
|
|
│ │ ├── react/ # ReAct framework for document Q&A
|
|
│ │ ├── core/ # Chatbot core logic
|
|
│ │ ├── agents/ # Vector search, semantic tools
|
|
│ │ └── ui/ # Chatbot UI components
|
|
│ ├── process/ # Document processing, OCR, translation
|
|
│ ├── history/ # Export functionality (PDF, DOCX)
|
|
│ └── annotations/ # Annotation and highlighting
|
|
├── server/ # Backend: Node.js/Express + Prisma + PostgreSQL
|
|
│ ├── src/
|
|
│ │ ├── index.js # Backend entry point
|
|
│ │ ├── routes/ # API route handlers
|
|
│ │ ├── middleware/ # Auth, error handling, rate limiting
|
|
│ │ └── utils/ # Prisma client, logger, etc.
|
|
│ └── package.json
|
|
├── local-proxy/ # Local Node.js proxy server (OCR + academic search)
|
|
│ └── server.js
|
|
├── workers/ # Cloudflare Workers
|
|
│ ├── pb-ocr-proxy/ # OCR proxy worker
|
|
│ └── academic-search-proxy/
|
|
├── admin/ # Admin panel
|
|
├── css/ # Stylesheets (modular structure)
|
|
├── tests/ # Frontend test HTML files (open in browser)
|
|
└── docs/ # Documentation
|
|
```
|
|
|
|
## Dual Deployment Modes
|
|
|
|
### Frontend Mode (Vercel/Static)
|
|
- Pure static hosting
|
|
- Uses localStorage + IndexedDB for data persistence
|
|
- Storage adapter auto-detects mode via `storage-adapter.js`
|
|
- No backend required
|
|
|
|
### Backend Mode (Docker/Self-hosted)
|
|
- Full PostgreSQL database via Prisma
|
|
- Backend API at `/api/*`
|
|
- User authentication and multi-user support
|
|
- Storage adapter switches to API calls automatically
|
|
|
|
**Mode Detection:** `js/storage/storage-adapter.js` auto-detects backend availability by checking `/api/health` endpoint.
|
|
|
|
## Development Commands
|
|
|
|
```bash
|
|
# Start both frontend (port 8080) and backend (port 3456)
|
|
./start.sh
|
|
|
|
# Frontend development
|
|
npm run dev:fe # Vite dev server on port 5173
|
|
npm run build:fe # Build frontend to dist/
|
|
npm run preview:fe # Preview built frontend
|
|
|
|
# Backend development (in server/ directory)
|
|
cd server
|
|
npm run dev # Nodemon dev server
|
|
npm test # Run Jest tests
|
|
npm run prisma:migrate # Run database migrations
|
|
npm run prisma:studio # Open Prisma Studio GUI
|
|
|
|
# Local proxy (in local-proxy/ directory)
|
|
cd local-proxy
|
|
npm start # Start proxy server
|
|
```
|
|
|
|
## Key Modules
|
|
|
|
### Frontend Core
|
|
- `js/app.js` - Main entry, file handling, translation coordination
|
|
- `js/index.js` - UI module registration (`window.ui.registerModule`)
|
|
- `js/storage/storage-adapter.js` - Storage abstraction layer
|
|
|
|
### ReAct Engine (Document Q&A)
|
|
- `js/chatbot/react/index.js` - Main ReAct module entry
|
|
- `js/chatbot/react/tool-registry.js` - 10 retrieval tools (grep, vector search, fetch, etc.)
|
|
- `js/chatbot/agents/` - Vector store, semantic search, BM25
|
|
|
|
### Document Processing
|
|
- `js/process/ocr.js` - OCR processing (supports mineru/doc2x adapters)
|
|
- `js/process/document.js` - Document parsing and chunking
|
|
- `js/process/glossary-*.js` - Terminology matching
|
|
|
|
### Backend Routes
|
|
- `server/src/routes/auth.js` - Authentication
|
|
- `server/src/routes/document.js` - Document management
|
|
- `server/src/routes/translation.js` - Translation API
|
|
- `server/src/routes/chat.js` - Chatbot API
|
|
|
|
## Important Patterns
|
|
|
|
### Module Registration (Frontend)
|
|
```javascript
|
|
// UI modules register themselves via window.ui.registerModule
|
|
window.ui.registerModule('moduleName', {
|
|
func1, func2, ...
|
|
});
|
|
|
|
// Check module status
|
|
window.ui.moduleStatus['moduleName'] // true/false
|
|
```
|
|
|
|
### Storage Adapter Usage
|
|
```javascript
|
|
// Automatically uses correct storage based on deployment mode
|
|
const settings = await window.storageAdapter.loadSettings();
|
|
await window.storageAdapter.saveSettings(settings);
|
|
```
|
|
|
|
### ReAct Engine Usage
|
|
```javascript
|
|
const engine = new window.ReActEngine({
|
|
maxIterations: 5,
|
|
tokenBudget: { totalBudget: 32000, contextTokens: 18000 },
|
|
llmConfig: {...}
|
|
});
|
|
|
|
const generator = engine.run(userQuestion, docContent, systemPrompt, chatHistory);
|
|
for await (const event of generator) {
|
|
// Handle streaming events
|
|
}
|
|
```
|
|
|
|
## Testing
|
|
|
|
- **Backend tests:** `cd server && npm test` (Jest)
|
|
- **Frontend tests:** Open HTML files in `tests/` directory directly in browser
|
|
|
|
## Environment Variables
|
|
|
|
Backend (`server/.env`):
|
|
```
|
|
DATABASE_URL=postgresql://...
|
|
JWT_SECRET=...
|
|
PORT=3456
|
|
```
|
|
|
|
Local proxy (`local-proxy/.env`):
|
|
```
|
|
OCR_API_KEY=...
|
|
ACADEMIC_SEARCH_API_KEY=...
|
|
```
|
|
|
|
## Notes
|
|
|
|
- Frontend uses Vanilla JavaScript (no React/Vue framework)
|
|
- CSS is organized in `css/history_detail/` with modular structure
|
|
- Workers are deployed to Cloudflare for OCR and academic search proxies
|
|
- The project supports both Chinese and English interfaces |