Architecture
Analysis Date: 2026-04-17
Pattern Overview
Overall: Docker Compose service orchestration β no compiled application source code. All services are pre-built images configured via environment variables and volume mounts.
Key Characteristics:
- Fully local-first, zero cloud dependency
- All inter-service communication stays inside the Docker network
- One service (Ollama) runs on the host machine, accessed via host.docker.internal
- PostgreSQL replaces the default SQLite backend for production-grade persistence
- AI metadata enrichment is a satellite layer β it augments Paperless-ngx via REST API without modifying the core storage system
Services
broker (Redis 8):
- Purpose: Message broker and task queue for Paperless-ngx's document processing pipeline
- Image: docker.io/library/redis:8
- Port: 6379 (internal only, not exposed to host)
- Persistent volume: redisdata:/data
- Depended on by: webserver
db (PostgreSQL 18):
- Purpose: Primary relational database for all Paperless-ngx application data (documents, tags, correspondents, document types, users)
- Image: docker.io/library/postgres:18
- Port: 5432 (internal only, not exposed to host)
- Persistent volume: pgdata:/var/lib/postgresql
- Credentials: set via POSTGRES_DB, POSTGRES_USER, POSTGRES_PASSWORD (all default to paperless in docker-compose.yml)
- Depended on by: webserver
webserver (Paperless-ngx):
- Purpose: Core document management system β handles file ingestion, OCR, full-text search, storage, and REST API
- Image: ghcr.io/paperless-ngx/paperless-ngx:latest
- Port: 8000:8000 (exposed to host)
- Volumes:
- data:/usr/src/paperless/data β application data and search index
- media:/usr/src/paperless/media β stored document files
- ./export:/usr/src/paperless/export β export directory (host-mounted, gitignored)
- ./consume:/usr/src/paperless/consume β document inbox for file-drop ingestion (host-mounted, gitignored)
- Configuration: docker-compose.env (not in git, must be created manually) + inline environment (PAPERLESS_REDIS, PAPERLESS_DBHOST)
- Depends on: db, broker
- Depended on by: paperless-ai
paperless-ai:
- Purpose: AI enrichment satellite β polls Paperless-ngx via REST API, sends document text to Ollama for analysis, writes generated metadata (title, tags, correspondent, document type, date) back via API
- Image: clusterzx/paperless-ai
- Container name: paperless-ai (explicit)
- Port: 3000:3000 (exposed to host, configurable via PAPERLESS_AI_PORT env var)
- Persistent volume: paperless-ai_data:/app/data β stores its own config including .env with API token and Ollama settings
- Security hardening: cap_drop: ALL, no-new-privileges: true
- RAG integration: RAG_SERVICE_URL=http://webserver:8000, RAG_SERVICE_ENABLED=true
- Depends on: webserver
- Reaches Ollama via: http://host.docker.internal:11434 (host machine, not containerized)
Ollama (host process, not containerized):
- Purpose: Local LLM inference β runs llama3.2 (and llama2) for document analysis
- Runs on: host machine (GPU access)
- Port: 11434 on host
- Started with: ollama serve (must be running before docker compose up)
- Reached from containers via: http://host.docker.internal:11434
Data Flow
Document Ingestion:
- User drops file into
./consume/directory or uploads via web UI athttp://localhost:8000 webserverdetects file, runs OCR (Tesseract), extracts text and metadata- Document stored in
mediavolume; record written to PostgreSQL viadb - Task queued through
broker(Redis) for async processing steps - Full-text index updated in
datavolume
AI Enrichment (every 30 minutes):
paperless-aicron job fires (SCAN_INTERVAL=*/30 * * * *)- Fetches unprocessed documents from
http://webserver:8000/apiusingPAPERLESS_API_TOKEN - Sends document text to Ollama at
http://host.docker.internal:11434with modelllama3.2 - Ollama returns structured metadata: title, tags, correspondent, document type, date
paperless-aiwrites metadata back to Paperless-ngx via REST API- ChromaDB/RAGZ creates vector embeddings (SentenceTransformer) stored in
paperless-ai_datavolume
State Management:
- Document records and metadata: PostgreSQL (pgdata volume)
- Task queue state: Redis (redisdata volume)
- Raw document files: Docker media volume
- Application/search index data: Docker data volume
- AI config and vector embeddings: Docker paperless-ai_data volume
Entry Points
Document upload (human):
- Web UI: http://localhost:8000
- File drop: ./consume/ directory (host-mounted)
AI dashboard:
- Web UI: http://localhost:3000
- Initial setup: http://localhost:3000/setup
Admin / API:
- Paperless-ngx REST API: http://localhost:8000/api
- Django shell: docker exec -it paperless-webserver-1 python3 manage.py shell
- Token generation: docker exec paperless-webserver-1 python3 manage.py shell -c "..."
Network Topology
All four containerized services share the default Docker Compose bridge network. Service-to-service communication uses Docker DNS names (webserver, db, broker).
Ollama is the only component outside the Docker network. It is reached from paperless-ai using host.docker.internal:11434.
Critical networking constraint: network_mode: bridge must NOT be set on paperless-ai β doing so isolates it from the Compose network and breaks host.docker.internal resolution, silently preventing Ollama access.
Error Handling
Silent failure modes (documented in .claude/memory/project_coldstart_bug.md):
- Missing
docker-compose.envβ services start but Paperless-ngx is misconfigured; no error visible on port 8000 - Ollama not running on host β
paperless-aistarts successfully but AI enrichment silently fails at each scan interval network_mode: bridgeonpaperless-aiβ container starts, web UI works, but all Ollama calls fail
Pre-flight checklist (must verify before docker compose up):
1. docker-compose.env exists with ADMIN_USER, ADMIN_PASSWORD, SECRET_KEY
2. Ollama is running: ollama serve
3. network_mode: bridge is NOT present in docker-compose.yml for paperless-ai
Planned Extensions
Nullfeld-Integration: FPGA x SoC star maps layer β not yet implemented
Vector-Bridge to Eule/Qdrant: Cross-system RAG connecting this stack to crumbforest.org Qdrant instance β not yet implemented
Custom Fields: Paperless-ngx custom field configuration β not yet done
Architecture analysis: 2026-04-17