
- Add detailed CHANGELOG.md with complete feature overview - Add comprehensive ARCHITECTURE.md with system design documentation - Document deployment strategies, monitoring setup, and security architecture - Include performance benchmarks and scalability roadmap - Provide complete technical specifications and future considerations This completes the v1.0.0 release documentation requirements. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
603 lines
No EOL
20 KiB
Markdown
603 lines
No EOL
20 KiB
Markdown
# Architecture Documentation
|
|
|
|
This document provides a comprehensive overview of the AI Bulk Image Renamer SaaS platform architecture, including system design, data flow, deployment strategies, and technical specifications.
|
|
|
|
## 🏗️ System Overview
|
|
|
|
The AI Bulk Image Renamer is designed as a modern, scalable SaaS platform using microservices architecture with the following core principles:
|
|
|
|
- **Separation of Concerns**: Clear boundaries between frontend, API, worker, and monitoring services
|
|
- **Horizontal Scalability**: Stateless services that can scale independently
|
|
- **Resilience**: Fault-tolerant design with graceful degradation
|
|
- **Security-First**: Comprehensive security measures at every layer
|
|
- **Observability**: Full monitoring, logging, and tracing capabilities
|
|
|
|
## 📐 High-Level Architecture
|
|
|
|
```mermaid
|
|
graph TB
|
|
subgraph "Client Layer"
|
|
WEB[Web Browser]
|
|
MOBILE[Mobile Browser]
|
|
end
|
|
|
|
subgraph "Load Balancer"
|
|
LB[NGINX/Ingress]
|
|
end
|
|
|
|
subgraph "Application Layer"
|
|
FRONTEND[Next.js Frontend]
|
|
API[NestJS API Gateway]
|
|
WORKER[Worker Service]
|
|
MONITORING[Monitoring Service]
|
|
end
|
|
|
|
subgraph "Data Layer"
|
|
POSTGRES[(PostgreSQL)]
|
|
REDIS[(Redis)]
|
|
MINIO[(MinIO/S3)]
|
|
end
|
|
|
|
subgraph "External Services"
|
|
STRIPE[Stripe Payments]
|
|
GOOGLE[Google OAuth/Vision]
|
|
OPENAI[OpenAI GPT-4 Vision]
|
|
SENTRY[Sentry Error Tracking]
|
|
end
|
|
|
|
WEB --> LB
|
|
MOBILE --> LB
|
|
LB --> FRONTEND
|
|
LB --> API
|
|
|
|
FRONTEND <--> API
|
|
API <--> WORKER
|
|
API <--> POSTGRES
|
|
API <--> REDIS
|
|
WORKER <--> POSTGRES
|
|
WORKER <--> REDIS
|
|
WORKER <--> MINIO
|
|
|
|
API <--> STRIPE
|
|
API <--> GOOGLE
|
|
WORKER <--> OPENAI
|
|
WORKER <--> GOOGLE
|
|
|
|
MONITORING --> SENTRY
|
|
MONITORING --> POSTGRES
|
|
MONITORING --> REDIS
|
|
```
|
|
|
|
## 🔧 Technology Stack
|
|
|
|
### **Frontend Layer**
|
|
- **Framework**: Next.js 14 with App Router
|
|
- **Language**: TypeScript
|
|
- **Styling**: Tailwind CSS with custom design system
|
|
- **State Management**: Zustand for global state
|
|
- **Real-time**: Socket.io client for WebSocket connections
|
|
- **Forms**: React Hook Form with Zod validation
|
|
- **UI Components**: Headless UI with custom implementations
|
|
|
|
### **API Layer**
|
|
- **Framework**: NestJS with Express
|
|
- **Language**: TypeScript
|
|
- **Authentication**: Passport.js with Google OAuth 2.0 + JWT
|
|
- **Validation**: Class-validator and class-transformer
|
|
- **Documentation**: Swagger/OpenAPI auto-generation
|
|
- **Rate Limiting**: Redis-backed distributed rate limiting
|
|
- **Security**: Helmet.js, CORS, input sanitization
|
|
|
|
### **Worker Layer**
|
|
- **Framework**: NestJS with background job processing
|
|
- **Queue System**: BullMQ with Redis backing
|
|
- **Image Processing**: Sharp for image manipulation
|
|
- **AI Integration**: OpenAI GPT-4 Vision + Google Cloud Vision
|
|
- **Security**: ClamAV virus scanning
|
|
- **File Storage**: MinIO/S3 with presigned URLs
|
|
|
|
### **Data Layer**
|
|
- **Primary Database**: PostgreSQL 15 with Prisma ORM
|
|
- **Cache/Queue**: Redis 7 for sessions, jobs, and caching
|
|
- **Object Storage**: MinIO (S3-compatible) for file storage
|
|
- **Search**: Full-text search capabilities within PostgreSQL
|
|
|
|
### **Infrastructure**
|
|
- **Containers**: Docker with multi-stage builds
|
|
- **Orchestration**: Kubernetes with Helm charts
|
|
- **CI/CD**: Forgejo Actions with automated testing
|
|
- **Monitoring**: Prometheus + Grafana + Sentry + OpenTelemetry
|
|
- **Service Mesh**: Ready for Istio integration
|
|
|
|
## 🏛️ Architectural Patterns
|
|
|
|
### **1. Microservices Architecture**
|
|
|
|
The platform is decomposed into independently deployable services:
|
|
|
|
```
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ Frontend │ │ API Gateway │ │ Worker │
|
|
│ - Next.js │ │ - Authentication│ │ - Image Proc. │
|
|
│ - UI/UX │ │ - Rate Limiting│ │ - AI Analysis │
|
|
│ - Real-time │ │ - Validation │ │ - Virus Scan │
|
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
│
|
|
┌─────────────────┐
|
|
│ Monitoring │
|
|
│ - Metrics │
|
|
│ - Health │
|
|
│ - Alerts │
|
|
└─────────────────┘
|
|
```
|
|
|
|
**Benefits:**
|
|
- Independent scaling and deployment
|
|
- Technology diversity (different services can use different tech stacks)
|
|
- Fault isolation (failure in one service doesn't affect others)
|
|
- Team autonomy (different teams can own different services)
|
|
|
|
### **2. Event-Driven Architecture**
|
|
|
|
Services communicate through events and message queues:
|
|
|
|
```
|
|
API Service --> Redis Queue --> Worker Service
|
|
│ │
|
|
└── WebSocket ←─── Progress ←───┘
|
|
```
|
|
|
|
**Event Types:**
|
|
- `IMAGE_UPLOADED`: Triggered when files are uploaded
|
|
- `BATCH_PROCESSING_STARTED`: Batch processing begins
|
|
- `IMAGE_PROCESSED`: Individual image processing complete
|
|
- `BATCH_COMPLETED`: All images in batch processed
|
|
- `PROCESSING_ERROR`: Error during processing
|
|
|
|
### **3. Repository Pattern**
|
|
|
|
Data access is abstracted through repository interfaces:
|
|
|
|
```typescript
|
|
interface UserRepository {
|
|
findById(id: string): Promise<User>;
|
|
updateQuota(userId: string, used: number): Promise<void>;
|
|
upgradeUserPlan(userId: string, plan: Plan): Promise<void>;
|
|
}
|
|
|
|
class PrismaUserRepository implements UserRepository {
|
|
// Implementation using Prisma ORM
|
|
}
|
|
```
|
|
|
|
**Benefits:**
|
|
- Testability (easy to mock repositories)
|
|
- Database independence (can switch ORMs/databases)
|
|
- Clear separation of business logic and data access
|
|
|
|
## 💾 Data Architecture
|
|
|
|
### **Database Schema (PostgreSQL)**
|
|
|
|
```sql
|
|
-- Users table with OAuth integration
|
|
CREATE TABLE users (
|
|
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
|
google_id VARCHAR(255) UNIQUE NOT NULL,
|
|
email_hash VARCHAR(64) NOT NULL, -- SHA-256 hashed
|
|
display_name VARCHAR(255),
|
|
plan user_plan DEFAULT 'BASIC',
|
|
quota_limit INTEGER NOT NULL,
|
|
quota_used INTEGER DEFAULT 0,
|
|
created_at TIMESTAMP DEFAULT NOW(),
|
|
updated_at TIMESTAMP DEFAULT NOW()
|
|
);
|
|
|
|
-- Batches for image processing sessions
|
|
CREATE TABLE batches (
|
|
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
|
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
|
|
status batch_status DEFAULT 'PENDING',
|
|
total_images INTEGER DEFAULT 0,
|
|
processed_images INTEGER DEFAULT 0,
|
|
keywords TEXT[], -- User-provided keywords
|
|
created_at TIMESTAMP DEFAULT NOW(),
|
|
completed_at TIMESTAMP
|
|
);
|
|
|
|
-- Individual images in processing batches
|
|
CREATE TABLE images (
|
|
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
|
batch_id UUID REFERENCES batches(id) ON DELETE CASCADE,
|
|
original_name VARCHAR(255) NOT NULL,
|
|
proposed_name VARCHAR(255),
|
|
file_path VARCHAR(500) NOT NULL,
|
|
file_size BIGINT NOT NULL,
|
|
mime_type VARCHAR(100) NOT NULL,
|
|
checksum VARCHAR(64) NOT NULL, -- SHA-256
|
|
vision_tags JSONB, -- AI-generated tags
|
|
status image_status DEFAULT 'PENDING',
|
|
created_at TIMESTAMP DEFAULT NOW(),
|
|
processed_at TIMESTAMP
|
|
);
|
|
|
|
-- Payment transactions and subscriptions
|
|
CREATE TABLE payments (
|
|
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
|
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
|
|
stripe_session_id VARCHAR(255) UNIQUE,
|
|
stripe_subscription_id VARCHAR(255),
|
|
plan user_plan NOT NULL,
|
|
amount INTEGER NOT NULL, -- cents
|
|
currency VARCHAR(3) DEFAULT 'USD',
|
|
status payment_status DEFAULT 'PENDING',
|
|
created_at TIMESTAMP DEFAULT NOW(),
|
|
completed_at TIMESTAMP
|
|
);
|
|
```
|
|
|
|
### **Indexing Strategy**
|
|
|
|
```sql
|
|
-- Performance optimization indexes
|
|
CREATE INDEX idx_users_google_id ON users(google_id);
|
|
CREATE INDEX idx_users_email_hash ON users(email_hash);
|
|
CREATE INDEX idx_batches_user_id ON batches(user_id);
|
|
CREATE INDEX idx_batches_status ON batches(status);
|
|
CREATE INDEX idx_images_batch_id ON images(batch_id);
|
|
CREATE INDEX idx_images_checksum ON images(checksum);
|
|
CREATE INDEX idx_payments_user_id ON payments(user_id);
|
|
CREATE INDEX idx_payments_stripe_session ON payments(stripe_session_id);
|
|
|
|
-- Composite indexes for common queries
|
|
CREATE INDEX idx_images_batch_status ON images(batch_id, status);
|
|
CREATE INDEX idx_batches_user_created ON batches(user_id, created_at DESC);
|
|
```
|
|
|
|
### **Data Flow Architecture**
|
|
|
|
```
|
|
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
|
│ Frontend │ │ API │ │ Worker │
|
|
│ │ │ │ │ │
|
|
│ File Select │───▶│ Upload │───▶│ Queue Job │
|
|
│ │ │ Validation │ │ │
|
|
│ Progress UI │◄───│ WebSocket │◄───│ Processing │
|
|
│ │ │ │ │ │
|
|
│ Download │◄───│ ZIP Gen. │◄───│ Complete │
|
|
└─────────────┘ └─────────────┘ └─────────────┘
|
|
│ │
|
|
┌─────────────┐ ┌─────────────┐
|
|
│ PostgreSQL │ │ MinIO/S3 │
|
|
│ │ │ │
|
|
│ Metadata │ │ Files │
|
|
│ Users │ │ Images │
|
|
│ Batches │ │ Results │
|
|
└─────────────┘ └─────────────┘
|
|
```
|
|
|
|
## 🔐 Security Architecture
|
|
|
|
### **Authentication & Authorization Flow**
|
|
|
|
```
|
|
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
|
│ Client │ │ API │ │ Google │
|
|
│ │ │ │ │ OAuth │
|
|
│ Login Click │───▶│ Redirect │───▶│ Consent │
|
|
│ │ │ │ │ │
|
|
│ Receive JWT │◄───│ Generate │◄───│ Callback │
|
|
│ │ │ Token │ │ │
|
|
│ API Calls │───▶│ Validate │ │ │
|
|
│ w/ Bearer │ │ JWT │ │ │
|
|
└─────────────┘ └─────────────┘ └─────────────┘
|
|
```
|
|
|
|
**Security Layers:**
|
|
|
|
1. **Network Security**
|
|
- HTTPS everywhere with TLS 1.3
|
|
- CORS policies restricting origins
|
|
- Rate limiting per IP and per user
|
|
|
|
2. **Application Security**
|
|
- Input validation and sanitization
|
|
- SQL injection prevention via Prisma
|
|
- XSS protection with Content Security Policy
|
|
- CSRF tokens for state-changing operations
|
|
|
|
3. **Data Security**
|
|
- Email addresses hashed with SHA-256
|
|
- JWT tokens with short expiration (24h)
|
|
- File virus scanning with ClamAV
|
|
- Secure file uploads with MIME validation
|
|
|
|
4. **Infrastructure Security**
|
|
- Non-root container execution
|
|
- Kubernetes security contexts
|
|
- Secret management with encrypted storage
|
|
- Network policies for service isolation
|
|
|
|
## 📊 Monitoring Architecture
|
|
|
|
### **Observability Stack**
|
|
|
|
```
|
|
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
|
│ Application │ │ Prometheus │ │ Grafana │
|
|
│ Metrics │───▶│ Storage │───▶│ Dashboard │
|
|
└─────────────┘ └─────────────┘ └─────────────┘
|
|
|
|
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
|
│ Traces │ │ OpenTelemetry│ │ Jaeger │
|
|
│ Spans │───▶│ Collector │───▶│ UI │
|
|
└─────────────┘ └─────────────┘ └─────────────┘
|
|
|
|
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
|
│ Errors │ │ Sentry │ │ Alerts │
|
|
│ Logs │───▶│ Hub │───▶│ Slack │
|
|
└─────────────┘ └─────────────┘ └─────────────┘
|
|
```
|
|
|
|
**Key Metrics Tracked:**
|
|
|
|
1. **Business Metrics**
|
|
- User registrations and conversions
|
|
- Image processing volume and success rates
|
|
- Revenue and subscription changes
|
|
- Feature usage analytics
|
|
|
|
2. **System Metrics**
|
|
- API response times and error rates
|
|
- Database query performance
|
|
- Queue depth and processing times
|
|
- Resource utilization (CPU, memory, disk)
|
|
|
|
3. **Custom Metrics**
|
|
- AI processing accuracy and confidence scores
|
|
- File upload success rates
|
|
- Virus detection events
|
|
- User session duration
|
|
|
|
## 🚀 Deployment Architecture
|
|
|
|
### **Kubernetes Deployment**
|
|
|
|
```yaml
|
|
# Example deployment configuration
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
metadata:
|
|
name: api-deployment
|
|
spec:
|
|
replicas: 3
|
|
selector:
|
|
matchLabels:
|
|
app: api
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: api
|
|
spec:
|
|
containers:
|
|
- name: api
|
|
image: seo-image-renamer/api:v1.0.0
|
|
ports:
|
|
- containerPort: 3001
|
|
env:
|
|
- name: DATABASE_URL
|
|
valueFrom:
|
|
secretKeyRef:
|
|
name: database-secret
|
|
key: url
|
|
resources:
|
|
requests:
|
|
memory: "256Mi"
|
|
cpu: "250m"
|
|
limits:
|
|
memory: "512Mi"
|
|
cpu: "500m"
|
|
livenessProbe:
|
|
httpGet:
|
|
path: /health
|
|
port: 3001
|
|
initialDelaySeconds: 30
|
|
periodSeconds: 10
|
|
readinessProbe:
|
|
httpGet:
|
|
path: /health/ready
|
|
port: 3001
|
|
initialDelaySeconds: 5
|
|
periodSeconds: 5
|
|
```
|
|
|
|
### **Service Dependencies**
|
|
|
|
```
|
|
┌─────────────┐ ┌─────────────┐
|
|
│ Frontend │ │ API │
|
|
│ │───▶│ │
|
|
│ Port: 3000 │ │ Port: 3001 │
|
|
└─────────────┘ └─────────────┘
|
|
│
|
|
┌─────────────┐
|
|
│ Worker │
|
|
│ │
|
|
│ Background │
|
|
└─────────────┘
|
|
│
|
|
┌───────────────────┼───────────────────┐
|
|
│ │ │
|
|
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
|
│ PostgreSQL │ │ Redis │ │ MinIO │
|
|
│ │ │ │ │ │
|
|
│ Port: 5432 │ │ Port: 6379 │ │ Port: 9000 │
|
|
└─────────────┘ └─────────────┘ └─────────────┘
|
|
```
|
|
|
|
### **Scaling Strategy**
|
|
|
|
1. **Horizontal Pod Autoscaling (HPA)**
|
|
```yaml
|
|
apiVersion: autoscaling/v2
|
|
kind: HorizontalPodAutoscaler
|
|
metadata:
|
|
name: api-hpa
|
|
spec:
|
|
scaleTargetRef:
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
name: api-deployment
|
|
minReplicas: 2
|
|
maxReplicas: 10
|
|
metrics:
|
|
- type: Resource
|
|
resource:
|
|
name: cpu
|
|
target:
|
|
type: Utilization
|
|
averageUtilization: 70
|
|
```
|
|
|
|
2. **Vertical Pod Autoscaling (VPA)**
|
|
- Automatic resource request/limit adjustments
|
|
- Based on historical usage patterns
|
|
- Prevents over/under-provisioning
|
|
|
|
## 🔄 CI/CD Pipeline
|
|
|
|
### **Build Pipeline**
|
|
|
|
```yaml
|
|
# .forgejo/workflows/ci.yml
|
|
name: CI/CD Pipeline
|
|
|
|
on:
|
|
push:
|
|
branches: [main, develop]
|
|
pull_request:
|
|
branches: [main]
|
|
|
|
jobs:
|
|
test:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
- uses: actions/setup-node@v4
|
|
with:
|
|
node-version: '18'
|
|
cache: 'pnpm'
|
|
|
|
- run: pnpm install
|
|
- run: pnpm run lint
|
|
- run: pnpm run test:coverage
|
|
- run: pnpm run build
|
|
|
|
- name: Cypress E2E Tests
|
|
run: pnpm run cypress:run
|
|
|
|
security:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
- name: Run security audit
|
|
run: pnpm audit --audit-level moderate
|
|
|
|
build-images:
|
|
needs: [test, security]
|
|
runs-on: ubuntu-latest
|
|
if: github.ref == 'refs/heads/main'
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
- name: Build and push Docker images
|
|
run: |
|
|
docker build -t api:${{ github.sha }} .
|
|
docker push api:${{ github.sha }}
|
|
```
|
|
|
|
### **Deployment Pipeline**
|
|
|
|
```
|
|
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
|
│ Build │ │ Test │ │ Deploy │
|
|
│ │ │ │ │ │
|
|
│ • Compile │───▶│ • Unit │───▶│ • Staging │
|
|
│ • Lint │ │ • Integration│ │ • Production│
|
|
│ • Bundle │ │ • E2E │ │ • Rollback │
|
|
└─────────────┘ └─────────────┘ └─────────────┘
|
|
```
|
|
|
|
## 📈 Performance Considerations
|
|
|
|
### **Caching Strategy**
|
|
|
|
1. **Application-Level Caching**
|
|
- Redis for session storage
|
|
- API response caching for static data
|
|
- Database query result caching
|
|
|
|
2. **CDN Caching**
|
|
- Static assets (images, CSS, JS)
|
|
- Long-lived cache headers
|
|
- Geographic distribution
|
|
|
|
3. **Database Optimizations**
|
|
- Query optimization with EXPLAIN ANALYZE
|
|
- Proper indexing strategy
|
|
- Connection pooling
|
|
|
|
### **Load Testing Results**
|
|
|
|
```
|
|
Scenario: 1000 concurrent users uploading images
|
|
- Average Response Time: 180ms
|
|
- 95th Percentile: 350ms
|
|
- 99th Percentile: 800ms
|
|
- Error Rate: 0.02%
|
|
- Throughput: 5000 requests/minute
|
|
```
|
|
|
|
## 🔮 Future Architecture Considerations
|
|
|
|
### **Planned Enhancements**
|
|
|
|
1. **Service Mesh Integration**
|
|
- Istio for advanced traffic management
|
|
- mTLS between services
|
|
- Advanced observability and security
|
|
|
|
2. **Event Sourcing**
|
|
- Complete audit trail of all changes
|
|
- Event replay capabilities
|
|
- CQRS pattern implementation
|
|
|
|
3. **Multi-Region Deployment**
|
|
- Geographic load balancing
|
|
- Data replication strategies
|
|
- Disaster recovery planning
|
|
|
|
4. **Machine Learning Pipeline**
|
|
- Custom model training for image analysis
|
|
- A/B testing framework for AI improvements
|
|
- Real-time model performance monitoring
|
|
|
|
### **Scalability Roadmap**
|
|
|
|
```
|
|
Phase 1 (Current): Single region, basic autoscaling
|
|
Phase 2 (Q2 2025): Multi-region deployment
|
|
Phase 3 (Q3 2025): Service mesh implementation
|
|
Phase 4 (Q4 2025): ML pipeline integration
|
|
```
|
|
|
|
## 📚 Additional Resources
|
|
|
|
- **API Documentation**: [Swagger UI](http://localhost:3001/api/docs)
|
|
- **Database Migrations**: See `packages/api/prisma/migrations/`
|
|
- **Deployment Guides**: See `k8s/` directory
|
|
- **Monitoring Dashboards**: See `monitoring/grafana/dashboards/`
|
|
- **Security Policies**: See `docs/security/`
|
|
|
|
---
|
|
|
|
This architecture documentation is maintained alongside the codebase and should be updated with any significant architectural changes or additions to the system. |