🚀 Deployment Guide - Phase 1 Multi-Tenant System
Version: 1.0.0 Date: January 29, 2026 Phase: 1 (Multi-Tenant Foundation) Status: ✅ PRODUCTION READY
📋 Table of Contents
- Pre-Deployment Checklist
- Environment Setup
- Database Migration
- Deployment Steps
- Verification & Smoke Tests
- Rollback Procedures
- Post-Deployment Monitoring
- Troubleshooting
✅ Pre-Deployment Checklist
Critical Requirements
1. Environment Variables (CRITICAL)
- JWT_SECRET - Generate cryptographically secure 64-character random string
# Generate JWT_SECRET
openssl rand -hex 32
# OR
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))" - JWT_EXPIRY - Set token expiration (default:
24h, recommended:8hfor production) - PASSWORD_SALT_ROUNDS - Set bcrypt salt rounds (default:
10, min:10) - MONGODB_URL - Production MongoDB connection string with authentication
- RABBITMQ_URL - Production RabbitMQ connection string
- REDIS_URL - Production Redis connection string
- GEMINI_API_KEY - Google Gemini API key for AI analysis
- ALLOWED_ORIGINS - Comma-separated list of allowed CORS origins
- PUBLIC_API_URL - Public-facing API URL (for report links)
2. Security Configuration
- SSL/TLS Certificates - Ensure HTTPS is configured for API and Dashboard
- CORS Origins - Restrict to production domains only (no
*ortrue) - Firewall Rules - Only expose necessary ports (80, 443)
- Database Security - Enable authentication, restrict network access
- Secret Rotation - Plan for periodic secret rotation (quarterly)
3. Infrastructure Readiness
- Docker Engine - Version 20.10+ installed on all nodes
- Docker Compose - Version 2.0+ installed
- MongoDB - Version 6.0+ with replica set configured
- RabbitMQ - Version 3.12+ with management plugin enabled
- Redis - Version 7.0+ with persistence enabled
- Persistent Storage - Volumes configured for reports, database, logs
- Resource Limits - CPU/Memory limits configured for containers
4. Code & Dependencies
- Git Repository - All changes committed and pushed
- Build Success - All Docker images build without errors
- Tests Passing - Integration tests pass (8/8)
- Dependencies Audited - No critical vulnerabilities in npm packages
- Documentation Updated - README, .env.example, API docs current
5. Backup & Recovery
- Database Backup - Pre-deployment snapshot created
- Configuration Backup - Current .env and docker-compose.yml saved
- Rollback Plan - Documented and tested
- Recovery Time Objective (RTO) - Defined (recommended: < 1 hour)
- Recovery Point Objective (RPO) - Defined (recommended: < 5 minutes)
🔧 Environment Setup
1. Create Production Environment File
Location: Server - /opt/agnostic-automation-center/.env
# 1. SSH into production server
ssh user@production-server
# 2. Create directory
sudo mkdir -p /opt/agnostic-automation-center
cd /opt/agnostic-automation-center
# 3. Copy .env.example
sudo cp .env.example .env
# 4. Edit with production values
sudo nano .env
Production .env Template:
# ================================
# PRODUCTION ENVIRONMENT VARIABLES
# ================================
# JWT Authentication (CRITICAL - CHANGE THESE!)
JWT_SECRET=<64-character-random-hex-string> # Generate: openssl rand -hex 32
JWT_EXPIRY=8h
PASSWORD_SALT_ROUNDS=12
# Database (MongoDB)
MONGODB_URL=mongodb://admin:<password>@mongodb-prod:27017/automation_platform?authSource=admin
# Message Queue (RabbitMQ)
RABBITMQ_URL=amqp://admin:<password>@rabbitmq-prod:5672
# Cache (Redis)
REDIS_URL=redis://:<password>@redis-prod:6379
# AI Analysis (Google Gemini)
GEMINI_API_KEY=<your-gemini-api-key>
# API Configuration
PUBLIC_API_URL=https://api.yourdomain.com
PRODUCER_URL=http://producer:3000
# CORS Configuration
ALLOWED_ORIGINS=https://app.yourdomain.com,https://dashboard.yourdomain.com
# Dashboard Configuration
VITE_API_URL=https://api.yourdomain.com
DASHBOARD_URL=https://app.yourdomain.com
# Reports & Storage
REPORTS_DIR=/app/reports
# Docker & Runtime
NODE_ENV=production
RUNNING_IN_DOCKER=true
# Optional: Inject Environment Variables to Test Containers
INJECT_ENV_VARS=API_USER,API_PASSWORD,SECRET_KEY
# Logging (Optional)
LOG_LEVEL=info
LOG_AUTH=false # Set to true for debugging auth issues
2. Generate Secrets
# Generate JWT_SECRET (64 characters)
JWT_SECRET=$(openssl rand -hex 32)
echo "JWT_SECRET=$JWT_SECRET"
# Generate MongoDB password
MONGO_PASSWORD=$(openssl rand -base64 32)
echo "MONGO_PASSWORD=$MONGO_PASSWORD"
# Generate RabbitMQ password
RABBIT_PASSWORD=$(openssl rand -base64 32)
echo "RABBIT_PASSWORD=$RABBIT_PASSWORD"
# Generate Redis password
REDIS_PASSWORD=$(openssl rand -base64 32)
echo "REDIS_PASSWORD=$REDIS_PASSWORD"
3. Verify Environment Variables
# Check all critical variables are set
cat .env | grep -E "JWT_SECRET|MONGODB_URL|RABBITMQ_URL|REDIS_URL|GEMINI_API_KEY"
# Verify JWT_SECRET is not default
if grep -q "dev-secret-CHANGE-IN-PRODUCTION" .env; then
echo "❌ ERROR: JWT_SECRET is still using default value!"
else
echo "✅ JWT_SECRET is set correctly"
fi
4. Database Connection (MONGO_URI)
[!IMPORTANT] The
docker-compose.ymluses${MONGO_URI}from your.envfile to determine the database connection. This is no longer hardcoded.
Configure based on your environment:
| Mode | MONGO_URI Value | Use Case |
|---|---|---|
| Cloud | mongodb+srv://user:pass@cluster.mongodb.net/automation_platform | Production/Staging with MongoDB Atlas |
| Local | mongodb://automation-mongodb:27017/automation_platform | Local development with Docker container |
Setting Cloud Connection:
# In .env file
MONGO_URI=mongodb+srv://your-user:your-password@cluster.mongodb.net/automation_platform?retryWrites=true&w=majority
Setting Local Connection:
# In .env file (uses Docker MongoDB container)
MONGO_URI=mongodb://automation-mongodb:27017/automation_platform
[!WARNING] When switching between Cloud and Local:
- Each database has separate user accounts
- You must create a new account in the new environment
- Existing JWT tokens will be invalid
📊 Database Migration
Pre-Migration Steps
# 1. Backup existing database (CRITICAL)
mongodump --uri="mongodb://localhost:27017/automation_platform" --out=/backup/pre-migration-$(date +%Y%m%d-%H%M%S)
# 2. Verify backup
ls -lh /backup/
# 3. Test backup restoration (on staging first!)
mongorestore --uri="mongodb://staging-server:27017/automation_platform_test" /backup/pre-migration-20260129-120000
Run Migration
# 1. Clone repository
cd /opt/agnostic-automation-center
git clone https://github.com/yourusername/agnostic-automation-center.git .
# 2. Install dependencies (for migration script)
npm install
# 3. Run migration script
export MONGODB_URI="mongodb://admin:<password>@mongodb-prod:27017/automation_platform?authSource=admin"
node migrations/001-add-organization-to-existing-data.ts
# Expected output:
# ✅ Migration started
# ✅ Created default organization
# ✅ Created default admin user
# ✅ Migrated 29 executions
# ✅ Created 15 indexes
# ✅ Migration completed successfully
Post-Migration Verification
# 1. Connect to MongoDB
mongosh "mongodb://admin:<password>@mongodb-prod:27017/automation_platform?authSource=admin"
# 2. Verify collections and data
use automation_platform
# Check organizations
db.organizations.countDocuments() // Should be >= 1
# Check users
db.users.countDocuments() // Should be >= 1
# Check executions have organizationId
db.executions.findOne() // Should have organizationId field
# Check indexes
db.executions.getIndexes()
db.users.getIndexes()
db.organizations.getIndexes()
# Exit
exit
🚀 Deployment Steps
Deployment Flow
1. Pull latest code
2. Build Docker images
3. Stop old containers
4. Start new containers
5. Verify services
6. Run smoke tests
7. Monitor logs
Staging Deployment (Recommended First)
# 1. SSH to staging server
ssh user@staging-server
cd /opt/agnostic-automation-center
# 2. Pull latest code
git fetch origin
git checkout main
git pull origin main
# 3. Build images
docker-compose -f docker-compose.yml build
# 4. Stop existing containers
docker-compose down
# 5. Start new containers
docker-compose up -d
# 6. Verify all services started
docker-compose ps
# Expected output:
# NAME STATUS
# producer-service Up 30 seconds
# worker-service Up 30 seconds
# dashboard-client Up 30 seconds
# automation-mongodb Up 30 seconds
# automation-rabbitmq Up 30 seconds
# automation-redis Up 30 seconds
# 7. Check logs for errors
docker-compose logs --tail=50 producer
docker-compose logs --tail=50 worker
docker-compose logs --tail=50 dashboard
# 8. Run smoke tests (see Verification section below)
Production Deployment
IMPORTANT: Only deploy to production after successful staging deployment and smoke tests.
# 1. SSH to production server
ssh user@production-server
cd /opt/agnostic-automation-center
# 2. Pull latest code
git fetch origin
git checkout main
git pull origin main
# 3. Verify commit hash matches staging
STAGING_COMMIT="<staging-commit-hash>"
CURRENT_COMMIT=$(git rev-parse HEAD)
if [ "$STAGING_COMMIT" == "$CURRENT_COMMIT" ]; then
echo "✅ Commit matches staging"
else
echo "❌ WARNING: Commit mismatch! Staging: $STAGING_COMMIT, Production: $CURRENT_COMMIT"
exit 1
fi
# 4. Build images (production optimized)
docker-compose -f docker-compose.prod.yml build --no-cache
# 5. Create backup of current deployment
docker-compose ps > /backup/deployment-$(date +%Y%m%d-%H%M%S).txt
docker images > /backup/images-$(date +%Y%m%d-%H%M%S).txt
# 6. Rolling deployment (zero downtime)
# Option A: Blue-Green deployment
# Start new containers on different ports, switch load balancer, stop old containers
# Option B: Standard deployment (brief downtime)
docker-compose -f docker-compose.prod.yml down
docker-compose -f docker-compose.prod.yml up -d
# 7. Verify deployment
docker-compose -f docker-compose.prod.yml ps
docker-compose -f docker-compose.prod.yml logs --tail=100
# 8. Health check
curl -f https://api.yourdomain.com/ || echo "❌ API health check failed"
curl -f https://app.yourdomain.com/ || echo "❌ Dashboard health check failed"
Post-Deployment Cleanup
# Remove unused images
docker image prune -a -f
# Remove unused volumes (CAREFUL - only if absolutely sure)
# docker volume prune -f
# Check disk usage
docker system df
✅ Verification & Smoke Tests
1. Service Health Checks
# Producer Service
curl -f http://localhost:3000/
# Expected: {"message":"Agnostic Producer Service is running!"}
# Dashboard Client
curl -f http://localhost:8080/
# Expected: HTML response
# MongoDB
docker exec -it automation-mongodb mongosh --eval "db.adminCommand('ping')"
# Expected: { ok: 1 }
# RabbitMQ
docker exec -it automation-rabbitmq rabbitmqctl status
# Expected: Status information
# Redis
docker exec -it automation-redis redis-cli ping
# Expected: PONG
2. Authentication Tests
# Test signup endpoint
curl -X POST http://localhost:3000/api/auth/signup \
-H "Content-Type: application/json" \
-d '{
"email": "testuser@example.com",
"password": "Test123!@#",
"name": "Test User",
"organizationName": "Test Organization"
}'
# Expected:
# {
# "success": true,
# "token": "eyJhbGc...",
# "user": { ... }
# }
# Test login endpoint
curl -X POST http://localhost:3000/api/auth/login \
-H "Content-Type: application/json" \
-d '{
"email": "testuser@example.com",
"password": "Test123!@#"
}'
# Expected:
# {
# "success": true,
# "token": "eyJhbGc...",
# "user": { ... }
# }
3. Protected Endpoint Tests
# Get executions (requires auth)
TOKEN="<token-from-login>"
curl -X GET http://localhost:3000/api/executions \
-H "Authorization: Bearer $TOKEN"
# Expected:
# {
# "success": true,
# "data": []
# }
4. Multi-Tenant Isolation Test
# Create two organizations
ORG1_TOKEN=$(curl -X POST http://localhost:3000/api/auth/signup \
-H "Content-Type: application/json" \
-d '{"email":"org1@test.com","password":"Test123!@#","name":"User 1","organizationName":"Org 1"}' \
| jq -r '.token')
ORG2_TOKEN=$(curl -X POST http://localhost:3000/api/auth/signup \
-H "Content-Type: application/json" \
-d '{"email":"org2@test.com","password":"Test123!@#","name":"User 2","organizationName":"Org 2"}' \
| jq -r '.token')
# Verify isolation
# Org 1 should see 0 executions
curl -X GET http://localhost:3000/api/executions \
-H "Authorization: Bearer $ORG1_TOKEN" \
| jq '.data | length'
# Expected: 0
# Org 2 should see 0 executions
curl -X GET http://localhost:3000/api/executions \
-H "Authorization: Bearer $ORG2_TOKEN" \
| jq '.data | length'
# Expected: 0
5. Socket.io Connection Test
# Test Socket.io connection (requires wscat or similar tool)
npm install -g wscat
# Connect to Socket.io (replace TOKEN with actual JWT)
wscat -c "ws://localhost:3000/socket.io/?EIO=4&transport=websocket" \
--header "Authorization: Bearer $TOKEN"
# Send authentication
42["auth",{"token":"YOUR_JWT_TOKEN"}]
# Expected:
# 42["auth-success",{"message":"Connected to organization channel",...}]
6. Dashboard UI Test
# Open dashboard in browser
open http://localhost:8080
# Manual checks:
# ✅ Login page loads
# ✅ Can signup new user
# ✅ Can login with credentials
# ✅ Dashboard displays after login
# ✅ No console errors in browser DevTools
# ✅ Socket.io connection successful (check Network tab → WS)
7. Integration Test Suite
# Run automated integration tests
cd tests
npm install
npm run test:integration
# Expected:
# ✅ 8/8 tests passed
🔙 Rollback Procedures
Quick Rollback (Revert to Previous Docker Images)
# 1. Stop current containers
docker-compose down
# 2. List available images
docker images | grep agnostic-automation-center
# 3. Tag previous version
docker tag agnostic-automation-center-producer:previous agnostic-automation-center-producer:latest
docker tag agnostic-automation-center-worker:previous agnostic-automation-center-worker:latest
docker tag agnostic-automation-center-dashboard:previous agnostic-automation-center-dashboard:latest
# 4. Start previous version
docker-compose up -d
# 5. Verify rollback
docker-compose ps
docker-compose logs --tail=100
Database Rollback (If Migration Failed)
# 1. Stop all services
docker-compose down
# 2. Drop current database
mongosh "mongodb://admin:<password>@mongodb-prod:27017/automation_platform?authSource=admin" \
--eval "db.dropDatabase()"
# 3. Restore from backup
mongorestore --uri="mongodb://admin:<password>@mongodb-prod:27017/automation_platform?authSource=admin" \
/backup/pre-migration-20260129-120000
# 4. Restart services with old code
git checkout <previous-commit-hash>
docker-compose up -d
Full System Rollback
# 1. Stop all containers
docker-compose down
# 2. Revert code to previous version
git log --oneline -n 10 # Find previous stable commit
git checkout <previous-stable-commit>
# 3. Restore database from backup
mongorestore --uri="mongodb://admin:<password>@mongodb-prod:27017/automation_platform" \
/backup/pre-migration-20260129-120000
# 4. Rebuild and restart
docker-compose build
docker-compose up -d
# 5. Verify rollback
curl http://localhost:3000/
curl http://localhost:8080/
📊 Post-Deployment Monitoring
1. Real-Time Monitoring
# Monitor all service logs
docker-compose logs -f
# Monitor specific service
docker-compose logs -f producer
docker-compose logs -f worker
docker-compose logs -f dashboard
# Monitor resource usage
docker stats
# Expected:
# CONTAINER CPU % MEM USAGE / LIMIT NET I/O
# producer-service 2.5% 150MiB / 512MiB 1.2kB / 890B
# worker-service 1.2% 200MiB / 1GiB 450B / 320B
# dashboard-client 0.5% 50MiB / 256MiB 2.1kB / 1.5kB
2. Application Metrics
# Check MongoDB connections
docker exec -it automation-mongodb mongosh --eval "db.serverStatus().connections"
# Expected: { current: <number>, available: <number> }
# Check RabbitMQ queues
docker exec -it automation-rabbitmq rabbitmqctl list_queues
# Expected: test_queue with 0 messages (idle)
# Check Redis memory usage
docker exec -it automation-redis redis-cli INFO memory
# Expected: Memory usage statistics
3. Error Monitoring (First 24 Hours)
# Monitor for errors in producer service
docker-compose logs producer | grep -i error
# Monitor for failed authentications
docker-compose logs producer | grep -i "401\|403"
# Monitor for database errors
docker-compose logs producer | grep -i "mongo\|database"
# Set up alerts (example using cron)
# Check errors every 5 minutes for first day
*/5 * * * * docker-compose logs --since 5m producer | grep -i error && echo "Errors detected in producer service!" | mail -s "Production Alert" admin@example.com
4. Performance Monitoring
# API response time test
time curl http://localhost:3000/api/executions -H "Authorization: Bearer $TOKEN"
# Expected: < 200ms
# Database query performance
docker exec -it automation-mongodb mongosh --eval "db.executions.find({}).limit(10).explain('executionStats')"
# Check: executionTimeMillis should be < 50ms
# Socket.io connection time
# Use browser DevTools Network tab
# Expected: < 500ms for connection establishment
5. Security Monitoring
# Monitor failed login attempts
docker-compose logs producer | grep -i "Invalid credentials" | wc -l
# Monitor JWT verification failures
docker-compose logs producer | grep -i "Invalid token" | wc -l
# Monitor suspicious activity
docker-compose logs producer | grep -i "403\|429"
🔧 Troubleshooting
Common Issues & Solutions
Issue 1: Producer Service Won't Start
Symptoms:
producer-service exited with code 1
Error: Failed to connect to MongoDB
Solution:
# Check MongoDB is running
docker-compose ps automation-mongodb
# Check MongoDB logs
docker-compose logs automation-mongodb
# Verify MONGODB_URL in .env
cat .env | grep MONGODB_URL
# Test MongoDB connection
docker exec -it automation-mongodb mongosh --eval "db.adminCommand('ping')"
# Restart MongoDB
docker-compose restart automation-mongodb
Issue 2: Dashboard Shows "Network Error"
Symptoms:
- Dashboard loads but shows connection errors
- Cannot login
- Console shows CORS errors
Solution:
# Check VITE_API_URL is correct
cat .env | grep VITE_API_URL
# Verify CORS configuration in producer service
docker-compose logs producer | grep -i cors
# Check producer service is accessible
curl http://localhost:3000/
# Verify ALLOWED_ORIGINS includes dashboard URL
cat .env | grep ALLOWED_ORIGINS
Issue 3: Socket.io Connection Fails
Symptoms:
WebSocket connection to 'ws://localhost:3000/socket.io/' failed
Solution:
# Check Socket.io registration in producer service
docker-compose logs producer | grep -i "socket"
# Verify JWT token is being sent
# (Check browser DevTools → Network → WS → Headers)
# Test Socket.io endpoint
curl http://localhost:3000/socket.io/
# Restart producer service
docker-compose restart producer
Issue 4: Tests Fail After Migration
Symptoms:
executions collection not found
organizationId is null
Solution:
# Verify migration completed
mongosh "mongodb://localhost:27017/automation_platform" --eval "db.organizations.countDocuments()"
# Check if executions have organizationId
mongosh "mongodb://localhost:27017/automation_platform" --eval "db.executions.findOne()"
# Re-run migration if needed
node migrations/001-add-organization-to-existing-data.ts
Issue 5: "Invalid Token" Errors
Symptoms:
401 Unauthorized - Invalid token
JWT verification failed
Solution:
# Verify JWT_SECRET matches between services
cat .env | grep JWT_SECRET
# Check JWT_SECRET is not default
if grep -q "dev-secret-CHANGE-IN-PRODUCTION" .env; then
echo "❌ ERROR: JWT_SECRET is default!"
fi
# Generate new JWT_SECRET if needed
JWT_SECRET=$(openssl rand -hex 32)
echo "JWT_SECRET=$JWT_SECRET" >> .env
# Restart all services
docker-compose down && docker-compose up -d
Issue 6: High CPU/Memory Usage
Symptoms:
docker stats shows >80% CPU or memory usage
Containers restarting frequently
Solution:
# Check resource limits
docker-compose config | grep -A 5 "resources:"
# Increase limits in docker-compose.yml
# services:
# producer:
# deploy:
# resources:
# limits:
# memory: 1G
# cpus: '2.0'
# Check for memory leaks
docker-compose logs producer | grep -i "out of memory"
# Restart services
docker-compose down && docker-compose up -d
Issue 7: Duplicate Executions
Symptoms:
- Same taskId appears twice in database
- One execution stuck PENDING, one completes
Solution:
# Verify organizationId is STRING in both services
docker-compose logs worker | grep -i "organizationId"
# Check worker service code
# Ensure NO ObjectId conversion:
# ✅ Correct: { taskId, organizationId }
# ❌ Wrong: { taskId, organizationId: new ObjectId(organizationId) }
# Clean up duplicates manually
mongosh "mongodb://localhost:27017/automation_platform" --eval "
db.executions.aggregate([
{ \$group: { _id: '\$taskId', count: { \$sum: 1 }, docs: { \$push: '\$_id' } } },
{ \$match: { count: { \$gt: 1 } } }
]).forEach(doc => {
db.executions.deleteMany({ _id: { \$in: doc.docs.slice(1) } });
});
"
📞 Support Contacts
Emergency Contacts
- DevOps Lead: [Contact Info]
- Backend Developer: [Contact Info]
- Frontend Developer: [Contact Info]
- Database Admin: [Contact Info]
Escalation Path
- Level 1: Check troubleshooting guide above
- Level 2: Review logs and error messages
- Level 3: Contact DevOps Lead
- Level 4: Initiate rollback procedures
- Level 5: Contact all stakeholders
📚 Additional Resources
- Architecture Documentation:
docs/architecture.md - API Documentation:
docs/api-reference.md - Security Audit:
docs/SECURITY-AUDIT-PHASE-1.md - Integration Testing:
INTEGRATION-TESTING-COMPLETE.md - Bug Fixes:
MANUAL-TESTING-FIXES-ROUND-2.md
✅ Deployment Sign-Off
Staging Deployment:
- Date: __________
- Deployed By: __________
- Commit Hash: __________
- Smoke Tests Passed: [ ] Yes [ ] No
- Approved By: __________
Production Deployment:
- Date: __________
- Deployed By: __________
- Commit Hash: __________
- Smoke Tests Passed: [ ] Yes [ ] No
- Approved By: __________
- Monitoring Active: [ ] Yes [ ] No
- Rollback Plan Ready: [ ] Yes [ ] No
Document Version: 1.0.0 Last Updated: January 29, 2026 Next Review: After Phase 2 implementation