Skip to main content

🚀 Deployment Guide - Phase 1 Multi-Tenant System

Version: 1.0.0 Date: January 29, 2026 Phase: 1 (Multi-Tenant Foundation) Status:PRODUCTION READY


📋 Table of Contents

  1. Pre-Deployment Checklist
  2. Environment Setup
  3. Database Migration
  4. Deployment Steps
  5. Verification & Smoke Tests
  6. Rollback Procedures
  7. Post-Deployment Monitoring
  8. Troubleshooting

✅ Pre-Deployment Checklist

Critical Requirements

1. Environment Variables (CRITICAL)

  • JWT_SECRET - Generate cryptographically secure 64-character random string
    # Generate JWT_SECRET
    openssl rand -hex 32
    # OR
    node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"
  • JWT_EXPIRY - Set token expiration (default: 24h, recommended: 8h for production)
  • PASSWORD_SALT_ROUNDS - Set bcrypt salt rounds (default: 10, min: 10)
  • MONGODB_URL - Production MongoDB connection string with authentication
  • RABBITMQ_URL - Production RabbitMQ connection string
  • REDIS_URL - Production Redis connection string
  • GEMINI_API_KEY - Google Gemini API key for AI analysis
  • ALLOWED_ORIGINS - Comma-separated list of allowed CORS origins
  • PUBLIC_API_URL - Public-facing API URL (for report links)

2. Security Configuration

  • SSL/TLS Certificates - Ensure HTTPS is configured for API and Dashboard
  • CORS Origins - Restrict to production domains only (no * or true)
  • Firewall Rules - Only expose necessary ports (80, 443)
  • Database Security - Enable authentication, restrict network access
  • Secret Rotation - Plan for periodic secret rotation (quarterly)

3. Infrastructure Readiness

  • Docker Engine - Version 20.10+ installed on all nodes
  • Docker Compose - Version 2.0+ installed
  • MongoDB - Version 6.0+ with replica set configured
  • RabbitMQ - Version 3.12+ with management plugin enabled
  • Redis - Version 7.0+ with persistence enabled
  • Persistent Storage - Volumes configured for reports, database, logs
  • Resource Limits - CPU/Memory limits configured for containers

4. Code & Dependencies

  • Git Repository - All changes committed and pushed
  • Build Success - All Docker images build without errors
  • Tests Passing - Integration tests pass (8/8)
  • Dependencies Audited - No critical vulnerabilities in npm packages
  • Documentation Updated - README, .env.example, API docs current

5. Backup & Recovery

  • Database Backup - Pre-deployment snapshot created
  • Configuration Backup - Current .env and docker-compose.yml saved
  • Rollback Plan - Documented and tested
  • Recovery Time Objective (RTO) - Defined (recommended: < 1 hour)
  • Recovery Point Objective (RPO) - Defined (recommended: < 5 minutes)

🔧 Environment Setup

1. Create Production Environment File

Location: Server - /opt/agnostic-automation-center/.env

# 1. SSH into production server
ssh user@production-server

# 2. Create directory
sudo mkdir -p /opt/agnostic-automation-center
cd /opt/agnostic-automation-center

# 3. Copy .env.example
sudo cp .env.example .env

# 4. Edit with production values
sudo nano .env

Production .env Template:

# ================================
# PRODUCTION ENVIRONMENT VARIABLES
# ================================

# JWT Authentication (CRITICAL - CHANGE THESE!)
JWT_SECRET=<64-character-random-hex-string> # Generate: openssl rand -hex 32
JWT_EXPIRY=8h
PASSWORD_SALT_ROUNDS=12

# Database (MongoDB)
MONGODB_URL=mongodb://admin:<password>@mongodb-prod:27017/automation_platform?authSource=admin

# Message Queue (RabbitMQ)
RABBITMQ_URL=amqp://admin:<password>@rabbitmq-prod:5672

# Cache (Redis)
REDIS_URL=redis://:&lt;password&gt;@redis-prod:6379

# AI Analysis (Google Gemini)
GEMINI_API_KEY=<your-gemini-api-key>

# API Configuration
PUBLIC_API_URL=https://api.yourdomain.com
PRODUCER_URL=http://producer:3000

# CORS Configuration
ALLOWED_ORIGINS=https://app.yourdomain.com,https://dashboard.yourdomain.com

# Dashboard Configuration
VITE_API_URL=https://api.yourdomain.com
DASHBOARD_URL=https://app.yourdomain.com

# Reports & Storage
REPORTS_DIR=/app/reports

# Docker & Runtime
NODE_ENV=production
RUNNING_IN_DOCKER=true

# Optional: Inject Environment Variables to Test Containers
INJECT_ENV_VARS=API_USER,API_PASSWORD,SECRET_KEY

# Logging (Optional)
LOG_LEVEL=info
LOG_AUTH=false # Set to true for debugging auth issues

2. Generate Secrets

# Generate JWT_SECRET (64 characters)
JWT_SECRET=$(openssl rand -hex 32)
echo "JWT_SECRET=$JWT_SECRET"

# Generate MongoDB password
MONGO_PASSWORD=$(openssl rand -base64 32)
echo "MONGO_PASSWORD=$MONGO_PASSWORD"

# Generate RabbitMQ password
RABBIT_PASSWORD=$(openssl rand -base64 32)
echo "RABBIT_PASSWORD=$RABBIT_PASSWORD"

# Generate Redis password
REDIS_PASSWORD=$(openssl rand -base64 32)
echo "REDIS_PASSWORD=$REDIS_PASSWORD"

3. Verify Environment Variables

# Check all critical variables are set
cat .env | grep -E "JWT_SECRET|MONGODB_URL|RABBITMQ_URL|REDIS_URL|GEMINI_API_KEY"

# Verify JWT_SECRET is not default
if grep -q "dev-secret-CHANGE-IN-PRODUCTION" .env; then
echo "❌ ERROR: JWT_SECRET is still using default value!"
else
echo "✅ JWT_SECRET is set correctly"
fi

4. Database Connection (MONGO_URI)

[!IMPORTANT] The docker-compose.yml uses ${MONGO_URI} from your .env file to determine the database connection. This is no longer hardcoded.

Configure based on your environment:

ModeMONGO_URI ValueUse Case
Cloudmongodb+srv://user:pass@cluster.mongodb.net/automation_platformProduction/Staging with MongoDB Atlas
Localmongodb://automation-mongodb:27017/automation_platformLocal development with Docker container

Setting Cloud Connection:

# In .env file
MONGO_URI=mongodb+srv://your-user:your-password@cluster.mongodb.net/automation_platform?retryWrites=true&w=majority

Setting Local Connection:

# In .env file (uses Docker MongoDB container)
MONGO_URI=mongodb://automation-mongodb:27017/automation_platform

[!WARNING] When switching between Cloud and Local:

  • Each database has separate user accounts
  • You must create a new account in the new environment
  • Existing JWT tokens will be invalid

📊 Database Migration

Pre-Migration Steps

# 1. Backup existing database (CRITICAL)
mongodump --uri="mongodb://localhost:27017/automation_platform" --out=/backup/pre-migration-$(date +%Y%m%d-%H%M%S)

# 2. Verify backup
ls -lh /backup/

# 3. Test backup restoration (on staging first!)
mongorestore --uri="mongodb://staging-server:27017/automation_platform_test" /backup/pre-migration-20260129-120000

Run Migration

# 1. Clone repository
cd /opt/agnostic-automation-center
git clone https://github.com/yourusername/agnostic-automation-center.git .

# 2. Install dependencies (for migration script)
npm install

# 3. Run migration script
export MONGODB_URI="mongodb://admin:<password>@mongodb-prod:27017/automation_platform?authSource=admin"
node migrations/001-add-organization-to-existing-data.ts

# Expected output:
# ✅ Migration started
# ✅ Created default organization
# ✅ Created default admin user
# ✅ Migrated 29 executions
# ✅ Created 15 indexes
# ✅ Migration completed successfully

Post-Migration Verification

# 1. Connect to MongoDB
mongosh "mongodb://admin:<password>@mongodb-prod:27017/automation_platform?authSource=admin"

# 2. Verify collections and data
use automation_platform

# Check organizations
db.organizations.countDocuments() // Should be >= 1

# Check users
db.users.countDocuments() // Should be >= 1

# Check executions have organizationId
db.executions.findOne() // Should have organizationId field

# Check indexes
db.executions.getIndexes()
db.users.getIndexes()
db.organizations.getIndexes()

# Exit
exit

🚀 Deployment Steps

Deployment Flow

1. Pull latest code
2. Build Docker images
3. Stop old containers
4. Start new containers
5. Verify services
6. Run smoke tests
7. Monitor logs
# 1. SSH to staging server
ssh user@staging-server
cd /opt/agnostic-automation-center

# 2. Pull latest code
git fetch origin
git checkout main
git pull origin main

# 3. Build images
docker-compose -f docker-compose.yml build

# 4. Stop existing containers
docker-compose down

# 5. Start new containers
docker-compose up -d

# 6. Verify all services started
docker-compose ps

# Expected output:
# NAME STATUS
# producer-service Up 30 seconds
# worker-service Up 30 seconds
# dashboard-client Up 30 seconds
# automation-mongodb Up 30 seconds
# automation-rabbitmq Up 30 seconds
# automation-redis Up 30 seconds

# 7. Check logs for errors
docker-compose logs --tail=50 producer
docker-compose logs --tail=50 worker
docker-compose logs --tail=50 dashboard

# 8. Run smoke tests (see Verification section below)

Production Deployment

IMPORTANT: Only deploy to production after successful staging deployment and smoke tests.

# 1. SSH to production server
ssh user@production-server
cd /opt/agnostic-automation-center

# 2. Pull latest code
git fetch origin
git checkout main
git pull origin main

# 3. Verify commit hash matches staging
STAGING_COMMIT="<staging-commit-hash>"
CURRENT_COMMIT=$(git rev-parse HEAD)

if [ "$STAGING_COMMIT" == "$CURRENT_COMMIT" ]; then
echo "✅ Commit matches staging"
else
echo "❌ WARNING: Commit mismatch! Staging: $STAGING_COMMIT, Production: $CURRENT_COMMIT"
exit 1
fi

# 4. Build images (production optimized)
docker-compose -f docker-compose.prod.yml build --no-cache

# 5. Create backup of current deployment
docker-compose ps > /backup/deployment-$(date +%Y%m%d-%H%M%S).txt
docker images > /backup/images-$(date +%Y%m%d-%H%M%S).txt

# 6. Rolling deployment (zero downtime)
# Option A: Blue-Green deployment
# Start new containers on different ports, switch load balancer, stop old containers

# Option B: Standard deployment (brief downtime)
docker-compose -f docker-compose.prod.yml down
docker-compose -f docker-compose.prod.yml up -d

# 7. Verify deployment
docker-compose -f docker-compose.prod.yml ps
docker-compose -f docker-compose.prod.yml logs --tail=100

# 8. Health check
curl -f https://api.yourdomain.com/ || echo "❌ API health check failed"
curl -f https://app.yourdomain.com/ || echo "❌ Dashboard health check failed"

Post-Deployment Cleanup

# Remove unused images
docker image prune -a -f

# Remove unused volumes (CAREFUL - only if absolutely sure)
# docker volume prune -f

# Check disk usage
docker system df

✅ Verification & Smoke Tests

1. Service Health Checks

# Producer Service
curl -f http://localhost:3000/
# Expected: {"message":"Agnostic Producer Service is running!"}

# Dashboard Client
curl -f http://localhost:8080/
# Expected: HTML response

# MongoDB
docker exec -it automation-mongodb mongosh --eval "db.adminCommand('ping')"
# Expected: { ok: 1 }

# RabbitMQ
docker exec -it automation-rabbitmq rabbitmqctl status
# Expected: Status information

# Redis
docker exec -it automation-redis redis-cli ping
# Expected: PONG

2. Authentication Tests

# Test signup endpoint
curl -X POST http://localhost:3000/api/auth/signup \
-H "Content-Type: application/json" \
-d '{
"email": "testuser@example.com",
"password": "Test123!@#",
"name": "Test User",
"organizationName": "Test Organization"
}'

# Expected:
# {
# "success": true,
# "token": "eyJhbGc...",
# "user": { ... }
# }

# Test login endpoint
curl -X POST http://localhost:3000/api/auth/login \
-H "Content-Type: application/json" \
-d '{
"email": "testuser@example.com",
"password": "Test123!@#"
}'

# Expected:
# {
# "success": true,
# "token": "eyJhbGc...",
# "user": { ... }
# }

3. Protected Endpoint Tests

# Get executions (requires auth)
TOKEN="<token-from-login>"

curl -X GET http://localhost:3000/api/executions \
-H "Authorization: Bearer $TOKEN"

# Expected:
# {
# "success": true,
# "data": []
# }

4. Multi-Tenant Isolation Test

# Create two organizations
ORG1_TOKEN=$(curl -X POST http://localhost:3000/api/auth/signup \
-H "Content-Type: application/json" \
-d '{"email":"org1@test.com","password":"Test123!@#","name":"User 1","organizationName":"Org 1"}' \
| jq -r '.token')

ORG2_TOKEN=$(curl -X POST http://localhost:3000/api/auth/signup \
-H "Content-Type: application/json" \
-d '{"email":"org2@test.com","password":"Test123!@#","name":"User 2","organizationName":"Org 2"}' \
| jq -r '.token')

# Verify isolation
# Org 1 should see 0 executions
curl -X GET http://localhost:3000/api/executions \
-H "Authorization: Bearer $ORG1_TOKEN" \
| jq '.data | length'
# Expected: 0

# Org 2 should see 0 executions
curl -X GET http://localhost:3000/api/executions \
-H "Authorization: Bearer $ORG2_TOKEN" \
| jq '.data | length'
# Expected: 0

5. Socket.io Connection Test

# Test Socket.io connection (requires wscat or similar tool)
npm install -g wscat

# Connect to Socket.io (replace TOKEN with actual JWT)
wscat -c "ws://localhost:3000/socket.io/?EIO=4&transport=websocket" \
--header "Authorization: Bearer $TOKEN"

# Send authentication
42["auth",{"token":"YOUR_JWT_TOKEN"}]

# Expected:
# 42["auth-success",{"message":"Connected to organization channel",...}]

6. Dashboard UI Test

# Open dashboard in browser
open http://localhost:8080

# Manual checks:
# ✅ Login page loads
# ✅ Can signup new user
# ✅ Can login with credentials
# ✅ Dashboard displays after login
# ✅ No console errors in browser DevTools
# ✅ Socket.io connection successful (check Network tab → WS)

7. Integration Test Suite

# Run automated integration tests
cd tests
npm install
npm run test:integration

# Expected:
# ✅ 8/8 tests passed

🔙 Rollback Procedures

Quick Rollback (Revert to Previous Docker Images)

# 1. Stop current containers
docker-compose down

# 2. List available images
docker images | grep agnostic-automation-center

# 3. Tag previous version
docker tag agnostic-automation-center-producer:previous agnostic-automation-center-producer:latest
docker tag agnostic-automation-center-worker:previous agnostic-automation-center-worker:latest
docker tag agnostic-automation-center-dashboard:previous agnostic-automation-center-dashboard:latest

# 4. Start previous version
docker-compose up -d

# 5. Verify rollback
docker-compose ps
docker-compose logs --tail=100

Database Rollback (If Migration Failed)

# 1. Stop all services
docker-compose down

# 2. Drop current database
mongosh "mongodb://admin:<password>@mongodb-prod:27017/automation_platform?authSource=admin" \
--eval "db.dropDatabase()"

# 3. Restore from backup
mongorestore --uri="mongodb://admin:<password>@mongodb-prod:27017/automation_platform?authSource=admin" \
/backup/pre-migration-20260129-120000

# 4. Restart services with old code
git checkout <previous-commit-hash>
docker-compose up -d

Full System Rollback

# 1. Stop all containers
docker-compose down

# 2. Revert code to previous version
git log --oneline -n 10 # Find previous stable commit
git checkout <previous-stable-commit>

# 3. Restore database from backup
mongorestore --uri="mongodb://admin:<password>@mongodb-prod:27017/automation_platform" \
/backup/pre-migration-20260129-120000

# 4. Rebuild and restart
docker-compose build
docker-compose up -d

# 5. Verify rollback
curl http://localhost:3000/
curl http://localhost:8080/

📊 Post-Deployment Monitoring

1. Real-Time Monitoring

# Monitor all service logs
docker-compose logs -f

# Monitor specific service
docker-compose logs -f producer
docker-compose logs -f worker
docker-compose logs -f dashboard

# Monitor resource usage
docker stats

# Expected:
# CONTAINER CPU % MEM USAGE / LIMIT NET I/O
# producer-service 2.5% 150MiB / 512MiB 1.2kB / 890B
# worker-service 1.2% 200MiB / 1GiB 450B / 320B
# dashboard-client 0.5% 50MiB / 256MiB 2.1kB / 1.5kB

2. Application Metrics

# Check MongoDB connections
docker exec -it automation-mongodb mongosh --eval "db.serverStatus().connections"
# Expected: { current: <number>, available: <number> }

# Check RabbitMQ queues
docker exec -it automation-rabbitmq rabbitmqctl list_queues
# Expected: test_queue with 0 messages (idle)

# Check Redis memory usage
docker exec -it automation-redis redis-cli INFO memory
# Expected: Memory usage statistics

3. Error Monitoring (First 24 Hours)

# Monitor for errors in producer service
docker-compose logs producer | grep -i error

# Monitor for failed authentications
docker-compose logs producer | grep -i "401\|403"

# Monitor for database errors
docker-compose logs producer | grep -i "mongo\|database"

# Set up alerts (example using cron)
# Check errors every 5 minutes for first day
*/5 * * * * docker-compose logs --since 5m producer | grep -i error && echo "Errors detected in producer service!" | mail -s "Production Alert" admin@example.com

4. Performance Monitoring

# API response time test
time curl http://localhost:3000/api/executions -H "Authorization: Bearer $TOKEN"
# Expected: < 200ms

# Database query performance
docker exec -it automation-mongodb mongosh --eval "db.executions.find({}).limit(10).explain('executionStats')"
# Check: executionTimeMillis should be < 50ms

# Socket.io connection time
# Use browser DevTools Network tab
# Expected: < 500ms for connection establishment

5. Security Monitoring

# Monitor failed login attempts
docker-compose logs producer | grep -i "Invalid credentials" | wc -l

# Monitor JWT verification failures
docker-compose logs producer | grep -i "Invalid token" | wc -l

# Monitor suspicious activity
docker-compose logs producer | grep -i "403\|429"

🔧 Troubleshooting

Common Issues & Solutions

Issue 1: Producer Service Won't Start

Symptoms:

producer-service exited with code 1
Error: Failed to connect to MongoDB

Solution:

# Check MongoDB is running
docker-compose ps automation-mongodb

# Check MongoDB logs
docker-compose logs automation-mongodb

# Verify MONGODB_URL in .env
cat .env | grep MONGODB_URL

# Test MongoDB connection
docker exec -it automation-mongodb mongosh --eval "db.adminCommand('ping')"

# Restart MongoDB
docker-compose restart automation-mongodb

Issue 2: Dashboard Shows "Network Error"

Symptoms:

  • Dashboard loads but shows connection errors
  • Cannot login
  • Console shows CORS errors

Solution:

# Check VITE_API_URL is correct
cat .env | grep VITE_API_URL

# Verify CORS configuration in producer service
docker-compose logs producer | grep -i cors

# Check producer service is accessible
curl http://localhost:3000/

# Verify ALLOWED_ORIGINS includes dashboard URL
cat .env | grep ALLOWED_ORIGINS

Issue 3: Socket.io Connection Fails

Symptoms:

WebSocket connection to 'ws://localhost:3000/socket.io/' failed

Solution:

# Check Socket.io registration in producer service
docker-compose logs producer | grep -i "socket"

# Verify JWT token is being sent
# (Check browser DevTools → Network → WS → Headers)

# Test Socket.io endpoint
curl http://localhost:3000/socket.io/

# Restart producer service
docker-compose restart producer

Issue 4: Tests Fail After Migration

Symptoms:

executions collection not found
organizationId is null

Solution:

# Verify migration completed
mongosh "mongodb://localhost:27017/automation_platform" --eval "db.organizations.countDocuments()"

# Check if executions have organizationId
mongosh "mongodb://localhost:27017/automation_platform" --eval "db.executions.findOne()"

# Re-run migration if needed
node migrations/001-add-organization-to-existing-data.ts

Issue 5: "Invalid Token" Errors

Symptoms:

401 Unauthorized - Invalid token
JWT verification failed

Solution:

# Verify JWT_SECRET matches between services
cat .env | grep JWT_SECRET

# Check JWT_SECRET is not default
if grep -q "dev-secret-CHANGE-IN-PRODUCTION" .env; then
echo "❌ ERROR: JWT_SECRET is default!"
fi

# Generate new JWT_SECRET if needed
JWT_SECRET=$(openssl rand -hex 32)
echo "JWT_SECRET=$JWT_SECRET" >> .env

# Restart all services
docker-compose down && docker-compose up -d

Issue 6: High CPU/Memory Usage

Symptoms:

docker stats shows >80% CPU or memory usage
Containers restarting frequently

Solution:

# Check resource limits
docker-compose config | grep -A 5 "resources:"

# Increase limits in docker-compose.yml
# services:
# producer:
# deploy:
# resources:
# limits:
# memory: 1G
# cpus: '2.0'

# Check for memory leaks
docker-compose logs producer | grep -i "out of memory"

# Restart services
docker-compose down && docker-compose up -d

Issue 7: Duplicate Executions

Symptoms:

  • Same taskId appears twice in database
  • One execution stuck PENDING, one completes

Solution:

# Verify organizationId is STRING in both services
docker-compose logs worker | grep -i "organizationId"

# Check worker service code
# Ensure NO ObjectId conversion:
# ✅ Correct: { taskId, organizationId }
# ❌ Wrong: { taskId, organizationId: new ObjectId(organizationId) }

# Clean up duplicates manually
mongosh "mongodb://localhost:27017/automation_platform" --eval "
db.executions.aggregate([
{ \$group: { _id: '\$taskId', count: { \$sum: 1 }, docs: { \$push: '\$_id' } } },
{ \$match: { count: { \$gt: 1 } } }
]).forEach(doc => {
db.executions.deleteMany({ _id: { \$in: doc.docs.slice(1) } });
});
"

📞 Support Contacts

Emergency Contacts

  • DevOps Lead: [Contact Info]
  • Backend Developer: [Contact Info]
  • Frontend Developer: [Contact Info]
  • Database Admin: [Contact Info]

Escalation Path

  1. Level 1: Check troubleshooting guide above
  2. Level 2: Review logs and error messages
  3. Level 3: Contact DevOps Lead
  4. Level 4: Initiate rollback procedures
  5. Level 5: Contact all stakeholders

📚 Additional Resources

  • Architecture Documentation: docs/architecture.md
  • API Documentation: docs/api-reference.md
  • Security Audit: docs/SECURITY-AUDIT-PHASE-1.md
  • Integration Testing: INTEGRATION-TESTING-COMPLETE.md
  • Bug Fixes: MANUAL-TESTING-FIXES-ROUND-2.md

✅ Deployment Sign-Off

Staging Deployment:

  • Date: __________
  • Deployed By: __________
  • Commit Hash: __________
  • Smoke Tests Passed: [ ] Yes [ ] No
  • Approved By: __________

Production Deployment:

  • Date: __________
  • Deployed By: __________
  • Commit Hash: __________
  • Smoke Tests Passed: [ ] Yes [ ] No
  • Approved By: __________
  • Monitoring Active: [ ] Yes [ ] No
  • Rollback Plan Ready: [ ] Yes [ ] No

Document Version: 1.0.0 Last Updated: January 29, 2026 Next Review: After Phase 2 implementation