Deploying a NestJS backend to production with true zero downtime is more involved than most guides suggest. This is a complete, battle-tested walkthrough — built from real production experience — covering everything from Droplet setup to automated GitLab pipelines that deploy without dropping a single request.
Stack:
- Server: DigitalOcean Droplet (Ubuntu 24.04)
- Runtime: Node.js v18
- Process Manager: PM2 (cluster mode)
- Reverse Proxy: Nginx
- CI/CD: GitLab CI/CD
-
Deploy user:
deployer(non-root, for security)
Part 1 — Droplet Initial Setup
SSH into your Droplet as root and run the following steps once.
Install Node.js v18
⚠️ Warning
Do NOT usenodesourcesetup scripts — they may install v20 regardless of the version you specify. Install directly from the official Node.js binary to guarantee the version.
# Remove any existing Node.js installation
apt-get purge -y nodejs npm
apt-get autoremove -y
apt-get autoclean
rm -f /etc/apt/sources.list.d/nodesource.list
rm -f /etc/apt/sources.list.d/node*.list
rm -f /etc/apt/keyrings/nodesource.gpg
rm -f /usr/bin/node /usr/bin/nodejs /usr/bin/npm /usr/bin/npx
apt-get update
# Download and install Node.js v18 directly from nodejs.org
cd /tmp
curl -fsSL https://nodejs.org/dist/v18.20.8/node-v18.20.8-linux-x64.tar.xz -o node18.tar.xz
tar -xJf node18.tar.xz
cp -r node-v18.20.8-linux-x64/bin/* /usr/local/bin/
cp -r node-v18.20.8-linux-x64/lib/* /usr/local/lib/
cp -r node-v18.20.8-linux-x64/include/* /usr/local/include/
# Verify
node -v # v18.20.8
npm -v
Install PM2 and Nginx
npm install -g pm2
apt-get install -y nginx
Create a non-root deploy user
Never deploy as root. Create a dedicated deployer user:
useradd -m -s /bin/bash deployer
# Create app directory
mkdir -p /var/www/nestapp
chown -R deployer:deployer /var/www/nestapp
# Create PM2 log directory
mkdir -p /var/log/pm2
chown -R deployer:deployer /var/log/pm2
Set up SSH for the deployer user
mkdir -p /home/deployer/.ssh
chmod 700 /home/deployer/.ssh
touch /home/deployer/.ssh/authorized_keys
chmod 600 /home/deployer/.ssh/authorized_keys
chown -R deployer:deployer /home/deployer/.ssh
Configure the firewall
ufw allow OpenSSH
ufw allow 'Nginx Full'
ufw --force enable
Part 2 — Nginx Configuration
Create the Nginx site config
nano /etc/nginx/sites-available/nestapp
Paste the following, replacing your-domain.com with your actual domain or Droplet IP:
upstream nestapp {
server 127.0.0.1:3000;
keepalive 64;
}
server {
listen 80;
listen [::]:80;
server_name your-domain.com;
location / {
proxy_pass http://nestapp;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_cache_bypass $http_upgrade;
proxy_read_timeout 240s;
proxy_connect_timeout 10s;
}
location /health {
proxy_pass http://nestapp;
access_log off;
}
}
Enable the site
# Symlink to enable
ln -s /etc/nginx/sites-available/nestapp /etc/nginx/sites-enabled/nestapp
# Remove default site
rm /etc/nginx/sites-enabled/default
# Fix file permissions (important — Nginx can't read files owned only by root)
chmod 644 /etc/nginx/sites-available/nestapp
# Test config
nginx -t
# Reload Nginx
systemctl reload nginx
SSL with Certbot
Once your domain DNS is pointing to the Droplet:
apt install certbot python3-certbot-nginx -y
certbot --nginx -d your-domain.com
ℹ️ Note
Make suredig your-domain.com +shortreturns your Droplet IP before running certbot, otherwise the ACME challenge will fail.
Part 3 — Project File Setup
ecosystem.config.js
Add this to your project root and commit it to GitLab:
module.exports = {
apps: [
{
name: 'nestapp',
script: 'dist/main.js',
instances: 'max',
exec_mode: 'cluster',
node_args: '--security-revert=CVE-2023-46809', // remove if not needed
autorestart: true,
watch: false,
max_memory_restart: '512M',
env_production: {
NODE_ENV: 'production',
PORT: 3000,
},
out_file: '/var/log/pm2/nestapp-out.log',
error_file: '/var/log/pm2/nestapp-error.log',
log_date_format: 'YYYY-MM-DD HH:mm:ss Z',
merge_logs: true,
kill_timeout: 5000,
listen_timeout: 8000,
wait_ready: true,
},
],
};
Update src/main.ts
Signal PM2 when each worker is ready — this is what makes zero-downtime reload actually work:
import { NestFactory } from '@nestjs/core';
import { AppModule } from './app.module';
async function bootstrap() {
const app = await NestFactory.create(AppModule, { bufferLogs: true });
await app.listen(process.env.PORT || 3000);
// Tell PM2 this worker is ready to receive traffic
if (process.send) {
process.send('ready');
}
}
bootstrap();
Update package.json — disable Husky in production
Without this, npm ci on the server will fail because Husky tries to install git hooks:
"scripts": {
"prepare": "husky install || true"
}
.gitlab-ci.yml
Add this to your project root:
image: node:18-alpine
stages:
- install
- lint
- test
- build
- deploy
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
policy: pull-push
workflow:
rules:
- if: $CI_COMMIT_BRANCH == "main"
- if: $CI_MERGE_REQUEST_IID
variables:
NODE_ENV: test
install:
stage: install
script:
- npm ci --prefer-offline
artifacts:
paths:
- node_modules/
expire_in: 1 hour
lint:
stage: lint
needs: [install]
script:
- npm run lint
test:
stage: test
needs: [install]
script:
- npm run test -- --passWithNoTests
build:
stage: build
needs: [lint, test]
script:
- npm run build
- mkdir -p release
- cp -r dist package.json package-lock.json ecosystem.config.js release/
artifacts:
paths:
- release/
expire_in: 1 hour
deploy:
stage: deploy
image: alpine:latest
needs: [build]
rules:
- if: $CI_COMMIT_BRANCH == "main"
before_script:
- apk add --no-cache openssh-client rsync bash curl
- eval $(ssh-agent -s)
- echo "$SSH_PRIVATE_KEY" | tr -d 'r' | ssh-add -
- mkdir -p ~/.ssh && chmod 700 ~/.ssh
- ssh-keyscan -H "$DROPLET_HOST" >> ~/.ssh/known_hosts 2>/dev/null
script:
# 1. Sync build artifacts — .env and node_modules on the server are preserved
- |
rsync -avz --delete
--exclude='.env'
--exclude='node_modules'
release/
$DROPLET_USER@$DROPLET_HOST:/var/www/nestapp/
# 2. SSH in, install deps, reload with zero downtime
- |
ssh $DROPLET_USER@$DROPLET_HOST << ENDSSH
set -e
cd /var/www/nestapp
echo "==> Installing production dependencies..."
HUSKY=0 npm ci --omit=dev --prefer-offline
echo "==> Reloading app (zero downtime)..."
pm2 reload ecosystem.config.js --env production --update-env ||
pm2 start ecosystem.config.js --env production
echo "==> Saving PM2 process list..."
pm2 save
pm2 list
ENDSSH
# 3. Health check
- |
echo "==> Running health check..."
sleep 10
STATUS=$(curl -s -o /dev/null -w "%{http_code}" http://$DROPLET_HOST/ || echo "000")
if [ "$STATUS" = "200" ] || [ "$STATUS" = "201" ]; then
echo "Deploy successful! (HTTP $STATUS)"
else
echo "Health check failed (HTTP $STATUS) — check pm2 logs"
exit 1
fi
environment:
name: production
url: http://$DROPLET_HOST
Part 4 — SSH Key Setup (GitLab ↔ Droplet)
Generate an SSH key pair on your local machine
ssh-keygen -t ed25519 -C "gitlab-deployer" -f ~/.ssh/gitlab_deployer
# Leave passphrase empty — CI/CD can't type a passphrase
This creates two files:
-
~/.ssh/gitlab_deployer— private key (goes into GitLab) -
~/.ssh/gitlab_deployer.pub— public key (goes onto the Droplet)
Add the public key to the Droplet
As root on the Droplet:
> /home/deployer/.ssh/authorized_keys
echo "ssh-ed25519 AAAA...your-public-key... gitlab-deployer" >> /home/deployer/.ssh/authorized_keys
# Permissions are critical — wrong permissions = Permission denied every time
chmod 700 /home/deployer/.ssh
chmod 600 /home/deployer/.ssh/authorized_keys
chown -R deployer:deployer /home/deployer/.ssh
Test it
ssh -i ~/.ssh/gitlab_deployer deployer@your-droplet-ip
Should log in without a password prompt.
Add variables to GitLab
Go to GitLab → Settings → CI/CD → Variables:
| Key | Value | Protected | Masked |
|---|---|---|---|
SSH_PRIVATE_KEY |
Contents of ~/.ssh/gitlab_deployer
|
✅ | ❌ |
DROPLET_HOST |
Your Droplet IP | ✅ | ❌ |
DROPLET_USER |
deployer |
✅ | ❌ |
DROPLET_PATH |
/var/www/nestapp |
✅ | ❌ |
⚠️ Warning
SSH_PRIVATE_KEYcannot be masked — GitLab doesn’t support masking multiline values. Set it as Protected instead so it’s only available on protected branches.
Part 5 — First-Time Manual Deploy
Before CI/CD can take over, get the app running on the Droplet manually once.
Build and upload from your local machine
npm run build
scp -r dist package.json package-lock.json ecosystem.config.js root@your-droplet-ip:/var/www/nestapp/
ssh root@your-droplet-ip "chown -R deployer:deployer /var/www/nestapp"
Create the .env file on the Droplet
# As deployer
nano /var/www/nestapp/.env
NODE_ENV=production
PORT=3000
# all your other environment variables...
🚨 Important
Never commit.envto GitLab. This file lives only on the Droplet. The rsync command in the pipeline excludes it so it persists safely across every deployment.
Start PM2
cd /var/www/nestapp
HUSKY=0 npm ci --omit=dev
pm2 start ecosystem.config.js --env production
pm2 save
Configure PM2 to survive reboots
# As deployer
pm2 startup
# Copy and run the generated command as root, then:
pm2 save
Part 6 — Trigger the First Automated Pipeline
git add .gitlab-ci.yml ecosystem.config.js package.json
git commit -m "chore: add CI/CD pipeline and PM2 config"
git push origin main
Go to GitLab → CI/CD → Pipelines and watch all 5 stages:
install → lint → test → build → deploy
Every push to main from this point forward triggers a full pipeline and zero-downtime deployment automatically.
Part 7 — How Zero-Downtime Actually Works
PM2 cluster reload works like this:
- GitLab CI builds the new
dist/and rsyncs it to the Droplet -
pm2 reloadsends a graceful signal to worker 0 - Worker 0 finishes all in-flight requests then shuts down
- A new worker 0 boots, loads the new code, sends the
readysignal - PM2 moves to the next worker and repeats
- Nginx keeps routing traffic to all other online workers throughout
No requests are dropped. No downtime.
The wait_ready: true and listen_timeout: 8000 in ecosystem.config.js are what make this reliable — PM2 waits for the process.send('ready') signal from main.ts before it considers the new worker healthy.
Part 8 — Useful Commands
# Check running processes
pm2 list
# View live logs
pm2 logs nestapp
# Monitor CPU and memory
pm2 monit
# Zero-downtime reload (always use this for deployments)
pm2 reload nestapp --update-env
# Hard restart (brief downtime — use only if reload fails)
pm2 restart nestapp --update-env
# Reset restart counter
pm2 reset nestapp
# View last 50 log lines
pm2 logs nestapp --lines 50
# Test Nginx config before reloading
nginx -t
# Reload Nginx without downtime
systemctl reload nginx
# Test the app is responding on the Droplet
curl http://localhost:3000
Common Issues and Fixes
| Issue | Cause | Fix |
|---|---|---|
ecosystem.config.js not found |
File not on Droplet yet | Upload manually via scp before first CI run |
Permission denied (publickey) |
Wrong SSH folder permissions | chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys |
EACCES: permission denied mkdir /node_modules |
SSH heredoc not expanding variables | Use unquoted ENDSSH (not 'ENDSSH') |
.env deleted on every deploy |
rsync --delete removes files not in source |
Add --exclude='.env' and sync from a single release/ folder |
husky: not found |
Husky runs npm ci prepare script on server |
Prefix with HUSKY=0 → HUSKY=0 npm ci --omit=dev
|
| App crash-looping | Missing .env variables |
Run pm2 logs nestapp --lines 50 and recreate .env
|
Cannot find module dist/main.js |
rsync uploaded folder contents not the folder | Use dist without trailing slash in rsync |
| Node version wrong after install | nodesource installs wrong version | Install Node directly from nodejs.org binary |
| PM2 still using old Node version | PM2 daemon started with old binary |
pm2 kill then pm2 start ecosystem.config.js --env production
|
Nginx Permission denied reading config |
File permissions too restrictive | chmod 644 /etc/nginx/sites-available/nestapp |
| Certbot ACME challenge fails | DNS not pointed to Droplet yet | Run dig your-domain.com +short — must return Droplet IP first |
Built from a real production deployment of a NestJS Fintech backend. Every issue in the troubleshooting table was hit and solved during the actual setup.