Software

3 minute read

🚀 How We Reduced API Response Time by 60% in Node.js Using Caching, Query Optimization & Performance Profiling

David Nack

January 28, 2026

Performance issues in backend systems rarely come from one obvious bug.

In our production system, a high-traffic API showed:

📉 P95 latency ~ 820ms
📈 Frequent database CPU spikes
⛔ Increased timeout rates during traffic bursts

After a structured optimization process, we reduced:

⚡ P95 latency: 820ms → 310ms
🧠 Database CPU usage reduced by ~40%
📊 Timeout rate reduced by ~70%

Here’s the exact breakdown of what we did.

1️⃣ Step One: Measure Before Optimizing

Before touching any code, we collected:

P50 / P95 / P99 latency
Slow query logs
DB execution plans
CPU & memory metrics
Event loop lag

We added a lightweight response-time logger:

app.use((req, res, next) => {
  const start = process.hrtime.bigint();

  res.on("finish", () => {
    const duration =
      Number(process.hrtime.bigint() - start) / 1_000_000;

    console.log(`${req.method} ${req.url} - ${duration.toFixed(2)}ms`);
  });

  next();
});

🔎 Observation

~68% latency was database time
~20% was repeated identical queries
Remaining was serialization + network overhead

Conclusion:
🧩 Primary bottleneck = Database layer.

2️⃣ Query Optimization (Biggest Impact 🔥)

❌ Problem 1: Missing Composite Index

Original query:

SELECT * 
FROM orders
WHERE user_id = $1
AND status = 'completed'
ORDER BY created_at DESC
LIMIT 20;

Execution plan showed:

Sequential scan
High disk I/O
Large row filtering

✅ Fix: Composite Index

CREATE INDEX idx_orders_user_status_created
ON orders(user_id, status, created_at DESC);

📊 Result

Query time: 180ms → 35ms
Removed full table scan
Reduced disk reads significantly

Indexing alone reduced endpoint latency by ~25%.

❌ Problem 2: Over-Fetching Data

Original:

SELECT * FROM users WHERE id = $1;

But API only needed:

name
profile_picture

✅ Optimized Query

SELECT name, profile_picture
FROM users
WHERE id = $1;

*🎯 Impact *

Reduced payload size
Lower memory allocation
Faster JSON serialization

Small change. Measurable gain.

❌ Problem 3: N+1 Query Pattern

Original logic:

const orders = await getOrders(userId);

for (const order of orders) {
  order.items = await getItems(order.id);
}

Under 20 orders → 21 queries.

✅ Optimized Join Query

SELECT o.id, o.created_at, i.product_id, i.quantity
FROM orders o
LEFT JOIN order_items i ON i.order_id = o.id
WHERE o.user_id = $1;

📈 Result

Drastically reduced DB round trips

Improved P95 latency stability

3️⃣ Introducing Multi-Layer Caching 🧠

Important principle:

⚠️ Do NOT cache blindly.
Cache only:

Read-heavy endpoints
Expensive aggregations
Low-volatility data

🟢 Layer 1: In-Memory Cache (Short TTL)

Used for ultra-hot endpoints.

import NodeCache from "node-cache";

const cache = new NodeCache({ stdTTL: 30 });

async function getUserProfile(userId) {
  const key = `user:${userId}`;

  const cached = cache.get(key);
  if (cached) return cached;

  const data = await fetchFromDB(userId);
  cache.set(key, data);

  return data;
}

Use case:

Frequently accessed profile endpoints
Dashboard metadata

Latency improvement: ~10–15%

🟡 Layer 2: Redis Distributed Cache

Used for:

Aggregated stats
Leaderboards
Expensive computations

async function getDashboardStats(userId) {
  const key = `dashboard:${userId}`;

  const cached = await redis.get(key);
  if (cached) return JSON.parse(cached);

  const data = await computeStats(userId);

  await redis.set(key, JSON.stringify(data), "EX", 120);

  return data;
}

📊 Impact

Significant DB load reduction
Improved P95 & P99 latency
Reduced spike sensitivity

4️⃣ Cache Invalidation Strategy ⚠️

Caching without invalidation creates stale data problems.

We used:

Event-based invalidation after writes
Short TTL for volatile data
Versioned keys when necessary

Example:

await redis.del(`dashboard:${userId}`);

Triggered immediately after order creation.

We avoided:

Long-lived static caches
Global flushes
Blind TTL-only approaches

5️⃣ Connection Pool Optimization 🔌

We observed:

DB pool exhaustion during spikes
Increased wait time for connections

Original pool size: 10
Optimized to: 25 (after validating DB capacity)

📈 Result

Reduced queuing delay
Stabilized latency under burst traffic

6️⃣ JSON Serialization Optimization ⚡

Large nested objects increased serialization cost.

Instead of returning deeply populated objects, we:

Reduced unnecessary fields
Flattened response structure
Avoided over-population

Serialization overhead dropped ~8–10%.

7️⃣ Final Optimized Flow 🏗️

Client
  ↓
Load Balancer
  ↓
Node.js (Clustered)
  ↓
In-Memory Cache
  ↓
Redis
  ↓
Optimized Indexed Queries
  ↓
Database

Thanks :)…..

Marposs Corporation Announces Matteucci as President

January 28, 2026

Software

LiteRT: The Universal Framework for On-Device AI

January 28, 2026

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Hand-Picked Top-Read Stories

LiteRT: The Universal Framework for On-Device AI

🚀 How We Reduced API Response Time by 60% in Node.js Using Caching, Query Optimization & Performance Profiling

Marposs Corporation Announces Matteucci as President

Trending Tags

🚀 How We Reduced API Response Time by 60% in Node.js Using Caching, Query Optimization & Performance Profiling

1️⃣ Step One: Measure Before Optimizing

2️⃣ Query Optimization (Biggest Impact 🔥)

3️⃣ Introducing Multi-Layer Caching 🧠

4️⃣ Cache Invalidation Strategy ⚠️

5️⃣ Connection Pool Optimization 🔌

6️⃣ JSON Serialization Optimization ⚡

7️⃣ Final Optimized Flow 🏗️

Leave a Reply Cancel reply

Previous Post

Marposs Corporation Announces Matteucci as President

Next Post

LiteRT: The Universal Framework for On-Device AI

🚀 How We Reduced API Response Time by 60% in Node.js Using Caching, Query Optimization & Performance Profiling

1️⃣ Step One: Measure Before Optimizing

2️⃣ Query Optimization (Biggest Impact 🔥)

3️⃣ Introducing Multi-Layer Caching 🧠

4️⃣ Cache Invalidation Strategy ⚠️

5️⃣ Connection Pool Optimization 🔌

6️⃣ JSON Serialization Optimization ⚡

7️⃣ Final Optimized Flow 🏗️

Leave a Reply Cancel reply

Previous Post

Next Post

Related Posts