Why I Stopped Writing “Just Another CSV Script” for Every Project

why-i-stopped-writing-“just-another-csv-script”-for-every-project

Every project starts the same way:

  1. – Client sends a messy CSV file
  2. – I write a quick script to clean it
  3. – A week later… they send another file, slightly different
  4. – I tweak the script again
  5. – Repeat until I’m buried in tiny, fragile one-off scripts

Sound familiar?

In the past, I treated CSV cleaning like it was a minor task—just whip up some Node.js, make the necessary fixes, and then get on with my day.

The Problem With One-Off Scripts

One-time scripts are fast to write and easy to forget. But they come back to haunt you when:

  • A client changes the column order or headers
  • You forget which script handles which format
  • Someone else needs to run it—and it only works on your machine
  • You end up repeating the same logic across 10 files

I was solving the same problems repeatedly:

  • Normalize inconsistent column names
  • Convert date formats
  • Drop blank or duplicate rows
  • Handle different encodings (UTF-8 with BOMs… hello darkness)
  • Export the cleaned result

I didn’t need more scripts. I needed structure.

What I Do Now Instead

These days, when I come across a messy new file, I don’t just dive in from the beginning.

I’ve developed a handy approach that breaks things down into small, testable parts.

  • input parsers (CSV, Excel, JSON)
  • a normalization layer (headers, encodings)
  • a transformation layer (date formatting, filters, maps)
  • an output formatter (CSV, JSON, preview)
    This isn’t a framework. It’s just a mindset:
    Write it once → reuse it forever.

Example: Simple Modular Cleanup in Node.js

Instead of one giant script, I use small utilities like these:
parser.js

const fs = require("fs");
const csv = require("csv-parser");

function parseCSV(filePath) {
  return new Promise((resolve, reject) => {
    const results = [];
    fs.createReadStream(filePath)
      .pipe(csv())
      .on("data", (row) => results.push(row))
      .on("end", () => resolve(results))
      .on("error", reject);
  });
}

module.exports = { parseCSV };

cleaner.js

function cleanRows(data) {
  return data
    .filter(row => Object.values(row).some(val => val !== ""))
    .map(row => ({
      ...row,
      date: new Date(row.date).toISOString().split("T")[0], // Normalize date
      name: row.name?.trim(), // Clean string
    }));
}

module.exports = { cleanRows };

exporter.js

const { writeFileSync } = require("fs");

function exportCSV(data, path) {
  const header = Object.keys(data[0]).join(",");
  const rows = data.map(obj => Object.values(obj).join(",")).join("n");
  writeFileSync(path, `${header}n${rows}`, "utf8");
}

module.exports = { exportCSV };

main.js

const { parseCSV } = require("./parser");
const { cleanRows } = require("./cleaner");
const { exportCSV } = require("./exporter");

async function runCleanup() {
  const raw = await parseCSV("dirty.csv");
  const cleaned = cleanRows(raw);
  exportCSV(cleaned, "cleaned.csv");
}

runCleanup();

Now, whenever I receive a new file, I simply adjust my cleaner.js logic—no need to start from square one anymore.

Benefits of Moving Away From “Just Scripts”

  • Less copy-paste, more confidence
  • Easier to onboard clients or teammates
  • Faster debugging (you know where the logic lives)
  • Fewer edge-case surprises
  • Scales from a 100-row file to 1 million+ rows
    Now when I get a weird file with 12 columns, 3 date formats, and 2 “LOL” rows… I know my workflow can handle it.

Takeaways for Devs Handling Messy Data

  • Your first script should solve the problem
  • Your second should solve the pattern
  • Your third should become a system

If you’re still writing one-off scripts for every client file:
no shame — we’ve all done it
but long term, it’s pain on repeat

If you’ve already moved to a modular, testable data-cleaning setup, I’d love to hear how you approached it

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post
what-is-positioning?-market,-product,-and-brand-positioning-explained

What is positioning? Market, product, and brand positioning explained

Next Post
gm-to-invest-$4b-in-its-us.-manufacturing-plants

GM to Invest $4B in Its U.S. Manufacturing Plants

Related Posts