Software

2 minute read

Why I Stopped Writing “Just Another CSV Script” for Every Project

Aleksandra Mitroshkina

June 11, 2025

why-i-stopped-writing-“just-another-csv-script”-for-every-project

Every project starts the same way:

– Client sends a messy CSV file
– I write a quick script to clean it
– A week later… they send another file, slightly different
– I tweak the script again
– Repeat until I’m buried in tiny, fragile one-off scripts

Sound familiar?

In the past, I treated CSV cleaning like it was a minor task—just whip up some Node.js, make the necessary fixes, and then get on with my day.

The Problem With One-Off Scripts

One-time scripts are fast to write and easy to forget. But they come back to haunt you when:

A client changes the column order or headers
You forget which script handles which format
Someone else needs to run it—and it only works on your machine
You end up repeating the same logic across 10 files

I was solving the same problems repeatedly:

Normalize inconsistent column names
Convert date formats
Drop blank or duplicate rows
Handle different encodings (UTF-8 with BOMs… hello darkness)
Export the cleaned result

I didn’t need more scripts. I needed structure.

What I Do Now Instead

These days, when I come across a messy new file, I don’t just dive in from the beginning.

I’ve developed a handy approach that breaks things down into small, testable parts.

input parsers (CSV, Excel, JSON)
a normalization layer (headers, encodings)
a transformation layer (date formatting, filters, maps)
an output formatter (CSV, JSON, preview)
This isn’t a framework. It’s just a mindset:
Write it once → reuse it forever.

Example: Simple Modular Cleanup in Node.js

Instead of one giant script, I use small utilities like these:
parser.js

const fs = require("fs");
const csv = require("csv-parser");

function parseCSV(filePath) {
  return new Promise((resolve, reject) => {
    const results = [];
    fs.createReadStream(filePath)
      .pipe(csv())
      .on("data", (row) => results.push(row))
      .on("end", () => resolve(results))
      .on("error", reject);
  });
}

module.exports = { parseCSV };

cleaner.js

function cleanRows(data) {
  return data
    .filter(row => Object.values(row).some(val => val !== ""))
    .map(row => ({
      ...row,
      date: new Date(row.date).toISOString().split("T")[0], // Normalize date
      name: row.name?.trim(), // Clean string
    }));
}

module.exports = { cleanRows };

exporter.js

const { writeFileSync } = require("fs");

function exportCSV(data, path) {
  const header = Object.keys(data[0]).join(",");
  const rows = data.map(obj => Object.values(obj).join(",")).join("n");
  writeFileSync(path, `${header}n${rows}`, "utf8");
}

module.exports = { exportCSV };

main.js

const { parseCSV } = require("./parser");
const { cleanRows } = require("./cleaner");
const { exportCSV } = require("./exporter");

async function runCleanup() {
  const raw = await parseCSV("dirty.csv");
  const cleaned = cleanRows(raw);
  exportCSV(cleaned, "cleaned.csv");
}

runCleanup();

Now, whenever I receive a new file, I simply adjust my cleaner.js logic—no need to start from square one anymore.

Benefits of Moving Away From “Just Scripts”

Less copy-paste, more confidence
Easier to onboard clients or teammates
Faster debugging (you know where the logic lives)
Fewer edge-case surprises
Scales from a 100-row file to 1 million+ rows
Now when I get a weird file with 12 columns, 3 date formats, and 2 “LOL” rows… I know my workflow can handle it.

Takeaways for Devs Handling Messy Data

Your first script should solve the problem
Your second should solve the pattern
Your third should become a system

If you’re still writing one-off scripts for every client file:
no shame — we’ve all done it
but long term, it’s pain on repeat

If you’ve already moved to a modular, testable data-cleaning setup, I’d love to hear how you approached it

What is positioning? Market, product, and brand positioning explained

June 11, 2025

Quality Assurance

GM to Invest $4B in Its U.S. Manufacturing Plants

June 11, 2025

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Hand-Picked Top-Read Stories

Calibrating Humidity for Data Centers

Why I Stopped Using SQL Queries for AI Workloads (and What Happened Next)

Determining Settling Time in Measurement Systems – An Analytical Approach

Trending Tags

Why I Stopped Writing “Just Another CSV Script” for Every Project

The Problem With One-Off Scripts

What I Do Now Instead

Example: Simple Modular Cleanup in Node.js

Benefits of Moving Away From “Just Scripts”

Takeaways for Devs Handling Messy Data

Leave a Reply Cancel reply

Previous Post

What is positioning? Market, product, and brand positioning explained

Next Post

GM to Invest $4B in Its U.S. Manufacturing Plants

Why I Stopped Writing “Just Another CSV Script” for Every Project

The Problem With One-Off Scripts

What I Do Now Instead

Example: Simple Modular Cleanup in Node.js

Benefits of Moving Away From “Just Scripts”

Takeaways for Devs Handling Messy Data

Leave a Reply Cancel reply

Previous Post

Next Post

Related Posts