HashiCorp built an MCP server for writing Terraform. I built one for reviewing it.
A few weeks ago HashiCorp shipped terraform-mcp-server. It’s an official MCP server that lets a model lean on the Terraform Registry: search providers, pull module docs, manage HCP Terraform workspaces. The shape is “help the model author IaC.” That’s a genuinely useful tool and a clear signal that Terraform-via-MCP is now a category, not a curiosity.
But it doesn’t help with the part of Terraform I spend the most time stressed about: reviewing somebody else’s plan.
terraform plan outputs are long. Real production plans run into the thousands of lines. The risky changes (an IAM grant going *, a security group opening to 0.0.0.0/0, an RDS instance being replaced) hide between a hundred routine attribute updates. Code review tools don’t help because the danger isn’t in the HCL diff, it’s in the planned actions.
So I built tf-review-mcp. It’s an MCP server scoped to one job: parse terraform show -json output and surface what a human reviewer actually cares about, structured for an LLM to quote and reason about.
What it does
Two tools:
-
review_plan(plan_json_path)returns a structured summary: action counts, high-blast-radius resource changes, stateful destroys, and diff-aware public-exposure findings. -
suggest_review_comments(plan_json_path)returns a list of{address, severity, comment}objects ready to drop into a PR review.
The server flags three things:
High-risk resource types (warn). A conservative built-in list across AWS, GCP, and Azure: IAM, KMS, RDS, security groups, S3, EKS, GKE, Cloud SQL, GCS, Cloud DNS, GCE firewalls, Key Vault. Easy to extend in source.
Stateful destroys (blocker). When a stateful resource like aws_db_instance, google_sql_database_instance, or google_compute_instance is scheduled for delete or replace. The GCE case matters more than people realize: a replace on a Compute Engine VM with local SSDs nukes both the boot disk and any local-SSD attachments. Silent data loss if a reviewer misses it.
Public exposure changes (blocker). Diff-aware. The server compares before and after on every google_compute_firewall change and fires when source_ranges newly contains 0.0.0.0/0 or ::/0. That single check has caught more dumb mistakes for me than any static analyzer.
What it looks like
Given a plan that deletes a Cloud SQL instance, replaces a GCE VM, and widens a firewall to the public internet, the server returns:
{
"counts": {"create": 1, "update": 3, "replace": 1, "delete": 1},
"stateful_destroys": [
{"address": "google_compute_instance.op_geth", ...},
{"address": "google_sql_database_instance.indexer", ...}
],
"public_exposure_changes": [
{"address": "google_compute_firewall.rpc_public",
"finding": "source_ranges now includes 0.0.0.0/0 (public exposure)"}
],
"notes": [
"2 stateful resource(s) scheduled for destroy/replace. Verify backups and migration plan before applying.",
"1 firewall change(s) widen public exposure. Confirm intent before applying."
]
}
Asked to summarize, Claude reads this and produces:
Blocker-severity items (3)
google_compute_instance.op_geth: Delete/recreate of a stateful compute instance. Confirm backup, migration, and rollback plan before merging.google_sql_database_instance.indexer: Cloud SQL instance is being deleted. Confirm backup, migration, and rollback plan before merging.google_compute_firewall.rpc_public:source_rangesnow includes0.0.0.0/0(public exposure). Confirm this is intentional and matches firewall policy.
That’s a PR review comment I’d actually merge.
Why MCP, not a CLI
The same parser could be a CLI. The reason MCP matters is the conversational shape. When I’m reviewing a plan, I want to ask follow-up questions: “is the SQL delete a replace or a hard destroy?”, “what changed about that firewall?”, “draft me a PR comment for the IAM grant.” The model can pull the right tool, narrow on the right resource, and write something useful, because the underlying tool returns data, not prose. Lower hallucination surface. Higher signal per token.
Most community Terraform-MCP experiments wrap the CLI (“run terraform plan for me”). That’s the wrong abstraction for review. You don’t want the model running plan; you want it reasoning about a plan that already ran.
Architecture and deployment
The parser is in review.py as pure functions with no MCP imports: plan JSON in, ReviewSummary out. The MCP wrapper is in server.py, about a hundred lines. The split means the parser can be unit-tested, dropped into CI, or audited without touching the transport.
Deployment is local stdio. Always. The MCP client launches the server as a child process. Plan JSON never leaves your laptop. No auth, no shared state, no network listener.
Plan JSON contains IAM relationships, account IDs, security group rules, and resource counts. Hosted “send us your plan” services are a recon goldmine waiting to happen. The architecture should make leakage impossible by default. A self-hosted HTTP variant for CI is a v2 idea, with auth and a path allowlist, not before.
Why this is differentiated
HashiCorp’s official server covers Registry lookup and HCP workspace management. Helpful for writing.
This one covers plan review. Helpful for reviewing.
Both ship as MCP servers. Both can be installed in the same client. The model picks the right tool per task. They’re complementary by design, not competitive.
I haven’t found another MCP server scoped specifically to plan review with LLM-friendly structured output, as of writing. If one exists, I want to know.
What’s next
-
estimate_cost_deltawrapping Infracost. -
check_policywrapping Conftest, so teams can bring their own Rego. - Configurable
HIGH_RISK_TYPESvia a YAML file so each team can codify its own blast-radius rules. - More diff-aware checks:
aws_security_groupingress widening, IAM*grants,google_storage_bucketforce_destroytoggles.
The repo lives here: github.com/sanjeevkkansal/tf-review-mcp. It’s MIT-licensed, v0.2, and very much open to PRs, especially if you have a war story about a plan that should have been caught.
Sanjeev Kumar is an infrastructure engineer working on platform tooling, AI-assisted ops, and infra that doesn’t wake people up at 3am.