Software

2 minute read

The Pitfalls of Test Coverage: Introducing Mutation Testing with Stryker and Cosmic Ray

February 1, 2026

Overview

Goal: Overcome the limitations of Code Coverage metrics and introduce ‘Mutation Testing’ to verify if test codes actually catch errors in business logic.
Scope: Core modules of the enterprise orchestrator project (Ochestrator) in both Frontend (TypeScript) and Backend (Python).
Expected Results: Improve code stability and test reliability by securing a ‘Mutation Score’ beyond simple line coverage.

We often believe that high test coverage means safe code. However, it’s difficult to answer the question: “Who tests the tests?” Tests that simply execute code without proper assertions still contribute to coverage metrics. To solve this ‘coverage trap’, we introduced mutation testing.

Implementation

1. TypeScript Environment: Introducing Stryker Mutator

For the TypeScript environment, including frontend and common utilities, we chose Stryker. It integrates well with Vitest and is easy to configure.

Tech Stack: TypeScript, Vitest, Stryker Mutator
Key Configuration (stryker.config.json):

  {
    "testRunner": "vitest",
    "reporters": ["html", "clear-text", "progress"],
    "concurrency": 4,
    "incremental": true,
    "mutate": [
      "src/utils/**/*.ts",
      "src/services/**/*.ts"
    ]
  }

We enabled the incremental option to efficiently perform tests only on changed files.

2. Python Environment: Introducing Cosmic Ray

For the backend environment, we introduced Cosmic Ray. It generates powerful mutations by manipulating the AST (Abstract Syntax Tree) using Python’s dynamic nature.

Tech Stack: Python, Pytest, Cosmic Ray, Docker
Execution Architecture: Since mutation testing consumes significant computational resources, we configured it to run in parallel across multiple workers using Docker.

  # Partial docker-compose.test.yaml
  cosmic-worker-1:
    command: uv run cosmic-ray worker cosmic.sqlite
  cosmic-runner:
    depends_on: [cosmic-worker-1, cosmic-worker-2]
    command: |
      uv run cosmic-ray init cosmic-ray.toml cosmic.sqlite
      uv run cosmic-ray exec cosmic-ray.toml cosmic.sqlite

Debugging/Challenges

Real-world Case: Survived Mutants in `VideoSplitter.ts`

The most interesting case was videoSplitter.ts, which handles video splitting. This file had over 95% line coverage, but Stryker revealed shocking results.

Problem Statement:
A large number of mutants survived in the logic that checks available memory.

  // Original Code
  if (availableMemory < requiredMemory) {
    throw new Error("Insufficient memory.");
  }

Even when Stryker changed this code to if (false) or if (availableMemory <= requiredMemory), all existing tests PASSED.

Root Cause Analysis:
Existing tests focused only on "whether an error occurs," missing boundary value tests for exactly which conditions trigger the error. In other words, coverage was high, but the actual logic wasn't being thoroughly verified.
Solution:
To 'kill' the surviving mutants, we reinforced the test cases with boundary value analysis.

  test('Boundary value verification for memory', () => {
    // Simulate situations where memory is exactly equal to or slightly less than requiredMemory
    // ... reinforced test code ...
  });

Results

Achievements:
- Discovered and removed 12 Survived Mutants in core utility modules.
- Elevated test code from simply 'executing' code to truly 'verifying' it.
Key Metrics:
- Mutation Score: Improved from an initial 62% to 88%.
- Reliability: Prevented potential regression bugs by running test:mutation scripts before deployment.
User Feedback: Positive reactions from team members: "I can now refactor with confidence, trusting our tests."

Key Takeaways

Coverage is just the beginning: Line coverage only tells you 'what is not tested,' not the 'quality of what is tested.'
Mutation testing is expensive but worth it: Although it takes time (up to tens of minutes for full execution), it's essential for core business logic or complex utilities.
Incremental Adoption: Rather than applying it to all code, it's important to build success stories by starting with core infrastructure code like VideoSplitter.

After completion, ensure the following checklist is met:

Verification Checklist

[x] Overview: Are the goals and scope clear?
[x] Implementation: Are the tech stack and specific code examples included?
[x] Debugging: Is there at least one specific problem and its solution process?
[x] Results: Are there numerical data or performance indicators?
[x] Key Takeaways: Are the lessons learned and future plans clear?

Length Guidelines

[x] Overall: 400-800 lines (currently ~100 lines - can be expanded if needed)
[x] Each section: Minimum 50 lines (if possible)
[x] Code examples: 2-3 examples included

ReactJS ~React Server Components~

February 1, 2026

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Hand-Picked Top-Read Stories

The Pitfalls of Test Coverage: Introducing Mutation Testing with Stryker and Cosmic Ray

ReactJS ~React Server Components~

The Vendor Police: The QA-QC Manager’s Conundrum

Trending Tags

The Pitfalls of Test Coverage: Introducing Mutation Testing with Stryker and Cosmic Ray

Overview

Implementation

1. TypeScript Environment: Introducing Stryker Mutator

2. Python Environment: Introducing Cosmic Ray

Debugging/Challenges

Real-world Case: Survived Mutants in `VideoSplitter.ts`

Results

Key Takeaways

Verification Checklist

Length Guidelines

Leave a Reply Cancel reply

Previous Post

ReactJS ~React Server Components~

The Pitfalls of Test Coverage: Introducing Mutation Testing with Stryker and Cosmic Ray

ReactJS ~React Server Components~

The Vendor Police: The QA-QC Manager’s Conundrum

The Pitfalls of Test Coverage: Introducing Mutation Testing with Stryker and Cosmic Ray

Overview

Implementation

1. TypeScript Environment: Introducing Stryker Mutator

2. Python Environment: Introducing Cosmic Ray

Debugging/Challenges

Real-world Case: Survived Mutants in VideoSplitter.ts

Results

Key Takeaways

Verification Checklist

Length Guidelines

Leave a Reply Cancel reply

Previous Post

ReactJS ~React Server Components~

Related Posts

Taming the Flame: Securely Connecting Next.js and Firebase with TypeScript

The top real-time notification services for building in-app notifications

Distilling from Dialogues: Finding Meaning in LLM Interactions

Real-world Case: Survived Mutants in `VideoSplitter.ts`