Your AI agent did not fail because the model was weak.
It failed because it made a decision no one had authorized it to make.
Maybe it skipped an escalation.
Maybe it treated a missing requirement as obvious.
Maybe it chose one tradeoff over another because a threshold told it to.
The dangerous part is not that the AI made a mistake.
The dangerous part is that the system allowed the decision to happen invisibly.
This is not a tooling problem. It is a definition problem.
What AI Actually Is
Before designing any harness, we need to agree on what we are harnessing.
My working definition:
AI is a machine that executes the work of structuring information according to a given purpose.
Two constraints follow immediately from this definition:
- Purpose is supplied externally. AI does not generate its own goals. A car does not decide where to go. AI does not decide what to optimize for.
- Structuring information is not the same as making judgments. A car can move faster than a human. That does not mean it decides the route.
This is not a limitation to be engineered around. It is the definition itself.
Where Harness Engineering Goes Wrong
The harness engineering movement — which crystallized in early 2026 — defines the harness as everything except the model: tools, memory, guardrails, feedback loops, retry mechanisms, confidence thresholds.
The formula is clean: Agent = Model + Harness.
But there is a category error embedded in it.
When AI agents were not yet capable of chaining actions, humans performed the orchestration manually. They connected outputs, prioritized next steps, and filled in the gaps when something was unclear. That human orchestration contained two things mixed together:
- Execution work — connecting outputs, sequencing steps, formatting results
- Judgment work — resolving tradeoffs, filling in unknowns, deciding priorities
Harness engineering took this human orchestration and delegated it to the harness — without separating execution from judgment first.
The result: the harness now contains judgment calls that were never made explicit. They are buried in threshold values, fallback rules, and priority weights that someone configured without realizing they were making decisions on behalf of the system.
If the definition is wrong, refining the methodology only embeds the error deeper.
You cannot harness your way out of a category mistake.
The Two Points That Belong to Humans
Information structuring work always contains two types of unresolvable moments:
1. Tradeoffs — situations where two valid paths exist and the choice depends on values, priorities, or context that the AI was not given.
2. Unknowns — gaps in information that cannot be filled by inference without risk of fabrication.
These are not edge cases. They are structurally guaranteed to appear in any non-trivial task. Project managers have known this for decades. Every project begins with a risk register. Unknowns are logged on day one, not discovered in production.
The design question is not whether these moments will occur. It is where does control go when they do.
Confident thresholds and risk scores do not answer this question. They are themselves tradeoff decisions — and tradeoff decisions belong to humans by definition, not by preference.
The threshold is not a parameter. It is a judgment.
And judgments, by definition, belong to humans.
The Same Principle Already Exists Everywhere
This is not a new idea. We have solved it before, in two adjacent domains.
Software engineering: well-designed systems do not suppress exceptions. They surface them to the caller. A try-catch that swallows every error and continues execution is not robust engineering — it is a liability. Harness engineering that handles every unknown internally, without escalating to a human, is structurally identical.
Organizational design: every role in a functioning organization operates within a defined scope of authority. When a situation exceeds that scope, it escalates. Not because the person is incapable, but because the decision belongs to a different level of authority. This is not failure. It is the system working as designed.
AI organization design needs the same structure. The escalation path is not a fallback. It is a first-class design element.
My Harness
Everything except tradeoffs and unknowns belongs in the AI. Those two points belong to humans — by definition.
My harness enforces exactly two constraints:
No speculation. When the AI encounters an unknown, it does not infer, guess, or fill the gap. It surfaces the unknown to the human who owns the decision. This forces the escalation path to activate rather than allowing silent fabrication.
Separate the executor from the checker. The AI that performs a task does not verify its own output. A separate agent — with a different role, different context, different prompt — checks the work. This is not redundancy. It is the same principle behind code review, audit functions, and quality control in any mature organization. A single agent checking its own work is equivalent to a developer reviewing their own pull request the moment after writing it.
These two constraints did not come from observing AI failures and patching them. They came from asking what an AI organization needs to look like, given what AI is by definition.
The harness is not a cage built around an unpredictable system. It is an org chart built around a well-defined one.
The Design Sequence
Most teams build in this order:
- Deploy the agent
- Observe failures
- Add guardrails to prevent recurrence
This embeds the failure mode into the design. Each guardrail is a patch over an undefined boundary.
The sequence should be:
- Define what the AI is (information structuring machine, externally purposed)
- Define what it cannot do (resolve tradeoffs, fill unknowns)
- Design the escalation path for those two cases
- Deploy the agent within that structure
The intelligence layer comes after the organizational layer. Not before.
Conclusion
Harness engineering asks: how do we make AI agents reliable?
That is the right question with the wrong starting point.
AI agents become reliable when the organization around them is designed with the same rigor we apply to software systems and human teams — not when we add enough guardrails to constrain an undefined thing into acceptable behavior.
You do not put guardrails on a car to prevent it from flying. The definition already draws that boundary.
Design the organization first. The harness follows from that.
The organizational structure described in this article — explicit role boundaries, judgment delegation, and cross-reference traceability between work units — is implemented in XRefKit: