THE DISCIPLINE

Defensive engineering.

The deliberate practice of building generative systems with built-in, antifragile resistance to their own characteristic failure modes at scale. You can read the rule and reproduce the result; you can do neither with a model grading a model. Not robust, which only survives the stress, but antifragile, which improves with every failure it catches.

Why it is necessary

Generation is an offensive process: fluent, cheap, and optimized for local plausibility. At scale it produces onslop: adequate, invisible, structurally correct but low-signal output. Human review cannot keep up with the volume. Internal LLM judges inherit the same continuation biases and blind spots. The check has to come from outside the loop: external, partially deterministic, visually auditable, and compounding.

This is not a temporary problem of bad prompts. It is the steady state of cheap generation without external constraint. The same forces that make output abundant make high-fidelity review scarce. Defensive engineering accepts the asymmetry and designs the countermeasures accordingly.

This holds even for the evidence behind the thesis itself. Every named statistic below was traced to its primary source before it went on this page, because a claim you cannot replay is not one you can ship. That is the same discipline the checks enforce.

The problem is real, and measured

Not our numbers. Named reports, each one traced to its primary source.

21% of a new user's feed

On a fresh YouTube account, 104 of the first 500 recommended videos were AI-generated, fed automatically with no human in the loop.

Kapwing AI Slop Report →

3,006 AI news-farm sites

Tracked one by one by an independent watchdog, up from 49 in mid-2023, across 16 languages. Human fact-checking cannot keep pace.

NewsGuard AI Tracking Center →

~146,900 fake citations

Hallucinated references found across arXiv, bioRxiv, SSRN and PubMed Central for 2025 alone, in an audit of 111 million citations. Fabrication is slipping past peer review.

Audit, via Phys.org →

$9M a year, per 10,000 staff

Lost to workslop: plausible, low-value AI output that colleagues spend nearly two hours each undoing. 40% received some in the last month.

HBR, BetterUp + Stanford →

And the research is unambiguous: a model cannot check its own work

Peer-reviewed. Every arXiv ID below was fetched and confirmed.

Models favor their own output

LLM judges recognize and score their own generations higher than humans rate them. The generator-as-judge has a built-in blind spot, which is exactly why the check has to be external.

Panickssery et al., NeurIPS 2024 →

They cannot self-correct reasoning

Without external feedback, intrinsic self-correction fails and often degrades the answer. Self-review runs the same engine with the same errors.

Huang et al., ICLR 2024 →

Hallucination is inevitable

A formal result: an LLM cannot learn every computable function, so it will inevitably hallucinate as a general solver. A limit, not a bug to be prompted away.

Xu et al., 2024 →

External verifiers win

A separate verifier scales better with data than fine-tuning the generator to self-improve, and lifts accuracy well past what self-review reaches.

Cobbe et al., GSM8K →

Core principles

1. External vantage first

The generator, model or agent, cannot reliably police its own global properties. The defense has to sit outside the continuation loop.

2. Deterministic where it can be had

Rules, clustering, provenance, finite scores, replayable verdicts wherever possible. Honest containment (bounded model layers plus escalation gates) everywhere else.

3. Visual or cryptographic provenance

A human or auditor must be able to re-see the source or replay the exact input to verdict. No trusting the generated summary alone.

4. Antifragile: the retained memory

Per-tenant memory will hold the codebase rules you add for the failure modes that recur for you, so once a rule is in, the same failure is caught on every later run. This memory layer is in build.

5. Recursion: defend the defense

The resistance layer itself must resist drift, overclaim, and capture. Meta-checks, honest status, public rules, no infinite model-judge loops.

6. One machine, many modalities

The same loop and memory wrap different specialized resistance mechanisms, one per kind of output. The architecture stays coherent while the checks specialize.

Concrete examples

Extraction (WYSIWYD)

Word clustering into rows and columns, a visual grid overlaid on the source PDF, retained templates that turn recurring layouts into free, byte-identical, higher-certainty extractions. Saved templates ship today; the broader per-tenant memory is in build.

Writing

Mechanistic detection of the tics of text that manages itself (self-pointing, enumeration, reader-gaze, move-announcement, diction-grading), cadence and voice rules, an optional cheap reader for new jargon, plus the loop that hands findings back until the text holds up.

Conversations

Inter-actio mode classification, the three failure modes (Mirror / Wall / Line), shape detection (rescue spirals, self-loops, narrowing), all mechanistic and external to the dialogue itself.

Presentations (Inkwell)

Substance first (real data, right form, finding visible in five seconds, title matches the data), then a finite Tufte-style style score, with a hard human-escalation gate after repeated style-only rejections. A bounded model layer with explicit limits.

The mark as visual theorem

The doloop mark performs the idea. Binary primitives (ring = 0, stick = 1) make the name legible as code. The single embedded infinity, the only one in the mark, sits in the central "00" so the name literally contains the loop. Mirror the middle and 010100 resolves to 10, the same numeral in base 10 and binary. The design is a compact statement that generation needs external, self-referential resistance.

See the mark explorations →

Adopting it

Start with one high-pain modality and instrument it for replay and visual audit. Build the memory layer early; even a thin retained memory compounds faster than you expect. Treat the defensive layer as first-class infrastructure, not a compliance afterthought. Be explicit about what is byte-deterministic and what is bounded, and publish the boundaries. Run the same lens on your own prompts, docs, decks, and agent traces, because the blindness problem is recursive.

Teams that internalize this stop asking how to make the model better at reviewing itself, and start asking what external resistance to run the model against.

doloop is one concrete, multi-modal implementation of this discipline. The same loop and memory wrap different specialized checks. Not another set of checks, but infrastructure for an emerging engineering practice.

Run the checks → Machine-readable briefing