THE AI CODE BAROMETER · LIVE

What code fails on,
and what it keeps. Measured.

We run the deterministic code donkey across the most-used repositories on GitHub and count what fails. Same code in, same verdict out. It also measures what code does consistently, the conventions almost everyone keeps. AI-written code is held to this same standard, and trained on this same public corpus, it reproduces these same failures at speed. The numbers below are live and grow with every scan.

·
code files scanned
·
languages
·
last updated

What the donkey flags in public code

The mistakes the code donkey finds, ranked by how often each turns up. A common one, a hardcoded secret, is a habit worth breaking; a rarer one, eval or a shell run on a value, is the kind that turns into a remote-code-execution bug. Every count is a named rule on a cited line, so anyone can reproduce it.

loading the live counts…

How often code keeps its own conventions

The same scan measures three basics across every file: when code catches an error, does it act on it or swallow it? When it opens a file or connection, does it close it? When it logs, does it use a real logger or a raw print? The rate is the signal: a convention kept almost always is a near-law, and the few breaks stand out; one kept half the time is a coin flip the gate can only warn on.

loading the live convention rates…

Measured across the live corpus, growing as we scan more.

Open and close is counted where acquisition and release share a function; some languages keep that convention at object lifetime.

The same convention, opposite in two houses

A convention that holds almost everywhere is a law the gate can block on. A convention that splits house by house is the opposite, and the right verdict follows the house. Typing is the clearest case: some projects type nearly every signature, others type almost none, and both are settled in opposite directions. A single global rule would be wrong in half of them. The amber cells sit below the line where no direction is settled, and there the gate stays silent.

loading the per-project rates…

Six conventions across widely used Python projects, each the project's own rate, measured deterministically. Green holds the convention, slate holds the opposite. Silence shows two ways: amber where sites are enough but no direction is settled, and a dot where too few sites exist to read at all.

How to read it

Every count traces to a named rule in the open canon, and every finding cites the rule and the line, so the number is reproducible, not an estimate. Run the same donkey on your own code at code.doloop.io, or gate it in CI so the bug never reaches the barometer in the first place.