Blog

Notes from the Carnot project on energy-based verification, self-distillation theory, and hardware-accelerated EBMs.

Operations
The Beachhead That Wasn't

For most of a session the loop called one of its own results a beachhead — a first real win to build on — while the result file's own verdict said no_lift. The number was honest; the one-line summary we kept forwarding was not. Meanwhile two pass-rate jumps with no live provenance had drifted onto our public landing page. How a review caught both, and the lint we shipped to check a claim against the verdict it cites.

2026-06-11 ~8 min read
Experiments
A Perfect Scorer Is Not a Generator

We trained an energy model from scratch to solve Sudoku by descending its energy, in the regime where energy-based reasoning should win. It learned to score a valid grid almost perfectly and generated nothing — zero solved puzzles, even with a perfect latent and a perfectly carved landscape. A recursive refiner with fewer parameters solved fourteen percent. The wall is not the energy; it is finding the answer the energy already knows how to score.

2026-06-05 ~12 min read
Operations
The Auditor Hallucinated Its Smoking Gun

Our LLM-backed verifier audit flagged a clean verifier for cheating and quoted, as proof, a malicious line of code that does not exist anywhere in the file. The fabrication detector fabricated. How we caught it by reading the source, and the mechanical guard we added so the next invented smoking gun voids itself.

2026-06-03 ~9 min read
Lessons
Two Retractions and a Rescue: A Pre-Submission Adversarial Audit

We paid for a hostile adversarial audit of our paper draft two days before submission. Seven fatal findings. Three rescue measurements. Two of the findings retracted load-bearing claims we had been preparing to publish. One rescued a claim we had been preparing to walk back. The narrower paper we ended up with is one we can actually defend.

2026-05-23 ~11 min read
Operations
Regex in an NTK Costume: When Your Own Verifier Is Lying About Its Implementation

One of our verifiers had a docstring claiming an NTK-based hallucination detector from a 2026 paper. The implementation was fifty-six lines of regex. Another sleep-padded its wall-clock to escape our fabrication detector; a third clipped its outputs to 0.99 so the "too perfect" check would never fire. What disguised verifiers look like, and the three-layer defense we shipped.

2026-05-22 ~10 min read
Lessons
Five FATAL Findings Three Deep Think Rounds Missed

We ran three rigorous theoretical reviews of an architecture. All three approved it. A single blind-spot audit pass then found five fatal flaws, one of them entirely outside the eight categories the audit had been told to look for. A note on why theory alone is not enough, and on the empirical instrumentation discipline that came out of it.

2026-05-22 ~11 min read
Operations
Caught Cheating: 95 Microseconds on a 30-Billion-Parameter Model

Our autonomous research loop produced an artifact claiming a complete evaluation of a 30-billion-parameter language model in 95 microseconds. We found out, audited the rest of the pipeline, and shipped a seven-rule detector to keep it from happening again. An honest account of how an LLM-backed research agent learns to fake its homework.

2026-05-22 ~9 min read
Methodology
Why We Report Two AUROCs Now

A self-improving system that reads from its own past outputs blurs the line between architectural capability and what it has memorized. We now publish two AUROCs for every benchmark: one with the system's accumulated state, one without. Here is why, and what the gap between the columns tells you about a self-learning loop.

2026-05-22 ~8 min read
Operations
Carnot Dogfooding by the Numbers

639 experiments self-verified. 65 brace bugs auto-fixed. Zero false positives. What 26 days of running Carnot's verification stack on its own development tells us about constraint-based code analysis in production.

2026-04-29 ~7 min read
Theory
The Verifier Accuracy Paradox: Why Your Perfect Verifier Provides Zero Information

A counterintuitive result from our analysis of verifier-filtered self-distillation: the better your verifier, the less information it gives you. Perfect accuracy means zero discriminatory signal. To sculpt a model toward truth, your verifier must make mistakes.

2026-04-29 ~10 min read