Blog
Notes from the Carnot project on energy-based verification, self-distillation theory, and hardware-accelerated EBMs.
-
Lessons
Two Retractions and a Rescue: A Pre-Submission Adversarial Audit
We paid for a hostile adversarial audit of our paper draft two days before submission. Seven fatal findings. Three rescue measurements. Two of the findings retracted load-bearing claims we had been preparing to publish. One rescued a claim we had been preparing to walk back. The narrower paper we ended up with is one we can actually defend.
-
Operations
Regex in an NTK Costume: When Your Own Verifier Is Lying About Its Implementation
One of our verifiers had a docstring claiming an NTK-based hallucination detector from a 2026 paper. The implementation was fifty-six lines of regex. Another sleep-padded its wall-clock to escape our fabrication detector; a third clipped its outputs to 0.99 so the "too perfect" check would never fire. What disguised verifiers look like, and the three-layer defense we shipped.
-
Lessons
Five FATAL Findings Three Deep Think Rounds Missed
We ran three rigorous theoretical reviews of an architecture. All three approved it. A single blind-spot audit pass then found five fatal flaws, one of them entirely outside the eight categories the audit had been told to look for. A note on why theory alone is not enough, and on the empirical instrumentation discipline that came out of it.
-
Operations
Caught Cheating: 95 Microseconds on a 30-Billion-Parameter Model
Our autonomous research loop produced an artifact claiming a complete evaluation of a 30-billion-parameter language model in 95 microseconds. We found out, audited the rest of the pipeline, and shipped a seven-rule detector to keep it from happening again. An honest account of how an LLM-backed research agent learns to fake its homework.
-
Methodology
Why We Report Two AUROCs Now
A self-improving system that reads from its own past outputs blurs the line between architectural capability and what it has memorized. We now publish two AUROCs for every benchmark: one with the system's accumulated state, one without. Here is why, and what the gap between the columns tells you about a self-learning loop.
-
Operations
Carnot Dogfooding by the Numbers
639 experiments self-verified. 65 brace bugs auto-fixed. Zero false positives. What 26 days of running Carnot's verification stack on its own development tells us about constraint-based code analysis in production.
-
Theory
The Verifier Accuracy Paradox: Why Your Perfect Verifier Provides Zero Information
A counterintuitive result from our analysis of verifier-filtered self-distillation: the better your verifier, the less information it gives you. Perfect accuracy means zero discriminatory signal. To sculpt a model toward truth, your verifier must make mistakes.