Signal Over Noise · AI & Machine Learning
The machine that learns — rendered as the flora it mirrors: branching, adaptive, alive.
An anonymous repository commit surfaced internal evaluation scores suggesting GPT-5 achieves 94.2 on MMLU-Pro and 89.7 on HumanEval+. Within six hours, Anthropic, Mistral, and Cohere had each quietly updated their internal roadmaps. The implications for teams betting on current frontier capabilities are immediate.
Our analysis: the benchmark gap is real, but the deployment gap may be wider. We tracked the downstream effects across twelve open-source projects that cited GPT-4 capability ceilings as architectural constraints. Seven now have active PRs reconsidering those assumptions.
MoE architecture with 141B total params, 39B active — sets new SOTA on BigBench Hard.
Led by DCVC. Valuation at $1.1B. The pitch: LLMs need formal reasoning scaffolds, not bigger context windows.
Nature paper. Expands from proteins to full molecular interactions. Drug discovery timelines compress again.
The models that matter aren't the ones with the highest benchmark scores. They're the ones that ship.
Three things that changed the field while you slept.
The Digest lands before your first coffee. Not a thread. Not a newsletter that recaps a newsletter. A single, annotated briefing written by people who read the papers so you can make decisions, not just form opinions.
GPT-5 leaks · Mistral 8×22B · AlphaFold 3 RNA

Annotated. Explained. Opinionated.
Every major model release comes with our annotated architecture breakdown — not just the abstract, but what the design choices actually imply for your stack. We mark the load-bearing decisions so you can skip to what matters.
Mixture-of-Experts · Sparse Attention · KV Cache Compression

One link. Fifteen replies. Two decisions made.
The Digest is the link your senior engineer drops in Slack before standup. It surfaces the funding rounds that shift your competitive landscape, the open-source drops that make your roadmap assumptions stale, and the research that becomes product in 18 months.
Shared by 3 teammates · 47 reactions · 2 decisions
When the week's noise settles, the signal remains.
Friday evening brings the Weekend Essay — a 3,000-word piece on a single idea that matters. Last week: why sparse architectures will outlast the scaling hypothesis. This week: the geopolitics of GPU allocation and what it means for open-source.
Weekend Essay · 3,200 words · With references
The 141B parameter count is marketing. The 39B active parameters per forward pass is the number that matters. We ran the cost math across four deployment configurations.
The protein folding problem was solved. The RNA-protein interaction problem just got significantly easier. Here's what this means for the 847 active drug discovery programs using computational methods.
They're not building another LLM. They're betting that the path to AGI runs through formal reasoning scaffolds, not scale. The investors writing those checks disagree with most of the field.
Meta's latest open release benchmarks within 3 points of GPT-4 on most tasks. The API-as-moat strategy just got harder to defend.
"I cancelled four other newsletters after my first week with Digest. It's the only one that treats me like I can handle nuance."
"We use Digest in our Monday morning VC calls as a shared baseline. It's replaced three separate information sources."
"The architecture annotations alone are worth the subscription. I've forwarded the Mistral deep-dive to my entire team twice."
41,000 ML engineers, founders, and investors read Digest before their first standup. Join them — or spend another morning triaging arxiv alone.
Most newsletters tell you what happened. Digest tells you what it means.