@Today Dario Amodei's Mental Models for AI Development

Formalized from the Dwarkesh Podcast interview (Feb 13, 2026).

https://www.youtube.com/watch?v=n1E9IZfvGMA

Eight frameworks extracted, interpreted, and expressed as mathematical models.

Methodology: Each model is derived from Dario's direct quotes and stated reasoning. Where he provides specific numbers (revenue figures, probability estimates, growth rates), those are used as calibration points. Where he describes qualitative dynamics ("log-linear", "diminishing returns", "Cournot"), the standard mathematical formalism is applied. These are interpretive models — they represent how a quant analyst would formalize the CEO's stated mental framework.

Model 1: The Big Blob of Compute

Log-Linear Scaling Law (The Core Thesis)

"All the cleverness, all the techniques... that doesn't matter very much. There are only a few things that matter."

Dario's foundational belief since 2017. AI capability is a function of a small set of scaling inputs, and performance improves log-linearly with each. This is the "Big Blob of Compute Hypothesis" — the same idea as Sutton's "Bitter Lesson." He explicitly names 7 factors and claims this holds for both pre-training AND RL.

Core Scaling Law

C(t) = α · log(Compute) + β · log(|Data|) + γ · H(Data) + δ · log(T_train) + f(Objective) + ε

C(t) = capability at time t (e.g. benchmark score, task success rate)
Compute = raw FLOP budget
|Data| = quantity of training data (tokens)
H(Data) = Shannon entropy / breadth of data distribution
T_train = training duration (steps × batch size)
f(Objective) = quality of objective function (pre-training loss, RL reward signal)
ε = numerical stability / normalization term (laminar flow of gradients)
α, β, γ, δ = scaling exponents (empirically fit)

Model 1: The Big Blob of Compute

Core Scaling Law

RL Extension (same law, new domain)