Formalized from the Dwarkesh Podcast interview (Feb 13, 2026).
https://www.youtube.com/watch?v=n1E9IZfvGMA
Eight frameworks extracted, interpreted, and expressed as mathematical models.
Methodology: Each model is derived from Dario's direct quotes and stated reasoning. Where he provides specific numbers (revenue figures, probability estimates, growth rates), those are used as calibration points. Where he describes qualitative dynamics ("log-linear", "diminishing returns", "Cournot"), the standard mathematical formalism is applied. These are interpretive models — they represent how a quant analyst would formalize the CEO's stated mental framework.
Log-Linear Scaling Law (The Core Thesis)
"All the cleverness, all the techniques... that doesn't matter very much. There are only a few things that matter."
Dario's foundational belief since 2017. AI capability is a function of a small set of scaling inputs, and performance improves log-linearly with each. This is the "Big Blob of Compute Hypothesis" — the same idea as Sutton's "Bitter Lesson." He explicitly names 7 factors and claims this holds for both pre-training AND RL.
C(t) = α · log(Compute) + β · log(|Data|) + γ · H(Data) + δ · log(T_train) + f(Objective) + ε
C(t) = capability at time t (e.g. benchmark score, task success rate)Compute = raw FLOP budget|Data| = quantity of training data (tokens)H(Data) = Shannon entropy / breadth of data distributionT_train = training duration (steps × batch size)f(Objective) = quality of objective function (pre-training loss, RL reward signal)ε = numerical stability / normalization term (laminar flow of gradients)α, β, γ, δ = scaling exponents (empirically fit)