A curated knowledge base covering the mathematical foundations of causal reasoning, experiment design, reinforcement learning, evolutionary methods, and online learning.
Curated corpus
A General Approach to Causal Mediation Analysis
Kosuke Imai, Luke Keele, Dustin Tingley · 2010 · Psychological Methods · 3,587 citations
Treatment Effect HeterogeneityEstimation and Inference of Heterogeneous Treatment Effects using Random Forests
Stefan Wager, Susan Athey · 2017 · Journal of the American Statistical Association · 2,737 citations
Causal InferenceCausal inference in statistics: An overview
Judea Pearl · 2009 · Statistics Surveys · 2,309 citations
Treatment Effect HeterogeneityMetalearners for Estimating Heterogeneous Treatment Effects using Machine Learning
Soren R. Kunzel, Jasjeet S. Sekhon, Peter J. Bickel +1 more · 2019 · Proceedings of the National Academy of Sciences · 1,243 citations
Causal InferenceOn Causal Inference in the Presence of Interference
Eric J. Tchetgen Tchetgen, Tyler J. VanderWeele · 2012 · Statistical Methods in Medical Research · 500 citations
Experiment DesignOnline controlled experiments at large scale
Ron Kohavi, Alex Deng, Brian Frasca +3 more · 2013 · Knowledge Discovery and Data Mining · 424 citations
Experiment DesignStatistical Challenges in Online Controlled Experiments: A Review of A/B Testing Methodology
Nicholas Larsen, Alex Deng, Jiheng Zhang +2 more · 2024 · The American Statistician · 70 citations
Sequential DecisionsDesigning Optimal Dynamic Treatment Regimes: A Causal Reinforcement Learning Approach
Junzhe Zhang · 2020 · International Conference on Machine Learning · 16 citations
Experiment DesignControlled experiments on the web: survey and practical guide
Ron Kohavi, Roger Longbotham, Dan Sommerfield +1 more · 2009 · Data Mining and Knowledge Discovery
Causal InferenceCausal Inference: What If
Miguel A. Hernan, James M. Robins · 2020 · Chapman & Hall/CRC
Causal EstimationDouble/Debiased Machine Learning for Treatment and Causal Parameters
Victor Chernozhukov, Denis Chetverikov, Mert Demirer +4 more · 2016
Causal EstimationEstimation and Inference of Heterogeneous Treatment Effects using Random Forests
Stefan Wager, Susan Athey · 2015
Research areas
We study the conditions under which causal quantities are identifiable from observational and interventional data. Our work extends the do-calculus (Pearl, 2000) to settings with partial compliance, unmeasured confounders, and time-varying treatments. We are particularly interested in the gap between identification theory — which establishes what is estimable in principle — and efficient estimation in finite samples using doubly-robust and semiparametric methods.
Classical optimal design theory (Kiefer & Wolfowitz, 1960; Wald, 1947) characterizes designs that minimize the variance of estimators under fixed sample budgets. We extend this framework to sequential and adaptive settings where the allocation of experimental conditions can respond to accumulating data. Questions of interest include: optimal stopping rules, the tradeoff between exploration and exploitation in multi-armed designs, and the design of experiments robust to model misspecification.
We investigate sequential decision problems where an agent must act under uncertainty about both the environment state and the causal structure of outcomes. This spans Bayesian bandit algorithms (Thompson sampling, information-directed sampling), model-based reinforcement learning in partially observed environments, and offline RL from fixed datasets — where distributional shift between behavior and evaluation policies creates fundamental challenges for policy evaluation and improvement.
When treatment effects vary across individuals or contexts, population-average estimates are insufficient for decision-making. We study nonparametric and semiparametric methods for estimating conditional average treatment effects (CATEs), including meta-learner frameworks (T-, S-, X-, R-learners), honest causal forests, and doubly-robust AIPW estimators. A central concern is valid, assumption-light confidence intervals for effect heterogeneity in moderate-dimensional covariate spaces.
Individual-level experiments suffer from low power; pooling across units risks masking heterogeneity. We study Bayesian hierarchical models and empirical Bayes procedures that borrow strength across experimental units without imposing homogeneity. This includes Stein-type shrinkage estimators, posterior predictive checks for exchangeability, and computationally tractable approximate inference methods for large-scale hierarchical structures.
Structure learning — recovering a directed acyclic graph (DAG) from observational or interventional data — is computationally hard in general and statistically challenging under latent confounding. We are interested in score-based and constraint-based discovery algorithms, identifiability conditions for linear non-Gaussian and nonlinear models, and the emerging intersection of causal structure learning with deep generative models.
When the objective is non-differentiable, stochastic, or evaluated through a black-box simulator, gradient-based methods fail. Evolutionary strategies — particularly CMA-ES and natural evolution strategies — provide principled gradient-free alternatives with strong theoretical convergence properties. We are interested in the intersection of evolutionary methods with neural architecture search, quality-diversity algorithms that maintain behavioral repertoires, and population-based training as a hyperparameter optimization primitive.
Online learning formalizes decision-making as a sequential game: at each round an agent selects an action, observes a loss, and updates its policy. The central quantity is regret — the gap between the agent's cumulative loss and the best fixed strategy in hindsight. We study Follow-the-Regularized-Leader and mirror descent as unified frameworks for deriving optimal algorithms, EXP3 and Hedge for adversarial settings, and adaptive gradient methods (AdaGrad, Adam) as instances of online-to-batch conversion with per-coordinate learning rates.
Open problems
Under what conditions is a CATE estimator asymptotically efficient, and does honest sample-splitting pay a meaningful price in finite samples?
Can Thompson sampling be extended to non-stationary reward processes with unknown drift, while maintaining sub-linear regret guarantees?
What is the minimax-optimal adaptive design for a crossover experiment when washout duration is unknown and carryover effects are plausible?
Is there a general semiparametric efficiency bound for causal quantities defined by the do-calculus in models with instrumental variables and unmeasured confounders?
When does offline RL with pessimistic value estimates yield policies that are safe to deploy, and how tight are existing coverage assumptions?
Can CMA-ES or natural evolution strategies achieve competitive sample efficiency with policy gradient methods on problems where the reward is non-differentiable or only partially observed?
Is there a unified regret bound that smoothly interpolates between the stochastic and adversarial bandit settings, without requiring prior knowledge of which regime applies?
Corpus
Each topic page includes canonical foundational papers with structured wiki summaries, a full browsable paper list, and key findings extracted from the literature.
Educational materials
Ten interactive chapters on experimental design, causal graphs, counterfactual reasoning, Bayesian inference, bandits, and reinforcement learning — accessible to practitioners and researchers.