Notes on Causal Inference, Panel Data, and AI
I decided to start writing notes on causal inference, panel data, and possibly research with and on AI. Substack felt like the right place to put them. This post is a short explanation of why.
The band, the books, the gap
When I was a kid I liked a band called Michael Learns to Rock. The name has stayed with me because it captures something about my relationship to statistics, econometrics, and causal inference: I came to this field relatively late, and I am still learning to rock it.
The mismatch between how confident I sound in classrooms and how confused I often feel reading new papers is not a flaw to hide. It is part of the reason to write things down.
For years I have wanted to write lecture notes on panel data methods aimed at practitioners, because I am one. Existing materials are excellent, but the field has moved quickly enough that the gap between recent arXiv papers and what people actually do in 2026 has widened.
This Substack is partly a working space for those notes. Eventually they may become something more permanent. For now, this is where the thinking happens out loud.
There are many people I am learning from: Scott Cunningham’s Mixtape, Andrew Gelman’s blog, Cyrus Samii’s posts, and Bia’s DiD Digest, among others. Each is excellent. But the broader space of informal, public writing on causal inference with panel data still feels surprisingly sparse given how active the research area is. More people writing openly in this space would help, not hurt.
Why now
Three things changed for me recently.
First, I got tenure. I genuinely feel safer making mistakes in public. That matters.
Second, AI has lowered two long-standing barriers for me. As a non-native English speaker, copyediting used to be a real bottleneck. It is much less so now. Building small teaching simulations to illustrate a point used to take an afternoon; now it can take minutes. The marginal cost of writing a post has fallen enough that the marginal post is now worth writing.
Third, I have felt a stronger urge to keep thinking through writing. I am fortunate to be surrounded by very smart people doing serious work on causal inference, and much of what I write here will be an attempt to turn what I observe and learn into something more durable. Writing is thinking, and public writing is one of the better ways I know to think carefully.
What to expect
Most posts will be around 1,000-1,500 words, hopefully each with some substance. My imagined audience is people who have taken the first-year quantitative methods or econometrics sequence in the social sciences, though I hope a wider audience will find it useful.
Some posts will discuss my own work. I will talk about what we did and why.
Some will be learning notes on other people’s work that I have benefited from.
Others will involve replications or simulations that I think can shed light on practice.
Many posts will be half-baked. There are ideas worth writing down that are not worth a full arXiv paper, and Twitter is not the right place for anything requiring more than two paragraphs of context. Substack sits nicely in that middle ground.
When a post includes simulations or reproduces published results, the code and data will live in a companion GitHub repo. One folder per post, with scripts, data, and figures. If you find a bug, please file an issue and reference the post date. I will fix it.
A note on AI. I will use it cautiously, especially in the writing itself. There is a real risk of AI slop colonizing this medium, and the only honest response is transparency and human custody. Public code and data are part of that.
Next post
The first real post will be about visualizing panel data with panelView. I made this package with Licheng and Hongyu, and I find it useful almost every time I work with a new panel dataset.
I am also planning a revamp of the package with a more modern look soon. The post will discuss what it does and why visualization is underused before the regressions even begin.
See you then.




I can’t wait to read!