5.2 — Difference-in-Differences

ECON 480 • Econometrics • Fall 2022

Dr. Ryan Safner
Associate Professor of Economics

safner@hood.edu
ryansafner/metricsF22
metricsF22.classes.ryansafner.com

Contents

Difference-in-Differences Models

Example I: HOPE in Georgia

Generalizing DND Models

Example II: “The” Card-Kreuger Minimum Wage Study

Clever Research Designs Identify Causality

Again, this toolkit of research designs to identify causal effects is the economist’s comparative advantage that firms and governments want!

Difference-in-Differences Models

Natural Experiments

Difference-in-Differences Models I

  • Often, we want to examine the consequences of a change, such as a law or policy intervention

Difference-in-Differences Models I

  • Often, we want to examine the consequences of a change, such as a law or policy intervention

Example

  • How do States that implement policy \(X\) see changes in \(Y\)
    • Treatment: States that implement \(X\)
    • Control: States that did not implement \(X\)
  • If we have panel data with observations for all states before and after the change…

  • Find the difference between treatment & control groups in their differences before and after the treatment period

Difference-in-Differences Models I

  • Often, we want to examine the consequences of a change, such as a law or policy intervention

Example

  • How do States that implement policy \(X\) see changes in \(Y\)
    • Treatment: States that implement \(X\)
    • Control: States that did not implement \(X\)
  • If we have panel data with observations for all states before and after the change…

  • Find the difference between treatment & control groups in their differences before and after the treatment period

Difference-in-Differences Models II

  • The difference-in-differences (aka “diff-in-diff” or “DND”) estimator identifies treatment effect by differencing the difference pre- and post-treatment values of \(Y\) between treatment and control groups

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • \(\text{Treated}_i= \begin{cases}1 \text{ if } i \text{ is in treatment group}\\ 0 \text{ if } i \text{ is not in treatment group}\end{cases} \quad \text{After}_t= \begin{cases}1 \text{ if } t \text{ is after treatment period}\\ 0 \text{ if } t \text{ is before treatment period}\end{cases}\)
Control Treatment Group Diff \((\Delta Y_i)\)
Before \(\beta_0\) \(\beta_0+\beta_1\) \(\beta_1\)
After \(\beta_0+\beta_2\) \(\beta_0+\beta_1+\beta_2+\beta_3\) \(\beta_1+\beta_3\)
Time Diff \((\Delta Y_t)\) \(\beta_2\) \(\beta_2+\beta_3\) \(\beta_3\) Diff-in-diff \((\Delta_i \Delta_t)\)

Example: Hot Dogs

  • Is there a discount when you get cheese and chili?

Example: Hot Dogs

  • Is there a discount when you get cheese and chili?
lm(price ~ cheese + chili + cheese*chili,
   data = hotdogs) %>%
  tidy()
  • Diff-n-diff is just a model with an interaction term between two dummies!

Visualizing Diff-in-Diff

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • Control group \((\text{Treated}_i = 0)\)

  • \(\hat{\beta_0}\): value of \(Y\) for control group before treatment

  • \(\hat{\beta_2}\): time difference (for control group)

Visualizing Diff-in-Diff

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • Control group \((\text{Treated}_i = 0)\)

  • \(\hat{\beta_0}\): value of \(Y\) for control group before treatment

  • \(\hat{\beta_2}\): time difference (for control group)

  • Treatment group \((\text{Treated}_i = 1)\)

Visualizing Diff-in-Diff

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • Control group \((\text{Treated}_i = 0)\)

  • \(\hat{\beta_0}\): value of \(Y\) for control group before treatment

  • \(\hat{\beta_2}\): time difference (for control group)

  • Treatment group \((\text{Treated}_i = 1)\)

  • \(\hat{\beta_1}\): difference between groups before treatment

Visualizing Diff-in-Diff

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • Control group \((\text{Treated}_i = 0)\)

  • \(\hat{\beta_0}\): value of \(Y\) for control group before treatment

  • \(\hat{\beta_2}\): time difference (for control group)

  • Treatment group \((\text{Treated}_i = 1)\)

  • \(\hat{\beta_1}\): difference between groups before treatment

  • \(\hat{\beta_3}\): difference-in-differences (treatment effect)

Visualizing Diff-in-Diff II

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • \(\bar{Y_i}\) for Control group before: \(\hat{\beta_0}\)

Visualizing Diff-in-Diff II

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • \(\bar{Y_i}\) for Control group before: \(\hat{\beta_0}\)

  • \(\bar{Y_i}\) for Control group after: \(\hat{\beta_0}+\hat{\beta_2}\)

Visualizing Diff-in-Diff II

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • \(\bar{Y_i}\) for Control group before: \(\hat{\beta_0}\)

  • \(\bar{Y_i}\) for Control group after: \(\hat{\beta_0}+\hat{\beta_2}\)

  • \(\bar{Y_i}\) for Treatment group before: \(\hat{\beta_0}+\hat{\beta_1}\)

Visualizing Diff-in-Diff II

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • \(\bar{Y_i}\) for Control group before: \(\hat{\beta_0}\)

  • \(\bar{Y_i}\) for Control group after: \(\hat{\beta_0}+\hat{\beta_2}\)

  • \(\bar{Y_i}\) for Treatment group before: \(\hat{\beta_0}+\hat{\beta_1}\)

  • \(\bar{Y_i}\) for Treatment group after: \(\hat{\beta_0}+\hat{\beta_1}+\hat{\beta_2}+\hat{\beta_3}\)

Visualizing Diff-in-Diff II

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • \(\bar{Y_i}\) for Control group before: \(\hat{\beta_0}\)

  • \(\bar{Y_i}\) for Control group after: \(\hat{\beta_0}+\hat{\beta_2}\)

  • \(\bar{Y_i}\) for Treatment group before: \(\hat{\beta_0}+\hat{\beta_1}\)

  • \(\bar{Y_i}\) for Treatment group after: \(\hat{\beta_0}+\hat{\beta_1}+\hat{\beta_2}+\hat{\beta_3}\)

  • Group Difference (before): \(\hat{\beta_1}\)

Visualizing Diff-in-Diff II

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • \(\bar{Y_i}\) for Control group before: \(\hat{\beta_0}\)

  • \(\bar{Y_i}\) for Control group after: \(\hat{\beta_0}+\hat{\beta_2}\)

  • \(\bar{Y_i}\) for Treatment group before: \(\hat{\beta_0}+\hat{\beta_1}\)

  • \(\bar{Y_i}\) for Treatment group after: \(\hat{\beta_0}+\hat{\beta_1}+\hat{\beta_2}+\hat{\beta_3}\)

  • Group Difference (before): \(\hat{\beta_1}\)

  • Time Difference: \(\hat{\beta_2}\)

Visualizing Diff-in-Diff II

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • \(\bar{Y_i}\) for Control group before: \(\hat{\beta_0}\)

  • \(\bar{Y_i}\) for Control group after: \(\hat{\beta_0}+\hat{\beta_2}\)

  • \(\bar{Y_i}\) for Treatment group before: \(\hat{\beta_0}+\hat{\beta_1}\)

  • \(\bar{Y_i}\) for Treatment group after: \(\hat{\beta_0}+\hat{\beta_1}+\hat{\beta_2}+\hat{\beta_3}\)

  • Group Difference (before): \(\hat{\beta_1}\)

  • Time Difference: \(\hat{\beta_2}\)

  • Difference-in-differences: \(\hat{\beta_3}\) (treatment effect)

Comparing Group Means (Again)

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

Control Treatment Group Diff \((\Delta Y_i)\)
Before \(\beta_0\) \(\beta_0+\beta_1\) \(\beta_1\)
After \(\beta_0+\beta_2\) \(\beta_0+\beta_1+\beta_2+\beta_3\) \(\beta_1+\beta_3\)
Time Diff \((\Delta Y_t)\) \(\beta_2\) \(\beta_2+\beta_3\) Diff-in-diff \(\Delta_i \Delta_t: \beta_3\)

Key Assumption: Counterfactual

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • Key assumption for DND: time trends (for treatment and control) are parallel

  • Treatment and control groups assumed to be identical over time on average, except for treatment

  • Counterfactual: if the treatment group had not recieved treatment, it would have changed identically over time as the control group \((\hat{\beta_2})\)

Key Assumption: Counterfactual

\[\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}\]

  • If the time-trends would have been different, a biased measure of the treatment effect \((\hat{\beta_3})\)!

Example I: HOPE in Georgia

Diff-in-Diff Example I

Example

In 1993 Georgia initiated a HOPE scholarship program to let state residents with at least a B average in high school attend public college in Georgia for free. Did it increase college enrollment?

  • Micro-level data on 4,291 young individuals
  • \(\text{InCollege}_{it}=\begin{cases}1 \text{ if } i \text{ is in college during year }t\\ 0 \text{ if } i \text{ is not in college during year }t\\ \end{cases}\)1
  • \(\text{Georgia}_i=\begin{cases}1 \text{ if } i \text{ is a Georgia resident}\\ 0 \text{ if } i \text{ is not a Georgia resident}\\ \end{cases}\)
  • \(\text{After}_t=\begin{cases}1 \text{ if } t \text{ is after 1992}\\ 0 \text{ if } t \text{ is after 1992}\\ \end{cases}\)

Dynarski, Susan, 1999, “Hope for Whom? Financial Aid for the Middle Class and its Impact on College Attendance,” National Tax Journal 53(3): 629-661

Diff-in-Diff Example II

  • We can use a DND model to measure the effect of HOPE scholarship on enrollments

  • Georgia and nearby States, if not for HOPE, changes should be the same over time

  • Treatment period: after 1992
  • Treatment: Georgia
  • Difference-in-differences: \[\Delta_i \Delta_t Enrolled = (\text{GA}_{after}-\text{GA}_{before})-(\text{neighbors}_{after}-\text{neighbors}_{before})\]
  • Regression equation:

\[\widehat{\text{Enrolled}_{it}} = \beta_0+\beta_1 \, \text{Georgia}_{i}+\beta_2 \, \text{After}_{t}+\beta_3 \, (\text{Georgia}_{i} \times \text{After}_{t})\]

Example: Data

hope

Example: Data

Dynarski, Susan, 1999, “Hope for Whom? Financial Aid for the Middle Class and its Impact on College Attendance,” National Tax Journal 53(3): 629-661

Example: Regression

DND_reg <- lm(InCollege ~ Georgia + After + Georgia*After, data = hope)
DND_reg %>% tidy()

\[\widehat{\text{Enrolled}}_{it}=0.406-0.105 \, \text{Georgia}_{i}-0.004 \, \text{After}_{t}+0.089 \, (\text{Georgia}_{i} \times \text{After}_{t})\]

Example: Interpretting the Regression

\[\widehat{\text{Enrolled}}_{it}=0.406-0.105 \, \text{Georgia}_{i}-0.004 \, \text{After}_{t}+0.089 \, (\text{Georgia}_{i} \times \text{After}_{t})\]

  • \(\beta_0\): A non-Georgian before 1992 was 40.6% likely to be a college student
  • \(\beta_1\): Georgians before 1992 were 10.5% less likely to be college students than neighboring states
  • \(\beta_2\): After 1992, non-Georgians are 0.4% less likely to be college students
  • \(\beta_3\): After 1992, Georgians are 8.9% more likely to enroll in colleges than neighboring states

  • Treatment effect: HOPE increased enrollment likelihood by 8.9%

Example: Comparing Group Means

\[\widehat{\text{Enrolled}}_{it}=0.406-0.105 \, \text{Georgia}_{i}-0.004 \, \text{After}_{t}+0.089 \, (\text{Georgia}_{i} \times \text{After}_{t})\]

  • A group mean for a dummy \(Y\) is \(\mathbb{E}[Y=1]\), i.e. the probability a student is enrolled:

  • Non-Georgian enrollment probability pre-1992: \(\beta_0=0.406\)

  • Georgian enrollment probability pre-1992: \(\beta_0+\beta_1=0.406-0.105=0.301\)
  • Non-Georgian enrollment probability post-1992: \(\beta_0+\beta_2=0.406-0.004=0.402\)
  • Georgian enrollment probability post-1992: \(\beta_0+\beta_1+\beta_2+\beta_3=0.406-0.105-0.004+0.089=0.386\)

Example: Comparing Group Means in R

# group mean for non-Georgian before 1992
hope %>%
  filter(Georgia == 0,
         After == 0) %>%
  summarize(prob = mean(InCollege))
# group mean for non-Georgian AFTER 1992
hope %>%
  filter(Georgia == 0,
         After == 1) %>%
  summarize(prob = mean(InCollege))

Example: Comparing Group Means in R

# group mean for Georgian before 1992
hope %>%
  filter(Georgia == 1,
         After == 0) %>%
  summarize(prob = mean(InCollege))
# group mean for Georgian AFTER 1992
hope %>%
  filter(Georgia == 1,
         After == 1) %>%
  summarize(prob = mean(InCollege))

Example: Diff-in-Diff Summary

\[\widehat{\text{Enrolled}}_{it}=0.406-0.105 \, \text{Georgia}_{i}-0.004 \, \text{After}_{t}+0.089 \, (\text{Georgia}_{i} \times \text{After}_{t})\]

Neighbors Georgia Group Diff \((\Delta Y_i)\)
Before \(0.406\) \(0.301\) \(-0.105\)
After \(0.402\) \(0.386\) \(0.016\)
Time Diff \((\Delta Y_t)\) \(-0.004\) \(0.085\) Diff-in-diff: \(0.089\)

\[\begin{align*} \Delta_i \Delta_t Enrolled &= (\text{GA}_{after}-\text{GA}_{before})-(\text{neighbors}_{after}-\text{neighbors}_{before})\\ &=(0.386-0.301)-(0.402-0.406)\\ &=(0.085)-(-0.004)\\ &=0.089\\ \end{align*}\]

Diff-in-Diff Summary & Data

Dynarski, Susan, 1999, “Hope for Whom? Financial Aid for the Middle Class and its Impact on College Attendance,” National Tax Journal 53(3): 629-661

Example: Diff-in-Diff Graph

Example: Diff-in-Diff Graph

Generalizing DND Models

Generalizing DND Models

  • DND can be generalized with a two-way fixed effects model:

\[\hat{Y}_{it}=\beta_1 \, (\text{Treated}_i \times \text{After}_{t})+\alpha_i+\theta_t+\nu_{it}\]

  • \(\alpha_i\): group fixed effects (treatments/control groups)
  • \(\theta_t\): time fixed effects (pre/post treatment)
  • \(\beta_1\): diff-in-diff (interaction effect, \(\beta_3\) from before)
  • Flexibility: many periods (not just before/after), many different treatment(s)/groups, and treatment(s) can occur at different times to different units (so long as some do not get treated)
  • Can also add control variables that vary within units and over time

\[\hat{Y}_{it}=\beta_1 \, (\text{Treated}_i \times \text{After}_{t})+\beta_2 X_{it}+\cdots + \alpha_i+\theta_t+\nu_{it}\]

Our Example, Generalized I

\[\widehat{\text{Enrolled}_{it}} = \beta_1 \, (\text{Georgia}_{i} \times \text{After}_{t}) + \alpha_i+\theta_t+\]

  • StateCode is a variable for the State \(\implies\) create State fixed effect \((\alpha_i)\)

  • Year is a variable for the year \(\implies\) create year fixed effect \((\theta_t)\)

Our Example, Generalized II

Using LSDV method:

DND_fe <- lm(InCollege ~ Georgia*After + factor(StateCode) + factor(Year),
           data = hope)
DND_fe %>% tidy()

Our Example, Generalized II

Using fixest

library(fixest)
DND_fe_2 <- feols(InCollege ~ Georgia*After | factor(StateCode) + factor(Year),
           data = hope)
DND_fe_2 %>% tidy()

\[\widehat{\text{InCollege}_{it}}=0.091 \, (\text{Georgia}_i \times \text{After}_{it})+\alpha_i+\theta_t\]

Our Example, Generalized, with Controls II

Using LSDV Method

DND_fe_controls <- lm(InCollege ~ Georgia*After + factor(StateCode) + factor(Year) + Black + LowIncome,
           data = hope)
DND_fe_controls %>% tidy()

Our Example, Generalized, with Controls II

Using fixest

DND_fe_controls_2 <- feols(InCollege ~ Georgia*After + Black + LowIncome | factor(StateCode) + factor(Year),
           data = hope)
DND_fe_controls_2 %>% tidy()

\[\widehat{\text{InCollege}_{it}}=0.023 \, (\text{Georgia}_i \times \text{After}_{it})-0.094 \, \text{Black}_{it}-0.302 \,\text{LowIncome}_{it}\]

Our Example, Generalized, with Controls III

No FE TWFE TWFE
Constant 0.40578***
(0.01092)
Georgia −0.10524***
(0.03778)
After −0.00446
(0.01585)
Georgia x After 0.08933* 0.09142*** 0.02344
(0.04889) (0.00564) (0.01282)
Black −0.09399***
(0.01273)
LowIncome −0.30172***
(0.03066)
n 4291 4291 2967
Adj. R2 0.00
SER 0.49 0.49 0.47
* p < 0.1, ** p < 0.05, *** p < 0.01

The Findings

Dynarski, Susan, 1999, “Hope for Whom? Financial Aid for the Middle Class and its Impact on College Attendance,” National Tax Journal 53(3): 629-661

Intuition behind DND

  • Diff-in-diff models are the quintessential example of exploiting natural experiments

  • A major change at a point in time (change in law, a natural disaster, political crisis) separates groups where one is affected and another is not—identifies the effect of the change (treatment)

  • One of the cleanest and clearest causal identification strategies

Example II: “The” Card-Kreuger Minimum Wage Study

Example: ”The” Card-Kreuger Minimum Wage Study I

Example

The controversial minimum wage study, Card & Kreuger (1994) is a quintessential (and clever) diff-in-diff. ]

Card, David, Krueger, Alan B, (1994), “Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania,” American Economic Review 84 (4): 772–793

Card & Kreuger (1994): Background I

  • Card & Kreuger (1994) compare employment in fast food restaurants on New Jersey and Pennsylvania sides of border between February and November 1992.

  • Pennsylvania & New Jersey both had a minimum wage of $4.25 before February 1992

  • In February 1992, New Jersey raised minimum wage from $4.25 to $5.05

Card & Kreuger (1994): Background II

  • If we look only at New Jersey before & after change:
    • Omitted variable bias: macroeconomic variables (there’s a recession!), weather, etc.
      • Including PA as a control will control for these time-varying effects if they are national trends
  • Surveyed 400 fast food restaurants on each side of the border, before & after min wage increase
    • Key assumption: Pennsylvania and New Jersey follow parallel trends,
      • Counterfactual: if not for the minimum wage increase, NJ employment would have changed similar to PA employment

Card & Kreuger (1994): Comparisons

Card & Kreuger (1994): Summary I

Card & Kreuger (1994): Summary II

Card & Kreuger (1994): Model

\[\widehat{\text{Employment}_{i t}}=\beta_0+\beta_1 \, \text{NJ}_{i}+\beta_2 \, \text{After}_{t}+\beta_3 \, (\text{NJ}_i \times After_t)\]

  • PA Before: \(\beta_0\)

  • PA After: \(\beta_0+\beta_2\)

  • NJ Before: \(\beta_0+\beta_1\)

  • NJ After: \(\beta_0+\beta_1+\beta_2+\beta_3\)

  • Diff-in-diff: \((\text{NJ}_{after}-\text{NJ}_{before})-(\text{PA}_{after}-\text{PA}_{before})\)

PA NJ Group Diff \((\Delta Y_i)\)
Before \(\beta_0\) \(\beta_0+\beta_1\) \(\beta_1\)
After \(\beta_0+\beta_2\) \(\beta_0+\beta_1+\beta_2+\beta_3\) \(\beta_1+\beta_3\)
Time Diff \((\Delta Y_t)\) \(\beta_2\) \(\beta_2+\beta_3\) Diff-in-diff \(\Delta_i \Delta_t: \beta_3\)

Card & Kreuger (1994): Results