5.2 — Difference-in-Differences

ECON 480 • Econometrics • Fall 2022

Dr. Ryan Safner
Associate Professor of Economics

## Clever Research Designs Identify Causality

Again, this toolkit of research designs to identify causal effects is the economist’s comparative advantage that firms and governments want!

# Difference-in-Differences Models

## Difference-in-Differences Models I

• Often, we want to examine the consequences of a change, such as a law or policy intervention

## Difference-in-Differences Models I

• Often, we want to examine the consequences of a change, such as a law or policy intervention

Example

• How do States that implement policy $X$ see changes in $Y$
• Treatment: States that implement $X$
• Control: States that did not implement $X$
• If we have panel data with observations for all states before and after the change…

• Find the difference between treatment & control groups in their differences before and after the treatment period

## Difference-in-Differences Models I

• Often, we want to examine the consequences of a change, such as a law or policy intervention

Example

• How do States that implement policy $X$ see changes in $Y$
• Treatment: States that implement $X$
• Control: States that did not implement $X$
• If we have panel data with observations for all states before and after the change…

• Find the difference between treatment & control groups in their differences before and after the treatment period

## Difference-in-Differences Models II

• The difference-in-differences (aka “diff-in-diff” or “DND”) estimator identifies treatment effect by differencing the difference pre- and post-treatment values of $Y$ between treatment and control groups

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• $\text{Treated}_i= \begin{cases}1 \text{ if } i \text{ is in treatment group}\\ 0 \text{ if } i \text{ is not in treatment group}\end{cases} \quad \text{After}_t= \begin{cases}1 \text{ if } t \text{ is after treatment period}\\ 0 \text{ if } t \text{ is before treatment period}\end{cases}$
Control Treatment Group Diff $(\Delta Y_i)$
Before $\beta_0$ $\beta_0+\beta_1$ $\beta_1$
After $\beta_0+\beta_2$ $\beta_0+\beta_1+\beta_2+\beta_3$ $\beta_1+\beta_3$
Time Diff $(\Delta Y_t)$ $\beta_2$ $\beta_2+\beta_3$ $\beta_3$ Diff-in-diff $(\Delta_i \Delta_t)$

## Example: Hot Dogs

• Is there a discount when you get cheese and chili?

## Example: Hot Dogs

• Is there a discount when you get cheese and chili?
lm(price ~ cheese + chili + cheese*chili,
data = hotdogs) %>%
tidy()
• Diff-n-diff is just a model with an interaction term between two dummies!

## Visualizing Diff-in-Diff

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• Control group $(\text{Treated}_i = 0)$

• $\hat{\beta_0}$: value of $Y$ for control group before treatment

• $\hat{\beta_2}$: time difference (for control group)

## Visualizing Diff-in-Diff

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• Control group $(\text{Treated}_i = 0)$

• $\hat{\beta_0}$: value of $Y$ for control group before treatment

• $\hat{\beta_2}$: time difference (for control group)

• Treatment group $(\text{Treated}_i = 1)$

## Visualizing Diff-in-Diff

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• Control group $(\text{Treated}_i = 0)$

• $\hat{\beta_0}$: value of $Y$ for control group before treatment

• $\hat{\beta_2}$: time difference (for control group)

• Treatment group $(\text{Treated}_i = 1)$

• $\hat{\beta_1}$: difference between groups before treatment

## Visualizing Diff-in-Diff

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• Control group $(\text{Treated}_i = 0)$

• $\hat{\beta_0}$: value of $Y$ for control group before treatment

• $\hat{\beta_2}$: time difference (for control group)

• Treatment group $(\text{Treated}_i = 1)$

• $\hat{\beta_1}$: difference between groups before treatment

• $\hat{\beta_3}$: difference-in-differences (treatment effect)

## Visualizing Diff-in-Diff II

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• $\bar{Y_i}$ for Control group before: $\hat{\beta_0}$

## Visualizing Diff-in-Diff II

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• $\bar{Y_i}$ for Control group before: $\hat{\beta_0}$

• $\bar{Y_i}$ for Control group after: $\hat{\beta_0}+\hat{\beta_2}$

## Visualizing Diff-in-Diff II

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• $\bar{Y_i}$ for Control group before: $\hat{\beta_0}$

• $\bar{Y_i}$ for Control group after: $\hat{\beta_0}+\hat{\beta_2}$

• $\bar{Y_i}$ for Treatment group before: $\hat{\beta_0}+\hat{\beta_1}$

## Visualizing Diff-in-Diff II

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• $\bar{Y_i}$ for Control group before: $\hat{\beta_0}$

• $\bar{Y_i}$ for Control group after: $\hat{\beta_0}+\hat{\beta_2}$

• $\bar{Y_i}$ for Treatment group before: $\hat{\beta_0}+\hat{\beta_1}$

• $\bar{Y_i}$ for Treatment group after: $\hat{\beta_0}+\hat{\beta_1}+\hat{\beta_2}+\hat{\beta_3}$

## Visualizing Diff-in-Diff II

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• $\bar{Y_i}$ for Control group before: $\hat{\beta_0}$

• $\bar{Y_i}$ for Control group after: $\hat{\beta_0}+\hat{\beta_2}$

• $\bar{Y_i}$ for Treatment group before: $\hat{\beta_0}+\hat{\beta_1}$

• $\bar{Y_i}$ for Treatment group after: $\hat{\beta_0}+\hat{\beta_1}+\hat{\beta_2}+\hat{\beta_3}$

• Group Difference (before): $\hat{\beta_1}$

## Visualizing Diff-in-Diff II

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• $\bar{Y_i}$ for Control group before: $\hat{\beta_0}$

• $\bar{Y_i}$ for Control group after: $\hat{\beta_0}+\hat{\beta_2}$

• $\bar{Y_i}$ for Treatment group before: $\hat{\beta_0}+\hat{\beta_1}$

• $\bar{Y_i}$ for Treatment group after: $\hat{\beta_0}+\hat{\beta_1}+\hat{\beta_2}+\hat{\beta_3}$

• Group Difference (before): $\hat{\beta_1}$

• Time Difference: $\hat{\beta_2}$

## Visualizing Diff-in-Diff II

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• $\bar{Y_i}$ for Control group before: $\hat{\beta_0}$

• $\bar{Y_i}$ for Control group after: $\hat{\beta_0}+\hat{\beta_2}$

• $\bar{Y_i}$ for Treatment group before: $\hat{\beta_0}+\hat{\beta_1}$

• $\bar{Y_i}$ for Treatment group after: $\hat{\beta_0}+\hat{\beta_1}+\hat{\beta_2}+\hat{\beta_3}$

• Group Difference (before): $\hat{\beta_1}$

• Time Difference: $\hat{\beta_2}$

• Difference-in-differences: $\hat{\beta_3}$ (treatment effect)

## Comparing Group Means (Again)

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

Control Treatment Group Diff $(\Delta Y_i)$
Before $\beta_0$ $\beta_0+\beta_1$ $\beta_1$
After $\beta_0+\beta_2$ $\beta_0+\beta_1+\beta_2+\beta_3$ $\beta_1+\beta_3$
Time Diff $(\Delta Y_t)$ $\beta_2$ $\beta_2+\beta_3$ Diff-in-diff $\Delta_i \Delta_t: \beta_3$

## Key Assumption: Counterfactual

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• Key assumption for DND: time trends (for treatment and control) are parallel

• Treatment and control groups assumed to be identical over time on average, except for treatment

• Counterfactual: if the treatment group had not recieved treatment, it would have changed identically over time as the control group $(\hat{\beta_2})$

## Key Assumption: Counterfactual

$\hat{Y}_{it}=\beta_0+\beta_1 \, \text{Treated}_i +\beta_2 \, \text{After}_{t}+\beta_3 \,(\text{Treated}_i \times \text{After}_{t})+u_{it}$

• If the time-trends would have been different, a biased measure of the treatment effect $(\hat{\beta_3})$!

# Example I: HOPE in Georgia

## Diff-in-Diff Example I

Example

In 1993 Georgia initiated a HOPE scholarship program to let state residents with at least a B average in high school attend public college in Georgia for free. Did it increase college enrollment?

• Micro-level data on 4,291 young individuals
• $\text{InCollege}_{it}=\begin{cases}1 \text{ if } i \text{ is in college during year }t\\ 0 \text{ if } i \text{ is not in college during year }t\\ \end{cases}$1
• $\text{Georgia}_i=\begin{cases}1 \text{ if } i \text{ is a Georgia resident}\\ 0 \text{ if } i \text{ is not a Georgia resident}\\ \end{cases}$
• $\text{After}_t=\begin{cases}1 \text{ if } t \text{ is after 1992}\\ 0 \text{ if } t \text{ is after 1992}\\ \end{cases}$

Dynarski, Susan, 1999, “Hope for Whom? Financial Aid for the Middle Class and its Impact on College Attendance,” National Tax Journal 53(3): 629-661

## Diff-in-Diff Example II

• We can use a DND model to measure the effect of HOPE scholarship on enrollments

• Georgia and nearby States, if not for HOPE, changes should be the same over time

• Treatment period: after 1992
• Treatment: Georgia
• Difference-in-differences: $\Delta_i \Delta_t Enrolled = (\text{GA}_{after}-\text{GA}_{before})-(\text{neighbors}_{after}-\text{neighbors}_{before})$
• Regression equation:

$\widehat{\text{Enrolled}_{it}} = \beta_0+\beta_1 \, \text{Georgia}_{i}+\beta_2 \, \text{After}_{t}+\beta_3 \, (\text{Georgia}_{i} \times \text{After}_{t})$

## Example: Data

hope

## Example: Data

Dynarski, Susan, 1999, “Hope for Whom? Financial Aid for the Middle Class and its Impact on College Attendance,” National Tax Journal 53(3): 629-661

## Example: Regression

DND_reg <- lm(InCollege ~ Georgia + After + Georgia*After, data = hope)
DND_reg %>% tidy()

$\widehat{\text{Enrolled}}_{it}=0.406-0.105 \, \text{Georgia}_{i}-0.004 \, \text{After}_{t}+0.089 \, (\text{Georgia}_{i} \times \text{After}_{t})$

## Example: Interpretting the Regression

$\widehat{\text{Enrolled}}_{it}=0.406-0.105 \, \text{Georgia}_{i}-0.004 \, \text{After}_{t}+0.089 \, (\text{Georgia}_{i} \times \text{After}_{t})$

• $\beta_0$: A non-Georgian before 1992 was 40.6% likely to be a college student
• $\beta_1$: Georgians before 1992 were 10.5% less likely to be college students than neighboring states
• $\beta_2$: After 1992, non-Georgians are 0.4% less likely to be college students
• $\beta_3$: After 1992, Georgians are 8.9% more likely to enroll in colleges than neighboring states

• Treatment effect: HOPE increased enrollment likelihood by 8.9%

## Example: Comparing Group Means

$\widehat{\text{Enrolled}}_{it}=0.406-0.105 \, \text{Georgia}_{i}-0.004 \, \text{After}_{t}+0.089 \, (\text{Georgia}_{i} \times \text{After}_{t})$

• A group mean for a dummy $Y$ is $\mathbb{E}[Y=1]$, i.e. the probability a student is enrolled:

• Non-Georgian enrollment probability pre-1992: $\beta_0=0.406$

• Georgian enrollment probability pre-1992: $\beta_0+\beta_1=0.406-0.105=0.301$
• Non-Georgian enrollment probability post-1992: $\beta_0+\beta_2=0.406-0.004=0.402$
• Georgian enrollment probability post-1992: $\beta_0+\beta_1+\beta_2+\beta_3=0.406-0.105-0.004+0.089=0.386$

## Example: Comparing Group Means in R

# group mean for non-Georgian before 1992
hope %>%
filter(Georgia == 0,
After == 0) %>%
summarize(prob = mean(InCollege))
# group mean for non-Georgian AFTER 1992
hope %>%
filter(Georgia == 0,
After == 1) %>%
summarize(prob = mean(InCollege))

## Example: Comparing Group Means in R

# group mean for Georgian before 1992
hope %>%
filter(Georgia == 1,
After == 0) %>%
summarize(prob = mean(InCollege))
# group mean for Georgian AFTER 1992
hope %>%
filter(Georgia == 1,
After == 1) %>%
summarize(prob = mean(InCollege))

## Example: Diff-in-Diff Summary

$\widehat{\text{Enrolled}}_{it}=0.406-0.105 \, \text{Georgia}_{i}-0.004 \, \text{After}_{t}+0.089 \, (\text{Georgia}_{i} \times \text{After}_{t})$

Neighbors Georgia Group Diff $(\Delta Y_i)$
Before $0.406$ $0.301$ $-0.105$
After $0.402$ $0.386$ $0.016$
Time Diff $(\Delta Y_t)$ $-0.004$ $0.085$ Diff-in-diff: $0.089$

\begin{align*} \Delta_i \Delta_t Enrolled &= (\text{GA}_{after}-\text{GA}_{before})-(\text{neighbors}_{after}-\text{neighbors}_{before})\\ &=(0.386-0.301)-(0.402-0.406)\\ &=(0.085)-(-0.004)\\ &=0.089\\ \end{align*}

## Diff-in-Diff Summary & Data

Dynarski, Susan, 1999, “Hope for Whom? Financial Aid for the Middle Class and its Impact on College Attendance,” National Tax Journal 53(3): 629-661

# Generalizing DND Models

## Generalizing DND Models

• DND can be generalized with a two-way fixed effects model:

$\hat{Y}_{it}=\beta_1 \, (\text{Treated}_i \times \text{After}_{t})+\alpha_i+\theta_t+\nu_{it}$

• $\alpha_i$: group fixed effects (treatments/control groups)
• $\theta_t$: time fixed effects (pre/post treatment)
• $\beta_1$: diff-in-diff (interaction effect, $\beta_3$ from before)
• Flexibility: many periods (not just before/after), many different treatment(s)/groups, and treatment(s) can occur at different times to different units (so long as some do not get treated)
• Can also add control variables that vary within units and over time

$\hat{Y}_{it}=\beta_1 \, (\text{Treated}_i \times \text{After}_{t})+\beta_2 X_{it}+\cdots + \alpha_i+\theta_t+\nu_{it}$

## Our Example, Generalized I

$\widehat{\text{Enrolled}_{it}} = \beta_1 \, (\text{Georgia}_{i} \times \text{After}_{t}) + \alpha_i+\theta_t+$

• StateCode is a variable for the State $\implies$ create State fixed effect $(\alpha_i)$

• Year is a variable for the year $\implies$ create year fixed effect $(\theta_t)$

## Our Example, Generalized II

Using LSDV method:

DND_fe <- lm(InCollege ~ Georgia*After + factor(StateCode) + factor(Year),
data = hope)
DND_fe %>% tidy()

## Our Example, Generalized II

Using fixest

library(fixest)
DND_fe_2 <- feols(InCollege ~ Georgia*After | factor(StateCode) + factor(Year),
data = hope)
DND_fe_2 %>% tidy()

$\widehat{\text{InCollege}_{it}}=0.091 \, (\text{Georgia}_i \times \text{After}_{it})+\alpha_i+\theta_t$

## Our Example, Generalized, with Controls II

Using LSDV Method

DND_fe_controls <- lm(InCollege ~ Georgia*After + factor(StateCode) + factor(Year) + Black + LowIncome,
data = hope)
DND_fe_controls %>% tidy()

## Our Example, Generalized, with Controls II

Using fixest

DND_fe_controls_2 <- feols(InCollege ~ Georgia*After + Black + LowIncome | factor(StateCode) + factor(Year),
data = hope)
DND_fe_controls_2 %>% tidy()

$\widehat{\text{InCollege}_{it}}=0.023 \, (\text{Georgia}_i \times \text{After}_{it})-0.094 \, \text{Black}_{it}-0.302 \,\text{LowIncome}_{it}$

## Our Example, Generalized, with Controls III

No FE TWFE TWFE
Constant 0.40578***
(0.01092)
Georgia −0.10524***
(0.03778)
After −0.00446
(0.01585)
Georgia x After 0.08933* 0.09142*** 0.02344
(0.04889) (0.00564) (0.01282)
Black −0.09399***
(0.01273)
LowIncome −0.30172***
(0.03066)
n 4291 4291 2967
SER 0.49 0.49 0.47
* p < 0.1, ** p < 0.05, *** p < 0.01

## The Findings

Dynarski, Susan, 1999, “Hope for Whom? Financial Aid for the Middle Class and its Impact on College Attendance,” National Tax Journal 53(3): 629-661

## Intuition behind DND

• Diff-in-diff models are the quintessential example of exploiting natural experiments

• A major change at a point in time (change in law, a natural disaster, political crisis) separates groups where one is affected and another is not—identifies the effect of the change (treatment)

• One of the cleanest and clearest causal identification strategies

# Example II: “The” Card-Kreuger Minimum Wage Study

## Example: ”The” Card-Kreuger Minimum Wage Study I

Example

The controversial minimum wage study, Card & Kreuger (1994) is a quintessential (and clever) diff-in-diff. ]

Card, David, Krueger, Alan B, (1994), “Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania,” American Economic Review 84 (4): 772–793

## Card & Kreuger (1994): Background I

• Card & Kreuger (1994) compare employment in fast food restaurants on New Jersey and Pennsylvania sides of border between February and November 1992.

• Pennsylvania & New Jersey both had a minimum wage of $4.25 before February 1992 • In February 1992, New Jersey raised minimum wage from$4.25 to \$5.05

## Card & Kreuger (1994): Background II

• If we look only at New Jersey before & after change:
• Omitted variable bias: macroeconomic variables (there’s a recession!), weather, etc.
• Including PA as a control will control for these time-varying effects if they are national trends
• Surveyed 400 fast food restaurants on each side of the border, before & after min wage increase
• Key assumption: Pennsylvania and New Jersey follow parallel trends,
• Counterfactual: if not for the minimum wage increase, NJ employment would have changed similar to PA employment

## Card & Kreuger (1994): Model

$\widehat{\text{Employment}_{i t}}=\beta_0+\beta_1 \, \text{NJ}_{i}+\beta_2 \, \text{After}_{t}+\beta_3 \, (\text{NJ}_i \times After_t)$

• PA Before: $\beta_0$

• PA After: $\beta_0+\beta_2$

• NJ Before: $\beta_0+\beta_1$

• NJ After: $\beta_0+\beta_1+\beta_2+\beta_3$

• Diff-in-diff: $(\text{NJ}_{after}-\text{NJ}_{before})-(\text{PA}_{after}-\text{PA}_{before})$

PA NJ Group Diff $(\Delta Y_i)$
Before $\beta_0$ $\beta_0+\beta_1$ $\beta_1$
After $\beta_0+\beta_2$ $\beta_0+\beta_1+\beta_2+\beta_3$ $\beta_1+\beta_3$
Time Diff $(\Delta Y_t)$ $\beta_2$ $\beta_2+\beta_3$ Diff-in-diff $\Delta_i \Delta_t: \beta_3$