Final Review

ECON 480 • Econometrics • Fall 2022

Dr. Ryan Safner

Associate Professor of Economics

safner@hood.edu

ryansafner/metricsF22

metricsF22.classes.ryansafner.com

- Causality
- Fundamental problem of causal inference, potential outcomes
- DAGs, front-doors/back-doors, controlling

- Multivariate OLS
- Omitted Variable Bias
- Variance/Multicollinearity

- Categorical data
- Interpreting dummies, group means
- Using categorical variables as dummies
- dummy variable trap
- interaction effects

- Nonlinear Models & Transforming Variables
- quadratic model
- higher-order polynomials
- logs
- standardizing variables
- joint hypothesis (F-tests)

Panel data

- pooled regression & problems
- fixed effects

Difference-in-differences

Instrumental variables

What are the two conditions for a variable \(Z\) to cause .shout[omitted variable bias] if it is left out of the regression?

\[Wages_i=\beta_0+\beta_1 \, Education_i + \beta_2 \, Age_i + \beta_3 \, Experience_i + u_i\]

Suppose \(Education_i\) and \(Age_i\) are highly correlated

- Does this bias \(\hat{\beta_1}\) and \(\hat{\beta_2}\)?

- What will happen to the variance of \(\hat{\beta_1}\) and \(\hat{\beta_2}\)?
- How can we measure this?

\[Cholesterol_i=\beta_0+\beta_1 \, Treated_i+u_i\]

- \(Treated_i\) is a dummy variable \(= \begin{cases} 1 & \text{if person received treatment}\\ 0 & \text{if person did not receive treatment}\\ \end{cases}\)

- What is \(\hat{\beta_0}\)?

- What is \(\hat{\beta_1}\)?

- What is the average cholesterol level for someone who recieved treatment?

\[Y_i=\beta_0+\beta_1 \, Red_i+\beta_2 \, Orange_i+\beta_3 \, Yellow_i+\beta_4 \, Green_i+\beta_5 \, Blue_i\]

Suppose the **color** of observation \(i\) can be either \(\{\)Red, Orange, Yellow, Green, Blue, Purple \(\}\)

- What is \(\hat{\beta_0}\)?

- What is \(\hat{\beta_1}\)?

- What is the average value of \(Y_i\) for \(Green\) observations?

- Why can’t we add \(\beta_6 \, Purple_i\)?

\[\widehat{Utility}_i=\beta_0+\beta_1 \, Eggs_i+\beta_2 \, Breakfast_i+\beta_3 (Eggs_i \times Breakfast_i)\]

\(Breakfast_i\) is a dummy variable \(= \begin{cases} 1 & \text{if meal i is breakfast}\\ 0 & \text{if meal i is not breakfast}\\ \end{cases}\)

- What is \(\hat{\beta_1}\)?

- What is \(\hat{\beta_2}\)?

- What is \(\hat{\beta_3}\)?

- We have two regressions (one for Breakfast; one for Not Breakfast)
- how can we determine if the intercepts are different?
- how can we determine if the slopes are different?

\[\widehat{Utility}_i=2+4\text{ Ice Cream Cones}_i-1\text{ Ice Cream Cones}_i^2\]

- What is the marginal effect of eating 1 more Ice Cream Cone?

- What if we
*start*with 1 Ice Cream Cone?

- What if we
*start*with 4 Ice Cream Cones?

- What amount of ice cream cones will
*maximize*utility?

- How would we know if we should add \(\text{Ice Cream Cones}_i^3\)?

\[\ln(GDP_i)=10+2\text{ population (in billions)}_i\]

- Interpret \(\hat{\beta_1}\) in context.

\[\ln(GDP_i)=10+0.1 \, \ln(\text{population}_i)\]

- Interpret \(\hat{\beta_1}\) in context.

- Explain
*what*an \(F\)-test is used for.

- Explain
*how*an \(F\)-statistic is estimated (roughly).

Consider a two-way fixed effects model:

\[\text{Divorce Rate}_{it}=\beta_1 \text{Divorce Law}_{it}+\alpha_i+\theta_t+\epsilon_{it}\]

for State \(i\) at time \(t\)

- Why do we need \(\alpha_i\) and \(\theta_t\)?

- What sorts of things are in \(\alpha_i\)?

- What sorts of things are in \(\theta_t\)?

Suppose Maryland passes a law (and other States do not) that affects crime rates. Consider the following model:

\[\text{Crime Rate}_{it}=\beta_0+\beta_1 \, \text{Maryland}_{i}+\beta_2 \, \text{After}_t+\beta_3 \, (\text{Maryland}_i \times \text{After}_t)\]

for State \(i\) at time \(t\)

- What must we assume about Maryland over time?

- What is the average crime rate for other states before the law?

- What is the average crime rate for Maryland after the law?

- What is the
*causal effect*of passing the law?

- What are the two conditions required for an instrument to be
**valid**?- How is this different from the conditions for omitted variable bias?

- How can we test each condition?

- How do we run a two-stage least squares regression?