4.3 — Nonlinearity & Transformations

ECON 480 • Econometrics • Fall 2022

Dr. Ryan Safner
Associate Professor of Economics

safner@hood.edu
ryansafner/metricsF22
metricsF22.classes.ryansafner.com

Contents

Nonlinear Effects

Polynomial Models

Quadratic Model

Logarithmic Models

Linear-Log Model

Log-Linear Model

Log-Log Model

Standardizing & Comparing Across Units

Joint Hypothesis Testing

Nonlinear Effects

Linear Regression

  • OLS is commonly known as “linear regression” as it fits a straight line to data points

  • Often, data and relationships between variables may not be linear

Linear Regression

Linear Regression

^Life Expectancyi=^β0+^β1GDPi

Linear Regression

^Life Expectancyi=^β0+^β1GDPi

^Life Expectancyi=^β0+^β1GDPi+^β2GDP2i

Linear Regression

^Life Expectancyi=^β0+^β1GDPi

^Life Expectancyi=^β0+^β1GDPi+^β2GDP2i

^Life Expectancyi=^β0+^β1lnGDPi

Sources of Nonlinearities

  • Effect of X1→Y might be nonlinear if:
  1. X1→Y is different for different levels of X1
    • e.g. diminishing returns: ↑X1 increases Y at a decreasing rate
    • e.g. increasing returns: ↑X1 increases Y at an increasing rate
  1. X1→Y is different for different levels of X2
    • e.g. interaction effects (last lesson)

Nonlinearities Alter Marginal Effects

  • Linear:

Y=^β0+^β1X

  • marginal effect (slope), (^β1)=ΔYΔX is constant for all X

Nonlinearities Alter Marginal Effects

  • Polynomial:

Y=^β0+^β1X+^β2X2

  • Marginal effect, “slope” (≠^β1) depends on the value of X!

Nonlinearities Alter Marginal Effects

  • Interaction Effect:

ˆY=^β0+^β1X1+^β2X2+^β3X1×X2

  • Marginal effect, “slope” depends on the value of X2!

  • Easy example: if X2 is a dummy variable:

    • X2=0 (control) vs. X2=1 (treatment)

Polynomial Models

Polynomial Functions of X I

  • Linear

ˆY=^β0+^β1X

Polynomial Functions of X I

  • Linear

ˆY=^β0+^β1X

  • Quadratic

ˆY=^β0+^β1X+^β2X2

Polynomial Functions of X I

  • Linear

ˆY=^β0+^β1X

  • Quadratic

ˆY=^β0+^β1X+^β2X2

  • Cubic

ˆY=^β0+^β1X+^β2X2+^β3X3

Polynomial Functions of X I

  • Linear

ˆY=^β0+^β1X

  • Quadratic

ˆY=^β0+^β1X+^β2X2

  • Cubic

ˆY=^β0+^β1X+^β2X2+^β3X3

  • Quartic

ˆY=^β0+^β1X+^β2X2+^β3X3+^β4X4

Polynomial Functions of X II

^Yi=^β0+^β1Xi+^β2X2i+⋯+^βrXri+ui

  • Where r is the highest power Xi is raised to
    • quadratic r=2
    • cubic r=3
  • The graph of an rth-degree polynomial function has (r−1) bends
  • Just another multivariate OLS regression model!

Quadratic Model

Quadratic Model

^Yi=^β0+^β1Xi+^β2X2i

  • Quadratic model has X and X2 variables in it (yes, need both!)
  • How to interpret coefficients (betas)?
    • β0 as “intercept” and β1 as “slope” makes no sense 🧐
    • β1 as effect Xi→Yi holding X2i constant??1
  • Estimate marginal effects by calculating predicted ^Yi for different levels of Xi
  1. Note: this is not a perfect multicollinearity problem! Correlation only measures linear relationships!

Quadratic Model: Calculating Marginal Effects

^Yi=^β0+^β1Xi+^β2X2i

  • What is the marginal effect of ΔXi→ΔYi?
  • Take the derivative of Yi with respect to Xi:

∂Yi∂Xi=^β1+2^β2Xi

  • Marginal effect of a 1 unit change in Xi is a (^β1+2^β2Xi) unit change in Y

Quadratic Model: Example I

Example

^Life Expectancyi=^β0+^β1GDP per capitai+^β2GDP per capita2i

  • Use gapminder package and data
library(gapminder)

Quadratic Model: Example II

  • These coefficients will be very large, so let’s transform gdpPercap to be in $1,000’s
gapminder <- gapminder %>%
  mutate(GDP_t = gdpPercap/1000)

gapminder %>% head() # look at it
ABCDEFGHIJ0123456789
country
<fct>
continent
<fct>
year
<int>
lifeExp
<dbl>
pop
<int>
AfghanistanAsia195228.8018425333
AfghanistanAsia195730.3329240934
AfghanistanAsia196231.99710267083
AfghanistanAsia196734.02011537966
AfghanistanAsia197236.08813079460
AfghanistanAsia197738.43814880372
6 rows | 1-5 of 7 columns

Quadratic Model: Example II

  • Let’s also create a squared term, gdp_sq
gapminder <- gapminder %>%
  mutate(GDP_sq = GDP_t^2)

gapminder %>% head() # look at it
ABCDEFGHIJ0123456789
country
<fct>
continent
<fct>
year
<int>
lifeExp
<dbl>
pop
<int>
AfghanistanAsia195228.8018425333
AfghanistanAsia195730.3329240934
AfghanistanAsia196231.99710267083
AfghanistanAsia196734.02011537966
AfghanistanAsia197236.08813079460
AfghanistanAsia197738.43814880372
6 rows | 1-5 of 8 columns

Quadratic Model: Example IV

  • Can “manually” run a multivariate regression with GDP_t and GDP_sq
library(broom)
reg1 <- lm(lifeExp ~ GDP_t + GDP_sq, data = gapminder)

reg1 %>% tidy()
ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)50.524005780.2978134673169.649840.000000e+00
GDP_t1.550991120.037373494541.499761.292863e-260
GDP_sq-0.015019270.0005794139-25.921493.935809e-125
3 rows

Quadratic Model: Example IV

  • OR use gdp_t and add the I() operator to transform the variable in the regression, I(gdp_t^2)1
reg1_alt <- lm(lifeExp ~ GDP_t + I(GDP_t^2), data = gapminder)

reg1_alt %>% tidy()
ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)50.524005780.2978134673169.649840.000000e+00
GDP_t1.550991120.037373494541.499761.292863e-260
I(GDP_t^2)-0.015019270.0005794139-25.921493.935809e-125
3 rows
  1. Here is a decent explanation of what I() does. An alternative is to use poly(GDP_t, 2) to make the squared term, but this has some issues.

Quadratic Model: Example V

ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)50.524005780.2978134673169.649840.000000e+00
GDP_t1.550991120.037373494541.499761.292863e-260
GDP_sq-0.015019270.0005794139-25.921493.935809e-125
3 rows

^Life Expectancyi=50.52+1.55GDPi−0.02GDP2i

  • Positive effect (^β1>0), with diminishing returns (^β2<0)

  • Marginal effect of GDP on Life Expectancy depends on initial value of GDP!

Quadratic Model: Example VI

ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)50.524005780.2978134673169.649840.000000e+00
GDP_t1.550991120.037373494541.499761.292863e-260
GDP_sq-0.015019270.0005794139-25.921493.935809e-125
3 rows
  • Marginal effect of GDP on Life Expectancy:

∂Y∂X=^β1+2^β2Xi∂Life Expectancy∂GDP≈1.55+2(−0.02)GDP≈1.55−0.04GDP

Quadratic Model: Example VII

∂Life Expectancy∂GDP=1.55−0.04GDP

Marginal effect of GDP if GDP =5 ($ thousand):

∂Life Expectancy∂GDP=1.55−0.04GDP=1.55−0.04(5)=1.55−0.20=1.35

  • i.e. for every addition $1 (thousand) in GDP per capita, average life expectancy increases by 1.35 years

Quadratic Model: Example VIII

∂Life Expectancy∂GDP=1.55−0.04GDP

Marginal effect of GDP if GDP =25 ($ thousand):

∂Life Expectancy∂GDP=1.55−0.04GDP=1.55−0.04(25)=1.55−1.00=0.55

  • i.e. for every addition $1 (thousand) in GDP per capita, average life expectancy increases by 0.55 years

Quadratic Model: Example X

∂Life Expectancy∂GDP=1.55−0.04GDP

Marginal effect of GDP if GDP =50 ($ thousand):

∂Life Expectancy∂GDP=1.55−0.04GDP=1.55−0.04(50)=1.55−2.00=−0.45

  • i.e. for every addition $1 (thousand) in GDP per capita, average life expectancy decreases by 0.45 years

Quadratic Model: Example XI

^Life Expectancyi=50.52+1.55GDP per capitai−0.02GDP per capita2i∂Life ExpectancydGDP=1.55−0.04GDP

Initial GDP per capita Marginal Effect1
$5,000 1.35 years
$25,000 0.55 years
$50,000 −0.45 years
  1. Of +$1,000 GDP/capita on Life Expectancy.

Quadratic Model: Example XII

Code
ggplot(data = gapminder)+
  aes(x = GDP_t,
      y = lifeExp)+
  geom_point(color = "blue", alpha=0.5)+
  stat_smooth(method = "lm",
              formula = y ~ x + I(x^2),
              color = "green")+ 
  geom_vline(xintercept = c(5,25,50),
             linetype = "dashed",
             color = "red", size = 1)+
  scale_x_continuous(labels = scales::dollar,
                     breaks = seq(0,120,10))+
  scale_y_continuous(breaks = seq(0,100,10),
                     limits = c(0,100))+
  labs(x = "GDP per Capita (in Thousands)",
       y = "Life Expectancy (Years)")+
  theme_bw(base_family = "Fira Sans Condensed",
           base_size=16)

Quadratic Model: Maxima and Minima I

  • For a polynomial model, we can also find the predicted maximum or minimum of ^Yi
  • A quadratic model has a single global maximum or minimum (1 bend)
  • By calculus, a minimum or maximum occurs where:

∂Yi∂Xi=0β1+2β2Xi=02β2Xi=−β1X∗i=−β12β2

Quadratic Model: Maxima and Minima II

ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)50.524005780.2978134673169.649840.000000e+00
GDP_t1.550991120.037373494541.499761.292863e-260
GDP_sq-0.015019270.0005794139-25.921493.935809e-125
3 rows

GDP∗i=−β12β2GDP∗i=−(1.55)2(−0.015)GDP∗i≈51.67

Quadratic Model: Maxima and Minima III

Code
ggplot(data = gapminder)+
  aes(x = GDP_t,
      y = lifeExp)+
  geom_point(color = "blue", alpha=0.5)+
  stat_smooth(method = "lm",
              formula = y ~ x + I(x^2),
              color = "green")+
  geom_vline(xintercept=51.67, linetype="dashed", color="red", size = 1)+
  geom_label(x=51.67, y=90, label="$51.67", color="red")+
  scale_x_continuous(labels = scales::dollar,
                     breaks = seq(0,120,10))+
  scale_y_continuous(breaks = seq(0,100,10),
                     limits = c(0,100))+
  labs(x = "GDP per Capita (in Thousands)",
       y = "Life Expectancy (Years)")+
  theme_bw(base_family = "Fira Sans Condensed",
           base_size=16)

Determining If Polynomials Are Necessary I

ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)50.524005780.2978134673169.649840.000000e+00
GDP_t1.550991120.037373494541.499761.292863e-260
GDP_sq-0.015019270.0005794139-25.921493.935809e-125
3 rows
  • Is the quadratic term necessary?
  • Determine if ^β2 (on X2i) is statistically significant:
    • H0:^β2=0
    • Ha:^β2≠0
  • Statistically significant ⟹ we should keep the quadratic model
    • If we only ran a linear model, it would be incorrect!

Determining Polynomials are Necessary II

  • Should we keep going up in polynomials?

Determining Polynomials are Necessary II

  • Should we keep going up in polynomials?

^Life Expectancyi=^β0+^β1GDPi+^β2GDP2i+^β3GDP3i

Determining Polynomials are Necessary III

  • In general, you should have a compelling theoretical reason why data or relationships should “change direction” multiple times

  • Or clear data patterns that have multiple “bends”

  • Recall, we care more about accurately measuring the causal effect of X→Y, rather than getting the most accurate prediction possible for ˆY

Determining Polynomials are Necessary IV

ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)47.47550695100.3227816126147.082440.000000e+00
GDP_t2.72263706980.074273363936.656982.816848e-217
I(GDP_t^2)-0.06815450710.0030340927-22.462904.574160e-98
I(GDP_t^3)0.00040931490.000023010117.788494.713120e-65
4 rows
  • ^β3 is statistically significant…
  • …but can we really think of a good reason to complicate the model?

If You Kept Going…

ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)4.003294e+015.846282e-0168.4759010.000000e+00
GDP_t8.722968e+005.290582e-0116.4877289.307692e-57
I(GDP_t^2)-1.081312e+001.294759e-01-8.3514601.383586e-16
I(GDP_t^3)7.190930e-021.334295e-025.3893098.066753e-08
I(GDP_t^4)-2.705563e-037.010624e-04-3.8592331.180104e-04
I(GDP_t^5)6.063170e-052.056983e-052.9476043.246284e-03
I(GDP_t^6)-8.254873e-073.495442e-07-2.3616101.830836e-02
I(GDP_t^7)6.685309e-093.408241e-091.9615134.998276e-02
I(GDP_t^8)-2.956581e-111.766287e-11-1.6738969.433565e-02
I(GDP_t^9)5.490732e-143.765889e-141.4580171.450211e-01
1-10 of 10 rows
  • It takes until a 9th-degree polynomial for one of the terms to become insignificant…

  • …but does this make the model better? more interpretable?

  • A famous problem of overfitting

If You Kept Going…Visually

If You Kept Going…Visually

A 4th-degree polynomial

If You Kept Going…Visually

A 9th-degree polynomial

If You Kept Going…Visually

A 14th-degree polynomial

Strategy for Polynomial Model Specification

  1. Are there good theoretical reasons for relationships changing (e.g. increasing/decreasing returns)?
  1. Plot your data: does a straight line fit well enough?
  1. Specify a polynomial function of a higher power (start with 2) and estimate OLS regression
  1. Use t-test to determine if higher-power term is significant
  1. Interpret effect of change in X on Y
  1. Repeat steps 3-5 as necessary (if there are good theoretical reasons)

Logarithmic Models

Linear Regression

^Life Expectancyi=^β0+^β1GDPi

^Life Expectancyi=^β0+^β1GDPi+^β2GDP2i

^Life Expectancyi=^β0+^β1lnGDPi

Logarithmic Models

  • Another useful model for nonlinear data is the logarithmic model1
    • We transform either X, Y, or both by taking the (natural) logarithm
  • Logarithmic model has two additional advantages
    1. We can easily interpret coefficients as percentage changes or elasticities
    2. Useful economic shape: diminishing returns (production functions, utility functions, etc)

  1. Don’t confuse this with a logistic (logit) model for dependent dummy variables.

The Natural Logarithm

  • The exponential function, Y=eX or Y=exp(X), where base e=2.71828...

  • Natural logarithm is the inverse, Y=ln(X)

The Natural Logarithm: Review I

  • Exponents are defined as

bn=b×b×⋯×b⏟n times

  • where base b is multiplied by itself n times
  • Example: 23=2×2×2⏟n=3=8
  • Logarithms are the inverse, defined as the exponents in the expressions above

If bn=y, then logb(y)=n

  • n is the number you must raise b to in order to get y
  • Example: log2(8)=3

The Natural Logarithm: Review II

  • Logarithms can have any base, but common to use the natural logarithm (ln) with base e=2.71828...

If en=y, then ln(y)=n

The Natural Logarithm: Properties

  • Natural logs have a lot of useful properties:
    1. ln(1x)=−ln(x)
    2. ln(ab)=ln(a)+ln(b)
    3. ln(xa)=ln(x)−ln(a)
    4. ln(xa)=aln(x)
    5. dlnxdx=1x

The Natural Logarithm: Example

  • Most useful property: for small change in x, Δx:

ln(x+Δx)−ln(x)⏟Difference in logs≈Δxx⏟Relative change

Example

Let x=100 and Δx=1, relative change is:

Δxx=(101−100)100=0.01 or 1%

  • The logged difference:

ln(101)−ln(100)=0.00995≈1%

  • This allows us to very easily interpret coefficients as percent changes or elasticities

Elasticity

  • An elasticity between any two variables, ϵY,X describes the responsiveness (in %) of one variable (Y) to a change in another (X)

ϵY,X=%ΔY%ΔX=(ΔYY)(ΔXX)

  • Numerator is relative change in Y, Denominator is relative change in X
  • Interpretation: a 1% change in X will cause a ϵY,X% change in Y

Math FYI: Cobb Douglas Functions and Logs

  • One of the (many) reasons why economists love Cobb-Douglas functions:

Y=ALαKβ

  • Taking logs, relationship becomes linear:

ln(Y)=ln(A)+αln(L)+βln(K)

  • With data on (Y,L,K) and linear regression, can estimate α and β
    • α: elasticity of Y with respect to L
      • A 1% change in L will lead to an α% change in Y
    • β: elasticity of Y with respect to K
      • A 1% change in K will lead to a β% change in Y

Math FYI: Cobb Douglas Functions and Logs

Example

Y=2L0.75K0.25

  • Taking logs:

lnY=ln2+0.75lnL+0.25lnK

  • A 1% change in L will yield a 0.75% change in output Y

  • A 1% change in K will yield a 0.25% change in output Y

Logarithms in R I

  • The log() function can easily take the logarithm
gapminder <- gapminder %>%
  mutate(loggdp = log(gdpPercap)) # log GDP per capita

gapminder %>% head() # look at it
ABCDEFGHIJ0123456789
country
<fct>
continent
<fct>
year
<int>
lifeExp
<dbl>
pop
<int>
AfghanistanAsia195228.8018425333
AfghanistanAsia195730.3329240934
AfghanistanAsia196231.99710267083
AfghanistanAsia196734.02011537966
AfghanistanAsia197236.08813079460
AfghanistanAsia197738.43814880372
6 rows | 1-5 of 9 columns

Logarithms in R II

  • Note, log() by default is the natural logarithm ln(), i.e. base e
    • Can change base with e.g. log(x, base = 5)
    • Some common built-in logs: log10, log2
log10(100)
[1] 2
log2(16)
[1] 4
log(19683, base=3)
[1] 9

Logarithms in R III

  • Note when running a regression, you can pre-transform the data into logs (as I did above), or just add log() around a variable in the regression
ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)-9.1008891.227674-7.4131171.934812e-13
loggdp8.4050850.14876256.5002060.000000e+00
2 rows

Types of Logarithmic Models

  • Three types of log regression models, depending on which variables we log
  1. Linear-log model: Yi=β0+β1lnXi
  1. Log-linear model: lnYi=β0+β1Xi
  1. Log-log model: lnYi=β0+β1lnXi

Linear-Log Model

Linear-Log Model: Interpretation

  • Linear-log model has an independent variable (X) that is logged

Y=β0+β1lnXiβ1=ΔY(ΔXX)

  • Marginal effect of X→Y: a 1% change in X→ a β1100 unit change in Y

Linear-Log Model in R

ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)-9.1008891.227674-7.4131171.934812e-13
loggdp8.4050850.14876256.5002060.000000e+00
2 rows

^Life Expectancyi=−9.10+8.41ln GDPi

  • A 1% change in GDP → a 9.41100= 0.0841 year increase in Life Expectancy
  • A 25% fall in GDP → a (−25×0.0841)= 2.1025 year decrease in Life Expectancy
  • A 100% rise in GDP → a (100×0.0841)= 8.4100 year increase in Life Expectancy

Linear-Log Model Graph (Linear X-Axis)

Code
ggplot(data = gapminder)+
  aes(x = gdpPercap,
      y = lifeExp)+
  geom_point(color = "blue", alpha = 0.5)+
  geom_smooth(method = "lm",
              formula = y ~ log(x),
              color = "orange")+ 
  scale_x_continuous(labels = scales::dollar,
                     breaks = seq(0,120000,20000))+
  scale_y_continuous(breaks = seq(0,100,10),
                     limits = c(0,100))+
  labs(x = "GDP per Capita",
       y = "Life Expectancy (Years)")+
  theme_bw(base_family = "Fira Sans Condensed",
           base_size = 16)

Linear-Log Model Graph (Log X-Axis)

Code
ggplot(data = gapminder)+
  aes(x = loggdp,
      y = lifeExp)+
  geom_point(color = "blue", alpha = 0.5)+
  geom_smooth(method = "lm",
              formula = y ~ log(x),
              color = "orange")+ 
  scale_y_continuous(breaks = seq(0,100,10),
                     limits = c(0,100))+
  labs(x = "Log GDP per Capita",
       y = "Life Expectancy (Years)")+
  theme_bw(base_family = "Fira Sans Condensed",
           base_size = 16)

Log-Linear Model

Log-Linear Model: Interpretation

  • Log-linear model has the dependent variable (Y) logged

lnYi=β0+β1Xβ1=(ΔYY)ΔX

  • Marginal effect of X→Y: a 1 unit change in X→ a β1×100 % change in Y

Log-Linear Model in R (Preliminaries)

  • We will again have very large/small coefficients if we deal with GDP directly, again let’s transform gdpPercap into $1,000s, call it gdp_t

  • Then log LifeExp

gapminder <- gapminder %>%
  mutate(gdp_t = gdpPercap/1000, # first make GDP/capita in $1000s
         loglife = log(lifeExp)) # take the log of LifeExp
gapminder %>% head() # look at it
ABCDEFGHIJ0123456789
country
<fct>
continent
<fct>
year
<int>
lifeExp
<dbl>
pop
<int>
AfghanistanAsia195228.8018425333
AfghanistanAsia195730.3329240934
AfghanistanAsia196231.99710267083
AfghanistanAsia196734.02011537966
AfghanistanAsia197236.08813079460
AfghanistanAsia197738.43814880372
6 rows | 1-5 of 11 columns

Log-Linear Model in R

ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)3.9666390.0058345501679.853390.000000e+00
gdp_t0.0129170.000477707227.039582.920378e-134
2 rows

^lnLife Expectancyi=3.967+0.013GDPi

  • A $1 (thousand) change in GDP → a 0.013×100%= 1.3% increase in Life Expectancy
  • A $25 (thousand) fall in GDP → a (−25×1.3%)= 32.5% decrease in Life Expectancy
  • A $100 (thousand) rise in GDP → a (100×1.3%)= 130% increase in Life Expectancy

Linear-Log Model Graph

Code
ggplot(data = gapminder)+
  aes(x = gdp_t,
      y = loglife)+ 
  geom_point(color = "blue", alpha = 0.5)+
  geom_smooth(method = "lm", color = "orange")+
  scale_x_continuous(labels = scales::dollar,
                     breaks = seq(0,120,20))+
  labs(x = "GDP per Capita ($ Thousands)",
       y = "Log Life Expectancy")+
  theme_bw(base_family = "Fira Sans Condensed",
           base_size = 16)

Log-Log Model

Log-Log Model

  • Log-log model has both variables (X and Y) logged

lnYi=β0+β1lnXiβ1=(ΔYY)(ΔXX)

  • Marginal effect of X→Y: a 1% change in X→ a β1 % change in Y

  • β1 is the elasticity of Y with respect to X!

Log-Log Model in R

ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)2.8641770.02328274123.017180
loggdp0.1465490.0028212651.944520
2 rows

^ln Life Expectancyi=2.864+0.147ln GDPi

  • A 1% change in GDP → a 0.147% increase in Life Expectancy
  • A 25% fall in GDP → a (−25×0.147%)= 3.675% decrease in Life Expectancy
  • A 100% rise in GDP → a (100×0.147%)= 14.7% increase in Life Expectancy

Log-Log Model Graph

Code
ggplot(data = gapminder)+
  aes(x = loggdp,
      y = loglife)+ 
  geom_point(color = "blue", alpha = 0.5)+
  geom_smooth(method = "lm", color = "orange")+
  labs(x = "Log GDP per Capita",
       y = "Log Life Expectancy")+
  theme_bw(base_family = "Fira Sans Condensed",
           base_size = 16)

Comparing Log Models I

Model Equation Interpretation
Linear-Log Y=β0+β1lnX 1% change in X→^β1100 unit change in Y
Log-Linear lnY=β0+β1X 1 unit change in X→^β1×100% change in Y
Log-Log lnY=β0+β1lnX 1% change in X→^β1% change in Y
  • Hint: the variable that gets logged changes in percent terms, the linear variable (not logged) changes in unit terms
    • Going from units → percent: multiply by 100
    • Going from percent → units: divide by 100

Comparing Models II

Code
library(modelsummary)
modelsummary(models = list("Life Exp." = lin_log_reg,
                           "Log Life Exp." = log_lin_reg,
                           "Log Life Exp." = log_log_reg),
             fmt = 2, # round to 2 decimals
             output = "html",
             coef_rename = c("(Intercept)" = "Constant",
                             "gdp_t" = "GDP per capita ($1,000s)",
                             "loggdp" = "Log GDP per Capita"),
             gof_map = list(
               list("raw" = "nobs", "clean" = "n", "fmt" = 0),
               #list("raw" = "r.squared", "clean" = "R<sup>2</sup>", "fmt" = 2),
               list("raw" = "adj.r.squared", "clean" = "Adj. R<sup>2</sup>", "fmt" = 2),
               list("raw" = "rmse", "clean" = "SER", "fmt" = 2)
             ),
             escape = FALSE,
             stars = c('*' = .1, '**' = .05, '***' = 0.01)
)
Life Exp. Log Life Exp. Log Life Exp.
Constant −9.10*** 3.97*** 2.86***
(1.23) (0.01) (0.02)
Log GDP per Capita 8.41*** 0.15***
(0.15) (0.00)
GDP per capita ($1,000s) 0.01***
(0.00)
n 1704 1704 1704
Adj. R2 0.65 0.30 0.61
SER 7.62 0.19 0.14
* p < 0.1, ** p < 0.05, *** p < 0.01
  • Models are very different units, how to choose?
    1. Compare intuition
    2. Compare R2’s
    3. Compare graphs

Comparing Models III

Linear-Log Log-Linear Log-Log
^Yi=^β0+^β1lnXi lnYi=^β0+^β1Xi lnYi=^β0+^β1lnXi
R2=0.65 R2=0.30 R2=0.61

When to Log?

  • In practice, the following types of variables are usually logged:
    • Variables that must always be positive (prices, sales, market values)
    • Very large numbers (population, GDP)
    • Variables we want to talk about as percentage changes or growth rates (money supply, population, GDP)
    • Variables that have diminishing returns (output, utility)
    • Variables that have nonlinear scatterplots
  • Avoid logs for:
    • Variables that are less than one, decimals, 0, or negative
    • Categorical variables (season, gender, political party)
    • Time variables (year, week, day)

Standardizing & Comparing Across Units

Comparing Coefficients of Different Units I

^Yi=β0+β1X1+β2X2

  • We often want to compare coefficients to see which variable X1 or X2 has a bigger effect on Y

  • What if X1 and X2 are different units?

Example

^Salaryi=β0+β1Batting averagei+β2Home runsi^Salaryi=−2,869,439.40+12,417,629.72Batting averagei+129,627.36Home runsi

Comparing Coefficients of Different Units II

  • An easy way is to standardize1 the variables (i.e. take the Z-score)

XZ=Xi−¯Xsd(X)

  • Note doing this will make the constant 0, as both distributions of X and Y are now centered at 0.
  1. Also called “centering” or “scaling.”

Comparing Coefficients of Different Units: Example

Variable Mean Std. Dev.
Salary $2,024,616 $2,764,512
Batting Average 0.267 0.031
Home Runs 12.11 10.31

^Salaryi=−2,869,439.40+12,417,629.72Batting averagei+129,627.36Home runsi^SalaryZ=0.00+0.14Batting averageZ+0.48Home runsZ

  • Marginal effects on Y (in standard deviations of Y) from 1 standard deviation change in X:
  • ^β1: a 1 standard deviation increase in Batting Average increases Salary by 0.14 standard deviations

0.14×$2,764,512=$387,032

  • ^β2: a 1 standard deviation increase in Home Runs increases Salary by 0.48 standard deviations

0.48×$2,764,512=$1,326,966

Standardizing in R

Variable Mean SD
LifeExp 59.47 12.92
gdpPercap $7215.32 $9857.46
  • Use the scale() command inside mutate() function to standardize a variable
Code
gapminder <- gapminder %>%
  mutate(life_Z = scale(lifeExp),
         gdp_Z = scale(gdpPercap))

std_reg <- lm(life_Z ~ gdp_Z, data = gapminder)
tidy(std_reg)
ABCDEFGHIJ0123456789
term
<chr>
estimate
<dbl>
std.error
<dbl>
statistic
<dbl>
p.value
<dbl>
(Intercept)1.095650e-160.019675695.568547e-151.000000e+00
gdp_Z5.837062e-010.019681472.965766e+013.565724e-156
2 rows
  • A 1 standard deviation increase in gdpPercap will increase lifeExp by 0.584 standard deviations (0.584×12.92=7.55 years)

Rescaling: Visually

Code
ggplot(data = gapminder)+
  aes(x = gdpPercap,
      y = lifeExp)+
  geom_point(color = "blue", alpha = 0.5)+
  labs(x = "GDP per Capita",
       y = "Life Expectancy (Years)")+
  theme_bw(base_family = "Fira Sans Condensed",
           base_size = 16)

Rescaling: Visually

Code
ggplot(data = gapminder)+
  aes(x = gdp_Z,
      y = life_Z)+
  geom_point(color = "blue", alpha = 0.5)+
  geom_hline(yintercept = 0)+
  geom_vline(xintercept = 0)+
    labs(x = "GDP per Capita (Standardized)",
       y = "Life Expectancy (Standardized)")+
  theme_bw(base_family = "Fira Sans Condensed",
           base_size = 16)

Rescaling: Visually

  • Both X and Y now have means of 0 and sd of 1
Code
gapminder %>%
  summarize(mean_gdp = mean(gdp_Z), sd_gdp = sd(gdp_Z), mean_life = mean(life_Z), sd_life = sd(life_Z)) %>%
  round(1)
ABCDEFGHIJ0123456789
mean_gdp
<dbl>
sd_gdp
<dbl>
mean_life
<dbl>
sd_life
<dbl>
0101
1 row

Joint Hypothesis Testing

Joint Hypothesis Testing I

Example

Return again to:

^Wagei=^β0+^β1Malei+^β2Northeasti+^β3Midwesti+^β4Southi

  • Maybe region doesn’t affect wages at all?
  • H0:β2=0,β3=0,β4=0
  • This is a joint hypothesis (of multiple parameters) to test

Joint Hypothesis Testing II

  • A joint hypothesis tests against the null hypothesis of a value for multiple parameters:

H0:β1=β2=0

the hypotheses that multiple regressors are equal to zero (have no causal effect on the outcome)

  • Our alternative hypothesis is that:

H1: either β1≠0 or β2≠0 or both

or simply, that H0 is not true

Types of Joint Hypothesis Tests

  1. H0: β1=β2=0
    • Testing against the claim that multiple variables don’t matter
    • Useful under high multicollinearity between variables
    • Ha: at least one parameter ≠ 0
  1. H0: β1=β2
    • Testing whether two variables matter the same
    • Variables must be the same units
    • Ha:β1(≠,<, or >)β2
  1. H0: ALL β’s =0
    • The “Overall F-test”
    • Testing against claim that regression model explains NO variation in Y

Joint Hypothesis Tests: F-statistic

  • The F-statistic is the test-statistic used to test joint hypotheses about regression coefficients with an F-test
  • This involves comparing two models:
    1. Unrestricted model: regression with all coefficients
    2. Restricted model: regression under null hypothesis (coefficients equal hypothesized values)
  • F is an analysis of variance (ANOVA)
    • essentially tests whether R2 increases statistically significantly as we go from the restricted model→unrestricted model
  • F has its own distribution, with two sets of degrees of freedom

Joint Hypothesis F-test: Example I

Example

^Wagei=^β0+^β1Malei+^β2Northeasti+^β3Midwesti+^β4Southi

  • H0:β2=β3=β4=0
  • Ha: H0 is not true (at least one βi≠0)

Joint Hypothesis F-test: Example II

Example

^Wagei=^β0+^β1Malei+^β2Northeasti+^β3Midwesti+^β4Southi

  • Unrestricted model:

^Wagei=^β0+^β1Malei+^β2Northeasti+^β3Midwesti+^β4Southi

  • Restricted model:

^Wagei=^β0+^β1Malei

  • F-test: does going from restricted to unrestricted model statistically significantly improve R2?

Calculating the F-statistic

Fq,(n−k−1)=((R2u−R2r)q)((1−R2u)(n−k−1))

Calculating the F-statistic

Fq,(n−k−1)=((R2u−R2r)q)((1−R2u)(n−k−1))

  • R2u: the R2 from the unrestricted model (all variables)

Calculating the F-statistic

Fq,(n−k−1)=((R2u−R2r)q)((1−R2u)(n−k−1))

  • R2u: the R2 from the unrestricted model (all variables)

  • R2r: the R2 from the restricted model (null hypothesis)

Calculating the F-statistic

Fq,(n−k−1)=((R2u−R2r)q)((1−R2u)(n−k−1))

  • R2u: the R2 from the unrestricted model (all variables)

  • R2r: the R2 from the restricted model (null hypothesis)

  • q: number of restrictions (number of β′s=0 under null hypothesis)

  • k: number of X variables in unrestricted model (all variables)

  • F has two sets of degrees of freedom:

    • q for the numerator, (n−k−1) for the denominator

Calculating the F-statistic

Fq,(n−k−1)=((R2u−R2r)q)((1−R2u)(n−k−1))

  • Key takeaway: The bigger the difference between (R2u−R2r), the greater the improvement in fit by adding variables, the larger the F!

  • This formula is (believe it or not) actually a simplified version (assuming homoskedasticity)

    • I give you this formula to build your intuition of what F is measuring

F-test Example I

  • We’ll use the wooldridge package’s wage1 data again
# load in data from wooldridge package
library(wooldridge)
wages <- wage1

# run regressions
unrestricted_reg <- lm(wage ~ female + northcen + west + south, data = wages)
restricted_reg <- lm(wage ~ female, data = wages)

F-test Example II

  • Unrestricted model:

^Wagei=^β0+^β1Malei+^β2Northeasti+^β3Midwesti+^β4Southi

  • Restricted model:

^Wagei=^β0+^β1Malei

  • H0:β2=β3=β4=0

  • q=3 restrictions (F numerator df)

  • n−k−1=526−4−1=521 (F denominator df)

F-test Example III

  • We can use the car package’s linearHypothesis() command to run an F-test:
    • first argument: name of the (unrestricted) regression
    • second argument: vector of variable names (in quotes) you are testing
# load car package for additional regression tools
library(car) 
# F-test
linearHypothesis(unrestricted_reg, c("northcen", "west", "south")) 
ABCDEFGHIJ0123456789
 
 
Res.Df
<dbl>
RSS
<dbl>
Df
<dbl>
Sum of Sq
<dbl>
15246332.194NANA
25216174.8313157.3625
2 rows | 1-5 of 7 columns
  • p-value on F-test <0.05, so we can reject H0

All F-test I


Call:
lm(formula = wage ~ female + northcen + west + south, data = wages)

Residuals:
    Min      1Q  Median      3Q     Max 
-6.3269 -2.0105 -0.7871  1.1898 17.4146 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   7.5654     0.3466  21.827   <2e-16 ***
female       -2.5652     0.3011  -8.520   <2e-16 ***
northcen     -0.5918     0.4362  -1.357   0.1755    
west          0.4315     0.4838   0.892   0.3729    
south        -1.0262     0.4048  -2.535   0.0115 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.443 on 521 degrees of freedom
Multiple R-squared:  0.1376,    Adjusted R-squared:  0.131 
F-statistic: 20.79 on 4 and 521 DF,  p-value: 6.501e-16
  • Last line of regression output from summary() is an All F-test
    • H0: all β′s=0
      • the regression explains no variation in Y
    • Calculates an F-statistic that, if high enough, is significant (p-value <0.05) enough to reject H0

All F-test II

  • Alternatively, if you use broom instead of summary():
    • glance() command makes table of regression summary statistics
    • tidy() only shows coefficients
glance(unrestricted_reg)
ABCDEFGHIJ0123456789
r.squared
<dbl>
adj.r.squared
<dbl>
sigma
<dbl>
statistic
<dbl>
p.value
<dbl>
0.13764330.13102253.44265620.789596.500683e-16
1 row | 1-5 of 12 columns
  • statistic is the All F-test, p.value next to it is the p-value from the F test

ECON 480 — Econometrics

1
4.3 — Nonlinearity & Transformations ECON 480 • Econometrics • Fall 2022 Dr. Ryan Safner Associate Professor of Economics safner@hood.edu ryansafner/metricsF22 metricsF22.classes.ryansafner.com

  1. Slides

  2. Tools

  3. Close
  • Title Slide
  • Contents
  • Nonlinear Effects
  • Linear Regression
  • Linear Regression
  • Linear Regression
  • Linear Regression
  • Linear Regression
  • Sources of Nonlinearities
  • Nonlinearities Alter Marginal Effects
  • Nonlinearities Alter Marginal Effects
  • Nonlinearities Alter Marginal Effects
  • Polynomial Models
  • Polynomial Functions of \(X\) I
  • Polynomial Functions of \(X\) I
  • Polynomial Functions of \(X\) I
  • Polynomial Functions of \(X\) I
  • Polynomial Functions of \(X\) II
  • Quadratic Model
  • Quadratic Model
  • Quadratic Model: Calculating Marginal Effects
  • Quadratic Model: Example I
  • Quadratic Model: Example II
  • Quadratic Model: Example II
  • Quadratic Model: Example IV
  • Quadratic Model: Example IV
  • Quadratic Model: Example V
  • Quadratic Model: Example VI
  • Quadratic Model: Example VII
  • Quadratic Model: Example VIII
  • Quadratic Model: Example X
  • Quadratic Model: Example XI
  • Quadratic Model: Example XII
  • Quadratic Model: Maxima and Minima I
  • Quadratic Model: Maxima and Minima II
  • Quadratic Model: Maxima and Minima III
  • Determining If Polynomials Are Necessary I
  • Determining Polynomials are Necessary II
  • Determining Polynomials are Necessary II
  • Determining Polynomials are Necessary III
  • Determining Polynomials are Necessary IV
  • If You Kept Going…
  • If You Kept Going…Visually
  • If You Kept Going…Visually
  • If You Kept Going…Visually
  • If You Kept Going…Visually
  • Strategy for Polynomial Model Specification
  • Logarithmic Models
  • Linear Regression
  • Logarithmic Models
  • The Natural Logarithm
  • The Natural Logarithm: Review I
  • The Natural Logarithm: Review II
  • The Natural Logarithm: Properties
  • The Natural Logarithm: Example
  • Elasticity
  • Math FYI: Cobb Douglas Functions and Logs
  • Math FYI: Cobb Douglas Functions and Logs
  • Logarithms in R I
  • Logarithms in R II
  • Logarithms in R III
  • Types of Logarithmic Models
  • Linear-Log Model
  • Linear-Log Model: Interpretation
  • Linear-Log Model in R
  • Linear-Log Model Graph (Linear X-Axis)
  • Linear-Log Model Graph (Log X-Axis)
  • Log-Linear Model
  • Log-Linear Model: Interpretation
  • Log-Linear Model in R (Preliminaries)
  • Log-Linear Model in R
  • Linear-Log Model Graph
  • Log-Log Model
  • Log-Log Model
  • Log-Log Model in R
  • Log-Log Model Graph
  • Comparing Log Models I
  • Comparing Models II
  • Comparing Models III
  • When to Log?
  • Standardizing & Comparing Across Units
  • Comparing Coefficients of Different Units I
  • Comparing Coefficients of Different Units II
  • Comparing Coefficients of Different Units: Example
  • Standardizing in R
  • Rescaling: Visually
  • Rescaling: Visually
  • Rescaling: Visually
  • Joint Hypothesis Testing
  • Joint Hypothesis Testing I
  • Joint Hypothesis Testing II
  • Types of Joint Hypothesis Tests
  • Joint Hypothesis Tests: F-statistic
  • Joint Hypothesis F-test: Example I
  • Joint Hypothesis F-test: Example II
  • Calculating the F-statistic
  • Calculating the F-statistic
  • Calculating the F-statistic
  • Calculating the F-statistic
  • Calculating the F-statistic
  • F-test Example I
  • F-test Example II
  • F-test Example III
  • All F-test I
  • All F-test II
  • f Fullscreen
  • s Speaker View
  • o Slide Overview
  • e PDF Export Mode
  • b Toggle Chalkboard
  • c Toggle Notes Canvas
  • d Download Drawings
  • ? Keyboard Help