slides

4.3 — Categorical Data

ECON 480 • Econometrics • Fall 2022

Dr. Ryan Safner
Associate Professor of Economics

safner@hood.edu
ryansafner/metricsF22
metricsF22.classes.ryansafner.com

id <dbl>	rank <chr>	grade <dbl>
1	Freshman	76
2	Junior	82
3	Sophomore	73
4	Sophomore	95
5	Senior	74

id <dbl>	rank <fct>	grade <dbl>
1	Freshman	76
2	Junior	82
3	Sophomore	73
4	Sophomore	95
5	Senior	74

rank <fct>	n <int>
Freshman	4
Junior	1
Senior	3
Sophomore	2

id <dbl>	rank <ord>	grade <dbl>
1	Freshman	76
2	Junior	82
3	Sophomore	73
4	Sophomore	95
5	Senior	74

rank <ord>	n <int>
Freshman	4
Sophomore	2
Junior	1
Senior	3

	wage <dbl>	gender <fct>	educ <int>	exper <int>
1	3.10	Female	11	2
2	3.24	Female	12	22
3	3.00	Male	11	2
4	6.00	Male	8	44
5	5.30	Male	12	7
6	8.75	Male	16	9
7	11.25	Male	18	15
8	5.00	Female	12	5
9	3.60	Female	12	26
10	18.18	Male	17	22

mean <dbl>	sd <dbl>
7.099489	4.160858

mean <dbl>	sd <dbl>
4.587659	2.529363

term <chr>	estimate <dbl>	std.error <dbl>	statistic <dbl>	p.value <dbl>
(Intercept)	4.587659	0.2189834	20.949802	3.012371e-71
genderMale	2.511830	0.3034092	8.278688	1.041764e-15

	wage <dbl>	female <dbl>	educ <int>	exper <int>
1	3.10	1	11	2
2	3.24	1	12	22
3	3.00	0	11	2
4	6.00	0	8	44
5	5.30	0	12	7
6	8.75	0	16	9
7	11.25	0	18	15
8	5.00	1	12	5
9	3.60	1	12	26
10	18.18	0	17	22

term <chr>	estimate <dbl>	std.error <dbl>	statistic <dbl>	p.value <dbl>
(Intercept)	7.099489	0.2100082	33.805777	8.971839e-134
female	-2.511830	0.3034092	-8.278688	1.041764e-15

term <chr>	estimate <dbl>	std.error <dbl>	statistic <dbl>	p.value <dbl>
(Intercept)	7.099489	0.2100082	33.805777	8.971839e-134
female	-2.511830	0.3034092	-8.278688	1.041764e-15

	wage <dbl>	female <dbl>	male <dbl>
1	3.10	1	0
2	3.24	1	0
3	3.00	0	1
4	6.00	0	1
5	5.30	0	1

term <chr>	estimate <dbl>	std.error <dbl>	statistic <dbl>	p.value <dbl>
(Intercept)	4.587659	0.2189834	20.949802	3.012371e-71
male	2.511830	0.3034092	8.278688	1.041764e-15

	Wage	Wage
Constant	7.10***	4.59***
	(0.21)	(0.22)
female	−2.51***
	(0.30)
male		2.51***
		(0.30)
n	526	526
Adj. R²	0.11	0.11
SER	3.47	3.47
* p < 0.1, p < 0.05, * p < 0.01

	No Northeast	No Midwest	No South	No West
Constant	6.37***	5.71***	5.39***	6.61***
	(0.34)	(0.32)	(0.27)	(0.39)
midwest	−0.66		0.32	−0.90*
	(0.47)		(0.42)	(0.50)
south	−0.98**	−0.32		−1.23***
	(0.43)	(0.42)		(0.47)
west	0.24	0.90*	1.23***
	(0.52)	(0.50)	(0.47)
northeast		0.66	0.98**	−0.24
		(0.47)	(0.43)	(0.52)
n	526	526	526	526
R²	0.02	0.02	0.02	0.02
Adj. R²	0.01	0.01	0.01	0.01
SER	3.66	3.66	3.66	3.66
* p < 0.1, p < 0.05, * p < 0.01

term <chr>	estimate <dbl>	std.error <dbl>	statistic <dbl>	p.value <dbl>
(Intercept)	6.15827549	0.34167408	18.023830	7.998534e-57
exper	0.05360476	0.01543716	3.472450	5.585255e-04
female	-1.54654677	0.48186030	-3.209534	1.411253e-03
exper:female	-0.05506989	0.02217496	-2.483427	1.332533e-02

	Wage
Constant	6.16***
	(0.34)
exper	0.05***
	(0.02)
female	−1.55***
	(0.48)
exper:female	−0.06**
	(0.02)
n	526
Adj. R²	0.13
SER	3.43
* p < 0.1, p < 0.05, * p < 0.01

term <chr>	estimate <dbl>	std.error <dbl>	statistic <dbl>	p.value <dbl>
(Intercept)	6.15827549	0.34167408	18.023830	7.998534e-57
exper	0.05360476	0.01543716	3.472450	5.585255e-04
female	-1.54654677	0.48186030	-3.209534	1.411253e-03
exper:female	-0.05506989	0.02217496	-2.483427	1.332533e-02

mean <dbl>
5.168023

mean <dbl>
7.983032

mean <dbl>
4.611583

mean <dbl>
4.565909

	Men	Women
Unmarried	$5.17	$4.61
Married	$7.98	$4.57

term <chr>	estimate <dbl>	std.error <dbl>	statistic <dbl>	p.value <dbl>
(Intercept)	5.1680233	0.3614348	14.298631	2.255740e-39
female	-0.5564399	0.4735578	-1.175020	2.405224e-01
married	2.8150086	0.4363413	6.451391	2.531401e-10
female:married	-2.8606829	0.6075577	-4.708496	3.202330e-06

	Wage
Constant	5.17***
	(0.36)
female	−0.56
	(0.47)
married	2.82***
	(0.44)
female:married	−2.86***
	(0.61)
n	526
Adj. R²	0.18
SER	3.34
* p < 0.1, p < 0.05, * p < 0.01

	Men	Women
Unmarried	$5.17	$4.61
Married	$7.98	$4.57

	Men	Women	Diff
Unmarried	$5.17	$4.61	$0.56
Married	$7.98	$4.57	$3.41
Diff	$2.81	$0.04	$2.85

4.3 — Categorical Data ECON 480 • Econometrics • Fall 2022 Dr. Ryan Safner Associate Professor of Economics safner@hood.edu ryansafner/metricsF22 metricsF22.classes.ryansafner.com

Title Slide
Contents
Categorical Variables
Working with factor Variables in R
Factors in R I
Factors in R II
Factors in R III
Ordered Factors in R I
Ordered Factors in R II
Example Research Question with Categorical Data
A Difference in Group Means
Plotting factors in R
Regression with Dummy Variables
Comparing Groups with Regression
Comparing Groups in Regression: Scatterplot
Comparing Groups in Regression: Scatterplot
Dummy Variables as Group Means
Dummy Variables as Group Means: Our Example
Comparing Groups in Regression: Scatterplot
Comparing Groups in Regression: Scatterplot
The Data
Conditional Group Means
Visualize Differences
The Regression (factor variables)
The Regression: Dummy Variables
The Regression (Dummy variables)
Dummy Regression vs. Group Means
Recoding Dummy Variables
Recoding Dummy Variables
Recoding Dummies in the Data
Scatterplot with Male
Dummy Variables as Group Means: With Male
Scatterplot & Regression Line with Male
The Regression with Male
The Dummy Regression: Male or Female
Categorical Variables (More than 2 Categories)
Categorical Variables with More than 2 Categories
Using Categorical Variables in Regression I
Using Categorical Variables in Regression II
Using Categorical Variables in Regression III
The Dummy Variable Trap
The Reference Category
The Reference Category: Example
Regression in R with Categorical Variable
Regression in R with Dummies (& Dummy Variable Trap)
Using Different Reference Categories in R
Dummy Dependent (Y) Variables
Interaction Effects
Sliders and Switches
Interaction Effects
Three Types of Interactions
Interactions Between a Dummy and Continuous Variable
Interactions: A Dummy & Continuous Variable
Interactions: A Dummy & Continuous Variable I
Dummy-Continuous Interaction Effects as Two Regressions I
Dummy-Continuous Interaction Effects as Two Regressions II
Interpretting Coefficients I
Interpretting Coefficients II
Interpretting Coefficients III
Interactions in Our Example
Interactions in Our Example: Scatterplot
Interactions in Our Example: Scatterplot
Interactions in Our Example: Scatterplot
Interactions in Our Example: Regression in R
Interactions in Our Example: Regression
Interactions in Our Example: Interpretting Coefficients
Interactions in Our Example: As Two Regressions I
Interactions in Our Example: As Two Regressions I
Interactions in Our Example: Hypothesis Testing
Interactions Between Two Dummy Variables
Interactions Between Two Dummy Variables
Interactions Between Two Dummy Variables
2 Dummy Interaction: Interpretting Coefficients
Interactions Between 2 Dummy Variables: Example
Conditional Group Means in the Data
Two Dummies Interaction: Group Means
Two Dummies Interaction: Regression in R I
Two Dummies Interaction: Regression in R II
Two Dummies Interaction: Interpretting Coefficients I
Two Dummies Interaction: Interpretting Coefficients II
Interactions Between Two Continuous Variables
Interactions Between Two Continuous Variables
Interactions Between Two Continuous Variables
Continuous Variables Interaction: Example
Continuous Variables Interaction: In R I
Continuous Variables Interaction: In R II
Continuous Variables Interaction: Marginal Effects
Continuous Variables Interaction: Marginal Effects

term <chr>	estimate <dbl>	std.error <dbl>	statistic <dbl>	p.value <dbl>
(Intercept)	-2.859915627	1.181079647	-2.4214418	1.579891e-02
educ	0.601735470	0.089899977	6.6933885	5.640482e-11
exper	0.045768911	0.042613758	1.0740407	2.833007e-01
educ:exper	0.002062345	0.003490614	0.5908258	5.548929e-01

Experience
5 years
10 years
15 years

Education
5 years
10 years
15 years

	Wage
Constant	−2.86**
	(1.18)
educ	0.60***
	(0.09)
exper	0.05
	(0.04)
educ:exper	0.00
	(0.00)
n	526
Adj. R²	0.22
SER	3.25
* p < 0.1, p < 0.05, * p < 0.01

Gender	Avg. Wage	Std. Dev.
Female
Male
Difference