4.3 — Categorical Data & Interactions — Class Content
Overview
This week, we look at how to use data that is categorical (i.e. variables that indicate an observation’s membership in a particular group or category). We introduce them into regression models as dummy variables that can equal 0 or 1: where 1 indicates membership in a category, and 0 indicates non-membership.
We also look at what happens when categorical variables have more than two values: for regression, we introduce a dummy variable for each possible category - but be sure to leave out one reference category to avoid the dummy variable trap.
We then continue by examining how to use categorical data in regression, particularly focusing on interactions between variables. We look at three types of interaction effects: 1. Interaction between a continuous variable & a dummy variable 2. Interaction between two dummy variables 3. Interaction between two continuous variables
We will also be working on practice problems in R
.
Readings
- Ch. 6.1—6.2 in Bailey, Real Econometrics
R Practice
This week you will be working on R practice problems on categorical data. Answers will be posted later on that page.
Assignments
Problem Set 4 Due Fri Nov 11
Problem Set 4 is due by the end of the day on Friday, November 11.
Slides
Below, you can find the slides in two formats. Clicking the image will bring you to the html version of the slides in a new tab. The lower button will allow you to download a PDF version of the slides.
I suggest printing the slides beforehand and using them to take additional notes in class (not everything is in the slides)!