2.3 — Simple Linear Regression — Class Content
Overview
Today we start looking at associations between variables, which we will first attempt to quantify with measures like covariance and correlation. Then we turn to fitting a line to data via linear regression. We overview the basic regression model, the parameters and how they are derived, and see how to work with regressions in R
with lm
and the tidyverse package broom
.
We consider an extended example about class sizes and test scores, which comes from a (Stata) dataset from an old textbook that I used to use, Stock and Watson, 2007. Download and follow along with the data from today’s example:1
I have also made a RStudio Cloud project documenting all of the things we have been doing with this data that may help you when you start working with regressions (next class):
Readings
- Ch. 3.1, Math and Probability Background Appendix A in Bailey
Now that we return to the statistics, we will do a minimal overview of basic statistics and distributions. Review all of Bailey’s appendices.
Chapter 2 is optional, but will give you a good overview of using data.
Appendix
See the online appendix for today’s content:
Slides
Below, you can find the slides in two formats. Clicking the image will bring you to the html version of the slides in a new tab. The lower button will allow you to download a PDF version of the slides.
I suggest printing the slides beforehand and using them to take additional notes in class (not everything is in the slides)!
Footnotes
Note this is a
.dta
Stata file. You will need to (install and) load the packagehaven
toread_dta()
Stata files into a dataframe.↩︎