2.4 — Goodness of Fit and Bias — Class Content
Overview
Today we continue looking at basic OLS regression. We will cover how to measure if a regression line is a good fit (using \(R^2\) and \(\sigma_u\) or SER), and whether OLS estimators are biased. These will depend on four critical assumptions about \(u\).
In doing so, we begin an ongoing exploration into inferential statistics, which will finally become clear in another week. The most confusing part is recognizing that there is a sampling distribution of each OLS estimator. We want to measure the center of that sampling distribution, to see if the estimator is biased. Next class we will measure the spread of that distribution.
We continue our extended example about class sizes and test scores, which comes from a (Stata) dataset from an old textbook that I used to use, Stock and Watson, 2007. Download and follow along with the data from today’s example:1
I have also made a RStudio Cloud project documenting all of the things we have been doing with this data that may help you when you start working with regressions (next class):
Readings
- Ch. 3.2-3.4, 3.7-3.8 in Bailey, Real Econometrics
Appendix
See the online appendix for today’s content:
Slides
Below, you can find the slides in two formats. Clicking the image will bring you to the html version of the slides in a new tab. The lower button will allow you to download a PDF version of the slides.
I suggest printing the slides beforehand and using them to take additional notes in class (not everything is in the slides)!
Footnotes
Note this is a
.dta
Stata file. You will need to (install and) load the packagehaven
toread_dta()
Stata files into a dataframe.↩︎