# 2.4 — Goodness of Fit and Bias — Class Content

## Overview

Today we continue looking at basic OLS regression. We will cover how to measure if a regression line is a good fit (using \(R^2\) and \(\sigma_u\) or SER), and whether OLS estimators are *biased*. These will depend on four critical *assumptions about \(u\).*

In doing so, we begin an ongoing exploration into inferential statistics, which will finally become clear in another week. The most confusing part is recognizing that there is a *sampling distribution of each OLS estimator*. We want to measure the center of that sampling distribution, to see if the estimator is *biased*. Next class we will measure the spread of that distribution.

We continue our extended example about class sizes and test scores, which comes from a (Stata) dataset from an old textbook that I used to use, Stock and Watson, 2007. Download and follow along with the data from today’s example:^{1}

I have also made a RStudio Cloud project documenting all of the things we have been doing with this data that may help you when you start working with regressions (next class):

## Readings

- Ch. 3.2-3.4, 3.7-3.8 in Bailey,
*Real Econometrics*

## Appendix

See the online appendix for today’s content:

## Slides

Below, you can find the slides in two formats. Clicking the image will bring you to the html version of the slides in a new tab. The lower button will allow you to download a PDF version of the slides.

I suggest printing the slides beforehand and using them to take additional notes in class (*not everything* is in the slides)!

## Footnotes

Note this is a

`.dta`

Stata file. You will need to (install and) load the package`haven`

to`read_dta()`

Stata files into a dataframe.↩︎