2.6 — Inference for Regression — Class Content
Overview
We begin with some more time for you to work on the R Practice from last class.
This week is about inferential statistics: using statistics calculated from a sample of data to infer the true (and unmeasurable) parameters that describe a population. In doing so, we can run hypothesis tests on our sample to determine a point estimate of a parameter, or construct a confidence interval from our sample to cast a range for the true parameter.
This is standard principles of statistics - you hopefully should have learned it before. If it has been a while (or never) since your last statistics class, this is one of the hardest concepts to understand at first glance. I recommend Khan Academy [From sampling distributions through significance tests, for this. Though the whole class is helpful!] or Google for these concepts, as every statistics class will cover them in the standard way.
That being said, I do not cover inferential statistics in the standard way (see the appendix today below for an overview of the standard way). I think it will be more intuitive if I show you where these concepts come from by simulating a sampling distribution, as opposed to reciting the theoretical sampling distributions.
Readings
- Ch.4 in Bailey, Real Econometrics
- Ch. 10 (optionally 8-9) in Modern Dive
- Visualizing Probability and Inference
Bailey teaches inferential statistics in the classical way (with reference to theoretical \(Z\) and \(t\) distributions, and \(Z\) and \(t\) tests). This is all valid. Again, you may wish to brush up with Khan Academy [From sampling distributions through significance tests, for this. Though the whole class is helpful!].
The latter “book” (also free online, like R4DS) uses the infer
package to run simulations for inferential statistics. Chapter 10 is focused on regression (but I also recommend the chapters leading up to it, which are on inferential statistics broadly, using this method).
The final link is a great website for visualizing basic statistic concepts like probability, distributions, confidence intervals, hypothesis tests, central limit theorem, and regression.
R Practice
Today we will finish working on practice problems. Answers will be posted on that page later.
Appendix
See the online appendix for today’s content:
Slides
Below, you can find the slides in two formats. Clicking the image will bring you to the html version of the slides in a new tab. The lower button will allow you to download a PDF version of the slides.
I suggest printing the slides beforehand and using them to take additional notes in class (not everything is in the slides)!