2.6 — Inference for Regression — Class Content

Meeting Dates

Monday, October 3, 2022

Overview

We begin with some more time for you to work on the R Practice from last class.

This week is about inferential statistics: using statistics calculated from a sample of data to infer the true (and unmeasurable) parameters that describe a population. In doing so, we can run hypothesis tests on our sample to determine a point estimate of a parameter, or construct a confidence interval from our sample to cast a range for the true parameter.

This is standard principles of statistics - you hopefully should have learned it before. If it has been a while (or never) since your last statistics class, this is one of the hardest concepts to understand at first glance. I recommend Khan Academy [From sampling distributions through significance tests, for this. Though the whole class is helpful!] or Google for these concepts, as every statistics class will cover them in the standard way.

That being said, I do not cover inferential statistics in the standard way (see the appendix today below for an overview of the standard way). I think it will be more intuitive if I show you where these concepts come from by simulating a sampling distribution, as opposed to reciting the theoretical sampling distributions.

Readings

Bailey teaches inferential statistics in the classical way (with reference to theoretical Z and t distributions, and Z and t tests). This is all valid. Again, you may wish to brush up with Khan Academy [From sampling distributions through significance tests, for this. Though the whole class is helpful!].

The latter “book” (also free online, like R4DS) uses the infer package to run simulations for inferential statistics. Chapter 10 is focused on regression (but I also recommend the chapters leading up to it, which are on inferential statistics broadly, using this method).

The final link is a great website for visualizing basic statistic concepts like probability, distributions, confidence intervals, hypothesis tests, central limit theorem, and regression.

R Practice

Today we will finish working on practice problems. Answers will be posted on that page later.

Appendix

See the online appendix for today’s content:

Slides

Below, you can find the slides in two formats. Clicking the image will bring you to the html version of the slides in a new tab. The lower button will allow you to download a PDF version of the slides.

I suggest printing the slides beforehand and using them to take additional notes in class (not everything is in the slides)!

2.6-slides

Download as PDF