2.7 — Hypothesis Testing (Regression)

ECON 480 • Econometrics • Fall 2022

Dr. Ryan Safner
Associate Professor of Economics

safner@hood.edu
ryansafner/metricsF22
metricsF22.classes.ryansafner.com

Hypothesis Testing

Digression: p-Values and the Philosophy of Science

Hypothesis Testing by Simulation with infer

Theory-Based Hypothesis Testing (What R Calculates)

The Use and Abuse of p-Values

Hypothesis Testing

Estimation and Hypothesis Testing I

We want to test if our estimates are statistically significant and they describe the population
- this is the “bread and butter” of using inferential statistics

Examples

Does reducing class size improve test scores?
Do more years of education increase your wages?
Is the gender wage gap between men and women 23%?

All modern science is built upon statistical hypothesis testing, so understand it well

Estimation and Hypothesis Testing II

Note, we can test a lot of hypotheses about a lot of population parameters, e.g.
- A population mean $\mu$
  - Example: average height of adults
- A population proportion $p$
  - Example: percent of voters who voted for Biden
- A difference in population means $\mu_A-\mu_B$
  - Example: difference in average wages of men vs. women
- A difference in population proportions $p_A-p_B$
  - Example: difference in percent of patients reporting symptoms of drug A vs B
We will focus on hypotheses about population regression slope $(\beta_1)$, i.e. the causal effect¹ of $X$ on $Y$

Null and Alternative Hypotheses I

All scientific inquiries begin with a null hypothesis $(H_0)$ that proposes a specific value of a population parameter
- Notation: add a subscript 0: $\beta_{1,0}$ (or $\mu_0$, $p_0$, etc)

We suggest an alternative hypothesis $(H_a)$, often the one we hope to verify
- Note, can be multiple alternative hypotheses: $H_1, H_2, \ldots , H_n$

Ask: “Does our data (sample) give us sufficient evidence to reject $H_0$ in favor of $H_a$?”
- Note: the test is always about $\mathbf{H_0}$!
- See if we have sufficient evidence to reject the status quo

Null and Alternative Hypotheses II

Null hypothesis assigns a value (or a range) to a population parameter
- e.g. $\beta_1=2$ or $\beta_1 \leq 20$
- Most common is $\beta_1=0$ $\implies$ $X$ has no effect on $Y$ (no slope for a line)
- Note: always an equality!

Alternative hypothesis must mathematically contradict the null hypothesis
- e.g. $\beta_1 \neq 2$ or $\beta_1 > 20$ or $\beta_1 \neq 0$
- Note: always an inequality!

Alternative hypotheses come in two forms:
1. One-sided alternative: $\beta_1 >H_0$ or $\beta_1< H_0$
2. Two-sided alternative: $\beta_1 \neq H_0$
  - Note this means either $\beta_1 < H_0$ or $\beta_1 > H_0$

::: footer :::

Components of a Valid Hypothesis Test

All statistical hypothesis tests have the following components:

A null hypothesis, $H_0$

An alternative hypothesis, $H_a$

A test statistic to determine if we reject $H_0$ when the statistic reaches a “critical value”
- Beyond the critical value is the “rejection region”, sufficient evidence to reject $H_0$

A conclusion whether or not to reject $H_0$ in favor of $H_a$

Type I and Type II Errors I

Sample statistic $(\hat{\beta_1})$ will rarely be exactly equal to the hypothesized parameter $(\beta_1)$
Difference between observed statistic and true parameter could be because:

Parameter is not the hypothesized value
- $H_0$ is false

Parameter truly is the hypothesized value, but sampling variability gave us a different estimate
- $H_0$ is true

We cannot distinguish between these two possibilities with any certainty
So, we can interpret our estimates probabilistically as committing one of two types of error

Type I and Type II Errors II

Type I error (false positive): rejecting $H_0$ when it is in fact true
- Believing we found an important result when there is truly no relationship

Type II error (false negative): failing to reject $H_0$ when it is in fact false
- Believing we found nothing when there was truly a relationship to find

Type I and Type II Errors III

Depending on context, committing one type of error may be more serious than the other

Type I and Type II Errors IV

Anglo-American common law presumes defendant is innocent: $H_0$

Jury judges whether the evidence presented against the defendant is plausible assuming the defendant were in fact innocent

If highly improbable (beyond a “reasonable doubt”): sufficient evidence to reject $H_0$ and convict

Type I and Type II Errors V

William Blackstone

(1723-1780)

“It is better that ten guilty persons escape than that one innocent suffer.”

Type I error is worse than a Type II error in law!

Blackstone, William, 1765-1770, Commentaries on the Laws of England

Type I and Type II Errors VI

Type I and Type II Errors VII

Significance Level, $\alpha$, and Confidence Level $1-\alpha$

The significance level, $\alpha$, is the probability of a Type I error

\[\alpha=P(\text{Reject } H_0 | H_0 \text{ is true})\]

The confidence level is defined as $(1-\alpha)$
- Specify in advance an $\alpha$-level (0.10, 0.05, 0.01) with associated confidence level (90%, 95%, 99%)

The probability of a Type II error is defined as $\beta$:

\[\beta=P(\text{Don't reject } H_0 | H_0 \text{ is false})\]

$\alpha$ and $\beta$

Power and p-values

The statistical power of the test is $(1-\beta)$: the probability of correctly rejecting $H_0$ when $H_0$ is in fact false (e.g. convicting a guilty person)

\[\text{Power} = 1- \beta = P(\text{Reject }H_0|H_0 \text{ is false})\]

The p-value or significance probability is the probability that, if the null hypothesis were true, the test statistic from any sample will be at least as extreme as the test statistic from our sample

\[p(\delta \geq \delta_i|H_0 \text{ is true})\]

where $\delta$ represents some test statistic
$\delta_i$ is the test statistic we observe in our sample
More on this in a bit

p-values and Statistical Significance

After running our test, we need to make a decision between the competing hypotheses
Compare $p$-value with pre-determined $\alpha$ (commonly, $\alpha=0.05$, 95% confidence level)
If $p<\alpha$: statistically significant evidence sufficient to reject $H_0$ in favor of $H_a$
- Note this does not mean $H_a$ is true! We merely have rejected $H_0$!
If $p \geq \alpha$: insufficient evidence to reject $H_0$
- Note this does not mean $H_0$ is true! We merely have failed to reject $H_0$!

Digression: p-Values and the Philosophy of Science

Hypothesis Testing and the Philosophy of Science I

Sir Ronald A. Fisher

(1890-1962)

“The null hypothesis is never proved or established, but is possibly disproved, in the course of experimentation. Every experiment may be said to exist only in order to give the facts a chance of disproving the null hypothesis.”

Fisher, R.A., 1931, The Design of Experiments

Hypothesis Testing and the Philosophy of Science II

Modern philosophy of science is largely based off of hypothesis testing and falsifiability, which form the “Scientific Method”¹
For something to be “scientific”, it must be falsifiable, or at least testable (at least in principle)
Hypotheses can be corroborated with evidence, but always tentative until falsified by data in suggesting an alternative hypothesis
“All swans are white” is a hypothesis rejected upon discovery of a single black swan

Hypothesis Testing and p-Values

Hypothesis testing, confidence intervals, and p-values are probably the hardest thing to understand in statistics

Fivethirtyeight: Not Even Scientists Can Easily Explain P-values

Hypothesis Testing: Which Test? I

Rigorous course on statistics (ECMG 212 or MATH 112) will spend weeks going through different types of tests:
- Sample mean; difference of means
- Proportion; difference of proportions
- Z-test vs t-test
- 1 sample vs. 2 samples
- $\chi^2$ test

Hypothesis Testing: Which Test? II

There is Only One Test!

Fortunately, some clever statisticians realized “there is only one test” and some built a nice R package called infer

Calculate a statistic, $\delta_i$¹, from a sample of data
Simulate a world where $\delta$ is null $(H_0)$
Examine the distribution of $\delta$ across the null world
Calculate the probability that $\delta_i$ could exist in the null world
Decide if $\delta_i$ is statistically significant

Elements of a Hypothesis Test

Alan Downey: “There is still only one test”

Hypothesis Testing with the infer Package I

R naturally runs the following hypothesis test on any regression as part of lm():

\[\begin{align*} H_0: \; & \beta_1=0\\ H_1: \; & \beta_1 \neq 0 \end{align*}\]

infer allows you to run through these steps manually to understand the process:

specify() a model

hypothesize() the null

generate() simulations of the null world

calculate() the $p$-value

visualize() with a histogram (optional)

Hypothesis Testing with the infer Package II

Theory-Based Inference: Critical Values of Test Statistic

Test statistic $\delta$: measures how far what we observed in our sample $(\hat{\beta_1})$ is from what we would expect if the null hypothesis were true $(\beta_1=0)$
- Calculated from a sampling distribution of the estimator (i.e. $\hat{\beta_1})$
- In econometrics, we use $t$-distributions which have $n-k-1$ degrees of freedom¹
Rejection region: if the test statistic reaches a “critical value” of $\delta$, then we reject the null hypothesis

Theory-Based Inference: Critical Values of Test Statistic

Hypothesis Testing by Simulation, with `infer`

Imagine a Null World, where $H_0$ is True

Our world, and a world where $\beta_1=0$ by assumption.

Comparing the Worlds I

From that null world where $H_0: \, \beta_1=0$ is true, we simulate another sample and calculate OLS estimators again

Comparing the Worlds II

From that null world where $H_0: \, \beta_1=0$ is true, let’s simulate 1,000 samples and calculate slope $(\hat{\beta_1})$ for each

Prepping the `infer` Pipeline

Before I show you how to do this, let’s first save our estimated slope from our actual sample
- We’ll want this later!

# save as our_slope
our_slope <- school_reg %>% 
  tidy() %>%
  filter(term == "str") %>%
  pull(estimate)

# look at it
our_slope

[1] -2.279808

The `infer` Pipeline: `specify()`

data %>%

specify(y ~ x)

Take our data and pipe it into the specify() function, which is essentially a lm() function for regression (for our purposes)

ca_school %>%
  specify(testscr ~ str)

The `infer` Pipeline: `hypothesize()`

data %>%

specify(y ~ x) %>%

hypothesize(null = "independence")

Describe what the null hypothesis is here
In infer’s language, str and testscr are independent $(\beta_1=0)$¹

ca_school %>%
  specify(testscr ~ str) %>%
  hypothesize(null = "independence")

The `infer` Pipeline: `generate()`

data %>%

specify(y ~ x) %>%

hypothesize(null = "independence") %>%

generate(reps = n, type = "permute")

Now the magic starts, as we run a number of simulated samples
Set the number of reps and set the type equal to "permute" (not bootstrap)
- Permutation randomly matches $X$-values and $Y$-values from the data so that there is no relationship between $X$ and $Y$

ca_school %>%
  specify(testscr ~ str) %>%
  hypothesize(null = "independence") %>%
  generate(reps = 1000,
           type = "permute")

The `infer` Pipeline: `calculate()`

data %>%

specify(y ~ x) %>%

hypothesize(null = "independence") %>%

generate(reps = n, type = "permute") %>%

calculate(stat = "slope")

We calculate sample statistics for each of the 1,000 replicate samples
In our case, calculate the slope¹ $(\hat{\beta}_1)$ for each replicate

ca_school %>%
  specify(testscr ~ str) %>%
  hypothesize(null = "independence") %>%
  generate(reps = 1000,
           type = "permute") %>%
  calculate(stat = "slope")

The `infer` Pipeline: `get_p_value()`

data %>%

specify(y ~ x) %>%

hypothesize(null = "independence") %>%

generate(reps = n, type = "permute") %>%

calculate(stat = "slope") %>%

get_p_value(obs stat = "", direction = "both")

We can calculate the p-value
- the probability of seeing a value at least as large as our_slope (-2.28) in our simulated null distribution
Two-sided alternative $H_a: \beta_1 \neq 0$, we double the raw $p$-value

ca_school %>%
  specify(testscr ~ str) %>%
  hypothesize(null = "independence") %>%
  generate(reps = 1000,
           type = "permute") %>%
  calculate(stat = "slope") %>%
  get_p_value(obs_stat = our_slope,
              direction = "both")

The `infer` Pipeline: `visualize()`

data %>%

specify(y ~ x) %>%

hypothesize(null = "independence") %>%

generate(reps = n, type = "permute") %>%

calculate(stat = "slope") %>%

visualize()

Make a histogram of our null distribution of $\beta_1$
- Note it is centered at $\beta_1=0$ because that’s $H_0$!

ca_school %>%
  specify(testscr ~ str) %>%
  hypothesize(null = "independence") %>%
  generate(reps = 1000,
           type = "permute") %>%
  calculate(stat = "slope") %>%
  visualize()

The `infer` Pipeline: `visualize()`

data %>%

specify(y ~ x) %>%

hypothesize(null = "independence") %>%

generate(reps = n, type = "permute") %>%

calculate(stat = "slope") %>%

visualize()

Add our our_slope to show our finding on the null distr.

ca_school %>%
  specify(testscr ~ str) %>%
  hypothesize(null = "independence") %>%
  generate(reps = 1000,
           type = "permute") %>%
  calculate(stat = "slope") %>%
  visualize(obs_stat = our_slope)

The `infer` Pipeline: `visualize()`

data %>%

specify(y ~ x) %>%

hypothesize(null = "independence") %>%

generate(reps = n, type = "permute") %>%

calculate(stat = "slope") %>%

visualize() + shade_p_value()

Add shade_p_value() to see what $p$ is

ca_school %>%
  specify(testscr ~ str) %>%
  hypothesize(null = "independence") %>%
  generate(reps = 1000,
           type = "permute") %>%
  calculate(stat = "slope") %>%
  visualize(obs_stat = our_slope) +
  shade_p_value(obs_stat = our_slope, #<<
                direction = "two_sided")

`visualize()` is Just a Wrapper for `ggplot`

Plot
Code

# infer
ca_school %>%
  specify(testscr ~ str) %>%
  hypothesize(null = "independence") %>%
  generate(reps = 1000,
           type = "permute") %>%
  calculate(stat = "slope") %>%
  # pipe into ggplot
  ggplot(data = )+
  aes(x = stat)+
  geom_histogram(color="white", fill="#e64173")+
  geom_vline(xintercept = our_slope,
             color = "blue",
             size = 2,
             linetype = "dashed")+
  annotate(geom = "label",
           x = -2.28,
           y = 100,
           label = expression(paste("Our ", hat(beta[1]))),
           color = "blue")+
  scale_y_continuous(lim=c(0,130),
                     expand = c(0,0))+
  labs(x = expression(paste("Sampling distribution of ", hat(beta)[1], " under ", H[0], ":  ", beta[1]==0)),
       y = "Samples")+
    theme_classic(base_family = "Fira Sans Condensed",
           base_size=20)

Theory-Based Hypothesis Testing (What R Calculates)

What R Does: Theory-Based Statistical Inference I

R does things the old-fashioned way, using a theoretical null distribution instead of simulating one
A t-distribution with $n-k-1$ df¹
Calculate a $t$-statistic for $\hat{\beta_1}$:

\[\text{test statistic} = \frac{\text{estimate} - \text{null hypothesis}}{\text{standard error of estimate}}\]

What R Does: Theory-Based Statistical Inference II

\[\text{test statistic} = \frac{\text{estimate} - \text{null hypothesis}}{\text{standard error of estimate}}\]

$t$ same interpretation as $Z$: number of std. dev. away from the sampling distribution’s expected value $E[\hat{\beta_1}]$¹ (if $H_0$ were true)
Compares to a critical value of $t^*$ (pre-determined by $\alpha$-level & $n-k-1$ df)
- For 95% confidence, $\alpha=0.05$, $t^* \approx 2$²

What R Does: Theory-Based Statistical Inference III

\[\begin{align*} t &= \frac{\hat{\beta_1}-\beta_{1,0}}{se(\hat{\beta_1})}\\ t &= \frac{-2.28-0}{0.48}\\ t &= -4.75\\ \end{align*}\]

Our sample slope $\hat{\beta_1}$ is 4.75 standard deviations below the expected value $E[\hat{\beta_1}]$ (i.e. 0) if $H_0$ were true

What R Does: Theory-Based Statistical Inference IV

$$\[\begin{align*} t &= \frac{\hat{\beta_1}-\beta_{1,0}}{se(\hat{\beta_1})}\\ t &= \frac{-2.28-0}{0.48}\\ t &= -4.75\\ \end{align*}\]$$

.hi[p-value]: prob. of a test statistic at least as large (in magnitude) as ours if the null hypothesis were true
- Continuous distribution implies we need probability of area beyond our value
- p-value is 2-sided for $H_a: \beta_1 \neq 0$
$2 \times p(t_{418}> \vert -4.75\vert)=0.0000028$

One-Sided Tests & p-Values

$H_a: \beta_1<0$

p-value: $p(t \leq t_i)$

$H_a: \beta_1>0$

p-value: $p(t \geq t_i)$

Two-Sided Tests and p-Values

$H_a: \beta_1 \neq 0$

p-value: $2 \times p(t \geq |t_i|)$

Calculating p-Values in `R`

pt() calculates probabilities on a t distribution with arguments:
- the t-score
- df = the degrees of freedom
- lower.tail =
  - TRUE if looking at area to LEFT of value
  - FALSE if looking at area to RIGHT of value

2 * pt(4.75, # I'll double the right tail
       df = 418,
       lower.tail = F) # right tail

[1] 2.800692e-06

$2 \times p(t_{418}> \vert -4.75\vert)=0.0000028$

Hypothesis Tests in Regression Output I

school_reg %>% summary()


Call:
lm(formula = testscr ~ str, data = ca_school)

Residuals:
    Min      1Q  Median      3Q     Max 
-47.727 -14.251   0.483  12.822  48.540 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 698.9330     9.4675  73.825  < 2e-16 ***
str          -2.2798     0.4798  -4.751 2.78e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 18.58 on 418 degrees of freedom
Multiple R-squared:  0.05124,   Adjusted R-squared:  0.04897 
F-statistic: 22.58 on 1 and 418 DF,  p-value: 2.783e-06

Hypothesis Tests in Regression Output II

In broom’s tidy() (with confidence intervals)

tidy(school_reg, conf.int=TRUE)

p-value on str is 0.00000278.

Conclusions

\[\begin{align*} H_0: \,& \beta_1=0\\ H_a: \, & \beta_a \neq 0\\ \end{align*}\]

Because the hypothesis test’s $p$-value $<$ $\alpha$ (0.05)…
We have sufficient evidence to reject $H_0$ in favor of our alternative hypothesis. Our sample suggests that there is a relationship between class size and test scores.

Using the confidence intervals:
We are 95% confident that, from similarly constructed samples, the true marginal effect of class size on test scores is between -3.22 and -1.34.

Hypothesis Testing vs. Confidence Intervals

Confidence intervals are all two-sided by nature

\[CI_{0.95}=\left(\left[\hat{\beta_1}-\underbrace{2 \times se(\hat{\beta_1})}_{MOE}\right], \, \left[\hat{\beta_1}+\underbrace{2 \times se(\hat{\beta_1})}_{MOE}\right] \right)\]

Hypothesis test $(t$-test) of $H_0: \, \beta_1=0$ computes a $t$-value of6[Since our null hypothesis is that $\beta_{1,0}=0$, the test statistic simplifies to this neat fraction.]

\[t=\frac{\hat{\beta_1}}{se(\hat{\beta_1})}\]

and $p<0.05$ when $t\geq2$ (approximately)

If our confidence interval contains the $H_0$ value (i.e. $0$, for our test), then we fail to reject $H_0$.

The Use and Abuse of $p$-values

p-Hacking

Consider what 95% confident or $\alpha=0.05$ means
If we repeat a procedure 20 times, we should expect $\frac{1}{20}$ (5%) to produce a fluke result!

Image source: Seeing Theory

Abusing p-values and “Science”

Source: Washington Post

Abusing p-Values and “Science” I

Source: SMBC

Abusing p-Values and “Science” II

“The widespread use of ‘statistical significance’ (generally interpreted as $(p \leq 0.05)$ as a license for making a claim of a scientific finding (or implied truth) leads to considerable distortion of the scientific process.”

Wasserstein, Ronald L. and Nicole A. Lazar, (2016), “The ASA’s Statement on p-Values: Context, Process, and Purpose,” The American Statistician 30(2): 129-133

Abusing p-Values and “Science” III

“No economist has achieved scientific success as a result of a statistically significant coefficient. Massed observations, clever common sense, elegant theorems, new policies, sagacious economic reasoning, historical perspective, relevant accounting, these have all led to scientific success. Statistical significance has not,” (p.112).

McCloskey, Dierdre N and Stephen Ziliak, 1996, The Cult of Statistical Significance

Common Misconceptions About p-Values

❌ $p$ is the probability that the alternative hypothesis is false - We can never prove an alternative hypothesis, only tentatively reject a null hypothesis

❌ $p$ is the probability that the null hypothesis is true - We’re not proving the $H_0$ is false, only saying that it’s very unlikely that if $H_0$ were true, we’d obtain a slope as rare as our sample’s slope

❌ $p$ is the probability that our observed effects were produced purely by random chance - $p$ is computed under a specific model (think about our null world) that assumes $H_0$ is true

❌ $p$ tells us how significant our finding is - $p$ tells us nothing about the size or the real world significance of any effect deemed “statistically significant” - it only tells us that the slope is statistically significantly different from 0 (if $H_0$ is $\beta_1=0)$

p-Values: Restatement

Again, p-value is the probability that, if the null hypothesis were true, we obtain (by pure random chance) a test statistic at least as extreme as the one we estimated for our sample
A low p-value means either (and we can’t distinguish which):
1. $H_0$ is true and a highly improbable event has occurred OR
2. $H_0$ is false

Statistical Significance In Regression Tables

	Test Score
Constant	698.93***
	(9.47)
STR	−2.28***
	(0.48)
n	420
R²	0.05
SER	18.54
* p < 0.1, p < 0.05, * p < 0.01

Statistical significance is shown by asterisks, common (but not always!) standard:
- 1 asterisk: significant at $\alpha=0.10$
- 2 asterisks: significant at $\alpha=0.05$
- 3 asterisks: significant at $\alpha=0.01$
Rare, but sometimes regression tables include $p$-values for estimates

Contents

Hypothesis Testing

Estimation and Hypothesis Testing I

Estimation and Hypothesis Testing II

Null and Alternative Hypotheses I

Null and Alternative Hypotheses II

Components of a Valid Hypothesis Test

Type I and Type II Errors I

Type I and Type II Errors II

Type I and Type II Errors III

Type I and Type II Errors IV

Type I and Type II Errors V

Type I and Type II Errors VI

Type I and Type II Errors VII

Significance Level, \(\alpha\), and Confidence Level \(1-\alpha\)

\(\alpha\) and \(\beta\)

Power and p-values

p-values and Statistical Significance

Digression: p-Values and the Philosophy of Science

Hypothesis Testing and the Philosophy of Science I

Hypothesis Testing and the Philosophy of Science II

Hypothesis Testing and p-Values

Hypothesis Testing: Which Test? I

Hypothesis Testing: Which Test? II

There is Only One Test!

Elements of a Hypothesis Test

Hypothesis Testing with the infer Package I

Hypothesis Testing with the infer Package II

Hypothesis Testing with the infer Package II

Hypothesis Testing with the infer Package II

Hypothesis Testing with the infer Package II

Hypothesis Testing with the infer Package II

Hypothesis Testing with the infer Package II

Theory-Based Inference: Critical Values of Test Statistic

Theory-Based Inference: Critical Values of Test Statistic

Hypothesis Testing by Simulation, with infer

Imagine a Null World, where \(H_0\) is True

Comparing the Worlds I

Comparing the Worlds II

Prepping the infer Pipeline

The infer Pipeline: specify()

The infer Pipeline: specify()

The infer Pipeline: hypothesize()

The infer Pipeline: hypothesize()

The infer Pipeline: generate()

The infer Pipeline: generate()

The infer Pipeline: calculate()

The infer Pipeline: calculate()

The infer Pipeline: get_p_value()

The infer Pipeline: visualize()

The infer Pipeline: visualize()

The infer Pipeline: visualize()

The infer Pipeline: visualize()

visualize() is Just a Wrapper for ggplot

Theory-Based Hypothesis Testing (What R Calculates)

What R Does: Theory-Based Statistical Inference I

What R Does: Theory-Based Statistical Inference II

What R Does: Theory-Based Statistical Inference III

What R Does: Theory-Based Statistical Inference IV

One-Sided Tests & p-Values

Two-Sided Tests and p-Values

Calculating p-Values in R

Hypothesis Tests in Regression Output I

Hypothesis Tests in Regression Output II

Conclusions

Hypothesis Testing vs. Confidence Intervals

The Use and Abuse of \(p\)-values

p-Hacking

p-Hacking

p-Hacking

p-Hacking

p-Hacking

Abusing p-values and “Science”

Abusing p-Values and “Science” I

Abusing p-Values and “Science” II

Abusing p-Values and “Science” III

Common Misconceptions About p-Values

p-Values: Restatement

Statistical Significance In Regression Tables

Hypothesis Testing by Simulation, with `infer`

Prepping the `infer` Pipeline

The `infer` Pipeline: `specify()`

The `infer` Pipeline: `specify()`

The `infer` Pipeline: `hypothesize()`

The `infer` Pipeline: `hypothesize()`

The `infer` Pipeline: `generate()`

The `infer` Pipeline: `generate()`

The `infer` Pipeline: `calculate()`

The `infer` Pipeline: `calculate()`

The `infer` Pipeline: `get_p_value()`

The `infer` Pipeline: `visualize()`

The `infer` Pipeline: `visualize()`

The `infer` Pipeline: `visualize()`

The `infer` Pipeline: `visualize()`

`visualize()` is Just a Wrapper for `ggplot`

Calculating p-Values in `R`