2.4 — Goodness of Fit and Bias — Appendix
Deriving the OLS Estimators
The population linear regression model is:
The errors
Recall our goal is to find
So our minimization problem is:
Using calculus, we take the partial derivatives and set it equal to 0 to find a minimum. The first order conditions are:
Finding
Working with the first FOC, divide both sides by
Then expand the summation across all terms and divide by
Note the first term is
So we can rewrite as:
Rearranging:
Finding
To find
From the formula for
Combining similar terms:
Distribute
Move the second term to the righthand side:
Divide to keep just
Note that from the rules about summation operators:
and:
Plug in these two facts:
Algebraic Properties of OLS Estimators
The OLS residuals
- The expected value (average) error is 0:
- The covariance between
and the errors is 0:
Note the first two properties imply strict exogeneity. That is, this is only a valid model if
- The expected predicted value of
is equal to the expected value of :
- Total sum of squares is equal to the explained sum of squares plus sum of squared errors:
Recall
- The regression line passes through the point
, i.e. the mean of and the mean of .
Bias in
Begin with the formula we derived for
Recall from Rule 6 of summations, we can rewrite the numerator as
We know the true population relationship is expressed as:
Substituting this in for
Breaking apart the sums in the numerator:
We can simplify equation 4 using Rules 4 and 5 of summations
- The first term in the numerator
has the constant , which can be pulled out of the summation. This gives us the summation of deviations, which add up to 0 as per Rule 4:
- The second term in the numerator
has the constant , which can be pulled out of the summation. Additionally, Rule 5 tells us :
When placed back in the context of being the numerator of a fraction, we can see this term simplifies to just
Thus, we are left with:
Now, take the expectation of both sides:
We can break this up, using properties of expectations. First, recall
Second, the true population value of
Third, since we assume
Thus, the properties of the equation are primarily driven by the expectation
Use the property of summation operators to expand the numerator term:
Now divide the numerator and denominator of the second term by
By the Zero Conditional Mean assumption of OLS,
Alternatively, we can express the bias in terms of correlation instead of covariance:
From the definition of correlation:
Plugging this in:
Proof of the Unbiasedness of
Begin with equation:2
Substitute for
Distribute
Separate the sum into additive pieces:
Simplifying the first term, we are left with:
Now if we take expectations of both sides:
Using the properties of expectations, we can pull out
Again using the properties of expectations, we can put the expectation inside the summation operator (the expectation of a sum is the sum of expectations):
Under the exogeneity condition, the correlation between
Footnotes
From the rules about summation operators, we define the mean of a random variable
as . The mean of a constant, like or is itself.↩︎Admittedly, this is a simplified version where
, but there is no loss of generality in the results.↩︎