Bessel’s Correction

For an estimator to be good we usually look for it to be unbiased. This means the estimator is expected to be the population value (i.e. E[θ^]=θ). Why is the naïve estimator of the population variance S~x=1nΣ(xix¯)2 biased? If E[x¯]=μ how could we possibly do better?

Underestimation

While it is true that E[x¯]=μ or equivalently E[x¯μ]=0 it is not true that E[(x¯μ)2]=0. This is by definition Var[x¯]. Therefore every squared deviation from the sample mean (xix¯)2 underestimates the squared deviation from the population mean (xiμ)2 by Var[x¯]. Taking this into account we arrive at the unbiased estimator Sx=1n1Σ(xix¯)2.

Full details of this proof are available on Wikipedia.

Degrees of freedom

Another less concrete perspective on Bessel’s correction employs the notion of degrees of freedom. I dislike this explanation because it’s quite hand-wavy, but I will provide it for completeness.

Imagine a sample of data with a missing value: 5,13,15,10,2,x. Let’s say we know the mean of this sample x¯=12. Can we figure out what the missing data point is? Of course, it’s simple algebra.

When we say a value is free to vary we’re saying that given certain constraints the value is not determined. In this case x is determined by the system. Given the mean we know exactly what the missing data point has to be. The variable x is not free to vary, which means this system has no degrees of freedom.

What about two missing data points: 5,13,15,10,x,y with x¯=12? We know that without an additional equation relating x to y we cannot find out an exact value for either variable. But how many degrees of freedom are there? Well if we set x to a value then y cannot vary. So we say that this system has one degree of freedom because we’re only free to vary one variable.

How many degrees of freedom are there in the estimator S~x=1nΣ(xix¯)2? Ostensibly there are n, but this is not true. We can derive a constraint.

Given n1 values for xi the last variable xn is determined. This means there are only n1 measurements of deviation because only these measurements are free to vary. Once again we arrive at Bessel’s correction.