Yesterday, I looked at how the mean minimizes the variance and how this happened due to defining the variance as the mean of the square of the differences (also known as the norm). In particular, replacing the square with higher powers does not lead to a simple minimization problem. But what if we set the variance as the mean of the absolute value of the differences (also known as the
norm?
Keep in mind that the derivative of is
when
and
when
.
The that satisfies the above expression is exactly the median since we see that the sum of
when
must be
. As an example, compute the
variance for the numbers
. We see that the mean is
which leads to a variance of
and median is
which leads to a variance of
.