Yesterday, I looked at how the mean minimizes the variance and how this happened due to defining the variance as the mean of the square of the differences (also known as the norm). In particular, replacing the square with higher powers does not lead to a simple minimization problem. But what if we set the variance as the mean of the absolute value of the differences (also known as the norm?

Keep in mind that the derivative of is when and when .

The that satisfies the above expression is exactly the median since we see that the sum of when must be . As an example, compute the variance for the numbers . We see that the mean is which leads to a variance of and median is which leads to a variance of .