What happens if we use absolute value? (3/365)

Yesterday, I looked at how the mean minimizes the variance and how this happened due to defining the variance as the mean of the square of the differences (also known as the $L_2$ norm). In particular, replacing the square with higher powers does not lead to a simple minimization problem. But what if we set the variance as the mean of the absolute value of the differences (also known as the $L_1$ norm?

Keep in mind that the derivative of $|x|$ is $1$ when $x \ge 0$ and $-1$ when $x < 0$.

\begin{aligned} &\underset{\mu}{\arg \min} \int_{x \in X} | x-\mu | f(x) \\ 0 &= \int_{x - \mu < 0} f(x) - \int_{x - \mu \ge 0} f(x) \text{ after taking derivative and setting to 0} \end{aligned}

The $\mu$ that satisfies the above expression is exactly the median since we see that the sum of $f(x)$ when $x - \mu \ge 0$ must be $0.5$. As an example, compute the $L_1$ variance for the numbers $1,2,12$. We see that the mean is $15/3 = 5$ which leads to a variance of $4+3+7$ and median is $2$ which leads to a variance of $0+1+10$.

This entry was posted in Uncategorized and tagged , . Bookmark the permalink.