The easiest method of model fitting is linear least squares. This means minimizing the sums of squares of residuals (L2). On the other hand, we often encounter huge noises and it is much safer to minimize the sums of absolute values of residuals (L1). (The problem with L0 is that there are multiple minima, so the gradient is not a sensible way to seek the deepest).
There exist specialized techniques for handling L1 multivariate fitting problems. They should work better than the simple iterative reweighting outlined here.
A penalty function that ranges from L2 to L1,
depending on the constant is
![]() |
(9) |
![]() |
(10) |
![]() |
(11) |
![]() |
(12) |
![]() |
(13) |
Continuing, we notice that the new weighting of residuals
has nothing to do with the linear relation between model perturbation
and residual perturbation;
that is,
we retain the familiar relations,
and
.
In practice we have the question of how to choose .I suggest that
be proportional to
or some other percentile.