Least absolute deviations
From Wikipedia, the free encyclopedia
Least absolute deviations (LAD), also known as Least Absolute Errors (LAE), Least Absolute Value (LAV), or the L1 norm problem, is a mathematical optimization technique similar to the popular least squares technique in that it attempts to find a function which closely approximates a set of data. In the simple case of a set of (x,y) data, the approximation function is a simple "trend line" in 2D Cartesian coordinates. The method minimizes the sum of absolute errors (SAE), or sum of "residuals" between points generated by the function and corresponding points in the data.
Contents |
[edit] Formulation of the problem
Suppose that the data set consists of the points (xi, yi) with i = 1, 2, ..., n. We want to find a function f such that
To attain this goal, we suppose that the function f is of a particular form containing some parameters which need to be determined. For instance, suppose that it is quadratic, meaning that f(x) = ax2 + bx + c, where a, b and c are not yet known. We now seek the values of a, b and c that minimize the sum of the absolute values of the residuals:
[edit] Contrasting Least Squares with Least Absolute Deviations
The following is a table contrasting some properties of the method of least absolute deviations with those of the method of least squares.
Least Squares Regression | Least Absolute Deviations Regression | |
---|---|---|
Not very robust | Robust | |
Stable solution | Unstable solution | |
Always one solution | Possibly multiple solutions |
The method of least absolute deviation finds applications in many areas, due to its robustness over the least squares method. Least absolute deviations is robust in that it is resistant to outliers in the data. This may be helpful in studies where outliers may be safely and effectively ignored. If it were important to pay attention to any and all outliers, the method of least squares is a better choice.
The unstable property of the method of least absolute deviations means that, for any small horizontal adjustment of a datum, the regression line may jump a large amount. The method has continuous solutions, however by moving the a datum a small amount, one could "jump past" a configuration which has multiple solutions that span a region. After passing this region of solutions, the least absolute deviations line has a slope differing greatly from the previous line. In contrast, the least squares solutions is stable in that, for any small horizontal adjustment of a data point, the regression line will always move only slightly, or continuously.
Lastly, for a given data set, the method of least absolute deviations may produce multiple solutions, whereas the method of least squares always produces only one solution (the regression line is unique).
For a set of applets that demonstrate these differences, see the following site: http://www.math.wpi.edu/Course_Materials/SAS/lablets/7.3/73_choices.html
[edit] Other properties
There exist other unique properties of the least absolute deviations line. In the case of a set of (x,y) data, the least absolute deviations line will always pass through at least two of the data points, unless there are multiple solutions. If multiple solutions exist, then the region of valid least absolute deviations solutions line will be bounded by at least two lines, each of which pass through at least two data points. This "latching" of the line to the data points can help to understand the "unstable" property: if the line always latches to at least two points, then the line will jump between different sets of points as the data points are altered. The "latching" also helps to understand the "robustness" property: if there exists an outlier, and a least absolute deviations line must latch onto two data points, the outlier will most likely not be one of those two points because that will not minimize the sum of absolute deviations in most cases.
One known case in which multiple solutions exist is a set of points symmetric about a horizontal line, as shown in figure A below.
To understand why there are multiple solutions in the case shown in Figure A, consider the pink line in the green region. Its sum of absolute errors is some value S. If one were to tilt the line upward slightly, while still keeping it within the green region, the sum of errors will still be S. It will not change because the distance from each point to the line grows on one side of the line, while the distance to each point on the opposite side of the line diminishes by the exact same amount. Thus the sum of absolute errors remains the same. Also, since one can tilt the line in infinitely small increments, this also shows that if there is more than one solution, there are infinitely many solutions.
[edit] Variations, extensions, specializations
The least absolute deviation problem may be extended to include constraints and regularization, e.g., a linear model with linear constraints:[1]
- minimize
- subject to, e.g.,
Regularization with LASSO may also be combined with LAD.[2]
[edit] Solving Methods
Though the idea of least absolute deviations regression is just as straightforward as that of least squares regression, the least absolute deviations line is not as simple to compute efficiently. Unlike least squares regression, least absolute deviations regression does not have an analytical solving method. Therefore, an iterative approach is required. The following is an enumeration of some least absolute deviations solving methods.
- Simplex-based methods (such as the Barrodale-Roberts algorithm[3])
- Iteratively re-weighted least squares[4]
- Wesolowsky’s direct descent method[5]
- Li-Arce’s maximum likelihood approach[6]
- Check all combinations of point-to-point lines for minimum sum of errors
Simplex-based methods are the “preferred” way to solve the least absolute deviations problem. A Simplex method is a method for solving a problem in linear programming. The most popular algorithm is the Barrodale-Roberts modified Simplex algorithm. The algorithms for IRLS, Wesolowsky's Method, and Li's Method can be found in Appendix A of this document,[7] among other methods. Checking all combinations of lines traversing any two (x,y) data points is another method of finding the least absolute deviations line. Since it is known that at least one least absolute deviations line traverses at least two data points, this method will find a line by comparing the SAE of each line, and choosing the line with the smallest SAE. In addition, if multiple lines have the same, smallest SAE, then the lines outline the region of multiple solutions. Though simple, this final method is inefficient for large sets of data.
[edit] See also
- Absolute deviation
- Robust regression
- Linear regression
- Regression analysis
- Weighted least squares
- Least squares
[edit] External links
- ^ Mingren Shi & Mark A. Lukas (March 2002). "An L1 estimation algorithm with degeneracy and linear constraints". Computational Statistics & Data analysis 39 (1): 35–55. doi: .
- ^ Li Wang, Michael D. Gordon & Ji Zhu (December 2006). "Regularized Least Absolute Deviations Regression and an Efficient Algorithm for Parameter Tuning". Proceedings of the Sixth International Conference on Data Mining: 690–700. doi:10.1109/ICDM.2006.134.
- ^ I. Barrodale & F. D. K. Roberts (1973). "Animproved algorithm for discrete I1 linear approximation". SIAM Journal on Numerical Analysis 10: 839–848. doi: .
- ^ E. J. Schlossmacher (December 1973). "An Iterative Technique for Absolute Deviations Curve Fitting". Journal of the American Statistical Association 68 (344): 857–859.
- ^ G. O. Wesolowsky (1981). "A new descent algorithm for the least absolute value regression problem". Communications in Statistics, Simulation and Computation B10 (5): 479–491. doi: .
- ^ Yinbo Li and Gonzalo R. Arce (2004). "A Maximum Likelihood Approach to Least Absolute Deviation Regression". EURASIP Journal on Applied Signal Processing 2004 (12): 1762–1769. doi: .
- ^ William A. Pfeil, Statistical Teaching Aids, Bachelor of Science thesis, Worcester Polytechnic Institute, 2006
[edit] Others
- Peter Bloomfield and William Steiger (1980). "Least Absolute Deviations Curve-Fitting". SIAM Journal on Scientific Computing 1 (2): pages 290–301. doi: .
- Subhash C. Narula and John F. Wellington (1982). "The Minimum Sum of Absolute Errors Regression: A State of the Art Survey". International Statistical Review 50 (3): 317–326.
- Robert F. Phillips (July 2002). "Least absolute deviations estimation via the EM algorithm". Statistics and Computing 12 (3): 281–285. doi: .
- Enno Siemsen & Kenneth A. Bollen (2007). "Least Absolute Deviation Estimation in Structural Equation Modeling". Sociological Methods & Research 36 (2): 227–265. doi: .
- http://www.math.wpi.edu/Course_Materials/SAS/lablets/7.3/73_choices.html