Stress majorization

From Wikipedia, the free encyclopedia

Stress majorization is an optimization strategy used in multidimensional scaling (MDS) where, for a set of n, m-dimensional data items, a configuration X of n points in r(<<m)-dimensional space is sought that minimises the so called stress function σ(X). Usually r is 2 or 3, i.e. the r\times n matrix X lists points in 2- or 3-dimensional Euclidean space so that the result may be visualised (i.e. an MDS plot). The function σ is a loss or cost function that measures the squared differences between ideal (m-dimensional) distances and actual distances in r-dimensional space. It is defined as:

\sigma(X)=\sum_{i<j\le n}w_{ij}(d_{ij}(X)-\delta_{ij})^2

where w_{ij}\ge 0 is a weight for the measurement between a pair of points (i,j), dij(X) is the euclidean distance between i and j and δij is the ideal distance between the points (their separation) in the m-dimensional data space. Note that wij can be used to specify a degree of confidence in the similarity between points (e.g. 0 can be specified if there is no information for a particular pair).

A configuration X which minimises σ(X) gives a plot in which points that are close together correspond to points that are also close together in the original m-dimensional data space.

There are many ways that σ(X) could be minimised. For example, Kruskal[1] recommended an iterative steepest descent approach. However, a significantly better (in terms of guarantees on, and rate of, convergence) method for minimising stress was introduced by Jan de Leeuw [2]. De Leeuw's iterative majorization method at each step minimises a simple convex function which both bounds σ from above and touches the surface of σ at a point Z, called the supporting point. In convex analysis such a function is called a majorizing function. This iterative majorization process is also referred to as the SMACOF algorithm ("Scaling by majorizing a complicated function").

[edit] The SMACOF algorithm

The stress function σ can be expanded as follows:

\sigma(X)=\sum_{i<j\le n}w_{ij}(d_{ij}(X)-\delta_{ij})^2 =\sum_{i<j}w_{ij}\delta_{ij}^2 + \sum_{i<j}w_{ij}d_{ij}^2(X)-2\sum_{i<j}w_{ij}\delta_{ij}d_{ij}(X)

Note that the first term is a constant C and the second term is quadratic in X (i.e. for the Hessian matrix V the second term is equivalent to trX'VX) and therefore relatively easily solved. The third term is bounded by:

\sum_{i<j}w_{ij}\delta_{ij}d_{ij}(X)=\,\operatorname{tr}\, X'B(X)X \ge \,\operatorname{tr}\, X'B(Z)Z

where B(Z) has:

b_{ij}=-\frac{w_{ij}\delta_{ij}}{d_{ij}(Z)} for d_{ij}(Z)\ne 0, i \ne j

and bij = 0 for d_{ij}(Z)=0, i\ne j

and b_{ii}=-\sum_{j=1,j\ne i}^n b_{ij}.

Proof of this inequality is by the Cauchy-Schwartz inequality, see [3] (pp. 152--153).

Thus, we have a simple quadratic function τ(X,Z) that majorizes stress:

\sigma(X)=C+\,\operatorname{tr}\, X'VX - 2 \,\operatorname{tr}\, X'B(X)X\le C+\,\operatorname{tr}\, X' V X - 2 \,\operatorname{tr}\, X'B(Z)Z = \tau(X,Z)

The iterative minimization procedure is then:

  • at the kth step we set Z\leftarrow X^{k-1}
  • X^k\leftarrow \min_X \tau(X,Z)
  • stop if σ(Xk − 1) − σ(Xk) < ε otherwise repeat.

This algorithm has been shown to decrease stress monotonically (see [2]).

[edit] Use in graph drawing

Stress majorization and algorithms similar to SMACOF also have application in the field of graph drawing[4][5]. That is, one can find a reasonably aesthetically-appealing layout for a network or graph by minimizing a stress function over the positions of the nodes in the graph. In this case, the δij are usually set to the graph-theoretic distances between nodes i and j and the weights wij are taken to be \delta_{ij}^{-\alpha}. Here, α is chosen as a trade-off between preserving long- or short-range ideal distances. Good results have been shown for α = 2[6].

[edit] References

  1. ^ Kruskal, J. B.: "Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis", Psychometrika 29, 1–27., 1964
  2. ^ a b de Leeuw, J.: "Applications of convex analysis to multidimensional scaling. In J. R. Barra, F. Brodeau, G. Romie,a dn B. van Cutsem (Eds.), Recent developments in statistics (pp. 133-145), 1977
  3. ^ Borg, I. and Groenen, P.: "Modern Multidimensional Scaling: theory and applications", Springer-Verlag New York, 1997
  4. ^ Michailidis, G. and de Leeuw, J. "Data visualization through graph drawing" Computation Stat 2001;16(3):435--50.
  5. ^ E. Gansner, Y. Koren and S. North, "Graph Drawing by Stress Majorization", Proceedings of 12th Int. Symp. Graph Drawing (GD'04), Lecture Notes in Computer Science, Vol. 3383, Springer Verlag, pp. 239--250, 2004.
  6. ^ Cohen, J. "Drawing graphs to convey proximity: an incremental arrangement method", ACM Transactions on Computer-Human Interaction, Vol. 4, No. 3, September 1997, Pages 197–229.