Talk:Singular value decomposition

From Wikipedia, the free encyclopedia

WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, which collaborates on articles related to mathematics.
Mathematics rating: A-BB+ Class Mid Priority  Field: Algebra
One of the 500 most frequently viewed mathematics articles.


This article may be too technical for a general audience.
Please help improve this article by providing more context and better explanations of technical details to make it more accessible, without removing technical details.

Contents

[edit] Computational complexity

I cannot find anything about the computational complexity. Did I miss it? If I did not, it is quite important, so could someone who knows that stuff include it in the article? Vplagnol 18:13, 8 February 2007 (UTC)

[edit] error in theorem

I think they are errors in the statement of the theorem: first U should be an m-by-m matrix (not an m-by-n matrix)... a unitary matrix is necessarily a square matrix. and Σ should be m-by-n with diagonal entries (it is neither n-by-n nor diagonal matrix)

best regards, Sebastien.

You're absolutely right on the dimensions, thanks. It got changed about a month ago, and sadly nobody noticed. The rectangular matrix Σ is sometimes said to be a diagonal matrix, even though it is rectangular, since its off-diagonal entries are zero. However, I saw that Horn & Johnson does not do this, so I also changed that for clarity. -- Jitse Niesen (talk) 22:50, 5 September 2005 (UTC)

I removed this:

An important extension of the singular-value decomposition theorem says that if M is a symmetric square matrix then one may take G = H, and in the case in which n=m=r (the full-rank case) and all of the singular values are different one must take G = H. That is the spectral theorem.

This is incorrect: a symmetric matrix can have negative eigenvalues, and then the spectral theorem decomposition wouldn't be a singular-value decomposition.

Something is also fishy with the term "singular vector": they are not uniquely determined by M; different singular-value decompositions of M can yield different singular vectors. What is the clean formulation of this situation? AxelBoldt 00:17 Dec 13, 2002 (UTC)

For all the textbooks and articles on matrix theory I have ever read, this is the one and only one article uses the term singular vector!!! Wshun


Well, Google gives about 3000 hits for the phrase "singular vector", and most of them seem to refer to the columns/rows of the matrices in the singular value decomposition. So the term is apparently in use, it's just that the definition is not quite clear. I restored some of the removed material about them, and also mentioned the fact that singular value deompositions always exist. AxelBoldt 19:30 Apr 6, 2003 (UTC)

Singular vector is standard terminology. Its good enough for Strang, Golub and for Hansen, so its good enough for me Billlion 14:21, 3 Sep 2004 (UTC)

Why directing the article Singular value to Singular value decomposition? It is as bizarre as directing prime number to prime number theorem, or calculus to the fundamental theorem of calculus. It is better to reverse the direction. In my opinion, the two articles should not be combined together! Wshun

I would be happy for the content of s-number to go to into this one, I think the article should cover operator case Billlion 08:23, 17 Sep 2004 (UTC)

[edit] Response to Wshun

Singular vector is indeed standard terminology. See Gilbert Strang's books. Michael Hardy 14:54, 3 Sep 2004 (UTC)

[edit] Algorithms please

How does one compute the SVD in practice? Lupin 13:19, 5 Feb 2005 (UTC)

See the article
'LAPACK users manual (http://www.netlib.org/lapack/lug/node53.html) gives details of subroutines to calculate the SVD.'
The numerical algorithms are explained and referenced there.In Matlab/Octave is easy, just use the svd command, or in Mathematica SingularValueDecomposition,Billlion 19:45, 5 Feb 2005 (UTC)
I tried to understand the methods used but on the very first algorithm they use ("reduced bidiagonal form") I can find very little explanatory material, anywhere. Yet, I have noticed that most of the source code I have come by tend to use this undocumented technique. --ANONYMOUS COWARD0xC0DE 02:12, 27 November 2006 (UTC)
I'm not sure what "reduced bidiagonal form" is; are you talking about computing the SVD by reducing the matrix to bidiagonal form? One reference for the Golub–Kahan algorithm is Golub and Van Loan, Matrix computations, John Hopkins University Press, Section 8.6, "Computing the SVD". This is a standard reference for numerical linear algebra. -- Jitse Niesen (talk) 03:17, 27 November 2006 (UTC)
Yes that's exactly what I am talking about. Thank you for your reference I just ordered it at my library (ISBN:0801854148). --ANONYMOUS COWARD0xC0DE 05:23, 28 November 2006 (UTC)
Ok I implemented the house and bidiagonalization functions (algorithm 5.1.1 and 5.4.2 respectivly) and these are the results I have obtained \operatorname{house}\left (\begin{bmatrix} 1 \\ 4 \\ 7 \\ 10\end{bmatrix}\right )=\begin{bmatrix}1 \\ .310 \\ .543 \\ .776\end{bmatrix} \operatorname{householder bidiagonalization}\left (\begin{bmatrix}1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ 10 & 11 & 12 \end{bmatrix}\right )=\begin{bmatrix}11.5 & 9.39 & 0.0967 \\ 0.0264 & -0.510 & 0.995 \\ 0.0461 & 0.681 & 1.05 \\ 0.0659 & 0.404 & -0.216\end{bmatrix}. I was expecting the Bidiagonal matrix B and my results are definitly not bidiagonal. I have further concerns regarding algorithm 5.4.2; the example results don't seem to hold the property U⋅B⋅VT=A as the first column of A ends up negative. Could someone verify that these are the correct results. --ANONYMOUS COWARD0xC0DE 00:47, 2 January 2007 (UTC)
Well, it seems I will have to ask how to compute the SVD elsewhere, when I find my answers I will post back here. --ANONYMOUS COWARD0xC0DE 07:07, 5 February 2007 (UTC)
M=\begin{bmatrix} -5 & 3 \\ 0 & 8 \\ -1 & 5 \end{bmatrix} M^\top M=\begin{bmatrix} 26 & -20 \\ -20 & 98 \end{bmatrix} M\cdot M^\top =\begin{bmatrix} 34 & 24 & 20 \\ 24 & 64 & 40 \\ 20 & 40 & 26 \end{bmatrix} \Sigma = \begin{bmatrix} 10.16 & 0 \\ 0 & 4.563 \\ 0 & 0 \end{bmatrix} eigenvector(M^\top M)=\begin{bmatrix} -.2508 & .968 \\ .968 & .2508 \end{bmatrix}=V eigenvector(M\cdot M^\top)=\begin{bmatrix} .4094 & .8959 & -.1726 \\ .7624 & -.4398 & -.4747 \\ .5012 & -.06272 & .8631 \end{bmatrix}\overset{?}{=}U First you get the eigenvector of mTm *the smaller matrix* and assign it to V. In this setting U and V are orthogonal however U\cdot \Sigma \cdot V^\top = \begin{bmatrix} 2.914 & 5.052 \\ -3.885 & 6.995 \\ -1.554 & 4.857 \end{bmatrix} is not even close. So instead you calculate U via U=M\cdot V \cdot \Sigma^{-1} where \Sigma^{-1} = \Sigma^\top followed by inverting each element on the diagonal giving \Sigma^{-1} = \begin{bmatrix} \frac{1}{10.16} & 0 & 0 \\ 0 & \frac{1}{4.563} & 0 \end{bmatrix} U = \begin{bmatrix} -5 & 3 \\ 0 & 8 \\ -1 & 5 \end{bmatrix} \begin{bmatrix} -.2508 & .968 \\ .968 & .2508 \end{bmatrix} \begin{bmatrix} \frac{1}{10.16} & 0 & 0 \\ 0 & \frac{1}{4.563} & 0 \end{bmatrix} = \begin{bmatrix} .4093 & -.8958 & 0 \\ .7622 & .4397 & 0 \\ .5011 & .06268 & 0 \end{bmatrix} Thats the simplest way to solve an M-by-N matrix where M>=N, and I believe if N>=M you can use V=M\cdot \Sigma^{-1} \cdot U^\top haven't tried it yet. However U is not orthogonal in my example, and I don't know how to correct that. My method seems to be the same as that used by the $50 .net matrix library, for example input my 3-by-2 matrix M into their free trial interface, it will cut off the offending columns in U, the dimensions of U would be 3-by-2 instead of 3-by-3 in violation of the orthogonality property. Thus I am left with just one question, how do I assign the appropriate numbers/values (the exact values in the columns are irrelevant as their eigenvalues are zero) to the remaining columns of a matrix to fulfill the orthogonality property? --ANONYMOUS COWARD0xC0DE 23:09, 27 November 2006 (UTC)
Dear AC, the singular vectors u you get by calculating the eigenvectors of M M* may each be +1 or -1 times the singular vectors uv corresponding to the eigenvectors v of M* M. (Can you see that this is so?)
More generally, if there are degenerate singular vectors v, ie more than one sharing the same singular value, then an SV u that you get from M M* may be any unit linear combination of the vectors uv that correspond to the vectors v that share that singular value. (See the section in the article on whether the SVD is uniquely determined).
The way round this, if you are calculating the SVD by hand, is to calculate the vectors v, then multiply each by M; and then σv = the magnitude of M v, and uv = (1/σ)M v. That will ensure you get left and right singular vectors that match. Jheald 09:48, 4 December 2006 (UTC)
For the null space of M you can use any set of orthonormal u vectors that correspond to zero eigenvalues of M M*; you can find v vectors (if any more are needed) by finding the eigenvectors that correspond to zero eigenvalues of M* M. Alternatively, you can find either set of zero singular vectors by orthogonalising random vectors with respect to the vectors you've already got (u or v respectively), until you've successfully found a full basis. Jheald 10:07, 4 December 2006 (UTC)
I have noticed the {+1,-1} pattern. I also tried u = (1 / σ)Mv and it works perfectly and is equivalent to my method of U=M\cdot V \cdot \Sigma^{-1} where \Sigma^{-1} = \Sigma^\top followed by inverting each element on the diagonal (awkward notation, but equivalent). As for "orthogonalising random vectors [...] until you've successfully found a full basis" I was hoping for something that was more definite (in other words purely in complexity class P and not ZPP).--ANONYMOUS COWARD0xC0DE 06:36, 25 December 2006 (UTC)
I found a temporary solution to my problem that I should have seen earlier. The QR decomposition. Take the orthogonal vectors I have and then preform a QR decomposition on the vector matrix and replace the vector matrix with the resulting Q matrix. The QR decomposition will maintain the currently orthogonal vectors and further see that the entire vector matrix is orthogonal. But I would still like to see a more "main stream" way of computing the full svd. --ANONYMOUS COWARD0xC0DE 07:07, 5 February 2007 (UTC)
Also regarding your statement "if you are calculating the SVD by hand"; In my mind I am not capable of doing it any other way but the algorithmic way, when I think "do it by hand", I see no difference from computing it on a computer library package (besides being on paper and far slower). --ANONYMOUS COWARD0xC0DE 07:07, 5 February 2007 (UTC)
Well, I added a bit about "By hand" calculation, but then Mct mht removed it, stating that it's already in the article, so apparently he's seing something I'm not... Sim 02:51, 30 April 2007 (UTC)
I guess Mct mht is referring to the section Singular value decomposition#Linear algebraic proof. I think the bit you added goes too much into detail given that section, but there should probably be a bit about the method in the computation section. -- Jitse Niesen (talk) 04:45, 30 April 2007 (UTC)
Ah. Okay. Thanks for the heads-up to the two of you. :) Sim 03:13, 2 May 2007 (UTC)

[edit] Singular values, singular vectors, and their relation to the SVD

The first math expression in this section appears in the code but not the page, at least on my screen. I don't know enough about the expression mark-up to fix it. proteus71 17:22, 2 Nov 2005 (UTC)

Could you please check whether it is still not appearing? It appears on my screen. It might have something to do with the servers being slow today (probably because they're overloaded). -- Jitse Niesen (talk) 22:30, 2 November 2005 (UTC)

[edit] Freedom in U and V

I removed the following text:

"If M is real, and the singular values are all distinct, then unique orthogonal matrices U and VT can be obtained. Otherwise U and V can be transformed by any unitary matrix S that commutes with Σ, such that U' = U S and (V')* = S* V*."

It is always possible to replace U and V by −U and −V, respectively. For the second part to be true, we need n = m.

Obviously, there is something to be said about the freedom in U and V. There is some discussion about it further down, but there are probably things to add. -- Jitse Niesen (talk) 00:06, 14 January 2006 (UTC)

[edit] Recent changes

In the intro, I changed 'is analogous' (to eigenvalue decomposition) to 'is similar', and added some text which makes it clearer that the two decompositions are not too similar. Also, I dropped the square characterization of symmetric and Hermitian matrizes since such matrices are always square.

In the 'Singular values, singular vectors, and their relation to the SVD' section I changed to "if and only if" since this is a definition of what a singular value is. Also, we must require that u and v are normalized in order to talk about a singular value. Added some text about the relation between the theorem and the singular values and vectors defined here. Added a defintion of what degenerate signular value is. Tried to make it clearer in when and in what way the SVD is unique or not.

The discussion on the relation with eigenvalue decomposition is expanded and given a section of its own.

Added a discussion on the relation between SVD and the range and null space of M. Changed singular value = sqrt(eigenvalue) to singular value squared = eigenvalue.

Added a discussion on range and null space in "applications" section.

Moved small section on rank to applications, since it is an application and it follows from the range and null space stuff.

--KYN 20:28, 16 January 2006 (UTC)

Great stuff. I tried to further clarify the uniqueness question; hope that's okay with you. Two further comments: firstly, it is quite common for mathematicians to use "if" in definitions where it should logically be "if and only if" (see, for instance, If and only if#Definitions); secondly, out of curiosity, why must u and v be normalized? I checked a couple of books and they all defined the singular vectors as columns of the matrices U and V, which means that they are indeed of unit length, but I don't see a particular reason to demand that the singular vectors are normalized. -- Jitse Niesen (talk) 22:07, 17 January 2006 (UTC)
You want the singular vectors to be normalised because (i) that's what makes them useful as unit basis vectors; and (ii) it's only if you normalise the singular vectors that you get the singular values coming out as well-defined gain factors in the Σ part of the decomposition. -- Jheald 22:23, 17 January 2006 (UTC).
Could you please explain (ii)? I don't see it. Sorry to be such a nuisance, but I'm just trying to understand why eigenvectors are not normalized while singular vectors are. -- Jitse Niesen (talk) 22:44, 17 January 2006 (UTC)
Eigenvectors don't need to be normalised if the only property that you're interested in is A v = λ v. That is often enough the case that often enough you can work quite happily with un-normalised eigenvectors. But if you're using the eigenvectors for a spectral decomposition, A = V Λ V*, then those column vectors in V of course do need to be normalised.
The difference with SVD is that it is essentially always the property M = U Σ V* that makes SVD practically useful, and for that the singular vectors need to be normalised; so it is essentially only normalised singular vectors that are ever of interest. It is of course true that M (k v) would still equal σ (k u), but this relation is not what makes the SVD useful. -- Jheald 23:27, 17 January 2006 (UTC).
I agree and add to that my copy of Golub and Van Loan Matrix Computations (2nd ed, page 71) states that u and v are "unit 2-norm vectors". To make the above point even more clear: Assume that M v=sigma u and M^{\star} u=sigma v for normalized u and v. Now, look at the same relations with u replaced by 2u and sigma by sigma': M v = 2 sigma' u and M^{\star} 2 u = sigma' v. Identifying with the first relations gives (1) sigma = 2 sigma' and (2) sigma = sigma'/2, which implies sigma = sigma' = 0! Consequently, we need to impose some restrictions on u and v in order to make an interpretation of the two relations. This fact is not obvious from the outset, and maybe worth mentioning somewhere in the article? --KYN 10:08, 18 January 2006 (UTC)
In reply to Jheald, I kind of see your point, but the relation Mv = σu is important for the geometric interpretation of the SVD, which was historically the motivation. However, I'll accept this as the reason for the normalization condition.
In reply to KYN, what you showed is that if you replace u by 2u and sigma is not zero, the relations M v=sigma u and M^{\star} u=sigma v cannot be satisfied, unless you replace v by 2v. This is also what Jheald means with his/her last sentence: you have to scale u and v by the same factor. If you do that, sigma does not change. -- Jitse Niesen (talk) 12:23, 18 January 2006 (UTC)

[edit] In "Statement of the theorem" section

it says in the bullet list that U and V "describe" the row or columns of M. This sound a bit fuzzy. What exactly do they describe and in what way? --KYN 22:45, 16 January 2006 (UTC)

[edit] Usage of ∗-symbol

Hi, I just have seen, that the &lowast;-symbol, especially "<sup>&lowast;</sup>" as used in the paragraph "Relation to eigenvalue decomposition" is not rendered correctly in Opera8 and IE (because of the font I've chosen?) while being correctly shown in FF. Is there a reason, why not just use a simple asterisk instead of the &lowast;? I mean, the asterisk is used anywhere else for this purpose. --(Tobi)134.109.148.28 14:53, 21 March 2006 (UTC)

I think the &lowast; symbol looks a bit nicer, but it's not essential. And because it does not display correctly on some systems, it has to go, so I removed it. Thanks for your comment. -- Jitse Niesen (talk) 05:18, 22 March 2006 (UTC)

[edit] How to compute compact or truncated SVD?

Is there a way to compute SVD eigenvectors in a sequence? That is to say, first compute the eigenvector for the largest eigenvalue, then compute the eigenvector for the second largest eigenvalue, etc. 04 March 2006

Yes, this is possible. The idea is that the singular values of A are related to the eigenvalues of A * A and apply an iterative eigenvalue algorithm like Lanczos iteration to find the eigenvalues of the symmetric matrix A * A. I'm afraid I can't tell you the details though. One problem is that you might lose accuracy when forming the product A * A. I note that Matlab uses the matrix
 \begin{bmatrix} 0 & A \\ A^* & 0 \end{bmatrix}
instead. -- Jitse Niesen (talk) 04:43, 5 April 2006 (UTC)


Q: Thanks for the answer. But as I understand, Lanczos iteration can only find the eigenvector corresponding to the largest eigenvalue. After the largest eigenvector being computed, how is the eigenvector corresponding to the second largest eigenvalue computed? I appreciate if you give a specific reference. Apr 05 2006

Sorry, I forgot that you're after the vectors. It seems that Lanczos can do this, at least this is what the theorem in Section VI.6.3 in Y. Saad, Numerical methods for large eigenvalue problems, indicates. I know only very little about large-scale numerical linear algebra, so I can only recommend you to look in that book (which is online at Saad's web site). Another interesting reference is Parlett, The Symmetric Eigenvalue Problem, Prentice-Hall, 1980. I haven't read it, but it's a whole book so it's bound to touch on computing the eigenvectors, though it might be a bit out-dated. -- Jitse Niesen (talk) 02:06, 6 April 2006 (UTC)


One way into getting your head around the Lanczos method is to imagine using the Conjugate gradient method to minimise the potential function \frac{1}{2}x^T A x - x^T b for some known positive definite symmetrical matrix A, starting from an initial position x = x0, and then iteratively getting closer to the solution x = A-1 b.
As a reminder, at each step of conjugate gradient we minimise the function in a particular search direction; then at the minimum point for that search direction, we calculate the negative of the gradient, which is just b - A xn. This gives a new direction, orthogonal to the subspace we have just minimised. However, although the negative of the gradient gives the direction of steepest descent, it can be shown that it is actually suboptimal in its component in the direction we have just come: the optimal search direction is the conjugate direction, which is a linear combination of the new gradient and direction we have examined immediately previously.
By searching along the conjugate direction, rather than the steepest descent direction, we don't botch up the minimisation w.r.t. the previous search directions, so there is none of the "zig-zagging" that you get using pure gradient descent; instead by the end of the nth step, the Conjugate gradient algorithm should have accurately minimised the quadratic function in the space spanned by [Ax0,A2x0,A3x0,...Anx0] added to the initial guess x0.
The connection with the Lanczos algorithm is this: suppose we want to give an estimate of the matrix A, just from that sequence of gradients we have calculated. The conjugate gradient property is equivalent to saying that the truncation of A in the Krylov space [Ax0,A2x0,...Anx0] can be represented by UTUT, where T is a symmetric tridiagonal matrix (ie a diagonal and one off-diagonal, repeated above and below), and U is the (normalised) set of (already orthogonal) gradient directions that we have progressively calculated.
It is quick to eigen-decompose T into V Λ V^T as it's (i) not very big (usually we will stop after a number of iterations n which is very much less big than the whole rank of A), and (ii) already tridiagonalised.
So this quite readily gives the eigen-decomposition of the truncation of A in this Krylov space.
Note that this is not the same as the first n eigenvalues and eigenvectors of A. However, because of the wonders of exponential increase, it should give you quite a good handle on, say, the first n/4 eigenvalues and eigenvectors of A, even if they are only hardly present in your initial seed vector x0. (The more different the largest eigenvalues are from the rest of the pack, the more quickly more of their eigenvectors will become well approximated).
For more details on Conjugate gradient, and how it relates to the Lanczos method, see eg Golub and van Loan (3e) section 10.2 (pp. 520-532). A more direct presentation of the Lanczos algorithm by itself is given earlier in chapter 9 (pp.470-490).
Note that in real-world numerics (as opposed to idealised infinite precision numerics), issues arise, essentially because the minimisation can't be done perfectly, so the new gradient vectors aren't automatically perfectly orthogonalised to the foregoing ones, and if left untreated this creates a seed which can allow duplicate appearances of particular eigenvectors to build up. The numerical analysis of how this occurs has been well known since the early 70s; various fixes (of various levels of computational expense) are known, and discussed in the book.
It's well worth playing with simple codes to get your own feel for what's going on; but if you need to apply it really seriously, for that reason it may well be worth searching out pre-canned routines which have been tuned into the ground both for accuracy and for efficiency.
Hope this helps,
All best -- Jheald 20:55, 6 April 2006 (UTC).


BTW: At the moment the WP articles on Lanczos algorithm, Gradient descent, and Conjugate gradient method could all use some work IMO. In particular
  • the Gradient descent article should make much clearer what is wrong with Gradient descent as a method, and why Conjugate gradient fixes it.
  • the CG article isn't so bad so far as it goes, but should make it much clearer that CG is a method for general nonlinear minimisation, not just for solving linear systems; more clearly explain (with a picture?) why it fixes the problems of gradient descent; and explain how the approximation to the curvature (Hessian) matrix can be built up through successive CG minimisations.
  • the Lanczos method article could use a lot more expansion, particularly regarding explanation, motivation, some of the numerical issues that arise, and how to address them.
I can't offer any time of my own at the moment to fix them, sadly; but it would be rather good if somebody could have a go. -- Jheald 21:13, 6 April 2006 (UTC).


BTW (2): The Lanczos method is particularly good if either (i) you're only interested in the largest k eigenvectors, where k << n the size of the matrix; and/or (ii) if it is unusually fast to apply the matrix (or operator) A to a vector (e.g. if A is sparse; or if it is (say) a Fourier operator).
In cases where you just want a quicker SVD, eg because you know the matrix is very non square (m >> n), and you're not interested in the null-space singular vectors, then, as the article says, you can save a lot of time by doing a QR decomposition first, to give an m × n orthogonal Q matrix and an n × n triangular matrix R, and then calculating the much smaller SVD of R (an approach called R-SVD). Also note that often (including in R-SVD), it is not worth explicitly calculating the coefficients of the orthogonal matrices; instead, don't calculate them at all if you don't need them; or, if you're just calculating them only then to apply them to a few vectors, instead apply the elementary rotations directly to those vectors as the rotations get calculated, and never explicitly form the orthogonal matrices at all.
Most "canned" SVD routines should give you some or all of these options. Golub and van Loan (3e) gives an idea of how much the required flop counts can be reduced, depending on how much of the SVD you don't actually need, on their page 254. -- Jheald 21:34, 6 April 2006 (UTC).

[edit] Possible errata in "Relation to eigenvalue decomposition"

I think instead of "the squares of the non-zero singular values of M" it should say "the squares of the modulus of the non-zero singular values of M". Don't you think so?

singular values are by definition non-negative, therefore real. Mct mht 17:40, 20 June 2006 (UTC)

[edit] Dense article

I find this article overly dense. When trying to understand SVD it was unhelpful. For example compare with http://www.uwlax.edu/faculty/will/svd/svd/index.html, where the intuition behind the matrix decomposition is presented using a traditional graphic form.

[edit] Request: Non - mathematician understanding of SVD

Can someone please try to explain in the article , to the non-mathematician, what is SVD, what is the meaning SVD, and what is it good for? thanks.

69.107.60.124 21:06, 28 January 2007 (UTC) Dear Anonymous... I share your sentiment. Many articles on Wikipedia, while mathematically brilliant, are usually of less value to the general public using them. An artical such as the one found at http://www.kwon3d.com/theory/jkinem/svd.html is much more valuable on the topic to the public at large IMO. It generally takes a majority of the people many many reads at such articals... I feel these articles on Wikipedia are tailored for the < 5% of the population.

I added an example demonstrating the form and properties of the three matrices. While this does not fully address the request, I think that it helps. --Diomidis Spinellis 16:22, 29 January 2007 (UTC)

[edit] Losing the intent of Wiki

The authors of the math sections on Wiki need to quickly realize that wiki isnt meant just for them and for the actual majority of the public. In writing artcles so densly, it doesent matter how correct the article is, it just comes accross as gibberish. These authors would do the wiki community much better if they refrained from writing such useless explanations on topics in the first place. These guys are really doing more harm than helping... If they want to write a PhD thesis, please have them do it elsewhere. —The preceding unsigned comment was added by 69.28.107.2 (talk) 21:58, 19 February 2007 (UTC).

The author of the above needs to realize that vague whinging is not constructive, nor is promoting his/her own favorite article, putting a link at the top of the list and even having the guts to call it a better explanation while it is in fact equally dense and clearly incomplete. Specific suggestions are welcome, attempts to improve the article are even better, and I do hope that the author of the above decides to become more constructive. -- Jitse Niesen (talk) 00:05, 20 February 2007 (UTC)

[edit] Problems with "Linear Algebraic Proof"

Please pardon the LaTeX...

\begin{document}
The ``Linear Algebraic Proof'' does not specify dimensions.  Assuming
consistency -- with what appears previously in the article --
$M$ is $m$-by-$n$, hence $M^{\ast}M$ is $n$-by-$n$ and $U$ is $n$-by-$n$
(this already conflicts with the dimensions of $U$ given in the
``Statement of the theorem'', but let's ignore that for now).  
Consequently,
$$
U^{\ast} M^{\ast} M U = 
\left( \begin{array}{cc} \Sigma & 0 \\ 0 & 0\end{array} \right)
$$
is $n$-by-$n$.  However, $\Sigma$ is supposed to be $m$-by-$n$ which 
means that the partitioned matrix
$$
\left( \begin{array}{cc} \Sigma & 0 \\ 0 & 0\end{array} \right)
$$
is actually 
$$
\left( \begin{array}{c} \Sigma \\ 0 \end{array} \right)
$$
It does not help to begin with $M M^{\ast}$, since the partitioned 
matrix 
$$
\left( \begin{array}{cc} \Sigma & 0 \\ 0 & 0\end{array} \right)
$$
in that case must actually be
$$
\left( \begin{array}{cc} \Sigma & 0 \end{array} \right)
$$
Perhaps it is {\em impossible\/} to present a ``Linear Algebraic Proof''
that is consistent with previous notation... but then alternate
notation should be used (and dimensions should be given explicitly)
so that the ``Proof'' actually makes sense!  Moreover, a real proof
would most likely need to consider cases ($n \le m$ and $n > m$).
\end{document} 

—The preceding unsigned comment was added by 160.36.59.7 (talk) 17:03, 19 January 2007 (UTC).

The matrices U and Σ in "Linear Algebraic Proof" are not the same as the matrices U and Σ in the statement of the theorem. This is rather confusing, so I renamed the variables in the proof. I don't think it's necessary to give the dimensions of the matrices, but I did add the dimensions in one place to be absolutely clear. A quick check didn't show me where the proof needs to distinguish cases. -- Jitse Niesen (talk) 12:28, 26 January 2007 (UTC)
thanks, Jitse. to anon: the argument is pretty standard. the matrix dimensions should be clear by inspection. starting with MM*, the same argument goes through with appropriate (in fact very little) modification. Mct mht 13:07, 26 January 2007 (UTC)

[edit] question about the number of zeroes

What does the number of zeroes show after computing the SVD for a matrix? I mean comparing 2 matrixes with the same size what could we understand from comparing the number of zeroes in there SVD matrix?Amirshahi62 15:09, 1 March 2007 (UTC)

if you talking about the diagonal matrix D where A = UDV, then the number of nonzero elements on the diagonal of D is the rank of A, because D2 = UAA*U* (say) means range D = range UA. Mct mht 18:19, 1 March 2007 (UTC)

[edit] add PROPACK SVD implementation

please add a link to PROPACK sparse SVD implementation:

http://soi.stanford.edu/~rmunk/PROPACK/index.html

it's actually ONLY implemenation that i've managed to get working on large sparse matrices. both SVDPACK and SVDPACKC are terribly written, they barely work on their test matrices (die with SEGFAULT). —The preceding unsigned comment was added by 195.5.55.14 (talk) 15:31, 2 March 2007 (UTC).

Looks okay, so I added the link. -- Jitse Niesen (talk) 03:27, 3 March 2007 (UTC)

[edit] This page is quite technical

I also agree-- this page could use some examples. I'm just interested in taking the svd of a matrix for an engineering problem, and all the theorems here, though nice, don't help me any. This page is also frequently visited, and i think most visitors are looking for something simpler than this. Apologies for posting in potentially the wrong place-- first post. 129.170.66.44 (talk) 04:11, 12 March 2008 (UTC)

I agree, at least in principle with the poster who remarked earlier about this page being too dense and technical. Maybe that's unavoidable in this subject, but I do think it's worth at least looking at this page with a serious eye and asking "Is there any way that the reach of this page can be broadened, is there anyway the usefulness of it can be increased to the average person?" . Even just saying how mathmaticians and engineers use it in the introduction would help a lot. Maybe moving the history section to the top as well. FrozenPurpleCube 15:21, 14 March 2007 (UTC)

This is a topic that can be explained to people just learning some linear algebra, so it should certainly be made accessible. Convincing college freshmen/sophomores of the utility of something like this is a different story however :-). Anyway, the introduction seems quite weak in terms of context and content. So the tag seems warranted. I'll take a better look some time. --C S (Talk) 12:38, 20 March 2007 (UTC)


More explicitly if

\ \{s_1, ... ,s_p\} are the eigenvalues of M^*M.\!

then S is a m-by-n matrix with p non-zero diagonal elements

S = \begin{bmatrix}
\sqrt s_1 & 0  & \  & ... & \  & \  
\\ 0 & \sqrt s_2 & 0 & ...  & \ & ... 
\\ \vdots & 0 & \ddots  
\\ \  & \vdots      & \  & \sqrt s_p & 0 & ...  
\\ \  & \  & \  & 0         & 0 & \ 
\\    & \  &    & \vdots    &   & \ddots
\\\end{bmatrix}

and

V = \begin{bmatrix}v_1 & v_2  & v_3 & ... & v_p & v_{p+1} & ... & v_n \end{bmatrix}

[edit] The SVD via the eigenvalues of M * M

Here is a version of the section The SVD via the eigenvalues of M * M

The approach of computing the SVD via the eigenvalues of M * M is correct from a mathematical point of view. But is it really as fast and accurate compared to dedicated methods such as those presented in Golub & van Loan? --KYN 14:41, 18 July 2007 (UTC)

not a numerical analyst, but i would guess answer is no. that section is also redundant, perhaps it can be removed all together. Mct mht 22:43, 18 July 2007 (UTC)

I also propose to move the Application section (now at #13) to an earlier position, perhaps after the formal definition has been presented. Right now the article appears too technical for the reader not familiar with the concept. --KYN 14:46, 18 July 2007 (UTC)

---

The application section has now been moved to position 4. The remaining issue is what to do with the section "The SVD via the eigenvalues of M * M". I propose to remove it entirely because

  • It does not provide a straightforward computational approach since it relies on eigenvalue decompositions, so it remains to answer how to accomplish such a decomposition.
  • It appears to be significantly slower and probably of less numerical accuracy compared to a dedicated SVD algorithm.

--KYN 09:43, 31 July 2007 (UTC)

Since there has been no support for keeping the section on computing SVD via eigenvalue decomposition I have removed it. If someone want to put it back in, please provide justification in accordance to this discussion.

I am starting to look at the section "Linear algebraic proof" with a similar view. Although formally correct, the "proof" makes a strong reference to the spectral theorem. However, I don't think that proving the spectral theorem is any easier than proving the spectral decomposition using a variational method. Please correct me if I'm wrong. A proper proof must base statements on "simpler" facts rather than move laterally. Proposal: remove also section "Linear algebraic proof". Alternatively, start a new article on the various connections/relations between SVD and EVD which exist, without assuming that EVD is more basic than SVD. --KYN 16:05, 6 August 2007 (UTC)

the proof is a standard one (i doubt you can find a mathematician who disgrees) and it contains essentially the relationship between eigenvalue decomposition and SVD. anything else one would wanna say in this regard is a trivial consequence (e.g. the statements in the "Relation to eigenvalue decomposition" and "Singular values, singular vectors, and their relation to the SVD" sections). but i for one am not going to object if you wanna remove it, as the article is getting long-ish.
as an aside, the two proofs in the present version present a nice contrast: one algebraic while the other analytic. we also see how they generalize to bounded Hilbert space operators somewhere below. your comment on what constitutes a "proper" proof aside, which is very wrong (sry, :-) ), the spectral theorem for Hermitian matrices follows easily from the fact that every matrix has an eigenvector. Mct mht 16:56, 6 August 2007 (UTC)


I'm for keeping the algebraic proof. its reliance on the eigenvalue decomposition is really good an many more people are familiar with eigenvector decomposition. One of the fun bits is to take an maximizing problem and solve it with an algebraic expression. So to see the proof from the algebraic perspective is nice .

I'm also for putting back the computing SVD via eigenvalue decomposition section. (I have to own up to being the author) The reasons are

  • we need to give at least one algorithim not just links to libraries of code so people can have a crack themselves .
  • It is a simple method to calculate the SVD given an EVD (there are many algorithms for calculation the EVD given in wikipedia)
  • its and speed accuracy are tied to the speed and accuracy of the EVD and the multiplication of M.V1.Dinv.
About the proof I'm not totally against the so called algebraic proof and I agree that it illustrates some interesting relations between SVD and EVD. My objection is simply that it suggests that EVD is simpler or more general or basic than SVD is. Of course, this is true for some readers but not necessary for all. I notice that at least two books on the subject; Golub & van Loan and Numerical Recipes describe SVD rather early and EVD toward the end, which to me suggests that at least these authors prefer to present it the other way around; SVD is the general approach which works for all matrices and EVD is a special case which only works for normal matrices. I propose to at least change the heading from "Algebraic proof" to something like "Based on the spectral theorem". This goes more along the lines just before that heading which refer to the two "proofs" as "arguments".
About the section on computing the SVD via EVD. I must confess that this topic is slightly outside my expertise, but I'm guessing that the relevance of computing SVD via EVD decreases with the size of the matrix. Please correct me if I wrong here. For example, I can imagine that when the round-off errors, which are inevitable when a large matrix is multiplied by its own transpose, is of the same order of magnitude as the smaller singular values there may be problems. What we could say in the article is that if you you want to compute SVD and don't have a dedicated algorithm for that BUT you do have one for EVD or are prepared to implement one (instead of implementing a dedicated SVD algorithm), then you can find SVD along the lines which were presented (and now removed). What we should not do, is to suggest that this approach is a standard algorithm for computing SVD. --KYN 11:42, 10 August 2007 (UTC)
The algebraic proof is useful in that it shows the relationship with the eigenvalue decomposition of M*M. I think that most people are more familiar with eigenvalues than singular values. However, the relationship SVD/EVD is already (though in less detail) explained earlier in the article, and wonder how much the proof adds.
Calculating the SVD via the EVD is not a good algorithm and I added that to the article. The problem is that the EVD may be ill-conditioned even though the SVD is not. For those that nevertheless want to implement the algorithm, the explanation of the relation between SVD and EVD in the article should suffice.
Incidentally, you can compute the SVD stably via the EVD, but you have to use the relation between the SVD of M and the EVD of
 \begin{bmatrix} 0 & M \\ M^* & 0 \end{bmatrix}.
The obvious problem is that this matrix is bigger than M. Apparently, the standard methods for computing the SVD are somehow related to this, but they don't explicitly construct this matrix. However, I don't know the details. -- Jitse Niesen (talk) 13:18, 10 August 2007 (UTC)

[edit] History

I am not a mathematician, so I do not want to edit the page directly, but it occurred to me that the History section does not seem to be quite correct. It ends thus:

 Practical methods for computing the SVD were unknown until 1965 when Gene Golub and William Kahan published their algorithm.
 In 1970, Golub and Christian Reinsch published a variant of the Golub/Kahan algorithm that is still the one most-used today.

Did not Cornelius Lanczos publish his algorithm for what has later been called SVD in 1950? And is his algorithm not the most-used one today? (For instance, it is the only algorithm implemented in svdlibc.) -Peachris (talk) 23:47, 10 December 2007 (UTC)

P.S. Here is the reference: Lanczos, C. (1951). An Iteration Method for the Solution of the Eigenvalue Problem of Linear Differential and Integral Operators. Journal of research of the National Bureau of Standards, 45, 255-282. -Peachris (talk) 19:43, 11 December 2007 (UTC)
Hmmm, I don't know. I thought that paper introduced the Lanczos algorithm for finding eigenvalues. Are singular values (possibly under another name) discussed in that paper? Golub and Kahan mention in their paper that the Lanczos algorithm can be used for computing singular values, so perhaps this idea was already floating around. However, the first reference I know of that discusses it in detail is Paige, Bidiagonalization of Matrices and Solution of Linear Equations, SIAM J. Numer. Anal., 1974.
I think that the Lanczos algorithm is not suitable for computing the full SVD. It seems to be very useful for the computation of the largest few singular values of a sparse matrix (which is a fairly common problem in practice), but the smaller singular values converge relatively slowly. See for instance the documentation of SVDPACK, on which svdlibc is based.
I should stress that numerical linear algebra is not my specialization and that I may well be wrong. -- Jitse Niesen (talk) 18:37, 16 December 2007 (UTC)
Thank you Jitse, your comments were very helpful and convincing. As a practitioner, I only need to perform truncated SVD on sparse matrices, which explains why I have not needed anything else than the Lanczos algorithm as implemented in svdpack/svdpackc/svdlibc. Thanks also for the link to the svdpack documentation – it is more helpful than the svdpackc documentation I was aware of. I'm afraid I cannot help explaining Lanczos' 1951 paper – it goes way above my head. It does not seem to mention singular value decomposition, but of course he might use another term for the same thing. -Peachris (talk) 01:54, 19 December 2007 (UTC)

[edit] Dimensions of matrices incorrect in statement of theorem

I think it says m-by-m when it should say m-by-n, and there may be other problems. Currently this article says:

Suppose M is an m-by-n matrix whose entries come from the field K, which is either the field of real numbers or the field of complex numbers. Then there exists a factorization of the form

M = U\Sigma V^*, \,\!
where U is an m-by-m unitary matrix over K, the matrix Σ is m-by-n with nonnegative numbers on the diagonal and zeros off the diagonal, and V* denotes the conjugate transpose of V, an n-by-n unitary matrix over K.

That seems definitely wrong to me. For example, the matrix which has nonzero diagonal elements should be a square matrix -- otherwise it makes no sense. Compare the treatment in Press et al., Numerical Recipes in C, p. 59:

Any M x N matrix A whose number of rows M is greater than or equal to its number of columns N, can be written as the product of an M x N column-orthogonal matrix U, an N x N diagonal matrix W with positive or zero elements (the singular values), and the transpose of an N x N orthogonal matrix V.

And it gives the equation like A = UWVT.

I think I see how to fix it. I think the three matrices are in the same order in the two formulations and M corresponds to m and N corresponds to n, so all I have to do is fix it to say m-by-n for U and n-by-n for sigma, to match up with Press et al.'s formulation. --Coppertwig (talk) 03:35, 25 December 2007 (UTC)

they are essentially the same, with trivial difference in the proofs. i've reverted your change for consistency of article. Mct mht (talk) 04:08, 25 December 2007 (UTC)
You're right. I was wrong. Thanks for watching and fixing it. My edit had a rectangular matrix being described as "unitary" with a link to a page about square matrices, which made no sense. I suppose I vaguely remember now having previously seen such a thing as the diagonal of a rectangular matrix after all; anyway, that's consistent with the proof beneath it, as you say, and with the page diagonal matrix. I inserted a few words to emphasize/clarify that it's talking about the diagonal of a (potentially) non-square matrix. Does this proof work regardless of whether m > n or m < n? Press specifies M > N. (of course also possible m=n). --Coppertwig (talk) 13:21, 25 December 2007 (UTC)
yes, the proofs for M > N, M < N, and M = N are almost verbatim the same, only difference is whether U is square, V is square, or both are square. for example, suppose Σ has more rows then columns. if you chop off the zero part to make Σ square, then U becomes rectangular (and column-orthonormal) but V remains square. this would be the formulation you brought up above. Mct mht (talk) 05:02, 29 December 2007 (UTC)

[edit] the spectral theorem

I have removed the following text from the teaser.

"The spectral theorem says that normal matrices, which by definition must be square, can be unitarily diagonalized using a basis of eigenvectors. The SVD can be seen as a generalization of the spectral theorem to arbitrary, not necessarily square, matrices."

It would be very difficult to give a mathematically correct interpretation of the above. The spectral theorem says that a normal matrix can be written as UDU^(inverse). The SVD says that an arbitrary matrix can be written as UDV, where V has absolutely nothing to do with U.

Phrenophobia (talk) 09:06, 10 February 2008 (UTC)

There was actually a mathematically correct version of it later in the article, as long as one is flexible about the meaning of "generalization". For hermitian positive semi-definite matrices the SVD and unitary diagonalization are the same, and in general the eigenvalue decomposition of the HPSD matrix MM^* is very closely related to the SVD of M. However, I think contrasting the decompositions as you do here is also important, so I included that as well ("V is unrelated to U"). JackSchmidt (talk) 16:23, 10 February 2008 (UTC)
Another way to phrase the idea mathematically and correctly: "The spectral theorem says that every normal endomorphism of a (complex with hermitian p.d. inner product) vector space is diagonal up to a rotation of the coordinate system. The SVD shows that every homomorphism between distinct (complex with hermitian p.d. inner product) vector spaces is diagonal up to rotations of their coordinate systems." This is nice because it says that once you have a nice enough geometry on the vector spaces, every nice enough operator is just diagonal, it just scales the coordinates. I believe this is sometimes phrased as every operator takes circles to ellipses. The reason the two ideas of eigenvalue decomposition and SVD are not really related as "generalization", but rather are more "analogous", is that there is a very large difference between an endomorphism of one vector space, and a homomorphism between two vector spaces ("U is unrelated to V"). I'm not sure if such a geometric version should be added, but perhaps the geometric meaning section could do with some improvement (maybe a picture). JackSchmidt (talk) 16:35, 10 February 2008 (UTC)
it's not really correct to say U and V are unrelated. Mct mht (talk) 00:12, 11 February 2008 (UTC)
Right. However, I find it hard to find a better formulation. I went for "U and V are not related except through M". I also made some more changes to the text that Jack added.
I'm not so fond of the sentence "The eigenvalue decomposition only applies to square matrices, but gives an extremely tidy decomposition, while the SVD applies to any matrix, but gives a looser decomposition." What do you want to say here? In what sense is the SVD looser?
I agree with Jack I think the article can be improved by combining this with the geometric section, add a picture there, and perhaps even moving it higher up because it gives a nice explanation about what the SVD is and why we're interested in it. I quite like the description in Cleve Moler's "Professor SVD" (which says that this article is "quite good", so well done!). -- Jitse Niesen (talk) 12:02, 11 February 2008 (UTC)
I very much like "not related except through M". They are base changes for different vector spaces, and those vector spaces are only related through M. Thanks very much for the better phrasing.
The new changes introduced a subtle math error. The SVD is not simply a diagonalization, but a positive semi-definite diagonalization, and so the only matrices for which the eigenvalue decomposition and the singular value decomposition are the same are the hermitian positive semi-definite matrices. The new text brings in non-unitarily diagonalizable matrices, which strains the analogy between the two decompositions, but I could not decide if the strain was good for the article or no.
The troublesome sentence can be removed, *especially* if a geometric version is added. In case someone wants to salvage it: The idea is that the eigenvalue decomposition takes n^2 entries and returns n^2+n invariants, but the singular value decomposition of a square matrix takes n^2 entries and returns 2n^2+n invariants, so it has a much larger space to choose from, yet it describes nearly the same set of matrices. I think the assignment of these decompositions is not continuous in the matrix entries (though I could be wrong, I think there is an exercise somewhere to just give the "formula" for the SVD of a small general matrix; but I think [1,0;0,e] as e varies over a neighborhood of 0 shows discontinuity), so viewing it via dimensions is probably flawed. The informal sense in which it is looser, is simply that the SVD needs two base changes to do what the eigenvalue decomposition does with one. I'll read Prof SVD's stuff. JackSchmidt (talk) 17:52, 11 February 2008 (UTC)
Sorry for all the minor edits. The map from SVDs to matrices has a small kernel, the map from eigenvalue decomps to matrices has an even smaller kernel, the topological dimension of the space of all SVDs is much larger (even if one were to penalize it for the kernel) than the dimension of the eigenvalue decompositions, yet their images are both dense in the set of all matrices, so the eigenvalue decomposition does "more with less". I'll stress again the informal sense, since it is so much easier to explain: the SVD needs two base changes to do what the eigenvalue decomposition does with one. JackSchmidt (talk) 18:07, 11 February 2008 (UTC)
Thanks for catching my error. However, I don't agree with the rest of your post. Are you taking into account that the U and V in the SVD have to be unitary? That goes a long way towards resolving the difference. I don't feel like counting dimensions, but it does seem to be roughly similar. -- Jitse Niesen (talk) 18:49, 11 February 2008 (UTC)

(←) I think User:KYN has now fixed up this section nicely (thanks!), but at some point we should add the eigshow picture from the Prof SVD article (also from Trefethen and Bau and just plain old matlab). I am happy letting the "looser" argument go, but I think I did take into account the unitary part, as the article Unitary group says, "The unitary group U(n) is a real Lie group of dimension n^2." However, I may have made a real/complex dimensional error, which would also go a long way towards fixing the problem, so you may be right. The huge problem with the looser argument is that it is comparing apples to oranges (not just real/complex, but which matrices can be represented, which can be represented uniquely after replacing some things with quotient topological spaces, and the fundamental difference: SVD is for homomorphisms and eigendecomposition is for endomorphisms). To be clear, this is why I removed the sentence from the article way back, and only presented the argument to be salvaged. I don't actually hold that it is true, merely that I suspect someone could make a true and interesting statement from it (after finding the right conditions on the two decompositions to make them comparable). JackSchmidt (talk) 22:16, 11 February 2008 (UTC)

Yup real/complex error. I think they have roughly the same dimension. Thanks for catching it, otherwise I'd probably have continued in that mistaken impression for years! If you do happen to salvage the argument somehow, let me know (even on talk page if it is not worth inclusion in an article). JackSchmidt (talk) 22:23, 11 February 2008 (UTC)

[edit] SVD on Scroll Matrix

Is this known to be trivial or should it be cited in this Article?

(* complete dictionary of words, one per row, number of rows is (alpha^word), using words of length "word" and an alphabet of "alpha" number of characters *)

alpha=4;word=3;dict=.;

dict=Partition[Flatten[Tuples[Reverse[IdentityMatrix[alpha]],word]],(alpha*word)]

PseudoInverse[dict]==((Transpose[dict])*((alpha)^-(word -1)))-((word - 1)/(alpha^word)/word)

True

There is nothing special for alpha=4;word=3 - The method works for all cases where word < alpha, but that is a more verbose loop, and the Mathematica PseudoInverse function takes too long to calculate for values of alpha and word that are large, whereas the transpose side of the equation is fast.

100TWdoug (talk) 05:54, 25 March 2008 (UTC)

If there are no objections, I will cite this to http://www.youvan.com as verbose Mathematica code, unless another editor wants to use formal notation. This particular example of alpha = 4; word = 3 happens to cover the biological genetic code, where the alpha(bet) is the 4 nucleotides (A, C, G, T) and the word(length) is the nucleotide triplet (e.g., 3) for the codon.50MWdoug (talk) 00:36, 1 April 2008 (UTC)

I'm not sure I understand it correctly, but it seems you have a formula for the pseudoinverse of a particular matrix (the variable dict in your code). That has little to do with the SVD.
Even if you have a formula for the SVD of your matrix dict, I doubt it belongs in the article. The SVD of many matrices can be calculated explicitly, but I think only important ones should be mentioned. I don't know what the matrix dict looks like (I can't read Mathematica code very well), or how widely used it is. Finally, we have a no-original-research policy. Personal websites are generally not deemed reliable sources, and www.youvan.com does seem like one.
Sorry about this, but Wikipedia is meant to be an encyclopaedia. It's not a collection of all possible facts. -- Jitse Niesen (talk) 11:03, 1 April 2008 (UTC)

"PseudoInverse[dict]" is a Mathematica function that runs SVD on over-determined or under-determined matrices. (Wolfram could just as well labeled this function "SVD"). So, your comment: "The SVD of many matrices can be calculated explicitly" is very interesting if these matrices are nontrivial. How might we determine whether the right side of the equation, involving Tuples, is either trivial are already known? Thank you for looking at this. 50MWdoug (talk) 01:01, 2 April 2008 (UTC)

[edit] Linear Algebraic Proof?

The section "Calculating the SVD" refers to, and provides a dead link to, a "linear algebraic proof." Was this removed? If so, that's unfortunate, because it would be nice to provide a proof that doesn't rely on heavy machinery such as the spectral theorem or Lagrange multipliers. The book by Trefethen and Bau, mentioned in the same section, provides a straightforward inductive linear algebraic proof that requires only a simple metric space compactness argument (not explicitly given in in the book). Jbunniii (talk) 03:51, 5 April 2008 (UTC)

gotta be kidding me. what you call "heavy machinery" is not at all. whatever "simple metric space compactness argument" you are referring to probably resembles the variational proof given in article: M*M is self-adjoint, so characterize its eigenvalues variationally, using compactness, and you're done.
as for the deadlink, i think someone who didn't know what a proof is changed the section name from "linear algebraic proof".
problem with this particular article is that some people edit/comment without knowing what they're talking about. as a result, we have an article that's not as organized as it can be and repeats the obvious unnecessarily. Mct mht (talk) 09:37, 5 April 2008 (UTC)
Trefethen and Bau treat the SVD (way) before they treat eigenvalues, so in that context the spectral theorem is heavy machinery. Their proof is indeed variational (they start by defining the first singular vector as the one that maximizes ||Mv|| over the unit sphere) but it also avoids Lagrange multipliers. I think it's quite a nice proof, but I don't want to add a third proof which is similar to the proofs that are already in the article. -- Jitse Niesen (talk) 15:27, 7 April 2008 (UTC)

[edit] References to isometries in SVD w/ spectral theorm section

Reading this spectral derivation of the SVD I'm guessing that it would be fairly readable to somebody with first year linear algebra. The exception is the injection of the use of the term isometry:

"We see that this is almost the desired result, except that U1 and V1 are not unitary in general. U1 is a partial isometry (U1U*1 = = I ) while V1  is an isometry (V*1V1 = I ). To finish the argument, one simply has to "fill out" these matrices to obtain unitaries."

Reading the wikipedia page on partial isometry doesn't clarify things too much. In that article there's references to:

Hilbert spaces, functional analysis, linear map, final and initial subspaces, isometric map.

It's all very abstract, much more so than ought to be required to read this section (I was looking for an explaination of SVD for nothing more than real matrices, so having to understand all of this stuff first would require a major digression).

I'm still figuring all this stuff out for myself so I'm not sure how one would rewrite this in a clear fashion (intuition says this would involve only references to projection and not isometries). Perhaps somebody in the know already understands how this could be made clearer, and could do so.

Peeter.joot (talk) 15:42, 13 April 2008 (UTC)

The article just means that U1 and V1 are not square matrices, so are not technically unitary. However, they can be obviously or arbitrarily extended to unitary matrices by adding on some more orthonormal columns. Basically you can ignore the sentence using the word isometry. JackSchmidt (talk) 15:52, 13 April 2008 (UTC)