Talk:Generalized minimal residual method

From Wikipedia, the free encyclopedia

[edit] Why Krylov subspace?

It isn't clear to me why the Krylov subspace makes a good basis for the solution. —Ben FrantzDale 20:43, 19 January 2007 (UTC)

Are you asking why the Krylov subspace is a good subspace, or why a particular basis for the Krylov subspace is good? -- Jitse Niesen (talk) 06:30, 22 January 2007 (UTC)

I'm asking why the Krylov subspace is good. When I first heard about GMRES, I immediately thought of the power iteration algorithm for finding eigenvalues. That makes me think the Krylov subspaces will contain the directions that “matter most” for approximating the range of the operator, and since it will approximate eigenvectors it would also contain directions that “matter most” for the domain of the operator. Is that intuition reasonable? On the other hand, the fact that power iterations converge on the principle eigenvecctor makes it seem like the Krylov subspaces will basically igore the low-eigenvalue directions. (Perhaps that's tha good thing.) —Ben FrantzDale 13:17, 22 January 2007 (UTC)

Actually, I don't know that much about iterative methods, but I'll give your interesting question a try. Firstly, I think the Krylov subspace is a fairly natural space to consider; it's easy to construct and uses the problem data. The second argument is emperical: the method works. I think this is not a satisfactory answer, so if you ever find something better please let me know.

You're right that there is a connection with power iteration. Krylov methods indeed concentrate on the extremal eigenvalues (not just the largest eigenvalue, as power iteration, but also the small eigenvalues). Compare with the formula for the speed of convergence when the matrix is positive definite: it depends on the smallest and largest eigenvalues. -- Jitse Niesen (talk) 01:41, 23 January 2007 (UTC)

Thanks for the comment. I think I found an answer: If I wanted to construct an approximate solution, x, to

A x = b

, I would want to do it with the eigenvectors of A, starting with the eigenvector with the largest eigenvalue, and going on in descending order. It appears that GMRES approximates this in that the Krylov subspace will quickly head off towards those principle eigenvectors and will eventually span the entire space. According to Arnoldi iteration:

The resulting vectors are a basis of the Krylov subspace, $\mathcal{K}_{n}$ . We may expect the vectors of this basis to give good approximations of the eigenvectors corresponding to the

n

largest eigenvalues, for the same reason that

A n - 1 b

approximates the dominant eigenvector.

So that basically answers the question.

Two things stood out for me as points of confusion. First, the idea of constructing an input to A out of the output of A seemed unnatural. (I'm thinking in terms of mechanics, so the output might be force when the input is position, making it extra counterintuitive.) Second, it seems like the solution will depend on the initial guess. If it is orthogonal to some of the eigenvectors, won't those eigenvectors never be seen in the result? (That is highly unlikely, I guess, but it could happen in theory, right? I suppose it's more likely when the matrix is ill-conditioned.) —Ben FrantzDale 03:45, 23 January 2007 (UTC)

Suppose A is non-singular. Take the characteristic polynomial c(x). By Caylay-Hamilton, c(A)=0, and c(0)=+-det(A). So it is

$A(c_1+c_2A+\ldots+A^{n-1})=-c_0=(-1)^{n-1}\det(A)$ .

From this a formula for the inverse matrix and thus a formula for the solution results

$A^{-1}=\frac{(-1)^{n-1}}{\det(A)}(c_1+c_2A+\ldots+A^{n-1})$ .

That is, if there is a solution x to Ax=b, then it is a linear combination of b,Ab,AAb,... This proves the theoretical correctness of the algorithm and also shows why this particular Krylow space is used.

For the input-output-question one has to assume that some rescaling by an invertible matrix took place so that all entries are of the same unit or, after cancelling it out, are unitless.--LutzL 12:49, 21 September 2007 (UTC)

[edit] Pseudocode

I'm confused by the pseudocode. First, c and s are never initialized. Second, there seems to be no convergence criteria. So, is this to find the exact x after all n steps? If so, what good is it? JefeDeJefes 20:49, 17 January 2007 (UTC)

I don't understand your first point. All entries in c and s are initialized before they are used (in the first iteration of the j-loop, the i-loop is skipped because the sequence 1, … j-1 is empty). Your second point is spot on: the algorithm finds the exact x after n steps, and no, this is not very useful. The algorithm needs to be amended and cleaned up. -- Jitse Niesen (talk) 14:50, 30 March 2007 (UTC)

[edit] Pseudocode bug ?

in the implementation description, the Givens rotation contains the submatrix ((c,s),(-s,c)) in the pseudocode, the Givens rotation contains the submatrix ((c,s)(s,-c)) which version is correct ? 157.161.41.179 21:36, 23 June 2007 (UTC)

((c,s)(s,-c)) is not a rotation, so the version in the pseudocode is probably wrong. Together with the point raised in the previous section, this is enough for me to have my doubts about it (I copied it from the German article at de:GMRES-Verfahren without really checking it). So I removed it from the article and copied it below, until somebody can have a good look at it. -- Jitse Niesen (talk) 10:11, 24 June 2007 (UTC)

Moved from the article:

=== Pseudocode ===

Inputs: A, b, x₀.

$r_0 \leftarrow b - Ax_0 \,$
if $r 0 = 0$ then

return

x 0

end if
$v_1 = \frac{r_0}{|r_0|_2} \,$
$γ 1 = | r 0 | 2$
for j ← 1, …, n do

for i ← 1, …, j do

$h_{ij} \leftarrow v_i^\top Av_j \,$

end for

$w_j \leftarrow Av_j - \sum_{i=1}^j h_{ij} v_i \,$

$h_{j+1,j} \leftarrow |w_j|_2 \,$

for i ← 1, …, j − 1 do

$\begin{pmatrix} h_{ij} \\ h_{i+1,j} \end{pmatrix} \leftarrow \begin{pmatrix} c_{i+1} & s_{i+1} \\ s_{i+1} & -c_{i+1} \end{pmatrix} \begin{pmatrix} h_{ij} \\ h_{i+1,j} \end{pmatrix}$