Steffensen's method

In numerical analysis, Steffensen's method is a root-finding method, similar to Newton's method, named after Johan Frederik Steffensen. Steffensen's method also achieves quadratic convergence, but without using derivatives as Newton's method does.

Simple description

The simplest form of the formula for Steffensen's method occurs when it is used to find the zeros, or roots, of a function f ; that is: to find the value x_\star that satisfies f(x_\star)=0 . Near the solution x_\star , the function f is supposed to approximately satisfy -1 < f'(x_\star) < 0 ; this condition makes the function f adequate as a correction for finding its own solution, although it is not required to work efficiently. For some functions, Steffensen's method can work even if this condition is not met, but in such a case, the starting value x_0\ must be very close to the actual solution x_\star , and convergence to the solution may be slow.

Given an adequate starting value x_0\  , a sequence of values x_0,\ x_1,\ x_2,\dots,\ x_n,\dots can be generated using the formula below. When it works, each value in the sequence is much closer to the solution x_\star than the prior value. The value x_n\ from the current step generates the value x_{n+1}\ for the next step, via this formula:[1]

x_{n+1} = x_n - \frac{f(x_n)}{g(x_n)}

for n = 0, 1, 2, 3, ... , where the slope function g(x_n) is a composite of the original function f given by the following formula:

g(x_n) = \frac{f(x_n + f(x_n)) - f(x_n)}{f(x_n)}.

The function g is the average value for the slope of the function f between the last sequence point (x,y) = ( x_n,\ f(x_n) ) and the auxiliary point (x,y)=( x_n + h,\ f(x_n + h) ) , with the step h=f(x_n)\  . It is also called the first-order divided difference of f between those two points.

It is only for the purpose of finding h for this auxiliary point that the value of the function f must be an adequate correction to get closer to its own solution, and for that reason fulfill the requirement that -1 < f'(x_\star) < 0  . For all other parts of the calculation, Steffensen's method only requires the function f to be continuous and to actually have a nearby solution. Several modest modifications of the step h in the slope calculation g exist to accommodate functions f that do not quite meet the requirement.

Advantages and drawbacks

The main advantage of Steffensen's method is that it has quadratic convergence like Newton's method that is, both methods find roots to an equation f just as ‘quickly’. In this case quickly means that for both methods, the number of correct digits in the answer doubles with each step. But the formula for Newton's method requires a separate function for the derivative; Steffensen's method does not. So Steffensen's method can be programmed for a generic function, as long as that function meets the constraints mentioned above.

The price for the quick convergence is the double function evaluation: both f(x_n) and f(x_n + h) must be calculated, which might be time-consuming if f is a complicated function. For comparison, the secant method needs only one function evaluation per step, so with two function evaluations the secant method can do two steps, and two steps of the secant method increase the number of correct digits by a factor of 2.6 . The equally time-consuming single step of Steffensen's (or Newton's) method increases the correct digits by a factor of 2 which is slightly less.

Similar to Newton's method and most other quadratically convergent algorithms, the crucial weakness in Steffensen's method is the choice of the starting value x_0 . If the value of x_0 is not close ‘enough’ to the actual solution x_\star , the method may fail and the sequence of values x_0, x_1, x_2, x_3,\dots may either flip flop between two extremes, or diverge to infinity (possibly both!).

Derivation using Aitken's delta-squared process

The version of Steffensen's method implemented in the MATLAB code shown below can be found using the Aitken's delta-squared process for accelerating convergence of a sequence. To compare the following formulae to the formulae in the section above, notice that x_n = p\ -\ p_n . This method assumes starting with a linearly convergent sequence and increases the rate of convergence of that sequence. If the signs of p_n,\ p_{n+1},\ p_{n+2} agree and p_n\ is sufficiently close to the desired limit of the sequence p\ , we can assume the following:

\frac{p_{n+1}-p}{p_n-p}\approx\frac{p_{n+2}-p}{p_{n+1}-p}

then

(p_{n+1}-p)^2\approx(p_{n+2}-p)(p_n-p)

so

p_{n+1}^2-2p_{n+1}p+p^2\approx p_{n+2}p_n-(p_n+p_{n+2})p+p^2

and hence

(p_{n+2}-2p_{n+1}+p_n)p\approx p_{n+2}p_n-p_{n+1}^2 .


Solving for the desired limit of the sequence p gives:

p\approx \frac{p_{n+2}p_n-p_{n+1}^2}{p_{n+2}-2p_{n+1}+p_n}


=\frac{p_{n}^2+p_{n}p_{n+2}+2p_{n}p_{n+1}-2p_{n}p_{n+1}-p_{n}^2-p_{n+1}^2}{p_{n+2}-2p_{n+1}+p_n}


=\frac{(p_{n}^2+p_{n}p_{n+2}-2p_{n}p_{n+1})-(p_{n}^2-2p_{n}p_{n+1}+p_{n+1}^2)}{p_{n+2}-2p_{n+1}+p_n}


=p_n-\frac{(p_{n+1}-p_n)^2}{p_{n+2}-2p_{n+1}+p_n},


which results in the more rapidly convergent sequence:

p\approx p_{n+3}=p_n-\frac{(p_{n+1}-p_n)^2}{p_{n+2}-2p_{n+1}+p_n}.

Implementation in Matlab

Here is the source for an implementation of Steffensen's Method in MATLAB.

function Steffensen(f,p0,tol)
% This function takes as inputs: a fixed point iteration function, f, 
% and initial guess to the fixed point, p0, and a tolerance, tol.
% The fixed point iteration function is assumed to be input as an
% inline function. 
% This function will calculate and return the fixed point, p, 
% that makes the expression f(x) = p true to within the desired 
% tolerance, tol. 
 
format compact % This shortens the output.
format long    % This prints more decimal places. 
 
for i=1:1000   % get ready to do a large, but finite, number of iterations.
               % This is so that if the method fails to converge, we won't
               % be stuck in an infinite loop.
    p1=f(p0);  % calculate the next two guesses for the fixed point.
    p2=f(p1);
    p=p0-(p1-p0)^2/(p2-2*p1+p0) % use Aitken's delta squared method to
                                % find a better approximation to p0.
    if abs(p-p0)<tol  % test to see if we are within tolerance.
        break         % if we are, stop the iterations, we have our answer.
    end
    p0=p;              % update p0 for the next iteration.
end
if abs(p-p0)>tol       % If we fail to meet the tolerance, we output a
                       % message of failure.
    'failed to converge in 1000 iterations.'
end

Generalization

Steffensen's method can also be used to find an input x = x_\star for a different kind of function F that produces output the same as its input: x_\star = F(x_\star) for the special value x_\star . Solutions like x_\star are called fixed points. Many such functions can be used to find their own solutions by repeatedly recycling the result back as input, but the rate of convergence can be slow, or the function can fail to converge at all, depending on the individual function. Steffensen's method accelerates this convergence, to make it quadratic.

This method for finding fixed points of a real-valued function has been generalised for functions F : X \to X on a Banach space X . The generalised method assumes that a family of bounded linear operators \{L(u,v): u, v \in X\} associated with u\ and v\ can be found to satisfy the condition[2]

F(u)- F(v)=L(u,v)\ (u-v).

In the simple form given in the section above, the function f simply takes in and produces real numbers. There, the function g is a divided difference. In the generalized form here, the operator L is the analogue of a divided difference for use in the Banach space. The operator L is equivalent to a matrix whose entries are all functions of vector arguments u\ and v\ .

Steffensen's method is then very similar to the Newton's method, except that it uses the divided differenceL(F(x),x)\ instead of the derivative F'(x)\  . It is thus defined by

x_{n+1} = x_n + [I - L(F(x_n), x_n)]^{-1}(F(x_n) - x_n),\

for n=1,\ 2,\ 3,\ ... , and where I\ is the identity operator.

If the operator L\ satisfies

\|L(u,v) - L(x,y)\| \le K \big( \|u-x\| + \|v-y\| \big)

for some constant K\  , then the method converges quadratically to a fixed point of F if the initial approximation x_0\ is "sufficiently close" to the desired solution x_\star , that satisfies x_\star = F(x_\star) .

References

  1. Germund Dahlquist, Åke Björck, tr. Ned Anderson. (1974). Numerical Methods, pp. 230231. Englewood Cliffs, NJ: Prentice Hall.
  2. Johnson, L. W. & Scholz, D. R. (1968). On Steffensen's Method. SIAM Journal on Numerical Analysis, 5 (2), 296302, (June 1968). Stable URL: