3.6 Section 3.2 Newton root finding

  3.6.1 Definition of convex function
  3.6.2 Newton method to solve set of equations

Understanding Newton method

Start from the line equation. You remember we normally write it as

\(y=c+mx\) where \(c\) is the y-axis intercept and \(m\) is the slope. But this equation always implicitly assumed that the slope is taken at \(x_{0}=0\), and the intercept is also at \(x_{0}=0\), hence the above can be written as \begin {equation} y=f\relax (x) =f\left (x_{0}\right ) +f^{\prime }\left (x_{0}\right ) \left (x-x_{0}\right ) \tag {1} \end {equation}

The above equation is the same as \(y=c+mx\) when \(x_{0}=0\)

So from (1)

\begin {align*} f\relax (x) & =f\left (x_{0}\right ) +f^{\prime }\left ( x_{0}\right ) \left (x-x_{0}\right ) \\ \frac {f\relax (x) -f\left (x_{0}\right ) }{f^{\prime }\left ( x_{0}\right ) } & =\left (x-x_{0}\right ) \end {align*}

Now instead of writing \(x\) and \(x_{0}\) we write \(x_{n+1}\) and \(x_{n}\), so the above becomes

\begin {align*} \frac {f\left (x_{n+1}\right ) -f\left (x_{n}\right ) }{f^{\prime }\left ( x_{n}\right ) } & =\left (x_{n+1}-x_{n}\right ) \\ x_{n+1} & =\frac {f\left (x_{n+1}\right ) -f\left (x_{n}\right ) }{f^{\prime }\left (x_{n}\right ) }+x_{n} \end {align*}

That is it. Now for \(x_{n+1}\), \(f\left (x_{n+1}\right ) =0\), and that is the whole idea. Replace \(f\left (x_{n+1}\right ) \) by zero in the above we obtain

\[ x_{n+1}=x_{n}-\frac {f\left (x_{n}\right ) }{f^{\prime }\left (x_{n}\right ) }\]

Which is Newton method.

Theorem: Let \(f^{\prime \prime }\,\)be C\(^{2}\) and let \(r\) be a simple zero. There is a neighborhood of r and a constant C s.t. if Newton method is started in that neighborhood it will converge to r according to \(\left \vert x_{n+1}-r\right \vert \leq C\left \vert x_{n}-1\right \vert ^{2}\)

Proof of quadratic convergence order

pict
Figure 3.12:quadratic convergence

From diagram we see that

\begin {equation} e_{n+1}=r-x_{n+1} \tag {1} \end {equation}

But from definition of Newton root finding

\begin {equation} x_{n+1}=x_{n}-\frac {f\left (x_{n}\right ) }{f^{\prime }\left (x_{n}\right ) } \tag {2} \end {equation}

substituting (2) into (1) gives

\begin {align} e_{n+1} & =r-\left (x_{n}-\frac {f\left (x_{n}\right ) }{f^{\prime }\left ( x_{n}\right ) }\right ) \nonumber \\ & =\overset {e_{n}}{\overbrace {r-x_{n}}}+\frac {f\left (x_{n}\right ) }{f^{\prime }\left (x_{n}\right ) }\nonumber \\ & =e_{n}+\frac {f\left (x_{n}\right ) }{f^{\prime }\left (x_{n}\right ) }\nonumber \\ & =\frac {e_{n}f^{\prime }\left (x_{n}\right ) +f\left (x_{n}\right ) }{f^{\prime }\left (x_{n}\right ) } \tag {3} \end {align}

Now evaluate \(f\relax (x) \) at \(r\) by expanding it using Taylor around \(x_{n}\)

\[ f\relax (r) =f\left (x_{n}\right ) +hf^{\prime }\left (x_{n}\right ) +\frac {h^{2}f^{\prime \prime }\left (\xi \right ) }{2!}\]

But \(h\) is the distance between \(r\) and \(x_{n}\), which is \(e_{n}\), hence the above becomes

\[ f\relax (r) =f\left (x_{n}\right ) +e_{n}f^{\prime }\left ( x_{n}\right ) +\frac {e_{n}^{2}f^{\prime \prime }\left (\xi \right ) }{2!}\]

But \(f\relax (r) =0\) since this is a root finding, and that is our goal. Hence the above becomes

\begin {align} 0 & =f\left (x_{n}\right ) +e_{n}f^{\prime }\left (x_{n}\right ) +\frac {e_{n}^{2}f^{\prime \prime }\left (\xi \right ) }{2!}\nonumber \\ f\left (x_{n}\right ) +e_{n}f^{\prime }\left (x_{n}\right ) & =-\frac {f^{\prime \prime }\left (\xi \right ) }{2!}e_{n}^{2} \tag {4} \end {align}

Substituting (4) into numerator of (3) gives

\begin {align*} e_{n+1} & =\frac {-\frac {f^{\prime \prime }\left (\xi \right ) }{2!}e_{n}^{2}}{f^{\prime }\left (x_{n}\right ) }\\ & =-\frac {f^{\prime \prime }\left (\xi \right ) }{2f^{\prime }\left ( x_{n}\right ) }e_{n}^{2} \end {align*}

but \(\frac {f^{\prime \prime }\left (\xi \right ) }{f^{\prime }\left ( x_{n}\right ) }\approx \frac {f^{\prime \prime }\relax (r) }{f^{\prime }\relax (r) }=k\) (some constant), hence above becomes

\[ e_{n+1}=k_{1}e_{n}^{2}\]

where \(k_{1}\) is some constant as well.

Hence we can find a constant C such that \(e_{n+1}\leq Ce_{n}^{2}\) where \(C\) is any constant smaller than \(k_{1}\)

The above proofes that Newton method converges quadratically.

3.6.1 Definition of convex function

A function \(f\relax (x) \) is convex if \(f^{\prime \prime }\left ( x\right ) \) \(\geq 0\) for all \(x.\)

Mathworld has this definition

”A convex function is a continuous function whose value at the midpoint of every interval in its domain does not exceed the arithmetic mean of its values at the ends of the interval.” diagram below from Mathworld

pict
Figure 3.13:Mathworld

How to use Newton method to find \(\sqrt {R}\) ?

Let \(x=\sqrt {R}\), hence \(x^{2}-R=0\) is \(f\relax (x) \) to use with Newton method. This leads to \(x_{n+1}=x_{n}-\frac {x_{n}^{2}-R}{2x_{n}}=\frac {1}{2}\left (x_{n}+\frac {R}{x_{n}}\right ) \)

3.6.2 Newton method to solve set of equations

Writing

\begin {align*} \begin {bmatrix} x_{n+1}\\ y_{n+1}\end {bmatrix} & =\begin {bmatrix} x_{n}\\ y_{n}\end {bmatrix} -F\left (x_{n}\right ) J^{-1}\left (x_{n}\right ) \\ & =\begin {bmatrix} x_{n}\\ y_{n}\end {bmatrix} -\begin {bmatrix} F_{1}\left (x_{n},y_{n}\right ) \\ F_{2}\left (x_{n},y_{n}\right ) \end {bmatrix}\begin {bmatrix} \frac {\partial F_{1}\left (x_{n},y_{n}\right ) }{x_{n}} & & \frac {\partial F_{1}\left (x_{n},y_{n}\right ) }{y_{n}}\\ \frac {\partial F_{2}\left (x_{n},y_{n}\right ) }{x_{n}} & & \frac {\partial F_{2}\left (x_{n},y_{n}\right ) }{x_{n}}\end {bmatrix} \end {align*}