Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Derivation

Given f(x)=0f(x) = 0 and an initial guess x0x_0, approximate ff by its tangent line at x0x_0:

y=f(x0)+f(x0)(xx0)y = f(x_0) + f'(x_0)(x - x_0)

Setting y=0y = 0 and solving for xx gives the next approximation:

x1=x0f(x0)f(x0)x_1 = x_0 - \frac{f(x_0)}{f'(x_0)}

Repeating this process:

xn+1=xnf(xn)f(xn)x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}

Newton’s Method as Fixed Point Iteration

Newton’s method is a fixed point iteration with:

g(x)=xf(x)f(x)g(x) = x - \frac{f(x)}{f'(x)}

If xncx_n \to c, then:

c=cf(c)f(c)    f(c)=0c = c - \frac{f(c)}{f'(c)} \implies f(c) = 0

So the fixed point is indeed a root of ff.

The Algorithm

Convergence Analysis

Proof 1

The iteration function is g(x)=xf(x)f(x)g(x) = x - \frac{f(x)}{f'(x)}.

Computing the derivative:

g(x)=1(f)2ff(f)2=ff(f)2g'(x) = 1 - \frac{(f')^2 - ff''}{(f')^2} = \frac{ff''}{(f')^2}

At the root:

g(c)=f(c)f(c)(f(c))2=0g'(c) = \frac{f(c)f''(c)}{(f'(c))^2} = 0

Since g(c)=0g'(c) = 0 and gg is continuous, there exists δ>0\delta > 0 such that g(x)<1|g'(x)| < 1 for x(cδ,c+δ)x \in (c-\delta, c+\delta). By the fixed point convergence theorem, the iteration converges.

Proof 2

This follows from the fixed point order theorem: since g(c)=0g'(c) = 0 but generally g(c)0g''(c) \neq 0, the convergence is quadratic.

More precisely, Taylor expansion of gg around cc gives:

xn+1c=g(xn)g(c)=g(c)=0(xnc)+g(ξ)2(xnc)2x_{n+1} - c = g(x_n) - g(c) = \underbrace{g'(c)}_{=0}(x_n - c) + \frac{g''(\xi)}{2}(x_n - c)^2

for some ξ\xi between xnx_n and cc.

Remark 1 (Newton’s Method as a Local Contraction)

The convergence proof reveals why Newton’s method is “local”: it’s a contraction mapping only near the root!

Recall from the Banach Fixed Point Theorem that contractions converge geometrically. For Newton’s iteration g(x)=xf(x)/f(x)g(x) = x - f(x)/f'(x):

  • At the root: g(c)=0<1|g'(c)| = 0 < 1 ✓ (strongly contractive)

  • Far from the root: g(x)|g'(x)| may be large (not a contraction!)

This explains Newton’s behavior:

  • Good initial guess: You’re in the contraction region → rapid convergence

  • Bad initial guess: You’re outside the contraction region → possible divergence

The condition g(x)<1|g'(x)| < 1 from fixed-point theory tells us exactly when Newton iterates move closer to the root. Newton’s special feature is that g(c)=0g'(c) = 0, giving quadratic rather than just linear convergence.

Example: Computing 3\sqrt{3}

Example 1 (Babylonian Method)

For f(x)=x23f(x) = x^2 - 3, Newton’s method gives:

xn+1=xnxn232xn=xn2+32xn=12(xn+3xn)x_{n+1} = x_n - \frac{x_n^2 - 3}{2x_n} = \frac{x_n^2 + 3}{2x_n} = \frac{1}{2}\left(x_n + \frac{3}{x_n}\right)

This is the Babylonian method for square roots! The number of correct digits roughly doubles each iteration.

Starting from x0=2x_0 = 2:

nnxnx_nError
02.00.27
11.750.018
21.7321439×1059 \times 10^{-5}
31.73205082×1092 \times 10^{-9}

Newton’s Method in Higher Dimensions

For a vector-valued function f:RnRn\mathbf{f}: \mathbb{R}^n \to \mathbb{R}^n, Newton’s method becomes:

xn+1=xn[Df(xn)]1f(xn)\mathbf{x}_{n+1} = \mathbf{x}_n - [D\mathbf{f}(\mathbf{x}_n)]^{-1}\mathbf{f}(\mathbf{x}_n)

where DfD\mathbf{f} is the Jacobian matrix.

In 2D, for f(x,y)=(f1(x,y)f2(x,y))\mathbf{f}(x,y) = \begin{pmatrix} f_1(x,y) \\ f_2(x,y) \end{pmatrix}:

(xn+1yn+1)=(xnyn)(f1xf1yf2xf2y)1(f1(xn,yn)f2(xn,yn))\begin{pmatrix} x_{n+1} \\ y_{n+1} \end{pmatrix} = \begin{pmatrix} x_n \\ y_n \end{pmatrix} - \begin{pmatrix} \frac{\partial f_1}{\partial x} & \frac{\partial f_1}{\partial y} \\ \frac{\partial f_2}{\partial x} & \frac{\partial f_2}{\partial y} \end{pmatrix}^{-1} \begin{pmatrix} f_1(x_n, y_n) \\ f_2(x_n, y_n) \end{pmatrix}

Each Newton step requires solving a linear system—this motivates the linear algebra chapters!

Multiple Roots

Remark 2 (Modified Newton for Multiple Roots)

If the multiplicity pp is known, the modified iteration:

xn+1=xnpf(xn)f(xn)x_{n+1} = x_n - p\frac{f(x_n)}{f'(x_n)}

restores quadratic convergence.

Advantages and Disadvantages

Advantages:

Disadvantages:

Variants: