Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Big Idea

Newton’s method approximates a function by its tangent line at each iteration. This simple idea yields quadratic convergence: the number of correct digits roughly doubles each step. It’s the workhorse of nonlinear equation solving.

Derivation

Given f(x)=0f(x) = 0 and an initial guess x0x_0, approximate ff by its tangent line at x0x_0:

y=f(x0)+f(x0)(xx0)y = f(x_0) + f'(x_0)(x - x_0)

Setting y=0y = 0 and solving for xx gives the next approximation:

x1=x0f(x0)f(x0)x_1 = x_0 - \frac{f(x_0)}{f'(x_0)}

Repeating this process:

xn+1=xnf(xn)f(xn)x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}
Source
<Figure size 1500x450 with 3 Axes>

Newton’s method in action. At each step, the function f(x)=x32x5f(x) = x^3 - 2x - 5 (blue) is approximated by its tangent line (red dashed) at the current iterate xnx_n (black star). The next iterate xn+1x_{n+1} (red square) is the root of the tangent line. The true root is marked in green. Notice how the iterates converge rapidly to the root, with the tangent line becoming an increasingly accurate local approximation.

Newton’s Method as Fixed Point Iteration

Newton’s method is a fixed point iteration with:

g(x)=xf(x)f(x)g(x) = x - \frac{f(x)}{f'(x)}

If xncx_n \to c, then:

c=cf(c)f(c)    f(c)=0c = c - \frac{f(c)}{f'(c)} \implies f(c) = 0

So the fixed point is indeed a root of ff.

The Algorithm

Algorithm 1 (Newton’s Method)

Input: Functions ff and ff', initial guess x0x_0, tolerance ε\varepsilon, max iterations NN

Output: Approximate root xx

  1. for n=0,1,2,,N1n = 0, 1, 2, \ldots, N-1:

  2. xn+1xnf(xn)/f(xn)\qquad x_{n+1} \gets x_n - f(x_n)/f'(x_n)

  3. \qquad if xn+1xn<ε|x_{n+1} - x_n| < \varepsilon: return xn+1x_{n+1}

  4. return xNx_N (or indicate failure)

Convergence Analysis

Theorem 1 (Local Convergence of Newton’s Method)

Suppose fC2([a,b])f \in \mathcal{C}^2([a,b]), f(c)=0f(c) = 0 for some c(a,b)c \in (a,b), and cc is a simple root (i.e., f(c)0f'(c) \neq 0). Then for x0x_0 sufficiently close to cc, Newton’s method converges to cc.

Proof 1

The iteration function is g(x)=xf(x)f(x)g(x) = x - \frac{f(x)}{f'(x)}.

Computing the derivative:

g(x)=1(f)2ff(f)2=ff(f)2g'(x) = 1 - \frac{(f')^2 - ff''}{(f')^2} = \frac{ff''}{(f')^2}

At the root:

g(c)=f(c)f(c)(f(c))2=0g'(c) = \frac{f(c)f''(c)}{(f'(c))^2} = 0

Since g(c)=0g'(c) = 0 and gg is continuous, there exists δ>0\delta > 0 such that g(x)<1|g'(x)| < 1 for x(cδ,c+δ)x \in (c-\delta, c+\delta). By the fixed point convergence theorem, the iteration converges.

The key step in the proof above deserves emphasis: since g(c)=0g'(c) = 0 and gg' is continuous, g(x)<1|g'(x)| < 1 in a neighborhood of cc. This is a fundamental property of continuous functions. If a continuous function satisfies a strict inequality at a point, the inequality persists in a neighborhood.

Source
<Figure size 1300x500 with 2 Axes>

Continuity and strict inequalities. If hh is continuous and h(x)<1|h(x^*)| < 1, then there exists δ>0\delta > 0 such that h(x)<1|h(x)| < 1 for all xx in the green region. Left: When h(x)h(x^*) is well below 1, the neighborhood is large. Right: When h(x)h(x^*) is close to 1, the neighborhood shrinks. The guarantee always holds, but the closer h(x)|h(x^*)| is to 1, the smaller δ\delta may be.

Remark 1 (Local Convergence)

The convergence proof above shows that Newton’s method converges locally. It only works when x0x_0 is sufficiently close to the root. Even if g(x)<1|g'(x)| < 1 only near the root (not on the whole real line), the iteration still converges provided we start close enough.

Specifically: since g(c)=0g'(c) = 0 and gg' is continuous, there exists δ>0\delta > 0 such that g(x)ρ<1|g'(x)| \leq \rho < 1 for all x(cδ,c+δ)x \in (c - \delta, c + \delta). For any x0x_0 in this interval, the fixed point convergence theorem guarantees xncx_n \to c.

This is why a good initial guess matters for Newton’s method. In practice, one often uses a few steps of bisection to get close, then switches to Newton for fast convergence.

Order of Convergence

We’ve seen that Newton’s method converges when x0x_0 is close enough to the root, but how fast? We need a precise notion of convergence speed.

Definition 1 (Order of Convergence)

A sequence {xn}\{x_n\} converging to \ell has order pp if:

limnxn+1xnp=C\lim_{n\to\infty} \frac{|x_{n+1} - \ell|}{|x_n - \ell|^p} = C

for some constant C>0C > 0.

  • p=1p = 1: Linear convergence (error shrinks by a constant factor each step)

  • p=2p = 2: Quadratic convergence (digits of accuracy double each step)

  • p=3p = 3: Cubic convergence (digits of accuracy triple each step)

Remark 2 (What Higher-Order Convergence Means)

The error relation en+1Cenp|e_{n+1}| \approx C|e_n|^p has dramatic consequences as pp increases.

Linear (p=1p = 1): Each step multiplies the error by ρ\rho. To gain one digit of accuracy, you always need a fixed number of iterations. Progress is steady but slow.

Quadratic (p=2p = 2): Each step squares the error. If en103e_n \approx 10^{-3}, then en+1106e_{n+1} \approx 10^{-6}, then en+21012e_{n+2} \approx 10^{-12}. The number of correct digits doubles at every step. This is characteristic of Newton’s method: the initial iterations show modest improvement, but once the error becomes small the doubling effect produces rapid convergence to machine precision in just a few steps.

Cubic (p=3p = 3): Each step cubes the error. The number of correct digits triples each step, even faster.

Source
<Figure size 800x500 with 1 Axes>

Quadratic Convergence of Newton’s Method

Theorem 2 (Quadratic Convergence)

Under the same assumptions as Theorem 1, Newton’s method converges with order at least p=2p = 2 (quadratic convergence). Specifically:

xn+1cMxnc2|x_{n+1} - c| \leq M |x_n - c|^2

for some constant M>0M > 0 depending on ff.

Proof 2

Since Newton’s method is a fixed point iteration with g(c)=0g'(c) = 0 but generally g(c)0g''(c) \neq 0, the convergence is quadratic by the definition of order.

Taylor expansion of gg around cc gives:

xn+1c=g(xn)g(c)=g(c)=0(xnc)+g(ξ)2(xnc)2x_{n+1} - c = g(x_n) - g(c) = \underbrace{g'(c)}_{=0}(x_n - c) + \frac{g''(\xi)}{2}(x_n - c)^2

for some ξ\xi between xnx_n and cc.

Remark 3 (Newton’s Method as a Local Contraction)

The convergence proof reveals why Newton’s method is “local”: it’s a contraction mapping only near the root!

Recall from the Banach Fixed Point Theorem that contractions converge geometrically. For Newton’s iteration g(x)=xf(x)/f(x)g(x) = x - f(x)/f'(x):

  • At the root: g(c)=0<1|g'(c)| = 0 < 1 (strongly contractive)

  • Far from the root: g(x)|g'(x)| may be large (not a contraction!)

This explains Newton’s behavior:

  • Good initial guess: You’re in the contraction region \rightarrow rapid convergence

  • Bad initial guess: You’re outside the contraction region \rightarrow possible divergence

The condition g(x)<1|g'(x)| < 1 from fixed-point theory tells us exactly when Newton iterates move closer to the root. Newton’s special feature is that g(c)=0g'(c) = 0, giving quadratic rather than just linear convergence.

See Also

Locality in practice. The exercises in Q3.7: Newton’s Method Basins of Attraction explore this directly: for the polynomial f(x)=4x46x211/4f(x) = 4x^4 - 6x^2 - 11/4, some starting points on the real line fail to converge entirely, and extending to the complex plane reveals intricate basins of attraction (Newton fractals) where nearby initial guesses can converge to completely different roots.

The Newton Basins notebook explores Newton’s method in the complex plane further, including deflation techniques and how basins of attraction shift as parameters change.

Example: Computing 3\sqrt{3}

Example 1 (Babylonian Method)

For f(x)=x23f(x) = x^2 - 3, Newton’s method gives:

xn+1=xnxn232xn=xn2+32xn=12(xn+3xn)x_{n+1} = x_n - \frac{x_n^2 - 3}{2x_n} = \frac{x_n^2 + 3}{2x_n} = \frac{1}{2}\left(x_n + \frac{3}{x_n}\right)

This is the Babylonian method for square roots! The number of correct digits roughly doubles each iteration.

Starting from x0=2x_0 = 2:

nnxnx_nError
02.00.27
11.750.018
21.7321439×1059 \times 10^{-5}
31.73205082×1092 \times 10^{-9}
See Also

Fixed Point Iteration Demo: Compares different iteration functions for 3\sqrt{3}

Summary

Newton’s method is a fixed point iteration engineered so that g(c)=0g'(c) = 0, giving quadratic convergence near simple roots. The cost of this speed is locality: convergence is only guaranteed when x0x_0 is close enough to the root.