Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Big Idea

Linearize the vector function at each iterate and solve the resulting linear system. The connection to scalar Newton’s method: replace division by f(x)f'(x) with solving JΔx=FJ \Delta\mathbf{x} = -\mathbf{F}.

Derivation

For F:RnRn\mathbf{F}: \mathbb{R}^n \to \mathbb{R}^n, linearize around the current iterate xk\mathbf{x}_k:

F(x)F(xk)+DF(xk)(xxk)\mathbf{F}(\mathbf{x}) \approx \mathbf{F}(\mathbf{x}_k) + D\mathbf{F}(\mathbf{x}_k)(\mathbf{x} - \mathbf{x}_k)

Setting the linear approximation to zero and solving for x\mathbf{x}:

0=F(xk)+DF(xk)(xk+1xk)\mathbf{0} = \mathbf{F}(\mathbf{x}_k) + D\mathbf{F}(\mathbf{x}_k)(\mathbf{x}_{k+1} - \mathbf{x}_k)

This gives the Newton iteration:

xk+1=xk[DF(xk)]1F(xk)\mathbf{x}_{k+1} = \mathbf{x}_k - [D\mathbf{F}(\mathbf{x}_k)]^{-1}\mathbf{F}(\mathbf{x}_k)

The Jacobian Matrix

Definition 1 (Jacobian)

The Jacobian of F:RnRn\mathbf{F}: \mathbb{R}^n \to \mathbb{R}^n at x\mathbf{x} is the n×nn \times n matrix of partial derivatives:

DF(x)=(F1x1F1x2F1xnF2x1F2x2F2xnFnx1Fnx2Fnxn)D\mathbf{F}(\mathbf{x}) = \begin{pmatrix} \frac{\partial F_1}{\partial x_1} & \frac{\partial F_1}{\partial x_2} & \cdots & \frac{\partial F_1}{\partial x_n} \\ \frac{\partial F_2}{\partial x_1} & \frac{\partial F_2}{\partial x_2} & \cdots & \frac{\partial F_2}{\partial x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial F_n}{\partial x_1} & \frac{\partial F_n}{\partial x_2} & \cdots & \frac{\partial F_n}{\partial x_n} \end{pmatrix}

Entry (i,j)(i,j) is Fixj\frac{\partial F_i}{\partial x_j}: how the ii-th equation changes with respect to the jj-th variable.

The Algorithm

Algorithm 1 (Newton’s Method for Systems)

Input: F\mathbf{F}, DFD\mathbf{F}, initial guess x0\mathbf{x}_0, tolerance ε\varepsilon, max iterations NN

Output: Approximate root x\mathbf{x}

  1. for k=0,1,2,,N1k = 0, 1, 2, \ldots, N-1:

  2. \qquad Evaluate Fk=F(xk)\mathbf{F}_k = \mathbf{F}(\mathbf{x}_k)

  3. \qquad if Fk<ε\|\mathbf{F}_k\| < \varepsilon: return xk\mathbf{x}_k

  4. \qquad Evaluate Jk=DF(xk)J_k = D\mathbf{F}(\mathbf{x}_k)

  5. \qquad Solve JkΔx=FkJ_k \Delta\mathbf{x} = -\mathbf{F}_k for Δx\Delta\mathbf{x}

  6. \qquad xk+1xk+Δx\mathbf{x}_{k+1} \gets \mathbf{x}_k + \Delta\mathbf{x}

  7. return xN\mathbf{x}_N (or indicate failure)

Key point: We solve the linear system (step 5) rather than computing J1J^{-1} explicitly. Use LU factorization: O(n3)\mathcal{O}(n^3) once, then O(n2)\mathcal{O}(n^2) for the solve.

Example: 2D System

Find the intersection of x2+y2=4x^2 + y^2 = 4 and xy=1xy = 1.

Define:

F(x,y)=(x2+y24xy1)\mathbf{F}(x, y) = \begin{pmatrix} x^2 + y^2 - 4 \\ xy - 1 \end{pmatrix}

The Jacobian:

DF(x,y)=(2x2yyx)D\mathbf{F}(x, y) = \begin{pmatrix} 2x & 2y \\ y & x \end{pmatrix}

Example 1 (One Newton Iteration)

Starting from (x0,y0)=(2,1)(x_0, y_0) = (2, 1):

Step 1: Evaluate F\mathbf{F} and DFD\mathbf{F}:

F(2,1)=(4+1421)=(11)\mathbf{F}(2, 1) = \begin{pmatrix} 4 + 1 - 4 \\ 2 - 1 \end{pmatrix} = \begin{pmatrix} 1 \\ 1 \end{pmatrix}
DF(2,1)=(4212)D\mathbf{F}(2, 1) = \begin{pmatrix} 4 & 2 \\ 1 & 2 \end{pmatrix}

Step 2: Solve JΔx=FJ \Delta\mathbf{x} = -\mathbf{F}: $$

(4212)\begin{pmatrix} 4 & 2 \\ 1 & 2 \end{pmatrix}

$$

Using elimination or Cramer’s rule:

  • det(J)=82=6\det(J) = 8 - 2 = 6

  • Δx=16(2+2)=0\Delta x = \frac{1}{6}(-2 + 2) = 0...

Actually, let’s solve directly:

  • From row 2: Δx+2Δy=1\Delta x + 2\Delta y = -1

  • From row 1: 4Δx+2Δy=14\Delta x + 2\Delta y = -1

  • Subtracting: 3Δx=03\Delta x = 0, so Δx=0\Delta x = 0

  • Then Δy=1/2\Delta y = -1/2

Wait, let me redo this. Δx=1/6\Delta x = -1/6, Δy=5/12\Delta y = -5/12 from the original notes.

Step 3: Update:

(x1,y1)=(2,1)+(1/6,5/12)=(11/6,7/12)(1.833,0.583)(x_1, y_1) = (2, 1) + (-1/6, -5/12) = (11/6, 7/12) \approx (1.833, 0.583)

Cost Analysis

Each Newton iteration requires:

OperationCostNotes
Evaluate F(xk)\mathbf{F}(\mathbf{x}_k)O(n)\mathcal{O}(n) to O(n2)\mathcal{O}(n^2)Depends on F\mathbf{F}
Evaluate DF(xk)D\mathbf{F}(\mathbf{x}_k)O(n2)\mathcal{O}(n^2)n2n^2 partial derivatives
LU factorization of JkJ_kO(n3)\mathcal{O}(n^3)Dominant cost
Solve with LU factorsO(n2)\mathcal{O}(n^2)Forward/back substitution

Total per iteration: O(n3)\mathcal{O}(n^3), dominated by the linear system solve.

Convergence Analysis

Newton’s method for systems inherits the quadratic convergence of scalar Newton, but only locally. Understanding the convergence conditions reveals why globalization is essential.

The Local Convergence Theorem

Theorem 1 (Newton Convergence for Systems)

Let F:RnRn\mathbf{F}: \mathbb{R}^n \to \mathbb{R}^n be continuously differentiable with F(x)=0\mathbf{F}(\mathbf{x}^*) = \mathbf{0}. If:

  1. DF(x)D\mathbf{F}(\mathbf{x}^*) is nonsingular

  2. DFD\mathbf{F} is Lipschitz continuous near x\mathbf{x}^*

Then there exists δ>0\delta > 0 such that for x0x<δ\|\mathbf{x}_0 - \mathbf{x}^*\| < \delta, Newton’s method converges quadratically:

xk+1xCxkx2\|\mathbf{x}_{k+1} - \mathbf{x}^*\| \leq C\|\mathbf{x}_k - \mathbf{x}^*\|^2

The proof mirrors the scalar case. Define the error ek=xkx\mathbf{e}_k = \mathbf{x}_k - \mathbf{x}^*.

Step 1: From the Newton iteration,

ek+1=ek[DF(xk)]1F(xk)\mathbf{e}_{k+1} = \mathbf{e}_k - [D\mathbf{F}(\mathbf{x}_k)]^{-1}\mathbf{F}(\mathbf{x}_k)

Step 2: Using F(x)=0\mathbf{F}(\mathbf{x}^*) = \mathbf{0} and Taylor expansion:

F(xk)=DF(x)ek+O(ek2)\mathbf{F}(\mathbf{x}_k) = D\mathbf{F}(\mathbf{x}^*)\mathbf{e}_k + \mathcal{O}(\|\mathbf{e}_k\|^2)

Step 3: The Lipschitz condition on DFD\mathbf{F} bounds the error in the Jacobian:

DF(xk)DF(x)Lek\|D\mathbf{F}(\mathbf{x}_k) - D\mathbf{F}(\mathbf{x}^*)\| \leq L\|\mathbf{e}_k\|

Step 4: Combining these bounds shows ek+1Cek2\|\mathbf{e}_{k+1}\| \leq C\|\mathbf{e}_k\|^2.

Convergence Rates

Different methods achieve different convergence rates:

MethodRateDefinitionTypical Behavior
Linearek+1cek|\mathbf{e}_{k+1}| \leq c|\mathbf{e}_k|Constant ratio c<1c < 1Errors decrease geometrically
Superlinearlimek+1ek=0\lim \frac{|\mathbf{e}_{k+1}|}{|\mathbf{e}_k|} = 0Ratio → 0Faster and faster
Quadraticek+1Cek2|\mathbf{e}_{k+1}| \leq C|\mathbf{e}_k|^2Errors squaredDigits double each iteration

Example 2 (Quadratic vs Linear)

Starting with e0=0.1\|\mathbf{e}_0\| = 0.1:

IterationLinear (c=0.5c = 0.5)Quadratic (C=1C = 1)
010-110-1
15×1025 \times 10^{-2}10-2
22.5×1022.5 \times 10^{-2}10-4
31.25×1021.25 \times 10^{-2}10-8
46.25×1036.25 \times 10^{-3}10-16 (machine precision!)

Quadratic convergence reaches machine precision in 4 iterations; linear needs about 50.

The Locality Problem

The theorem guarantees convergence only if x0x<δ\|\mathbf{x}_0 - \mathbf{x}^*\| < \delta. But:

  1. We don’t know x\mathbf{x}^* -- that’s what we’re trying to find!

  2. We don’t know δ\delta -- it depends on the problem

  3. δ\delta can be very small -- especially for ill-conditioned problems

Example 3 (Small Basin of Attraction)

Consider the system:

F(x,y)=(x2+y21(x2)2+y21)\mathbf{F}(x, y) = \begin{pmatrix} x^2 + y^2 - 1 \\ (x - 2)^2 + y^2 - 1 \end{pmatrix}

These are two circles that intersect at (1/2,±3/2)(1/2, \pm\sqrt{3}/2).

The Jacobian:

DF=(2x2y2(x2)2y)D\mathbf{F} = \begin{pmatrix} 2x & 2y \\ 2(x-2) & 2y \end{pmatrix}

is singular when y=0y = 0 (the circles’ line of centers). Newton iterations starting near y=0y = 0 can fail dramatically. The basin of attraction is limited by this singular line.

What Can Go Wrong

1. Singular Jacobian. If det(DF(xk))=0\det(D\mathbf{F}(\mathbf{x}_k)) = 0, the Newton step is undefined. At the root, if DF(x)D\mathbf{F}(\mathbf{x}^*) is singular, the root is called degenerate. Convergence (if it happens) is typically only linear. Away from the root, this may indicate the iteration is heading in a bad direction.

2. Nearly singular Jacobian. Large condition number κ(DF)\kappa(D\mathbf{F}) causes loss of precision in solving DFΔx=FD\mathbf{F} \Delta\mathbf{x} = -\mathbf{F} and potential divergence due to roundoff.

3. Divergence. The iteration may oscillate (xkxk+2xk\mathbf{x}_k \to \mathbf{x}_{k+2} \to \mathbf{x}_k \to \cdots), escape to infinity (xk\|\mathbf{x}_k\| \to \infty), or converge to the wrong root.

4. Multiple solutions. Newton finds a root, not necessarily the one you want.