Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

The Unifying Framework

Consider solving the linear system Ax=bA\mathbf{x} = \mathbf{b}. Instead of direct factorization, we can reformulate as a fixed point problem:

x=Mx+c\mathbf{x} = M\mathbf{x} + \mathbf{c}

and iterate:

x(k+1)=Mx(k)+c\mathbf{x}^{(k+1)} = M\mathbf{x}^{(k)} + \mathbf{c}

When does this converge? Three equivalent perspectives:

Perspective 1: Banach Fixed Point Theorem

Remark 1 (Connection to Scalar Case)

This is the matrix version of the Banach Fixed Point Theorem from nonlinear equations!

ScalarMatrix
xn+1=g(xn)x_{n+1} = g(x_n)x(k+1)=Mx(k)+c\mathbf{x}^{(k+1)} = M\mathbf{x}^{(k)} + \mathbf{c}
Converges if $g’(x)
Rate: $e_n

The unifying principle: contractions converge geometrically.

Perspective 2: Neumann Series

Rearrange x=Mx+c\mathbf{x} = M\mathbf{x} + \mathbf{c} to get (IM)x=c(I - M)\mathbf{x} = \mathbf{c}, so:

x=(IM)1c\mathbf{x} = (I - M)^{-1}\mathbf{c}

When does this inverse exist? The Neumann series gives the answer:

Proof 1

Let Sn=I+M+M2++MnS_n = I + M + M^2 + \cdots + M^n. Then:

(IM)Sn=IMn+1(I - M)S_n = I - M^{n+1}

If ρ(M)<1\rho(M) < 1, then Mn+10M^{n+1} \to 0 as nn \to \infty, so Sn(IM)1S_n \to (I-M)^{-1}.

This is the matrix analog of the geometric series:

11r=1+r+r2+for r<1\frac{1}{1-r} = 1 + r + r^2 + \cdots \quad \text{for } |r| < 1

Perspective 3: Iterates Are Neumann Partial Sums

Here’s the beautiful connection: the fixed point iterates compute the Neumann series term by term.

Proof 2

Expand the recurrence x(k+1)=Mx(k)+c\mathbf{x}^{(k+1)} = M\mathbf{x}^{(k)} + \mathbf{c}:

x(1)=Mx(0)+c\mathbf{x}^{(1)} = M\mathbf{x}^{(0)} + \mathbf{c}
x(2)=Mx(1)+c=M2x(0)+Mc+c\mathbf{x}^{(2)} = M\mathbf{x}^{(1)} + \mathbf{c} = M^2\mathbf{x}^{(0)} + M\mathbf{c} + \mathbf{c}
x(3)=Mx(2)+c=M3x(0)+M2c+Mc+c\mathbf{x}^{(3)} = M\mathbf{x}^{(2)} + \mathbf{c} = M^3\mathbf{x}^{(0)} + M^2\mathbf{c} + M\mathbf{c} + \mathbf{c}

By induction: x(k)=Mkx(0)+j=0k1Mjc\mathbf{x}^{(k)} = M^k \mathbf{x}^{(0)} + \sum_{j=0}^{k-1} M^j \mathbf{c}.

As kk \to \infty with ρ(M)<1\rho(M) < 1:

Special case: If x(0)=0\mathbf{x}^{(0)} = \mathbf{0}, then x(k)=(j=0k1Mj)c\mathbf{x}^{(k)} = \left(\sum_{j=0}^{k-1} M^j\right)\mathbf{c} — the iterates are the Neumann partial sums applied to c\mathbf{c}.

Matrix Splittings for Ax=bA\mathbf{x} = \mathbf{b}

To solve Ax=bA\mathbf{x} = \mathbf{b}, split A=NPA = N - P where NN is easy to invert:

Nx=Px+b    x=N1Px+N1b=Mx+cN\mathbf{x} = P\mathbf{x} + \mathbf{b} \implies \mathbf{x} = N^{-1}P\mathbf{x} + N^{-1}\mathbf{b} = M\mathbf{x} + \mathbf{c}

where M=N1PM = N^{-1}P is the iteration matrix.

Convergenceρ(M)<1\rho(M) < 1 ⟺ Neumann series for (IM)1(I - M)^{-1} converges.

Jacobi Method

Split A=D(L+U)A = D - (L + U) where DD = diagonal, LL = strict lower, UU = strict upper.

Key property: All components use values from the previous iteration — parallelizable.

Gauss-Seidel Method

Use updated values immediately as they become available.

Key property: Uses new values for j<ij < i — typically faster convergence but not parallelizable.

Comparison

MethodIteration Matrix MMParallelizableTypical Convergence
JacobiD1(L+U)D^{-1}(L + U)YesSlower
Gauss-Seidel(DL)1U(D-L)^{-1}UNoFaster
SORModified G-S with ω\omegaNoTunable

Convergence Conditions

Spectral Radius

Sufficient Conditions

Computing ρ(M)\rho(M) is expensive. These conditions are easier to check:

Proof 3

Diagonal dominance implies M<1\|M\|_\infty < 1, and ρ(M)M\rho(M) \leq \|M\| for any matrix norm.

Connection to ODE Discretization

The same framework appears in time-stepping for ODEs!

For y=Ay\mathbf{y}' = A\mathbf{y}, implicit Euler gives:

yn+1=yn+hAyn+1\mathbf{y}_{n+1} = \mathbf{y}_n + h A\mathbf{y}_{n+1}

Rearranging:

(IhA)yn+1=yn    yn+1=(IhA)1yn(I - hA)\mathbf{y}_{n+1} = \mathbf{y}_n \implies \mathbf{y}_{n+1} = (I - hA)^{-1}\mathbf{y}_n

Stability depends on (IhA)1(I - hA)^{-1} being bounded — the Neumann series tells you when!

Remark 2 (The Stability Connection)
ContextFixed Point FormConvergence/Stability
Iterative solverx=Mx+c\mathbf{x} = M\mathbf{x} + \mathbf{c}ρ(M)<1\rho(M) < 1
Implicit Euleryn+1=(IhA)1yn\mathbf{y}_{n+1} = (I-hA)^{-1}\mathbf{y}_nEigenvalues in stability region
ODE equilibrium0=Ax\mathbf{0} = A\mathbf{x}^*Re(λi)<0\text{Re}(\lambda_i) < 0

The same spectral analysis underlies all three!

The Deeper Structure: Fredholm Theory

For compact operators KK (think: integral equations, discretized PDEs), the equation (IK)x=c(I - K)\mathbf{x} = \mathbf{c} has special structure:

When the inverse fails to exist, the Fredholm alternative applies: either a unique solution exists, or the kernel is finite-dimensional and solvability depends on c\mathbf{c} being orthogonal to the cokernel.

Beyond Classical Methods

Successive Over-Relaxation (SOR)

Gauss-Seidel can be accelerated with a relaxation parameter ω(0,2)\omega \in (0, 2):

xi(k+1)=(1ω)xi(k)+ω(Gauss-Seidel update)x_i^{(k+1)} = (1-\omega)x_i^{(k)} + \omega \cdot (\text{Gauss-Seidel update})

With ω=1\omega = 1 you get standard Gauss-Seidel; ω>1\omega > 1 (over-relaxation) can dramatically accelerate convergence if ω\omega is chosen well; ω<1\omega < 1 (under-relaxation) increases stability. The optimal ω\omega depends on the spectrum of MGSM_{GS}.

Modern Krylov Subspace Methods

Classical methods (Jacobi, Gauss-Seidel, SOR) are rarely used in modern large-scale computing. Krylov subspace methods dominate:

MethodForKey Property
Conjugate Gradient (CG)SPD systemsOptimal in energy norm
GMRESGeneral systemsMinimizes residual
BiCGSTABNon-symmetricMemory efficient

These methods search for solutions in the Krylov subspace Kk(A,b)=span{b,Ab,A2b,,Ak1b}\mathcal{K}_k(A, \mathbf{b}) = \text{span}\{\mathbf{b}, A\mathbf{b}, A^2\mathbf{b}, \ldots, A^{k-1}\mathbf{b}\}—finding good approximations using only matrix-vector products, never forming A1A^{-1}.

When to Use What

SituationMethod
Small/medium dense AADirect (LU, QR)
Large sparse AA, SPDConjugate Gradient
Large sparse AA, generalGMRES, BiCGSTAB
Multiple right-hand sidesDirect (reuse factorization)
Only AvA\mathbf{v} availableKrylov methods

The classical methods matter for understanding the theory—the Neumann series perspective, the role of ρ(M)\rho(M), the connection to fixed points and ODE stability. Modern methods build on these ideas with more sophisticated polynomial approximations in Krylov subspaces.