Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Big Idea

Some linear systems are cheap to solve: diagonal systems cost O(n)\mathcal{O}(n), triangular systems cost O(n2)\mathcal{O}(n^2). Since a general system costs O(n3)\mathcal{O}(n^3), the strategy for solving Ax=bA\mathbf{x} = \mathbf{b} is to factorize AA into triangular (or orthogonal) factors, then solve the resulting easy systems.

Diagonal Systems

For a diagonal matrix D\mathcal{D}:

Dx=(d1dn)(x1xn)=(b1bn)\mathcal{D}\mathbf{x} = \begin{pmatrix} d_1 & & \\ & \ddots & \\ & & d_n \end{pmatrix} \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix} = \begin{pmatrix} b_1 \\ \vdots \\ b_n \end{pmatrix}

The solution is immediate: xi=bi/dix_i = b_i / d_i for all ii.

Cost: nn divisions = O(n)\mathcal{O}(n) flops.

Upper Triangular Systems (Back Substitution)

For an upper triangular matrix U\mathcal{U}:

Ux=(u11u12u1nu22u2nunn)(x1x2xn)=(b1b2bn)\mathcal{U}\mathbf{x} = \begin{pmatrix} u_{11} & u_{12} & \cdots & u_{1n} \\ & u_{22} & \cdots & u_{2n} \\ & & \ddots & \vdots \\ & & & u_{nn} \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} = \begin{pmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{pmatrix}

The last equation gives xnx_n directly:

unnxn=bn    xn=bnunnu_{nn} x_n = b_n \implies x_n = \frac{b_n}{u_{nn}}

Then work upward:

xi=bij=i+1nuijxjuii,i=n1,n2,,1x_i = \frac{b_i - \sum_{j=i+1}^{n} u_{ij} x_j}{u_{ii}}, \quad i = n-1, n-2, \ldots, 1

Algorithm 1 (Back Substitution)

Input: Upper triangular URn×n\mathcal{U} \in \mathbb{R}^{n \times n}, right-hand side bRn\mathbf{b} \in \mathbb{R}^n

Output: Solution x\mathbf{x} of Ux=b\mathcal{U}\mathbf{x} = \mathbf{b}

  1. xnbn/unnx_n \gets b_n / u_{nn}

  2. for i=n1,n2,,1i = n-1, n-2, \ldots, 1:

  3. xi(bij=i+1nuijxj)/uii\qquad x_i \gets \left(b_i - \sum_{j=i+1}^{n} u_{ij} x_j\right) / u_{ii}

Cost: O(n2)\mathcal{O}(n^2) flops.

Lower Triangular Systems (Forward Substitution)

For a lower triangular matrix L\mathcal{L}, the first equation gives x1x_1:

l11x1=b1    x1=b1l11l_{11} x_1 = b_1 \implies x_1 = \frac{b_1}{l_{11}}

Then work downward:

xi=bij=1i1lijxjlii,i=2,3,,nx_i = \frac{b_i - \sum_{j=1}^{i-1} l_{ij} x_j}{l_{ii}}, \quad i = 2, 3, \ldots, n

Algorithm 2 (Forward Substitution)

Input: Lower triangular LRn×n\mathcal{L} \in \mathbb{R}^{n \times n}, right-hand side bRn\mathbf{b} \in \mathbb{R}^n

Output: Solution x\mathbf{x} of Lx=b\mathcal{L}\mathbf{x} = \mathbf{b}

  1. x1b1/l11x_1 \gets b_1 / l_{11}

  2. for i=2,3,,ni = 2, 3, \ldots, n:

  3. xi(bij=1i1lijxj)/lii\qquad x_i \gets \left(b_i - \sum_{j=1}^{i-1} l_{ij} x_j\right) / l_{ii}

Cost: O(n2)\mathcal{O}(n^2) flops.

Example 1 (Back Substitution)

Solve Ux=b\mathcal{U}\mathbf{x} = \mathbf{b} where:

(213021004)(x1x2x3)=(114)\begin{pmatrix} 2 & 1 & 3 \\ 0 & 2 & 1 \\ 0 & 0 & 4 \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 1 \\ -1 \\ 4 \end{pmatrix}

Working from the bottom:

  • x3=4/4=1x_3 = 4/4 = 1

  • x2=(111)/2=1x_2 = (-1 - 1 \cdot 1)/2 = -1

  • x1=(11(1)31)/2=1/2x_1 = (1 - 1 \cdot (-1) - 3 \cdot 1)/2 = -1/2

Solution: x=(1/2,1,1)T\mathbf{x} = (-1/2, -1, 1)^T.

Why This Matters: Matrix Factorizations

The key observation: triangular systems cost O(n2)\mathcal{O}(n^2) to solve, while a general system costs O(n3)\mathcal{O}(n^3). This motivates the central strategy of numerical linear algebra: factorize the matrix into simpler pieces.

The naive approach to solving Ax=bA\mathbf{x} = \mathbf{b} is to compute A1A^{-1} and then x=A1b\mathbf{x} = A^{-1}\mathbf{b}. This is a bad idea: computing A1A^{-1} costs O(n3)\mathcal{O}(n^3) operations, is numerically less stable than alternatives, and does not exploit structure. Instead, we write AA as a product of matrices that are easy to solve with.

The Main Factorizations

LU factorization writes A=PLUA = PLU where PP is a permutation matrix (row swaps), LL is lower triangular (with ones on the diagonal), and UU is upper triangular. You have seen this before: it is Gaussian elimination (Math 235) organized as a matrix product. Solving Ax=bA\mathbf{x} = \mathbf{b} becomes PLUx=bPLU \mathbf{x} = \mathbf{b} which reduces to:

  1. Permute: PbP\mathbf{b}

  2. Forward substitution: solve Ly=PbL\mathbf{y} = P\mathbf{b}

  3. Back substitution: solve Ux=yU\mathbf{x} = \mathbf{y}

Steps 2 and 3 are each O(n2)\mathcal{O}(n^2). The expensive part is computing the factorization itself (O(n3)\mathcal{O}(n^3)), but this is done only once. Solving for additional right-hand sides costs only O(n2)\mathcal{O}(n^2) each. See the appendix for details on LU decomposition.

QR factorization writes A=QRA = QR where QQ is orthogonal and RR is upper triangular. This is the Gram-Schmidt process (Math 235) rewritten as a matrix factorization. Then Ax=bA\mathbf{x} = \mathbf{b} becomes Rx=QTbR\mathbf{x} = Q^T\mathbf{b}: a matrix-vector multiply followed by back substitution. QR is the focus of this chapter because it is more numerically stable than LU and extends naturally to rectangular matrices and least squares problems.

SVD (singular value decomposition) writes A=UΣVTA = U\Sigma V^T where UU and VV are orthogonal and Σ\Sigma is diagonal. This is connected to eigenvector diagonalization and the fundamental theorem of linear algebra (Math 545). The SVD is the most powerful factorization for analysis (it reveals rank, conditioning, and the geometry of the linear map), but it is also the most expensive to compute.