Some linear systems are cheap to solve: diagonal systems cost O(n),
triangular systems cost O(n2). Since a general system costs
O(n3), the strategy for solving Ax=b is to
factorizeA into triangular (or orthogonal) factors, then solve the
resulting easy systems.
The key observation: triangular systems cost O(n2) to solve, while
a general system costs O(n3). This motivates the central strategy
of numerical linear algebra: factorize the matrix into simpler pieces.
The naive approach to solving Ax=b is to compute
A−1 and then x=A−1b. This is a bad idea:
computing A−1 costs O(n3) operations, is numerically less
stable than alternatives, and does not exploit structure. Instead, we write A
as a product of matrices that are easy to solve with.
LU factorization writes A=PLU where P is a permutation matrix (row
swaps), L is lower triangular (with ones on the diagonal), and U is upper
triangular. You have seen this before: it is
Gaussian elimination (Math 235)
organized as a matrix product. Solving Ax=b
becomes PLUx=b which reduces to:
Permute: Pb
Forward substitution: solve Ly=Pb
Back substitution: solve Ux=y
Steps 2 and 3 are each O(n2). The expensive part is computing the
factorization itself (O(n3)), but this is done only once. Solving
for additional right-hand sides costs only O(n2) each. See the
appendix for details on LU decomposition.
QR factorization writes A=QR where Q is orthogonal and R is upper
triangular. This is the Gram-Schmidt
process (Math 235) rewritten as a matrix
factorization. Then Ax=b becomes
Rx=QTb: a matrix-vector multiply followed by back
substitution. QR is the focus of this chapter because it is more numerically
stable than LU and extends naturally to rectangular matrices and least squares
problems.
SVD (singular value decomposition)
writes A=UΣVT where U and
V are orthogonal and Σ is diagonal. This is connected to eigenvector
diagonalization and the fundamental theorem of linear algebra
(Math 545). The
SVD is the most powerful factorization for analysis (it reveals rank,
conditioning, and the geometry of the linear map), but it is also the most
expensive to compute.