Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Why Orthogonal Matrices Are Perfect

Why Not Gram-Schmidt?

Gram-Schmidt—even the modified version—is not the method of choice for computing QR in practice. The fundamental issue is that it works by subtracting projections, which leads to cancellation errors.

We need a different strategy: instead of orthogonalizing columns by subtraction, we’ll triangularize AA by multiplication with orthogonal matrices:

QnQ2Q1A=RA=Q1TQ2TQnTQRQ_n \cdots Q_2 Q_1 A = R \quad \Rightarrow \quad A = \underbrace{Q_1^T Q_2^T \cdots Q_n^T}_{Q} R

Since orthogonal matrices preserve norms and don’t amplify errors, this approach is backward stable.

The QR Factorization

Householder Reflections

The stable way to compute QR uses Householder reflections—orthogonal matrices that reflect vectors across hyperplanes.

Key properties:

The Householder QR Algorithm

The idea: introduce zeros below the diagonal, column by column, using Householder reflections.

Step 1: Find a Householder reflector H1H_1 that zeros out A2:m,1A_{2:m,1}:

H1A=(000)H_1 A = \begin{pmatrix} * & * & \cdots & * \\ 0 & * & \cdots & * \\ 0 & * & \cdots & * \\ \vdots & \vdots & & \vdots \\ 0 & * & \cdots & * \end{pmatrix}

Step 2: Find H2H_2 that zeros out (H1A)3:m,2(H_1A)_{3:m,2}, leaving column 1 unchanged:

H2H1A=(00000)H_2 H_1 A = \begin{pmatrix} * & * & * & \cdots & * \\ 0 & * & * & \cdots & * \\ 0 & 0 & * & \cdots & * \\ \vdots & \vdots & \vdots & & \vdots \\ 0 & 0 & * & \cdots & * \end{pmatrix}

Continue until the matrix is upper triangular:

HnH2H1A=RH_n \cdots H_2 H_1 A = R

Therefore:

A=H1H2HnQRA = \underbrace{H_1 H_2 \cdots H_n}_{Q} R

Since each HiH_i is orthogonal, their product QQ is orthogonal.

Computing the Householder Vector

To zero out everything below a1a_1 in a vector aa, we need Ha=a2e1Ha = \|a\|_2 e_1.

The Householder vector is:

v=a+sign(a1)a2e1v = a + \text{sign}(a_1) \|a\|_2 e_1

then normalize: vv/v2v \leftarrow v / \|v\|_2

Why the sign? To avoid cancellation when aa is nearly parallel to e1e_1.

Comparison: Gram-Schmidt vs Householder

AspectGram-SchmidtHouseholder
ApproachOrthogonalize columnsIntroduce zeros
ComputesQQ explicitlyQQ implicitly (as product of HiH_i)
StabilityCan lose orthogonalityBackward stable
QTQI|Q^TQ - I|O(κ(A)ϵ)O(\kappa(A)\epsilon)O(ϵ)O(\epsilon)
Cost2mn22mn^22mn223n32mn^2 - \frac{2}{3}n^3

Why Householder is Backward Stable

Proof 1

Key insight 1: Orthogonal transformations preserve norms.

Each Householder reflector HkH_k satisfies Hk2=1\|H_k\|_2 = 1. When we compute HkxH_k x in floating point, the error is:

fl(Hkx)=Hkx+δkwithδk=O(εmach)x\text{fl}(H_k x) = H_k x + \delta_k \quad \text{with} \quad \|\delta_k\| = O(\varepsilon_{\text{mach}}) \|x\|

The error is proportional to x\|x\|, not amplified.

Key insight 2: No cancellation.

Gram-Schmidt computes:

vj=aji<jaj,qiqi(subtraction → cancellation risk)v_j = a_j - \sum_{i<j} \langle a_j, q_i \rangle \, q_i \quad \text{(subtraction → cancellation risk)}

Householder computes:

Hka=a2e1(orthogonal transformation → no cancellation)H_k a = \|a\|_2 e_1 \quad \text{(orthogonal transformation → no cancellation)}

The reflector introduces zeros by rotation/reflection, not by subtracting nearly equal quantities.

Key insight 3: Errors don’t accumulate badly.

After nn Householder transformations:

HnH1A=R+EwithE=O(nεmach)AH_n \cdots H_1 A = R + E \quad \text{with} \quad \|E\| = O(n \varepsilon_{\text{mach}}) \|A\|

The factor of nn (not 2n2^n!) comes from the fact that each orthogonal transformation only adds O(εmach)O(\varepsilon_{\text{mach}}) relative error.

Contrast with Gram-Schmidt: The loss of orthogonality in Gram-Schmidt scales as κ(A)εmach\kappa(A) \cdot \varepsilon_{\text{mach}}. For ill-conditioned matrices, this can be catastrophic. Householder QR maintains orthogonality to O(εmach)O(\varepsilon_{\text{mach}}) regardless of conditioning.

Properties of the QR Factorization

Applications of QR

  1. Solving least squares: More stable than normal equations

  2. Eigenvalue algorithms: QR iteration

  3. Orthonormal basis: For column space of AA

  4. Rank determination: Via the diagonal of RR

  5. Linear system solving: When AA is square

Cost Analysis

For ARm×nA \in \mathbb{R}^{m \times n}:

OperationFlops
QR factorization2mn223n32mn^2 - \frac{2}{3}n^3
Compute QQ explicitlyAdditional 2mn22mn^2
Apply QTQ^T to vector4mn2n24mn - 2n^2

For least squares, we often don’t need QQ explicitly—just QTbQ^T b.