rij=⟨aj,qi⟩ for i<j (the projection coefficients)
rjj=∥vj∥ (the normalization factor)
Remark 1 (The Parallel: Gaussian Elimination to LU, Gram-Schmidt to QR)
Algorithm
What it does
Matrix factorization
Gaussian elimination
Row operations to get upper triangular
A=LU
Gram-Schmidt
Orthogonalization to get orthonormal basis
A=QR
In both cases:
The algorithm performs a sequence of operations on columns/rows
Recording these operations as matrices gives the factorization
L stores the elimination multipliers; R stores the projection coefficients
U is the reduced form; Q is the orthonormal basis
Making This Concrete:
Gaussian elimination: Each step subtracts a multiple of one row from another. The multipliers ℓij go into L; the result is U.
Gram-Schmidt: Each step subtracts projections onto previous vectors. The projection coefficients rij=⟨aj,qi⟩ go into R; the orthonormal vectors form Q.
The triangular structure of R reflects the sequential nature: aj only projects onto q1,…,qj−1, so rij=0 for i>j.
The loss of orthogonality scales with the condition number: for ill-conditioned matrices, the computed “orthonormal” vectors can be far from orthogonal. This defeats the entire purpose of orthogonalization!
In classical Gram-Schmidt, we compute all projections using the original vector aj, then subtract them all at once. Rounding errors in the projections accumulate.
In modified Gram-Schmidt, we subtract each projection immediately, then compute the next projection using the updated vector. Each projection is computed against a vector that is already more orthogonal to the previous qi’s.
Analogy: Think of it like this: if you’re trying to make a vector orthogonal to q1 and q2, classical GS computes both corrections using the original vector and subtracts them together. Modified GS first makes the vector orthogonal to q1, then computes how to make that result orthogonal to q2.
The second approach is more accurate because the second projection starts from a better approximation.
However: Modified Gram-Schmidt still has orthogonality loss proportional to κ(A)⋅εmach. For ill-conditioned matrices, this can still be unacceptable. We need a different approach.