Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Big Idea

A numerical algorithm cannot avoid the sensitivity inherent in the problem it solves. The best it can do is avoid making things worse. Backward stability is the formal statement of this goal: the computed answer is the exact solution of a slightly perturbed problem. Combined with the condition number, this gives the tightest error bound we can hope for.

Forward and Backward Error

When an algorithm computes an approximate solution x^\hat{x} to a problem f(x)=yf(x) = y, there are two ways to measure how good x^\hat{x} is.

Definition 1 (Forward and Backward Error)

For a problem f:XYf: X \to Y with true solution y=f(x)y = f(x) and computed solution y^\hat{y}:

Forward error: how far is y^\hat{y} from the true answer?

forward error=y^f(x)\text{forward error} = \|\hat{y} - f(x)\|

Backward error: what is the smallest perturbation δx\delta x such that y^\hat{y} is the exact answer to the perturbed problem?

backward error=min{δx:f(x+δx)=y^}\text{backward error} = \min \{ \|\delta x\| : f(x + \delta x) = \hat{y} \}

For linear systems, the problem is f(A,b)=A1bf(A, \mathbf{b}) = A^{-1}\mathbf{b}. An algorithm produces x^\hat{\mathbf{x}}. Then:

Forward error is what we care about. Backward error is what we can control.

Theorem 1 (The Golden Rule)

relative forward errorκ×relative backward error\text{relative forward error} \lesssim \kappa \times \text{relative backward error}

where κ\kappa is the condition number of the problem.

This separates concerns cleanly:

Neither can compensate for the other. A stable algorithm applied to an ill-conditioned problem still gives a poor answer. A brilliant problem formulation is wasted on an unstable algorithm.

Sensitivity of Linear Systems

How sensitive is the solution x\mathbf{x} to perturbations in AA and b\mathbf{b}?

A linear system Ax=bA\mathbf{x} = \mathbf{b} has two inputs: the matrix AA and the vector b\mathbf{b}. Both are subject to errors:

So we must understand how errors in both AA and b\mathbf{b} propagate to errors in x\mathbf{x}.

Theorem 2 (Sensitivity of Linear Systems)

For the linear system Ax=bA\mathbf{x} = \mathbf{b}:

δxxκ(A)(δAA+δbb)\frac{\|\delta\mathbf{x}\|}{\|\mathbf{x}\|} \lesssim \kappa(A) \left(\frac{\|\delta A\|}{\|A\|} + \frac{\|\delta\mathbf{b}\|}{\|\mathbf{b}\|}\right)

The quantity κ(A)=AA1\kappa(A) = \|A\| \|A^{-1}\| is the amplification factor from relative input perturbation to relative output error.

Proof 1

Start simple: perturb only b\mathbf{b}.

If Ax=bA\mathbf{x} = \mathbf{b}, then x=A1b\mathbf{x} = A^{-1}\mathbf{b}. Perturb bb+δb\mathbf{b} \to \mathbf{b} + \delta\mathbf{b}:

x+δx=A1(b+δb)δx=A1δb\mathbf{x} + \delta\mathbf{x} = A^{-1}(\mathbf{b} + \delta\mathbf{b}) \quad \Rightarrow \quad \delta\mathbf{x} = A^{-1}\delta\mathbf{b}

Taking norms: δxA1δb\|\delta\mathbf{x}\| \leq \|A^{-1}\| \|\delta\mathbf{b}\|

For relative error, we want δx/x\|\delta\mathbf{x}\|/\|\mathbf{x}\| in terms of δb/b\|\delta\mathbf{b}\|/\|\mathbf{b}\|.

The trick: use b=AxAx\|\mathbf{b}\| = \|A\mathbf{x}\| \leq \|A\| \|\mathbf{x}\|, so 1/xA/b1/\|\mathbf{x}\| \leq \|A\|/\|\mathbf{b}\|.

δxxA1δbAb=AA1κ(A)δbb\frac{\|\delta\mathbf{x}\|}{\|\mathbf{x}\|} \leq \|A^{-1}\| \|\delta\mathbf{b}\| \cdot \frac{\|A\|}{\|\mathbf{b}\|} = \underbrace{\|A\| \|A^{-1}\|}_{\kappa(A)} \cdot \frac{\|\delta\mathbf{b}\|}{\|\mathbf{b}\|}

The condition number emerges naturally!

Now perturb AA: Suppose (A+δA)(x+δx)=b(A + \delta A)(\mathbf{x} + \delta\mathbf{x}) = \mathbf{b}. Expanding:

Ax+Aδx+δAx+δAδx0=b\cancel{A\mathbf{x}} + A\delta\mathbf{x} + \delta A \cdot \mathbf{x} + \underbrace{\delta A \cdot \delta\mathbf{x}}_{\approx 0} = \cancel{\mathbf{b}}

Solving: δx=A1(δAx)\delta\mathbf{x} = -A^{-1}(\delta A \cdot \mathbf{x})

Taking norms and forming relative errors:

δxxA1AδAA=κ(A)δAA\frac{\|\delta\mathbf{x}\|}{\|\mathbf{x}\|} \leq \|A^{-1}\| \|A\| \cdot \frac{\|\delta A\|}{\|A\|} = \kappa(A) \cdot \frac{\|\delta A\|}{\|A\|}

Numerically Singular Matrices

Recall that the condition number κ(A)=AA1\kappa(A) = \|A\|\|A^{-1}\| measures the ratio of maximum to minimum stretch of a unit vector. The sensitivity theorem tells us what this means in practice.

Rule of thumb: Expect to lose log10κ(A)\log_{10}\kappa(A) digits of accuracy.

Condition NumberDigits Lost
κ10k\kappa \approx 10^k~kk digits
κ1/εmach1016\kappa \gtrsim 1/\varepsilon_{\text{mach}} \approx 10^{16}All digits

See the Condition Numbers and Lost Digits notebook for concrete demonstrations with diagonal, Hilbert, and Vandermonde matrices.

But what does it mean geometrically for a matrix to be ill-conditioned? How close is it to being singular? The answer involves the geometry of the set of singular matrices in the operator norm.

The singular set is closed (optional)

Lemma 1 (Σ\Sigma is closed)

The set of singular matrices Σ={MRn×n:det(M)=0}\Sigma = \{M \in \mathbb{R}^{n\times n} : \det(M) = 0\} is closed in the operator norm.

Proof 2

The determinant det:Rn×nR\det : \mathbb{R}^{n \times n} \to \mathbb{R} is a polynomial in the n2n^2 entries, so it is continuous. The singleton {0}\{0\} is closed in R\mathbb{R}, so Σ=det1({0})\Sigma = \det^{-1}(\{0\}) is closed as the preimage of a closed set under a continuous map.

Because Σ\Sigma is closed, every invertible matrix AA sits at some strictly positive distance from Σ\Sigma.

How far is an invertible matrix from Σ\Sigma? The following theorem gives the exact answer.

Theorem 3 (Distance to Singularity)

For any invertible ARn×nA \in \mathbb{R}^{n \times n} with the 2-norm, the distance to the set of singular matrices Σ={MRn×n:det(M)=0}\Sigma = \{M \in \mathbb{R}^{n\times n} : \det(M) = 0\} is

dist2(A,Σ)  =  minERn×nA+E singularE2  =  σmin(A)  =  A2κ2(A).\operatorname{dist}_2(A, \Sigma) \;=\; \min_{\substack{E \in \mathbb{R}^{n\times n} \\ A + E \text{ singular}}} \|E\|_2 \;=\; \sigma_{\min}(A) \;=\; \frac{\|A\|_2}{\kappa_2(A)}.

Proof 3

Recall that σmin(A)=minx=1Ax\sigma_{\min}(A) = \min_{\|\mathbf{x}\|=1} \|A\mathbf{x}\|.

Lower bound. Let EE be any perturbation such that A+EA + E is singular. Then there exists a unit vector z\mathbf{z} with (A+E)z=0(A + E)\mathbf{z} = \mathbf{0}, so Ez=AzE\mathbf{z} = -A\mathbf{z}. Therefore

E2Ez2=Az2σmin(A).\|E\|_2 \geq \|E\mathbf{z}\|_2 = \|A\mathbf{z}\|_2 \geq \sigma_{\min}(A).

Upper bound. Let v\mathbf{v} be a unit vector achieving σmin(A)=Av2\sigma_{\min}(A) = \|A\mathbf{v}\|_2. Set E=AvvTE^* = -A\mathbf{v}\,\mathbf{v}^T. Then E2=Av2=σmin(A)\|E^*\|_2 = \|A\mathbf{v}\|_2 = \sigma_{\min}(A), and

(A+E)v=AvAv(vTv)=0,(A + E^*)\mathbf{v} = A\mathbf{v} - A\mathbf{v}(\mathbf{v}^T\mathbf{v}) = \mathbf{0},

so A+EA + E^* is singular.

Combining: dist(A,Σ)=σmin(A)\operatorname{dist}(A, \Sigma) = \sigma_{\min}(A). The identity σmin=A2/κ2(A)\sigma_{\min} = \|A\|_2 / \kappa_2(A) follows from κ2(A)=σmax/σmin\kappa_2(A) = \sigma_{\max}/\sigma_{\min}.

Demmel’s insight

Example 1 (Numerically singular matrices)

A matrix with κ(A)1/εmach\kappa(A) \gtrsim 1/\varepsilon_{\text{mach}} is numerically indistinguishable from a singular matrix.

By Theorem 3, the relative distance from AA to the nearest singular matrix is

dist(A,Σ)A=σminσmax=1κ(A).\frac{\operatorname{dist}(A, \Sigma)}{\|A\|} = \frac{\sigma_{\min}}{\sigma_{\max}} = \frac{1}{\kappa(A)}.

Now consider floating-point storage. The computer does not see AA. It sees A~=A+E\tilde{A} = A + E where E/Aεmach\|E\|/\|A\| \sim \varepsilon_{\text{mach}}. The computer’s view of AA is a matrix somewhere inside a ball of radius εmachA\varepsilon_{\text{mach}}\|A\| centered at AA.

If κ(A)1/εmach\kappa(A) \gtrsim 1/\varepsilon_{\text{mach}}:

  • The distance to the nearest singular matrix is σmin=A/κ(A)εmachA\sigma_{\min} = \|A\|/\kappa(A) \lesssim \varepsilon_{\text{mach}}\|A\|.

  • The floating-point storage error is εmachA\sim \varepsilon_{\text{mach}}\|A\|.

  • The uncertainty ball overlaps Σ\Sigma. The computer cannot tell whether AA is singular or not.

This is why ill-conditioned systems are fundamentally hard. It is not a failure of the algorithm. The problem itself is on the boundary of being unsolvable.

The figure below illustrates this geometry. Each matrix AiA_i sits at distance σmin(Ai)\sigma_{\min}(A_i) from Σ\Sigma. The dashed circles show the floating-point uncertainty ball of radius εmachA\varepsilon_{\text{mach}}\|A\|. For A3A_3, the ball overlaps Σ\Sigma.

Source
<Figure size 900x550 with 1 Axes>

Residuals and Backward Error

Definition 2 (Residual)

Given a linear system Ax=bA\mathbf{x} = \mathbf{b} and a computed solution x^\hat{\mathbf{x}}, the residual is

r=bAx^\mathbf{r} = \mathbf{b} - A\hat{\mathbf{x}}

The residual is cheap to compute and directly measures the backward error.

Proposition 1 (The Residual Measures Backward Error)

The relative backward error (perturbing b\mathbf{b} only) equals r/b\|\mathbf{r}\|/\|\mathbf{b}\|. That is, x^\hat{\mathbf{x}} is the exact solution of the perturbed system Ax^=b+δbA\hat{\mathbf{x}} = \mathbf{b} + \delta\mathbf{b} with δb=r\delta\mathbf{b} = -\mathbf{r}.

Proof 4

By definition, r=bAx^\mathbf{r} = \mathbf{b} - A\hat{\mathbf{x}}, so rearranging:

Ax^=brA\hat{\mathbf{x}} = \mathbf{b} - \mathbf{r}

Setting δb=r\delta\mathbf{b} = -\mathbf{r}, we have Ax^=b+δbA\hat{\mathbf{x}} = \mathbf{b} + \delta\mathbf{b} exactly. The computed solution x^\hat{\mathbf{x}} is the exact solution of this perturbed system, with perturbation size

δbb=rb\frac{\|\delta\mathbf{b}\|}{\|\mathbf{b}\|} = \frac{\|\mathbf{r}\|}{\|\mathbf{b}\|}

This is the smallest perturbation to b\mathbf{b} (with AA fixed) that makes x^\hat{\mathbf{x}} an exact solution, so r/b\|\mathbf{r}\|/\|\mathbf{b}\| is the relative backward error.

Warning: Small Residual \neq Small Error

Combining the residual with the sensitivity theorem (Theorem 2):

x^xxκ(A)rb\frac{\|\hat{\mathbf{x}} - \mathbf{x}\|}{\|\mathbf{x}\|} \lesssim \kappa(A) \cdot \frac{\|\mathbf{r}\|}{\|\mathbf{b}\|}

For ill-conditioned AA, a tiny residual can hide a large error. The residual measures backward error; you need to multiply by κ(A)\kappa(A) to bound forward error.

Forward Stability

Now we turn to algorithms. We have two candidates for what “good algorithm” should mean. The first is the most natural wish.

Definition 3 (Forward Stable Algorithm)

An algorithm for solving Ax=bA\mathbf{x} = \mathbf{b} is forward stable if the computed solution x^\hat{\mathbf{x}} satisfies:

x^xx=O(εmach)\frac{\|\hat{\mathbf{x}} - \mathbf{x}\|}{\|\mathbf{x}\|} = O(\varepsilon_{\text{mach}})

That is, the relative forward error is of order machine epsilon, regardless of the conditioning of AA.

This sounds ideal. Why not demand it?

Proposition 2 (Forward Stability Is Unattainable)

No algorithm for solving Ax=bA\mathbf{x} = \mathbf{b} in floating-point arithmetic can be forward stable for all nonsingular AA.

Proof 5

The obstacle is not algorithmic; it is inherent in floating-point representation.

The input is already perturbed. When we store AA and b\mathbf{b} in floating point, we do not have the true AA and b\mathbf{b}. We have A~\tilde{A} and b~\tilde{\mathbf{b}} with

A~AA=O(εmach),b~bb=O(εmach)\frac{\|\tilde{A} - A\|}{\|A\|} = O(\varepsilon_{\text{mach}}), \qquad \frac{\|\tilde{\mathbf{b}} - \mathbf{b}\|}{\|\mathbf{b}\|} = O(\varepsilon_{\text{mach}})

Even before any computation begins, the problem has been perturbed. By the sensitivity theorem (Theorem 2), the exact solution of the perturbed problem already differs from x\mathbf{x} by

x~xx=O(κ(A)εmach)\frac{\|\tilde{\mathbf{x}} - \mathbf{x}\|}{\|\mathbf{x}\|} = O(\kappa(A) \cdot \varepsilon_{\text{mach}})

No algorithm can undo this. The input perturbation gets amplified by κ(A)\kappa(A). An algorithm that achieved O(εmach)O(\varepsilon_{\text{mach}}) forward error regardless of κ(A)\kappa(A) would have to “know” the true AA despite only having access to A~\tilde{A}.

Forward stability requires κ(A)=O(1)\kappa(A) = O(1), which is a condition on the problem, not the algorithm. For well-conditioned problems, forward and backward stability coincide. For ill-conditioned problems, forward stability is unachievable by any algorithm.

Backward Stability: The Right Standard

Since we cannot control forward error directly, we control what we can: the backward error. This leads to the Higham standard.

Definition 4 (Backward Stable Algorithm)

An algorithm for solving Ax=bA\mathbf{x} = \mathbf{b} is backward stable if the computed solution x^\hat{\mathbf{x}} satisfies:

(A+δA)x^=b+δb,δAA=O(εmach),δbb=O(εmach)(A + \delta A)\hat{\mathbf{x}} = \mathbf{b} + \delta\mathbf{b}, \quad \frac{\|\delta A\|}{\|A\|} = O(\varepsilon_{\text{mach}}), \quad \frac{\|\delta\mathbf{b}\|}{\|\mathbf{b}\|} = O(\varepsilon_{\text{mach}})

That is, x^\hat{\mathbf{x}} is the exact solution of a nearby problem.

Why this is the right standard. A backward stable algorithm introduces perturbations no larger than those already present from storing AA and b\mathbf{b} in floating point. The algorithm does not make the situation any worse than it already is. You cannot ask for more than that.

Combined with the sensitivity theorem (Theorem 2):

x^xxκ(A)εmach\frac{\|\hat{\mathbf{x}} - \mathbf{x}\|}{\|\mathbf{x}\|} \lesssim \kappa(A) \cdot \varepsilon_{\text{mach}}

This is the optimal forward error bound. It is optimal because:

  1. The factor εmach\varepsilon_{\text{mach}} comes from the backward error, which is as small as possible.

  2. The factor κ(A)\kappa(A) is intrinsic to the problem. No algorithm can remove it.

A backward stable algorithm achieves the best forward error that any algorithm can achieve on that problem. If the answer is not accurate enough, the problem is to blame, not the algorithm.

Example 2 (Why backward stability can still give a “bad” answer)

Consider the ill-conditioned 2×22 \times 2 system

A=(1111+1012),b=(22+1012).A = \begin{pmatrix} 1 & 1 \\ 1 & 1 + 10^{-12} \end{pmatrix}, \qquad \mathbf{b} = \begin{pmatrix} 2 \\ 2 + 10^{-12} \end{pmatrix}.

The true solution is x=(1,1)T\mathbf{x} = (1, 1)^T. Here κ(A)4×1012\kappa(A) \approx 4 \times 10^{12}.

Run a backward stable solver (say Householder QR) in double precision (εmach1016\varepsilon_{\text{mach}} \approx 10^{-16}). You might get, for example,

x^=(1.00040.9996).\hat{\mathbf{x}} = \begin{pmatrix} 1.0004 \\ 0.9996 \end{pmatrix}.

This answer is wrong in the 4th digit. But the algorithm is blameless. To see why, compute the residual:

Ax^b=(1.0004+0.99961.0004+0.9996(1+1012))(22+1012)=(04×1016).A\hat{\mathbf{x}} - \mathbf{b} = \begin{pmatrix} 1.0004 + 0.9996 \\ 1.0004 + 0.9996\,(1 + 10^{-12}) \end{pmatrix} - \begin{pmatrix} 2 \\ 2 + 10^{-12} \end{pmatrix} = \begin{pmatrix} 0 \\ -4 \times 10^{-16} \end{pmatrix}.

So setting δA=0\delta A = 0 and δb=Ax^b=(0,4×1016)T\delta\mathbf{b} = A\hat{\mathbf{x}} - \mathbf{b} = (0,\, -4 \times 10^{-16})^T gives

(A+δA)x^=b+δb(A + \delta A)\hat{\mathbf{x}} = \mathbf{b} + \delta\mathbf{b}

exactly, with δb/b2×1016\|\delta\mathbf{b}\|/\|\mathbf{b}\| \approx 2 \times 10^{-16}. The computed x^\hat{\mathbf{x}} is the true solution of a right-hand side that agrees with b\mathbf{b} to 16 digits. (One could equally absorb the residual into δA\delta A via δA=rx^T/x^2\delta A = \mathbf{r}\hat{\mathbf{x}}^T / \|\hat{\mathbf{x}}\|^2; the backward error is not unique.)

Where did the 4 digits go? The condition number amplifies the input uncertainty:

x^xxκ(A)εmach4×10121016=4×104.\frac{\|\hat{\mathbf{x}} - \mathbf{x}\|}{\|\mathbf{x}\|} \lesssim \kappa(A) \cdot \varepsilon_{\text{mach}} \approx 4 \times 10^{12} \cdot 10^{-16} = 4 \times 10^{-4}.

The lost accuracy is real, but it was lost the moment AA was rounded into floating point. Any other backward stable algorithm produces an answer in the same neighborhood. The only way to do better is to reduce κ(A)\kappa(A) by reformulating the problem, or to use higher precision.

“Exact solution of a nearby problem” is exactly as strong as the data deserves. For well-conditioned AA, nearby problems have nearby solutions, so the answer is accurate. For ill-conditioned AA, nearby problems can have very different solutions, so accuracy is unavailable regardless of the algorithm.

Remark 1 (Which Algorithms Are Backward Stable?)

  • Householder QR is backward stable (see Theorem 2).

  • LU with partial pivoting is backward stable in practice, though the theoretical worst case allows growth factor 2n12^{n-1} (see the stability notebook).

  • Classical Gram-Schmidt is not backward stable: the loss of orthogonality scales with κ(A)\kappa(A), not εmach\varepsilon_{\text{mach}}.

Practical Guideline: Always Check the Condition Number

Since the forward error satisfies

x^xxκ(A)εmach\frac{\|\hat{\mathbf{x}} - \mathbf{x}\|}{\|\mathbf{x}\|} \lesssim \kappa(A) \cdot \varepsilon_{\text{mach}}

we should always estimate κ(A)\kappa(A) before trusting the solution. But how?

The Challenge

Computing κ(A)=AA1\kappa(A) = \|A\| \|A^{-1}\| exactly requires A1A^{-1}, which costs O(n3)O(n^3) operations, as expensive as solving the system! We need a cheaper approach.

Hager’s Algorithm: A Clever Trick

The key insight (Hager, 1984; refined by Higham) is that we can estimate A1\|A^{-1}\| using only a few solves with the already-factored matrix.

Intuitive Derivation

We want B1\|B\|_1 where B=A1B = A^{-1}, without forming BB. Recall

B1=maxx0Bx1x1=maxx1=1Bx1.\|B\|_1 = \max_{\mathbf{x} \neq 0} \frac{\|B\mathbf{x}\|_1}{\|\mathbf{x}\|_1} = \max_{\|\mathbf{x}\|_1 = 1} \|B\mathbf{x}\|_1.

So we want to find the unit-1-norm vector x\mathbf{x} that makes Bx1\|B\mathbf{x}\|_1 largest. This is a constrained optimization; we will climb toward the maximum using only actions we can perform cheaply.

Step 1: evaluate the objective. For a given x\mathbf{x}, form y=Bx=A1x\mathbf{y} = B\mathbf{x} = A^{-1}\mathbf{x} by solving Ay=xA\mathbf{y} = \mathbf{x}. One triangular solve using the stored LU factors. The current value of the objective is y1=iyi\|\mathbf{y}\|_1 = \sum_i |y_i|.

Step 2: find an ascent direction. Write f(x)=Bx1=i(Bx)if(\mathbf{x}) = \|B\mathbf{x}\|_1 = \sum_i |(B\mathbf{x})_i|. Where differentiable,

f(x)=BTξ,ξi=sign(yi).\nabla f(\mathbf{x}) = B^T \boldsymbol\xi, \qquad \xi_i = \operatorname{sign}(y_i).

To compute this gradient we need to apply BT=(A1)T=(AT)1B^T = (A^{-1})^T = (A^T)^{-1} to ξ\boldsymbol\xi, that is, solve ATz=ξA^T \mathbf{z} = \boldsymbol\xi. A second triangular solve (same factors, transposed).

Step 3: move along the gradient. The feasible set {x11}\{\|\mathbf{x}\|_1 \leq 1\} is a polytope. A linear objective over a polytope is maximized at a vertex, and the vertices of the 1-norm ball are the signed unit vectors ±ej\pm \mathbf{e}_j. The best vertex is the one that maximizes ξTBx=zTx\boldsymbol\xi^T B \mathbf{x} = \mathbf{z}^T \mathbf{x}, namely

xnew=sign(zj)ej,j=argmaxizi.\mathbf{x}_{\text{new}} = \operatorname{sign}(z_j)\, \mathbf{e}_j, \qquad j = \arg\max_i |z_i|.

Jumping to this vertex is a single coordinate step, not a small increment.

Step 4: stop when no vertex improves. If zzTx\|\mathbf{z}\|_\infty \leq \mathbf{z}^T \mathbf{x}, the current vertex already maximizes the linearized objective, so we have reached a local maximum of ff. Return y1\|\mathbf{y}\|_1 as the estimate of B1\|B\|_1.

The starting point x=1/n\mathbf{x} = \mathbf{1}/n is interior and avoids an unlucky first gradient. Each iteration costs two triangular solves; convergence typically takes 2-5 iterations. The estimate is a lower bound on A11\|A^{-1}\|_1, and in practice agrees with the true value to within a small factor.

A 2D picture

The geometry is easiest to see in 2D. The feasible set {x11}\{\|\mathbf{x}\|_1 \le 1\} is the diamond with vertices ±e1,±e2\pm \mathbf{e}_1, \pm \mathbf{e}_2. The objective f(x)=Bx1f(\mathbf{x}) = \|B\mathbf{x}\|_1 is convex, so its maximum on the diamond is attained at one of these four vertices. Hager’s iteration is a vertex-hopping ascent: from the current point, the gradient z=BTsign(Bx)\mathbf{z} = B^T \operatorname{sign}(B\mathbf{x}) singles out the vertex sign(zj)ej\operatorname{sign}(z_j)\,\mathbf{e}_j with j=argmaxizij = \arg\max_i |z_i| as the best linear-prediction step.

Source
<Figure size 700x700 with 1 Axes>

Reading the figure. The shaded diamond is the feasible set {x11}\{\|\mathbf{x}\|_1 \le 1\}, with vertices ±e1,±e2\pm\mathbf{e}_1, \pm\mathbf{e}_2 (blue dots). Each vertex is labeled with the objective value f(v)=Bv1f(v) = \|Bv\|_1; the maximum, f(+e1)=4f(+\mathbf{e}_1) = 4, is exactly B1\|B\|_1 (the largest column sum). Gray curves are level sets {f(x)=c}\{f(\mathbf{x}) = c\} for c=1,2,3c = 1, 2, 3. The black square is the interior start x0=1/2\mathbf{x}_0 = \mathbf{1}/2; the green arrow is the gradient f(x0)=BTsign(Bx0)\nabla f(\mathbf{x}_0) = B^T \operatorname{sign}(B\mathbf{x}_0), which selects j=argmaxizi=1j = \arg\max_i |z_i| = 1. The red arrow is the resulting vertex jump to xnew=e1\mathbf{x}_{\mathrm{new}} = \mathbf{e}_1, where the algorithm terminates.

Algorithm 1 (Hager’s 1-Norm Estimator)

Input: LU factorization of AA

Output: Estimate of A11\|A^{-1}\|_1

  1. x=1/nx = \mathbf{1}/n

  2. repeat

  3. \qquad Solve ATy=xA^T y = x \quad (reuses the LU factorization!)

  4. \qquad if y1yprev1\|y\|_1 \leq \|y_{\text{prev}}\|_1 then break

  5. \qquad ξ=sign(y)\xi = \text{sign}(y)

  6. \qquad Solve Az=ξAz = \xi

  7. \qquad j=argmaxizij = \arg\max_i |z_i|

  8. \qquad if zzTx\|z\|_\infty \leq z^T x then break

  9. \qquad x=ejx = e_j \quad (unit vector with 1 in position jj)

  10. return y1\|y\|_1

Cost: Each iteration requires two triangular solves. Typically converges in 2-5 iterations, so total cost is O(n2)O(n^2), much cheaper than the O(n3)O(n^3) factorization.

See the Condition Number Estimation notebook for a Python implementation.

Remark 2 (\infty-norm version)

Applying Hager to ATA^T estimates A1\|A^{-1}\|_\infty, since B=BT1\|B\|_\infty = \|B^T\|_1. So Hager gives cheap κ1(A)\kappa_1(A) or κ(A)\kappa_\infty(A) estimates; the 2-norm would need an SVD.

Remark 3 (Why the 1-norm is sufficient)

Hager’s algorithm estimates κ1(A)\kappa_1(A), not κ2(A)\kappa_2(A). This is fine for two reasons.

First, all matrix norms on Rn×n\mathbb{R}^{n \times n} are equivalent. The sharp bounds between the 1-norm and 2-norm are:

1nA1A2nA1\frac{1}{\sqrt{n}}\|A\|_1 \leq \|A\|_2 \leq \sqrt{n}\,\|A\|_1

so κ1(A)\kappa_1(A) and κ2(A)\kappa_2(A) can differ by at most a factor of nn. Since the forward error bound (Proposition 1) in the 1-norm is

x^x1x1κ1(A)r1b1\frac{\|\hat{\mathbf{x}} - \mathbf{x}\|_1}{\|\mathbf{x}\|_1} \lesssim \kappa_1(A) \cdot \frac{\|\mathbf{r}\|_1}{\|\mathbf{b}\|_1}

we can convert to a 2-norm bound using v2v1nv2\|\mathbf{v}\|_2 \leq \|\mathbf{v}\|_1 \leq \sqrt{n}\,\|\mathbf{v}\|_2 for vectors and the matrix norm equivalence above. This gives

x^x2x2nκ1(A)r2b2\frac{\|\hat{\mathbf{x}} - \mathbf{x}\|_2}{\|\mathbf{x}\|_2} \lesssim n \cdot \kappa_1(A) \cdot \frac{\|\mathbf{r}\|_2}{\|\mathbf{b}\|_2}

The extra factor of nn is pessimistic (it comes from worst-casing the norm conversion at every step). In practice the 1-norm bound is already a good estimate of the 2-norm error. For the “digits lost” rule of thumb, a factor of nn changes log10κ\log_{10}\kappa by at most log10n\log_{10} n, which is negligible.

Second, the 1-norm is cheap: A1=maxjiaij\|A\|_1 = \max_j \sum_i |a_{ij}| costs O(n2)O(n^2) and requires no singular value computation. The 2-norm A2=σmax\|A\|_2 = \sigma_{\max} is more expensive. Since Hager’s algorithm is a practical tool for cheap error bounds, the 1-norm is the natural choice.

The weighted residual ρ\rho

LAPACK’s a-posteriori error estimates (e.g. xGERFS) do not use r/b\|\mathbf{r}\|/\|\mathbf{b}\| but a slightly different and sharper quantity:

ρ  =  rAx^.\rho \;=\; \frac{\|\mathbf{r}\|}{\|A\| \, \|\hat{\mathbf{x}}\|}.

This is the weighted (or scaled) residual. It equals r/b\|\mathbf{r}\|/\|\mathbf{b}\| up to a constant whenever Ax^b\|A\hat{\mathbf{x}}\| \approx \|\mathbf{b}\|, but it is the optimal backward error when perturbing AA (Rigal–Gaches 1967): x^\hat{\mathbf{x}} is the exact solution of (A+E)x^=b(A + E)\hat{\mathbf{x}} = \mathbf{b} for some EE with E/Aρ\|E\|/\|A\| \le \rho. A backward stable algorithm for a linear system is one where ρ=O(εmach)\rho = O(\varepsilon_{\text{mach}}).

Theorem 4 (Golden Rule (weighted residual form))

For a computed solution x^\hat{\mathbf{x}} of Ax=bA\mathbf{x} = \mathbf{b},

x^xx^    κ(A)ρ,ρ=rAx^.\frac{\|\hat{\mathbf{x}} - \mathbf{x}\|}{\|\hat{\mathbf{x}}\|} \;\le\; \kappa(A) \cdot \rho, \qquad \rho = \frac{\|\mathbf{r}\|}{\|A\|\,\|\hat{\mathbf{x}}\|}.

Proof 6

From r=A(xx^)\mathbf{r} = A(\mathbf{x} - \hat{\mathbf{x}}) we get x^x=A1r\hat{\mathbf{x}} - \mathbf{x} = -A^{-1}\mathbf{r}, hence x^xA1r\|\hat{\mathbf{x}} - \mathbf{x}\| \le \|A^{-1}\|\,\|\mathbf{r}\|. Divide by x^\|\hat{\mathbf{x}}\| and multiply and divide by A\|A\|:

x^xx^A1rx^=AA1rAx^=κ(A)ρ.\frac{\|\hat{\mathbf{x}} - \mathbf{x}\|}{\|\hat{\mathbf{x}}\|} \le \|A^{-1}\| \frac{\|\mathbf{r}\|}{\|\hat{\mathbf{x}}\|} = \|A\|\|A^{-1}\| \cdot \frac{\|\mathbf{r}\|}{\|A\|\,\|\hat{\mathbf{x}}\|} = \kappa(A) \cdot \rho.

Compared to the r/b\|\mathbf{r}\|/\|\mathbf{b}\| form, ρ\rho is computable without reference to b\mathbf{b} or x\mathbf{x} separately (only AA, r\mathbf{r}, and x^\hat{\mathbf{x}} appear), and it combines cleanly with Hager’s estimator for κ(A)\kappa(A). Paired in the \infty-norm (see Remark 2), ρ\rho and κ\kappa_\infty are the standard package for practical forward error bounds.

Practical Workflow

Solving Ax=bA\mathbf{x} = \mathbf{b} responsibly:

  1. Factor: Compute A=LUA = LU (or A=QRA = QR).

  2. Solve: Use the factorization to obtain x^\hat{\mathbf{x}}.

  3. Estimate κ(A)\kappa_\infty(A): Apply Hager’s algorithm to ATA^T (see Remark 2). This costs only O(n2)O(n^2) extra. For QR, the same idea works: estimate R1\|R^{-1}\| via triangular solves, then κ(A)AR1\kappa(A) \approx \|A\| \cdot \|R^{-1}\|.

  4. Compute the residual: r=bAx^\mathbf{r} = \mathbf{b} - A\hat{\mathbf{x}}.

  5. Weighted residual: ρ=r/(Ax^)\rho = \|\mathbf{r}\|_\infty / (\|A\|_\infty \|\hat{\mathbf{x}}\|_\infty). If ρεmach\rho \gtrsim \varepsilon_{\text{mach}} your solver lost backward stability; otherwise proceed.

  6. Forward error bound: By Theorem 4,

    x^xx^κ(A)ρ.\frac{\|\hat{\mathbf{x}} - \mathbf{x}\|_\infty}{\|\hat{\mathbf{x}}\|_\infty} \lesssim \kappa_\infty(A) \cdot \rho.

    If κ(A)εmach1\kappa_\infty(A) \cdot \varepsilon_{\text{mach}} \gtrsim 1, the answer may be meaningless.

Steps 3-6 are essentially free compared to the O(n3)O(n^3) factorization, and they give you a computable forward error bound.