Forward and Backward Error - Introduction to Scientific Computing

A Motivating Example¶

Consider computing $f(x) = \sqrt{x+1} - \sqrt{x}$ for large $x$ .

The condition number is $\kappa \leq 1/2$ — this is a well-conditioned problem!

Yet using 4-digit decimal arithmetic:

f(1984) = \sqrt{1985} - \sqrt{1984} = 44.55 - 44.54 = 0.01

(1)

The true value is 0.01122. We lost two significant digits — 10% relative error from a well-conditioned problem!

What went wrong? The algorithm subtracts two nearly-equal numbers, amplifying their individual roundoff errors. This is called subtractive cancellation.

A stable alternative: Rationalize the expression:

\sqrt{x+1} - \sqrt{x} = \frac{(\sqrt{x+1} - \sqrt{x})(\sqrt{x+1} + \sqrt{x})}{\sqrt{x+1} + \sqrt{x}} = \frac{1}{\sqrt{x+1} + \sqrt{x}} = \frac{1}{89.09} = 0.01122 \quad \checkmark

(2)

Same problem, different algorithm, full accuracy.

This example reveals something fundamental: not all errors are the problem’s fault. We need a framework to separate algorithm quality from problem sensitivity.

Forward Error¶

Consider computing $y = f(x)$ . Due to roundoff or approximation, we actually compute $\tilde{y}$ .

Forward error is the most intuitive measure—it directly answers “how wrong is my answer?”

Absolute vs. Relative Error¶

For meaningful comparisons, we often use relative error:

Example: An error of 0.001 means different things for:

$y = 1$ : relative error is 0.1% (excellent)
$y = 0.001$ : relative error is 100% (useless)

Relative error captures this distinction. When we say “accurate to 6 digits,” we mean relative error $\approx 10^{-6}$ .

The Catch¶

Forward error seems like the natural measure—we want to know how wrong we are! But there’s a problem:

Forward error requires knowing the true answer $y = f(x)$ .

If we knew the true answer, we wouldn’t need to compute it. This is where backward error comes in.

Backward Error¶

Instead of asking “how wrong is my answer?”, backward error asks a different question.

In other words: our computed $\tilde{y}$ might not equal $f(x)$ , but it does equal $f(\tilde{x})$ for some nearby $\tilde{x}$ . The backward error measures how far $\tilde{x}$ is from $x$ .

Why Backward Error?¶

At first, backward error seems less useful than forward error. But it turns out to be more fundamental for algorithm analysis. Here’s why:

1. Backward error is computable

Forward error requires knowing the true answer—but if we knew that, we wouldn’t need to compute it!

Backward error, on the other hand, is often directly computable:

Linear systems: Given computed $\tilde{x}$ , the residual $r = b - A\tilde{x}$ tells us backward error immediately—we solved $A\tilde{x} = b - r$ exactly.
Root finding: Given approximate root $\tilde{r}$ , the residual $f(\tilde{r})$ measures backward error.

2. Backward error separates algorithm from problem

There are two sources of error:

Algorithm quality: Does the algorithm solve some nearby problem accurately?
Problem sensitivity: How much does the answer change when the problem changes slightly?

Backward error isolates the first question. If an algorithm has small backward error, it’s doing its job well—any remaining forward error is the problem’s fault (ill-conditioning), not the algorithm’s.

3. Backward error accounts for input uncertainty

Real-world inputs have measurement error. If your input $x$ has uncertainty $\delta$ , then:

You don’t actually know you’re solving $f(x)$ —you might be solving $f(x + \epsilon)$ for some $|\epsilon| \leq \delta$
An algorithm with backward error $\leq \delta$ is as good as exact for your purposes

Principle: Don’t ask for more accuracy than your inputs justify.

Residual as Backward Error¶

For many problems, backward error equals the residual—a directly computable quantity.

Problem	Residual	Interpretation
Linear system $Ax = b$	$r = b - A\tilde{x}$	$\tilde{x}$ solves $Ax = b - r$ exactly
Root finding $f(x) = 0$	$f(\tilde{r})$	$\tilde{r}$ is a root of $f(x) - f(\tilde{r})$
Least squares	$\|b - A\tilde{x}\|$	How well $\tilde{x}$ fits the data

Key insight: Backward stability implies forward stability (for well-conditioned problems). This is why we focus on backward stability—it’s the stronger, more useful guarantee.

The Relationship: Forward Error ≤ κ × Backward Error¶

Now we connect backward error to forward error using the condition number from the previous section.

Derivation¶

Suppose we compute $\tilde{y}$ instead of the true $y = f(x)$ . If the algorithm is backward stable:

\tilde{y} = f(\tilde{x}) \quad \text{for some } \tilde{x} \text{ near } x

(8)

The forward error is the difference $|\tilde{y} - y| = |f(\tilde{x}) - f(x)|$ .

By Taylor expansion:

|f(\tilde{x}) - f(x)| \approx |f'(x)| \cdot |\tilde{x} - x|

(9)

Converting to relative errors:

\frac{|\tilde{y} - y|}{|y|} \approx \frac{|f'(x)|}{|f(x)|} \cdot |\tilde{x} - x| = \underbrace{\left|\frac{x f'(x)}{f(x)}\right|}_{\kappa} \cdot \frac{|\tilde{x} - x|}{|x|}

(10)

What This Means¶

Component	Controlled by	Can we improve it?
Forward error	Both	Indirectly
Condition number $\kappa$	The problem	No (it’s math)
Backward error	The algorithm	Yes!

Practical implications:

Well-conditioned + backward stable → accurate: Small $\kappa$ and small backward error guarantee small forward error
Ill-conditioned → trouble: Even perfect algorithms give poor forward error when $\kappa$ is large
Focus on backward stability: Write algorithms with small backward error; that’s all you can control

Unstable Algorithms: Common Pitfalls¶

Even well-conditioned problems can produce terrible results with unstable algorithms. Three operations are particularly dangerous:

Operation	Problem	Stable Alternative
Subtracting close numbers	Relative error explodes	Reformulate algebraically
Dividing by small numbers	Amplifies numerator error	Use logarithms or rescale
Adding numbers of vastly different magnitude	Small values lost	Sort and sum smallest first

For small $x$ , direct computation fails:

x = 1e-15
result = np.exp(x) - 1  # Returns ~1.1e-15 (only 1 correct digit!)

Why? $e^{10^{-15}} \approx 1.000000000000001$ , stored as 1.0 due to limited precision. Then $1.0 - 1.0 \approx 0$ .

Stable alternative: Use the dedicated function:

result = np.expm1(1e-15)  # Returns 1.0e-15 (full precision!)

When summing many numbers, roundoff errors accumulate. Kahan summation tracks and compensates for these errors:

def kahan_sum(x):
    """Compensated summation — much more accurate than naive sum."""
    s = 0.0  # Running sum
    c = 0.0  # Compensation for lost low-order bits
    for xi in x:
        y = xi - c        # Compensated value to add
        t = s + y         # New sum (may lose precision)
        c = (t - s) - y   # Recover what was lost
        s = t
    return s

This achieves $O(\varepsilon_{\text{mach}})$ error instead of $O(n \cdot \varepsilon_{\text{mach}})$ for naive summation.

Summary¶

Concept	Measures	Controlled by	Computable?
Forward error	Distance to true answer	Problem + algorithm	Usually not
Backward error	Perturbation to input	Algorithm	Often yes (residual)
Condition number	Sensitivity of problem	Problem (math)	Sometimes

The workflow:

Check if the problem is well-conditioned (small $\kappa$ )
Use a backward-stable algorithm (small backward error)
Then forward error will be small automatically

If forward error is large despite backward stability, the problem is ill-conditioned—reformulate if possible.