Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

A Teaser: The Fastest Inverse Square Root

Here are 6 lines of C that compute 1/x1/\sqrt{x} faster than any library call in the 1990s:

float Q_rsqrt( float x )
{
    long i;
    float x2, y;
    const float threehalfs = 1.5F;

    x2 = x * 0.5F;
    y  = x;
    i  = * ( long * ) &y;               // evil floating point bit level hacking
    i  = 0x5f3759df - ( i >> 1 );       // what the fuck?
    y  = * ( float * ) &i;
    y  = y * ( threehalfs - ( x2 * y * y ) );   // 1st iteration
    return y;
}

This code — from the Quake III Arena source — powered real-time 3D graphics on 1990s hardware. It looks like black magic, but every line has a precise mathematical explanation.

By the end of this chapter, you’ll understand every line.

Overview

This chapter is driven by a mystery from Chapter 1: why does the finite difference error increase when we make the step size smaller? By the end, you’ll not only understand the answer, but also see how floating-point representation can be exploited for creative algorithmic tricks.

  1. Errors and floating-point arithmetic — We define absolute and relative error, then introduce floating-point numbers as approximations with a relative error guarantee (machine epsilon). This immediately solves the finite difference mystery: subtracting nearly equal numbers amplifies relative error, creating a trade-off between truncation and round-off.

  2. Condition numbers — Could we have predicted the finite difference problem before running any code? Yes — the condition number κ\kappa measures how much a computation amplifies input errors. We’ll see that subtraction is ill-conditioned when operands are close, giving a deeper theoretical explanation of the trade-off.

  3. Stable and unstable algorithms — The condition number measures the problem’s sensitivity, but the algorithm matters too. We define stable and unstable algorithms, see how the same well-conditioned problem can give terrible results with a bad algorithm, and identify the three most dangerous floating-point operations.

  4. Fast inverse square root — Now we look inside the floating-point representation (IEEE 754 bit layout) and discover that reinterpreting float bits as an integer gives an approximate logarithm. This insight, combined with Newton’s method, produces one of the most famous algorithms in computer science.

Learning Outcomes

After completing this chapter, you should be able to: