A Teaser: The Fastest Inverse Square Root¶
Here are 6 lines of C that compute faster than any library call in the 1990s:
float Q_rsqrt( float x )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = x * 0.5F;
y = x;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
return y;
}This code — from the Quake III Arena source — powered real-time 3D graphics on 1990s hardware. It looks like black magic, but every line has a precise mathematical explanation.
By the end of this chapter, you’ll understand every line.
Overview¶
This chapter is driven by a mystery from Chapter 1: why does the finite difference error increase when we make the step size smaller? By the end, you’ll not only understand the answer, but also see how floating-point representation can be exploited for creative algorithmic tricks.
Errors and floating-point arithmetic — We define absolute and relative error, then introduce floating-point numbers as approximations with a relative error guarantee (machine epsilon). This immediately solves the finite difference mystery: subtracting nearly equal numbers amplifies relative error, creating a trade-off between truncation and round-off.
Condition numbers — Could we have predicted the finite difference problem before running any code? Yes — the condition number measures how much a computation amplifies input errors. We’ll see that subtraction is ill-conditioned when operands are close, giving a deeper theoretical explanation of the trade-off.
Stable and unstable algorithms — The condition number measures the problem’s sensitivity, but the algorithm matters too. We define stable and unstable algorithms, see how the same well-conditioned problem can give terrible results with a bad algorithm, and identify the three most dangerous floating-point operations.
Fast inverse square root — Now we look inside the floating-point representation (IEEE 754 bit layout) and discover that reinterpreting float bits as an integer gives an approximate logarithm. This insight, combined with Newton’s method, produces one of the most famous algorithms in computer science.
Learning Outcomes¶
After completing this chapter, you should be able to:
L2.1: Define absolute and relative error, and explain why relative error is the appropriate measure for floating-point computations.
L2.2: State the floating-point guarantee () and define machine epsilon.
L2.3: Explain the finite difference error trade-off between truncation and round-off.
L2.4: Compute condition numbers for simple functions and identify ill-conditioned problems.
L2.5: Explain the difference between a stable and unstable algorithm, and give an example of each for the same problem.
L2.6: Explain how the fast inverse square root exploits IEEE 754 bit representation.