Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Big Idea

Runge’s divergence is a property of node placement, not of polynomial interpolation as such. Equispaced nodes leave the interpolating polynomial free to oscillate with exponentially growing amplitude near the endpoints. The Chebyshev nodes are the cosine images of equispaced angles on a semicircle. They cluster at the endpoints and cure the problem: with them, smooth functions are approximated geometrically. The Lebesgue constant quantifies the contrast. Equispaced nodes amplify errors by 2n\sim 2^n, Chebyshev nodes by logn\sim \log n.

Runge’s Phenomenon

Consider the function f(x)=1/(1+25x2)f(x) = 1/(1 + 25 x^2) on [1,1][-1, 1]. It is smooth on the real line. We interpolate it at n+1n+1 equispaced nodes and plot the result for a few values of nn.

Source
<Figure size 1200x400 with 3 Axes>

The interpolant passes through every node, but between nodes near the endpoints it oscillates with amplitude that grows without bound. The function is analytic on the whole real line. The divergence is entirely a property of the node placement.

Reminder: analytic functions

A function ff is real-analytic at x0x_0 if its Taylor series about x0x_0 converges to f(x)f(x) in some neighborhood of x0x_0. Analytic functions are smooth (infinitely differentiable) and are completely determined by their values on any open set. Polynomials, exe^x, sinx\sin x, cosx\cos x, and rational functions p(x)/q(x)p(x)/q(x) away from the zeros of qq are all analytic.

Being analytic on R\mathbb{R} does not mean the Taylor series converges globally. The Runge function 1/(1+25x2)1/(1 + 25 x^2) is analytic on all of R\mathbb{R}, but its Taylor series about x=0x = 0 converges only for x<1/5|x| < 1/5, the distance to the nearest complex pole at x=±i/5x = \pm i/5. That complex-plane picture is what actually controls polynomial approximation rates; we return to it in Regularity and Coefficient Decay.

Why does moving the nodes help? Polynomial interpolation on [1,1][-1, 1] is really an angular problem in disguise. Under the change of variables θ=arccosx\theta = \arccos x, the endpoints ±1\pm 1 correspond to θ=0,π\theta = 0, \pi, while the interior is compressed. Nodes that are equispaced in xx are badly unequal in θ\theta, with the largest angular gaps sitting near ±1\pm 1. Those are exactly the gaps where the oscillations blow up.

Sampling uniformly in θ\theta instead produces nodes that cluster at the endpoints. The standard choice is the Chebyshev nodes.

Chebyshev Nodes

Definition 1 (Chebyshev Nodes (of the second kind))

The Chebyshev points on [1,1][-1, 1] are

xj=cos ⁣(jπn),j=0,1,,n.x_j = \cos\!\left(\frac{j\pi}{n}\right), \qquad j = 0, 1, \ldots, n.

Geometric origin

Place n+1n+1 equispaced points on the upper unit semicircle and project them straight down to the xx-axis. Equispaced angles θj=jπ/n\theta_j = j\pi/n, so xj=cosθjx_j = \cos\theta_j. That is the picture.

Source
<Figure size 700x360 with 1 Axes>

The clustering near ±1\pm 1 is exactly what tames the endpoint instability seen in the Runge example. The optional final section on Lebesgue constants makes this precise.

Barycentric weights for free

The barycentric formula from Proposition 1 needs the weights λj=1/kj(xjxk)\lambda_j = 1 / \prod_{k \ne j}(x_j - x_k). On Chebyshev nodes these collapse to a closed form (up to a common scale, which cancels in the second barycentric formula):

λj  =  (1)jδj,δj={12,j=0 or j=n,1,otherwise.\lambda_j \;=\; (-1)^j\, \delta_j, \qquad \delta_j = \begin{cases} \tfrac{1}{2}, & j = 0 \text{ or } j = n, \\ 1, & \text{otherwise}. \end{cases}

No O(n2)O(n^2) precomputation is needed: just a sign flip and a halving at the endpoints. Evaluating pn(x)p_n(x) at an arbitrary point is then one O(n)O(n) pass with no setup cost, which is what makes the barycentric formula on Chebyshev nodes the evaluator of choice.

Chebyshev polynomials and the recurrence

The Chebyshev nodes came from sampling uniformly in θ\theta under x=cosθx = \cos\theta. The natural basis for functions on the θ\theta side is the Fourier cosine basis {cos(kθ)}k0\{\cos(k\theta)\}_{k \ge 0}, the standard frequencies on a periodic interval. We ask: under x=cosθx = \cos\theta, what do these basis functions look like on the xx side?

Check the first few by expanding with angle-addition:

Each one is a polynomial in xx, of degree exactly kk. This is not an accident.

Definition 2 (Chebyshev polynomial of the first kind)

The kk-th Chebyshev polynomial of the first kind TkT_k is the polynomial characterised by

Tk(cosθ)  =  cos(kθ)for all θR.T_k(\cos\theta) \;=\; \cos(k\theta) \qquad \text{for all } \theta \in \mathbb{R}.

The definition is only useful if such a polynomial exists for every kk. The recurrence makes this explicit.

Proposition 1 (Three-term recurrence)

T0(x)=1T_0(x) = 1, T1(x)=xT_1(x) = x, and for k1k \ge 1,

Tk+1(x)  =  2xTk(x)Tk1(x).T_{k+1}(x) \;=\; 2 x\, T_k(x) - T_{k-1}(x).

By induction this makes every TkT_k a polynomial of degree exactly kk.

Proof 1

The product-to-sum identity 2cosθcos(kθ)=cos((k+1)θ)+cos((k1)θ)2 \cos\theta \cos(k\theta) = \cos((k+1)\theta) + \cos((k-1)\theta) rearranges to

cos((k+1)θ)  =  2cosθcos(kθ)cos((k1)θ).\cos((k+1)\theta) \;=\; 2 \cos\theta\, \cos(k\theta) - \cos((k-1)\theta).

Setting x=cosθx = \cos\theta and substituting cos(jθ)=Tj(x)\cos(j\theta) = T_j(x) converts the right-hand side to 2xTk(x)Tk1(x)2 x\, T_k(x) - T_{k-1}(x), which must therefore equal Tk+1(x)T_{k+1}(x). The base cases T0=1T_0 = 1 and T1=xT_1 = x are immediate from the definition, and induction carries polynomiality and the degree forward.

The first few Chebyshev polynomials are therefore

T0=1,T1=x,T2=2x21,T3=4x33x,T4=8x48x2+1,T_0 = 1, \quad T_1 = x, \quad T_2 = 2 x^2 - 1, \quad T_3 = 4 x^3 - 3 x, \quad T_4 = 8 x^4 - 8 x^2 + 1,

and the pattern continues. Under x=cosθx = \cos\theta they are literally the Fourier cosine basis, which means they inherit a lot of nice properties for free: bounded by 1 on [1,1][-1, 1], k+1k+1 extrema there with alternating signs ±1\pm 1, orthogonal under the weighted inner product we will see below. The Chebyshev polynomials are just a re-encoding of Pn\mathbb{P}_n in that natural frequency basis, related to the monomials 1,x,,xn1, x, \ldots, x^n by an upper-triangular change of basis.

The Chebyshev Series

So far the Chebyshev polynomials have appeared as a tool for placing nodes. They are also a basis of Pn\mathbb{P}_n. More importantly, the full family {Tk}k0\{T_k\}_{k \ge 0} is an orthogonal system on [1,1][-1,1] under a natural weighted inner product, and suffices to represent any Lipschitz function on [1,1][-1, 1] as a convergent series.

Definition 3 (Chebyshev series)

The Chebyshev series of a Lipschitz function ff on [1,1][-1,1] is

f(x)=k=0ckTk(x).f(x) = \sum_{k=0}^\infty c_k\, T_k(x).

Using x=cosθx = \cos\theta and Tk(cosθ)=cos(kθ)T_k(\cos\theta) = \cos(k\theta), the coefficients have the equivalent forms

ck=2π11f(x)Tk(x)1x2dx=2π0πf(cosθ)cos(kθ)dθ,c_k = \frac{2}{\pi} \int_{-1}^{1} \frac{f(x)\, T_k(x)}{\sqrt{1 - x^2}}\, dx = \frac{2}{\pi} \int_0^{\pi} f(\cos\theta)\, \cos(k\theta)\, d\theta,

with prefactor 1/π1/\pi instead of 2/π2/\pi for c0c_0. The series converges uniformly on [1,1][-1,1]. The ckc_k are the Chebyshev coefficients of ff.

Definition 4 (Chebyshev projection)

The Chebyshev projection of degree nn of ff is the truncated series

(Pnf)(x)=k=0nckTk(x),(P_n f)(x) = \sum_{k=0}^n c_k\, T_k(x),

where the ckc_k are the Chebyshev coefficients of Definition 3.

Projection vs. interpolant

The Chebyshev projection PnfP_n f defined above is not the same polynomial as the Chebyshev interpolant pnp_n, the polynomial of degree n\le n that passes through ff at the n+1n+1 Chebyshev nodes. The two agree up to an aliasing correction that decays at the same rate as the Chebyshev coefficients themselves, so for convergence purposes we treat them interchangeably. But in practice we always compute the interpolant (via the DCT below), not the projection, because the interpolant only needs nodal samples of ff, whereas the projection needs the true integral coefficients ckc_k.

Aside: Fourier connection (optional)

If you have seen Fourier series, the substitution x=cosθx = \cos\theta turns the Chebyshev series of ff on [1,1][-1, 1] into the Fourier cosine series of f(cosθ)f(\cos\theta) on [0,π][0, \pi]. Statements about Fourier series (decay rates, Gibbs phenomenon, Parseval) carry over verbatim.

Computing the Coefficients

We rarely have a closed form for ckc_k. Instead we sample ff at the Chebyshev nodes and use a discrete formula. Two facts make this efficient: discrete orthogonality of the TkT_k at those nodes, and the cosine structure that lets the FFT do the work.

The values↔coefficients matrix

Just as the monomial basis produced the Vandermonde system Vc=fV \mathbf{c} = \mathbf{f}, the Chebyshev basis at the Chebyshev nodes produces its own values↔coefficients map. With xj=cos(jπ/n)x_j = \cos(j\pi/n) and Tk(xj)=cos(jkπ/n)T_k(x_j) = \cos(jk\pi/n), the equations pn(xj)=fjp_n(x_j) = f_j become

(T0(x0)T1(x0)Tn(x0)T0(x1)T1(x1)Tn(x1)T0(xn)T1(xn)Tn(xn))T  =  (cos(jkπ/n))jk(c0c1cn)  =  (f0f1fn).\underbrace{\begin{pmatrix} T_0(x_0) & T_1(x_0) & \cdots & T_n(x_0) \\ T_0(x_1) & T_1(x_1) & \cdots & T_n(x_1) \\ \vdots & & & \vdots \\ T_0(x_n) & T_1(x_n) & \cdots & T_n(x_n) \end{pmatrix}}_{\textstyle T \;=\; \big(\cos(jk\pi/n)\big)_{jk}} \begin{pmatrix} c_0 \\ c_1 \\ \vdots \\ c_n \end{pmatrix} \;=\; \begin{pmatrix} f_0 \\ f_1 \\ \vdots \\ f_n \end{pmatrix}.

Compare to the Vandermonde matrix from Polynomial Interpolation: Values and Coefficients: same layout (rows = nodes, columns = basis functions), but cosines in place of monomial powers. That change of basis is what fixes the ill-conditioning. Here κ2(T)=2\kappa_2(T) = \sqrt{2} independent of nn, versus κ2(V)\kappa_2(V) that exploded exponentially and broke double precision by n25n \approx 25. And inverting TT is free: the FFT applies T1T^{-1} in O(nlogn)O(n \log n). The two lemmas below make this precise.

Remark 1 (Where the 2\sqrt{2} comes from)

The matrix TT satisfies TT=DT^\top T = D with D=diag(n,n/2,,n/2,n)D = \mathrm{diag}(n, n/2, \ldots, n/2, n), the discrete orthogonality relation for cosines. The columns of TT are mutually orthogonal, but the endpoint columns have norm n\sqrt{n} and the interior ones have norm n/2\sqrt{n/2}, differing by a factor of 2\sqrt{2}. Rescaling by D1/2D^{-1/2} gives a genuinely orthogonal D1/2TD^{-1/2} T with condition number 1, so the full κ2(T)=κ2(D)=2\kappa_2(T) = \sqrt{\kappa_2(D)} = \sqrt{2}.

Lemma 1 (Discrete coefficient formula)

For ff sampled at the Chebyshev nodes xj=cos(jπ/n)x_j = \cos(j\pi/n), j=0,,nj = 0, \ldots, n, the discrete Chebyshev coefficients

c~k  =  2nj=0nf(xj)cos ⁣(jkπn),k=0,1,,n,\tilde c_k \;=\; \frac{2}{n} \sum_{j=0}^{n}{}'' f(x_j)\, \cos\!\left(\frac{jk\pi}{n}\right), \qquad k = 0, 1, \ldots, n,

(with \sum'' denoting that the j=0j = 0 and j=nj = n terms are halved) are the unique coefficients of the Chebyshev interpolant

pn(x)=k=0nc~kTk(x)p_n(x) = \sum_{k=0}^n \tilde c_k\, T_k(x)

satisfying pn(xj)=f(xj)p_n(x_j) = f(x_j) for all jj. If fPnf \in \mathbb{P}_n then c~k=ck\tilde c_k = c_k exactly.

Proof 2

The TkT_k satisfy a discrete orthogonality relation at the Chebyshev nodes:

j=0nTk(xj)Tl(xj)  =  {n,k=l{0,n},n/2,k=l,0<k<n,0,kl,0k,ln.\sum_{j=0}^{n}{}'' T_k(x_j)\, T_l(x_j) \;=\; \begin{cases} n, & k = l \in \{0, n\}, \\ n/2, & k = l, \quad 0 < k < n, \\ 0, & k \ne l, \quad 0 \le k, l \le n. \end{cases}

This follows from Tk(xj)=cos(jkπ/n)T_k(x_j) = \cos(jk\pi/n) and the standard discrete orthogonality of cosines. Imposing pn(xj)=f(xj)p_n(x_j) = f(x_j) and applying the orthogonality relation isolates c~k\tilde c_k as stated. If fPnf \in \mathbb{P}_n the interpolant is ff, so c~k=ck\tilde c_k = c_k.

For functions fPnf \notin \mathbb{P}_n the discrete coefficients c~k\tilde c_k approximate the true ckc_k to within an aliasing error that decays at the same rate as ckc_k itself, so for our purposes we treat the two interchangeably and write ckc_k for both.

Lemma 2 (DCT computation)

The discrete coefficients of Lemma 1 are the output of the type-I Discrete Cosine Transform of the value vector (f(x0),,f(xn))(f(x_0), \ldots, f(x_n)). Equivalently, they are the result of solving Tc=fT \mathbf{c} = \mathbf{f}. The Cooley–Tukey FFT algorithm Cooley & Tukey, 1965 carries out this basis change in O(nlogn)O(n \log n) operations.

The two lemmas together give the practical picture. Starting from n+1n+1 nodal samples of a smooth ff, one DCT performs the basis change from values to Chebyshev coefficients, and the inverse DCT reconstructs the values. Each costs O(nlogn)O(n \log n) when computed via an FFT. This O(n2)O(nlogn)O(n^2) \to O(n \log n) acceleration is the reason spectral methods are practical, and we use it freely in the integration and differentiation sections that follow.

Evaluating at Arbitrary Points

The DCT recovers pn(xj)p_n(x_j) at the Chebyshev nodes themselves, but we often want pn(x)p_n(x) at some other x[1,1]x \in [-1, 1]. Given the coefficients c0,,cnc_0, \ldots, c_n, we need to evaluate

pn(x)=k=0nckTk(x).p_n(x) = \sum_{k=0}^n c_k\, T_k(x).

A naïve approach builds T0(x),T1(x),,Tn(x)T_0(x), T_1(x), \ldots, T_n(x) via the three-term recurrence and accumulates the sum, which is O(n)O(n) work. Clenshaw’s algorithm does the same in O(n)O(n) but runs the recurrence backward on the coefficients, never forming the intermediate Tk(x)T_k(x) values. The backward sweep is backward-stable on [1,1][-1, 1], which is why libraries implement it as the standard evaluator.

Algorithm 1 (Clenshaw’s algorithm)

Input. Coefficients c0,c1,,cnc_0, c_1, \ldots, c_n; evaluation point x[1,1]x \in [-1, 1].

Output. pn(x)=k=0nckTk(x)p_n(x) = \sum_{k=0}^n c_k\, T_k(x).

  1. Set bn+2=bn+1=0b_{n+2} = b_{n+1} = 0.

  2. For k=n,n1,,1k = n, n-1, \ldots, 1, compute

    bk  =  2xbk+1bk+2+ck.b_k \;=\; 2 x\, b_{k+1} - b_{k+2} + c_k.
  3. Return pn(x)=c0+xb1b2p_n(x) = c_0 + x\, b_1 - b_2.

Proof 3

Rearranging the recurrence definition, ck=bk2xbk+1+bk+2c_k = b_k - 2x\, b_{k+1} + b_{k+2}, and using 2xTk(x)=Tk+1(x)+Tk1(x)2x\, T_k(x) = T_{k+1}(x) + T_{k-1}(x) from the Chebyshev recurrence,

ckTk  =  bkTkbk+1(Tk+1+Tk1)+bk+2Tk.c_k\, T_k \;=\; b_k\, T_k - b_{k+1}(T_{k+1} + T_{k-1}) + b_{k+2}\, T_k.

Summing from k=1k = 1 to nn, the cross-terms telescope: the bkTkb_k T_k sum cancels the bk+1Tk1b_{k+1} T_{k-1} sum shifted by one and the bk+2Tkb_{k+2} T_k sum shifted by two, leaving only the boundary terms

k=1nckTk(x)  =  b1T1(x)b2T0(x)  =  xb1b2.\sum_{k=1}^n c_k\, T_k(x) \;=\; b_1\, T_1(x) - b_2\, T_0(x) \;=\; x\, b_1 - b_2.

Adding c0T0(x)=c0c_0 T_0(x) = c_0 gives the stated formula.

A Classical Aside: The Minimax Property

The Chebyshev polynomial TnT_n has a beautiful classical extremum: among all monic polynomials of degree nn on [1,1][-1, 1], the rescaled polynomial 21nTn2^{1 - n}\, T_n has the smallest sup-norm, equal to 21n2^{1 - n}. This is the minimax or equioscillation property. Applied to the node polynomial ω(x)=j(xxj)\omega(x) = \prod_j (x - x_j) of n+1n + 1 interpolation nodes, it means the zeros of Tn+1T_{n+1}, the first-kind Chebyshev nodes, are the choice that minimises maxω\max |\omega|. The second-kind nodes used throughout this chapter are close cousins with the same asymptotic behaviour. None of this is load-bearing for the convergence rates; it is a standalone classical fact. See Trefethen (2013), Ch. 3 for the proof and context.

References
  1. Cooley, J. W., & Tukey, J. W. (1965). An algorithm for the machine calculation of complex Fourier series. Mathematics of Computation, 19(90), 297–301.
  2. Trefethen, L. N. (2013). Approximation Theory and Approximation Practice. SIAM.