Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Big Idea

We have Barron’s bound ffnL2Cf/n\|f - f_n\|_{L^2} \le C_f / \sqrt n in any dimension. The contrast: the deterministic alternative is a tensor-product Chebyshev grid that inherits the 1D rate at the price of ndn^d basis functions. The exponential blow-up has nothing to do with the choice of polynomials and everything to do with how Sobolev regularity accounts for the spectrum.

Definition 1 (Curse of dimensionality)

A numerical method for approximating a function f:RdRf: \mathbb{R}^d \to \mathbb{R} to accuracy ε\varepsilon suffers from the curse of dimensionality if the number of basis functions, parameters, or function evaluations nn it needs grows exponentially in the dimension dd, i.e.

n  =  Ω ⁣(Cd)orn  =  Ω ⁣(εd/k)n \;=\; \Omega\!\left(C^{\,d}\right) \quad \text{or} \quad n \;=\; \Omega\!\left(\varepsilon^{-d/k}\right)

for some constant C>1C > 1 and per-axis smoothness kk. The cost remains finite for any fixed dd, but blows up so fast with dd that already moderate dimensions (d10d \approx 10 to 20) make the method unaffordable. The phrase is due to Bellman (1957) in the context of dynamic programming.

Chebyshev series in 2D

Before the example we need to know what a 2D Chebyshev expansion is. A function f:[1,1]Rf: [-1, 1] \to \mathbb{R} has a 1D Chebyshev expansion f(x)=jcjTj(x)f(x) = \sum_{j} c_j T_j(x). For a function f:[1,1]2Rf: [-1, 1]^2 \to \mathbb{R}, the natural extension is the tensor product: use Tj(x)T_j(x) in the first variable and Tk(y)T_k(y) in the second, and form all products,

f(x,y)  =  j=0k=0cjkTj(x)Tk(y).f(x, y) \;=\; \sum_{j=0}^{\infty} \sum_{k=0}^{\infty} c_{jk}\, T_j(x)\, T_k(y).

Truncating to j<nj < n and k<nk < n gives an approximation built from n×n=n2n \times n = n^2 basis functions Tj(x)Tk(y)T_j(x)\,T_k(y). The coefficient cjkc_{jk} is the inner product of ff against Tj(x)Tk(y)T_j(x)T_k(y), computed in practice by applying the 1D DCT along each axis: first along the xx axis on every yy-row, then along the yy axis on every column. The result is a 2D coefficient matrix CRn×nC \in \mathbb{R}^{n \times n} whose (j,k)(j, k) entry is cjkc_{jk}.

The same recipe extends to dd dimensions. A degree-nn tensor product in Rd\mathbb{R}^d has ndn^d basis functions and ndn^d coefficients, computed by dd DCTs.

A 2D example

For a concrete look, fit a smooth non-separable function on [1,1]2[-1, 1]^2 by tensor-product Chebyshev with n=32n = 32 per axis, then plot the 32×3232 \times 32 matrix of coefficients on a log scale.

import numpy as np
import matplotlib.pyplot as plt
from scipy.fft import dct

def chebpts(n):
    return np.sin(np.pi * np.arange(-n+1, n, 2) / (2*(n-1)))

def polyfit2d(values):
    """2D type-2 Chebyshev coefs from values on the tensor grid.

    Apply the 1D polyfit DCT along each axis in turn.
    """
    n1, n2 = values.shape
    c = dct(values[::-1, :], type=1, axis=0) / (n1 - 1)
    c[0, :] *= 0.5; c[-1, :] *= 0.5
    c = dct(c[:, ::-1], type=1, axis=1) / (n2 - 1)
    c[:, 0] *= 0.5; c[:, -1] *= 0.5
    return c

n = 32
x = chebpts(n)
X, Y = np.meshgrid(x, x, indexing='ij')

f = np.exp(X * Y - 0.5 * X**2)             # smooth, non-separable
c = polyfit2d(f)

fig, ax = plt.subplots(figsize=(6, 5))
im = ax.imshow(np.log10(np.abs(c) + 1e-20), cmap='viridis',
               origin='lower', vmin=-15, vmax=0)
plt.colorbar(im, ax=ax, label=r'$\log_{10}|c_{jk}|$')
ax.set_xlabel(r'$k$ ($y$-direction index)')
ax.set_ylabel(r'$j$ ($x$-direction index)')
ax.set_title(rf'2D Chebyshev coefficients of $e^{{xy - x^2/2}}$, $n^2={n*n}$')
plt.tight_layout(); plt.show()

keep = np.abs(c) > 1e-12
print(f'Coefficients above 1e-12: {keep.sum()} of {n*n}')
<Figure size 600x500 with 2 Axes>
Coefficients above 1e-12: 87 of 1024

The mass concentrates in the low-frequency corner (small jj and kk), and the entries decay fast in both directions. Even so, reaching 10-12 accuracy keeps about 9 coefficients per axis, roughly 90 total. A 1D Chebyshev expansion of a comparably smooth function settles down in about 13 coefficients. The tensor product roughly multiplies the per-axis cost: in 3D the same target would keep 93700\approx 9^3 \approx 700 coefficients, in 10D about 9103×1099^{10} \approx 3 \times 10^9, even though the per-axis convergence is excellent.

Where the dd-dependence comes from

The exponential cost is not specific to Chebyshev. It is built into how Sobolev regularity accounts for the spectrum.

A function fHk(Rd)f \in H^k(\mathbb{R}^d) has fHk2=(1+ω2)kf^(ω)2dω\|f\|_{H^k}^2 = \int (1+|\omega|^2)^k |\hat f(\omega)|^2 d\omega finite. Truncating the spectrum at frequency RR, the L² tail is bounded by

ω>Rf^2dω    R2kfHk2,\int_{|\omega| > R} |\hat f|^2\,d\omega \;\le\; R^{-2k} \|f\|_{H^k}^2,

so to reach L2L^2 accuracy ε\varepsilon we need Rε1/kR \sim \varepsilon^{-1/k}. To resolve every mode below RR on a grid we need one basis function per resolvable mode in the ball {ωR}\{|\omega| \le R\}, whose volume is RdR^d. Hence

n    Rd    εd/k.n \;\sim\; R^d \;\sim\; \varepsilon^{-d/k}.

The dd in the exponent is the volume of the frequency ball, not a quirk of the polynomial basis. Any deterministic discretisation has to cover that volume.

For an analytic function (Bernstein ellipse with parameter ρ\rho), nlogρ(1/ε)n \sim \log_\rho(1/\varepsilon) per axis, so the cost is logd(1/ε)\log^d(1/\varepsilon) instead of εd/k\varepsilon^{-d/k}. Better, but still exponential in dd.

What Barron’s bound buys

For a Barron-class target the trade becomes

n    Cf2ε2,n \;\sim\; \frac{C_f^2}{\varepsilon^2},

with no dd in the exponent. The price is the slow 1/n1/\sqrt n rate; the prize is escaping the curse.