Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Overview

We’ve studied approximation with:

Neural networks offer a different approach: they approximate functions using compositions of simple nonlinear functions, achieving dimension-independent convergence rates for certain function classes.

Why Neural Networks Now?

This chapter connects approximation theory to modern machine learning:

ClassicalNeural Networks
Polynomials, FourierCompositions of sigmoids/ReLUs
Explicit coefficientsLearned parameters
Exponential in ddCan be dimension-independent
Linear in parametersNonlinear optimization

Historical Context

Preview: The Barron Miracle

Classical approximation theory says: to approximate a dd-dimensional function with accuracy ϵ\epsilon using polynomials, you need O(ϵd)O(\epsilon^{-d}) terms.

In d=100d = 100 dimensions (modest for machine learning): ϵ100\epsilon^{-100} terms—completely impractical!

Barron’s theorem says: for functions with bounded Barron norm, a neural network with nn neurons achieves:

ffnL2Cfn\|f - f_n\|_{L^2} \leq \frac{C_f}{\sqrt{n}}

The dimension dd appears only in CfC_f, not in the convergence rate!

This is why neural networks can handle high-dimensional problems that defeat polynomial methods.

Learning Outcomes

After completing this chapter, you should be able to: