The three notions of convergence (strong, weak, weak*) form a chain of implications. In probability, they correspond to convergence in norm, convergence in expectation, and convergence in distribution. The hierarchy of topologies explains when they coincide (reflexive spaces) and when they diverge. The direct method of the calculus of variations combines Banach-Alaoglu with weak lower semicontinuity to prove existence of minimizers.
Convergence in probability: all three in action¶
The three convergences appear naturally in probability, where they correspond to familiar notions.
Convergence in expectation as weak convergence¶
A random variable on a probability space is an element of precisely when it has finite expectation: . The norm is the expected absolute value: .
Example 1 (Convergence in expectation in )
Let be a probability space. The Banach space has dual . Each bounded measurable function defines a continuous linear functional on via
Think of as a weight function that selects which parts of to measure. Typical choices:
(indicator of a measurable set): , the integral of over .
: , recovering the norm.
(a character, if ): , a Fourier coefficient.
A sequence of random variables converges weakly () if and only if
In the foliation picture: each weight defines a foliation of , and weak convergence means the “weighted expectation” stabilizes for every choice of weight. In particular, taking for every measurable requires : the integrals of agree with those of over every measurable set.
Strong convergence in is , which implies weak convergence but not vice versa. A sequence can have all its weighted expectations converge without the random variables getting close in the norm.
Measures and weak* convergence¶
The richest example of weak* convergence in practice is the convergence of measures. By the Riesz–Markov–Kakutani theorem, (the space of finite signed Radon measures), and weak* convergence of measures means for every continuous . For probability measures, this is convergence in distribution — the convergence in the central limit theorem.
We develop this fully — including the total variation norm, Wasserstein distances, tightness, Prokhorov’s theorem, and optimal transport — in the distributions chapter on measure norms. There, the duality framework from this chapter combines with the distributional framework to give a complete picture of what different norms on measures “see.”
The direct method: a first look¶
The combination of Banach-Alaoglu and weak lower semicontinuity is the engine behind existence proofs in PDE and the calculus of variations. We give a brief preview here; we will develop this in detail in later sections.
Definition 1 (Weak lower semicontinuity)
A functional is weakly lower semicontinuous if whenever weakly in ,
The norm itself is the prototypical example: if , then (this follows from for any norming functional ). Many energy functionals in applications inherit this property from convexity.
The direct method proceeds in three steps:
Boundedness. Show that a minimizing sequence for is bounded: .
Compactness. Extract a weakly convergent subsequence . In a reflexive space this is Corollary 1; in a dual space , use Banach-Alaoglu (Theorem 1) for weak* compactness instead.
Lower semicontinuity. Conclude that
so is a minimizer.
Step 1 is problem-specific, typically a coercivity estimate. Step 2 is pure functional analysis, exactly what Banach-Alaoglu and Eberlein-Šmulian provide. Step 3 requires that behaves well under weak limits, which is where convexity or compensated compactness arguments enter. We will return to this in detail when we study Sobolev spaces and variational problems.
The direct method in a reflexive space¶
Example 2 (Minimizing the Dirichlet energy)
Let be a bounded open set with smooth boundary, and fix a boundary datum . Consider the Dirichlet energy
over the admissible set . We claim that attains its minimum on .
Step 1 (Boundedness). Let be a minimizing sequence: . Since is bounded, so is . The Poincaré inequality (applied to for any fixed extension of ) gives .
Step 2 (Compactness). The Sobolev space is a Hilbert space, hence reflexive. By Theorem 1, the bounded sequence has a weakly convergent subsequence in . The trace operator is continuous in the weak topology, so and .
Step 3 (Lower semicontinuity). The functional is convex and continuous in the norm topology, so by Mazur’s theorem (Theorem 2) its sublevel sets are weakly closed. Equivalently, is weakly lower semicontinuous:
Therefore is a minimizer. The minimizer is unique by strict convexity of , and satisfies the Euler-Lagrange equation in (Laplace’s equation).
The entire argument rests on two pillars: reflexivity of gives weak compactness, and convexity of gives weak lower semicontinuity. Remove either and the argument collapses.
Remark 1 (Why convexity gives weak lower semicontinuity)
Step 3 is the subtlest part of the direct method. Why can’t the energy increase when we pass to a weak limit? There are three ways to see this.
Direct proof for the Dirichlet energy. Since in , we have in . Weak convergence means for every . Taking :
by Cauchy–Schwarz. Dividing by gives , hence . The same argument shows that every norm is weakly lower semicontinuous: the norm can only drop under weak limits, never increase.
Via Mazur’s theorem. More generally, let be any convex and norm-continuous functional. Its sublevel sets are convex (by convexity of ) and norm-closed (by continuity). By Mazur’s theorem (Theorem 2), convex norm-closed sets are weakly closed. So the sublevel sets are weakly closed, which is equivalent to being weakly lower semicontinuous.
The intuition. By Mazur’s theorem, there exist convex combinations converging strongly to . For a convex functional, Jensen’s inequality gives
The right-hand side is at most , which is eventually close to . By norm continuity, , giving the inequality. In short: weak limits are limits of averages, and convex functionals can’t be fooled by averaging.
What goes wrong without reflexivity¶
Example 3 (Failure of the direct method in )
Consider the functional
over the constraint set . The infimum is , attained by the constant function .
Now replace the objective with something that rewards concentration. Define
over the same constraint set . A minimizing sequence is given by the approximations to the identity:
Each with and , so .
But more instructive is what happens to the sequence itself. As measures, in : for any continuous test function ,
The weak- limit is a perfectly good Radon measure, but it does not belong to . The mass has concentrated at a single point. This is the failure mode: is not reflexive, so bounded sequences need not have weakly convergent subsequences in . The sequence escapes to the larger space of measures, and the direct method cannot recover a minimizer in the original space.
The contrast between Example 2 and Example 3 illustrates why reflexivity is not a technical convenience but a structural requirement. In reflexive spaces, bounded sets are weakly compact and the direct method closes. In non-reflexive spaces like , minimizing sequences can concentrate or oscillate their way out of the space, and one must either enlarge the space (to measures) or impose additional compactness conditions (such as equi-integrability via the Dunford-Pettis theorem).