Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Applications of Weak and Weak* Convergence

Big Idea

The three notions of convergence (strong, weak, weak*) form a chain of implications. In probability, they correspond to convergence in L1L^1 norm, convergence in expectation, and convergence in distribution. The hierarchy of topologies explains when they coincide (reflexive spaces) and when they diverge. The direct method of the calculus of variations combines Banach-Alaoglu with weak lower semicontinuity to prove existence of minimizers.

Convergence in probability: all three in action

The three convergences appear naturally in probability, where they correspond to familiar notions.

Convergence in expectation as weak convergence

A random variable XX on a probability space (Ω,F,P)(\Omega, \mathcal{F}, \mathbb{P}) is an element of L1(Ω,P)L^1(\Omega, \mathbb{P}) precisely when it has finite expectation: E[X]=ΩXdP<\mathbb{E}[|X|] = \int_\Omega |X|\,d\mathbb{P} < \infty. The L1L^1 norm is the expected absolute value: XL1=E[X]\|X\|_{L^1} = \mathbb{E}[|X|].

Example 1 (Convergence in expectation in L1L^1)

Let (Ω,F,P)(\Omega, \mathcal{F}, \mathbb{P}) be a probability space. The Banach space L1(Ω,P)L^1(\Omega, \mathbb{P}) has dual (L1)=L(Ω,P)(L^1)^* = L^\infty(\Omega, \mathbb{P}). Each bounded measurable function hLh \in L^\infty defines a continuous linear functional on L1L^1 via

φh(X)=ΩX(ω)h(ω)dP(ω)=E[Xh].\varphi_h(X) = \int_\Omega X(\omega)\,h(\omega)\,d\mathbb{P}(\omega) = \mathbb{E}[Xh].

Think of hh as a weight function that selects which parts of XX to measure. Typical choices:

  • h=1Ah = \mathbf{1}_A (indicator of a measurable set): φh(X)=AXdP\varphi_h(X) = \int_A X\,d\mathbb{P}, the integral of XX over AA.

  • h=sgn(X)h = \text{sgn}(X): φh(X)=E[X]=XL1\varphi_h(X) = \mathbb{E}[|X|] = \|X\|_{L^1}, recovering the norm.

  • h=eiξ()h = e^{i\xi \cdot (\cdot)} (a character, if ΩRd\Omega \subseteq \mathbb{R}^d): φh(X)=X^(ξ)\varphi_h(X) = \widehat{X}(\xi), a Fourier coefficient.

A sequence of random variables XnL1X_n \in L^1 converges weakly (XnXX_n \rightharpoonup X) if and only if

E[Xnh]E[Xh]for every hL.\mathbb{E}[X_n h] \to \mathbb{E}[Xh] \quad \text{for every } h \in L^\infty.

In the foliation picture: each weight hh defines a foliation of L1L^1, and weak convergence means the “weighted expectation” E[Xnh]\mathbb{E}[X_n h] stabilizes for every choice of weight. In particular, taking h=1Ah = \mathbf{1}_A for every measurable AA requires AXndPAXdP\int_A X_n\,d\mathbb{P} \to \int_A X\,d\mathbb{P}: the integrals of XnX_n agree with those of XX over every measurable set.

Strong convergence in L1L^1 is E[XnX]0\mathbb{E}[|X_n - X|] \to 0, which implies weak convergence but not vice versa. A sequence can have all its weighted expectations converge without the random variables getting close in the L1L^1 norm.

Measures and weak* convergence

The richest example of weak* convergence in practice is the convergence of measures. By the Riesz–Markov–Kakutani theorem, C(K)M(K)C(K)^* \cong \mathcal{M}(K) (the space of finite signed Radon measures), and weak* convergence of measures means gdμngdμ\int g \, d\mu_n \to \int g \, d\mu for every continuous gg. For probability measures, this is convergence in distribution — the convergence in the central limit theorem.

We develop this fully — including the total variation norm, Wasserstein distances, tightness, Prokhorov’s theorem, and optimal transport — in the distributions chapter on measure norms. There, the duality framework from this chapter combines with the distributional framework to give a complete picture of what different norms on measures “see.”

The direct method: a first look

The combination of Banach-Alaoglu and weak lower semicontinuity is the engine behind existence proofs in PDE and the calculus of variations. We give a brief preview here; we will develop this in detail in later sections.

Definition 1 (Weak lower semicontinuity)

A functional F ⁣:XR{+}\mathcal{F} \colon X \to \mathbb{R} \cup \{+\infty\} is weakly lower semicontinuous if whenever xnxx_n \rightharpoonup x weakly in XX,

F(x)lim infnF(xn).\mathcal{F}(x) \leq \liminf_{n \to \infty} \mathcal{F}(x_n).

The norm itself is the prototypical example: if xnxx_n \rightharpoonup x, then xlim infxn\|x\| \leq \liminf \|x_n\| (this follows from f(x)=limf(xn)flim infxn|f(x)| = \lim |f(x_n)| \leq \|f\| \liminf \|x_n\| for any norming functional ff). Many energy functionals in applications inherit this property from convexity.

The direct method proceeds in three steps:

  1. Boundedness. Show that a minimizing sequence (un)(u_n) for F\mathcal{F} is bounded: unC\|u_n\| \leq C.

  2. Compactness. Extract a weakly convergent subsequence unkuu_{n_k} \rightharpoonup u. In a reflexive space this is Corollary 1; in a dual space X=YX = Y^*, use Banach-Alaoglu (Theorem 1) for weak* compactness instead.

  3. Lower semicontinuity. Conclude that

F(u)lim infkF(unk)=infF,\mathcal{F}(u) \leq \liminf_{k \to \infty} \mathcal{F}(u_{n_k}) = \inf \mathcal{F},

so uu is a minimizer.

Step 1 is problem-specific, typically a coercivity estimate. Step 2 is pure functional analysis, exactly what Banach-Alaoglu and Eberlein-Šmulian provide. Step 3 requires that F\mathcal{F} behaves well under weak limits, which is where convexity or compensated compactness arguments enter. We will return to this in detail when we study Sobolev spaces and variational problems.

The direct method in a reflexive space

Example 2 (Minimizing the Dirichlet energy)

Let ΩRd\Omega \subset \mathbb{R}^d be a bounded open set with smooth boundary, and fix a boundary datum gH1/2(Ω)g \in H^{1/2}(\partial\Omega). Consider the Dirichlet energy

E(u)=12Ωu2dx\mathcal{E}(u) = \frac{1}{2}\int_\Omega |\nabla u|^2 \, dx

over the admissible set A={uH1(Ω):uΩ=g}\mathcal{A} = \{u \in H^1(\Omega) : u|_{\partial\Omega} = g\}. We claim that E\mathcal{E} attains its minimum on A\mathcal{A}.

Step 1 (Boundedness). Let (un)(u_n) be a minimizing sequence: E(un)infAE\mathcal{E}(u_n) \to \inf_{\mathcal{A}} \mathcal{E}. Since E(un)\mathcal{E}(u_n) is bounded, so is unL2\|\nabla u_n\|_{L^2}. The Poincaré inequality (applied to ung~u_n - \tilde{g} for any fixed extension g~H1(Ω)\tilde{g} \in H^1(\Omega) of gg) gives unH1C\|u_n\|_{H^1} \leq C.

Step 2 (Compactness). The Sobolev space H1(Ω)=W1,2(Ω)H^1(\Omega) = W^{1,2}(\Omega) is a Hilbert space, hence reflexive. By Theorem 1, the bounded sequence (un)(u_n) has a weakly convergent subsequence unkuu_{n_k} \rightharpoonup u in H1(Ω)H^1(\Omega). The trace operator is continuous in the weak topology, so uΩ=gu|_{\partial\Omega} = g and uAu \in \mathcal{A}.

Step 3 (Lower semicontinuity). The functional uΩu2dxu \mapsto \int_\Omega |\nabla u|^2 \, dx is convex and continuous in the norm topology, so by Mazur’s theorem (Theorem 2) its sublevel sets are weakly closed. Equivalently, E\mathcal{E} is weakly lower semicontinuous:

E(u)lim infkE(unk)=infAE.\mathcal{E}(u) \leq \liminf_{k \to \infty} \mathcal{E}(u_{n_k}) = \inf_{\mathcal{A}} \mathcal{E}.

Therefore uu is a minimizer. The minimizer is unique by strict convexity of E\mathcal{E}, and satisfies the Euler-Lagrange equation Δu=0-\Delta u = 0 in Ω\Omega (Laplace’s equation).

The entire argument rests on two pillars: reflexivity of H1H^1 gives weak compactness, and convexity of E\mathcal{E} gives weak lower semicontinuity. Remove either and the argument collapses.

Remark 1 (Why convexity gives weak lower semicontinuity)

Step 3 is the subtlest part of the direct method. Why can’t the energy increase when we pass to a weak limit? There are three ways to see this.

Direct proof for the Dirichlet energy. Since unkuu_{n_k} \rightharpoonup u in H1H^1, we have unku\nabla u_{n_k} \rightharpoonup \nabla u in L2L^2. Weak convergence means unk,wu,w\langle \nabla u_{n_k}, w \rangle \to \langle \nabla u, w \rangle for every wL2w \in L^2. Taking w=uw = \nabla u:

uL22=u,u=limkunk,ulim infkunkL2uL2\|\nabla u\|_{L^2}^2 = \langle \nabla u, \nabla u \rangle = \lim_k \langle \nabla u_{n_k}, \nabla u \rangle \leq \liminf_k \|\nabla u_{n_k}\|_{L^2} \cdot \|\nabla u\|_{L^2}

by Cauchy–Schwarz. Dividing by uL2\|\nabla u\|_{L^2} gives uL2lim infkunkL2\|\nabla u\|_{L^2} \leq \liminf_k \|\nabla u_{n_k}\|_{L^2}, hence E(u)lim infkE(unk)\mathcal{E}(u) \leq \liminf_k \mathcal{E}(u_{n_k}). The same argument shows that every norm is weakly lower semicontinuous: the norm can only drop under weak limits, never increase.

Via Mazur’s theorem. More generally, let F\mathcal{F} be any convex and norm-continuous functional. Its sublevel sets {u:F(u)c}\{u : \mathcal{F}(u) \leq c\} are convex (by convexity of F\mathcal{F}) and norm-closed (by continuity). By Mazur’s theorem (Theorem 2), convex norm-closed sets are weakly closed. So the sublevel sets are weakly closed, which is equivalent to F\mathcal{F} being weakly lower semicontinuous.

The intuition. By Mazur’s theorem, there exist convex combinations vk=jλj(k)unjv_k = \sum_j \lambda_j^{(k)} u_{n_j} converging strongly to uu. For a convex functional, Jensen’s inequality gives

F(vk)=F ⁣(jλj(k)unj)jλj(k)F(unj).\mathcal{F}(v_k) = \mathcal{F}\!\left(\sum_j \lambda_j^{(k)} u_{n_j}\right) \leq \sum_j \lambda_j^{(k)} \mathcal{F}(u_{n_j}).

The right-hand side is at most maxjF(unj)\max_j \mathcal{F}(u_{n_j}), which is eventually close to lim infF(unk)\liminf \mathcal{F}(u_{n_k}). By norm continuity, F(vk)F(u)\mathcal{F}(v_k) \to \mathcal{F}(u), giving the inequality. In short: weak limits are limits of averages, and convex functionals can’t be fooled by averaging.

What goes wrong without reflexivity

Example 3 (Failure of the direct method in L1L^1)

Consider the functional

F(u)=01u(x)dx1\mathcal{F}(u) = \left| \int_0^1 u(x) \, dx - 1 \right|

over the constraint set C={uL1([0,1]):u0, uL1=1}\mathcal{C} = \{u \in L^1([0,1]) : u \geq 0, \ \|u\|_{L^1} = 1\}. The infimum is infCF=0\inf_{\mathcal{C}} \mathcal{F} = 0, attained by the constant function u=1u = 1.

Now replace the objective with something that rewards concentration. Define

G(u)=01u(x)2dx\mathcal{G}(u) = -\int_0^1 u(x)^2 \, dx

over the same constraint set C\mathcal{C}. A minimizing sequence is given by the approximations to the identity:

un(x)=n1[0,1/n](x).u_n(x) = n \, \mathbf{1}_{[0, 1/n]}(x).

Each unCu_n \in \mathcal{C} with unL1=1\|u_n\|_{L^1} = 1 and G(un)=n\mathcal{G}(u_n) = -n \to -\infty, so infCG=\inf_{\mathcal{C}} \mathcal{G} = -\infty.

But more instructive is what happens to the sequence itself. As measures, undxδ0u_n \, dx \rightharpoonup^* \delta_0 in M([0,1])=C([0,1])\mathcal{M}([0,1]) = C([0,1])^*: for any continuous test function φ\varphi,

01φ(x)un(x)dx=n01/nφ(x)dxφ(0).\int_0^1 \varphi(x) \, u_n(x) \, dx = n \int_0^{1/n} \varphi(x) \, dx \to \varphi(0).

The weak-* limit δ0\delta_0 is a perfectly good Radon measure, but it does not belong to L1L^1. The mass has concentrated at a single point. This is the failure mode: L1L^1 is not reflexive, so bounded sequences need not have weakly convergent subsequences in L1L^1. The sequence escapes to the larger space of measures, and the direct method cannot recover a minimizer in the original space.

The contrast between Example 2 and Example 3 illustrates why reflexivity is not a technical convenience but a structural requirement. In reflexive spaces, bounded sets are weakly compact and the direct method closes. In non-reflexive spaces like L1L^1, minimizing sequences can concentrate or oscillate their way out of the space, and one must either enlarge the space (to measures) or impose additional compactness conditions (such as equi-integrability via the Dunford-Pettis theorem).