Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

The Weak Topology

Big Idea

For applications in PDE and the calculus of variations, we need a topology that is Hausdorff (so limits are unique) and has many compact sets (so we can extract convergent subsequences). These demands pull in opposite directions. The weak topology is the sweet spot: coarse enough for compactness, fine enough for separation.

What we need from a topology

For the existence arguments that drive much of applied analysis, we need two things from a topology on a Banach space XX.

Recall that convergence in a topological space is defined in terms of open sets: a sequence (xn)(x_n) in a topological space XX converges to xXx \in X if for every open set UU containing xx, there exists n0Nn_0 \in \mathbb{N} such that xnUx_n \in U for all nn0n \geq n_0. In a metric space, this reduces to the familiar ε\varepsilon-definition, but in a general topological space the open sets may look very different from metric balls.

Hausdorff (unique limits). We need limits to be unique: if xnxx_n \to x and xnyx_n \to y, then x=yx = y. A topology is Hausdorff if any two distinct points can be separated by disjoint open sets. This forces uniqueness: if xyx \neq y, choose disjoint open sets UxU \ni x and VyV \ni y. Eventually xnUx_n \in U and eventually xnVx_n \in V, but UV=U \cap V = \emptyset, a contradiction. Without the Hausdorff property, a sequence can converge to multiple points simultaneously (in the indiscrete topology {,X}\{\emptyset, X\}, every sequence converges to every point), and limit arguments become meaningless.

Compact sets (convergent subsequences). We need to extract convergent subsequences from bounded families. A set KK is compact if every open cover has a finite subcover. Why does this give subsequences? Take a sequence (xn)(x_n) in KK. Any open cover of KK has a finite subcover U1,,UmU_1, \ldots, U_m. Since infinitely many terms of the sequence are distributed among finitely many sets, at least one UjU_j contains infinitely many xnx_n. This UjU_j can be made arbitrarily small (by refining the cover), forcing the terms to cluster. Compactness guarantees that sequences cannot escape.

The tension. These two demands pull in opposite directions. The indiscrete topology {,X}\{\emptyset, X\} makes every space trivially compact (there is only one open cover), but no two points can be separated. The discrete topology (every subset is open) makes every space Hausdorff, but only finite sets are compact: given an infinite set AA, the cover {{x}:xA}\{\{x\} : x \in A\} consists of open singletons and has no finite subcover. In general, making a topology coarser helps compactness but threatens separation; making it finer helps separation but destroys compactness.

In finite dimensions, the norm topology gives us both: it is Hausdorff and closed bounded sets are compact (Heine-Borel). In infinite dimensions, the norm topology remains Hausdorff but loses compactness: the closed unit ball is never norm-compact (Riesz’s theorem). For instance, the standard basis vectors e1,e2,e3,e_1, e_2, e_3, \ldots in 2\ell^2 all lie on the unit sphere with emen=2\|e_m - e_n\| = \sqrt{2} for mnm \neq n, so no subsequence is Cauchy. The bumps “escape” into new dimensions and never cluster. The norm topology has too many open sets.

The idea is to weaken the topology just enough to recover compactness while keeping the Hausdorff property. The Hahn-Banach theorem makes this possible: it guarantees enough continuous linear functionals to separate points, so the topology they generate is Hausdorff. The resulting weak topology has fewer open sets (and therefore fewer open covers), making compactness easier to achieve.

From norm balls to weak neighborhoods

In a normed space XX, the norm topology is generated by the open balls

Br(x)={yX:yx<r}.B_r(x) = \{ y \in X : \|y - x\| < r \}.

The idea behind the weak topology is to relax the notion of neighborhood. The building blocks are slabs: for a single functional fXf \in X^* and ε>0\varepsilon > 0, define

S(f,x,ε)={yX:f(yx)<ε}.S(f, x, \varepsilon) = \{ y \in X : |f(y - x)| < \varepsilon \}.

Geometrically, ff defines a family of parallel hyperplanes Hc={xX:f(x)=c}H_c = \{x \in X : f(x) = c\}, and the slab S(f,x,ε)S(f, x, \varepsilon) is the region between Hf(x)εH_{f(x) - \varepsilon} and Hf(x)+εH_{f(x) + \varepsilon}: a strip of width 2ε/f2\varepsilon / \|f\| in the direction normal to the hyperplanes. It constrains yy only in the direction that ff measures, and is unbounded in every direction within kerf\ker f.

Source
<Figure size 1000x500 with 2 Axes>

Left: the norm ball is bounded in every direction. Right: the slab x,a<1|\langle x, a \rangle| < 1 constrains only the direction normal to kerf\ker f and extends to infinity along it. The vector aa is the normal direction. The point yny_n lies inside the slab but far outside the norm ball.

A weak open set is built up in three layers. Recall that a basis for a topology is a collection B\mathcal{B} of open sets such that every open set in the topology is a union of members of B\mathcal{B}. Equivalently, for every open set UU and every xUx \in U, there exists BBB \in \mathcal{B} with xBUx \in B \subseteq U. In the norm topology, the open balls form a basis. In the weak topology, the role of open balls is played by tubes.

  1. Sub-basic open sets (slabs). A single slab S(f,x,ε)S(f, x, \varepsilon) imposes one linear constraint. It is an infinite strip, constraining one direction and placing no restriction on any direction in kerf\ker f.

  2. Basic open sets (finite intersections of slabs). A finite intersection

    U(x;f1,,fn,ε)=i=1nS(fi,x,ε)U(x; f_1, \ldots, f_n, \varepsilon) = \bigcap_{i=1}^n S(f_i, x, \varepsilon)

    constrains nn directions simultaneously. Think of this as a tube: bounded cross-section in finitely many directions, infinite extent in all others. These tubes form a basis for the weak topology.

  3. General open sets (arbitrary unions of tubes). Every weakly open set is a union of tubes, just as every norm-open set is a union of open balls.

The key point: in infinite dimensions, every tube is unbounded (since n<dimXn < \dim X always). Since every point in a weakly open set has a tube around it, and every tube is unbounded, every nonempty weakly open set is unbounded in infinite dimensions. No bounded set like a norm ball can be weakly open, and no bounded set can even contain a weakly open set.

Definition 1 (Weak topology)

The weak topology on XX, denoted σ(X,X)\sigma(X, X^*), is the coarsest topology making every fXf \in X^* continuous. A neighborhood basis at xXx \in X consists of the basic open sets:

U(x;f1,,fn,ε)={yX:fi(yx)<ε for i=1,,n}U(x; f_1, \ldots, f_n, \varepsilon) = \{ y \in X : |f_i(y - x)| < \varepsilon \text{ for } i = 1, \ldots, n \}

where f1,,fnXf_1, \ldots, f_n \in X^* and ε>0\varepsilon > 0.

Continuity of ff requires preimages of open sets to be open. Every open set in R\mathbb{R} is a union of open intervals, and f1((a,b))f^{-1}((a,b)) is exactly a slab between two parallel hyperplanes. So any topology making all fXf \in X^* continuous must contain all slabs. The coarsest such topology is the one generated by the slabs alone, with no extra open sets added.

The weak topology is Hausdorff: if xyx \neq y, then by the Hahn-Banach theorem there exists fXf \in X^* with f(x)f(y)f(x) \neq f(y), so xx and yy lie in disjoint slabs. Without Hahn-Banach, we would have no guarantee that XX^* contains enough functionals to distinguish points, and the weak topology could fail to be Hausdorff.

Example 1 (The weak topology on 2\ell^2)

Take X=2X = \ell^2 with its standard orthonormal basis (en)(e_n). By the Riesz representation theorem, every f(2)f \in (\ell^2)^* is of the form f(x)=x,af(x) = \langle x, a \rangle for some a2a \in \ell^2. Consider the functional f1(x)=x,e1=x1f_1(x) = \langle x, e_1 \rangle = x_1. The slab

S(f1,0,1)={x2:x1<1}S(f_1, 0, 1) = \{x \in \ell^2 : |x_1| < 1\}

is the region between the hyperplanes x1=1x_1 = -1 and x1=1x_1 = 1. This slab contains vectors with arbitrarily large norm, as long as their first component is small. For instance, yn=12e1+ne2y_n = \tfrac{1}{2} e_1 + n\, e_2 satisfies f1(yn)=12<1|f_1(y_n)| = \tfrac{1}{2} < 1, so ynS(f1,0,1)y_n \in S(f_1, 0, 1) for every nn, yet yn=1/4+n2\|y_n\| = \sqrt{1/4 + n^2} \to \infty. The slab sees only the first coordinate and is blind to the e2e_2 direction.

Adding more functionals narrows the neighborhood but never makes it bounded. The set U(0;f1,,fn,ε)={x:xk<ε for k=1,,n}U(0; f_1, \ldots, f_n, \varepsilon) = \{x : |x_k| < \varepsilon \text{ for } k = 1, \ldots, n\} constrains nn coordinates but leaves infinitely many unconstrained. This is a tube: bounded cross-section in the first nn coordinates, unbounded in all others.

Remark 1 (Open, closed, or neither)

Every set in a topological space falls into exactly one of four categories: (1) open but not closed, (2) closed but not open, (3) both (clopen), or (4) neither. In any topology, the vast majority of sets are in category (4).

When passing from the norm topology to the weak topology, sets can only move toward category (4). A norm-open set may lose its openness; a norm-closed set may lose its closedness. No set gains either property, because every weakly open set is norm-open and every weakly closed set is norm-closed. The “neither” category grows at the expense of all three others.

For example, in 2\ell^2 with the norm topology, the half-open shell {x:1x<2}\{x : 1 \leq \|x\| < 2\} is neither open nor closed. In the weak topology, the open ball {x:x<1}\{x : \|x\| < 1\} also joins the “neither” category: it is no longer weakly open (it is bounded), and it is not weakly closed either.

Remark 2 (Open and closed sets in the weak topology)

The weak topology has fewer open sets than the norm topology. Two consequences:

  • Harder to be open. Norm-open balls are generally not weakly open in infinite dimensions (they would need to contain an unbounded weak neighborhood, which they cannot).

  • Fewer closed sets, not more. A common misconception is that fewer open sets means more closed sets. In fact, closed sets are complements of open sets, so fewer open sets means fewer closed sets too. A set that is “not open” in the weak topology does not become closed. Most sets are neither.

    However, convex norm-closed sets remain weakly closed — this is Mazur's theorem. The closed unit ball is weakly closed because it is convex, not because of a general principle about coarser topologies. Non-convex norm-closed sets can fail to be weakly closed. For example, the set of standard basis vectors {en}\{e_n\} in 2\ell^2 is norm-closed (all pairwise distances are 2\sqrt{2}), but en0e_n \rightharpoonup 0 weakly, so 0 is in its weak closure.

Remark 3 (The unit ball as an intersection of slabs)

The slab picture gives a clean characterization of the unit ball. Each fXf \in X^* with f1\|f\| \leq 1 defines a closed slab {x:f(x)1}\{x : |f(x)| \leq 1\}, and the unit ball fits inside every such slab. The sup formula gives the converse:

BX=f1{x:f(x)1}.B_X = \bigcap_{\|f\| \leq 1} \{x : |f(x)| \leq 1\}.

The unit ball is exactly the intersection of all dual slabs. This means BXB_X is weakly closed (it is an intersection of weakly closed sets, since each ff is weakly continuous).

Strong and weak convergence

With the weak topology defined, we can now describe the corresponding notions of convergence. We begin with norm convergence for comparison, then turn to weak convergence.

Definition 2 (Strong convergence)

Let XX be a normed space and let (xn)n1(x_n)_{n \geq 1} be a sequence in XX. We say xnxx_n \to x strongly (or in norm) if

xnx0as n.\|x_n - x\| \to 0 \quad \text{as } n \to \infty.

This is the familiar notion: the points xnx_n physically move toward xx, and the distance xnx\|x_n - x\| shrinks to zero. Since f(xn)f(x)fxnx|f(x_n) - f(x)| \leq \|f\| \cdot \|x_n - x\|, strong convergence forces all instrument readings to converge uniformly over f1\|f\| \leq 1. But it is defined by the norm directly, not by the instruments.

Definition 3 (Weak convergence)

Let XX be a normed space and let (xn)n1(x_n)_{n \geq 1} be a sequence in XX. We say xnxx_n \rightharpoonup x weakly if

f(xn)f(x)for every fX.f(x_n) \to f(x) \quad \text{for every } f \in X^*.

Each instrument ff foliates XX into level sets {x:f(x)=c}\{x : f(x) = c\}, the “isotherms” for that measurement. Weak convergence means: in every foliation, the readings f(xn)f(x_n) settle down to f(x)f(x). The objects xnx_n need not become close to xx; they can keep bouncing around, as long as every instrument eventually reads the same value as it does on xx.

Proposition 1 (Strong convergence implies weak convergence)

Let XX be a normed space. If xnxx_n \to x strongly, then xnxx_n \rightharpoonup x weakly.

Proof 1

Let fXf \in X^*. Since ff is a bounded linear functional,

f(xn)f(x)=f(xnx)fxnx0.|f(x_n) - f(x)| = |f(x_n - x)| \leq \|f\| \cdot \|x_n - x\| \to 0.

Since ff was arbitrary, xnxx_n \rightharpoonup x.

The converse is false in infinite dimensions:

Example 2 (Weak but not strong convergence in L2L^2)

In L2([0,1])L^2([0,1]), the sequence xn=sin(nπt)x_n = \sin(n\pi t) converges weakly to 0, but not strongly. Each xnx_n has norm xnL2=1/2\|x_n\|_{L^2} = 1/\sqrt{2}, so the sequence stays on a sphere. But for any gL2g \in L^2, the Riemann-Lebesgue lemma gives:

01g(t)sin(nπt)dt0as n\int_0^1 g(t)\sin(n\pi t)\,dt \to 0 \quad \text{as } n \to \infty

so fg(xn)0f_g(x_n) \to 0 for every functional fg(L2)f_g \in (L^2)^*, i.e., xn0x_n \rightharpoonup 0.

Source
<Figure size 1500x450 with 3 Axes>

Left: the functions sin(nπt)\sin(n\pi t) oscillate faster and faster, never settling down pointwise. Center: the L2L^2 norm stays at 1/21/\sqrt{2} for every nn, so the sequence does not converge strongly. Right: the inner product g,xn\langle g, x_n \rangle decays to 0 for every test function gL2g \in L^2, no matter its shape. Every “foliation height” converges, even though the functions themselves keep bouncing around. This is weak convergence without strong convergence.

Remark 4 (Pointwise vs. uniform convergence over the dual)

Why does weak convergence not imply strong convergence? After all, weak convergence requires g,xn0\langle g, x_n \rangle \to 0 for all gL2g \in L^2 -- isn’t “all” a strong demand?

The key is that weak convergence is pointwise in gg: we fix gg first, then send nn \to \infty. For any fixed gg, the Fourier coefficients g,sin(nπt)=g^(n)0\langle g, \sin(n\pi t) \rangle = \hat{g}(n) \to 0 because the tail of a square-summable series vanishes. Each instrument eventually loses track of the oscillation.

But we are free to change the instrument with nn. Choose gn=sin(nπt)g_n = \sin(n\pi t), i.e., the instrument that is perfectly aligned with xnx_n. Then

gn,xn=sin(nπt)L22=12\langle g_n, x_n \rangle = \|\sin(n\pi t)\|_{L^2}^2 = \frac{1}{2}

for every nn: this “adaptive” instrument always catches the oscillation. The supremum over the unit ball,

supg1g,xn=xn=12,\sup_{\|g\| \leq 1} |\langle g, x_n \rangle| = \|x_n\| = \frac{1}{\sqrt{2}},

never decays. But this supremum is the norm of xnx_n, and driving it to zero would be exactly strong convergence.

Why is this not a problem for weak convergence? Because the definition (Definition 3) quantifies as: for every fixed gXg \in X^*, g,xn0\langle g, x_n \rangle \to 0. The functional gg is chosen once and for all, and then we ask whether the sequence of numbers g,xn\langle g, x_n \rangle converges. An adaptive choice gng_n that changes with nn does not define a single sequence of real numbers: it defines a different measurement at each step. This is not what any individual instrument reads; it is a meta-observation assembled by switching instruments. No single linear measuring instrument in XX^* witnesses the non-convergence.

So the gap between weak and strong convergence is precisely the gap between pointwise and uniform convergence over XX^*: each fixed linear instrument eventually stops detecting the oscillation, but the worst-case instrument shifts with nn to stay aligned with xnx_n.

The identity supg1g,xn=xn\sup_{\|g\| \leq 1} |\langle g, x_n \rangle| = \|x_n\| is a consequence of the Hahn-Banach theorem: for any xXx \in X, there exists a norming functional gxXg_x \in X^* with gx=1\|g_x\| = 1 and gx(x)=xg_x(x) = \|x\|. So the supremum over the unit ball of XX^* is always attained (or approached, in the non-reflexive case), and equals the norm. In other words, the norm of xx is exactly the largest reading that any unit-norm linear instrument can produce on xx.

This is also where the Banach-Steinhaus theorem (Theorem 1) enters: weak convergence xnxx_n \rightharpoonup x implies that g,xn\langle g, x_n \rangle is bounded for each fixed gg. Each xnx_n acts as a bounded linear functional on XX^* via evaluation, xn(g)=g(xn)x_n(g) = g(x_n), and the family {xn}\{x_n\} is pointwise bounded on XX^*. The uniform boundedness principle then gives supnxn<\sup_n \|x_n\| < \infty. So weak convergence automatically implies norm-boundedness, but not norm-convergence.

The foliation picture of weak convergence

Each functional gL2g \in L^2 defines a foliation of L2L^2 into parallel hyperplanes {x:g,x=c}\{x : \langle g, x \rangle = c\}. Weak convergence xn0x_n \rightharpoonup 0 means that in every foliation, the heights g,xn\langle g, x_n \rangle converge to 0. The points xnx_n jump between different hyperplanes in each foliation, but eventually the heights settle near the origin’s level set.

Source
<Figure size 1500x500 with 3 Axes>

Each panel is a different foliation of L2L^2, defined by a functional gg. The vertical axis is the “height” g,xn\langle g, x_n \rangle that the foliation assigns to xn=sin(nπt)x_n = \sin(n\pi t). The points jump around between level sets, but in every foliation the heights converge to 0 (the blue line). The sequence never converges in norm, yet every foliation eventually reads near-zero heights. This is weak convergence.

Strong vs. Weak

Strong convergence means the objects cluster in space. Weak convergence means every instrument reading converges, even though the objects may keep moving. The gap is real: sin(nπx)0\sin(n\pi x) \rightharpoonup 0 but sin(nπx)=1/2↛0\|\sin(n\pi x)\| = 1/\sqrt{2} \not\to 0.

In finite dimensions, this gap disappears: weak and strong convergence are equivalent (finitely many instruments suffice to control the norm). The gap is an essentially infinite-dimensional phenomenon.

Weak limits can lose mass

Strong convergence preserves the norm: xnx0\|x_n - x\| \to 0 implies xnx\|x_n\| \to \|x\|. Weak convergence does not. The norm can drop in the limit, but it cannot increase.

Proposition 2 (Weak lower semicontinuity of the norm)

Let XX be a normed space. If xnxx_n \rightharpoonup x weakly, then

xlim infnxn.\|x\| \leq \liminf_{n \to \infty} \|x_n\|.

Proof 2

By the sup formula, for any fXf \in X^* with f1\|f\| \leq 1, we have f(xn)fxnxn|f(x_n)| \leq \|f\| \cdot \|x_n\| \leq \|x_n\|. Since xnxx_n \rightharpoonup x, the left side converges: f(xn)f(x)|f(x_n)| \to |f(x)|. The right side xn\|x_n\| need not converge (the norms may oscillate), but lim inf\liminf always exists. If anbna_n \leq b_n and liman\lim a_n exists, then limanlim infbn\lim a_n \leq \liminf b_n. Therefore

f(x)=limnf(xn)lim infnxn.|f(x)| = \lim_{n \to \infty} |f(x_n)| \leq \liminf_{n \to \infty} \|x_n\|.

Taking the supremum over all ff with f1\|f\| \leq 1:

x=supf1f(x)lim infnxn.\|x\| = \sup_{\|f\| \leq 1} |f(x)| \leq \liminf_{n \to \infty} \|x_n\|.

This is the price of weak convergence: mass can escape to infinity. In the example sin(nπt)0\sin(n\pi t) \rightharpoonup 0, the norm stays at 1/21/\sqrt{2} while the weak limit has norm 0. The inequality is sharp.

Basic properties of weak convergence

Proposition 3 (Weak limits are unique)

Let XX be a normed space. If xnxx_n \rightharpoonup x and xnyx_n \rightharpoonup y, then x=yx = y.

Proof 3

For every fXf \in X^*, we have f(x)=limf(xn)=f(y)f(x) = \lim f(x_n) = f(y), so f(xy)=0f(x - y) = 0 for all fXf \in X^*. By the sup formula, xy=supf1f(xy)=0\|x - y\| = \sup_{\|f\| \leq 1} |f(x - y)| = 0, so x=yx = y.

Proposition 4 (Weakly convergent sequences are bounded)

Let XX be a Banach space. If xnxx_n \rightharpoonup x, then supnxn<\sup_n \|x_n\| < \infty.

Proof 4

Each xnx_n defines an element J[xn]XJ[x_n] \in X^{**} by J[xn](f)=f(xn)J[x_n](f) = f(x_n). Since xnxx_n \rightharpoonup x, for each fixed fXf \in X^* the sequence J[xn](f)=f(xn)J[x_n](f) = f(x_n) converges and is therefore bounded. So the family {J[xn]}n1X\{J[x_n]\}_{n \geq 1} \subseteq X^{**} is pointwise bounded on XX^*. Since XX^* is a Banach space, the Banach-Steinhaus theorem (Theorem 1) gives supnJ[xn]X<\sup_n \|J[x_n]\|_{X^{**}} < \infty. By Lemma 1, J[xn]X=xn\|J[x_n]\|_{X^{**}} = \|x_n\|, so supnxn<\sup_n \|x_n\| < \infty.

Proposition 5 (Compact operators turn weak convergence into strong convergence)

Let X,YX, Y be Banach spaces and A:XYA : X \to Y a compact linear operator. If xnxx_n \rightharpoonup x in XX, then AxnAxAx_n \to Ax strongly in YY.

Proof 5

Since xnxx_n \rightharpoonup x, Proposition 4 gives supnxnC\sup_n \|x_n\| \leq C. The sequence (xnx)(x_n - x) converges weakly to 0 and is bounded, so (A(xnx))(A(x_n - x)) lies in the image of a bounded set under a compact operator, hence its closure is compact in YY.

It suffices to show every subsequence of (Axn)(Ax_n) has a further subsequence converging to AxAx. Let (xnk)(x_{n_k}) be any subsequence. Since AA is compact and (xnk)(x_{n_k}) is bounded, there exists a further subsequence (xnkj)(x_{n_{k_j}}) such that AxnkjzAx_{n_{k_j}} \to z for some zYz \in Y. We identify z=Axz = Ax: for any gYg \in Y^*, the functional f=gAXf = g \circ A \in X^*, so

g(Axnkj)=f(xnkj)f(x)=g(Ax).g(Ax_{n_{k_j}}) = f(x_{n_{k_j}}) \to f(x) = g(Ax).

Since AxnkjzAx_{n_{k_j}} \to z strongly, also g(Axnkj)g(z)g(Ax_{n_{k_j}}) \to g(z), giving g(z)=g(Ax)g(z) = g(Ax) for all gYg \in Y^*. By Hahn-Banach, z=Axz = Ax.

Since every subsequence of (Axn)(Ax_n) has a further subsequence converging to AxAx, the full sequence converges: AxnAxAx_n \to Ax.

Weak compactness in reflexive spaces

The whole point of weakening the topology was to gain compactness. With fewer open sets there are fewer open covers, so it becomes easier for a set to be compact. The following theorem makes this precise for reflexive spaces.

Theorem 1 (Weak compactness of the unit ball)

Let XX be a reflexive Banach space. Then the closed unit ball BXB_X is compact in the weak topology. In particular, every bounded sequence in XX has a weakly convergent subsequence.

The proof uses the Banach-Alaoglu theorem (Theorem 1) and the canonical embedding; see Corollary 1.

This theorem is the reason reflexivity matters in applications. In reflexive spaces like LpL^p (1<p<1 < p < \infty) and Sobolev spaces Wk,pW^{k,p} (1<p<1 < p < \infty), every bounded sequence has a weakly convergent subsequence. This is the compactness step in the direct method of the calculus of variations: minimize an energy functional over a bounded set, extract a weakly convergent subsequence, and pass to the limit.

What goes wrong without reflexivity

Why does the theorem require reflexivity? If weak compactness held for every Banach space, then every bounded sequence would have a weakly convergent subsequence. Consider X=c0X = c_0 with the duality chain c0=1c_0^* = \ell^1, c0=c_0^{**} = \ell^\infty (recall Example 5). The space c0c_0 is not reflexive: the canonical embedding J:c0J : c_0 \hookrightarrow \ell^\infty misses elements like (1,1,1,)J(c0)(1, 1, 1, \ldots) \in \ell^\infty \setminus J(c_0).

Consider the bounded sequence xn=(1,1,,1n,0,0,)c0x_n = (\underbrace{1, 1, \ldots, 1}_{n}, 0, 0, \ldots) \in c_0, with xn=1\|x_n\|_\infty = 1. Can we extract a weakly convergent subsequence? In fact the full sequence already has the property that the readings converge: for any f=(fk)1=c0f = (f_k) \in \ell^1 = c_0^*,

f(xn)=k=1nfkf(x_n) = \sum_{k=1}^n f_k

which is a partial sum of the absolutely convergent series fk\sum f_k (since f1f \in \ell^1), so f(xn)k=1fkf(x_n) \to \sum_{k=1}^\infty f_k.

What we want: a subsequence (xnj)(x_{n_j}) and some xc0x \in c_0 such that f(xnj)f(x)f(x_{n_j}) \to f(x) for every f1f \in \ell^1.

What this requires: we already know f(xn)k=1fkf(x_n) \to \sum_{k=1}^\infty f_k for every ff. Since a subsequence of a convergent sequence converges to the same limit, f(xnj)k=1fkf(x_{n_j}) \to \sum_{k=1}^\infty f_k as well. So the candidate xx must satisfy f(x)=k=1fkf(x) = \sum_{k=1}^\infty f_k for every f1f \in \ell^1.

Why the only candidate is not in c0c_0: choose f=ek=(0,,0,1,0,)1f = e_k = (0, \ldots, 0, 1, 0, \ldots) \in \ell^1, which reads the kk-th coordinate. On one hand, ek(x)=xke_k(x) = x_k. On the other hand, j(ek)j=1\sum_j (e_k)_j = 1. So xk=1x_k = 1 for every kk, forcing x=(1,1,1,)x = (1, 1, 1, \ldots). But this constant sequence does not converge to zero, so xc0x \notin c_0. No subsequence of (xn)(x_n) converges weakly in c0c_0.

Where does this candidate live? The canonical embedding J:c0=c0J : c_0 \hookrightarrow \ell^\infty = c_0^{**} identifies each element of c0c_0 with an evaluation functional on 1\ell^1. The candidate x=(1,1,1,)x = (1, 1, 1, \ldots) is a bounded sequence, so it defines a valid element of \ell^\infty. It acts on 1\ell^1 by

J(x)(f)=k=1fkfor f=(fk)1.J(x)(f) = \sum_{k=1}^\infty f_k \quad \text{for } f = (f_k) \in \ell^1.

This is a perfectly good element of the bidual c0=c_0^{**} = \ell^\infty, but it is not in the range of JJ. Explicitly:

J(c0)={(yk):yk0},(1,1,1,)J(c0),J(c_0) = \{ (y_k) \in \ell^\infty : y_k \to 0 \}, \qquad (1, 1, 1, \ldots) \in \ell^\infty \setminus J(c_0),

since the constant sequence does not converge to zero. So JJ is not surjective: J(c0)=c0J(c_0) \subsetneq \ell^\infty = c_0^{**}. The sequence (xn)(x_n) has “escaped” into the bidual — the readings converge, but the limit lives in c0J(c0)c_0^{**} \setminus J(c_0).

Reflexivity means J(X)=XJ(X) = X^{**}: there is no gap, so there is no room to escape. In a reflexive space, every candidate limit that is consistent with the readings already lives in XX.

Separation and convexity in the weak topology

Optional Extension

This section explores the topological separation properties of the weak topology and their connection to Mazur’s theorem. It is not needed for the main development but clarifies why convexity plays such a distinguished role.

Separation axioms: from T2T_2 to T4T_4

The separation axioms form a hierarchy of increasing strength:

T4    T312    T3    T2    T1.T_4 \implies T_{3\frac{1}{2}} \implies T_3 \implies T_2 \implies T_1.

Each level upgrades what can be separated: T2T_2 separates points from points, T3T_3 separates points from closed sets by open sets, T312T_{3\frac{1}{2}} strengthens this to separation by continuous functions, and T4T_4 separates closed sets from closed sets.

Proposition 6 (Metric spaces are T4T_4)

Every metric space (X,d)(X, d) is T4T_4.

Proof 6

Given disjoint closed sets A,BXA, B \subset X, the function

f(x)=d(x,A)d(x,A)+d(x,B)f(x) = \frac{d(x, A)}{d(x, A) + d(x, B)}

is continuous with f=0f = 0 on AA and f=1f = 1 on BB (this works because d(x,A)+d(x,B)>0d(x, A) + d(x, B) > 0 when AB=A \cap B = \emptyset and both are closed), so U=f1([0,12))U = f^{-1}([0, \tfrac{1}{2})) and V=f1((12,1])V = f^{-1}((\tfrac{1}{2}, 1]) are disjoint open sets separating AA and BB.

Since every normed space is a metric space, every Banach space is T4T_4 in its norm topology. When we pass to the weak topology, we lose the metric but the topology is still Hausdorff (T2T_2), since XX^* separates points.

Mazur’s theorem: convexity restores separation

The natural question is: can we separate points from closed sets, not just from other points? The weak topology is defined so that every fXf \in X^* is continuous, and the Hahn-Banach theorem produces such an ff that strictly separates a point from a closed convex set. So for convex sets, the functionals in XX^* play exactly the role of the separating continuous functions required by the T312T_{3\frac{1}{2}} axiom. Mazur’s theorem makes this precise: convexity is exactly the condition under which norm-closure and weak-closure agree.

Theorem 2 (Mazur’s theorem)

Let XX be a normed space and CXC \subseteq X a convex set. Then CC is norm-closed if and only if it is weakly closed.

Proof 7

Since the weak topology is coarser than the norm topology, every weakly closed set is norm-closed. The content is the converse: a norm-closed convex set is weakly closed.

It suffices to show that if x0Cx_0 \notin C and CC is norm-closed and convex, then x0x_0 is not in the weak closure of CC, i.e., there is a weakly open set containing x0x_0 that misses CC.

Since CC is norm-closed and x0Cx_0 \notin C, there exists δ>0\delta > 0 with Bδ(x0)C=B_\delta(x_0) \cap C = \emptyset. By the geometric form of the Hahn-Banach theorem (strict separation of a point from a closed convex set), there exists fXf \in X^* and αR\alpha \in \mathbb{R} such that

f(x0)>αf(c)for all cC.f(x_0) > \alpha \geq f(c) \quad \text{for all } c \in C.

The set {xX:f(x)>α}\{x \in X : f(x) > \alpha\} is a weakly open slab containing x0x_0 and disjoint from CC. Therefore x0x_0 is not in the weak closure of CC.

Mazur’s theorem says that for convex sets, you cannot tell the difference between norm-closure and weak-closure. This is why the closed unit ball, closed convex hulls, and closed subspaces are all weakly closed. Non-convex sets do not enjoy this protection: the set {en}\{e_n\} in 2\ell^2 is norm-closed but not weakly closed.

The connection to separation is now clear: the weak topology lacks T4T_4 (separation of arbitrary closed sets), but Hahn-Banach gives separation of points from closed convex sets. Mazur’s theorem is the payoff — convexity is the precise condition under which the weak topology’s separation power suffices.