Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

The Weak* Topology and Banach-Alaoglu

Big Idea

The dual space XX^* is itself a Banach space, so it has a weak topology. But that weak topology requires the bidual XX^{**}, which can be enormous. The weak* topology sidesteps the bidual entirely by testing only against elements of XX. This gives a coarser topology with fewer open sets, making compactness easier, and leads to the Banach-Alaoglu theorem.

We now want a weak-type topology on the dual space XX^*, for the same reason we wanted one on XX: to gain compactness. Since XX^* is a Banach space in its own right, the most natural first attempt is to apply the same construction as before. Weak convergence of a sequence (fn)(f_n) in XX^* would then mean

ϕ(fn)ϕ(f)for every ϕ(X)=X.\phi(f_n) \to \phi(f) \quad \text{for every } \phi \in (X^*)^* = X^{**}.

This requires testing against every element of the bidual XX^{**}.

Example 1 (The bidual can be much larger than XX)

Recall the duality chain from Example 5: for X=c0X = c_0, c0=1c_0^* = \ell^1 and c0=c_0^{**} = \ell^\infty. To check weak convergence of a sequence (fn)(f_n) in 1\ell^1, we would need to test against every element of \ell^\infty.

Each xc0x \in c_0 gives an element of \ell^\infty via the canonical embedding JJ, and testing fnf_n against J(x)J(x) just means evaluating fn(x)=kfkxkf_n(x) = \sum_k f_k x_k, which is concrete: we point the instrument fnf_n at the object xx and read off the result.

But \ell^\infty also contains elements not in J(c0)J(c_0). Consider ϕ=(1,1,1,)\phi = (1, 1, 1, \ldots) \in \ell^\infty. Its action on f1f \in \ell^1 is ϕ(f)=kfk\phi(f) = \sum_k f_k. This is a perfectly valid bounded linear functional on 1\ell^1, but it does not come from the canonical embedding: there is no xc0x \in c_0 with J(x)=ϕJ(x) = \phi, since (1,1,1,)(1, 1, 1, \ldots) does not converge to zero. Requiring convergence against such “phantom” test functionals is a strictly stronger demand, and it makes compactness harder to achieve.

Now recall the canonical embedding J:XXJ : X \hookrightarrow X^{**}, which identifies each xXx \in X with the evaluation functional J(x)(f)=f(x)J(x)(f) = f(x). Every element of XX already lives inside XX^{**}. So instead of testing against all of XX^{**}, we could test against just J(X)J(X): ask only that

fn(x)f(x)for every xX.f_n(x) \to f(x) \quad \text{for every } x \in X.

This is weaker (we test against fewer functionals), but it has two advantages. First, it uses only elements of XX, which we already understand. Second, because the topology is coarser (fewer open sets, fewer open covers), compactness becomes easier.

The weak* topology

Definition 1 (Weak* topology)

The weak* topology on XX^*, denoted σ(X,X)\sigma(X^*, X), is the coarsest topology making every evaluation map x^:XR\hat{x} : X^* \to \mathbb{R}, x^(f)=f(x)\hat{x}(f) = f(x), continuous. A basis of open neighborhoods of fXf \in X^* is:

V(f;x1,,xn,ε)={gX:g(xi)f(xi)<ε for i=1,,n}V(f; x_1, \ldots, x_n, \varepsilon) = \{ g \in X^* : |g(x_i) - f(x_i)| < \varepsilon \text{ for } i = 1, \ldots, n \}

where x1,,xnXx_1, \ldots, x_n \in X and ε>0\varepsilon > 0.

Continuity of x^\hat{x} requires preimages of open sets to be open. Every open set in R\mathbb{R} is a union of open intervals, and x^1((a,b))={gX:a<g(x)<b}\hat{x}^{-1}((a, b)) = \{g \in X^* : a < g(x) < b\} is a slab in XX^*. So any topology making all evaluation maps continuous must contain all such slabs. The coarsest such topology is the one generated by the slabs alone. A weak* slab constrains the values of gg on finitely many test objects.

Example 2 (The weak* topology on 1=(c0)\ell^1 = (c_0)^*)

Take X=c0X = c_0 (sequences converging to 0 with sup norm). Its dual is X=1X^* = \ell^1 with f(x)=kfkxkf(x) = \sum_k f_k x_k for f=(fk)1f = (f_k) \in \ell^1, x=(xk)c0x = (x_k) \in c_0.

The norm ball in 1\ell^1 is B1={f1:kfk1}B_1 = \{f \in \ell^1 : \sum_k |f_k| \leq 1\}, the familiar cross-polytope (diamond) in finite-dimensional slices.

Now take the test object x=e1=(1,0,0,)c0x = e_1 = (1, 0, 0, \ldots) \in c_0. The evaluation e^1(f)=f1\hat{e}_1(f) = f_1 defines a weak* slab

V1={f1:f1<1}.V_1 = \{ f \in \ell^1 : |f_1| < 1 \}.

This slab constrains only the first component of ff. A sequence like f(n)=12e1+enf^{(n)} = \tfrac{1}{2} e_1 + e_n has f1(n)=12<1|f^{(n)}_1| = \tfrac{1}{2} < 1 (inside the slab) but f(n)1=32\|f^{(n)}\|_1 = \tfrac{3}{2} (outside the norm ball).

More generally, the weak* neighborhood V(0;e1,,en,ε)={f1:fk<ε for k=1,,n}V(0; e_1, \ldots, e_n, \varepsilon) = \{f \in \ell^1 : |f_k| < \varepsilon \text{ for } k = 1, \ldots, n\} constrains the first nn components but places no restriction on fkf_k for k>nk > n. In infinite dimensions, every weak* neighborhood of 0 is unbounded in 1\ell^1.

Where weak* sits: the hierarchy of topologies

The weak* topology is one of three natural topologies on XX^*. The canonical embedding J:XXJ : X \hookrightarrow X^{**} mediates between them: each xXx \in X defines an evaluation functional J[x]XJ[x] \in X^{**} by J[x](f)=f(x)J[x](f) = f(x).

Since J(X)XJ(X) \subseteq X^{**}, every weak* open set is also weakly open, and every weakly open set is norm-open. More test functionals means more slabs, which means more open sets:

weak* topologyweak topology on Xnorm topology on X.\text{weak* topology} \subseteq \text{weak topology on } X^* \subseteq \text{norm topology on } X^*.

Each inclusion means “coarser than or equal to.” The first inclusion is strict when XX is not reflexive (there exist elements of XJ(X)X^{**} \setminus J(X) that generate extra slabs). When XX is reflexive, J(X)=XJ(X) = X^{**} and the weak* and weak topologies on XX^* coincide.

Example 3 (c01c_0 \hookrightarrow \ell^1 \hookrightarrow \ell^\infty)

Take X=c0X = c_0. Then X=1X^* = \ell^1 and X=X^{**} = \ell^\infty. The canonical embedding J:c0J : c_0 \hookrightarrow \ell^\infty sends x=(xk)c0x = (x_k) \in c_0 to the same sequence viewed in \ell^\infty. Since c0c_0 \subsetneq \ell^\infty (e.g., the constant sequence (1,1,1,)c0(1, 1, 1, \ldots) \in \ell^\infty \setminus c_0), the space c0c_0 is not reflexive.

The weak* topology on 1\ell^1 uses slabs from c0c_0: neighborhoods constrain f(x)f(x) for xc0x \in c_0. The weak topology on 1\ell^1 uses slabs from \ell^\infty: neighborhoods can additionally constrain ff against sequences that do not converge to 0. The weak topology is strictly finer, with more open sets and therefore fewer compact sets.

Remark 1 (The compactness payoff)

The coarser the topology, the easier it is for sets to be compact. This is why compactness improves as we move left in the chain:

  • Norm topology: the closed unit ball of XX^* is compact only if XX^* is finite-dimensional.

  • Weak topology on XX^*: the closed unit ball is compact if and only if XX^* is reflexive.

  • Weak* topology on XX^*: the closed unit ball is always compact (Banach-Alaoglu).

The weak* topology is the coarsest of the three, so it is the easiest setting for compactness. This is exactly why Banach-Alaoglu works without any reflexivity assumption.

Weak* convergence

Definition 2 (Weak* convergence)

Let XX be a normed space and let (fn)n1(f_n)_{n \geq 1} be a sequence in XX^*. We say fnwff_n \xrightarrow{w^*} f (weak* convergence) if

fn(x)f(x)for every xX.f_n(x) \to f(x) \quad \text{for every } x \in X.

Now the picture flips: each fnf_n is an instrument, and the sequence is a sequence of instruments, not objects. Think of replacing your entire measurement apparatus, swapping one thermometer for another, one scale for another. Fix any object xx and read off f1(x),f2(x),f3(x),f_1(x), f_2(x), f_3(x), \ldots. Weak* convergence means these readings stabilize to f(x)f(x) for every fixed object. Geometrically, each fnf_n defines a different foliation (different isotherms), and these foliations rearrange from step to step, but at every fixed point the height reading converges.

Proposition 1 (Strong convergence in XX^* implies weak* convergence)

Let XX be a normed space. If fnff_n \to f in the norm of XX^*, then fnwff_n \xrightarrow{w^*} f.

Proof 1

For any fixed xXx \in X,

fn(x)f(x)=(fnf)(x)fnfx0.|f_n(x) - f(x)| = |(f_n - f)(x)| \leq \|f_n - f\| \cdot \|x\| \to 0.

Since xx was arbitrary, fnwff_n \xrightarrow{w^*} f.

Proposition 2 (Weak convergence in XX^* implies weak* convergence)

Let XX be a normed space. If fnff_n \rightharpoonup f weakly in XX^*, then fnwff_n \xrightarrow{w^*} f.

Proof 2

Weak convergence in XX^* means ϕ(fn)ϕ(f)\phi(f_n) \to \phi(f) for every ϕX\phi \in X^{**}. For any xXx \in X, the canonical image J(x)XJ(x) \in X^{**} acts by J(x)(g)=g(x)J(x)(g) = g(x). In particular,

fn(x)=J(x)(fn)J(x)(f)=f(x).f_n(x) = J(x)(f_n) \to J(x)(f) = f(x).

Since J(X)XJ(X) \subseteq X^{**}, weak convergence tests against a larger set of functionals than weak* convergence, so the former implies the latter.

In summary:

strong in X    weak in X    weak* in X,\text{strong in } X^* \implies \text{weak in } X^* \implies \text{weak* in } X^*,

and neither arrow reverses in general. When XX is reflexive (JJ is surjective), the last two notions coincide.

Example 4 (Weak* convergent but not weakly convergent in 1\ell^1)

Take X=c0X = c_0, so X=1X^* = \ell^1 and X=X^{**} = \ell^\infty. The standard basis vectors en1e_n \in \ell^1 converge weak* to 0: for any x=(xk)c0x = (x_k) \in c_0,

en(x)=xn0e_n(x) = x_n \to 0

since xc0x \in c_0 means xn0x_n \to 0. But ene_n does not converge weakly to 0 in 1\ell^1. The element ϕ=(1,1,1,)=(1)\phi = (1, 1, 1, \ldots) \in \ell^\infty = (\ell^1)^* satisfies

ϕ(en)=1for all n,\phi(e_n) = 1 \quad \text{for all } n,

so ϕ(en)↛0\phi(e_n) \not\to 0. This ϕ\phi is precisely the kind of “phantom” test functional in XJ(X)X^{**} \setminus J(X) from Example 1: it does not correspond to any object in c0c_0, but it detects the non-convergence.

Visualizing weak* convergence in R2\mathbb{R}^2

Consider fn(x,y)=(11/n)x+(1/n)yf_n(x,y) = (1 - 1/n)\,x + (1/n)\,y on (R2,)(\mathbb{R}^2, \|\cdot\|_\infty). Each fnf_n has kernel line (11/n)x+(1/n)y=0(1 - 1/n)x + (1/n)y = 0, which slowly rotates toward the yy-axis as nn \to \infty. For any fixed point (a,b)(a, b):

fn(a,b)=(11n)a+1nba=f(a,b)f_n(a, b) = \left(1 - \frac{1}{n}\right)a + \frac{1}{n}b \to a = f(a, b)

where f(x,y)=xf(x,y) = x. The foliations converge pointwise: at each location, the height readings stabilize, even though the kernel lines are visibly rotating from picture to picture.

As with weak convergence, the R2\mathbb{R}^2 picture is a visual scaffold: in finite dimensions weak* = weak = strong, so the foliations converge uniformly, not just pointwise. The genuine weak* phenomenon, where pointwise convergence of height readings does not imply uniform convergence, requires infinite dimensions. The top row below shows the geometry (rotating level sets), but the real content is in the bottom row: height readings at fixed test points stabilizing one by one.

Source
<Figure size 1500x1000 with 6 Axes>

Top row: the level sets of fnf_n rotate as nn increases. For n=2n = 2 the level sets are diagonal; by n=4n = 4 they are nearly vertical; at the limit f(x,y)=xf(x,y) = x they are exactly vertical. Bottom row: for each fixed test point, the height reading fn(p)f_n(p) converges to the limiting value f(p)=pxf(p) = p_x. This is weak* convergence: the foliations rearrange, but at every fixed point the readings stabilize.

The Banach-Alaoglu theorem

We claimed that the weak* topology makes compactness easier. The following theorem delivers on this promise: the closed unit ball of XX^* is always weak* compact, for any normed space XX.

The intuition is simple. Each fBXf \in B_{X^*} is determined by its readings on all objects, and since f1\|f\| \leq 1, each reading satisfies f(x)x|f(x)| \leq \|x\|, so f(x)[x,x]f(x) \in [-\|x\|, \|x\|]. An instrument is a choice of one bounded number per object. Given a sequence of instruments, focus on a single object x1x_1: the readings fn(x1)f_n(x_1) are bounded, so Bolzano-Weierstrass gives a convergent subsequence. Extract a further subsequence to make the readings converge at x2x_2, then x3x_3, and so on.

This is the same diagonal argument behind Arzelà-Ascoli and compact operators: whenever there are countably many coordinates to control, Bolzano-Weierstrass plus diagonalization does the job. If XX is separable, a countable dense subset provides the coordinates, and the argument goes through.

We give the full proof for separable spaces, then state the general result.

Theorem 1 (Banach-Alaoglu (separable case))

Let XX be a separable Banach space and (fn)X(f_n) \subset X^* a bounded sequence with fnM\|f_n\| \leq M. Then (fn)(f_n) has a weak* convergent subsequence.

Proof 3

Step 1: set up the countable dense subset. Since XX is separable, there exists a countable dense subset {x1,x2,x3,}X\{x_1, x_2, x_3, \ldots\} \subset X.

Step 2: successive extraction. Evaluate the sequence (fn)(f_n) at x1x_1. Since fn(x1)Mx1|f_n(x_1)| \leq M\|x_1\| for all nn, the sequence (fn(x1))(f_n(x_1)) is bounded in R\mathbb{R}. By Bolzano-Weierstrass, extract a convergent subsequence:

(fn)    (fn1(j))j1such that fn1(j)(x1) converges.(f_n) \;\supset\; (f_{n_1^{(j)}})_{j \geq 1} \quad \text{such that } f_{n_1^{(j)}}(x_1) \text{ converges.}

Now evaluate this subsequence at x2x_2. The numbers (fn1(j)(x2))(f_{n_1^{(j)}}(x_2)) are again bounded, so extract a further subsequence:

(fn1(j))    (fn2(j))j1such that fn2(j)(x1) and fn2(j)(x2) both converge.(f_{n_1^{(j)}}) \;\supset\; (f_{n_2^{(j)}})_{j \geq 1} \quad \text{such that } f_{n_2^{(j)}}(x_1) \text{ and } f_{n_2^{(j)}}(x_2) \text{ both converge.}

The second subsequence still converges at x1x_1 because it is a subsequence of something that already converged there. Continue: at stage kk, extract a subsequence (fnk(j))j1(f_{n_k^{(j)}})_{j \geq 1} that converges at x1,,xkx_1, \ldots, x_k. This gives nested subsequences:

(fn)    (fn1(j))conv at x1    (fn2(j))conv at x1,x2        (fnk(j))conv at x1,,xk    (f_n) \;\supset\; \underbrace{(f_{n_1^{(j)}})}_{\text{conv at } x_1} \;\supset\; \underbrace{(f_{n_2^{(j)}})}_{\text{conv at } x_1, x_2} \;\supset\; \cdots \;\supset\; \underbrace{(f_{n_k^{(j)}})}_{\text{conv at } x_1, \ldots, x_k} \;\supset\; \cdots

Step 3: the diagonal trick. Define gj:=fnj(j)g_j := f_{n_j^{(j)}}, the jj-th element of the jj-th subsequence. For any fixed kk and jkj \geq k, the element gj=fnj(j)g_j = f_{n_j^{(j)}} belongs to the jj-th subsequence, which is a sub-subsequence of the kk-th subsequence. So the tail gk,gk+1,gk+2,g_k, g_{k+1}, g_{k+2}, \ldots is a subsequence of the kk-th extracted sequence, which converges at xkx_k. Since finitely many initial terms do not affect convergence, (gj(xk))j1(g_j(x_k))_{j \geq 1} converges for every kk. Call the limit LkL_k.

Step 4: extend from the dense subset to all of XX. Take arbitrary xXx \in X. Given ε>0\varepsilon > 0, pick xkx_k from the dense subset with xxk<ε\|x - x_k\| < \varepsilon. Then:

gj(x)gl(x)gj(x)gj(xk)+gj(xk)gl(xk)+gl(xk)gl(x).|g_j(x) - g_l(x)| \leq |g_j(x) - g_j(x_k)| + |g_j(x_k) - g_l(x_k)| + |g_l(x_k) - g_l(x)|.

The first and third terms are bounded by gjxxkMε\|g_j\| \cdot \|x - x_k\| \leq M\varepsilon each. The middle term is less than ε\varepsilon for j,lj, l large enough since (gj(xk))(g_j(x_k)) converges. So gj(x)gl(x)(2M+1)ε|g_j(x) - g_l(x)| \leq (2M + 1)\varepsilon for j,lj, l sufficiently large. The sequence (gj(x))(g_j(x)) is Cauchy in R\mathbb{R}, hence convergent. Define g(x):=limjgj(x)g(x) := \lim_{j \to \infty} g_j(x).

Step 5: the limit is in XX^*. Linearity of gg follows from linearity of each gjg_j and limits. Boundedness follows from g(x)=limgj(x)Mx|g(x)| = \lim |g_j(x)| \leq M\|x\|, so gM\|g\| \leq M and gXg \in X^*. By construction gj(x)g(x)g_j(x) \to g(x) for all xXx \in X, which is exactly gjwgg_j \xrightarrow{w^*} g.

Leonidas Alaoglu extended this result to non-separable spaces in his 1938 PhD thesis at the University of Chicago, replacing the diagonal argument with Tychonoff’s theorem.

Remark 2 (The general (non-separable) case)

For arbitrary normed spaces, the theorem states that BXB_{X^*} is compact (not just sequentially compact) in the weak* topology. The proof embeds BXB_{X^*} into the product P=xX[x,x]P = \prod_{x \in X} [-\|x\|, \|x\|] via Φ(f)=(f(x))xX\Phi(f) = (f(x))_{x \in X}. By Tychonoff’s theorem, PP is compact in the product topology. The product topology is exactly the topology of pointwise convergence, which is the weak* topology. One checks that Φ(BX)\Phi(B_{X^*}) is closed in PP (limits of linear functions are linear, and the norm bound is preserved), so it is compact as a closed subset of a compact space.

Corollary 1 (Weak sequential compactness in separable reflexive spaces)

Let XX be a separable reflexive Banach space. Then every bounded sequence in XX has a weakly convergent subsequence.

Proof 4

Since XX is reflexive, the canonical embedding J:XXJ : X \to X^{**} is surjective, so J(BX)=BXJ(B_X) = B_{X^{**}}. Now apply Banach-Alaoglu to the space XX^*: the unit ball B(X)=BXB_{(X^*)^*} = B_{X^{**}} is compact in the weak* topology of XX^{**} (i.e., the topology of pointwise convergence on XX^*). But under the identification J:XXJ : X \cong X^{**}, the weak* topology on XX^{**} corresponds exactly to the weak topology on XX (both test against elements of XX^*). Therefore BXB_X is weakly compact.

For sequences: since XX is separable, XX^* is separable in the weak* topology (a countable dense subset of XX generates a countable family of weak* continuous functionals that separate points of XX^*). The diagonal argument from the Banach-Alaoglu proof then applies directly: given (xn)(x_n) with xnC\|x_n\| \leq C, pick a countable dense subset {f1,f2,}X\{f_1, f_2, \ldots\} \subset X^*, diagonalize to extract a subsequence along which fk(xnj)f_k(x_{n_j}) converges for every kk, and extend to all of XX^* by density and uniform boundedness.

This corollary is the reason reflexivity matters in applications. In a separable reflexive space (such as LpL^p or Wk,pW^{k,p} for 1<p<1 < p < \infty), bounded sequences always have weakly convergent subsequences. Without reflexivity, Banach-Alaoglu still gives weak* compactness of BXB_{X^*}, but this is a statement about functionals on XX, not about elements of XX itself.