Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

The Hahn–Banach Theorem

We saw that XX^* could be trivial (LpL^p for 0<p<10 < p < 1), and asked whether the dual is always a complete measurement system for Banach spaces. The following theorem answers yes: every bounded linear functional on a subspace extends to the whole space with the same norm. This guarantees that XX^* is rich enough to separate points, recover the norm, and support the weak topology.

Theorem 1 (Hahn–Banach theorem)

Let XX be a normed space and MXM \subset X a linear subspace. Let fMf \in M^* with f(x)kx|f(x)| \leq k\|x\| for all xMx \in M. Then there exists an extension FXF \in X^* of ff to all of XX with F(x)kx|F(x)| \leq k\|x\| for all xXx \in X.

Note that FF is not necessarily unique.

Proof 1

The proof has three ingredients. The first two are structural; the third is where the analysis lives.

Ingredient 1: Zorn’s lemma setup.

Define the poset:

E={(g,G):GM is a subspace,  g extends f on G,  g(x)kx  xG}E = \{(g, G) : G \supseteq M \text{ is a subspace}, \; g \text{ extends } f \text{ on } G, \; |g(x)| \leq k\|x\| \;\forall x \in G\}

Order by extension: (h,H)(g,G)(h, H) \leq (g, G) if HGH \subseteq G and gH=hg|_H = h. This is nonempty ((f,M)E(f, M) \in E), chains have upper bounds (take the union), so Zorn gives a maximal element (F,W)(F, W).

Ingredient 2: Maximal implies total.

Suppose WXW \neq X. Pick zXWz \in X \setminus W and set Z=W+span(z)Z = W + \text{span}(z). By the codimension-1 decomposition, every element of ZZ can be written uniquely as w+αzw + \alpha z with wWw \in W, αR\alpha \in \mathbb{R}. We want to define:

F~(w+αz)=F(w)+αc\tilde{F}(w + \alpha z) = F(w) + \alpha c

for some constant cc. If we can find any valid cc, then (F~,Z)(\tilde{F}, Z) strictly extends (F,W)(F, W), contradicting maximality.

Ingredient 3: The constant cc exists.

The requirement F~(w+αz)kw+αz|\tilde{F}(w + \alpha z)| \leq k\|w + \alpha z\| constrains cc to lie in an interval. The existing bound F()k|F(\cdot)| \leq k\|\cdot\| on WW, combined with the triangle inequality, guarantees that the lower bound on cc is \leq the upper bound. The interval is nonempty.

Since cc exists, we can extend, contradicting maximality. Therefore W=XW = X.

What to take away
  • Ingredients 1 and 2 follow the same Zorn’s lemma template as the Hamel basis proof, combined with the codimension-1 decomposition. No new ideas are needed.

  • Ingredient 3 is the only piece that requires actual work, and the only tool it uses is the triangle inequality. This is where the assumption that we’re in a normed space matters — and it’s exactly the property that fails in LpL^p for 0<p<10 < p < 1, which is why the dual collapses there.

Geometric Consequences

The corollaries of Hahn–Banach are all separation results — they say you can always find a hyperplane that separates things.

Corollary 1 (The distinguishing property)

Let x,yXx, y \in X. If f(x)=f(y)f(x) = f(y) for all fXf \in X^* then x=yx = y.

Proof 2

If xyx \neq y, define φ(α(xy))=αxy\varphi(\alpha(x - y)) = \alpha\|x - y\| on span(xy)\text{span}(x - y). Then φ=1\|\varphi\| = 1 and Hahn–Banach extends φ\varphi to FXF \in X^* with F(x)F(y)=F(xy)=xy0F(x) - F(y) = F(x-y) = \|x-y\| \neq 0, contradicting the assumption.

The contrapositive gives the intuition: if two points are on the same level set for every possible foliation, no matter how you orient the hyperplanes, then they must be the same point.

Source
<Figure size 1000x400 with 2 Axes>

Left: the functional f(u,v)=vf(u,v) = v gives the same reading for xx and yy (they lie on the same horizontal level set), so ff alone cannot distinguish them. Right: the functional g(u,v)=ug(u,v) = u gives different readings (different vertical level sets) and separates the two points. No single functional suffices; we need the entire dual XX^* to guarantee that distinct points are always distinguishable.

Corollary 2 (Norming property)

For each xXx \in X, there exists fXf \in X^* with f(x)=xf(x) = \|x\| and fop=1\|f\|_{\text{op}} = 1.

Proof 3

Define f(αx)=αxf(\alpha x) = \alpha\|x\| on span(x)\text{span}(x). Then f(αx)=αx=αx|f(\alpha x)| = |\alpha|\|x\| = \|\alpha x\|, so f=1\|f\| = 1 on span(x)\text{span}(x). Hahn–Banach extends to FXF \in X^* with F=1\|F\| = 1 and F(x)=xF(x) = \|x\|.

Source
<Figure size 500x400 with 1 Axes>

The functional f(u,v)=uf(u,v) = u has f=1\|f\| = 1 (the unit ball fits inside the strip f1|f| \leq 1) and achieves f(x0)=x0=1.8f(x_0) = \|x_0\|_\infty = 1.8. The reading equals the norm exactly.

Corollary 3 (The sup formula)

Let XX be a normed space and BX={fX:f1}B_{X^*} = \{f \in X^* : \|f\| \leq 1\} the closed unit ball of the dual. Then for all xXx \in X:

x=sup{f(x):fBX}=max{f(x):fBX}\|x\| = \sup\{|f(x)| : f \in B_{X^*}\} = \max\{|f(x)| : f \in B_{X^*}\}

In particular, the supremum is attained: there exists fBXf \in B_{X^*} with f(x)=x|f(x)| = \|x\|. The dual XX^* is a complete measurement system with no information loss.

Proof 4

The inequality \leq: for any ff with f1\|f\| \leq 1, we have f(x)fxx|f(x)| \leq \|f\|\|x\| \leq \|x\|, so supx\sup \leq \|x\|.

The inequality \geq (and that it is a max): by the norming property, there exists fXf \in X^* with f=1\|f\| = 1 and f(x)=xf(x) = \|x\|. This ff attains the supremum.

Source
<Figure size 550x500 with 1 Axes>

Three unit-norm functionals give different readings of x0=(1.5,1.0)x_0 = (1.5, 1.0): f1(u,v)=uf_1(u,v) = u reads 1.5, f2(u,v)=vf_2(u,v) = v reads 1.0, and f3=(u+v)/2f_3 = (u+v)/\sqrt{2} reads 1.25\approx 1.25. The best reading equals x0=1.5\|x_0\|_\infty = 1.5. The norming property guarantees such an optimal functional always exists.

Corollary 4 (Classification of closures)

Let MM be a linear subspace of a normed space XX and let x0Xx_0 \in X. Then x0Mx_0 \in \overline{M} if and only if there exists no bounded linear functional ff such that f(x)=0f(x) = 0 for all xMx \in M but f(x0)0f(x_0) \neq 0.

Proof 5

()(\Rightarrow) If x0Mx_0 \in \overline{M}, pick xnMx_n \in M with xnx0x_n \to x_0. For any fXf \in X^* vanishing on MM, continuity gives f(x0)=limf(xn)=0f(x_0) = \lim f(x_n) = 0.

()(\Leftarrow) Suppose x0Mx_0 \notin \overline{M}. Then M\overline{M} is a proper closed subspace and d=dist(x0,M)>0d = \text{dist}(x_0, \overline{M}) > 0. Define gg on M+span(x0)\overline{M} + \text{span}(x_0) by g(m+αx0)=αg(m + \alpha x_0) = \alpha. Then g(m+αx0)=αm+αx0/d|g(m + \alpha x_0)| = |\alpha| \leq \|m + \alpha x_0\|/d (since m+αx0αd\|m + \alpha x_0\| \geq |\alpha| d). So g1/d\|g\| \leq 1/d. Hahn-Banach extends gg to fXf \in X^*. By construction f=0f = 0 on MM but f(x0)=10f(x_0) = 1 \neq 0.

Geometric Hahn–Banach: Separation of Convex Sets

The corollaries above separate points from points and points from the unit ball. The geometric form of Hahn–Banach separates a point from any closed convex set.

Theorem 2 (Separation of point from closed convex set)

Let CXC \subset X be a closed convex set and x0Cx_0 \notin C. Then there exists fXf \in X^* and a constant γ\gamma such that

f(x0)>γf(c)for all cCf(x_0) > \gamma \geq f(c) \quad \text{for all } c \in C

That is, there is a closed hyperplane f1(γ)f^{-1}(\gamma) that strictly separates x0x_0 from CC.

Source
<Figure size 1200x500 with 2 Axes>

Why closedness and convexity matter