Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

The Geometry of Linear Functionals

Before proving Hahn–Banach, it helps to understand what a linear functional looks like geometrically. This perspective clarifies what duality is really doing.

Kernels and level sets

For a nonzero linear functional fXf \in X' (where XX' denotes the algebraic dual), define two canonical sets:

Both are level sets of ff: K(f)K(f) is where ff vanishes, and I(f)I(f) is where ff equals 1.

Note that 0K(f)0 \in K(f) always (since f(0)=0f(0) = 0 by linearity), while 0I(f)0 \notin I(f) (since f(0)=01f(0) = 0 \neq 1). So the kernel passes through the origin; the unit level set does not.

The fundamental decomposition

Proposition 1 (Codimension-1 decomposition)

Let fXf \in X' be a nonzero linear functional, and let x0Xx_0 \in X with f(x0)0f(x_0) \neq 0. Then:

  1. K(f)K(f) is a subspace of codimension 1 (see Corollary 1) — it is “missing exactly one dimension.”

  2. Every vector xXx \in X has a unique representation

    x=y+λx0,yK(f),λRx = y + \lambda x_0, \qquad y \in K(f), \quad \lambda \in \mathbb{R}
  3. The scalar λ\lambda is determined: λ=f(x)f(x0)\lambda = \dfrac{f(x)}{f(x_0)}.

Proof 1

Given any xXx \in X, set λ=f(x)/f(x0)\lambda = f(x)/f(x_0) and y=xλx0y = x - \lambda x_0. Then f(y)=f(x)λf(x0)=f(x)f(x)f(x0)f(x0)=0f(y) = f(x) - \lambda f(x_0) = f(x) - \frac{f(x)}{f(x_0)} f(x_0) = 0, so yK(f)y \in K(f) and x=y+λx0x = y + \lambda x_0 by construction.

Uniqueness: If x=y1+λ1x0=y2+λ2x0x = y_1 + \lambda_1 x_0 = y_2 + \lambda_2 x_0 with y1,y2K(f)y_1, y_2 \in K(f), then (λ1λ2)x0=y2y1K(f)(\lambda_1 - \lambda_2)x_0 = y_2 - y_1 \in K(f), so (λ1λ2)f(x0)=0(\lambda_1 - \lambda_2)f(x_0) = 0. Since f(x0)0f(x_0) \neq 0, we get λ1=λ2\lambda_1 = \lambda_2, hence y1=y2y_1 = y_2.

Codimension 1: The decomposition gives X=K(f)span(x0)X = K(f) \oplus \text{span}(x_0). Since span(x0)\text{span}(x_0) is one-dimensional, K(f)K(f) has codimension 1 — it is a maximal proper subspace, i.e., a hyperplane through the origin.

The foliation picture

The decomposition tells us exactly how ff organizes the space geometrically.

Every xXx \in X is uniquely x=y+λx0x = y + \lambda x_0 with yK(f)y \in K(f), and f(x)=λf(x0)f(x) = \lambda f(x_0). So the level sets of ff are:

f1(c)={y+cf(x0)x0:yK(f)}=K(f)+cf(x0)x0f^{-1}(c) = \left\{y + \frac{c}{f(x_0)}\,x_0 : y \in K(f)\right\} = K(f) + \frac{c}{f(x_0)}\,x_0

These are parallel copies of K(f)K(f), shifted along the x0x_0-direction. They foliate XX — every point lies on exactly one level set — and ff assigns a “height” cc to each slice.

If we normalize so that f(x0)=1f(x_0) = 1:

Example 1 (Foliation in R3\mathbb{R}^3)

Take f(x,y,z)=2x+3yzf(x,y,z) = 2x + 3y - z.

  • K(f)={2x+3yz=0}K(f) = \{2x + 3y - z = 0\} — a plane through the origin.

  • Pick x0=(1,0,2)x_0 = (1, 0, 2); then f(x0)=2f(x_0) = 2.

  • The level sets f1(c)={2x+3yz=c}f^{-1}(c) = \{2x + 3y - z = c\} are parallel planes.

  • Moving along x0x_0 increases ff by 2 per unit step.

Source
<Figure size 1400x600 with 2 Axes>

Level sets of f(x,y)=x+yf(x,y) = x + y (left) and g(x,y)=2(x+y)g(x,y) = 2(x+y) (right). Both functionals share the same kernel K(f)=K(g)={x+y=0}K(f) = K(g) = \{x + y = 0\}, but gg has twice the spacing density: the c=1c = 1 level set of gg sits where the c=1/2c = 1/2 level set of ff would be.

What the kernel and scale each determine

A nonzero functional is determined by two independent pieces of geometric data:

1. The kernel determines the orientation of the slices. Two functionals with the same kernel produce exactly the same family of parallel hyperplanes. The kernel tells you which directions are horizontal (directions lying in K(f)K(f)) and which direction is transverse.

2. The scale determines the spacing between slices. If g=2fg = 2f, then ker(g)=ker(f)\ker(g) = \ker(f) — same orientation, same family of parallel hyperplanes. But g1(1)=f1(1/2)g^{-1}(1) = f^{-1}(1/2), which sits halfway between f1(0)f^{-1}(0) and f1(1)f^{-1}(1). Rescaling ff by λ\lambda doesn’t rotate the slices — it compresses or stretches the spacing between them.

Multiple functionals: adding vs. intersecting

A common source of confusion is the distinction between adding two functionals (an operation in XX^*) and intersecting their kernels (an operation on subspaces of XX). These produce very different geometric objects.

Example 2 (Adding vs. intersecting functionals in R3\mathbb{R}^3)

Work in R3\mathbb{R}^3 with the three coordinate functionals:

f1(x,y,z)=x,f2(x,y,z)=y.f_1(x,y,z) = x, \qquad f_2(x,y,z) = y.

One functional. The kernel kerf1={x=0}\ker f_1 = \{x = 0\} is the yzyz-plane, a codimension-1 subspace. The decomposition gives R3=kerf1span{e1}\mathbb{R}^3 = \ker f_1 \oplus \operatorname{span}\{e_1\}: every vector is uniquely a point in the yzyz-plane plus a multiple of e1e_1.

Intersecting two kernels. The intersection kerf1kerf2={x=0}{y=0}={(0,0,z):zR}\ker f_1 \cap \ker f_2 = \{x = 0\} \cap \{y = 0\} = \{(0, 0, z) : z \in \mathbb{R}\} is the zz-axis, a codimension-2 subspace. Each independent functional removes one degree of freedom. The decomposition extends: R3=(kerf1kerf2)span{e1,e2}\mathbb{R}^3 = (\ker f_1 \cap \ker f_2) \oplus \operatorname{span}\{e_1, e_2\}. Imposing nn independent constraints carves out a codimension-nn subspace.

Adding two functionals. The sum f1+f2f_1 + f_2 is the functional (f1+f2)(x,y,z)=x+y(f_1 + f_2)(x,y,z) = x + y. This is a single functional, so its kernel ker(f1+f2)={x+y=0}\ker(f_1 + f_2) = \{x + y = 0\} is a plane: still codimension 1. It is a different plane from either kerf1\ker f_1 or kerf2\ker f_2, tilted at 45°45° between them. Adding functionals combines two measurements into one new measurement; it does not impose two constraints simultaneously.

Dependent functionals. If f2=2f1f_2 = 2 f_1, then kerf1kerf2=kerf1\ker f_1 \cap \ker f_2 = \ker f_1 — still the yzyz-plane, still codimension 1. The second functional measures the same direction as the first (just with a different scale), so it adds no new geometric information.

Cancellation. If f2=f1f_2 = -f_1, then f1+f2=0f_1 + f_2 = 0, the zero functional. Its kernel is all of R3\mathbb{R}^3: the two measurements cancel perfectly, and the constraint vanishes.

Source
<Figure size 1600x500 with 3 Axes>

Left: the kernel of a single functional f1(x,y,z)=xf_1(x,y,z) = x is the yzyz-plane (codimension 1), with the transverse direction e1e_1 shown in red. Center: intersecting the kernels of two independent functionals f1f_1 and f2f_2 gives the zz-axis (codimension 2), shown as the thick purple line where the two faint planes meet. Right: the sum f1+f2f_1 + f_2 is a single functional with kernel {x+y=0}\{x + y = 0\}, a different plane (codimension 1, green). Adding functionals produces one new measurement; intersecting kernels imposes multiple constraints.

Remark 1 (From intersecting kernels to weak neighborhoods)

The intersection kerf1kerfn\ker f_1 \cap \cdots \cap \ker f_n for nn linearly independent functionals is a subspace of codimension nn. If we thicken each kernel into a slab {y:fi(y)<ε}\{y : |f_i(y)| < \varepsilon\} and intersect, we get a tube: bounded cross-section in nn directions, unbounded in the remaining (codimension-nn) directions. These tubes are exactly the basic neighborhoods of the weak topology (see Definition 1). Each additional independent functional constrains one more direction but still leaves infinitely many unconstrained in infinite dimensions.

Functionals are determined by their kernels

The decomposition immediately yields a powerful uniqueness result.

If f,g:XRf, g : X \to \mathbb{R} are nonzero linear functionals with ker(f)=ker(g)\ker(f) = \ker(g), then f=λgf = \lambda g for some scalar λ0\lambda \neq 0.

Proof 2

Pick any x0ker(f)=ker(g)x_0 \notin \ker(f) = \ker(g). By the decomposition, every xXx \in X can be written as x=y+f(x)f(x0)x0x = y + \frac{f(x)}{f(x_0)}\,x_0 with yker(f)=ker(g)y \in \ker(f) = \ker(g). Applying gg:

g(x)=g(y)+f(x)f(x0)g(x0)=0+g(x0)f(x0)f(x)g(x) = g(y) + \frac{f(x)}{f(x_0)}\,g(x_0) = 0 + \frac{g(x_0)}{f(x_0)}\,f(x)

since g(y)=0g(y) = 0 (because yker(g)y \in \ker(g)). Setting λ=g(x0)/f(x0)\lambda = g(x_0)/f(x_0) gives g=λfg = \lambda f.

Consequence for the dual space: Classifying linear functionals up to scaling is the same as classifying codimension-1 subspaces. The projective algebraic dual X/ ⁣X'/\!\sim (where fλff \sim \lambda f for λ0\lambda \neq 0) is in bijection with the set of hyperplanes through the origin.

Characterizing continuity via hyperplanes

Here is where the algebraic vs. topological dual distinction becomes geometric. The key fact is surprisingly sharp: a hyperplane is either closed (and nowhere dense) or dense in XX — there is nothing in between.

A proper closed subspace of a normed space has empty interior (hence is nowhere dense).

Proof 3

Let MXM \subsetneq X be a proper closed subspace. Suppose for contradiction that MM contains an open ball B(x,r)B(x, r) for some xMx \in M, r>0r > 0. Since MM is a subspace and xMx \in M, for any vB(0,r)v \in B(0, r) we have v=(x+v)xv = (x + v) - x, and x+vB(x,r)Mx + v \in B(x, r) \subset M, so vMv \in M. Thus B(0,r)MB(0, r) \subset M. But then for any yXy \in X with y0y \neq 0, the vector r2yy\frac{r}{2\|y\|}\,y lies in B(0,r)MB(0, r) \subset M, so yMy \in M (since MM is a subspace). This gives M=XM = X, contradicting properness.

Why “empty interior” sounds wrong at first

Interior here means in the full topology of XX, not the relative topology on MM. The xx-axis in R2\mathbb{R}^2 looks solid from the inside, but any open ball in R2\mathbb{R}^2 escapes into the yy-direction — the subspace is “infinitely thin” when viewed from the ambient space.

So: a continuous linear functional has a closed kernel, and this kernel is a “thin” hyperplane — a codimension-1 subspace with empty interior in XX.

A codimension-1 subspace of a normed space is either closed or dense.

Proof 4

Let K(f)K(f) be a codimension-1 subspace (the kernel of some nonzero ff). Its closure K(f)\overline{K(f)} is a closed subspace containing K(f)K(f). Since K(f)K(f) has codimension 1, there are only two possibilities:

  • K(f)=K(f)\overline{K(f)} = K(f) — in which case K(f)K(f) is closed.

  • K(f)K(f)\overline{K(f)} \supsetneq K(f) — but then K(f)\overline{K(f)} is a closed subspace strictly containing a codimension-1 subspace, so it must be all of XX. Hence K(f)K(f) is dense in XX.

There is no intermediate option.

Proposition 3 (Continuity via the kernel)

A nonzero linear functional f:XRf : X \to \mathbb{R} is continuous if and only if ker(f)\ker(f) is closed.

Proof 5

(\Rightarrow): If ff is continuous, then ker(f)=f1({0})\ker(f) = f^{-1}(\{0\}) is the preimage of a closed set, hence closed.

(\Leftarrow): Suppose ker(f)\ker(f) is closed and f0f \neq 0. Pick x0x_0 with f(x0)0f(x_0) \neq 0; by the decomposition, X=ker(f)span(x0)X = \ker(f) \oplus \text{span}(x_0). Since ker(f)\ker(f) is closed, x0x_0 has positive distance from ker(f)\ker(f): set δ=dist(x0,ker(f))>0\delta = \text{dist}(x_0, \ker(f)) > 0. For any x=y+λx0x = y + \lambda x_0 with yker(f)y \in \ker(f):

x=y+λx0λdist(x0,ker(f))=λδ\|x\| = \|y + \lambda x_0\| \geq |\lambda| \cdot \text{dist}(x_0, \ker(f)) = |\lambda|\delta

(since y/λker(f)-y/\lambda \in \ker(f) when λ0\lambda \neq 0, so x0+y/λδ\|x_0 + y/\lambda\| \geq \delta, and multiplying by λ|\lambda| gives xλδ\|x\| \geq |\lambda|\delta). Therefore:

f(x)=λf(x0)f(x0)δx|f(x)| = |\lambda| \cdot |f(x_0)| \leq \frac{|f(x_0)|}{\delta}\,\|x\|

So ff is bounded with ff(x0)/δ\|f\| \leq |f(x_0)|/\delta.

Remark 2 (Connection to the quotient norm)

The quantity δ=dist(x0,ker(f))\delta = \operatorname{dist}(x_0, \ker(f)) appearing in the proof above is the quotient norm [x0]X/ker(f)\|[x_0]\|_{X/\ker(f)} (see Definition 2). The closedness of ker(f)\ker(f) is essential: it guarantees that the quotient norm is a genuine norm, so that [x0]>0\|[x_0]\| > 0 for x0ker(f)x_0 \notin \ker(f).

In fact, the entire proof is the quotient perspective in disguise. The functional ff factors through the quotient as

X  π  X/ker(f)  f~  RX \xrightarrow{\;\pi\;} X/\ker(f) \xrightarrow{\;\widetilde{f}\;} \mathbb{R}

where π\pi is the canonical projection and f~\widetilde{f} is the isomorphism from the Theorem 1. Since X/ker(f)X/\ker(f) is one-dimensional, f~\widetilde{f} is automatically bounded (all norms on R\mathbb{R} are equivalent), and π\pi is bounded with π1\|\pi\| \leq 1. The composite f=f~πf = \widetilde{f} \circ \pi is therefore bounded.

Combining everything:

There is nothing in between: every hyperplane is either cleanly closed or pathologically dense.

Remark 3 (Dense hyperplanes are pathological)

If K(f)K(f) is dense, then every level set f1(c)f^{-1}(c) is also dense. Points at height 0, height 1, and height 1,000,0001{,}000{,}000 are all interleaved at arbitrarily fine scales throughout the space. With a continuous functional, you’d see clean parallel bands of color, smoothly transitioning. With a discontinuous functional, every color appears in every tiny ball.

Concretely: if ff is discontinuous, then for any ε>0\varepsilon > 0 and any M>0M > 0, there exists xx with x<ε\|x\| < \varepsilon but f(x)>M|f(x)| > M. The slicing exists algebraically, but the height function is wildly discontinuous. This is why continuous functionals — the ones with closed, nowhere-dense kernels — are the only ones useful for analysis.

A Worked Example: Hahn–Banach Extension in R2\mathbb{R}^2

The Hahn–Banach theorem, stated in the hyperplane language, says: if you know the heights of points along a subspace, you can extend the height function to the entire space by choosing a closed hyperplane.

We make this explicit with a worked example. We use R2\mathbb{R}^2 with the \ell^\infty norm (x,y)=max(x,y)\|(x,y)\|_\infty = \max(|x|, |y|), so the unit ball is a square.

Why the \ell^\infty norm and not Euclidean?

With the Euclidean norm (round unit ball), the Hahn–Banach extension turns out to be unique — the roundness of the ball forces a single choice. This is a general fact about Hilbert spaces: the Riesz representation theorem pins down the unique extension.

With the \ell^\infty norm (square unit ball), the extension is non-unique — there is a whole interval of valid choices, each giving a different kernel. This is the generic situation in Banach spaces.

Step 1: Define ff on a subspace.

Let M=span((1,1))M = \text{span}((1,1)). Define ff on MM by f(t(1,1))=tf(t(1,1)) = t. Check the norm: f(t(1,1))=t|f(t(1,1))| = |t| and t(1,1)=t(1,1)=t\|t(1,1)\|_\infty = |t| \cdot \|(1,1)\|_\infty = |t|. So f=1\|f\| = 1 on MM.

We know the heights along the diagonal line MM: the origin has height 0, the point (1/2,1/2)(1/2, 1/2) has height 1/21/2, the point (1,1)(1,1) has height 1.

Step 2: Decompose.

Pick z=(1,1)Mz = (1,-1) \notin M as the new direction. Every (x,y)R2(x,y) \in \mathbb{R}^2 decomposes as:

(x,y)=x+y2(1,1)component in M+xy2coefficient α(1,1)(x,y) = \underbrace{\frac{x+y}{2}(1,1)}_{\text{component in }M} + \underbrace{\frac{x-y}{2}}_{\text{coefficient }\alpha}\,(1,-1)

Step 3: The extension is determined by one number.

Set c=F(z)=F(1,1)c = F(z) = F(1,-1) — the height we assign to the new direction. By linearity:

F(x,y)=x+y2+xy2c=1+c2x+1c2yF(x,y) = \frac{x+y}{2} + \frac{x-y}{2} \cdot c = \frac{1+c}{2}\,x + \frac{1-c}{2}\,y

Note a+b=1a + b = 1 where a=(1+c)/2a = (1+c)/2 and b=(1c)/2b = (1-c)/2, ensuring F(1,1)=1=f(1,1)F(1,1) = 1 = f(1,1).

Step 4: Find which values of cc are valid.

We need Fop1\|F\|_{\text{op}} \leq 1 in the \ell^\infty norm. For a linear functional F(x,y)=ax+byF(x,y) = ax + by on (R2,)(\mathbb{R}^2, \|\cdot\|_\infty), the operator norm is F=a+b\|F\| = |a| + |b|. The constraint a+b1|a| + |b| \leq 1 combined with a+b=1a + b = 1 forces a,b0a, b \geq 0, giving:

c[1,1]\boxed{c \in [-1, 1]}

Every cc in this interval gives a valid norm-preserving extension. The constraint a+b1|a| + |b| \leq 1 is the 1\ell^1 unit ball in the (a,b)(a,b)-plane, and a+b=1a + b = 1 is a line cutting through it. The valid extensions correspond to the segment where the line meets the ball.

Source
<Figure size 600x500 with 1 Axes>

The 1\ell^1 unit ball a+b1|a| + |b| \leq 1 (diamond) is the dual ball of (R2,)(\mathbb{R}^2, \|\cdot\|_\infty), so the operator norm constraint F1\|F\| \leq 1 requires (a,b)(a,b) to lie inside it. The extension constraint a+b=1a + b = 1 (ensuring F=fF = f on MM) is a line. Valid extensions are the segment where the line meets the diamond — parametrized by c[1,1]c \in [-1, 1].

Step 5: Each cc gives a different kernel.

Source
<Figure size 1600x600 with 3 Axes>

Three norm-preserving extensions of ff from M=span(1,1)M = \mathrm{span}(1,1) to all of R2\mathbb{R}^2, corresponding to c=0,1,1c = 0, 1, -1. In each panel, the \ell^\infty unit square (shaded) fits between the F=1F = -1 and F=1F = 1 level sets, confirming F=1\|F\| = 1. The kernel (red) and the subspace MM (dashed) differ in each case, but F(1,1)=1F(1,1) = 1 in all three.

Takeaway

Same functional on MM. Three different extensions. Three different kernels. In each case the unit square fits between the level sets F1(1)F^{-1}(-1) and F1(1)F^{-1}(1) — the norm constraint F1\|F\| \leq 1 is satisfied because the flat sides of the square allow the kernel to tilt freely. With a round ball, only one tilt would work. Hahn–Banach says: in any Banach space, no matter how large, at least one valid cc always exists.

Visualizing the full family of extensions

Every c[1,1]c \in [-1, 1] gives a different valid extension. Each extension defines a strip between F1(1)F^{-1}(-1) and F1(1)F^{-1}(1), and the norm constraint F=1\|F\| = 1 means the unit square must fit inside that strip. The unit ball is the intersection of all such strips.

Source
<Figure size 800x800 with 2 Axes>

Each valid extension FF (parametrized by c[1,1]c \in [-1,1]) defines a strip between F1(1)F^{-1}(-1) and F1(1)F^{-1}(1), shown as a pair of lines in matching color. The \ell^\infty unit square (shaded) fits inside every strip. All F1(1)F^{-1}(1) lines pass through (1,1)(1,1) and all F1(1)F^{-1}(-1) lines pass through (1,1)(-1,-1). The unit ball is exactly the intersection of all these strips, which is the geometric content of the sup formula x=supF1F(x)\|x\| = \sup_{\|F\| \leq 1} |F(x)|.