Before proving Hahn–Banach, it helps to understand what a linear functional looks like geometrically. This perspective clarifies what duality is really doing.
Kernels and level sets¶
For a nonzero linear functional (where denotes the algebraic dual), define two canonical sets:
— the kernel
— the unit level set
Both are level sets of : is where vanishes, and is where equals 1.
Note that always (since by linearity), while (since ). So the kernel passes through the origin; the unit level set does not.
The fundamental decomposition¶
Proposition 1 (Codimension-1 decomposition)
Let be a nonzero linear functional, and let with . Then:
is a subspace of codimension 1 (see Corollary 1) — it is “missing exactly one dimension.”
Every vector has a unique representation
The scalar is determined: .
Proof 1
Given any , set and . Then , so and by construction.
Uniqueness: If with , then , so . Since , we get , hence .
Codimension 1: The decomposition gives . Since is one-dimensional, has codimension 1 — it is a maximal proper subspace, i.e., a hyperplane through the origin.
The foliation picture¶
The decomposition tells us exactly how organizes the space geometrically.
Every is uniquely with , and . So the level sets of are:
These are parallel copies of , shifted along the -direction. They foliate — every point lies on exactly one level set — and assigns a “height” to each slice.
If we normalize so that :
is the slice through the origin.
is the slice one step along .
for any integer — evenly spaced slices.
Example 1 (Foliation in )
Take .
— a plane through the origin.
Pick ; then .
The level sets are parallel planes.
Moving along increases by 2 per unit step.
import numpy as np
import matplotlib.pyplot as plt
# Color map for the five highlighted level sets c = -2, -1, 0, 1, 2
level_colors = {-2: 'C4', -1: 'C3', 0: 'C0', 1: 'C1', 2: 'C2'}
fig, axes = plt.subplots(1, 2, figsize=(14, 6))
t = np.linspace(-3, 3, 100)
# Left: foliation by f(x,y) = x + y in R^2
ax = axes[0]
for c in np.arange(-5, 5, 0.5):
y_line = c - t
if c in level_colors:
ax.plot(t, y_line, color=level_colors[c], alpha=0.9, lw=2.5)
else:
ax.plot(t, y_line, color='gray', alpha=0.2, lw=0.7)
ax.set_xlim(-2.5, 2.5)
ax.set_ylim(-2.5, 2.5)
ax.set_aspect('equal')
ax.axhline(0, color='k', lw=0.5, alpha=0.3)
ax.axvline(0, color='k', lw=0.5, alpha=0.3)
ax.set_title(r'Foliation by $f(x,y) = x + y$', fontsize=14)
ax.set_xlabel(r'$x$', fontsize=12)
ax.set_ylabel(r'$y$', fontsize=12)
ax.tick_params(labelsize=11)
for c, col in level_colors.items():
ax.plot([], [], color=col, lw=2.5, label=f'$c = {c}$')
ax.legend(fontsize=11, loc='upper right', ncol=1)
# Right: foliation by g(x,y) = 2(x + y) — same kernel, denser spacing
ax = axes[1]
for c in np.arange(-9, 9, 1):
y_line = c / 2 - t # g = 2(x+y) = c => x+y = c/2
if c in level_colors:
ax.plot(t, y_line, color=level_colors[c], alpha=0.9, lw=2.5)
else:
ax.plot(t, y_line, color='gray', alpha=0.2, lw=0.7)
ax.set_xlim(-2.5, 2.5)
ax.set_ylim(-2.5, 2.5)
ax.set_aspect('equal')
ax.axhline(0, color='k', lw=0.5, alpha=0.3)
ax.axvline(0, color='k', lw=0.5, alpha=0.3)
ax.set_title(r'Foliation by $g(x,y) = 2(x + y)$: denser spacing', fontsize=14)
ax.set_xlabel(r'$x$', fontsize=12)
ax.set_ylabel(r'$y$', fontsize=12)
ax.tick_params(labelsize=11)
for c, col in level_colors.items():
ax.plot([], [], color=col, lw=2.5, label=f'$c = {c}$')
ax.legend(fontsize=11, loc='upper right', ncol=1)
plt.tight_layout()
plt.show()
Level sets of (left) and (right). Both functionals share the same kernel , but has twice the spacing density: the level set of sits where the level set of would be.
What the kernel and scale each determine¶
A nonzero functional is determined by two independent pieces of geometric data:
1. The kernel determines the orientation of the slices. Two functionals with the same kernel produce exactly the same family of parallel hyperplanes. The kernel tells you which directions are horizontal (directions lying in ) and which direction is transverse.
2. The scale determines the spacing between slices. If , then — same orientation, same family of parallel hyperplanes. But , which sits halfway between and . Rescaling by doesn’t rotate the slices — it compresses or stretches the spacing between them.
Multiple functionals: adding vs. intersecting¶
A common source of confusion is the distinction between adding two functionals (an operation in ) and intersecting their kernels (an operation on subspaces of ). These produce very different geometric objects.
Example 2 (Adding vs. intersecting functionals in )
Work in with the three coordinate functionals:
One functional. The kernel is the -plane, a codimension-1 subspace. The decomposition gives : every vector is uniquely a point in the -plane plus a multiple of .
Intersecting two kernels. The intersection is the -axis, a codimension-2 subspace. Each independent functional removes one degree of freedom. The decomposition extends: . Imposing independent constraints carves out a codimension- subspace.
Adding two functionals. The sum is the functional . This is a single functional, so its kernel is a plane: still codimension 1. It is a different plane from either or , tilted at between them. Adding functionals combines two measurements into one new measurement; it does not impose two constraints simultaneously.
Dependent functionals. If , then — still the -plane, still codimension 1. The second functional measures the same direction as the first (just with a different scale), so it adds no new geometric information.
Cancellation. If , then , the zero functional. Its kernel is all of : the two measurements cancel perfectly, and the constraint vanishes.
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d.art3d import Poly3DCollection
fig = plt.figure(figsize=(16, 5))
# --- Panel 1: ker f1 (yz-plane) ---
ax = fig.add_subplot(131, projection='3d')
yy, zz = np.meshgrid(np.linspace(-2, 2, 10), np.linspace(-2, 2, 10))
xx = np.zeros_like(yy)
ax.plot_surface(xx, yy, zz, alpha=0.25, color='C0')
ax.quiver(0, 0, 0, 2, 0, 0, color='C3', arrow_length_ratio=0.15, lw=2)
ax.text(2.2, 0, 0, r'$e_1$', fontsize=11, color='C3')
ax.set_xlabel('$x$'); ax.set_ylabel('$y$'); ax.set_zlabel('$z$')
ax.set_title(r'$\ker f_1 = \{x=0\}$' + '\ncodim 1', fontsize=11)
ax.set_xlim(-2, 2); ax.set_ylim(-2, 2); ax.set_zlim(-2, 2)
# --- Panel 2: ker f1 ∩ ker f2 (z-axis) ---
ax = fig.add_subplot(132, projection='3d')
# Show both planes faintly
ax.plot_surface(xx, yy, zz, alpha=0.12, color='C0')
xx2 = np.copy(yy); yy2 = np.zeros_like(zz)
ax.plot_surface(xx2, yy2, zz, alpha=0.12, color='C1')
# The z-axis
z_line = np.linspace(-2, 2, 50)
ax.plot(np.zeros_like(z_line), np.zeros_like(z_line), z_line, 'C4', lw=3)
ax.set_xlabel('$x$'); ax.set_ylabel('$y$'); ax.set_zlabel('$z$')
ax.set_title(r'$\ker f_1 \cap \ker f_2$' + '\ncodim 2 (the $z$-axis)',
fontsize=11)
ax.set_xlim(-2, 2); ax.set_ylim(-2, 2); ax.set_zlim(-2, 2)
# --- Panel 3: ker(f1 + f2) = {x + y = 0} ---
ax = fig.add_subplot(133, projection='3d')
# The plane x + y = 0, i.e., y = -x
xx3 = np.linspace(-2, 2, 10)
zz3 = np.linspace(-2, 2, 10)
xx3, zz3 = np.meshgrid(xx3, zz3)
yy3 = -xx3
ax.plot_surface(xx3, yy3, zz3, alpha=0.25, color='C2')
ax.set_xlabel('$x$'); ax.set_ylabel('$y$'); ax.set_zlabel('$z$')
ax.set_title(r'$\ker(f_1 + f_2) = \{x+y=0\}$' + '\ncodim 1 (a different plane)',
fontsize=11)
ax.set_xlim(-2, 2); ax.set_ylim(-2, 2); ax.set_zlim(-2, 2)
plt.tight_layout()
plt.show()
Left: the kernel of a single functional is the -plane (codimension 1), with the transverse direction shown in red. Center: intersecting the kernels of two independent functionals and gives the -axis (codimension 2), shown as the thick purple line where the two faint planes meet. Right: the sum is a single functional with kernel , a different plane (codimension 1, green). Adding functionals produces one new measurement; intersecting kernels imposes multiple constraints.
Remark 1 (From intersecting kernels to weak neighborhoods)
The intersection for linearly independent functionals is a subspace of codimension . If we thicken each kernel into a slab and intersect, we get a tube: bounded cross-section in directions, unbounded in the remaining (codimension-) directions. These tubes are exactly the basic neighborhoods of the weak topology (see Definition 1). Each additional independent functional constrains one more direction but still leaves infinitely many unconstrained in infinite dimensions.
Functionals are determined by their kernels¶
The decomposition immediately yields a powerful uniqueness result.
If are nonzero linear functionals with , then for some scalar .
Proof 2
Pick any . By the decomposition, every can be written as with . Applying :
since (because ). Setting gives .
Consequence for the dual space: Classifying linear functionals up to scaling is the same as classifying codimension-1 subspaces. The projective algebraic dual (where for ) is in bijection with the set of hyperplanes through the origin.
Characterizing continuity via hyperplanes¶
Here is where the algebraic vs. topological dual distinction becomes geometric. The key fact is surprisingly sharp: a hyperplane is either closed (and nowhere dense) or dense in — there is nothing in between.
A proper closed subspace of a normed space has empty interior (hence is nowhere dense).
Proof 3
Let be a proper closed subspace. Suppose for contradiction that contains an open ball for some , . Since is a subspace and , for any we have , and , so . Thus . But then for any with , the vector lies in , so (since is a subspace). This gives , contradicting properness.
Interior here means in the full topology of , not the relative topology on . The -axis in looks solid from the inside, but any open ball in escapes into the -direction — the subspace is “infinitely thin” when viewed from the ambient space.
So: a continuous linear functional has a closed kernel, and this kernel is a “thin” hyperplane — a codimension-1 subspace with empty interior in .
A codimension-1 subspace of a normed space is either closed or dense.
Proof 4
Let be a codimension-1 subspace (the kernel of some nonzero ). Its closure is a closed subspace containing . Since has codimension 1, there are only two possibilities:
— in which case is closed.
— but then is a closed subspace strictly containing a codimension-1 subspace, so it must be all of . Hence is dense in .
There is no intermediate option.
Proposition 3 (Continuity via the kernel)
A nonzero linear functional is continuous if and only if is closed.
Proof 5
Remark 2 (Connection to the quotient norm)
The quantity appearing in the proof above is the quotient norm (see Definition 2). The closedness of is essential: it guarantees that the quotient norm is a genuine norm, so that for .
In fact, the entire proof is the quotient perspective in disguise. The functional factors through the quotient as
where is the canonical projection and is the isomorphism from the Theorem 1. Since is one-dimensional, is automatically bounded (all norms on are equivalent), and is bounded with . The composite is therefore bounded.
Combining everything:
(continuous) is closed is nowhere dense: a thin, clean hyperplane with empty interior. The space is sliced into well-separated parallel copies of , and the height function varies continuously.
(discontinuous) is not closed is dense: the hyperplane comes arbitrarily close to every point in .
There is nothing in between: every hyperplane is either cleanly closed or pathologically dense.
Remark 3 (Dense hyperplanes are pathological)
If is dense, then every level set is also dense. Points at height 0, height 1, and height are all interleaved at arbitrarily fine scales throughout the space. With a continuous functional, you’d see clean parallel bands of color, smoothly transitioning. With a discontinuous functional, every color appears in every tiny ball.
Concretely: if is discontinuous, then for any and any , there exists with but . The slicing exists algebraically, but the height function is wildly discontinuous. This is why continuous functionals — the ones with closed, nowhere-dense kernels — are the only ones useful for analysis.
A Worked Example: Hahn–Banach Extension in ¶
The Hahn–Banach theorem, stated in the hyperplane language, says: if you know the heights of points along a subspace, you can extend the height function to the entire space by choosing a closed hyperplane.
We make this explicit with a worked example. We use with the norm , so the unit ball is a square.
With the Euclidean norm (round unit ball), the Hahn–Banach extension turns out to be unique — the roundness of the ball forces a single choice. This is a general fact about Hilbert spaces: the Riesz representation theorem pins down the unique extension.
With the norm (square unit ball), the extension is non-unique — there is a whole interval of valid choices, each giving a different kernel. This is the generic situation in Banach spaces.
Step 1: Define on a subspace.
Let . Define on by . Check the norm: and . So on .
We know the heights along the diagonal line : the origin has height 0, the point has height , the point has height 1.
Step 2: Decompose.
Pick as the new direction. Every decomposes as:
Step 3: The extension is determined by one number.
Set — the height we assign to the new direction. By linearity:
Note where and , ensuring .
Step 4: Find which values of are valid.
We need in the norm. For a linear functional on , the operator norm is . The constraint combined with forces , giving:
Every in this interval gives a valid norm-preserving extension. The constraint is the unit ball in the -plane, and is a line cutting through it. The valid extensions correspond to the segment where the line meets the ball.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon
fig, ax = plt.subplots(1, 1, figsize=(6, 5))
# The l^1 unit ball: |a| + |b| <= 1
diamond = np.array([[1, 0], [0, 1], [-1, 0], [0, -1], [1, 0]])
dm = Polygon(diamond[:-1], fill=True, facecolor='C0', alpha=0.15,
edgecolor='C0', lw=2, label=r'$|a| + |b| \leq 1$')
ax.add_patch(dm)
# The line a + b = 1
t = np.linspace(-0.5, 1.5, 200)
ax.plot(t, 1 - t, 'C3', lw=2, label=r'$a + b = 1$')
# The feasible segment: a + b = 1 and |a| + |b| <= 1 => a, b >= 0
# So a in [0, 1], b = 1 - a in [0, 1]
a_seg = np.linspace(0, 1, 100)
b_seg = 1 - a_seg
ax.plot(a_seg, b_seg, 'C1', lw=4, alpha=0.8, solid_capstyle='round',
label=r'Valid extensions')
# Mark the three special cases
cases = [(1, 0, r'$c=1$: $F=x$'), (0.5, 0.5, r'$c=0$: $F=\frac{x+y}{2}$'),
(0, 1, r'$c=-1$: $F=y$')]
for (a_pt, b_pt, label) in cases:
ax.plot(a_pt, b_pt, 'ko', ms=7, zorder=10)
ax.text(a_pt + 0.05, b_pt + 0.06, label, fontsize=9)
ax.set_xlim(-1.3, 1.5)
ax.set_ylim(-1.3, 1.5)
ax.set_aspect('equal')
ax.axhline(0, color='k', lw=0.5, alpha=0.3)
ax.axvline(0, color='k', lw=0.5, alpha=0.3)
ax.set_xlabel(r'$a$')
ax.set_ylabel(r'$b$')
ax.set_title(r'Feasible $(a,b)$: line $a+b=1$ meets $\ell^1$ ball', fontsize=12)
ax.legend(fontsize=9, loc='lower left')
plt.tight_layout()
plt.show()
The unit ball (diamond) is the dual ball of , so the operator norm constraint requires to lie inside it. The extension constraint (ensuring on ) is a line. Valid extensions are the segment where the line meets the diamond — parametrized by .
Step 5: Each gives a different kernel.
: . Kernel: (the anti-diagonal).
: . Kernel: (the -axis).
: . Kernel: (the -axis).
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon
fig, axes = plt.subplots(1, 3, figsize=(16, 6))
# The l-infinity unit ball (square)
square = np.array([[-1, -1], [1, -1], [1, 1], [-1, 1], [-1, -1]])
# The subspace M = span((1,1))
t_M = np.linspace(-2, 2, 100)
cases = [
(0, r'$c = 0$: $F = \frac{x+y}{2}$', r'Kernel: $x+y=0$'),
(1, r'$c = 1$: $F = x$', r'Kernel: $x=0$'),
(-1, r'$c = -1$: $F = y$', r'Kernel: $y=0$'),
]
for ax, (c_val, title, ker_label) in zip(axes, cases):
a = (1 + c_val) / 2
b = (1 - c_val) / 2
# Draw the unit square
sq_patch = Polygon(square[:-1], fill=True, facecolor='C0', alpha=0.15, edgecolor='C0', lw=2.5)
ax.add_patch(sq_patch)
# Draw level sets of F(x,y) = ax + by = const
for level in np.arange(-2.5, 3.0, 0.5):
t = np.linspace(-2.5, 2.5, 200)
if abs(b) > 1e-10:
y_line = (level - a * t) / b
mask = (y_line > -2.5) & (y_line < 2.5)
lw = 2.5 if abs(level - round(level)) < 0.01 and level == int(level) else 0.8
alpha = 0.8 if lw > 1 else 0.3
color = 'C3' if abs(level) < 0.01 else ('C1' if abs(level - 1) < 0.01 else 'gray')
ax.plot(t[mask], y_line[mask], color=color, alpha=alpha, lw=lw)
else:
if abs(a) > 1e-10:
x_val = level / a
if -2.5 < x_val < 2.5:
lw = 2.5 if abs(level - round(level)) < 0.01 and level == int(level) else 0.8
alpha = 0.8 if lw > 1 else 0.3
color = 'C3' if abs(level) < 0.01 else ('C1' if abs(level - 1) < 0.01 else 'gray')
ax.axvline(x_val, color=color, alpha=alpha, lw=lw)
# Draw the subspace M
ax.plot(t_M, t_M, 'k--', alpha=0.4, lw=1.5, label=r'$M = \mathrm{span}(1,1)$')
# Mark special points
ax.plot(1, 1, 'ko', ms=7, zorder=5)
ax.text(1.1, 1.1, r'$(1,1)$', fontsize=12)
ax.plot(0.5, 0.5, 'ko', ms=5, zorder=5)
ax.plot(0, 0, 'ko', ms=5, zorder=5)
ax.set_xlim(-2.6, 2.6)
ax.set_ylim(-2.6, 2.6)
ax.set_aspect('equal')
ax.set_title(f'{title}\n{ker_label}', fontsize=13)
ax.set_xlabel(r'$x$', fontsize=12)
ax.set_ylabel(r'$y$', fontsize=12)
ax.tick_params(labelsize=11)
# Color legend
ax.plot([], [], 'C3', lw=2.5, label=r'$F = 0$ (kernel)')
ax.plot([], [], 'C1', lw=2.5, label=r'$F = 1$')
ax.plot([], [], 'gray', lw=1, label='Other levels')
ax.legend(fontsize=10, loc='upper right')
plt.tight_layout()
plt.show()
Three norm-preserving extensions of from to all of , corresponding to . In each panel, the unit square (shaded) fits between the and level sets, confirming . The kernel (red) and the subspace (dashed) differ in each case, but in all three.
Same functional on . Three different extensions. Three different kernels. In each case the unit square fits between the level sets and — the norm constraint is satisfied because the flat sides of the square allow the kernel to tilt freely. With a round ball, only one tilt would work. Hahn–Banach says: in any Banach space, no matter how large, at least one valid always exists.
Visualizing the full family of extensions¶
Every gives a different valid extension. Each extension defines a strip between and , and the norm constraint means the unit square must fit inside that strip. The unit ball is the intersection of all such strips.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon
fig, ax = plt.subplots(1, 1, figsize=(8, 8))
square = np.array([[-1, -1], [1, -1], [1, 1], [-1, 1], [-1, -1]])
sq_patch = Polygon(square[:-1], fill=True, facecolor='C0', alpha=0.15, edgecolor='C0', lw=2.5,
zorder=5)
ax.add_patch(sq_patch)
# Draw F^{-1}(1) and F^{-1}(-1) for various c values
# F(x,y) = ax + by with a = (1+c)/2, b = (1-c)/2
# F = ±1 => ax + by = ±1
c_values = np.linspace(-1, 1, 9)
cmap = plt.cm.coolwarm
t = np.linspace(-3.5, 3.5, 300)
for i, c_val in enumerate(c_values):
a = (1 + c_val) / 2
b = (1 - c_val) / 2
color = cmap(i / (len(c_values) - 1))
for level in [1, -1]:
if abs(b) > 1e-10:
y_line = (level - a * t) / b
mask = (y_line > -3.5) & (y_line < 3.5)
ax.plot(t[mask], y_line[mask], color=color, alpha=0.6, lw=1.8)
else:
# b = 0, a = 1: x = ±1
ax.axvline(level, color=color, alpha=0.6, lw=1.8)
# Lightly shade the strip for a few representative values
if abs(c_val) < 0.01 or abs(abs(c_val) - 1) < 0.01:
if abs(b) > 1e-10:
y_upper = (1 - a * t) / b
y_lower = (-1 - a * t) / b
ax.fill_between(t, y_lower, y_upper, alpha=0.04, color=color)
# Mark (1,1) — all F^{-1}(1) lines pass through it
ax.plot(1, 1, 'ko', ms=8, zorder=10)
ax.text(1.12, 1.12, r'$(1,1)$', fontsize=13)
# Mark (-1,-1) — all F^{-1}(-1) lines pass through it
ax.plot(-1, -1, 'ko', ms=8, zorder=10)
ax.text(-1.35, -1.2, r'$(-1,-1)$', fontsize=13)
# Draw the subspace M
ax.plot(t, t, 'k--', alpha=0.3, lw=1.5, label=r'$M$', zorder=1)
ax.set_xlim(-2.8, 2.8)
ax.set_ylim(-2.8, 2.8)
ax.set_aspect('equal')
ax.set_title(r'Strips $F^{-1}(-1)$ to $F^{-1}(1)$ for all valid extensions', fontsize=14)
ax.set_xlabel(r'$x$', fontsize=12)
ax.set_ylabel(r'$y$', fontsize=12)
ax.tick_params(labelsize=11)
# Colorbar
sm = plt.cm.ScalarMappable(cmap=cmap, norm=plt.Normalize(-1, 1))
sm.set_array([])
cbar = plt.colorbar(sm, ax=ax, shrink=0.7)
cbar.ax.tick_params(labelsize=11)
cbar.set_label(r'$c = F(1,-1)$', fontsize=12)
plt.tight_layout()
plt.show()
Each valid extension (parametrized by ) defines a strip between and , shown as a pair of lines in matching color. The unit square (shaded) fits inside every strip. All lines pass through and all lines pass through . The unit ball is exactly the intersection of all these strips, which is the geometric content of the sup formula .