In finite dimensions, every linear operator maps bounded sets to bounded sets, and since
closed bounded sets are compact (Heine-Borel), bounded sequences always have convergent
subsequences. This is the engine behind most of finite-dimensional linear algebra: eigenvalue
decompositions, the SVD, and the Fredholm alternative all rely on extracting convergent subsequences.
In infinite dimensions, closed bounded sets are no longer compact
(Example 2), and this machinery breaks down for general bounded
operators. Compact operators are precisely the class of operators for which it does not break
down—they are the infinite-dimensional operators that still behave like matrices.
Why is extracting convergent subsequences the engine behind finite-dimensional linear algebra?
The eigenvalue decomposition and the SVD both reduce to finding a vector that achieves an
extremum. The method is always:
Take a sequence approaching the supremum/infimum.
Extract a convergent subsequence (compactness of the unit ball).
The limit achieves the optimum (continuity).
Eigenvalues of a symmetric matrix are found by maximizing the Rayleigh quotient
R(x)=⟨Ax,x⟩/∥x∥2 over the unit sphere. In Rn the unit sphere
is compact, so a maximizing sequence has a convergent subsequence whose limit is an eigenvector.
Restrict to its orthogonal complement and repeat—the full eigenvalue decomposition follows by
induction. Each eigenvalue λi yields a projection Pi=⟨⋅,ψi⟩ψi onto the corresponding eigenspace, and the spectral decomposition A=∑λiPi
says the operator is a sum of scaled orthogonal projections.
The SVD of a general (non-symmetric) matrix follows the same pattern by reducing to the
symmetric case: form K∗K (which is self-adjoint and positive), apply the Rayleigh quotient
argument to find its eigenvalues σi2 and eigenvectors vi, then set
ui=Kvi/σi. The result is K=∑σi⟨⋅,vi⟩ui.
Each step requires extracting a convergent subsequence from a maximizing sequence on the unit sphere.
A compact operator maps bounded sequences to sequences with convergent subsequences—by
definition. This is step 2, transplanted from a property of the space to a property of the
operator. For the Hilbert-Schmidt spectral theorem, you maximize the Rayleigh quotient on the unit
sphere. You cannot extract a convergent subsequence from the maximizing sequence (xn) directly,
but since A is compact, (Axn) has a convergent subsequence, which suffices to show (xn)
converges to an eigenvector. Compactness of the operator substitutes for compactness of the ball.
The sequential characterization is the one we use most often in practice: bounded sequence in,
convergent subsequence out. Compare this with a general bounded operator, which only guarantees
bounded sequence in, bounded sequence out.
Proof:
The unit ball B1(0)⊂X is bounded, so by compactness of the operator, K(B1(0))
is compact and hence bounded in Y. Thus
The converse is false in infinite dimensions: the identity operator I:ℓ2→ℓ2 is
bounded but not compact, since the orthonormal sequence (en) is bounded but has no convergent
subsequence.
In algebraic language, K(X,Y) is a closed two-sided ideal in L(X) when
X=Y. Composing a compact operator with any bounded operator (on either side) produces another
compact operator.
The closedness statement is the most important: it says that the operator-norm limit of compact
operators is again compact. This is the key tool for proving compactness of specific operators.
Proof:
Let (xj) be a bounded sequence in X. We use a diagonal argument.
Since K1 is compact, (K1xj) has a convergent subsequence K1xj(1).
Since K2 is compact, (K2xj(1)) has a convergent subsequence K2xj(2).
Continuing, at stage l the sequence (Klxj(l)) converges. The diagonal sequence
yj:=xj(j) satisfies: (Klyj)j converges for every l.
Now estimate:
∥Kyi−Kyj∥≤≤∥K−Kn∥⋅C∥Kyi−Knyi∥+→0 as i,j→∞∥Knyi−Knyj∥+≤∥K−Kn∥⋅C∥Knyj−Kyj∥
where C=supj∥yj∥. For any ε>0, first choose n so that
∥K−Kn∥<ε/3C, then choose i,j large enough so that
∥Knyi−Knyj∥<ε/3. Thus (Kyj) is Cauchy in Y and converges since Y
is a Banach space.
Finite-Rank Operators: Compact Operators as “Infinite-Dimensional Matrices”¶
The connection between compact operators and matrices runs through finite-rank operators.
Proof:
Let (xn) be a bounded sequence in X. Then (Axn) is a bounded sequence in the
finite-dimensional space R(A). By Bolzano-Weierstrass, it has a convergent subsequence.
This is the precise sense in which compact operators generalize matrices. A matrix
A∈Rm×n defines an operator of rank at most min(m,n)—always finite.
The key theorem (Theorem 1) now tells us:
Compact operators are precisely the operators that can be approximated by “matrices”
(finite-rank operators) in the operator norm.
In Hilbert spaces this is exact: every compact operator is the operator-norm limit of
finite-rank operators. In general Banach spaces, this is the approximation property (which
most natural spaces satisfy, though Enflo showed it can fail).
The spectral theory of compact operators is the payoff: it tells us that compact operators have
a spectrum that looks just like the spectrum of a matrix, up to a possible accumulation point at
zero.
When the compact operator is additionally self-adjoint, we get a complete spectral decomposition—the
infinite-dimensional analogue of the eigenvalue decomposition of a symmetric matrix.
This is the infinite-dimensional eigendecomposition for self-adjoint operators: the operator A
is completely determined by its eigenvalues and eigenfunctions, just as a symmetric matrix is
determined by its eigenvalues and eigenvectors. The spectral representation
A=∑λi⟨⋅,ψi⟩ψi is the direct analogue of the matrix
diagonalization A=QΛQT. Note that this is not the SVD—it is the eigenvalue
decomposition, which requires self-adjointness. The SVD
K=∑σi⟨⋅,vi⟩ui is a separate factorization that works for all
compact operators (see Remark 1).
If A−1 is compact and self-adjoint, then A−1 satisfies the Hilbert-Schmidt theorem with
eigenvalues μj→0. The eigenvalues of A are λj=1/μj→∞.
This is the typical situation for Laplacian-type operators: A=−Δ with suitable boundary
conditions is unbounded, but the solution operator A−1 (given by a Green’s function) is
compact (Example 1). Hence −Δ has a discrete spectrum of eigenvalues
tending to infinity, with eigenfunctions forming an ONB. This is why Fourier series work:
the eigenfunctions of the Laplacian on [0,2π] are precisely {einx}.
Compact operators are the bridge between the abstract operator theory of this chapter and several
later topics in the course:
Weak convergence + compact operator ⇒ strong convergence. If xn⇀x
weakly and K is compact, then Kxn→Kx strongly. This is a key tool in the calculus of
variations for passing to the limit in nonlinear problems.
Rellich-Kondrachov compactness. The Sobolev embedding H1(Ω)↪L2(Ω)
is compact for bounded Ω. This is the source of compactness in elliptic PDE theory and the
reason the direct method of the calculus of variations works.
Fixed point theory. Compact operators are the setting for Schauder’s fixed point theorem and
the Leray-Schauder degree, which extend Brouwer’s fixed point theorem to infinite dimensions.