Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Why Stochastic?

Many systems we want to model are not purely deterministic:


Brownian Motion

The Drunkard’s Walk

Imagine a person standing at the origin on the real line. At each tick of a clock (spaced δt\delta t apart), they step right by +δ+\delta or left by δ-\delta, each with probability 12\tfrac{1}{2}. Let ξj{δ,+δ}\xi_j \in \{-\delta,\,+\delta\} denote the jj-th step. The steps are independent, and by symmetry each has

E[ξj]=12(+δ)+12(δ)=0,E[ξj2]=12δ2+12δ2=δ2.\mathbb{E}[\xi_j] = \tfrac{1}{2}(+\delta) + \tfrac{1}{2}(-\delta) = 0, \qquad \mathbb{E}[\xi_j^2] = \tfrac{1}{2}\delta^2 + \tfrac{1}{2}\delta^2 = \delta^2.

Here E[X]\mathbb{E}[X] denotes the expected value (or mean) of a random variable XX: the average over all possible outcomes, weighted by their probabilities. Since E[ξj]=0\mathbb{E}[\xi_j] = 0, the variance is Var(ξj)=E[ξj2]=δ2\operatorname{Var}(\xi_j) = \mathbb{E}[\xi_j^2] = \delta^2.

After nn steps the position is Sn=j=1nξjS_n = \sum_{j=1}^n \xi_j. Since Sn+1=Sn+ξn+1S_{n+1} = S_n + \xi_{n+1}, we can ask: if the walker is currently at position ss, what is the expected position after one more step? In general, the conditional expectation of a random variable YY given that another random variable XX takes the value xx is

E[YX=x]=yy  P(Y=yX=x),\mathbb{E}[Y \mid X = x] = \sum_{y} y\;\mathbb{P}(Y = y \mid X = x),

the average of YY restricted to only those outcomes where X=xX = x. In our case Y=Sn+1Y = S_{n+1}, X=SnX = S_n, and x=sx = s. Given Sn=sS_n = s, the only two possible values of Sn+1S_{n+1} are s+δs + \delta and sδs - \delta, each with conditional probability 12\tfrac{1}{2}. So:

E[Sn+1Sn=s]=12(s+δ)+12(sδ)=s.\mathbb{E}[S_{n+1} \mid S_n = s] = \tfrac{1}{2}(s + \delta) + \tfrac{1}{2}(s - \delta) = s.

This says: no matter where the walker currently stands, the expected position after one more step is unchanged. To get the unconditional expectation, we average over all possible values of SnS_n, weighting each by its probability. Write pk=P(Sn=kδ)p_k = \mathbb{P}(S_n = k\delta) for the probability that the walker is at position kδk\delta after nn steps. Then

E[Sn+1]=kE[Sn+1Sn=kδ]  pk=k(kδ)pk=E[Sn].\mathbb{E}[S_{n+1}] = \sum_{k} \mathbb{E}[S_{n+1} \mid S_n = k\delta]\;p_k = \sum_{k} (k\delta)\,p_k = \mathbb{E}[S_n].

In the second equality we used the conditional result above: E[Sn+1Sn=kδ]=kδ\mathbb{E}[S_{n+1} \mid S_n = k\delta] = k\delta. The final sum is the definition of E[Sn]\mathbb{E}[S_n]: recall that SnS_n takes the values kδk\delta with probabilities pkp_k, so E[Sn]=k(kδ)pk\mathbb{E}[S_n] = \sum_k (k\delta)\,p_k. (This argument is an instance of the law of total expectation.) Since S0=0S_0 = 0, unrolling gives E[Sn]=E[Sn1]==E[S0]=0\mathbb{E}[S_n] = \mathbb{E}[S_{n-1}] = \cdots = \mathbb{E}[S_0] = 0.

For the mean-square displacement E[Sn2]\mathbb{E}[S_n^2] (the average squared distance from the origin, taken over many realizations), square Sn+1=Sn+ξn+1S_{n+1} = S_n + \xi_{n+1} and average over the two cases:

E[Sn+12Sn]=12(Sn+δ)2+12(Snδ)2=Sn2+δ2.\mathbb{E}[S_{n+1}^2 \mid S_n] = \tfrac{1}{2}(S_n + \delta)^2 + \tfrac{1}{2}(S_n - \delta)^2 = S_n^2 + \delta^2.

The cross term 2Snδ2\,S_n\,\delta cancels between the two cases. Taking expectations gives the recursion E[Sn+12]=E[Sn2]+δ2\mathbb{E}[S_{n+1}^2] = \mathbb{E}[S_n^2] + \delta^2, so starting from S0=0S_0 = 0:

E[Sn2]=nδ2.\mathbb{E}[S_n^2] = n\,\delta^2.

After time t=nδtt = n\,\delta t we have n=t/δtn = t/\delta t, so the mean-square displacement is

E[Sn2]=tδtδ2.\mathbb{E}[S_n^2] = \frac{t}{\delta t}\,\delta^2.

For this to have a well-defined limit as the discretization is refined, we need δ2/δt\delta^2 / \delta t to remain constant, i.e. δδt\delta \propto \sqrt{\delta t}. The natural choice is δ=δt\delta = \sqrt{\delta t}, giving E[Sn2]=t\mathbb{E}[S_n^2] = t.

This is precisely what is observed experimentally: if you track a pollen grain suspended in water under a microscope, the mean-square displacement grows linearly with time. Einstein (1905) explained this by the argument above, connecting the microscopic randomness of molecular collisions to the macroscopic diffusion rate.

The Drunkard’s Walk (General Version)

Imagine a person standing at the origin who, at regular time intervals δt\delta t, takes a step of random size. Let ξj\xi_j denote the jj-th step, drawn from a probability distribution with density p(ξ)p(\xi) (so that P(aξjb)=abp(ξ)dξ\mathbb{P}(a \leq \xi_j \leq b) = \int_a^b p(\xi)\,d\xi). Suppose the steps are independent with mean zero and variance σ2\sigma^2:

E[ξj]=ξp(ξ)dξ=0,Var(ξj)=ξ2p(ξ)dξ=σ2.\mathbb{E}[\xi_j] = \int_{-\infty}^{\infty} \xi\,p(\xi)\,d\xi = 0, \qquad \operatorname{Var}(\xi_j) = \int_{-\infty}^{\infty} \xi^2\,p(\xi)\,d\xi = \sigma^2.

Here E[X]\mathbb{E}[X] denotes the expected value (or mean) of a random variable XX: the average over all possible outcomes, weighted by their probabilities.

After nn steps the position is Sn=j=1nξjS_n = \sum_{j=1}^n \xi_j. What are the mean and variance of SnS_n? Writing out the expected value against the joint density p(ξ1,,ξn)p(\xi_1, \ldots, \xi_n):

E[Sn]=(j=1nξj)p(ξ1,,ξn)dξ1dξn.\mathbb{E}[S_n] = \int \cdots \int \left(\sum_{j=1}^n \xi_j\right) p(\xi_1, \ldots, \xi_n)\,d\xi_1 \cdots d\xi_n.

The integral is linear in the sum. Since the steps are independent, the joint density factors as p(ξ1,,ξn)=p(ξ1)p(ξn)p(\xi_1, \ldots, \xi_n) = p(\xi_1)\cdots p(\xi_n), and the cross-integrals collapse: p(ξk)dξk=1\int p(\xi_k)\,d\xi_k = 1 for kjk \neq j. This gives

E[Sn]=j=1nξjp(ξj)dξj=j=1nE[ξj]=0.\mathbb{E}[S_n] = \sum_{j=1}^n \int \xi_j\,p(\xi_j)\,d\xi_j = \sum_{j=1}^n \mathbb{E}[\xi_j] = 0.

For the variance, independence is essential. Recall that Var(X)=E[X2](E[X])2\operatorname{Var}(X) = \mathbb{E}[X^2] - (\mathbb{E}[X])^2. Since E[Sn]=0\mathbb{E}[S_n] = 0, the variance reduces to Var(Sn)=E[Sn2]\operatorname{Var}(S_n) = \mathbb{E}[S_n^2], the mean-square displacement: the average of the squared distance from the origin, taken over many realizations of the walk. Expanding Sn2S_n^2:

E[Sn2]=E ⁣[(j=1nξj) ⁣2]=j=1nE[ξj2]+2i<jE[ξiξj].\mathbb{E}[S_n^2] = \mathbb{E}\!\left[\left(\sum_{j=1}^n \xi_j\right)^{\!2}\right] = \sum_{j=1}^n \mathbb{E}[\xi_j^2] + 2\sum_{i < j} \mathbb{E}[\xi_i\,\xi_j].

Independence means E[ξiξj]=E[ξi]E[ξj]=0\mathbb{E}[\xi_i\,\xi_j] = \mathbb{E}[\xi_i]\,\mathbb{E}[\xi_j] = 0 for iji \neq j, so all cross terms vanish and

Var(Sn)=j=1nVar(ξj)=nσ2.\operatorname{Var}(S_n) = \sum_{j=1}^n \operatorname{Var}(\xi_j) = n\sigma^2.

After time t=nδtt = n\,\delta t the mean-square displacement is E[Sn2]=(t/δt)σ2\mathbb{E}[S_n^2] = (t / \delta t)\,\sigma^2. For this to depend only on tt and not on the discretization δt\delta t, we need σ2δt\sigma^2 \propto \delta t, i.e. each step has standard deviation proportional to δt\sqrt{\delta t}.

This is precisely what is observed experimentally: if you track a pollen grain suspended in water under a microscope, the mean-square displacement grows linearly with time. Einstein (1905) explained this by the argument above, connecting the microscopic randomness of molecular collisions to the macroscopic diffusion rate.

Why Gaussian? The Central Limit Theorem

Nothing in the drunkard’s walk argument used the shape of the step distribution — only that the steps are independent with mean zero and finite variance. The ±δ\pm\delta coin flip, a uniform distribution, or any other zero-mean, finite-variance law all give the same linear growth of mean-square displacement. So why does Brownian motion have Gaussian increments?

The answer is the Central Limit Theorem (CLT): if ξ1,ξ2,\xi_1, \xi_2, \ldots are i.i.d. with mean 0 and variance σ2\sigma^2, then the normalized sum converges in distribution to a Gaussian,

Snσn=ξ1++ξnσn  d  N(0,1)as n.\frac{S_n}{\sigma\sqrt{n}} = \frac{\xi_1 + \cdots + \xi_n}{\sigma\sqrt{n}} \;\xrightarrow{d}\; \mathcal{N}(0,1) \qquad \text{as } n \to \infty.

Equivalently, SndN(0,nσ2)S_n \xrightarrow{d} \mathcal{N}(0, n\sigma^2). In our setting n=t/δtn = t/\delta t steps of variance σ2=δt\sigma^2 = \delta t give SndN(0,t)S_n \xrightarrow{d} \mathcal{N}(0, t), regardless of the step distribution. The Gaussian emerges as the universal limit of many small independent kicks — a coin-flip walk, a uniform walk, or molecular collisions all produce the same Gaussian in the continuum limit.

This universality is made precise by Donsker’s theorem (the functional CLT): the rescaled random walk converges to Brownian motion as a continuous process, not just at a single time. Brownian motion is the unique scaling limit of every finite-variance random walk, in the same way the Gaussian is the universal limit for sums of independent random variables. Using Gaussian increments from the start simply gives exact Brownian increments at every scale, rather than only in the limit.

From the Random Walk to Continuous Noise

Setting σ2=δt\sigma^2 = \delta t and taking the steps to be Gaussian ξj=δtZj\xi_j = \sqrt{\delta t}\,Z_j with ZjN(0,1)Z_j \sim \mathcal{N}(0,1), the random walk updates as

Wj=Wj1+δt  ZjW_{j} = W_{j-1} + \sqrt{\delta t}\; Z_j

The δt\sqrt{\delta t} scaling is precisely what Einstein’s argument requires. When δt0\delta t \to 0, this random walk converges to a continuous random function W(t)W(t) called Brownian motion (or a Wiener process).

Computing with Stochastic Processes

A stochastic process like W(t)W(t) gives a different function on every realization. The best we can do is describe statistics of the process: expected values, variances, and correlations across realizations. When we solve an SDE numerically, the solution XnX_n at each time step is itself a random variable that depends on all the random increments ΔW0,ΔW1,,ΔWn1\Delta W_0, \Delta W_1, \ldots, \Delta W_{n-1}.

In practice, we rarely know the density of XnX_n explicitly, so we cannot evaluate E[Xn]\mathbb{E}[X_n] as an integral. Instead, we approximate it by Monte Carlo sampling: run MM independent simulations and average the results,

E[X]=xp(x)dx1Mi=1MX(i),X(i)p.\mathbb{E}[X] = \int_{-\infty}^{\infty} x\,p(x)\,dx \approx \frac{1}{M}\sum_{i=1}^{M} X^{(i)}, \qquad X^{(i)} \sim p.

This is a quadrature rule for the integral xp(x)dx\int x\,p(x)\,dx, where the sample points X(i)X^{(i)} are drawn from the distribution pp rather than placed on a deterministic grid. See the companion notebook Monte Carlo Methods for Monte Carlo integration examples, importance sampling, and the Metropolis--Hastings algorithm.


From Euler’s Method to Euler–Maruyama

The Integral Form of an ODE (Duhamel’s Principle)

Start from dxdt=f(t,x)\frac{dx}{dt} = f(t,x). Integrate both sides from tnt_n to tn+1t_{n+1}:

x(tn+1)=x(tn)+tntn+1f(s,x(s))dsx(t_{n+1}) = x(t_n) + \int_{t_n}^{t_{n+1}} f(s, x(s))\,ds

This is Duhamel’s principle. Euler’s method approximates the integral by freezing the integrand at the left endpoint (a rectangle rule):

tntn+1f(s,x(s))dsf(tn,xn)h\int_{t_n}^{t_{n+1}} f(s, x(s))\,ds \approx f(t_n, x_n) \cdot h

The SDE and Its Integral Form

This notation is shorthand for an integral equation, Duhamel’s principle with two integrals. Over one time step [tn,tn+1][t_n, t_{n+1}]:

X(tn+1)=X(tn)+tntn+1a(s,X(s))dsdeterministic integral+tntn+1b(s,X(s))dW(s)stochastic integralX(t_{n+1}) = X(t_n) + \underbrace{\int_{t_n}^{t_{n+1}} a(s, X(s))\,ds}_{\text{deterministic integral}} + \underbrace{\int_{t_n}^{t_{n+1}} b(s, X(s))\,dW(s)}_{\text{stochastic integral}}

We approximate both integrals by the rectangle-rule logic:

The Euler–Maruyama Scheme

Example 1 (Geometric Brownian Motion)

The SDE

dX=λXdt+μXdW,X(0)=X0dX = \lambda X\,dt + \mu X\,dW, \qquad X(0) = X_0

where λ,μ\lambda, \mu are constants, is called geometric Brownian motion (GBM). Both drift and diffusion are proportional to XX. This is the SDE underlying the Black–Scholes model for asset prices.

Modelling interpretation. The relative change in price over a short interval has two components:

  • A deterministic trend λdt\lambda\,dt (expected rate of return).

  • A random fluctuation μdW\mu\,dW (volatility).

So dX/X=λdt+μdWdX/X = \lambda\,dt + \mu\,dW, i.e. dX=λXdt+μXdWdX = \lambda X\,dt + \mu X\,dW. The multiplicative noise ensures prices stay positive and fluctuations scale with price level.

Euler–Maruyama: With a(t,X)=λXa(t,X) = \lambda X and b(t,X)=μXb(t,X) = \mu X:

Xn+1=Xn+λXnh+μXnΔWn=Xn(1+λh+μΔWn)X_{n+1} = X_n + \lambda X_n\,h + \mu X_n\,\Delta W_n = X_n(1 + \lambda h + \mu\,\Delta W_n)

Exact solution. This is one of the rare SDEs with a closed-form solution, making it an ideal test problem: we can compare the Euler–Maruyama approximation against the true solution on the same Brownian path. The exact solution is

X(t)=X0exp ⁣[(λ12μ2)t+μW(t)]X(t) = X_0 \exp\!\left[(\lambda - \tfrac{1}{2}\mu^2)\,t + \mu\,W(t)\right]

The 12μ2-\frac{1}{2}\mu^2 correction is a mathematical consequence of the Itô correction term (see Theorem 3 below), not a modelling choice. The derivation uses Itô’s formula, the stochastic chain rule derived in the next section: set Y=lnXY = \ln X and apply Itô’s formula with φ(x)=lnx\varphi(x) = \ln x (so φ=1/x\varphi' = 1/x, φ=1/x2\varphi'' = -1/x^2). Here f(X)=λXf(X) = \lambda X and g(X)=μXg(X) = \mu X, so

dY=1X(λXdt+μXdW)+12 ⁣(1X2)(μX)2dt=(λ12μ2)dt+μdW.dY = \frac{1}{X}(\lambda X\,dt + \mu X\,dW) + \frac{1}{2}\!\left(-\frac{1}{X^2}\right)(\mu X)^2\,dt = (\lambda - \tfrac{1}{2}\mu^2)\,dt + \mu\,dW.

This has constant coefficients. Integrating from 0 to tt:

Y(t)Y(0)=(λ12μ2)t+μW(t).Y(t) - Y(0) = (\lambda - \tfrac{1}{2}\mu^2)\,t + \mu\,W(t).

Exponentiating (X=eYX = e^Y) gives the result.


Convergence of Euler–Maruyama

Both XnX_n (numerical) and X(tn)X(t_n) (exact) are random variables. On any particular run, the error XnX(tn)|X_n - X(t_n)| depends on the Brownian path. We need convergence concepts that account for randomness. Now that we have the expected value at our disposal, there are two natural ways to measure the error.

Strong Convergence: Mean of the Error

Strong convergence asks: how close is each numerical path to the true path, on average? We compute the pathwise error XnX(tn)|X_n - X(t_n)| on each realization, then take the expected value across all realizations. This matters when individual trajectories are important, for example when simulating a specific stock price path or a particular particle trajectory.

Weak Convergence: Error of the Means

Strong convergence is demanding: it requires every individual path to be accurate. Often we only care about statistics of the solution, for instance the expected payoff of a financial derivative or the mean concentration of a chemical species. Weak convergence asks a different question: how well does the method reproduce the expected value? Rather than averaging the error, we look at the error of the average: E[Xn]E[X(tn)]|\mathbb{E}[X_n] - \mathbb{E}[X(t_n)]|.


Stochastic Calculus and Itô’s Formula

To understand where the exact GBM solution comes from, and in particular the 12μ2-\frac{1}{2}\mu^2 correction, we need a stochastic version of the chain rule. This requires first making sense of integration with respect to Brownian motion.

The Itô Stochastic Integral

We have already seen that the integral form of an SDE involves an expression 0TH(s)dW(s)\int_0^T H(s)\,dW(s), where HH is some process. The Itô integral defines this as the limit of left-endpoint Riemann sums:

The left-endpoint choice is what makes this the Itô integral (as opposed to a midpoint or right-endpoint convention). It is also the choice that Euler–Maruyama naturally implements.

Two properties of the Itô integral are essential for everything that follows:

Itô’s Formula

Now we can derive the stochastic chain rule. Suppose X(t)X(t) satisfies the SDE dX=f(X)dt+g(X)dWdX = f(X)\,dt + g(X)\,dW, and let φ\varphi be a twice continuously differentiable function. We want the SDE satisfied by Y(t)=φ(X(t))Y(t) = \varphi(X(t)).


Summary

References
  1. Higham, D. J. (2001). An Algorithmic Introduction to Numerical Simulation of Stochastic Differential Equations. SIAM Review, 43(3), 525–546. 10.1137/S0036144500378302