Modes of Convergence
A sequence of random variables $X_1, X_2, \ldots$ can approach a limit $X$ in several non-equivalent senses. Textbooks usually present the lattice of implications and a few hand-picked counterexamples in passing. Here the counterexamples are the centerpiece: each one is a recipe for sample paths $X_n(\omega)$, and the same paths are read four different ways.
Start by dragging the step $n$ — directly on the canvas, or with the slider — and watch the readouts. Then switch presets — each one is the canonical example of one mode of convergence failing while another succeeds.
1. Convergence of random variables
Each colored trace is one sample path $X_n(\omega_k)$ — same $\omega_k$ across all $n$, different $\omega_k$ per path. The horizontal axis is $n$; the shaded band is the $\varepsilon$-tube around the limit $X$. Four readouts measure the four modes:
- Almost sure ($X_n \to^{\text{a.s.}} X$). Fraction of paths that have ever been outside the tube for some $m \geq n$ visible on the canvas — should shrink to $0$.
- In probability ($X_n \to^p X$). Fraction of paths currently outside the tube — should shrink to $0$.
- In $L^p$ ($X_n \to^{L^p} X$). Sample $\mathbb{E}|X_n - X|^p$ for $p \in \{1, 2\}$.
- In distribution ($X_n \to^d X$). Lévy distance between the law of $X_n$ and the law of $X$ — unlike the Kolmogorov distance, it vanishes even when the limit $X$ is degenerate.
2. The implication lattice
Five strict implications hold; nothing else does. Click any node to load the preset where that mode is the headline example; click any dashed (non-)arrow to load the canonical counterexample that breaks it. Each node is tinted by the loaded preset — green where that mode converges, red where it fails.
3. Why the modes shrink: Markov, Chebyshev, Chernoff
The escape rate $\mathbb{P}(|X_n - X| \geq \varepsilon)$ is the central object behind convergence in probability. Three inequalities bound it from above, each using progressively more information:
- Markov: $\mathbb{P}(|X_n - X| \geq \varepsilon) \leq \mathbb{E}|X_n - X| / \varepsilon$. Uses the first moment.
- Chebyshev: $\mathbb{P}(|X_n - X| \geq \varepsilon) \leq \mathrm{Var}(X_n - X)/\varepsilon^2$. Uses the second moment.
- Chernoff: $\mathbb{P}(S_n - n\mu \geq n t) \leq e^{-n \cdot I(t)}$ where $I$ is the rate function. Uses the whole MGF.
Figure 3 overlays the Markov and Chebyshev bounds on the empirical escape rate from Figure 1 — they should sit above the dots, sometimes tightly, sometimes wastefully. The Cauchy preset is the dramatic failure case: the second moment is infinite, so Chebyshev gives an empty bound, and the LLN itself fails.
Markov bounds are usually loose because they only see the first moment. Chebyshev tightens them by squaring, but the next level — Chernoff — buys an exponential drop by exponentiating before taking expectation. Figure 4 shows the same tail $\mathbb{P}(\bar S_n - \mu \geq t)$ on a log scale, with all three bounds and the empirical rate. The Chernoff curve pulls away in a straight line; Chebyshev sags slowly like $1/n$.
A subtler use of Chernoff: pair the bound with the Borel–Cantelli lemma ($\sum_n \mathbb{P}(A_n) < \infty \Rightarrow \mathbb{P}(A_n \text{ i.o.}) = 0$). Exponential decay of $\mathbb{P}(|\bar S_n - \mu| \geq \varepsilon)$ is more than summable, so the strong law follows. Chebyshev's $1/n$ rate is not summable, which is why proving the strong LLN from Chebyshev alone requires the extra subsequence trick.
4. When can you swap limit and expectation?
Almost-sure convergence does not imply $L^1$ convergence — the spike preset shows $X_n \to 0$ a.s. while $\mathbb{E} X_n = 1$ for every $n$. Three theorems give sufficient conditions for $\mathbb{E} X_n \to \mathbb{E} X$:
- Monotone convergence (MCT). If $0 \leq X_n \uparrow X$, then $\mathbb{E} X_n \uparrow \mathbb{E} X$. No domination needed.
- Dominated convergence (DCT). If $X_n \to X$ a.s. and $|X_n| \leq g$ for some integrable $g$, then $\mathbb{E} X_n \to \mathbb{E} X$.
- Fatou. Always: $\mathbb{E}[\liminf X_n] \leq \liminf \mathbb{E} X_n$. The inequality can be strict — the spike preset has $\liminf \mathbb{E} X_n = 1$ versus $\mathbb{E}[\liminf X_n] = 0$.
Figure 5 lets you propose a dominating function $g(\omega)$ for the spike. The canvas plots all the $X_n(\omega) = n\cdot\mathbf{1}_{[0,1/n]}(\omega)$ on the same axes; you adjust $g$ as a power-law envelope $g(\omega) = c \cdot \omega^{-\alpha}$ and the readout reports $\int_0^1 g$. The point is to see why no integrable $g$ dominates: $g$ must satisfy $g(\omega) \geq n$ on $[0, 1/n]$ for every $n$, which forces $g(\omega) \geq 1/\omega$ near $0$ — and $\int_0^1 1/\omega \,d\omega = \infty$.
The MCT story is gentler. Figure 6 picks an integrable $f$ on $[0,1]$ and lets $X_n = f \cdot \mathbf{1}_{[0, 1 - 1/n]}$ — a sequence that fills in toward $f$ from the left. Because $X_n \uparrow f$ pointwise, MCT gives $\mathbb{E} X_n \uparrow \mathbb{E} f$ with no other hypothesis. Flip the monotonicity off and the guarantee disappears.
5. A guided tour through the lattice
The four counterexamples occupy specific positions in the lattice. Working them out by hand once is rewarding; clicking through them quickly is the next best thing.
- Spike — a.s. and in probability, but not in $L^1$. The path $X_n(\omega) = n\cdot\mathbf{1}_{[0,1/n]}(\omega)$ has $\mathbb{E} X_n = 1$ for all $n$. (Breaks: a.s. $\Rightarrow L^1$, in prob $\Rightarrow L^1$.)
- Typewriter — in probability and in $L^p$, but not almost surely. For each $\omega \in [0,1]$, infinitely many of the sliding indicators $\mathbf{1}_{[(n-2^k)/2^k, (n-2^k+1)/2^k]}$ light up. (Breaks: in prob $\Rightarrow$ a.s., $L^p \Rightarrow$ a.s.)
- Sign flip — in distribution, but not in probability. With $X$ a fair $\pm 1$ and $X_n = (-1)^n X$, every $X_n$ has the same distribution as $X$, so $X_n \overset{d}{\to} X$, but $|X_n - X|$ takes only the values $0$ or $2$. (Breaks: in dist $\Rightarrow$ in prob.)
- Cauchy mean — fails the LLN entirely. The running mean of iid Cauchy is itself Cauchy with the same scale, not contracting. Chebyshev cannot help: $\mathrm{Var} = \infty$.