Modes of Convergence

Almost sure, in probability, in distribution, in $L^p$ — the same sample paths, four different lenses.

A sequence of random variables $X_1, X_2, \ldots$ can approach a limit $X$ in several non-equivalent senses. Textbooks usually present the lattice of implications and a few hand-picked counterexamples in passing. Here the counterexamples are the centerpiece: each one is a recipe for sample paths $X_n(\omega)$, and the same paths are read four different ways.

Start by dragging the step $n$ — directly on the canvas, or with the slider — and watch the readouts. Then switch presets — each one is the canonical example of one mode of convergence failing while another succeeds.

1. Convergence of random variables

Each colored trace is one sample path $X_n(\omega_k)$ — same $\omega_k$ across all $n$, different $\omega_k$ per path. The horizontal axis is $n$; the shaded band is the $\varepsilon$-tube around the limit $X$. Four readouts measure the four modes:

Figure 1 · Sample paths, four lenses
sample paths $X_n(\omega_k)$ $\varepsilon$-tube around $X$ currently outside tube future escapes ($m \geq n$) largest deviation ($L^p$ driver)

2. The implication lattice

Five strict implications hold; nothing else does. Click any node to load the preset where that mode is the headline example; click any dashed (non-)arrow to load the canonical counterexample that breaks it. Each node is tinted by the loaded preset — green where that mode converges, red where it fails.

Figure 2 · Modes of convergence — what implies what
Hover or click an edge to see what it claims (or what counterexample disproves the converse). Node tint shows which modes hold for the loaded preset.
Reading the diagram. A solid arrow $A \Rightarrow B$ means every sequence that converges in mode $A$ also converges in mode $B$. A dashed arrow $A \not\Rightarrow B$ means a counterexample exists — clicking it loads that counterexample into Figure 1. The lattice has a sharp asymmetry: almost-sure and $L^p$ are both stronger than convergence in probability, but neither implies the other (the typewriter sits in one gap, the spike sits in the other).

3. Why the modes shrink: Markov, Chebyshev, Chernoff

The escape rate $\mathbb{P}(|X_n - X| \geq \varepsilon)$ is the central object behind convergence in probability. Three inequalities bound it from above, each using progressively more information:

Figure 3 overlays the Markov and Chebyshev bounds on the empirical escape rate from Figure 1 — they should sit above the dots, sometimes tightly, sometimes wastefully. The Cauchy preset is the dramatic failure case: the second moment is infinite, so Chebyshev gives an empty bound, and the LLN itself fails.

Figure 3 · Empirical escape rate vs. Markov & Chebyshev bounds
empirical $\mathbb{P}(|X_n - X|\geq\varepsilon)$ Markov bound Chebyshev bound

Markov bounds are usually loose because they only see the first moment. Chebyshev tightens them by squaring, but the next level — Chernoff — buys an exponential drop by exponentiating before taking expectation. Figure 4 shows the same tail $\mathbb{P}(\bar S_n - \mu \geq t)$ on a log scale, with all three bounds and the empirical rate. The Chernoff curve pulls away in a straight line; Chebyshev sags slowly like $1/n$.

Figure 4 · Concentration of $\bar S_n$ on a log scale
empirical tail Markov Chebyshev Chernoff

A subtler use of Chernoff: pair the bound with the Borel–Cantelli lemma ($\sum_n \mathbb{P}(A_n) < \infty \Rightarrow \mathbb{P}(A_n \text{ i.o.}) = 0$). Exponential decay of $\mathbb{P}(|\bar S_n - \mu| \geq \varepsilon)$ is more than summable, so the strong law follows. Chebyshev's $1/n$ rate is not summable, which is why proving the strong LLN from Chebyshev alone requires the extra subsequence trick.

4. When can you swap limit and expectation?

Almost-sure convergence does not imply $L^1$ convergence — the spike preset shows $X_n \to 0$ a.s. while $\mathbb{E} X_n = 1$ for every $n$. Three theorems give sufficient conditions for $\mathbb{E} X_n \to \mathbb{E} X$:

Figure 5 lets you propose a dominating function $g(\omega)$ for the spike. The canvas plots all the $X_n(\omega) = n\cdot\mathbf{1}_{[0,1/n]}(\omega)$ on the same axes; you adjust $g$ as a power-law envelope $g(\omega) = c \cdot \omega^{-\alpha}$ and the readout reports $\int_0^1 g$. The point is to see why no integrable $g$ dominates: $g$ must satisfy $g(\omega) \geq n$ on $[0, 1/n]$ for every $n$, which forces $g(\omega) \geq 1/\omega$ near $0$ — and $\int_0^1 1/\omega \,d\omega = \infty$.

Figure 5 · Can you dominate the spike?
$X_n(\omega) = n\cdot\mathbf{1}_{[0,1/n]}$ candidate envelope $g(\omega) = c\,\omega^{-\alpha}$

The MCT story is gentler. Figure 6 picks an integrable $f$ on $[0,1]$ and lets $X_n = f \cdot \mathbf{1}_{[0, 1 - 1/n]}$ — a sequence that fills in toward $f$ from the left. Because $X_n \uparrow f$ pointwise, MCT gives $\mathbb{E} X_n \uparrow \mathbb{E} f$ with no other hypothesis. Flip the monotonicity off and the guarantee disappears.

Figure 6 · MCT: monotone fill-up
limit $f(\omega)$ current $X_n$

5. A guided tour through the lattice

The four counterexamples occupy specific positions in the lattice. Working them out by hand once is rewarding; clicking through them quickly is the next best thing.

What this page is not. It is not a proof gallery — the inequalities and implications are quoted, not derived. It is also not the CLT story; the CLT lives more comfortably with the named distributions and Berry–Esseen, which is itself a refinement of convergence in distribution rather than a separate mode.

What next