GEMM & Tiling Explorer

Try this:
  1. Pick a model — see how matrix sizes differ
  2. Pick a GPU — watch the roofline and wave map change
  3. Drag batch/seq sliders — M dimension grows
  4. Change tile size — see wave efficiency shift
  5. Click rows in the table to inspect each op

Hardware

Model

Layer Operation

Workload

Tiling

Optimizations

Summary (selected op)

FLOPs
Bytes (tiled)
Arith. Intensity
Bound

Matrix Dimensions —

Tiling Grid & Wave Mapping

Single Tile Accumulation

Memory Traffic: Naive vs Tiled

Roofline Model

Per-Layer GEMM Breakdown

Operation M K N FLOPs Bytes (tiled) AI (FLOP/B) Bound Time (ms)

Click a preset to load an interesting configuration.