Model Overview

See the full transformer model — every layer stacked. Click any layer to drill into per-operation detail.

Model

Hardware

Quantization

Workload

Summary

Transformer Layer Stack

Model Totals (per decode token, batch=1)

Click a preset to load an interesting configuration.