Product update
New browser AI benchmark: ONNX on WebGPU and CPU
BottleneckRadar now includes a dedicated browser AI benchmark. It runs a real MobileNet v2 image-classification model locally through ONNX Runtime Web and compares CPU/WebAssembly inference with WebGPU acceleration when available.
What the benchmark reports
The test measures model initialization, median inference latency, p90 latency, and completed inferences per second. It also reports the relative WebGPU speedup when both backends finish successfully. Model download time is shown separately and does not affect the compute score.
Why ONNX is used
ONNX is a portable format for trained machine-learning models. It lets the same MobileNet v2 model run through different browser execution backends, making the CPU and WebGPU comparison more meaningful than a synthetic loop.
How to interpret the result
- Lower inference latency means the browser completes each model run faster.
- Higher inferences per second means better sustained AI throughput.
- WebGPU unavailable can reflect browser, driver, operating-system, or policy support rather than weak hardware.
- Laptop power mode and background applications can change results between runs.
The model input and benchmark output stay on the device. The first run downloads the approximately 13.3 MB model, while later runs can use the browser cache.