Product update

New browser AI benchmark: ONNX on WebGPU and CPU

Published June 13, 2026. Updated June 13, 2026.

BottleneckRadar now includes a dedicated browser AI benchmark. It runs a real MobileNet v2 image-classification model locally through ONNX Runtime Web and compares CPU/WebAssembly inference with WebGPU acceleration when available.

What the benchmark reports

The test measures model initialization, median inference latency, p90 latency, and completed inferences per second. It also reports the relative WebGPU speedup when both backends finish successfully. Model download time is shown separately and does not affect the compute score.

Why ONNX is used

ONNX is a portable format for trained machine-learning models. It lets the same MobileNet v2 model run through different browser execution backends, making the CPU and WebGPU comparison more meaningful than a synthetic loop.

How to interpret the result

Lower inference latency means the browser completes each model run faster.
Higher inferences per second means better sustained AI throughput.
WebGPU unavailable can reflect browser, driver, operating-system, or policy support rather than weak hardware.
Laptop power mode and background applications can change results between runs.

The model input and benchmark output stay on the device. The first run downloads the approximately 13.3 MB model, while later runs can use the browser cache.

Run the Browser AI Benchmark

New browser AI benchmark: ONNX on WebGPU and CPU

What the benchmark reports

Why ONNX is used

How to interpret the result

Related pages