The Device Offload Efficiency (DOE) is defined as the ratio between the total useful computation time on the used CPUs and the sum of that time and the total host idle time (summed over all CPUs) that is related to managing the targeted accelerator devices.
\[DOE = \frac{TotalUseful}{TotalUseful + TotalWaitForDevices}\]This metric accounts for whenever CPUs are not being used to perform useful computations while launching and waiting for kernels, sending data to accelerators, and/or waiting for data.
In order to fully understand the formulas, you may also visit the glossary of the metrics terms.
Related patterns: Avoidable transfers between host and GPU for MPI communication (GPU-Unaware MPI) ·