Home Resources for Co-Design Programs Gpu-kernel Gpu-kernel_mpi-cuda-aware

GPU Kernel GPU-aware MPI

Version's name: GPU Kernel GPU-aware MPI ; a version of the GPU-Kernel program.
Repository: [home] and version downloads: [.zip] [.tar.gz] [.tar.bz2] [.tar]
Patterns and behaviours:

Avoidable transfers between host and GPU for MPI communication (GPU-Unaware MPI)

Recommended best-practices:

Avoidable transfers between host and GPU for MPI communication (GPU-Aware MPI)

- Available version(s):

GPU Kernel GPU-aware MPI

Implemented best practices: Avoidable transfers between host and GPU for MPI communication (GPU-Aware MPI) ·

This version implements a distributed matrix matrix multiplication where computation is spread across GPUs and data is exchanged using MPI collectives. There are two different variants included: one using explicit data transfers between host and GPU and the other leveraging the GPU-aware feature of common MPI libraries.

Without GPU-aware MPI, data needs to reside on the host memory for MPI communication. Hence, each time data is computed on the GPUs that has to be aggregated on a single process, it needs to be first transferred from GPU to CPU. This variant variant can be executed with

mpirun matmul KERNEL UNAWARE

Utilizing the GPU-aware MPI feature, the application developer no longer needs explicit data transfers. Instead, device buffers can be directly used in MPI calls. Data transfers between host and GPU are either completely removed or optimized by the MPI library leading to improved runtime performance. The GPU-aware MPI variant can be executed with

mpirun matmul KERNEL AWARE

The following experiments have been registered: