GPU SAXPY

Program's name: GPU SAXPY
Available version(s):

GPU SAXPY with optimal cpu binding

GPU SAXPY with default cpu binding

Programming language(s): C ·
Programming model(s): MPI · OpenMP ·

This repository contains a small application reproducing performance issues due to GPU affinity. It repeatedly performs a single precision a times x plus y (SAXPY) operation: \(Y \leftarrow a X + Y\)

It uses OpenMP target offloading to perform this operation on the GPU. Further it uses MPI to devide huge vectors into smaller parts and computes them in parallel. Each MPI rank uses one device to compute its partial result.

This kernel has two versions:

The version on the main branch is using no CPU binding and does not specify the device to offload the SAXPY kernel to. It therefore has no control over which CPU offloads to which device.
The version on the cpu-binding branch uses a device clause to specify a device to offload the kernel to. Further, the --cpu-bind option is used in the srun command to specify which task is running on which NUMA domain.

Build instructions

The clang compiler with openmp-target offloading support is required along with an MPI implementation.

Before building edit line 2 of src/Makefile so that the given GPU architecture matches the hardware on your target system. The default value is sm_90 which matches Nvidia H100 GPUs.

To build this program navigate to the src folder and run make:

$ cd src
$ make

This will generate a binary called kernel.exe.

Executing the program

You can execute this kernel by running

$ mpirun -np <number of ranks> ./kernel.exe <size of vectors x and y>

The number of ranks should match the number of GPUs in the system. Rank x will offload to device x.

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreements No 676553 (POP1) and 824080 (POP2).

Currently, the project receives funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 101143931 (POP3). The JU receives support from the European Union's Horizon Europe research and innovation programme and Spain, Germany, France, Portugal and the Czech Republic.