CPU to GPU

Program's name: CPU to GPU
Available version(s):

CPU to GPU, CPU version

CPU to GPU, OpenCL version

Programming language(s): C++ ·
Programming model(s): OpenCL ·

This kernel code implements the solution of the 3D diffusion equation. There are currently three different implementations: cpu_diffusion which uses a single CPU core, cpu_openmp_diffusion which useses multiple CPU cores via OpenMP and opencl_diffusion where the iterations are computed on the GPU while the CPU launches kernels and manages the date transfer between MPI ranks.

First of all, the initial state is set and stored in field u. The diffusion is computed for the given number of iterations. The MPI ranks are connected in x-direction where rank 0 is located at lower x-coordinates than rank 1. The cells at xmax of rank 0 are used as ghost cells for rank 1, while the cells at xmin of rank 1 are used as ghost cells for rank 0.

Each iteration starts with the exchange of ghost cells via non blocking MPI. After the initialization of the field holding the Laplacian operator, the operator for field u is computed. Finally, u is updated using the equation u(t+dt)=u(t)+dt*Laplace(u(t))

Getting started

Prerequisites

To build and run this kernel you will need a

MPI library
C++ compiler
GPU with OpenCL runtime (opencl_diffusion only)

Building and running the kernel

The kernel can be compiled with the provided Makefile via

make cpu_diffusion|cpu_openmp_diffusion|opencl_diffusion|clean

The resulting executables can be executed using mpirun, the number of ranks must be two. The program requires two arguments:

n: number of elements in each direction, i.e. nnn elements will be used
number of iterations to perform

To run the kernel execute

mpirun -np 2 ./cpu_diffusion <n> <cycles>

For example: mpirun -np 2 ./cpu_diffusion 40 1000

At the end of the computation, each rank reports the rate at which work items were processes. A higher value indicates better performance.

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreements No 676553 (POP1) and 824080 (POP2).

Currently, the project receives funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 101143931 (POP3). The JU receives support from the European Union's Horizon Europe research and innovation programme and Spain, Germany, France, Portugal and the Czech Republic.