This version of the code removes the need for communicating the force field calculation to another MPI process by repeating the calculation on the receiving MPI process. Thus, the force field calculation \({\bf F}_{i,j}\) is calculated on process \(p\) and \({\bf F}_{j,i}\) is calculated on process \(q\).
Code purpose:
md_mpi_comm.c
can be used to demonstrate that poor MPI Transfer efficiency may occur under a specific set of conditions when using a simple MPI_SEND/MPI_RECV
strategy to split the force calculation among processes.
How to use:
The Makefile command make
generates an executable file named md_mpi_comm.exe
using the GNU compiler. To run the code, first define the number of time steps and atoms to be used and then launch the application on a specific number of MPI processes, for example replace NUMATOMS
, NUMSTEPS
and NUMPROC
in the following, where these are respectively the number of time steps, atoms, and MPI processes.
mpirun -n <NUMPROC> ./md_mpi_comm.exe NUMSTEPS NUMATOMS
Default values are assigned if the number of iterations is not provided, or are less than 0
.
Screen output will be generated, similar to the following one:
> POP WP7 kernel
> Version of code: original version with performance bottleneck
> Implements Pattern: High weighted communication in between ranks
> Problem size: NUMSTEPS = 10 TOTATOMS = 2000
> Kernel wall time (integration) = 20.12