RankDLB demonstrates performance issues arising in programs where the computational load per MPI rank evolves over time and therefore creates a load imbalance among MPI ranks. The computational problem must contain a coupling between MPI ranks where data is exchanged between ranks after the computation of a single iteration has completed.
The behaviour described above is best illustrated for discrete particle codes where numerical particles travel through space. In these codes, particles are usually located in cells and an iteration corresponds to a time step. Typical computations contain millions of time steps and millions of cells. The computational domain, i.e. the physical space is partitioned and each MPI rank processes the molecules in one partition. The data exchanged corresponds to molecules traveling across partition boundaries. The computation of the next cycle can only start after the exchange of molecules has completed. A load imbalance occurs when molecules accumulate in one partition. The load imbalance forces those ranks with less work to wait for the ranks with more work in order to complete the data exchange, CPU time is wasted.
The load can be balanced by moving cells and the molecules in those cell among MPI ranks. Dynamic load balancing reduces the wait time at the end of each iteration to a minimum but introduces additional computational cost because data structures to be updated. Therefore a compromise between the cost of load balancing and the gained time must be made. Usually, dynamic load balancing is performed when the CPU time lost due to load imbalance exceeds a given fraction of the total CPU time in a single iteration.