Home Resources for Co-Design Patterns Mpi-endpoints-contention

MPI endpoint contention

MPI processes often have to communicate with a list of neighbours. Depending on the order of send and receive calls it may happen that many processes get “synchronized” in that all of them try to send at the same time to the same given destination, resulting in the limited incoming bandwidth at the destination becoming a limiter for the overall communication performance.

The pattern arises in the code structure sketched in the following figure. This approach of programming communications is fairly typical of many codes.

rank_id_t neighbors[N]; // ordered list of neighbors of this rank

for (int i=0; i < N; i++) {
   send(neighbors[i]);
}

Where neighbors is the list of neighbors of the process, and N is the number of neighbors. Typically the list is ordered from lower rank to higher rank neighbors. The result is that all neighbors of rank 0 send their first message to it, overloading its receive bandwidth.

Recommended best-practice(s):

Re-schedule communications

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreements No 676553 (POP1) and 824080 (POP2).

Currently, the project receives funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 101143931 (POP3). The JU receives support from the European Union's Horizon Europe research and innovation programme and Spain, Germany, France, Portugal and the Czech Republic.