Home Resources for Co-Design Best-practices Re-schedule-communications

Re-schedule communications

Pattern addressed: MPI endpoint contention

MPI processes often have to communicate with a list of neighbours. Depending on the order of send and receive calls it may happen that many processes get “synchronized” in that all of them try to send at the same time to the same given destination, resulting in the limited incoming bandwidth at the destination becoming a limiter for the overall communication performance. (more...)

A simple way to address the issue would be to sort the list in ways that avoid such endpoint contention. Optimal communication schedules can be computed, but in practice, just starting each list by the first neighbor with rang higher that the sender and proceeding circularly to the lower ranked neighbor when the number of processes in the communicator is reached will probably reduce the endpoint contention effect.

rank_id_t neighbors[N]; // ordered list of neighbors of this rank

int next_neigh = search_idx(neighbors, myRank); // Search for next neighbor with rank greater than myRank

for (int i=0; i < N; i++) {
   send(neighbors[(next_neigh + i) % N]); // Circular traversal of the list of neighbors
}

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreements No 676553 (POP1) and 824080 (POP2).

Currently, the project receives funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 101143931 (POP3). The JU receives support from the European Union's Horizon Europe research and innovation programme and Spain, Germany, France, Portugal and the Czech Republic.