Communication imbalance with improved domain decomposition

The proposed best-practice statically addresses the imbalance generated in the communications by reducing the computation load on processes with more neighbors. By drastically reducing the computation cost in rank 0 and distributing its load to the other processes. The result is shown in the folowing figure


The pictures show how heavily underloading one processes can result in a global higher performance if the majority can be kept balanced and with less communication.