This version uses MPI_Alltoallv instead of the custom build communication using MPI_Issend and MPI_Irecv. The code is much shorter, but does not expose any potential for overlap of communication and data processing.
SW Co-design: If the MPI implementation could perform a collective service through overlapping.