The following code shows the structure of the program:
DO I = 1, NB, NTG
CALL pack NTG bands
CALL multi-band FW-FFT along Z
CALL multi-band Scatter
CALL multi-band FW-FFT along XY
CALL VOFR
CALL multi-band BW-FFT along XY
CALL multi-band Scatter
CALL multi-band BW-FFT along Z
CALL unpack NTG bands
END DO
The communication is split in two parts. The first one is taking place in the pack/unpack routines. Here the G-vectors are redistributed among the processes belonging to the different task groups. The second one is taking place in the scatter between the 1D and 2D FFT. In this function the data is scattered from the 1D pencils to the 2D planes with an MPI_Alltoall. It is important to stress that the second communication takes place only within the task groups.