This is the baseline version of the pils program. The program iterates on
different phases, splitted by MPI_barrier()’s. Each phase executes a given
number of parallel regions (determined by the grain parameter), which main
body is a loop executing iterations of task-duration.