Co-design at POP CoE project

Replacing critical section with reduction

OpenMP critical section: The OpenMP standard provides a critical section construct, which only allows one thread to execute the block of code within the construct. This feature allows blocks of code to be protected from race conditions, for example with write accesses into a shared array or incrementing a shared counter. However, usage of this construct, especially within parallel loops, can severely reduce performance. This is due to serialisation of the execution causing threads to “queue” to enter the critical region, as well as introducing large lock-management overheads required to manage the critical region. (more...) When the critical section corresponds to a recurrent operation

This best practice recommends that if the critical block is performing a reduction operation, this be replaced by the OpenMP reduction clause which has a much lower overhead than a critical section.

Consider the following pseudo-code:

sum = 0.0;

#pragma omp parallel for
for ( int i = 0; i < Ni; i++ ) {
  // work on array[:]
#pragma omp critical
  sum += array[i];
}

This could be re-written to:

#pragma omp parallel for reduction(+:sum) 
for ( int i = 0; i < Ni; i++ ) {
  // work on array [:]
  sum += array[i];
}