BEM4I miniApp (Chunksize 500)

Version's name: BEM4I miniApp (Chunksize 500) ; a version of the BEM4I miniApp program.
Repository: [home] and version downloads: [.zip] [.tar.gz] [.tar.bz2] [.tar]
Implemented best practices: Chunk/task grain-size trade-off (parallelism/overhead) ·

This is a version of the original BEM4I kernel code, where the work-sharing loop over all degrees of freedom in the global system is processed with chunk size set to 500.

...
#pragma omp parallel
{
// apply K, K', V and D
{ ... }

// loop over all degrees of freedom 
#pragma omp for schedule(dynamic, 500)
  for(int j = 0; j < nDOFs; j++)
  { ... }

} // end of parallel region
MPI_Allreduce(...);
The following experiments have been registered: