Sam(oa)² (work-sharing)

Version's name: Sam(oa)² (work-sharing) ; a version of the Sam(oa)² program.
Repository: [home] and version downloads: [.zip] [.tar.gz] [.tar.bz2] [.tar]
Patterns and behaviours: Dynamic load imbalance in MPI · Load imbalance due to computational complexity (unknown a priori) · Loop iterations manually distributed ·
Recommended best practices: Over-decomposition using tasking · Task migration among processes ·

Description

This version of Sam(oa)² uses a classical work-sharing approach to distribute the workload (sections) within a process to the participating threads. The structure of the implementation is equivalent to a static schedule in OpenMP, where the number of sections is equally distributed across threads. A simplified version is illustrated in the following code snippet:

! executed by every thread in the parallel team
do time_step = 1, N_TIME_STEPS
    sec_idx_start = calculate_start_idx_for_thread()
    sec_idx_end   = calculate_end_idx_for_thread()
    
    do idx = sec_idx_start, sec_idx_end
        traverse( sections(idx) )
    end do

    ! After the serial loop has been executed, all section traversals for the current thread have been completed
    ! Boundary exchange can be initiated for those sections
    
    exchange_boundary_data()
    apply_AMR()
end do

As sections might exhibit varying execution times, this might lead to load imbalance within a process and waiting times from idle threads at the barrier. Due to the internal design and implementation, which is a bit more complicated, Sam(oa)² does not use !$omp do schedule(static). Instead, the parallel region is spawned at the outer level and the time stepping loop is executed by every thread. Start and end indices are manually calculated by every thread. This design choice complicates applying a dynamic schedule to tackle the load imbalance.

How to Build & Run

The repository provides two detailed README files with descriptions and instructions on how to build and run the various application versions. Furthermore, it provides some scripts for the experiments that have been carried out here. For the prerequisites, refer to the README file in the repository.

This version focuses on the work-sharing version, that can be built following these steps:

# change to the scripts directoy
cd scripts/claix

# specify the paths to the Sam(oa)² repository and ASAGI installation directory
export SAMOA_DIR=/path/to/samoa
export ASAGI_DIR=/path/to/ASAGI

# make sure the script is executable
chmod u+x ./samoa_build_intel.sh

# compile the work-sharing version
COMPILE_WS=1 ./samoa_build_intel.sh

After the application has been built successfully, the executable can be found in the bin directory located in the Sam(oa)² repository.

In order to run the work-sharing version, execute the following steps:

# change to the scripts directoy
cd scripts/claix

# specify the paths to the Sam(oa)² repository and ASAGI installation directory
export SAMOA_DIR=/path/to/samoa
export ASAGI_DATA_DIR=/path/to/asagi/dir
# specify the path to a dummy output folder
export OUTPUT_DIR=/path/to/temp/output/folder # not used in this configuration but still required by program

# make sure the script is executable
chmod u+x ./samoa_run_intel.sh

# run the work-sharing version
OMP_NUM_THREADS_VAR=11 NUM_RANKS=4 RUN_WS=1 ./samoa_run_intel.sh
The following experiments have been registered: