This version of Sam(oa)² does not just apply load balancing of the threads within a process, but also between processes that might run on different compute nodes in distributed memory, using the Chameleon library. Chameleon is a task-based programming environment for developing reactive HPC applications and is based on MPI + OpenMP, allowing an easy integration into existing hybrid MPI + OpenMP applications. For further information, please visit the Chameleon GitHub site.
In this version, Chameleon tasks are used that can be either executed on the original process that created the task or temporarily migrated to a different process and executed there to balance the load. These migrations usually happen as soon as possible to overlap communication and computation as much as possible. Similar to the tasking version, all threads of a process participate in executing these tasks independently of whether they have been created locally or migrated from a different process. A
chameleon_distributed_taskwait() ensures execution will proceed only after all tasks (globally) have been completed. A simplified version is illustrated in the following code snippet:
! executed by every thread in the parallel team do time_step = 1, N_TIME_STEPS sec_idx_start = calculate_start_idx_for_thread() sec_idx_end = calculate_end_idx_for_thread() do idx = sec_idx_start, sec_idx_end !create a chameleon task and specify entry function as well as arguments e.g. section data (left out for brevity) cur_task = chameleon_create_task(traverse_fcn_pointer, num_args, args_info) !enqueue task in the chameleon runtime system i_error = chameleon_add_task_fortran(cur_task) end do ! wait until all section traversals have been completed (globally) i_error = chameleon_distributed_taskwait() exchange_boundary_data() apply_AMR() end do
The repository provides two detailed README files with descriptions and instructions on how to build and run the various application versions. Furthermore, it provides some scripts for the experiments that have been carried out here. For the prerequisites, refer to the README file in the repository.
This version focuses on the Chameleon version, that can be built following these steps:
# change to the scripts directoy cd scripts/claix # specify the paths to the Sam(oa)² repository and ASAGI installation directory export SAMOA_DIR=/path/to/samoa export ASAGI_DIR=/path/to/ASAGI # make sure the script is executable chmod u+x ./samoa_build_intel.sh # compile the Chameleon version # Append linux env vars with Chameleon include and lib folder (here: realized with an environment module) module load chameleon export CHAMELEON_DIR=/path/to/Chameleon/install/dir COMPILE_CHAMELEON=1 ./samoa_build_intel.sh
After the application has been built successfully, the executable can be found in the bin directory located in the Sam(oa)² repository.
In order to run the Chameleon version, execute the following steps:
# change to the scripts directoy cd scripts/claix # specify the paths to the Sam(oa)² repository and ASAGI installation directory export SAMOA_DIR=/path/to/samoa export ASAGI_DATA_DIR=/path/to/asagi/dir # specify the path to a dummy output folder export OUTPUT_DIR=/path/to/temp/output/folder # not used in this configuration but still required by program # make sure the script is executable chmod u+x ./samoa_run_intel.sh # run the Chameleon version # Append linux env vars with Chameleon include and lib folder (here: realized with an environment module) module load chameleon OMP_NUM_THREADS_VAR=11 NUM_RANKS=4 RUN_CHAMELEON=1 ./samoa_run_intel.sh