Alya is a simulation code for high performance computational mechanics. Alya solves coupled multiphysics problems using high performance computing techniques for distributed and shared memory supercomputers, together with vectorization and optimization at the node level.
This kernel corresponds to the algebraic system assembly of a finite element code (FE) for solving partial differential equations (PDE’s). The matrix assembly consists of a loop over the elements to compute element matrices and right-hand sides and their assemblies in the local system.
This version of the kernel makes use of atomic constructs to synchronise access to shared data.
The source code for this version and the multidependencies one is the same, a preprocessor directive (ALYA_OMPSS, not defined in this case) is used to choose which of the two versions is compiled.
The kernel requires gfortran to be built. It has been tested with version 8.4.1.
#> ./compile.sh
#> export OMP_NUM_THREADS=16 (or whatever you want)
#> ./miniapp.x --implicit ./tests/cavtet04_300_MM1_NO_COLORING_16.bin
The “–implicit” execution flag just tells the program to run an implicit assembly (as opposed to explicit assembly). Only implicit assembly is considered for both versions of the program (master and multidependencies).
The directory “tests/” contains the input files needed to run the program. They are in binary format so it is not possible to modify them.
If everything goes fine, you should see something like that:
start read
miniapp_read
miniapp_read: element integration
miniapp_read: mesh data
miniapp_read: parallel data
miniapp_read: ompss data
end read
----------------------------------
nelem= 318320
npoin= 57604
VECTOR_SIZE= 16
par_omp_nelem_chunk= 300
num_subd_par= 1
----------------------------------
Using OpenMP
NUM_THREADS= 16
MAX_THREADS= 16
Number of subdomains= 1
Chunk size (can be changed)= 600
IMPLICIT METHOD -------------->
time= 0.26172455213963985
time= 0.25982044730335474
time= 0.26099143736064434
time= 0.26015086565166712
time= 0.26262409798800945
time= 0.25985855888575315
time= 0.26031209155917168
time= 0.26034436654299498
time= 0.26018605008721352
time= 0.26048123929649591
time= 0.25992043968290091
loop finished correctly, time= 0.26046895943582060