Alya assembly (master)

Version's name: Alya assembly (master) ; a version of the Alya assembly program.
Patterns and behaviours: Recommended best-practices:

Alya is a simulation code for high performance computational mechanics. Alya solves coupled multiphysics problems using high performance computing techniques for distributed and shared memory supercomputers, together with vectorization and optimization at the node level.

This kernel corresponds to the algebraic system assembly of a finite element code (FE) for solving partial differential equations (PDE’s). The matrix assembly consists of a loop over the elements to compute element matrices and right-hand sides and their assemblies in the local system.

This version of the kernel makes use of atomic constructs to synchronise access to shared data.

The source code for this version and the multidependencies one is the same, a preprocessor directive (ALYA_OMPSS, not defined in this case) is used to choose which of the two versions is compiled.

How to build

The kernel requires gfortran to be built. It has been tested with version 8.4.1.

#> ./compile.sh


How to execute

#> export OMP_NUM_THREADS=16 (or whatever you want)
#> ./miniapp.x --implicit ./tests/cavtet04_300_MM1_NO_COLORING_16.bin


The “–implicit” execution flag just tells the program to run an implicit assembly (as opposed to explicit assembly). Only implicit assembly is considered for both versions of the program (master and multidependencies).

The directory “tests/” contains the input files needed to run the program. They are in binary format so it is not possible to modify them.

If everything goes fine, you should see something like that:

 start read
----------------------------------
nelem=                     318320
npoin=                      57604

VECTOR_SIZE=                   16
par_omp_nelem_chunk=          300
num_subd_par=                   1
----------------------------------
Using OpenMP
Number of subdomains=           1
Chunk size (can be changed)=         600
IMPLICIT METHOD -------------->
time=  0.26172455213963985
time=  0.25982044730335474
time=  0.26099143736064434
time=  0.26015086565166712
time=  0.26262409798800945
time=  0.25985855888575315
time=  0.26031209155917168
time=  0.26034436654299498
time=  0.26018605008721352
time=  0.26048123929649591
time=  0.25992043968290091
loop finished correctly, time=  0.26046895943582060