Alya assembly (master)

Version's name: Alya assembly (master) ; a version of the Alya assembly program.
Repository: [home] and version downloads: [.zip] [.tar.gz] [.tar.bz2] [.tar]
Patterns and behaviours: Recommended best-practices:

Alya is a simulation code for high performance computational mechanics. Alya solves coupled multiphysics problems using high performance computing techniques for distributed and shared memory supercomputers, together with vectorization and optimization at the node level.

This kernel corresponds to the algebraic system assembly of a finite element code (FE) for solving partial differential equations (PDE’s). The matrix assembly consists of a loop over the elements to compute element matrices and right-hand sides and their assemblies in the local system.

This version of the kernel makes use of atomic constructs to synchronise access to shared data.

The source code for this version and the multidependencies one is the same, a preprocessor directive (ALYA_OMPSS, not defined in this case) is used to choose which of the two versions is compiled.

How to build

The kernel requires gfortran to be built. It has been tested with version 8.4.1.

#> ./

How to execute

#> export OMP_NUM_THREADS=16 (or whatever you want)
#> ./miniapp.x --implicit ./tests/cavtet04_300_MM1_NO_COLORING_16.bin

The “–implicit” execution flag just tells the program to run an implicit assembly (as opposed to explicit assembly). Only implicit assembly is considered for both versions of the program (master and multidependencies).

The directory “tests/” contains the input files needed to run the program. They are in binary format so it is not possible to modify them.

If everything goes fine, you should see something like that:

 start read
 miniapp_read: element integration
 miniapp_read: mesh data
 miniapp_read: parallel data
 miniapp_read: ompss data
 end   read
 nelem=                     318320
 npoin=                      57604

 VECTOR_SIZE=                   16
 par_omp_nelem_chunk=          300
 num_subd_par=                   1
 Using OpenMP
 NUM_THREADS=           16
 MAX_THREADS=           16
 Number of subdomains=           1
 Chunk size (can be changed)=         600
 IMPLICIT METHOD -------------->
 time=  0.26172455213963985
 time=  0.25982044730335474
 time=  0.26099143736064434
 time=  0.26015086565166712
 time=  0.26262409798800945
 time=  0.25985855888575315
 time=  0.26031209155917168
 time=  0.26034436654299498
 time=  0.26018605008721352
 time=  0.26048123929649591
 time=  0.25992043968290091
 loop finished correctly, time=  0.26046895943582060