Alya is a simulation code for high performance computational mechanics. Alya solves coupled multiphysics problems using high performance computing techniques for distributed and shared memory supercomputers, together with vectorization and optimization at the node level.
This kernel corresponds to the algebraic system assembly of a finite element code (FE) for solving partial differential equations (PDE’s). The matrix assembly consists of a loop over the elements to compute element matrices and right-hand sides and their assemblies in the local system.
This version of the kernel leverages the use of multidependencies features provided by OmpSs programming model to avoid the use of atomic constructs.
The source code for this version and the master one is the same, a preprocessor directive (ALYA_OMPSS, defined in this case) is used to choose which of the two versions is compiled.
This version of the kernel requires OmpSs and gfortran to be built. If you do not have OmpSs available on your system, you can follow this link:
for a complete guide of installation.
For building the kernel, just type:
#> export OMP_NUM_THREADS=16 (or whatever you want) #> ./miniapp_ALYA_OMPSS.x --implicit ./tests/cavtet04_600_MM1_ALYA_OMPSS_16.bin
The “–implicit” execution flag just tells the program to run an implicit assembly (as opposed to explicit assembly). Only implicit assembly is considered for both vers ions of the program (master and multidependencies).
The directory “tests/” contains the input files needed to run the program. They are in binary format so it is not possible to modify them.
If everything goes fine, you should see something like that:
start read miniapp_read miniapp_read: element integration miniapp_read: mesh data miniapp_read: parallel data miniapp_read: ompss data end read ---------------------------------- nelem= 318320 npoin= 57604 VECTOR_SIZE= 16 par_omp_nelem_chunk= 600 num_subd_par= 530 ---------------------------------- Using OpenMP NUM_THREADS= 16 MAX_THREADS= 16 Number of subdomains= 530 Chunk size (can be changed)= 600 Max neighbors= 22 IMPLICIT METHOD --------------> time= 0.13966536056250334 time= 0.13289238139986992 time= 0.13748298678547144 time= 0.19586112350225449 time= 0.14218118786811829 time= 0.15607176814228296 time= 0.13923503085970879 time= 0.16393845435231924 time= 0.16770993731915951 time= 0.16430170554667711 time= 0.16796042304486036 loop finished correctly, time= 0.15676349988207222