# Resources for Co-design at POP CoE

## List of programs

BEM4I miniApp

BEM4I is a library of parallel boundary element based solvers developed at IT4Innovations National Supercomputing Center. It supports solutions of the Laplace, Helmholtz, Lame, and wave equations. The library implements OpenMP and hybrid OpenMP/MPI parallelization. The development is focused on an efficient implementation utilizing multi- and many-core architecture. System matrices assembled within the BEM are generally dense and the library uses Adaptive Cross Approximation technique to approximate them. The resulted linear system is solved by the appropriate iterative solver based on the quality of the system matrix. For Helmholtz and wave equations, the solver is the GMRES method, for Laplace and Lame it can be the CG method.

CalculiX IO

CalculiX is a free three dimensional structural finite element analysis program. It supports linear and non-linear calculations of static, dynamic and thermal problems. The code is written in C and Fortran. Parallelization is achieved using the pthread programming model.

CalculiX solver

CalculiX is a free three dimensional structural finite element analysis program. It supports linear and non-linear calculations of static, dynamic and thermal problems. The code is written in C and Fortran. Parallelization is achieved using the pthread programming model.

Communication Imbalance

The Communication Imbalance kernel is a synthetic program which reproduces a communication pattern in between several MPI processes. Initially it computes a connectivity matrix which represents from/to which ranks will comunicate to one each other, and it also preassigns a given number of elements to each rank.

DuMuX DUNE kernel

DuMuX DUNE is a free and open-source simulator for flow and transport processes in porous media written in C++. This is the DuMuX DUNE kernel, which implement one of the communication and computation patterns found in DuMuX DUNE. The kernel implements a sparse alltoallv communication pattern where computation is performed on the individual communicated buffers.

FFTXlib

FFTXlib is the stand-alone kernel that represents the Fast Fourier Transformation (FFT) algorithm used in the Quantum ESPRESSO application, one of the most used plane-wave Density Functional Theory (DFT) codes in the community of material science. The FFT kernel implements a layered MPI communication with FFT task groups to split the cost of collective communication operations to balance the impact on the performance.

JuPedSim

JuPedSim is an open source framework for simulating, analyzing and visualizing pedestrian dynamics in complex geometries, with the possibility for several exits and obstacles.

OpenMP Critical

An oil & gas code had the openmp-critical-section pattern and the computational aspects of the original code is recreated here. This application solves the 3D wave equation: $\frac{\partial^{2}u}{\partial t^{2}} = c^{2}\nabla^{2}u$ using the pseudospectral method.

Parallel File I/O

A naive approach to file I/O in parallel software is for one process to sequentially read/write ASCII data to/from a single file (e.g. using the C fscanf and fprintf commands) with point to point communications to share the data with all other processes.

RankDLB

RankDLB demonstrates performance issues arising in programs where the computational load per MPI rank evolves over time and therefore creates a load imbalance among MPI ranks. The computational problem must contain a coupling between MPI ranks where data is exchanged between ranks after the computation of a single iteration has completed.