A Centre of Excellence in HPC

Available version(s): Programming language(s): Fortran ·

Programming model(s): OpenMP ·

Uses following algorithm(s): Fast Fourier Transformation ·

Used in following discipline(s): Density Functional Theory ·

The Juelich KKR code family (JuKKR) is a collection of codes developed at Forschungszentrum Juelich implementing the Korringa-Kohn-Rostoker (KKR) Green’s function method to perform density functional theory calculations. Since the KKR method is based on a multiple scattering formalism it allows for highly accurate all-electron calculations.

One of the main codes in the family is the KKRhost code which is used for electronic structure calculations of periodic systems. The code is written in Fortran and parallelized using a hybrid MPI + OpenMP approach. MPI ranks are logically arranged in a two-dimensional grid. In one dimension ranks are distributed among atoms and in the other dimension among energy points of the quantum system under consideration.

The application employs a self-consistency loop to iteratively compute the electron density of a quantum system until convergence as shown by the following pseudo-C-code:

```
for(i=0; i<MAX_SCF_ITERATIONS; i++)
{
// main1a
for(a=0; a<NUM_ATOMS; a++)
for(e=0; e<NUM_ENERGY_PTS; e++)
singleSiteProblem(a,e);
// main1b
for(e=0; e<NUM_ENERGY_PTS; e++)
for(k=0; k<NUM_KPTS; k++)
dysonEquation(e,k);
// main1c
for(a=0; a<NUM_ATOMS; a++)
for(e=0; e<NUM_ENERGY_PTS; e++)
constructDensity(a,e);
// main2
newPotential = energyIntegrationAndExchangeCorrelation();
}
```

Typically, the *main1b* part solving the algebraic Dyson equation is the hotspot and accounts for around 60% of the applications runtime.
In contrast *main1c* accounts for around 25% of the runtime and *main1a* for around 10%.
The remaining 5% of the runtime are spent in the energy integration part and exchange correlation part.

Our testcase is a 3D unit cell of a crystal lattice containing 4 gold atoms. In each of the three dimensions 40 k-points are used to discretize the unit cell. Moreover there are 24 energy points. In total we have the following parameters with respect to the pseudocode shown above:

- NUM_ATOMS = 4
- NUM_ERNERGY_PTS = 24
- NUM_KPTS = 64000

Our kernel is a miniapp extracted from the KKRhost code and represents the k-point integration that solves the algebraic Dyson equation in the *main1b* part for two selected energy point exemplary.