Patterns and behaviours:
Load imbalance due to computational complexity (unknown a priori) ·

Recommended best practices:
Conditional nested tasks within an unbalanced phase ·

This program represents the original behavior of the GMRES solver found in the CalculiX application.
It solves the first non-symmetric linear system occuring in the very first timestep of simulation of airflow through a bend pipe.

The structure of the program is given by the following (pseudo) code:

```
gmres_serial(int i)
{
// get data chunk based on i
// perform GMRES solver on data chunk in serial
}
```

```
for (cid = 0; cid < NUM_CPUS; cid++)
{
pthread_create(tid, gmres_serial(cid))
}
for (cid = 0; cid < NUM_CPUS; cid++)
pthread_join(tid)
```

In order to solve a large sparse linear system of size `1536000 x 1536000`

with more than 10 million non zero elements `#T`

smaller subsystems are created, where `#T`

denotes the number of worker threads used.
Each thread solves one of these small subsystems on its own using a serial GMRES implementation.
So basically, each thread computes a subset of rows of the final solution vector of the whole system.

In this original version of the program the threads are created using the *pthread* programming model.
The program creates as many threads as given by the value of `NUM_CPUS`

which can be set using the `OMP_NUM_THREADS`

environment variable.

## Building the program

Along with the source code of the program comes a *Makefile*.
This *Makefile* offers a single target

The `pattern-pthread`

target will build the program representing the original solver behavior as described here.
Executing this program will show a behavior related to pattern “Load imbalance due to computational complexity (unknown a priori)”.

Moreover, this *Makefile* offers the possibility to instrument the code using *scorep*.
Therefor put a comment to line 2 and line 3 and uncomment lines 6,7 and 8 such that the beginning of the *Makefile* looks like this:

```
# run without scorep instrumentation
#SCOREP =
#SCOREP_FLAG =
# run with scorep instrumentation
SCOREP = scorep --user --nocompiler
SCOREP_PTHREAD = $(SCOREP) --thread=pthread
SCOREP_FLAG = -DSCOREP
###############################################
```

