This program represents the original behavior of the GMRES solver found in the CalculiX application. It solves the first non-symmetric linear system occuring in the very first timestep of simulation of airflow through a bend pipe.
The structure of the program is given by the following (pseudo) code:
gmres_serial(int i)
{
// get data chunk based on i
// perform GMRES solver on data chunk in serial
}
for (cid = 0; cid < NUM_CPUS; cid++)
{
pthread_create(tid, gmres_serial(cid))
}
for (cid = 0; cid < NUM_CPUS; cid++)
pthread_join(tid)
In order to solve a large sparse linear system of size 1536000 x 1536000
with more than 10 million non zero elements #T
smaller subsystems are created, where #T
denotes the number of worker threads used.
Each thread solves one of these small subsystems on its own using a serial GMRES implementation.
So basically, each thread computes a subset of rows of the final solution vector of the whole system.
In this original version of the program the threads are created using the pthread programming model.
The program creates as many threads as given by the value of NUM_CPUS
which can be set using the OMP_NUM_THREADS
environment variable.
Along with the source code of the program comes a Makefile. This Makefile offers a single target
pattern-pthread
The pattern-pthread
target will build the program representing the original solver behavior as described here.
Executing this program will show a behavior related to pattern “Load imbalance due to computational complexity (unknown a priori)”.
Moreover, this Makefile offers the possibility to instrument the code using scorep. Therefor put a comment to line 2 and line 3 and uncomment lines 6,7 and 8 such that the beginning of the Makefile looks like this:
# run without scorep instrumentation
#SCOREP =
#SCOREP_FLAG =
# run with scorep instrumentation
SCOREP = scorep --user --nocompiler
SCOREP_PTHREAD = $(SCOREP) --thread=pthread
SCOREP_FLAG = -DSCOREP
###############################################