This kernel is an extraction of the hotspot kernel from DENISE Black Edition, a 2D time-domain isotropic (visco)elastic FD modeling and FWI code. The kernel is extracted from the code part that calculates the propagation of SH waves. It showcases the significant slowdown of subnormal floating-point calculations when they are not handled properly.
The kernel can be compiled with or without flush-to-zero (FTZ) enabled to showcase potential performance differences due to subnormal handling. It can be run by using the provided Makefile as described below.
-mdaz-ftz, either GCC (set by default) or Clang.perf for measuring the hardware performance countersTo compile the kernel, use the provided Makefile that uses GCC by default for compilation. The Makefile includes the following compile targets:
kernel: Compiles the kernel with default settings, FTZ disabled.kernel_ftz: Compiles the kernel with flush-to-zero (FTZ) mode, enabled by compiling with -mdaz-ftz.kernel_nooutput: As kernel, but without verbose output during execution.kernel_ftz_nooutput: As kernel_ftz, but without verbose output during execution.all: Compiles all versions of the kernel.The following run targets are available in the Makefile:
run: Runs the kernel compiled with FTZ disabled, outputs the execution time and number of subnormals occurring in update_s_elastic_PML_SH (for reference) for each iteration.run_ftz: Runs the kernel with FTZ enabled.perf_ipc: Runs the kernel without FTZ and measures instructions per cycle (IPC) using perf.perf_ipc_ftz: Runs the kernel with FTZ enabled and measures IPC using perf.perf_assists (only available for Intel CPUs): Runs the kernel without FTZ and measures floating-point assists using perf.perf_assists_ftz (only available for Intel CPUs): Runs the kernel with FTZ enabled and measures floating-point assists using perf.