GPU Kernel Main

Version's name: GPU Kernel Main ; a version of the GPU-Kernel program.
Repository: [home] and version downloads: [.zip] [.tar.gz] [.tar.bz2] [.tar]
Patterns and behaviours:

GPU uncoalesced memory transfer

Recommended best-practices:

GPU align memory accesses

- Available version(s):

GPU Kernel Optimized

This kernel is to showcase the importance of memory coalescing with a simple matrix multiplication. In this case two 1024x1024 matrices of random integers get multiplied. The matrix multiplication is implemented the naive way, i.e. the threads get distributed in a two-dimensional grid and iterate over the row and down the column. No optimization techniques like blocking or tiling were used.

The following experiments have been registered:

GPU-kernel analysis of initial version

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreements No 676553 (POP1) and 824080 (POP2).

Currently, the project receives funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 101143931 (POP3). The JU receives support from the European Union's Horizon Europe research and innovation programme and Spain, Germany, France, Portugal and the Czech Republic.