GPU Kernel Optimized

Version's name: GPU Kernel Optimized ; a version of the GPU-Kernel program.
Repository: [home] and version downloads: [.zip] [.tar.gz] [.tar.bz2] [.tar]
Patterns and behaviours: Recommended best-practices: Implemented best practices: GPU align memory accesses ·

This verion of the matrix multiplication kernel is structurally identical to the original verison, i.e. the matrix multiplication is still implemented in the naive way. However, in this version we explicitly assure the compiler that it is safe to perform optimizations of the memory access pattern by marking the input arrays with the __restrict keyword. This simple change yields a nearly 3x improvement in runtime.