Home Resources for Co-Design Models Openmp_offload

OpenMP (Offload)

Since OpenMP version 4.0, the standard has introduced support for heterogeneous systems, which consist of a host architecture and one or more external accelerator devices. The host architecture is where the program begins its execution, while the target accelerators (such as GPUs) are external devices attached to the host, capable of executing portions of the computation. As a key feature, the OpenMP offload model enables performance portability across different HPC clusters by abstracting the user from device-specific architectures.

OpenMP facilitates offloading tasks to these accelerators using the target construct, which allows both data and code to be transferred from the host to the target device for execution. Additionally, OpenMP provides a set of specialized API routines for managing operations specific to devices, such as querying device information, handling data management, and managing thread hierarchies. The standard also includes environment variables that can be set at runtime to configure how the device executes kernels.

The typical workflow for executing kernels on a device involves three main steps: 1) The host maps its data to the target device’s memory environment; 2) The host offloads OpenMP target regions to the device for execution, potentially reusing the data environment to execute multiple regions; and 3) The host retrieves the computed results from the device and transfers the data back to the host.

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreements No 676553 (POP1) and 824080 (POP2).

Currently, the project receives funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 101143931 (POP3). The JU receives support from the European Union's Horizon Europe research and innovation programme and Spain, Germany, France, Portugal and the Czech Republic.