In a naive implementation, I/O operations are most likely implemented in serial. Data is read from and written to disk on demand whenever it is required to do so. However, this might lead to a significant performance decrease if the amount of data transferred to or from file is very small in a single operation and many of these operations happen. (more...)
Modern HPC file systems are designed to handle writing a large amount of data to a file. However, if the application performs a lot of write operations that write data in very small chunks this leads to an inefficient use of the file system’s capabilities. This becomes even more apparent if the file system is connected to the HPC system via a network. In this case each write operation initiates a separate data transfer over the network. So every time this happens one also pays the latency to establish the connection to the file system. This effect can easily sum up if a large number of small write operations happens.
Thus it is recommended to use fewer write operations that write large chunks of data to a file. This can be achieved by accumulating all the data that needs to be written to a file into a buffer. Finally, the whole buffer will be written to a file with one write operation.
The following code skeleton illustrates the use of buffered write operations.
! MAX_POINTS = number of discretization points #define MAX_POINTS 1000000 TYPE point real*8 :: position(3) END TYPE TYPE(point) :: points(MAX_POINTS); OPEN(42, file = 'output.dat', status = 'unknown', action = 'rewind', buffered='yes'); DO i = 0, MAX_POINTS write(42, *) (points(i)%position(j), j=1,3) END DO close(42);
In this code example, the position coordinates in 3D space for 1 million discretization points are written to a file called
By specifying the keyword
buffered='yes' when opening the file the operating system will provide an internal buffer to which all subsequent write operations that target this specific file will write.
If the buffer is full its content will be written to the file.
The size of this buffer can and should be tuned to match the characteristics of the given application.
In a small test example around 1.6 million write operations, each of size 20 bytes (2 double precision floating numbers + 1 integer number), can be performed in roughly 0.7 seconds when using buffered I/O operations.
Recommended in program(s): CalculiX I/O unbuffered ·
Implemented in program(s): CalculiX I/O (buffered) ·