# Parallel multi-file I/O

Pattern addressed: Sequential ASCII file I/O

In this pattern data held on all processes is read or written to an ASCII file by a single process. This is inefficient for several reasons: (more...)

Required condition: In a per process checkpoint/restart phase

This best practice uses multiple files for reading and writing, e.g. one file per process. This approach may be appropriate when a single file isn’t required, e.g. when writing checkpoint data for restarting on the same number of processes, or where it is optimal to aggregate multiple files at the end.

A strategy of reading or writing one file per process can give good performance, and avoids the need to communicate data between processes. However, opening many files for read and write operations can be inefficient, in which case a parallel file I/O library is likely to give better performance. Experimentation is necessary to decide which approach is best.

If adopting a multi file approach then using local file systems will typically give better performance, or it may be necessary to use a small subset of processes for file I/O.

The following pseudo-code shows the structure of the file I/O when writing data. Each process opens a file, and then writes data held in array data to it. Each file has a unique name based on the process rank.

get_my_process_id(proc_id)

filename = "file_" + to_string(proc_id)
open_file(filename)

write_data_to_file(data, filename)
close_file(filename)


The process is reversed when reading data from file, as follows.

get_my_process_id(proc_id)

filename = "file_" + to_string(proc_id)
open_file(filename)