Python loops

Program's name: Python loops
Available version(s): Programming language(s): Python ·
Used in following discipline(s): Seismic Data Processing ·

The kernel Python loops is a synthetic program based on a real world HPC Python script that reproduces an inefficient way to write loop-compute algorithms in Python.

The kernel initially reads the input arrays from a file. Then, it executes two compute steps where the input arrays are traversed doing some basic computation on each element. In the second step two output arrays are allocated where the final results are stored. These two output arrays are finally stored in a binary file so one can compare the correctness of the solution when modifying the code. The first compute step is much more expensive than the second because it has 5 levels of for-loops and calls some matrix operations like transpose and ravel, while the second step has only 4 for-loops levels and only does matrix multiplications and additions.

The operations of both compute steps are trivial. The only important things to know are:

  • the final solution is deterministic.
  • the second compute step has an if-else clause in its innermost loop.
  • there are no data dependencies. All iterations within a compute step are independent.

The following pseudo-code summarizes what compute steps look like:

def compute_step_1:
    for i in N
        #load_temporal_data
        for j in M
            #load_temporal_data
            for k in P
                for l in Z
                    #load_temporal_data
                    for f in H
                        #matrix_operations
                        m.transpose()
                        m.ravel()
                        m.sum()
                m += a/b*c
    return m

def compute_step_2
    for j in M
        for k in P
            for l in Z
                #initializes_output_arrays
                for f in H
                    if condition:
                        output1 += a*b
                        output2 += a*b
                    else:
                        output1 = 0
                        output2 = a
    return [output1, output2]     

The main issue of this kernel is that it implements the algorithm in a very naive way. As it is right now, it is entirely executed by Python’s interpreter, which is incredibly slower than any compiled code.