Python loops (numpy)

One of the proposals of the best-practice is to use Numpy’s routines to rely our operations on optimized functions. In this particular case, we are interested in using Numpy’s vectorization in order to remove all for-loops. This way, Numpy takes care of computing all elements of our arrays in just one call. The outcome of this optimization is shown below.

Compute Structure (numpy)

Total elapsed Time [s] 1545.69 1.67
compute_step_1 elapsed time [s] 1544.10 1.59
compute_step_2 elapsed time [s] 0.56 0.02
Total instructions 9.62e12 1.34e10
compute_step_1 instructions 9.62e12 1.33e10
compute_step_2 instructions 4.87e9 9.19e7
Total average IPC 2.02 2.60
compute_step_1 average IPC 2.02 2.65
compute_step_2 average IPC 2.63 1.88

By using Numpy vectorization we achieved a speedup of 925.56X. Since in this version we have replaced Python’s for-loop with Numpy vectorization, we can realize how slow are generic Python loops.