One of the proposals of the best-practice is to use Numba JIT (just in time) compiler to speedup Python loops. The outcome of this optimization is shown below.
Master | Numba | |
---|---|---|
Total elapsed Time [s] | 1545.69 | 261.70 |
compute_step_1 elapsed time [s] | 1544.10 | 253.29 |
compute_step_2 elapsed time [s] | 0.56 | 2.9e-3 |
Total instructions | 9.62e12 | 1.24e12 |
compute_step_1 instructions | 9.62e12 | 1.19e12 |
compute_step_2 instructions | 4.87e9 | 5.25e6 |
Total average IPC | 2.02 | 1.72 |
compute_step_1 average IPC | 2.02 | 1.71 |
compute_step_2 average IPC | 2.63 | 1.53 |
By using Numba decorators we achieved a speedup of 5.9X, mostly thanks to reducing the total number of instructions 7.76 times. Numba’s compiled code is much more efficient than Python’s interpreter.