Hi guys, this week's tip is about measuring the performance of your Python code. Maybe you left your cubicle last Friday with some piece of code that wasn't performing well? Well, today you will learn how to measure it! :)
The timeit module
We all know that Python comes with batteries included because in its standard library there are a lot of tools that you can easily use out of the box.
One of theese tools is the
timeit module allows you to measure how much time is taken for a specific python piece of code. In short, it takes your piece of code, runs it a million times, and then returns you the execution time of the run.
Ok, let's start with an example. You have been asked to write a small function that takes a list of strings and returns a single string that is the concatenation of all the strings in the list.
Here is your piece of code:
1def concatenate(list: [str]): 2 result = "" 3 for string in list: 4 result = result + "," + string 5 6 return result
It was easy, wasn't it?
Now we need to test it… let's write some code to test it.
1def concatenate(list: [str]): 2 result = "" 3 for string in list: 4 result = result + "," + string 5 6 return result 7 8if __name__ == '__main__': 9 import random 10 import string 11 12 my_list =  13 for _ in range(100): 14 my_string = "" 15 for _ in range(10): 16 my_string = my_string + random.choice(string.ascii_uppercase) 17 my_list.append(my_string) 18 19 result = concatenate(my_list) 20 print(result)
Ok, now we have added some code to create a random list of 100 ten-chars-strings and to call our
Running the program we can see that the program works… but how fast is it?
Let's find it out with the
timeit() function of the
The signature of the function we will use is the following:
1timeit.timeit(stmt='pass', setup='pass', timer=<default timer>, number=1000000, globals=None)
So in our code we can import the
timeit module and use this function, where:
stmtis the statement to be tested, written as a string
setupis an optional string that you can use to setup the environment before starting
timeris an optional parameter to specify the timer we want to use (by default it is the
numberis the number of times the code has to be executed, by default is a million times
globalsis an optional parameter that is useful if you want to specify a namespace where to execute your code
Let's try it:
1def concatenate(list: [str]): 2 result = "" 3 for string in list: 4 result = result + "," + string 5 6 return result 7 8if __name__ == '__main__': 9 import random 10 import string 11 import timeit 12 13 my_list =  14 for _ in range(100): 15 my_string = "" 16 for _ in range(10): 17 my_string = my_string + random.choice(string.ascii_uppercase) 18 my_list.append(my_string) 19 20 print(timeit.timeit("concatenate(my_list)", globals=globals()))
Please note that in our example it was important to specify the
globals parameter to specify in which namespace the module could have found the function we wrote.
Running the example on my Intel-based MacBook pro what I got is:
$ python timeit1.py 24.02475388700077
Ok, let's face it, this code sucks guys…
Please note that during the execution, the
timeit()function disables the garbage collector to have the single measurements more comparable. This is usually ok but sometimes it can be useful to measure it because garbage collections can be an important aspect to consider talking about performance. In this case, consider that you can reenable the garbage collection just by adding
setupparameter like this:
However, as you may know, there's a better way to concatenate strings in Python and it's done by using the
.join() method of a string object.
So in our case, our code could be written like this:
1def concatenate(list: [str]): 2 return ",".join(list) 3 4if __name__ == '__main__': 5 import random 6 import string 7 import timeit 8 9 my_list = [("".join(random.choice(string.ascii_uppercase) for _ in range(10)) for _ in range(100)] 10 11 print(timeit.timeit("concatenate(my_list)", globals=globals()))
Yes, in the previous code we had reinvented the wheel… and by the way: if you are asking how I created the
my_list list in this last example, check out my article about list comprehension.
Now, let's run this code and see what we get:
$ python timeit2.py 1.2398378039997624
Ok, we have optimized our code by 95%, let's call it a day! ;)
So, what have we learned?
to measure a Python piece of code the best method is to use the
timeitmodule of the standard library
optimizing your code is super important
the standard library is your friend :)
timeit module has a lot of other features that we haven't discussed in this article, to find out more, refer to the standard library official documentation.