"Premature optimization is the root of all evil"
Today while programming I found myself thinking about the performance impact of various basic Python operations such as method invocation, the "in" operator, etc.
This is a mistake because you really shouldn't be thinking about this stuff while programming. The performance impact of doing things one way vs another way is usually so small that you couldn't measure it.
Besides, often your program is IO bound rather than CPU bound which means even if you magically optimized CPU time to 0 you probably wouldn't notice it that much anyway.
The correct way to go about optimization is to ignore it altogether and just focus on optimizing for readability. Optimizing for performance is the last thing you should do. Literally. When you're done programming and you discover you really do need to squeeze out more performance out of your program you can run it through a profiler, identify the hotspots where your program is spending most of its time and then go optimize those for better IO or CPU performance.
But... I was still curious about Python's performance so I ran some tests. Just to put my mind at ease. And because it's pretty neat to measure things that are literally happening millions of times a second.
Some fun facts (my computer, one thread):
In general, high level Pythonic code is faster than equivalent lower level Python code.
Python's null-op, "pass" can run 50M times a second
function calls can happen 6.5M times a second
instances can be created 3.5M times a second
If making comparisons, set is faster than array when you have over 5 elements:
# execution time increases linearly with size of array # runs at 40M comparisons a second # if len(array) == 1000000 this will run 40 times a second val in array # execution time constant # runs 8M times a second val in set
set(iterable) is 4X slower than list(iterable) but still really fast
In 1 second == create a set with 130,000 elements. In 1 second == create a list with 446,000 elements.
Bottom line: thinking about writing well optimized Python code is usually a waste of time and energy.
Here's a little code snippets I use to time stuff:
import os import time def bench(f, howmany): start = time.time() for i in xrange(howmany): f() end = time.time() elapsed = end - start print "%d runs in %.4f seconds (%.2f per/sec)" % (howmany, elapsed, howmany / elapsed)
The timeit module from the standard library can can come in useful as well:
import timeit # by default timeit runs a million times timeit.Timer(stmt="pass").timeit() timeit.Timer(stmt="pass").timeit() # how many times a second 1 / timeit.Timer("pass").timeit(1000) * 1000