Blog Tags: 

Python performance tests reaffirms "premature optimization is the root of all evil"

"Premature optimization is the root of all evil"

Today while programming I found myself thinking about the performance impact of various basic Python operations such as method invocation, the "in" operator, etc.

This is a mistake because you really shouldn't be thinking about this stuff while programming. The performance impact of doing things one way vs another way is usually so small that you couldn't measure it.

Besides, often your program is IO bound rather than CPU bound which means even if you magically optimized CPU time to 0 you probably wouldn't notice it that much anyway.

The correct way to go about optimization is to ignore it altogether and just focus on optimizing for readability. Optimizing for performance is the last thing you should do. Literally. When you're done programming and you discover you really do need to squeeze out more performance out of your program you can run it through a profiler, identify the hotspots where your program is spending most of its time and then go optimize those for better IO or CPU performance.

But... I was still curious about Python's performance so I ran some tests. Just to put my mind at ease. And because it's pretty neat to measure things that are literally happening millions of times a second.

Some fun facts (my computer, one thread):

  • In general, high level Pythonic code is faster than equivalent lower level Python code.

  • Python's null-op, "pass" can run 50M times a second

  • function calls can happen 6.5M times a second

  • instances can be created 3.5M times a second

  • If making comparisons, set is faster than array when you have over 5 elements:

    # execution time increases linearly with size of array
    # runs at 40M comparisons a second
    # if len(array) == 1000000 this will run 40 times a second
    val in array
    # execution time constant
    # runs 8M times a second
    val in set
  • set(iterable) is 4X slower than list(iterable) but still really fast

    In 1 second == create a set with 130,000 elements. In 1 second == create a list with 446,000 elements.

Bottom line: thinking about writing well optimized Python code is usually a waste of time and energy.


Here's a little code snippets I use to time stuff:

import os
import time

def bench(f, howmany):
    start = time.time()
    for i in xrange(howmany):
    end = time.time()

    elapsed = end - start

    print "%d runs in %.4f seconds (%.2f per/sec)" % (howmany, elapsed, howmany / elapsed)

The timeit module from the standard library can can come in useful as well:

import timeit

# by default timeit runs a million times

# how many times a second
1 / timeit.Timer("pass").timeit(1000) * 1000


Liraz Siri's picture

I think it's better to focus on readability rather than adapt your style to squeeze out more performance our of a specific implementation of Python. Readability never gets old, while the underlying implementation details may change in a way that breaks your assumptions on what is fast and what isn't.
Liraz Siri's picture

It's not unusual for rules of thumb to break down when you venture too far from the well beaten path. Circumstances matter.


Add new comment