Python Performance
Parity Testing
from timeit import Timer t1 = Timer("for i in xrange(100): i % 2") t2 = Timer("for i in xrange(100): i & 1") # The "not" tests show what happens when interpreting # the result as a boolean t3 = Timer("for i in xrange(100): not i % 2") t4 = Timer("for i in xrange(100): not i & 1") print "Checking for odd parity with `mod`:\t%.4f" % t1.timeit() print "Checking for odd parity with `and`:\t%.4f" % t2.timeit() print "Checking for even parity with `mod`:\t%.4f" % t3.timeit() print "Checking for even parity with `and`:\t%.4f" % t4.timeit()
MacPython 2.7.2
Checking for odd parity with `mod`: 6.5617 Checking for odd parity with `and`: 5.3778 Checking for even parity with `mod`: 8.4417 Checking for even parity with `and`: 7.4086
PyPy 1.6.0 (with GCC 4.0.1; Python 2.7.2)
Checking for odd parity with `mod`: 0.2556 Checking for odd parity with `and`: 0.2312 Checking for even parity with `mod`: 1.7576 Checking for even parity with `and`: 0.6614
The results for odd parity were murky. Sometimes mod was slightly faster; sometimes bitwise-and was faster. There was no question with the even parity, however: The bitwise-and operator played much more nicely with the `not` operator than did the mod operator.
Jython 2.5.2
Checking for odd parity with `mod`: 3.4480 Checking for odd parity with `and`: 1.9380 Checking for even parity with `mod`: 3.6050 Checking for even parity with `and`: 2.0440
Tuple Unpacking
from timeit import Timer index1 = Timer("x = tpl[0]", "tpl = (5,)") unpack1 = Timer("x, = tpl", "tpl = (5,)") index2 = Timer("x = tpl[1]", "tpl = (5, 6)") unpack2 = Timer("y, x = tpl", "tpl = (5, 6)") index3 = Timer("x = tpl[2]", "tpl = (5, 6, 7)") unpack3 = Timer("y, y, x = tpl", "tpl = (5, 6, 7)") index4 = Timer("x = tpl[3]", "tpl = (5, 6, 7, 8)") unpack4 = Timer("y, y, y, x = tpl", "tpl = (5, 6, 7, 8)") list_index2 = Timer("[tpl[1] for tpl in tuples]", "tuples = [(i, i * i) for i in xrange(100)]") list_unpack2 = Timer("[y for x, y in tuples]", "tuples = [(i, i * i) for i in xrange(100)]") list_map2 = Timer("map(itemgetter(1), tuples)", "tuples = [(i, i * i) for i in xrange(100)]; from operator import itemgetter") times = 100000000 print "Indexing vs. unpacking a 1-tuple:\t%.4f\t%.4f" % (index1.timeit(number=times), unpack1.timeit(number=times)) print "Indexing vs. unpacking a 2-tuple:\t%.4f\t%.4f" % (index2.timeit(number=times), unpack2.timeit(number=times)) print "Indexing vs. unpacking a 3-tuple:\t%.4f\t%.4f" % (index3.timeit(number=times), unpack3.timeit(number=times)) print "Indexing vs. unpacking a 4-tuple:\t%.4f\t%.4f" % (index4.timeit(number=times), unpack4.timeit(number=times)) print "Indexing vs. unpacking a list of 2-tuples:\t%.4f\t%.4f" % (list_index2.timeit(), list_unpack2.timeit()) print "map() and itemgetter() (just for kicks):\t%.4f" % (list_map2.timeit())
MacPython 2.7.2
Indexing vs. unpacking a 1-tuple: 5.0712 3.3939 Indexing vs. unpacking a 2-tuple: 5.7888 6.2801 Indexing vs. unpacking a 3-tuple: 6.1820 7.5976 Indexing vs. unpacking a 4-tuple: 7.1802 7.8219 Indexing vs. unpacking a list of 2-tuples: 8.6561 8.3513 map() and itemgetter() (just for kicks): 9.1651
Unpacking is slightly faster for a tuple of a single item. This happens more often than you might thing; consider, for example, `struct.unpack(">H")`, which returns a tuple. Thus, use `val, = struct.unpack(">H")` in these situations instead of `val = struct.unpack(">H")[0]`. That said, use with care, since tuple unpacking is also slightly more unreadable than indexing, and so it does not seem that tuple unpacking causes a bottleneck for our software... yet. As the tuple grows, however, indexing is always faster. Also, as one might have suspected, `itemgetter` works more slowly than a list comprehension.
PyPy 1.6.0 (with GCC 4.0.1; Python 2.7.2)
Indexing vs. unpacking a 1-tuple: 0.2268 0.2279 Indexing vs. unpacking a 2-tuple: 0.2301 0.2302 Indexing vs. unpacking a 3-tuple: 0.2335 0.2320 Indexing vs. unpacking a 4-tuple: 0.2332 0.2344 Indexing vs. unpacking a list of 2-tuples: 1.2610 1.2698 map() and itemgetter() (just for kicks): 5.4586
There is no clear difference in pypy; both the indexing and unpacking operations seem to vary constantly. (I tested informally using a 250 item tuple. My test with 1000 slowed down unpacking considerably, but I suspect the bottleneck was with the source code parser, not the operation itself.) It is clear that map and itemgetter are significantly slower for pypy, however.
Jython 2.5.2
Indexing vs. unpacking a 1-tuple: 0.6510 1.1520 Indexing vs. unpacking a 2-tuple: 0.9610 0.7800 Indexing vs. unpacking a 3-tuple: 0.8930 0.8330 Indexing vs. unpacking a 4-tuple: 1.0250 0.8070 Indexing vs. unpacking a list of 2-tuples: 36.4800 40.4600 map() and itemgetter() (just for kicks): 11.0170
The Jython results varied from run to run, but it looks like unpacking was almost always faster. It also looks like Jython does not handle list comprehensions very well. Now you know.
Powers of Two
from timeit import Timer # Use 62 to prevent slowdown from long ints t1 = Timer("for i in xrange(62): 1 << i") t2 = Timer("for i in xrange(62): 2 ** i") times = 1000000 print "Bit-shifting vs. Exponentation:\t%.4f\t%.4f" % (t1.timeit(number=times), t2.timeit(number=times))
MacPython 2.7.2
Bit-shifting vs. Exponentation: 3.8654 8.3995
Bit-shifting wins by a longshot.
PyPy 1.6.0 (with GCC 4.0.1; Python 2.7.2)
Bit-shifting vs. Exponentation: 0.2184 2.0279
Again, bit-shifting wins by a longshot.
Jython 2.5.2
Bit-shifting vs. Exponentation: 2.6870 17.0960
Don't use Jython.