From the department of ‘optimisations that are so premature that if you ever find yourself actually caring about it, you need to find better problems to solve’, I recently had this thought: What’s the cheapest/fastest iterable in Python?
Reminder: iterable and iterator are not the same thing.
The need for an empty iterable occasionally comes up, like when you need to provide a default value to a missing key in a dictionary, and you need it to be something you can iterate over without running the risk of a TypeError
. The beautiful thing, of course, is that you can iterate over an empty iterable and just have nothing happen, so the actual type or contents don’t matter.
So I set out to test it. Again: you should never need to actually care about this. If you can live with the actual overhead of iterating over something, you can live with the overhead if that something is empty, no matter the actual type of iterable.
I evaluated strings, lists, tuples, dictionaries and sets. My hypothesis was that the fastest would be a string or maybe a tuple.
The test was performed on a late-2016 13″ MacBook Pro with a 3.3 GHz Intel Core i7 and I timed it using the timeit module.
First I tested out simply declaring the different types of iterables:
kweli:~ j$ python -m timeit -c '""'
100000000 loops, best of 3: 0.00672 usec per loop
kweli:~ j$ python -m timeit -c '[]'
100000000 loops, best of 3: 0.0187 usec per loop
kweli:~ j$ python -m timeit -c '()'
100000000 loops, best of 3: 0.0119 usec per loop
kweli:~ j$ python -m timeit -c '{}'
10000000 loops, best of 3: 0.0305 usec per loop
kweli:~ j$ python -m timeit -c 'set()'
10000000 loops, best of 3: 0.0924 usec per loop
So far so good: strings are the fastest, followed by tuples and dicts, with sets trailing far behind.
Then to actually iterating over them:
kweli:~ j$ python -m timeit -c 'for i in "": pass'
10000000 loops, best of 3: 0.0433 usec per loop
kweli:~ j$ python -m timeit -c 'for i in []: pass'
10000000 loops, best of 3: 0.0514 usec per loop
kweli:~ j$ python -m timeit -c 'for i in (): pass'
10000000 loops, best of 3: 0.0438 usec per loop
kweli:~ j$ python -m timeit -c 'for i in {}: pass'
10000000 loops, best of 3: 0.0707 usec per loop
kweli:~ j$ python -m timeit -c 'for i in set(): pass'
10000000 loops, best of 3: 0.136 usec per loop
And again, the hypothesis is confirmed, but interestingly the difference between lists and strings/tuples is much smaller when iterating compared to just declaring.
So in conclusion, use a string as an empty iterable, unless you have any reason at all not to. The difference is infinitesimal.