Each Answer to this Q is separated by one/two green lines.
Assuming that I have a list with a huge number of items,
l = [ 1, 4, 6, 30, 2, ... ]
I want to get the number of items from that list, where an item satisfies a certain condition. My first thought was:
count = len([i for i in l if my_condition(l)])
But if the filtered list also has a great number of items, I think that
creating a new list for the filtered result is just a waste of memory. For efficiency, IMHO, the above call can’t be better than:
count = 0 for i in l: if my_condition(l): count += 1
Is there any functional-style way to get the # of items that satisfy the condition without generating a temporary list?
You can use a generator expression:
>>> l = [1, 3, 7, 2, 6, 8, 10] >>> sum(1 for i in l if i % 4 == 3) 2
>>> sum(i % 4 == 3 for i in l) 2
which uses the fact that
True == 1 and
False == 0.
Alternatively, you could use
itertools.imap (python 2) or simply
map (python 3):
>>> def my_condition(x): ... return x % 4 == 3 ... >>> sum(map(my_condition, l)) 2
You want a generator comprehension rather than a list here.
l = [1, 4, 6, 7, 30, 2] def my_condition(x): return x > 5 and x < 20 print sum(1 for x in l if my_condition(x)) # -> 2 print sum(1 for x in range(1000000) if my_condition(x)) # -> 14
itertools.imap (though I think the explicit list and generator expressions look somewhat more Pythonic).
Note that, though it’s not obvious from the
sum example, you can compose generator comprehensions nicely. For example,
inputs = xrange(1000000) # In Python 3 and above, use range instead of xrange odds = (x for x in inputs if x % 2) # Pick odd numbers sq_inc = (x**2 + 1 for x in odds) # Square and add one print sum(x/2 for x in sq_inc) # Actually evaluate each one # -> 83333333333500000
The cool thing about this technique is that you can specify conceptually separate steps in code without forcing evaluation and storage in memory until the final result is evaluated.
This can also be done using
reduce if you prefer functional programming
reduce(lambda count, i: count + my_condition(i), l, 0)
This way you only do 1 pass and no intermediate list is generated.
you could do something like:
l = [1,2,3,4,5,..] count = sum(1 for i in l if my_condition(i))
which just adds 1 for each element that satisfies the condition.
from itertools import imap sum(imap(my_condition, l))