Removing objects whose counts are less than threshold in counter.

Each Answer to this Q is separated by one/two green lines.

I have a counter declared as: main_dict = Counter() and values are added as main_dict[word] += 1. In the end I want to remove all the elements less than 15 in frequency. Is there any function in Counters to do this.

Any help appreciated.

>>> from collections import Counter
>>> counter = Counter({'baz': 20, 'bar': 15, 'foo': 10})
>>> Counter({k: c for k, c in counter.items() if c >= 15})
Counter({'baz': 20, 'bar': 15})

No, you’ll need to remove them manually. Using itertools.dropwhile() makes that a little easier perhaps:

from itertools import dropwhile

for key, count in dropwhile(lambda key_count: key_count[1] >= 15, main_dict.most_common()):
    del main_dict[key]

Demonstration:

>>> main_dict
Counter({'baz': 20, 'bar': 15, 'foo': 10})
>>> for key, count in dropwhile(lambda key_count: key_count[1] >= 15, main_dict.most_common()):
...     del main_dict[key]
... 
>>> main_dict
Counter({'baz': 20, 'bar': 15})

By using dropwhile you only need to test the keys for which the count is 15 or over; after that it’ll forgo testing and just pass through everything. That works great with the sorted most_common() list. If there are a lot of values below 15, that saves execution time for all those tests.

Another method:

c = Counter({'baz': 20, 'bar': 15, 'foo': 10})
print Counter(el for el in c.elements() if c[el] >= 15)
# Counter({'baz': 20, 'bar': 15})

may I suggest another solution

from collections import Counter
main_dict = Counter({'baz': 20, 'bar': 15, 'foo': 10})  
trsh = 15

main_dict = Counter(dict(filter(lambda x: x[1] >= trsh, main_dict.items())))
print(main_dict)

>>> Counter({'baz': 20, 'bar': 15})

Also I have the same problem, but I need to return a list of all keys from Counter with values more than some threshold. To do this

keys_list = map(lambda x: x[0], filter(lambda x: x[1] >= trsh, main_dict.items()))
print(keys_list) 

>>> ['baz', 'bar']

An elegant solution when the threshold is zero:

main_dict += Counter()

An example of how to filter items whoso count greater than or less than a threshold in counter

from collections import Counter
from itertools import takewhile, dropwhile


data = (
    "Here's a little song about Roy G. Biv. "
    "He makes up all the colors that you see where you live. "
    "If you know all the colors, sing them with me: "
    "red, orange, yellow, green, blue, indigo, violet all that you see."
)

c = Counter(data)

more_than_10 = dict(takewhile(lambda i: i[1] > 10, c.most_common()))
less_than_2 = dict(dropwhile(lambda i: i[1] >= 2, c.most_common()))

print(f"> 10 {more_than_10} \n2 < {less_than_2}")

Output:

> 10 {' ': 40, 'e': 23, 'o': 16, 'l': 15, 't': 12} 
2 < {"'": 1, 'R': 1, 'G': 1, 'B': 1, 'p': 1, 'I': 1, 'f': 1, ':': 1}

Simply do a list comprehension of dictionary items:

[el for el in c.items() if el[1] >= 15]


The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .