collections.Counter:most_common包括相等的计数
问题内容:
在中collections.Counter
,该方法most_common(n)
仅返回列表中的n个最频繁的项目。我确实需要那个,但是我也需要包括相等的计数。
from collections import Counter
test = Counter(["A","A","A","B","B","C","C","D","D","E","F","G","H"])
-->Counter({'A': 3, 'C': 2, 'B': 2, 'D': 2, 'E': 1, 'G': 1, 'F': 1, 'H': 1})
test.most_common(2)
-->[('A', 3), ('C', 2)
我需要,[('A', 3), ('B', 2), ('C', 2), ('D', 2)]
因为在这种情况下,它们的计数与n =
2相同。我的真实数据是DNA代码,可能很大。我需要它有些效率。
问题答案:
您可以执行以下操作:
from itertools import takewhile
def get_items_upto_count(dct, n):
data = dct.most_common()
val = data[n-1][1] #get the value of n-1th item
#Now collect all items whose value is greater than or equal to `val`.
return list(takewhile(lambda x: x[1] >= val, data))
test = Counter(["A","A","A","B","B","C","C","D","D","E","F","G","H"])
print get_items_upto_count(test, 2)
#[('A', 3), ('C', 2), ('B', 2), ('D', 2)]