减少字典列表的优雅方法?
问题内容:
我有一个词典列表,每个词典包含完全相同的键。我想找到每个键的平均值,我想知道如何使用reduce进行操作(或者,如果不能使用比嵌套for
s更优雅的方法,则无法做到)。
这是清单:
[
{
"accuracy": 0.78,
"f_measure": 0.8169374016795885,
"precision": 0.8192088044235794,
"recall": 0.8172222222222223
},
{
"accuracy": 0.77,
"f_measure": 0.8159133315763016,
"precision": 0.8174754717495807,
"recall": 0.8161111111111111
},
{
"accuracy": 0.82,
"f_measure": 0.8226353934130455,
"precision": 0.8238175920455686,
"recall": 0.8227777777777778
}, ...
]
我想找回我这样的字典:
{
"accuracy": 0.81,
"f_measure": 0.83,
"precision": 0.84,
"recall": 0.83
}
这是我到目前为止的内容,但我不喜欢它:
folds = [ ... ]
keys = folds[0].keys()
results = dict.fromkeys(keys, 0)
for fold in folds:
for k in keys:
results[k] += fold[k] / len(folds)
print(results)
问题答案:
或者,如果您要对数据进行这样的计算,则您可能希望使用熊猫(这对于一次过大来说是过大的,但是会大大简化此类任务…)
import pandas as pd
data = [
{
"accuracy": 0.78,
"f_measure": 0.8169374016795885,
"precision": 0.8192088044235794,
"recall": 0.8172222222222223
},
{
"accuracy": 0.77,
"f_measure": 0.8159133315763016,
"precision": 0.8174754717495807,
"recall": 0.8161111111111111
},
{
"accuracy": 0.82,
"f_measure": 0.8226353934130455,
"precision": 0.8238175920455686,
"recall": 0.8227777777777778
}, # ...
]
result = pd.DataFrame.from_records(data).mean().to_dict()
这给你:
{'accuracy': 0.79000000000000004,
'f_measure': 0.8184953755563118,
'precision': 0.82016728940624295,
'recall': 0.81870370370370382}