python - 在 Python 中聚合一个字典中的值以填充另一个字典

我有一个大约 12K 字典的列表。每个字典都有相同的键:year、code 和 category。

L = [{"year": "2015", "code": "VU", "category": "Vulnerable"}, {"year": "2008", "code": "VU", "category": "Vulnerable"}, {"year": "2004", "code": "LC", "category": "Least Concern"}]

我正在尝试创建一个新字典，其中将 code 的每个值作为键，并将每个 code 的唯一年份列表作为该键的值 (我不一定需要 category 键值对):

{"VU": {2008, 2015}, "LC": {2004}}

我创建了一个字典codes_dict，其中正确的代码作为键，空集作为值(因为我不想重复，而且我真的只需要最早和最近的年份。)

codes = (e['code'] for e in L)
codes_dict = dict.fromkeys(codes, set())

for e in L:
    codes_dict[e['code']].add(e['year'])

但是，当我尝试填充这些值时，我会将每年添加到每个代码中:

{'VU': {'2004', '2008', '2015'}, 'LC': {'2004', '2008', '2015'}}

我错过了什么？我尝试使用 list 而不是 set 并得到相同的结果(有重复项)。另外使用 = 而不是 add() 意味着仅添加最后一个值，而我想要整个范围。

性能并不是真正的问题，因为这只是一个快速诊断。

奖励:如果有更好的方法在 pandas 中做到这一点，我很想听听。

谢谢!

最佳答案

在您的代码中，所有值都指向同一组。尝试改为(使用 defaultdict；您可以改为使用 get 并将每个元素设置为新集合(如果尚不存在)

from collections import defaultdict

L = [{"year": "2015", "code": "VU", "category": "Vulnerable"}, {"year": "2008", "code": "VU", "category": "Vulnerable"}, {"year": "2004", "code": "LC", "category": "Least Concern"}]


codes_dict = defaultdict(set)
for e in L:
    codes_dict[e['code']].add(e['year'])

print(dict(codes_dict))

关于python - 在 Python 中聚合一个字典中的值以填充另一个字典，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59505654/

python - 在 Python 中聚合一个字典中的值以填充另一个字典

上一篇：python - 根据文本将值从一个数据帧平均分配到另一个数据帧

下一篇：python - 属性错误: 'numpy.ndarray' object has no attribute 'getA1'