我整理了一些资料,做成了字典,如下:
gen_dict = {
"item_C_v001" : "jack",
"item_C_v002" : "kris",
"item_A_v003" : "john",
"item_B_v006" : "peter",
"item_A_v005" : "john",
"item_A_v004" : "dave"
}
我正在尝试以以下格式打印结果:
Item Name | No. of Vers. | User
item_A | 3 | dave, john
item_B | 1 | peter
item_C | 2 | jack, kris
它将相似的版本列成 1 行,同时计算有多少个版本,同时说明用户名..
我在集成用户名时遇到问题。我使用了 set() 命令,这似乎适用于我的所有 3 行输出。 即便如此,虽然我的“元素名称”和“否”。诗篇。列看起来确实正确,有什么方法可以检查它找到的版本数量是否符合名称?如果我的数据很小,我可以手动统计,但是如果我有大数据怎么办?
strip_ver_list = []
user_list = []
for item_name, user in gen_dict.iteritems():
# Strip out the version digits
strip_ver = item_name[:-3]
strip_ver_list.append(strip_ver)
user_list.append(user)
# This will count and remove the duplicates
versions_num = dict((duplicate, strip_ver_list.count(duplicate)) for duplicate in strip_ver_list)
for name, num in sorted(versions_num.iteritems()):
print "Version Name : {0}\nNo. of Versions : {1}\nUsers : {2}".format(name, num, set(user_list))
这是我得到的输出:
Item Name | No. of Vers. | User
item_A | 3 | set(['dave', 'john', 'jack', 'kris', 'peter'])
item_B | 1 | set(['dave', 'john', 'jack', 'kris', 'peter'])
item_C | 2 | set(['dave', 'john', 'jack', 'kris', 'peter'])
这是我能想到的唯一方法。但是如果有其他可行的方法来解决这个问题,请与我分享
最佳答案
我会使用 defaultdict
来聚合数据。大致:
>>> from collections import defaultdict
>>> gen_dict = {
... "item_C_v001" : "jack",
... "item_C_v002" : "kris",
... "item_A_v003" : "john",
... "item_B_v006" : "peter",
... "item_A_v005" : "john",
... "item_A_v004" : "dave"
... }
现在...
>>> versions_num = defaultdict(lambda:dict(versions=set(), users = set()))
>>> for item_name, user in gen_dict.items():
... strip_ver = item_name[:-5]
... version_num = item_name[-3:]
... versions_num[strip_ver]['versions'].add(version_num)
... versions_num[strip_ver]['users'].add(user)
...
最后,
>>> for item, data in versions_num.items():
... print("Item {} \tno. of Versions: {}\tUsers:{}".format(item, len(data['versions']), ",".join(data['users'])))
...
Item item_B no. of Versions: 1 Users:peter
Item item_A no. of Versions: 3 Users:john,dave
Item item_C no. of Versions: 2 Users:kris,jack
>>>
如果你想对其进行排序:
>>> for item, data in sorted(versions_num.items()):
... print("Item {} \tno. of Versions: {}\tUsers:{}".format(item, len(data['versions']), ",".join(data['users'])))
...
Item item_A no. of Versions: 3 Users:john,dave
Item item_B no. of Versions: 1 Users:peter
Item item_C no. of Versions: 2 Users:kris,jack
关于python - 计算并删除键中的重复项,同时保留值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42076644/