python - 根据python中的字典分组删除键值对

标签 python json dictionary grouping key-value

我有一个 JSON 文件 A.json,其中包含多个字典。我想从按品牌分组的键“模型”中删除常见的键值对。

例如,考虑品牌:“福特”:

{"Number": '123', "brand": "Ford", "model":{"Mustang1":"2.64", "Mustang2":"3.00", "Mustang3":"1.00", "Mustang4":"1.64"}}

{"Number": '891', "brand": "Ford", "model":{"Mustang1":"2.64", "Mustang8":"3.00", "Mustang3":"1.00", "Mustang6":"1.64"}}

两个词典中常见的model中的键是Mustang1Mustang3。因此,我从模型中删除了两个键值对。 最终的字典将是:

 {"Number": '123', "brand": "Ford", "model":{"Mustang2":"3.00", "Mustang4":"1.64"}}
{"Number": '891', "brand": "Ford", "model":{"Mustang8":"3.00", "Mustang6":"1.64"}}

A.json

{"Number": '123', "brand": "Ford", "model":{"Mustang1":"2.64", "Mustang2":"3.00", "Mustang3":"1.00", "Mustang4":"1.64"}}
{"Number": '321', "brand": "Toyota", "model":{"Camry":"2.64", "Prius":"3.00", "Corolla":"1.00", "Tundra":"1.64"}}
{"Number": '111', "brand": "Honda", "model":{"Accord":"2.64", "Civic":"3.00", "Insight":"1.00", "Pilot":"1.64"}}
{"Number": '891', "brand": "Ford", "model":{"Mustang1":"2.64", "Mustang8":"3.00", "Mustang3":"1.00", "Mustang6":"1.64"}}
{"Number": '745', "brand": "Toyota", "model":{"Camry":"2.64", "Sienna":"3.00", "4Runner":"1.00", "Prius":"1.64"}}
{"Number": '325', "brand": "Honda", "model":{"Accord":"2.64", "Passport":"3.00", "HR-V":"1.00", "Pilot":"1.64"}}
{"Number": '745', "brand": "Accura", "model":{"TLX":"2.64", "MDX":"3.00"}}
{"Number": '325', "brand": "Accura", "model":{"TLX":"2.64", "MDX":"3.00"}}

预期结果: 结果.json

{"Number": '123', "brand": "Ford", "model":{"Mustang2":"3.00", "Mustang4":"1.64"}}
{"Number": '321', "brand": "Toyota", "model":{"Corolla":"1.00", "Tundra":"1.64"}}
{"Number": '111', "brand": "Honda", "model":{"Civic":"3.00", "Insight":"1.00", "Pilot":"1.64"}}
{"Number": '891', "brand": "Ford", "model":{"Mustang8":"3.00", "Mustang6":"1.64"}}
{"Number": '745', "brand": "Toyota", "model":{"Sienna":"3.00", "4Runner":"1.00"}}
{"Number": '325', "brand": "Honda", "model":{"Passport":"3.00", "HR-V":"1.00", "Civic Type R":"1.64"}}
{"Number": '745', "brand": "Accura", "model":{}}
{"Number": '325', "brand": "Accura", "model":{}}

最佳答案

首先,您的 A.json 不是常规的 json 文件。这是更正后的版本:

[{"Number": "123", "brand": "Ford", "model":{"Mustang1":"2.64", "Mustang2":"3.00", "Mustang3":"1.00", "Mustang4":"1.64"}},
{"Number": "321", "brand": "Toyota", "model":{"Camry":"2.64", "Prius":"3.00", "Corolla":"1.00", "Tundra":"1.64"}},
{"Number": "111", "brand": "Honda", "model":{"Accord":"2.64", "Civic":"3.00", "Insight":"1.00", "Pilot":"1.64"}},
{"Number": "891", "brand": "Ford", "model":{"Mustang1":"2.64", "Mustang8":"3.00", "Mustang3":"1.00", "Mustang6":"1.64"}},
{"Number": "745", "brand": "Toyota", "model":{"Camry":"2.64", "Sienna":"3.00", "4Runner":"1.00", "Prius":"1.64"}},
{"Number": "325", "brand": "Honda", "model":{"Accord":"2.64", "Passport":"3.00", "HR-V":"1.00", "Pilot":"1.64"}},
{"Number": "745", "brand": "Accura", "model":{"TLX":"2.64", "MDX":"3.00"}},
{"Number": "325", "brand": "Accura", "model":{"TLX":"2.64", "MDX":"3.00"}}]

文件的内容应使用 json 模块进行解析:

import io # to test without a file
f = io.StringIO(json_text) # json_text is a string containing the text above

import json
ds = json.load(f)

其次,您必须按品牌构建一组常见模型:

common_by_brand = {}
for d in ds:
    if d["brand"] in common_by_brand:
        common_by_brand[d["brand"]] &= set(d["model"])
    else:
        common_by_brand[d["brand"]] = set(d["model"])
    # {'Ford': {'Mustang1', 'Mustang3'}, 'Toyota': {'Camry', 'Prius'}, 'Honda': {'Accord', 'Pilot'}, 'Accura': {'TLX', 'MDX'}}

第三,只需迭代列表并删除那些常见模型:

for d in ds:
    common = common_by_brand[d["brand"]]
    d["model"] = {k: v for k, v in d["model"].items() if k not in common}
# [{'Number': '123', 'brand': 'Ford', 'model': {'Mustang2': '3.00', 'Mustang4': '1.64'}}, {'Number': '321', 'brand': 'Toyota', 'model': {'Corolla': '1.00', 'Tundra': '1.64'}}, {'Number': '111', 'brand': 'Honda', 'model': {'Civic': '3.00', 'Insight': '1.00'}}, {'Number': '891', 'brand': 'Ford', 'model': {'Mustang8': '3.00', 'Mustang6': '1.64'}}, {'Number': '745', 'brand': 'Toyota', 'model': {'Sienna': '3.00', '4Runner': '1.00'}}, {'Number': '325', 'brand': 'Honda', 'model': {'Passport': '3.00', 'HR-V': '1.00'}}, {'Number': '745', 'brand': 'Accura', 'model': {}}, {'Number': '325', 'brand': 'Accura', 'model': {}}]

四、将结果以json格式写入文件:

g = io.StringIO()
json.dump(ds, g)
print (g.getvalue())

格式化输出:

[{"Number": "123", "brand": "Ford", "model": {"Mustang2": "3.00", "Mustang4": "1.64"}},
{"Number": "321", "brand": "Toyota", "model": {"Corolla": "1.00", "Tundra": "1.64"}},
{"Number": "111", "brand": "Honda", "model": {"Civic": "3.00", "Insight": "1.00"}},
{"Number": "891", "brand": "Ford", "model": {"Mustang8": "3.00", "Mustang6": "1.64"}},
{"Number": "745", "brand": "Toyota", "model": {"Sienna": "3.00", "4Runner": "1.00"}},
{"Number": "325", "brand": "Honda", "model": {"Passport": "3.00", "HR-V": "1.00"}},
{"Number": "745", "brand": "Accura", "model": {}},
{"Number": "325", "brand": "Accura", "model": {}}]

关于python - 根据python中的字典分组删除键值对,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55895451/

相关文章:

python - Pyarrow 使用 Pandas 不断将字符串转换为二进制

python - 带有 IF 子句的列表推导式

Javascript/Json 日期转换问题

json - 在hadoop map reduce中读取json对象来处理数据

arrays - 从 Swift 中的字典中的数组访问项目

dictionary - Gradle无法在任务内部创建Groovy Map对象

c++ - 如何找到 key 大于 val 的映射的第一个元素

WSL 2 上的 Python3 需要很长时间(超过 6 分钟)才能导入 key 环

python - 本地图在pygame中移动时用鼠标找到一个项目

json - 是什么导致 JavaScript 结构在实时版本中失败但在测试中有效?