python - 如何保留包含具有重复键的对象的 JSON 文档中的所有键值对?

标签 python json python-2.7 duplicates deserialization

我在正确完成此操作时遇到了一些麻烦,但我的数据如下所示:

{  
      "completedProtocol": "Extract",
      "map": [
        {
          "sampleIDsIn": [{ "clarityId": "claritySample1", "espId": "ESP024254" }, { "clarityId": "claritySample1", "espId": "ESP024255" }, { "clarityId": "claritySample1", "espId": "ESP024256"}],
          "sampleIDsOut": ["claritySample3", "claritySample4", "claritySample5"],
          "files":["http://fileserver.net/path/to/datafile3"]
        }
      ],
      "map": [
        {
          "sampleIDsIn": [{ "clarityId": "claritySample1", "espId": "ESP024258" }, { "clarityId": "claritySample1", "espId": "ESP024259" }, { "clarityId": "claritySample1", "espId": "ESP024260"}],
          "sampleIDsOut": ["claritySample3", "claritySample4", "claritySample5"],
          "files":["http://fileserver.net/path/to/datafile3"]
        }
      ]
    }

我想把它转换成这样:

[{"map": [
        {
          "sampleIDsIn": [{ "clarityId": "claritySample1", "espId": "ESP024254" }, { "clarityId": "claritySample1", "espId": "ESP024255" }, { "clarityId": "claritySample1", "espId": "ESP024256"}],
          "sampleIDsOut": ["claritySample3", "claritySample4", "claritySample5"],
          "files":["http://fileserver.net/path/to/datafile3"]
        }
      ]},
{"map":[
        {
          "sampleIDsIn": [{ "clarityId": "claritySample1", "espId": "ESP024258" }, { "clarityId": "claritySample1", "espId": "ESP024259" }, { "clarityId": "claritySample1", "espId": "ESP024260"}],
          "sampleIDsOut": ["claritySample3", "claritySample4", "claritySample5"],
          "files":["http://fileserver.net/path/to/datafile3"]
        }
      ]}]

到目前为止我的代码是:

import json

obj = json.loads(body)
newData = [dct for dct in obj if 'map' in dct]

但这只会返回:

[u'map']

如果我只在主体上使用json.loads,它只会返回map的第二个值,覆盖第一个值。

注意:我想要一系列单项字典;我不想想在一个键下收集这些值。

有什么想法吗?

最佳答案

您可以使用自定义 object_pairs_hook 函数强制 json.loads() 返回单项字典列表,而不是重复键被覆盖的单个字典:

import json

def keep_duplicates(ordered_pairs):
    result = []
    for key, value in ordered_pairs:
        result.append({key: value})
    return result

来自 docs :

object_pairs_hook is an optional function that will be called with the result of any object literal decoded with an ordered list of pairs. The return value of object_pairs_hook will be used instead of the dict. This feature can be used to implement custom decoders that rely on the order that the key and value pairs are decoded (for example, collections.OrderedDict() will remember the order of insertion). If object_hook is also defined, the object_pairs_hook takes priority.

用法:

>>> json.loads('{"a": 1, "a": 2, "a": 3}', object_pairs_hook=keep_duplicates)
[{u'a': 1}, {u'a': 2}, {u'a': 3}]

就您而言,由于您显然对除 "map" 键之外的任何内容都不感兴趣,因此您可以稍后过滤结果:

all_data = json.loads(body, object_pairs_hook=keep_duplicates)
map_data = [x for x in all_data if 'map' in x]

...这将为您提供问题中指定的准确结果。

关于python - 如何保留包含具有重复键的对象的 JSON 文档中的所有键值对?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48327661/

相关文章:

php - 对象存储的创建方法

python - 在OpenCV中如何在没有Sobel过滤器的情况下过滤线

javascript - JSON.stringify 解析复杂对象时返回空对象

json - 如何从 GitHub 克隆所有 repos(包括私有(private) repos)?

python - 算法 : selecting points from a list

python - 有没有办法简化这个 if-elif-else 链? ( python 初学者)

python - 将变量分配给私有(private)方法

python - 使用 pip 安装 mecab-python3 的问题

python - 如何在 python 中解析 JSON 日期时间字符串?

python - 如何在 Python 标准库中不正确关闭文件对象后进行清理(异常后)