json - 将具有多个级别的 json 读入 DataFrame [python]

我有这种通用格式的 json 文件:

{"attribute1": "test1",
 "attribute2": "test2",
 "data": {
      "0": 
         {"metadata": {
             "timestamp": "2022-08-14"},
         "detections": {
             "0": {"dim1": 40, "dim2": 30},
             "1": {"dim1": 50, "dim2": 20}}},
      "1": 
         {"metadata": {
             "timestamp": "2022-08-15"},
         "detections": {
             "0": {"dim1": 30, "dim2": 10},
             "1": {"dim1": 100, "dim2": 80}}}}}

这些 json 文件是指通过 3D 相机收集的测量结果。上层 key data对应于帧，每个帧都有自己的metadata并且可以有多个 detections对象，每个对象都有自己的尺寸(这里用 dim1 和 dim2 表示)。我想将这种类型的json文件转换为pandas数据帧采用以下格式:

<表类=“s-表”> <标题> 时间戳暗淡1 暗淡2 <正文> 2022-08-14 40 30 2022-08-14 50 20 2022-08-15 30 10 2022-08-15 100 80

因此，metadata 中的任何字段(这里我只添加了 timestamp 但可能有多个)必须对 detection 中的每个条目重复关键。

我可以将这种类型的 json 转换为 pandas DataFrame，但它需要多个步骤和单个文件中的 for 循环才能在最后连接所有内容。我也尝试过pd.json_normalize并玩弄参数 record_path , meta和max_level但到目前为止，我无法通过几个步骤将这种类型的 json 转换为 DataFrame。有没有一种干净的方法可以做到这一点？

最佳答案

使用嵌套字典理解来展平值并合并子字典，最后传递给DataFrame构造函数:

json = {"attribute1": "test1",
 "attribute2": "test2",
 "data": {
      "0": 
         {"metadata": {
             "timestamp": "2022-08-14"},
         "detections": {
             "0": {"dim1": 40, "dim2": 30},
             "1": {"dim1": 50, "dim2": 20}}},
      "1": 
         {"metadata": {
             "timestamp": "2022-08-15"},
         "detections": {
             "0": {"dim1": 30, "dim2": 10},
             "1": {"dim1": 100, "dim2": 80}}}}}

L = [{**x['metadata'], **y} for x in json['data'].values() 
                            for y in x['detections'].values()]

df = pd.DataFrame(L)
print (df)
    timestamp  dim1  dim2
0  2022-08-14    40    30
1  2022-08-14    50    20
2  2022-08-15    30    10
3  2022-08-15   100    80

关于json - 将具有多个级别的 json 读入 DataFrame [python]，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/74715965/

json - 将具有多个级别的 json 读入 DataFrame [python]

上一篇：swift - 为什么 Swift Set 函数为什么firstIndex(of : ) Apply?

下一篇：answer-set-programming - 计算有向图中两个节点之间的距离

json - 将具有多个级别的 json 读入 DataFrame [python]

上一篇：swift - 为什么 Swift Set 函数 为什么firstIndex(of : ) Apply?

下一篇：answer-set-programming - 计算有向图中两个节点之间的距离

上一篇：swift - 为什么 Swift Set 函数为什么firstIndex(of : ) Apply?