python - JSON 到 Python Pandas 数据框

我是 python 新手，一直专注于学习 pandas 和 xlxswriter 以帮助自动化一些工作流程。我附上了我可以访问的 JSON 文件的片段，但无法转换为 pandas 数据帧。

如果我使用pd.read_json(filename):它会将variationProducts 和productAttributes 的内容集中在一个单元格中，从而弄乱它们。

问题:我如何获取这个 JSON 文件并使其看起来像底部的 pandas 数据帧输出:

[
  {
    "ID": "12345",
    "productName": "Product A ",
    "minPrice": "$89.00",
    "maxPrice": "$89.00",
    "variationProducts": [
      {
        "variantColor": "JJ0BVE7",
        "variantSize": "080",
        "sellingPrice": "$89.00",
        "inventory": 3,
      },
      {
        "variantColor": "JJ0BVE7",
        "variantSize": "085",
        "sellingPrice": "$89.00",
        "inventory": 6,
      }
    ],
    "productAttributes": [
        {
        "ID": "countryOfOrigin",
        "value": "Imported"
      },
      {
        "ID": "csProductCode",
        "value": "1100"
      }
    ]
  },
  {
    "ID": "23456",
    "productName": "Product B",
    "minPrice": "$29.99",
    "maxPrice": "$69.00",
    "variationProducts": [
      {
        "variantColor": "JJ169Q0",
        "variantSize": "050",
        "sellingPrice": "$69.00",
        "inventory": 55,
      },
      {
        "variantColor": "JJ123Q0",
        "variantSize": "055",
        "sellingPrice": "$69.00",
        "inventory": 5,
      }
    ],
   "productAttributes": [
        {
        "ID": "countryOfOrigin",
        "value": "Imported"
      },
      {
        "ID": "csProductCode",
        "value": "1101"
      }
    ]
  }
]

我在 Excel 中制作了此示例输出，variationProducts 在variantColor 级别进行汇总 - 因此对于产品 A，库存是两个变体的总和，尽管它们具有不同的variantSizes:

     ID      productName maxPrice minPrice countryOfOrigin csProductCode variantColor inventory
    12345   Product A   $89     $89         Imported        1100    JJ0BVE7    9
    23456   Product B   $69     $30         Imported        1101    JJ169Q0    55
    23456   Product B   $69     $30         Imported        1101    JJ123Q0    5

最佳答案

您可以使用json_normalize:

In [11]: pd.io.json.json_normalize(d, "variationProducts", ["ID", "maxPrice", "minPrice", "productName"], record_prefix=".")
Out[11]:
   .inventory .sellingPrice .variantColor .variantSize     ID maxPrice minPrice productName
0           3        $89.00       JJ0BVE7          080  12345   $89.00   $89.00  Product A
1           6        $89.00       JJ0BVE7          085  12345   $89.00   $89.00  Product A
2          55        $69.00       JJ169Q0          050  23456   $69.00   $29.99   Product B
3           5        $69.00       JJ123Q0          055  23456   $69.00   $29.99   Product B

In [12]: pd.io.json.json_normalize(d, "productAttributes", ["ID", "maxPrice", "minPrice", "productName"], record_prefix=".")
Out[12]:
               .ID    .value     ID maxPrice minPrice productName
0  countryOfOrigin  Imported  12345   $89.00   $89.00  Product A
1    csProductCode      1100  12345   $89.00   $89.00  Product A
2  countryOfOrigin  Imported  23456   $69.00   $29.99   Product B
3    csProductCode      1101  23456   $69.00   $29.99   Product B

然后您可以将这两个连接/合并在一起...

关于python - JSON 到 Python Pandas 数据框，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46979118/

python - JSON 到 Python Pandas 数据框

上一篇：python - 将 tensorflow 导入 Keras skript 时出现 LinAlgError ("SVD did not converge")

下一篇：python - 给定平面法线，沿平面移动顶点