我是 python 新手,一直专注于学习 pandas 和 xlxswriter 以帮助自动化一些工作流程。我附上了我可以访问的 JSON 文件的片段,但无法转换为 pandas 数据帧。
如果我使用pd.read_json(filename)
:它会将variationProducts 和productAttributes 的内容集中在一个单元格中,从而弄乱它们。
问题:我如何获取这个 JSON 文件并使其看起来像底部的 pandas 数据帧输出:
[
{
"ID": "12345",
"productName": "Product A ",
"minPrice": "$89.00",
"maxPrice": "$89.00",
"variationProducts": [
{
"variantColor": "JJ0BVE7",
"variantSize": "080",
"sellingPrice": "$89.00",
"inventory": 3,
},
{
"variantColor": "JJ0BVE7",
"variantSize": "085",
"sellingPrice": "$89.00",
"inventory": 6,
}
],
"productAttributes": [
{
"ID": "countryOfOrigin",
"value": "Imported"
},
{
"ID": "csProductCode",
"value": "1100"
}
]
},
{
"ID": "23456",
"productName": "Product B",
"minPrice": "$29.99",
"maxPrice": "$69.00",
"variationProducts": [
{
"variantColor": "JJ169Q0",
"variantSize": "050",
"sellingPrice": "$69.00",
"inventory": 55,
},
{
"variantColor": "JJ123Q0",
"variantSize": "055",
"sellingPrice": "$69.00",
"inventory": 5,
}
],
"productAttributes": [
{
"ID": "countryOfOrigin",
"value": "Imported"
},
{
"ID": "csProductCode",
"value": "1101"
}
]
}
]
我在 Excel 中制作了此示例输出,variationProducts 在variantColor 级别进行汇总 - 因此对于产品 A,库存是两个变体的总和,尽管它们具有不同的variantSizes:
ID productName maxPrice minPrice countryOfOrigin csProductCode variantColor inventory
12345 Product A $89 $89 Imported 1100 JJ0BVE7 9
23456 Product B $69 $30 Imported 1101 JJ169Q0 55
23456 Product B $69 $30 Imported 1101 JJ123Q0 5
最佳答案
您可以使用json_normalize
:
In [11]: pd.io.json.json_normalize(d, "variationProducts", ["ID", "maxPrice", "minPrice", "productName"], record_prefix=".")
Out[11]:
.inventory .sellingPrice .variantColor .variantSize ID maxPrice minPrice productName
0 3 $89.00 JJ0BVE7 080 12345 $89.00 $89.00 Product A
1 6 $89.00 JJ0BVE7 085 12345 $89.00 $89.00 Product A
2 55 $69.00 JJ169Q0 050 23456 $69.00 $29.99 Product B
3 5 $69.00 JJ123Q0 055 23456 $69.00 $29.99 Product B
In [12]: pd.io.json.json_normalize(d, "productAttributes", ["ID", "maxPrice", "minPrice", "productName"], record_prefix=".")
Out[12]:
.ID .value ID maxPrice minPrice productName
0 countryOfOrigin Imported 12345 $89.00 $89.00 Product A
1 csProductCode 1100 12345 $89.00 $89.00 Product A
2 countryOfOrigin Imported 23456 $69.00 $29.99 Product B
3 csProductCode 1101 23456 $69.00 $29.99 Product B
然后您可以将这两个连接/合并在一起...
关于python - JSON 到 Python Pandas 数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46979118/