我使用BeautifulSoup来获取这些数据。它看起来像一个嵌套字典,但我未能将它们转换为数据框。类型是。
{"page":1,"rows":[{"id":"160128","cell":{"fund_id":"160128","bond_ratio":"132.04","report_dt":"2019-12-31","is_outdate":false,"maturity_dt_tips":""}},{"id":"160130","cell":{"fund_id":"160130","bond_ratio":"165.29","report_dt":"2019-12-31","is_outdate":false,"maturity_dt_tips":""}},{"id":"160131","cell":{"fund_id":"160131","bond_ratio":"94.93","report_dt":"2019-12-31","is_outdate":false,"maturity_dt_tips":""}}],"total":3}
如何获取“单元格”每个键下的“值”?谢谢。
fund_id bond_ratio report_dt is_outdate maturity_dt_tips
0 160128 132.04 2019-12-31 false
1 160130 165.29 2019-12-31 false
2 160131 94.93 2019-12-31 false
最佳答案
d = {"page":1,"rows":[{"id":"160128","cell":{"fund_id":"160128","bond_ratio":"132.04","report_dt":"2019-12-31","is_outdate":False,"maturity_dt_tips":""}},{"id":"160130","cell":{"fund_id":"160130","bond_ratio":"165.29","report_dt":"2019-12-31","is_outdate":False,"maturity_dt_tips":""}},{"id":"160131","cell":{"fund_id":"160131","bond_ratio":"94.93","report_dt":"2019-12-31","is_outdate":False,"maturity_dt_tips":""}}],"total":3}
from pandas.io.json import json_normalize
df = json_normalize(d['rows'])
print (df)
id cell.fund_id cell.bond_ratio cell.report_dt cell.is_outdate \
0 160128 160128 132.04 2019-12-31 False
1 160130 160130 165.29 2019-12-31 False
2 160131 160131 94.93 2019-12-31 False
cell.maturity_dt_tips
0
1
2
然后,如有必要,请删除列名称中 .
之前的值,添加 str.split
并通过 [-1]
索引列表的最后一个值:
df.columns = df.columns.str.split('.').str[-1]
print (df)
id fund_id bond_ratio report_dt is_outdate maturity_dt_tips
0 160128 160128 132.04 2019-12-31 False
1 160130 160130 165.29 2019-12-31 False
2 160131 160131 94.93 2019-12-31 False
关于python - 如何将嵌套字典从 BeautifulSoup 转换为 pandas 数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60275631/