所以我有一些代码可以从 api 生成 json 响应:
r4 = requests.get(url, params=mlp)
mlpr = r4.json()
响应的 1 行如下所示。
'command': 'SELECT', 'rowCount': 134, 'oid': None, 'rows': [{'match_id': 5334428840, 'start_time': 1586029157, 'leagueid': 11823, 'patch': '7.25', 'name': 'ESL One Los Angeles 2020 Online powered by Intel', 'radiant_team': 'Cyber Legacy', 'dire_team': 'B8', 'picks_bans': [{'is_pick': False, 'hero_id': 98, 'team': 0, 'order': 0}, {'is_pick': False, 'hero_id': 95, 'team': 1, 'order': 1}, {'is_pick': False, 'hero_id': 66, 'team': 0, 'order': 2}, {'is_pick': False, 'hero_id': 43, 'team': 1, 'order': 3}, {'is_pick': False, 'hero_id': 49, 'team': 0, 'order': 4}, {'is_pick': False, 'hero_id': 110, 'team': 1, 'order': 5}, {'is_pick': False, 'hero_id': 79, 'team': 0, 'order': 6}, {'is_pick': False, 'hero_id': 106, 'team': 1, 'order': 7}, {'is_pick': True, 'hero_id': 96, 'team': 0, 'order': 8}, {'is_pick': True, 'hero_id': 86, 'team': 1, 'order': 9}, {'is_pick': True, 'hero_id': 129, 'team': 1, 'order': 10}, {'is_pick': True, 'hero_id': 50, 'team': 0, 'order': 11}, {'is_pick': False, 'hero_id': 12, 'team': 0, 'order': 12}, {'is_pick': False, 'hero_id': 77, 'team': 1, 'order': 13}, {'is_pick': True, 'hero_id': 128, 'team': 1, 'order': 14}, {'is_pick': True, 'hero_id': 121, 'team': 0, 'order': 15}, {'is_pick': True, 'hero_id': 41, 'team': 1, 'order': 16}, {'is_pick': True, 'hero_id': 42, 'team': 0, 'order': 17}, {'is_pick': False, 'hero_id': 126, 'team': 1, 'order': 18}, {'is_pick': False, 'hero_id': 65, 'team': 0, 'order': 19}, {'is_pick': True, 'hero_id': 31, 'team': 0, 'order': 20}, {'is_pick': True, 'hero_id': 45, 'team': 1, 'order': 21}]}
正如您所见,picks_bans“列有一个包含 3 个附加列的嵌套字典,我需要将其拉出并分布在每个匹配 ID 上。
这是我用来将响应放入初始Dataframe
的代码,但它只能让我达到初始级别。
mlpr_df = pd.DataFrame(mlpr.get('rows'))
mlpr_df
[示例数据框][1]
如何适当取消 picks_bans
列的嵌套?
编辑:我尝试将代码更改为:
r4 = requests.get(url, params=mlp)
mlpr = r4.json()
data = mlpr.get('rows')
df = pd.concat([pd.DataFrame(data),
json_normalize(data['picks_bans'])],
axis=1).drop('picks_bans', 1)
我收到错误消息“列表索引必须是整数或切片,而不是 str”
最佳答案
json_normalize
就是您正在寻找的。
作为一个技巧,我使用第一行数据的键列表减去要扩展的字段来获取用作元数据的字段列表 - 它更容易编写且更具弹性。我已将调用中的参数名称更加明确。
import pandas as pd
from pandas import json_normalize
df = json_normalize(data, record_path="picks_bans",
meta=[col for col in data[0].keys() if col != "picks_bans"])
df.head()
# is_pick hero_id team order match_id start_time leagueid patch name radiant_team dire_team
# -- --------- --------- ------ ------- ---------- ------------ ---------- ------- ------------------------------------------------ -------------- -----------
# 0 False 98 0 0 5334428840 1586029157 11823 7.25 ESL One Los Angeles 2020 Online powered by Intel Cyber Legacy B8
# 1 False 95 1 1 5334428840 1586029157 11823 7.25 ESL One Los Angeles 2020 Online powered by Intel Cyber Legacy B8
# 2 False 66 0 2 5334428840 1586029157 11823 7.25 ESL One Los Angeles 2020 Online powered by Intel Cyber Legacy B8
数据样本
data = [{'match_id': 5334428840, 'start_time': 1586029157, 'leagueid': 11823, 'patch': '7.25', 'name': 'ESL One Los Angeles 2020 Online powered by Intel', 'radiant_team': 'Cyber Legacy', 'dire_team': 'B8', 'picks_bans':
[{'is_pick': False, 'hero_id': 98, 'team': 0, 'order': 0}, {'is_pick': False, 'hero_id': 95, 'team': 1, 'order': 1}, {'is_pick': False, 'hero_id': 66, 'team': 0, 'order': 2},
{'is_pick': False, 'hero_id': 43, 'team': 1, 'order': 3}, {'is_pick': False, 'hero_id': 49, 'team': 0, 'order': 4}, {'is_pick': False, 'hero_id': 110, 'team': 1, 'order': 5},
{'is_pick': False, 'hero_id': 79, 'team': 0, 'order': 6}, {'is_pick': False, 'hero_id': 106, 'team': 1, 'order': 7}, {'is_pick': True, 'hero_id': 96, 'team': 0, 'order': 8},
{'is_pick': True, 'hero_id': 86, 'team': 1, 'order': 9}, {'is_pick': True, 'hero_id': 129, 'team': 1, 'order': 10}, {'is_pick': True, 'hero_id': 50, 'team': 0, 'order': 11},
{'is_pick': False, 'hero_id': 12, 'team': 0, 'order': 12}, {'is_pick': False, 'hero_id': 77, 'team': 1, 'order': 13}, {'is_pick': True, 'hero_id': 128, 'team': 1, 'order': 14},
{'is_pick': True, 'hero_id': 121, 'team': 0, 'order': 15}, {'is_pick': True, 'hero_id': 41, 'team': 1, 'order': 16}, {'is_pick': True, 'hero_id': 42, 'team': 0, 'order': 17},
{'is_pick': False, 'hero_id': 126, 'team': 1, 'order': 18}, {'is_pick': False, 'hero_id': 65, 'team': 0, 'order': 19}, {'is_pick': True, 'hero_id': 31, 'team': 0, 'order': 20},
{'is_pick': True, 'hero_id': 45, 'team': 1, 'order': 21}]},
{'match_id': 5334428840, 'start_time': 1586029157, 'leagueid': 11823, 'patch': '7.25', 'name': 'ESL One Los Angeles 2020 Online powered by Intel', 'radiant_team': 'Cyber Legacy', 'dire_team': 'B8', 'picks_bans':
[{'is_pick': False, 'hero_id': 98, 'team': 0, 'order': 0}, {'is_pick': False, 'hero_id': 95, 'team': 1, 'order': 1}, {'is_pick': False, 'hero_id': 66, 'team': 0, 'order': 2},
{'is_pick': False, 'hero_id': 43, 'team': 1, 'order': 3}, {'is_pick': False, 'hero_id': 49, 'team': 0, 'order': 4}, {'is_pick': False, 'hero_id': 110, 'team': 1, 'order': 5},
{'is_pick': False, 'hero_id': 79, 'team': 0, 'order': 6}, {'is_pick': False, 'hero_id': 106, 'team': 1, 'order': 7}, {'is_pick': True, 'hero_id': 96, 'team': 0, 'order': 8},
{'is_pick': True, 'hero_id': 86, 'team': 1, 'order': 9}, {'is_pick': True, 'hero_id': 129, 'team': 1, 'order': 10}, {'is_pick': True, 'hero_id': 50, 'team': 0, 'order': 11},
{'is_pick': False, 'hero_id': 12, 'team': 0, 'order': 12}, {'is_pick': False, 'hero_id': 77, 'team': 1, 'order': 13}, {'is_pick': True, 'hero_id': 128, 'team': 1, 'order': 14},
{'is_pick': True, 'hero_id': 121, 'team': 0, 'order': 15}, {'is_pick': True, 'hero_id': 41, 'team': 1, 'order': 16}, {'is_pick': True, 'hero_id': 42, 'team': 0, 'order': 17},
{'is_pick': False, 'hero_id': 126, 'team': 1, 'order': 18}, {'is_pick': False, 'hero_id': 65, 'team': 0, 'order': 19}, {'is_pick': True, 'hero_id': 31, 'team': 0, 'order': 20},
{'is_pick': True, 'hero_id': 45, 'team': 1, 'order': 21}]}
]
关于python - 从 api 响应解析嵌套 json 的最佳方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61034851/