python - 如何将嵌套数据示例中的两个值抽象到 pandas Dataframe 中？

我正在使用 Dataset来自 Standford(参见 Dev Set 2.0)。此文件为 JSON 格式。当我读取文件时，它是一个字典，但我将其更改为一个 DF:

import json
json_file = open("dev-v2.0.json", "r")
json_data = json.load(json_file)
json_file.close()

df = pd.DataFrame.from_dict(json_data)
df = df[0:2] # for this example, only a subset

我需要的所有信息都在 df['data'] 列中。在每一行中，有很多数据，格式如下:

{'title': 'Normans', 'paragraphs': [{'qas': [{'question': 'In what country is Normandy located?', 'id': '56ddde6b9a695914005b9628', 'answers': [{'text': 'France', 'answer_start': 159}, {'text': 'France', 'answer_start': 159}, {'text': 'France', 'answer_start': 159}, {'text': 'France', 'answer_start': 159}], 'is_impossible': False}, {'question': 'When were the Normans in Normandy?', 'id': '56ddde6b9a695914005b9629', 'answers': [{'text': '10th and 11th centuries', 'answer_start': 94}, {'text': 'in the 10th and 11th centuries', 'answer_start': 87}

我想查询 DF 中所有行的所有问题和答案。所以理想情况下，输出是这样的:

Question                                         Answer 
'In what country is Normandy located?'          'France'
'When were the Normans in Normandy?'            'in the 10th and 11th centuries'

提前抱歉! 我已阅读 'Good example'邮政。但是我发现很难为这个例子生成可重现的数据，因为它看起来像一个字典，里面有一个列表，在列表中是一个小字典，在另一个字典中，然后又是一个字典......当我使用 < strong>print(df["data"])，它只打印一小部分......(这无助于重现此问题)。

print(df['data'])
0    {'title': 'Normans', 'paragraphs': [{'qas': [{...
1    {'title': 'Computational_complexity_theory', '...
Name: data, dtype: object

提前致谢!

最佳答案

以下page (SQuAD (Stanford Q&A) json to Pandas DataFrame) 处理将 dev-v1.1.json 转换为 DataFrame。

关于python - 如何将嵌套数据示例中的两个值抽象到 pandas Dataframe 中？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/58268919/

python - 如何将嵌套数据示例中的两个值抽象到 pandas Dataframe 中？

上一篇：python - 过去2个月的平均值

下一篇：python - 如何在特定时间间隔连续触发 Action ？敌人在pygame中发射恒定光束而不是子弹