我有一个来自 json 文件的数据集,格式如下:
data = {'data': {'content': [{'gender': 'Female',
'id': 'covid-1004200003256',
'state_code': '3272',
'district_code': '3272040',
'subdistrict_code': '3272040004',
'latitude': -6.906,
'longitude': 106.923,
'state_name': 'KOTA SUKABUMI',
'district_name': 'Gunungpuyuh',
'subdistrict_name': 'Karamat',
'stage': 'Isolated',
'status': 'SUSPECT'},
{'gender': 'Female',
'id': 'covid-1004200003255',
'state_code': '3272',
'district_code': '3272040',
'subdistrict_code': '3272040004',
'latitude': -6.906,
'longitude': 106.923,
'state_name': 'KOTA SUKABUMI',
'district_name': 'Gunungpuyuh',
'subdistrict_name': 'Karamat',
'stage': 'Isolated',
'status': 'SUSPECT',
}]}}
所以我想使用 json_normalize
制作数据框
df = pd.json_normalize(data, 'content')
df.head(10)
但它返回:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-36-4d8ad8c8743a> in <module>()
----> 1 df = pd.json_normalize(data, 'content')
2 df.head(10)
3 frames
/usr/local/lib/python3.6/dist-packages/pandas/io/json/_normalize.py in _json_normalize(data, record_path, meta, meta_prefix, record_prefix, errors, sep, max_level)
334 records.extend(recs)
335
--> 336 _recursive_extract(data, record_path, {}, level=0)
337
338 result = DataFrame(records)
/usr/local/lib/python3.6/dist-packages/pandas/io/json/_normalize.py in _recursive_extract(data, path, seen_meta, level)
307 else:
308 for obj in data:
--> 309 recs = _pull_records(obj, path[0])
310 recs = [
311 nested_to_record(r, sep=sep, max_level=max_level)
/usr/local/lib/python3.6/dist-packages/pandas/io/json/_normalize.py in _pull_records(js, spec)
246 if has non iterable value.
247 """
--> 248 result = _pull_field(js, spec)
249
250 # GH 31507 GH 30145, GH 26284 if result is not list, raise TypeError if not
/usr/local/lib/python3.6/dist-packages/pandas/io/json/_normalize.py in _pull_field(js, spec)
237 result = result[field]
238 else:
--> 239 result = result[spec]
240 return result
241
KeyError: 'content'
有什么办法解决这个问题吗?
最佳答案
您的命令失败,因为您正试图传递第二级嵌套键 (content
)。您只能传递 第一级
嵌套键。
因此,您需要传递 data['data']
,如下所示:
In [934]: df = pd.json_normalize(data['data'], 'content')
In [934]: df
Out[934]:
gender id state_code district_code subdistrict_code latitude longitude state_name district_name subdistrict_name stage status
0 Female covid-1004200003256 3272 3272040 3272040004 -6.906 106.923 KOTA SUKABUMI Gunungpuyuh Karamat Isolated SUSPECT
1 Female covid-1004200003255 3272 3272040 3272040004 -6.906 106.923 KOTA SUKABUMI Gunungpuyuh Karamat Isolated SUSPECT
关于python - Pandas json_normalize 返回 KeyError,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65308566/