python - Pandas json_normalize 返回 KeyError

标签 python json python-3.x pandas dataframe

我有一个来自 json 文件的数据集,格式如下:

data = {'data': {'content': [{'gender': 'Female',
    'id': 'covid-1004200003256',
    'state_code': '3272',
    'district_code': '3272040',
    'subdistrict_code': '3272040004',
    'latitude': -6.906,
    'longitude': 106.923,
    'state_name': 'KOTA SUKABUMI',
    'district_name': 'Gunungpuyuh',
    'subdistrict_name': 'Karamat',
    'stage': 'Isolated',
    'status': 'SUSPECT'},
   {'gender': 'Female',
    'id': 'covid-1004200003255',
    'state_code': '3272',
    'district_code': '3272040',
    'subdistrict_code': '3272040004',
    'latitude': -6.906,
    'longitude': 106.923,
    'state_name': 'KOTA SUKABUMI',
    'district_name': 'Gunungpuyuh',
    'subdistrict_name': 'Karamat',
    'stage': 'Isolated',
    'status': 'SUSPECT',
    }]}}

所以我想使用 json_normalize 制作数据框

df = pd.json_normalize(data, 'content')
df.head(10)

但它返回:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-36-4d8ad8c8743a> in <module>()
----> 1 df = pd.json_normalize(data, 'content')
      2 df.head(10)

3 frames
/usr/local/lib/python3.6/dist-packages/pandas/io/json/_normalize.py in _json_normalize(data, record_path, meta, meta_prefix, record_prefix, errors, sep, max_level)
    334                 records.extend(recs)
    335 
--> 336     _recursive_extract(data, record_path, {}, level=0)
    337 
    338     result = DataFrame(records)

/usr/local/lib/python3.6/dist-packages/pandas/io/json/_normalize.py in _recursive_extract(data, path, seen_meta, level)
    307         else:
    308             for obj in data:
--> 309                 recs = _pull_records(obj, path[0])
    310                 recs = [
    311                     nested_to_record(r, sep=sep, max_level=max_level)

/usr/local/lib/python3.6/dist-packages/pandas/io/json/_normalize.py in _pull_records(js, spec)
    246         if has non iterable value.
    247         """
--> 248         result = _pull_field(js, spec)
    249 
    250         # GH 31507 GH 30145, GH 26284 if result is not list, raise TypeError if not

/usr/local/lib/python3.6/dist-packages/pandas/io/json/_normalize.py in _pull_field(js, spec)
    237                 result = result[field]
    238         else:
--> 239             result = result[spec]
    240         return result
    241 

KeyError: 'content'

有什么办法解决这个问题吗?

最佳答案

您的命令失败,因为您正试图传递第二级嵌套键 (content)。您只能传递 第一级 嵌套键。

因此,您需要传递 data['data'],如下所示:

In [934]: df = pd.json_normalize(data['data'], 'content')

In [934]: df
Out[934]: 
   gender                   id state_code district_code subdistrict_code  latitude  longitude     state_name district_name subdistrict_name     stage   status
0  Female  covid-1004200003256       3272       3272040       3272040004    -6.906    106.923  KOTA SUKABUMI   Gunungpuyuh          Karamat  Isolated  SUSPECT
1  Female  covid-1004200003255       3272       3272040       3272040004    -6.906    106.923  KOTA SUKABUMI   Gunungpuyuh          Karamat  Isolated  SUSPECT

关于python - Pandas json_normalize 返回 KeyError,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65308566/

相关文章:

Python2.7 打印 unicode 字符串仍然出现 UnicodeEncodeError : 'ascii' codec can't encode character . .. 序号不在范围内(128)

python - 从 MultiIndex DataFrame 中采样

python-3.x - 如何在Elasticsearch中使用通配符搜索连续单词的文档

python - 停止内部迭代并找到 2 个元组的范围

python - pythons lambda 可以用来改变另一个函数的内部工作吗?

python - 二维列表行中的最后一个元素没有改变

python - urllib2.urlopen() 与 urllib.urlopen() - urllib2 在 urllib 工作时抛出 404!为什么?

javascript - 在 SVG 路径悬停时显示来自 JSON 的数据?

json - 使用 VBA 将数据从 JSON 写入 Excel

javascript - 422 错误 Laravel ajax 表单在 Microsoft Edge 上发布