python - 我在抓取的 JSON 中遇到 KeyError

标签 python json beautifulsoup keyerror

我从网站上抓取了一个 JSON。当尝试迭代 JSON 时,我收到 KeyError,但我不确定原因。循环在 JSON 的长度范围内。对于发生的事情有什么想法吗?

import requests
from bs4 import BeautifulSoup
import json
import pandas as pd

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:90.0) Gecko/20100101 Firefox/90.0"
url = "" \
      "=Remote&location_city=San%20Diego&location_city=Encinitas&location_city=Murrieta&location_city=La%20Jolla" \
      "&location_city=Not%20Specified&location_city=Vista&sort_by=score&sort_order=DESC "
request = requests.get(url, headers=headers)
response = BeautifulSoup(request.text, "html.parser")
all_data = response.find_all("script", {"type": "application/ld+json"})
df = pd.DataFrame(columns=("Title", "Department", "Salary Range", "Appointment Percent", "URL"))

for data in all_data:
    jsn = json.loads(data.string)
    jsn_length = len(jsn['itemListElement'])
    # print(json.dumps(jsn, indent=4))
    n = 0
    while n < jsn_length:
        # print(jsn['itemListElement'][n])
        df['URL'] = jsn['itemListElement'][n]
        n += 1


Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm 2022.1\plugins\python\helpers\pydev\", line 1491, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm 2022.1\plugins\python\helpers\pydev\_pydev_imps\", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/Will/PycharmProjects/UCSD_JOB_SCRAPE/", line 19, in <module>
    jsn_length = len(jsn['itemListElement'])
KeyError: 'itemListElement'


您引用的 JSON 中的元素号 250 似乎确实没有 itemListElement 键:

  "@context": "",
  "@type": "Organization",
  "url": "",
  "logo": "",
  "name": "UC San Diego "


for data in all_data:
    jsn = json.loads(data.string)
    if jsn.get('itemListElement') is None:
        print('No itemListElement in the JSON. The JSON is\n' + data.string)
        jsn_length = len(jsn['itemListElement'])
        n = 0
        while n < jsn_length:
            # print(jsn['itemListElement'][n])
            df['URL'] = jsn['itemListElement'][n]
            n += 1

关于python - 我在抓取的 JSON 中遇到 KeyError,我们在Stack Overflow上找到一个类似的问题:


python - Django Rest 框架 JWT "Authentication credentials were not provided."}

java - 带有 Jython 的 Eclipse 不理解 Java 导入

python - 随着数据框的变化更新字典键/值对

javascript - 从 JSON 数据生成无序列表?

python - 美汤 4 : AttributeError: NoneType has no attribute find_next

python - 使用 selenium 抓取 Instagram 粉丝

python - 在 Python 中旋转多个重复列

java - 如何从服务器向客户端发送数据

java - Jackson 解析器不会因明显错误的 json 而因 JsonParseException 失败

python - 提取标签之间的 HTML