python - Web抓取python : IndexError: list index out of range

标签 python web-scraping beautifulsoup compiler-errors html-formatting

该脚本从文本文件读取单个URL，然后从该网页导入信息并将其存储在CSV文件中。该脚本可以很好地用于单个URL。
问题:我已在文本文件中逐行添加了几个URL，现在我希望脚本读取第一个URL，执行所需的操作，然后返回文本文件以读取第二个URL并重复。
添加了for循环以完成此操作后，我说面临以下错误:

追溯(最近一次通话):
文件“C:\ Users \ T947610 \ Desktop \ hahah.py”，第22行，在
table = soup.findAll(“table”，{“class”:“display”})[0]＃此语句中的错误
IndexError:列表索引超出范围

f = open("URL.txt", 'r')
for line in f.readlines():
    print (line)
    page = requests.get(line)
    print(page.status_code)
    print(page.content)
    soup = BeautifulSoup(page.text, 'html.parser')
    print("soup command worked")
    table = soup.findAll("table", {"class":"display"})[0] #Facing error in this statement
    rows = table.findAll("tr")

最佳答案

如果单个URL输入有效，则可能是来自.txt的新输入行。尝试将.strip()应用于该行，该行的头部和尾部通常具有空格

page = requests.get(line.strip())

另外，如果汤.findall()找不到任何内容，它将返回None，该索引无法索引。尝试打印汤并检查内容。

关于python - Web抓取python : IndexError: list index out of range，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59023899/

上一篇：python - 写入csv时发生Python错误。我究竟做错了什么？

下一篇：swift - 巨大的Dictionary对象时编译错误。打破表达

相关文章：

Python Selenium xPath 从 div 类 a rel 中选择

python-2.7 - 使用 Beautifulsoup 4 进行网页抓取

python - Twisted IRC Bot 线程

python - OpenERP 模块 xml ValidateError

python函数更改列表值

python - 使用 Python 从站点提取表

python - BeautifulSoup 找不到属性

node.js - Node js，某些网站的请求正文为空

python - Python 网络抓取、DataFrame 索引问题

Python - 如何使用 BeautifulSoup 将一个类定位到另一个类中？