python - 使用 BeautifulSoup Issue 提取子标签文本

我正在运行的一些代码遇到问题。这是为了提取并最终创建网站上的名称列表。这是为了捕获以下名称:

<th class="left " data-append-csv="David-Cornell" data-stat="player" scope="row"><a href="/en/players/0c9aad01/David-Cornell">David Cornell</a></th>

现在我已经创建了代码来捕获所有这些实例，但是即使当我在代码中使用查找实例来捕获下一个标签时，我也会收到错误。我怀疑有一种方法可以让我只解析收到的文本，但这对于目的来说会相当多，特别是当有很多不同的页面时。

from bs4 import BeautifulSoup as bsoup
import requests as reqs

page = reqs.get("https://fbref.com/en/squads/986a26c1/Northampton-Town")
parsepage = bsoup(page.content, 'html.parser')

findplayers = parsepage.find_all('th',attrs={"data-stat":"player"}).find_next('a')
print(findplayers)

所以我一生都无法捕获下一个标签 - 我已经尝试了一系列迭代，运行此代码时出现的错误如下:

AttributeError: ResultSet object has no attribute 'find_next'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?

如何解决这个问题？

最佳答案

find_all 给出包含许多元素的列表，您必须对每个元素单独使用 find_next 。你必须使用for-loop

from bs4 import BeautifulSoup as bsoup
import requests as reqs

page = reqs.get("https://fbref.com/en/squads/986a26c1/Northampton-Town")
parsepage = bsoup(page.content, 'html.parser')

finndplayers = parsepage.find_all('th',attrs={"data-stat":"player"})

for item in findplayers:
    print( item.find_next('a') )

关于python - 使用 BeautifulSoup Issue 提取子标签文本，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56960254/

python - 使用 BeautifulSoup Issue 提取子标签文本

上一篇：python - Pandas 将另一个数据框中的值复制到我的数据框中

下一篇：python - 如何将 Python 库 'matplotlib' 部署为 AWS 中的 Lambda 层？