python - 通过其中的元素文本查找 div 类

我正在爬取一个游戏网站，我想获取包含特定文本的div对象。在本例中，我想要获取包含带有文本“SANDBOX Ghost”的 href 的 div 类“GameItemWrap”。整个代码中有很多 GameItemWrap 类，我不想获取“SummonerName”类 div，因为“GameItemWrap”中还有一些我需要的其他类。

这是我尝试过的:

duo_name='SANDBOX Ghost'    
gamelist=soup.find('div',"GameItemList")# "GameItemList" is a div that contains "GameItemWrap"
games=gamelist.find_all('GameItemWrap',{('a'):duo_name })

这就是我正在抓取的 JavaScript 的样子:

<div class="GameItemWrap>
    #some other div classes that i will need in the future 
    <div class="SummonerName">                                                       
        <a href="//www.op.gg/summoner/userName=SANDBOX+Ghost" class="Link" target="_blank">SANDBOX Ghost</a>                                                 
    </div>
</div>

我期待 4 个包含文本“SANDBOX Ghost”的 GameItemWraps 但是当我打印

print(len(games))

输出为 0。这不起作用。另外，我不想检查每个 GameItemWraps 类来检查它们是否包含“SANDBOX Ghost” 这可能吗？

最佳答案

修复显示的 html 后，使用 bs4 4.7.1 我希望您能够使用 :contains 伪类

from bs4 import BeautifulSoup as bs

html ='''
<div class="GameItemWrap">
    #some other div classes that i will need in the future 
    <div class="SummonerName">                                                       
        <a href="//www.op.gg/summoner/userName=SANDBOX+Ghost" class="Link" target="_blank">SANDBOX Ghost</a>                                                 
    </div>
</div>
'''
duo_name = 'SANDBOX Ghost'
soup = bs(html, 'lxml') #'html.parser' if lxml not installed
items = soup.select('.GameItemWrap:contains("' + duo_name + '")')

关于python - 通过其中的元素文本查找 div 类，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56230799/

python - 通过其中的元素文本查找 div 类

上一篇：python - 执行功能的条件

下一篇：python - 如何克隆列表以使其在分配后不会意外更改？