我正尝试在 Tripadvisor 上抓取一个书面评论页面,但在单击“更多”按钮以展开页面上的所有书面评论时遇到困难。我查看了类似的查询(谢谢 Saurabh Gaur),但是当使用 selenium 单击按钮时,会弹出此登录页面。
有没有办法在不触发此按钮的情况下单击“更多”按钮?谢谢你! :)
from selenium import webdriver
import re
from bs4 import BeautifulSoup
def clicker(url):
browser = webdriver.Firefox()
browser.get(url)
# Use regex to find that button link
pageSource = browser.page_source
soup = BeautifulSoup(pageSource, 'html.parser')
# Example: soup.findAll(True, {'class': re.compile(r'\bclass1\b')})
Regex = re.compile(r'.*\bmoreLink.ulBlueLinks.*')
linkElem = soup.find('span', class_=Regex)['class']
linkElem = '.'.join(linkElem[0:(len(linkElem)+1)])
moreButton = 'span.' + linkElem
print(moreButton)
button = browser.find_element_by_css_selector(moreButton)
print(button)
browser.execute_script("arguments[0].click()", button)
clicker('https://www.tripadvisor.com.sg/Hotel_Review-g295424-d1209362-Reviews-Residence_Spa_at_One_Only_Royal_Mirage_Dubai-Dubai_Emirate_of_Dubai.html')
最佳答案
这里有一个示例代码供您引用,您可以将 selenium 与 phantomjs 一起使用,然后单击按钮。我使用了函数“find_element_by_name”中需要的标签的名称属性,您可以根据需要修改。
from urllib.request import urlopen
from urllib.error import HTTPError
from bs4 import BeautifulSoup
from selenium import webdriver
def openUrl(link):
driver = webdriver.PhantomJS(
executable_path='../../phantomjs/bin/phantomjs')
try:
driver.get(link)
except HTTPError as e:
print ('Error opening ' + link)
continue
try:
bsObj = BeautifulSoup(driver.page_source)
except AttributeError as e:
return None
try:
elem1 = driver.find_element_by_name('checkAndShowAnswers')
elem1.click()
except:
continue
关于javascript - 使用 selenium 抓取 Tripadvisor 时如何单击 "More"按钮?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40254983/