python - 单击登录按钮后 Ebay 网站挂起 - Selenium Python

我编写了以下代码来登录网站。到目前为止，它只是获取网页，接受 cookie，但是当我尝试通过单击登录按钮登录时，页面挂起并且登录页面永远不会加载。

from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException, ElementNotInteractableException


# Accept consent cookies
def accept_cookies(browser):
    try:
        browser.find_element_by_xpath('//*[@id="gdpr-banner-accept"]').click()
    except NoSuchElementException:
        print('Cookies already accepted')
        

# Webpage parameters
base_site = "https://www.ebay-kleinanzeigen.de/"

# Setup remote control browser
fireFoxOptions = webdriver.FirefoxOptions()
#fireFoxOptions.add_argument("--headless")
browser = webdriver.Firefox(executable_path = '/home/Webdriver/bin/geckodriver',firefox_options=fireFoxOptions)
browser.get(base_site)
accept_cookies(browser)

# Click login pop-up 
browser.find_elements_by_xpath("//*[contains(text(), 'Einloggen')]")[1].click()

注意:有两个登录按钮(一个是弹出窗口，一个是页面中的)，我已尝试过这两个按钮，但结果相同。

我在其他网站也做过类似的事情，没问题。所以我很好奇为什么它在这里不起作用。

对于为什么会这样有什么想法吗？或者如何解决这个问题？

最佳答案

我稍微修改了您的代码，添加了几个可选参数，在执行时我得到了以下结果:

代码块:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

driver.get("https://www.ebay-kleinanzeigen.de/")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[@id='gdpr-banner-accept']"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[contains(text(), 'Einloggen')]"))).click()

观察结果:我的观察结果与您的类似，页面挂起并且登录页面永远不会加载，如下所示:

深入探讨

在检查DOM Tree时的网页你会发现一些<script>和<link>标签指的是具有关键字dist的JavaScript。举个例子:

<script type="text/javascript" async="" src="/static/js/lib/node_modules/@ebayk/prebid/dist/prebid.10o55zon5xxyi.js"></script>
window.BelenConf.prebidFileSrc = '/static/js/lib/node_modules/@ebayk/prebid/dist/prebid.10o55zon5xxyi.js';

这明确表明该网站受到机器人管理服务提供商的保护 Distil Networks ChromeDriver 的导航会被检测到并随后被阻止。

蒸馏

根据文章There Really Is Something About Distil.it... :

Distil protects sites against automatic content scraping bots by observing site behavior and identifying patterns peculiar to scrapers. When Distil identifies a malicious bot on one site, it creates a blacklisted behavioral profile that is deployed to all its customers. Something like a bot firewall, Distil detects patterns and reacts.

此外，

"One pattern with Selenium was automating the theft of Web content", Distil CEO Rami Essaid said in an interview last week. "Even though they can create new bots, we figured out a way to identify Selenium the a tool they're using, so we're blocking Selenium no matter how many times they iterate on that bot. We're doing that now with Python and a lot of different technologies. Once we see a pattern emerge from one type of bot, then we work to reverse engineer the technology they use and identify it as malicious".

引用

您可以在以下位置找到一些详细的讨论:

关于python - 单击登录按钮后 Ebay 网站挂起 - Selenium Python，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/65658258/

python - 单击登录按钮后 Ebay 网站挂起 - Selenium Python

深入探讨

蒸馏

引用

上一篇：scala - 模式匹配 scala 2.13.4 的奇怪行为

下一篇：react-native - 任何导航器均未处理有效负载为 'REPLACE' {"name"} 的操作 :"DrawerNavigationRoutes"