python - 检测到通过 ChromeDriver 启动的 Chrome 浏览器

我正在尝试在 python 中为网站 www.mouser.co.uk 使用 selenium chromedriver。然而，它从第一次拍摄就被检测为机器人。

有人对此有解释吗？。以下是我使用的代码:

options = Options()
options.add_argument("--start-maximized")
browser = webdriver.Chrome('chromedriver.exe',chrome_options=options)
wait = WebDriverWait(browser, 30)
browser.get('https://www.mouser.co.uk')

最佳答案

我试图访问 url https://www.mouser.co.uk/使用某些 chrome.options 但确实被检测到并被重定向到 Pardon Our Interruption 页面。

代码块:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get("https://www.mouser.co.uk")
myElement = WebDriverWait(driver, 30).until(EC.element_to_be_clickable((By.XPATH, "//a[@id='1_lnkLeftFlag']")))
driver.execute_script("arguments[0].click();", myElement)

现在检查 请原谅我们的打扰 页面，您会发现 <body>标签包含:

类属性 dist-GlobalHeader
类属性 dist-PageWrap

这清楚地表明该网站受到Bot Management 服务提供商的保护Distil Networks ChromeDriver 的导航会被检测到并随后被阻止。

提炼

根据文章There Really Is Something About Distil.it... :

Distil protects sites against automatic content scraping bots by observing site behavior and identifying patterns peculiar to scrapers. When Distil identifies a malicious bot on one site, it creates a blacklisted behavioral profile that is deployed to all its customers. Something like a bot firewall, Distil detects patterns and reacts.

更进一步，

"One pattern with Selenium was automating the theft of Web content", Distil CEO Rami Essaid said in an interview last week. "Even though they can create new bots, we figured out a way to identify Selenium the a tool they're using, so we're blocking Selenium no matter how many times they iterate on that bot. We're doing that now with Python and a lot of different technologies. Once we see a pattern emerge from one type of bot, then we work to reverse engineer the technology they use and identify it as malicious".

引用

您可以在以下位置找到一些详细的讨论:

关于python - 检测到通过 ChromeDriver 启动的 Chrome 浏览器，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52832413/

python - 检测到通过 ChromeDriver 启动的 Chrome 浏览器

提炼

引用

上一篇：python - 我的代码行中的字符串格式有什么问题，为什么会显示 : TypeError: 'NoneType' object is not callable?

下一篇：python - 如何将具有空值的列转换为日期时间格式？