java - Selenium webdriver : Modifying navigator. webdriver 标志以防止 selenium 检测

我正在尝试使用 selenium 和 chrome 在网站中自动执行一项非常基本的任务，但不知何故，该网站会检测到 chrome 何时由 selenium 驱动并阻止每个请求。我怀疑该网站依赖于像这样的公开 DOM 变量 https://stackoverflow.com/a/41904453/648236检测 Selenium 驱动的浏览器。

我的问题是，有没有办法让 navigator.webdriver 标志为 false？我愿意在进行修改后尝试重新编译 selenium 源，但我似乎无法在存储库中的任何位置找到 NavigatorAutomationInformation 源 https://github.com/SeleniumHQ/selenium

非常感谢任何帮助

P.S:我还尝试了 https://w3c.github.io/webdriver/#interface 中的以下操作

Object.defineProperty(navigator, 'webdriver', {
    get: () => false,
  });

但它仅在初始页面加载后更新属性。我认为该网站在执行我的脚本之前检测到该变量。

最佳答案

首先是更新¹

execute_cdp_cmd() :随着 execute_cdp_cmd(cmd, cmd_args) 的可用性现在您可以轻松执行命令 google-chrome-devtools commands使用Selenium 。使用此功能您可以修改 navigator.webdriver轻松防止 Selenium 被检测到。

<小时/>

防止检测²

为了防止 Selenium 驱动的 WebDriver 被检测到，一种利基方法将包括以下提到的一个/所有步骤:

添加参数 --disable-blink-features=AutomationControlled

from selenium import webdriver

options = webdriver.ChromeOptions() 
options.add_argument('--disable-blink-features=AutomationControlled')
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get("https://www.website.com")

You can find a relevant detailed discussion in Selenium can't open a second page

旋转user-agent通过execute_cdp_cmd()命令如下:

#Setting up Chrome/83.0.4103.53 as useragent
driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.53 Safari/537.36'})

更改 navigator 的属性值 将 webdriver 更改为 未定义

driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")

排除 enable-automation 的集合开关

options.add_experimental_option("excludeSwitches", ["enable-automation"])

关闭useAutomationExtension

options.add_experimental_option('useAutomationExtension', False)

<小时/>

示例代码³

将上述所有步骤组合起来，有效的代码块将是:

from selenium import webdriver

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.53 Safari/537.36'})
print(driver.execute_script("return navigator.userAgent;"))
driver.get('https://www.httpbin.org/headers')

<小时/>

历史记录

根据 W3C 编辑草案，当前的实现严格提到:

The webdriver-active flag is set to true when the user agent is under remote control which is initially set to false.

此外，

Navigator includes NavigatorAutomationInformation;

需要注意的是:

The NavigatorAutomationInformation interface should not be exposed on WorkerNavigator.

NavigatorAutomationInformation 接口(interface)定义为:

interface mixin NavigatorAutomationInformation {
    readonly attribute boolean webdriver;
};

返回 true 如果webdriver-active 标志已设置，否则为 false。

最后，navigator.webdriver定义了一种用于协作用户代理的标准方法，以通知文档它由 WebDriver 控制，以便在自动化过程中可以触发备用代码路径。

Caution: Altering/tweaking the above mentioned parameters may block the navigation and get the WebDriver instance detected.

<小时/>

更新(2019 年 11 月 6 日)

从当前实现来看，访问网页而不被检测到的理想方法是使用 ChromeOptions()类添加几个参数:

排除 enable-automation 的集合开关
关闭useAutomationExtension

通过ChromeOptions的实例如下:

Java 示例:

System.setProperty("webdriver.chrome.driver", "C:\\Utility\\BrowserDrivers\\chromedriver.exe");
ChromeOptions options = new ChromeOptions();
options.setExperimentalOption("excludeSwitches", Collections.singletonList("enable-automation"));
options.setExperimentalOption("useAutomationExtension", false);
WebDriver driver =  new ChromeDriver(options);
driver.get("https://www.google.com/");

Python 示例

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\path\to\chromedriver.exe')
driver.get("https://www.google.com/")

Ruby 示例

  options = Selenium::WebDriver::Chrome::Options.new
  options.add_argument("--disable-blink-features=AutomationControlled")
  driver = Selenium::WebDriver.for :chrome, options: options

<小时/>

传奇

¹:仅适用于 Selenium 的 Python 客户端。

²:仅适用于 Selenium 的 Python 客户端。

³:仅适用于 Selenium 的 Python 客户端。

关于java - Selenium webdriver : Modifying navigator. webdriver 标志以防止 selenium 检测，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/61905122/

java - Selenium webdriver : Modifying navigator. webdriver 标志以防止 selenium 检测

首先是更新¹

防止检测²

示例代码³

历史记录

更新(2019 年 11 月 6 日)

传奇

上一篇：java - 通过 https 给定 IP 地址的 Spring Boot 白名单

下一篇：java - 尝试添加 UP 按钮时，Android 子 Actionbar 与父 Actionbar 保持相同

java - Selenium webdriver : Modifying navigator. webdriver 标志以防止 selenium 检测

首先是更新1

防止检测2

示例代码3

历史记录

更新(2019 年 11 月 6 日)

传奇

上一篇：java - 通过 https 给定 IP 地址的 Spring Boot 白名单

下一篇：java - 尝试添加 UP 按钮时，Android 子 Actionbar 与父 Actionbar 保持相同

首先是更新¹

防止检测²

示例代码³