python - Chrome/Firefox 在新选项卡中打开 PDF，并且不会以 headless 模式保存它 (Selenium+Python)

标签 python selenium google-chrome firefox selenium-webdriver

执行测试时，我遇到 headless Chrome 的问题:单击按钮时会在新选项卡中打开 PDF 文件。如果我在非 headless 模式下运行测试，一切都很好。但是当尝试在 headless 中执行相同操作时 - 文件未下载。

options = ChromeOptions()
            options.add_argument('--no-sandbox')
            options.add_argument('--kiosk-printing')
            options.add_argument('--test-type')
            options.add_argument('--disable-infobars')
            options.add_argument('disable-gpu')
            options.add_argument('--verbose')
            options.add_argument('--disable-extensions')
            options.add_argument('--ignore-certificate-errors')
            options.add_experimental_option("prefs", {
                "profile.default_content_settings.popups": 0,
                "download.default_directory": dwnld_path,
                "download.prompt_for_download": False,
                "download.directory_upgrade": True,
                "safebrowsing.enabled": False,
                "plugins.always_open_pdf_externally": True,
                "plugins.plugins_disabled": ["Chrome PDF Viewer"]
            })

我还发现:

wd.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')

params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': dwnld_path}}
        command_result = wd.execute("send_command", params)

但只有当我收到以 headless 模式下载的请求时它才会有帮助，而不是在浏览器中打开文件时才有帮助。

最佳答案

尝试保存 PDF 文件的 url 并使用 requests 库下载它，我认为它会起作用。

像这样:

import urllib3
import PyPDF2
import certifi
import io

http = urllib3.PoolManager(cert_reqs='CERT_REQUIRED', ca_certs=certifi.where())
pdf_url = "http:\\XXXXXX.pdf"
r3 = http.request('GET', pdf_url)
with io.BytesIO(r3.data) as open_pdf_file:
      read_pdf = PyPDF2.PdfFileReader(open_pdf_file)
      num_pages = read_pdf.getNumPages()

然后我们需要代码的第二部分，您需要以类似的方式保存 pdf

关于python - Chrome/Firefox 在新选项卡中打开 PDF，并且不会以 headless 模式保存它 (Selenium+Python)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56381784/

上一篇：python - 如何将数据帧中的连接值插入 Pyspark 中的另一个数据帧？

下一篇：c# - 使用 HTML 解析器时如何加载网页上的所有项目？

python - z3python : using math library

php - 如何在 facebook webdriver 1.3.0 中添加 cookie

python - Pandas Python 中的 Dataframe Comprehension 以创建新的 Dataframe

python - 无法抓取 bscscan 的动态表。 requests_html 不返回任何内容并且 Selenium 不工作

python - 如何在 Python 中使用 Selenium 进行参数化/数据驱动测试

javascript - 谷歌浏览器导航器在线始终为真

javascript - Chrome 和 Windows 的非最佳 WebGL 性能

php - Websockets 无法解决的 "received unexpected continuation frame"错误

c++ - 覆盖 QIODevice 子类中的 readData 返回不正确的结果