python - BeautifulSoup 刮img

我正在尝试抓取 website用于验证码图像链接。

使用浏览器检查元素它已经出现，但在抓取时它没有显示。

我的目标是获取img

下面是我尝试过的代码。

import requests
from bs4 import BeautifulSoup

with requests.Session() as s:
    url = "https://myurl.com/"
    r = s.get(url)
    soup = BeautifulSoup(r.content, "html.parser")
    for item in soup.findAll("img"):
        print(item)

最佳答案

如果您转到“网络”选项卡，您将看到以下链接，该链接返回 JSON 格式的验证码图像。为此，您不需要 Selenium。

https://example.com/site/captcha/refresh/1/?_=1574163338269

您需要将响应转换为 JSON，然后获取 url key val。

import requests

with requests.Session() as s:
    url = "https://example.com/site/captcha/refresh/1/?_=1574163338269"
    r = s.get(url, verify=False)
    img = r.json()
    print(img['url'])

网络选项卡

screenshot[1]

关于python - BeautifulSoup 刮img，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/58931795/

上一篇：python - 从 .txt 文件 python 中提取主题标签

下一篇：python - 将当前列名称更改为行并替换为其他列名称

相关文章：

Python 身份验证在 HTTPS 页面上不起作用

python - 有多少人在一周中的每一天的给定时间工作，python

python - BeautifulSoup XML 解析不起作用

python - Pyparsing token 源范围

python - BeautifulSoup:只要进入一个标签，不管有多少封闭标签

python - BeautifulSoup 在使用 findAll(text =' ') 后返回下一个 sibling

Python 请求 - "To continue your browser has to accept cookies and has to have JavaScript enabled."

python - 请求拒绝证书

python - Django模型设计——1张或多张表

python - 在不生成所有可能性的情况下找到列表二进制值的唯一排列