我正在尝试下载验证码图像,手动解决它,然后在 POST 中将其与用户名和密码一起提交。我的响应文本只是原始登录页面,因此我认为这意味着我的代码失败。我登录的网页位于暗网上,但我不知道这是否确实相关。我唯一能想到的是提交 POST 的行为正在生成一个新的验证码。希望对 HTTP 有更好了解的人可以帮助我。
from bs4 import BeautifulSoup
import base64
import requests
session = requests.Session()
session.proxies = {'http': 'socks5h://127.0.0.1:9150', 'https': 'socks5h://127.0.0.1:9150'}
url = session.get("http://waeixxcraed4gw7q.onion/signin")
soup = BeautifulSoup(url.text, "lxml")
imgs = soup.findAll('img')
#save captcha from base64 encoding
img_data = bytes(imgs[1]['src'][23:],encoding='utf-8')
with open("olympus_captcha.jpg","wb") as fh:
fh.write(base64.decodestring(img_data))
#solve the captcha that has been saved to the harddrive
captcha = input("enter captcha:\n")
#attempt login (password and username removed)
payload = {"username":username, "password":password, "captcha":captcha}
response = session.post("http://waeixxcraed4gw7q.onion/signin", data = payload)
print(response.text)
最佳答案
我通过以下方式做到了这一点, 首次安装tesseract在您的系统上
session = requests.session()
url = "Login page link"
r = session.get(url)
soup = bs(r.text, 'html.parser')
img = soup.find('img', id='captcha')
img_SRC = img['src']
with open('captcha.jpg', 'wb') as handle:
response = requests.get(img_SRC, stream=True)
if not response.ok:
print("ok")
for block in response.iter_content(1024):
if not block:
break
handle.write(block)
try:
from PIL import Image
except ImportError:
import Image
import pytesseract
pic = Image.open('result/captcha.jpg')
pytesseract.pytesseract.tesseract_cmd = r'Enter the installation path of the \tesseract.exe'
captchaText = pytesseract.image_to_string(pic)
验证码中的文本可用于登录
关于python-3.x - python 请求使用验证码登录,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51049591/