我在标题中添加了一个用户代理。以下是我的代码和报错
from urllib.request import Request, urlopen
import json
from bs4 import BeautifulSoup
import time
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1)'}
domain=Request("http://online-courses.club/baugasm-series-8-design-abstract-textures-and-poster-with-acrylic-paint-photoshop-and-cinema-4d/",data=bytes(json.dumps(headers), encoding="utf-8"))
response =urlopen(domain)
我也尝试了不同的版本,注意域变量的变化
from urllib.request import Request, urlopen
import json
from bs4 import BeautifulSoup
import time
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1)'}
domain=Request("http://online-courses.club/baugasm-series-8-design-abstract-textures-and-poster-with-acrylic-paint-photoshop-and-cinema-4d/",headers)
response =urlopen(domain)
这些代码都不起作用。 错误:
line 9, in <module>
response =urlopen(domain)
File "C:\Users\ABC\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\ABC\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\ABC\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\ABC\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Users\ABC\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\ABC\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
最佳答案
使用 .add_header()
添加正确的 User-Agent
。
例如:
from urllib.request import Request, urlopen
domain=Request("http://online-courses.club/baugasm-series-8-design-abstract-textures-and-poster-with-acrylic-paint-photoshop-and-cinema-4d/")
domain.add_header('User-Agent', 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0')
response =urlopen(domain)
print(response.read())
打印:
b'<!DOCTYPE html>\r\n<html lang="en-US" prefix="og: http://ogp.me/ns#">\r\n<head itemscope="itemscope" itemtype="http://schema.org/WebSite">\r\n\t<meta charset="UTF-8" />
... and so on.
关于python-3.x - 如何使用带有 urllib 的 urlopen 修复 Python 3 中的 HTTP 错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62733630/