这是我用来获取耐克服装数据的代码。
import urllib.request
#Base url for website
url = 'http://store.nike.com/us/en_us/pw/mens-clothing/1mdZ7pu?ipp=120'
# A lot of sites don't like the user agents of Python 3, so I specify one here
req = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0'})
html = urllib.request.urlopen(req).read()
然后错误看起来像这样:
urllib.error.HTTPError:HTTP 错误 403:禁止
如何打开并解析此 HTML 页面?
最佳答案
或者尝试selenium
webdriver。
from selenium import webdriver
from bs4 import BeautifulSoup as bs
browser = webdriver.Firefox()
url = 'http://store.nike.com/us/en_us/pw/mens-clothing/1mdZ7pu?ipp=120'
browser.get(url)
source = browser.page_source
soup = bs(source, "html.parser")
print(soup)
这对我有用,尽管我只是个新手:)
关于python - 在网站上使用 python Beautiful Soup 时,不断收到此错误 : urllib. error.HTTPError: HTTP Error 403: Forbidden,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44705023/