我正在尝试下载 pdf具有以下 Python 函数的文件。我能够打开那个URL (重定向到另一个 URL)在浏览器中。但代码出现 404 错误。
import requests
def downloadFile(url, fileName):
r = requests.get(url, allow_redirects=True, stream=True)
with open(fileName, "wb") as pdf:
for chunk in r.iter_content(chunk_size=1024):
if chunk:
pdf.write(chunk)
downloadFile("http://pubs.vmware.com/vsphere-55/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-552-storage-guide.pdf", "vsphere-esxi-vcenter-server-552-storage-guide.pdf")
最佳答案
很少有网站会根据语言或位置进行屏蔽。以下带有附加 header 的代码可以正常工作
In [11]: def downloadFile(url, fileName):
headers = {'Accept-Language': 'en-US,en;q=0.9,te;q=0.8'}
r = requests.get(url, allow_redirects=True, stream=True, headers=headers)
with open(fileName, "wb") as pdf:
for chunk in r.iter_content(chunk_size=1024):
if chunk:
pdf.write(chunk)
In [12]: downloadFile("http://pubs.vmware.com/vsphere-55/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-552-storage-guide.pdf", "vsphere-esxi-vcenter-server-552-storage-guide.pdf")
关于python - 从Python中的URL下载实时pdf文件时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49042628/