python - 从 url 下载 .html 文件时出现超时错误

标签 python html python-2.7 urllib2 urllib

从 url 下载 html 页面时出现以下错误。

Error: raise URLError(err) urllib2.URLError: <urlopen error [Errno
 10060] A connection attempt failed because the connected party did not
 properly respond after a period of time or established connection
 failed because connected host has failed to respond>

代码:

import urllib2 
hdr = {'User-Agent': 'Mozilla/5.0'}

for i,site in enumerate(urls[index]):
    print (site)
    req = urllib2.Request(site, headers=hdr)
    page = urllib2.build_opener(urllib2.HTTPCookieProcessor).open(req)
    page_content = page.read()
    with open(path_current+'/'+str(i)+'.html', 'w') as fid:
        fid.write(page_content)

我认为这可能是由于某些代理设置或更改超时造成的，但我不确定。请帮忙，我手动检查了网址似乎可以正常打开。

最佳答案

好吧，由于大多数情况下您不会遇到这种情况，我可以推断您的网络可能很慢。尝试通过以下方式设置超时:

req = urllib2.Request(site, headers=hdr)
timeout_in_sec = 360
page = urllib2.build_opener(urllib2.HTTPCookieProcessor).open(req, timeout=timeout_in_sec)
page_content = page.read()

关于python - 从 url 下载 .html 文件时出现超时错误，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/30373301/

上一篇：python - 使用 pandas 指定日期限制的奇怪行为

下一篇：python - 通过 Pillow 的 Image.frombytes 创建的图像与预期不同

相关文章：

html - CSS Sprite 不工作 - 编辑这个 Fiddle

Python 2.7 MSSQL 语法错误接近订单

python - 通过引用 Cython 传递单个整数？

python - 如何使用正则表达式从字符串中删除除 '#' 之外的所有非字母数字字符？

html - 为什么如果我在 html 文件中使用连续字符，它会覆盖页面的边距？

python - 操纵 Blackjack 中 A 的值(Python)

Python 多处理池 map_async 卡住

python - 编译时如何捕获更改的 .c 文件列表？

python - 为什么无法使用 bool 掩码修改数据框中的值？

javascript - 显示 HTTP 请求时间