python - 具有多个连接的请求

标签 python networking download python-requests

我使用 Python Requests 库下载一个大文件，例如:

r = requests.get("http://bigfile.com/bigfile.bin")
content = r.content

大文件下载速度为每秒 +- 30 Kb，有点慢。与大文件服务器的每个连接都受到限制，因此我想建立多个连接。

有没有办法同时建立多个连接来下载一个文件？

最佳答案

您可以使用 HTTP Range header 仅获取文件的一部分(already covered for python here)。

只需启动多个线程并分别获取不同的范围即可；)

def download(url,start):
    req = urllib2.Request('http://www.python.org/')
    req.headers['Range'] = 'bytes=%s-%s' % (start, start+chunk_size)
    f = urllib2.urlopen(req)
    parts[start] = f.read()

threads = []
parts = {}

# Initialize threads
for i in range(0,10):
    t = threading.Thread(target=download, i*chunk_size)
    t.start()
    threads.append(t)

# Join threads back (order doesn't matter, you just want them all)
for i in threads:
    i.join()

# Sort parts and you're done
result = ''.join(parts[i] for i in sorted(parts.keys()))

另请注意，并非每个服务器都支持 Range header (尤其是带有 php scripts responsible for data fetching 的服务器通常不会对其进行处理)。

关于python - 具有多个连接的请求，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/13973188/

上一篇：python - 使用 numpy/scikit 函数保持 pandas 结构

下一篇：python - 百分比列表切片

相关文章：

python - 使用空请求对象调用 View 函数

python - Pylint 说 : W0233: __init__ method from a non direct base class 'Nested' is called (non-parent-init-called)

networking - CRC 突发错误检测校验和结果的证明

java - 如何更新大部分读取的持久数据？

vmware player无法识别Linux操作系统-Mint、Fedora、Red Hat

android - 如何使用安卓下载管理器？

python - 像在 MATLAB 中一样在 IPython 中保存 session ？

networking - 需要端口号才能在防火墙后面设置 Azure 点到站点 VPN

javascript - 使用 JavaScript 创建文件夹

python - 将 Django 项目从 virtualenv 推送到 github