python - 在 Python 中下载网页及其所有资源文件

标签 python urllib2 wget

我希望能够使用 Python 下载页面及其所有相关资源(图像、样式表、脚本文件等)。我(有点)熟悉 urllib2 并且知道如何下载单个 url，但是在我开始使用 BeautifulSoup + urllib2 进行黑客攻击之前，我想确保没有相当于“wget --page-requisites http://www.google.com”的 Python ”。

具体来说，我有兴趣收集有关下载整个网页(包括所有资源)所需时间的统计信息。

谢谢标记

最佳答案

网络吸盘？参见 http://effbot.org/zone/websucker.htm

关于python - 在 Python 中下载网页及其所有资源文件，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/844115/

上一篇：python - 以 root 身份在 Eclipse 中运行 Python 项目

下一篇：python - 为什么 * 在赋值语句和函数调用中的工作方式不同？

python librosa 包 - 如何从频谱中提取音频

python - 检测字符串迭代器是否为空格

Python 得到错误的 UTF-8 字符编码？

python - 使用包装在 python2.7 子进程中的 wget 永远运行

python - 在多列中使用 isin

python - 使用 urllib2 HTTPS 登录

python - IndexError 使用 python-ntlm

curl - 从 PyPi 下载轮子

javascript - 如何使用 JavaScript 菜单镜像站点？