python - Scrapy 和代理

您如何在 python 网络抓取框架 Scrapy 中利用代理支持？

最佳答案

Does Scrapy work with HTTP proxies?

Yes. Support for HTTP proxies is provided (since Scrapy 0.8) through the HTTP Proxy downloader middleware. See HttpProxyMiddleware.

使用代理最简单的方法是设置环境变量http_proxy。如何做到这一点取决于你的 shell。

C:\>set http_proxy=http://proxy:port
csh% setenv http_proxy http://proxy:port
sh$ export http_proxy=http://proxy:port

如果你想使用 https 代理并访问 https 网页，设置环境变量 http_proxy 你应该遵循以下，

C:\>set https_proxy=https://proxy:port
csh% setenv https_proxy https://proxy:port
sh$ export https_proxy=https://proxy:port

关于python - Scrapy 和代理，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/4710483/

上一篇：python - 如何将请求(python)cookie保存到文件中？

下一篇：python - 有没有办法为基于平台的 Python 应用程序提供条件 requirements.txt 文件？

python - 让 scrapy 跟踪页面上的特定链接

web-scraping - 如何存储 scrapy shell 输出/响应到变量而不是 html 文件

python - 从 Python 中的嵌套字典中提取键

scrapy - 覆盖 Scrapy 输出格式 'on the fly'

python - 为什么不是所有的二元组都是在 gensim 的 `Phrases` 工具中创建的？

python - 列表2> sum13编码 bat 问题: 'int' object is not iterable

python - 抓取某些网址时无法让我的脚本停止

python - 在不使用IIS的情况下将django站点部署到Windows服务器

python - 在多列上过滤 Pandas 数据框的最快方法