python-2.7 - python.failure.Failure OpenSSL.SSL.Scrapy 中的错误(版本 1.0.4)

我正在做一个数据抓取项目，我的代码使用了 Scrapy(版本 1.0.4)和 Selenium(版本 2.47.1).

from scrapy import Spider
from scrapy.selector import Selector
from scrapy.http import Request
from scrapy.spiders import CrawlSpider
from selenium import webdriver

class TradesySpider(CrawlSpider):
    name = 'tradesy'
    start_urls = ['My Start url',]

    def __init__(self):
        self.driver = webdriver.Firefox()

    def parse(self, response):
        self.driver.get(response.url)
        while True:
           tradesy_urls = Selector(response).xpath('//div[@id="right-panel"]"]')
           data_urls = tradesy_urls.xpath('div[@class="item streamline"]/a/@href').extract()
           for link in data_urls:
               url = 'My base url'+link
               yield Request(url=url,callback=self.parse_data)
               time.sleep(10)
           try:
               data_path = self.driver.find_element_by_xpath('//*[@id="page-next"]')
           except:
               break
           data_path.click()
           time.sleep(10)

    def parse_data(self,response):
        'Scrapy Operations...'

当我执行我的代码时，我得到了一些 url 的预期输出，但对于其他 url，我得到了以下错误。

2016-01-19 15:45:17 [scrapy] DEBUG: Retrying <GET MY_URL> (failed 1 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'SSL3_READ_BYTES', 'ssl handshake failure')]>]

请为此查询提供解决方案。

最佳答案

根据这个reported issue您可以创建自己的 ContextFactory 来处理 SSL。

上下文.py:

from OpenSSL import SSL
from scrapy.core.downloader.contextfactory import ScrapyClientContextFactory


class CustomContextFactory(ScrapyClientContextFactory):
    """
    Custom context factory that allows SSL negotiation.
    """

    def __init__(self):
        # Use SSLv23_METHOD so we can use protocol negotiation
        self.method = SSL.SSLv23_METHOD

settings.py

DOWNLOADER_CLIENTCONTEXTFACTORY = 'yourproject.context.CustomContextFactory'

关于python-2.7 - python.failure.Failure OpenSSL.SSL.Scrapy 中的错误(版本 1.0.4)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/34875175/

python-2.7 - python.failure.Failure OpenSSL.SSL.Scrapy 中的错误(版本 1.0.4)

上一篇：apache - 如何在 AWS 弹性负载均衡器上实现 HTTP 严格传输安全 (HSTS)？

下一篇：asp.net - 在哪里为 IISExpress 指定我的 SSL 端口？