相关代码
def start_requests( self ):
requests = [ Request( url['url'], meta=url['meta'], callback=self.parse, errback=self.handle_error ) for url in self.start_urls if valid_url( url['url'] )]
return requests
def handle_error( self, err ):
# Errors being saved in DB
# So I don't want them displayed in the logs
我有自己的代码用于在数据库中保存错误代码。我不希望它们显示在日志输出中。我怎样才能抑制这些错误?
请注意,我不想隐藏所有错误 - 只隐藏此处处理的错误。
最佳答案
尝试在 handle_error
中使用 self.skipped.add
, self.failed.add
和 isinstance
条件> 方法。
def on_error(self, failure):
if isinstance(failure.value, HttpError):
response = failure.value.response
if response.status in self.bypass_status_codes:
self.skipped.add(response.url[-3:])
return self.parse(response)
# it assumes there is a response attached to failure
self.failed.add(failure.value.response.url[-3:])
return failure
关于python - Scrapy 抑制处理的错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36682569/