python - scrapy如何重复重复的请求

标签 python web-scraping scrapy web-crawler

当我向 scrape API 发送请求时，有时它无法正确加载，并且会返回 -1 而不是价格。

因此，我放置了一个 while 循环，让它在收到 -1 时重复请求，但由于重复请求，蜘蛛在第一个请求后停止。

所以我的问题是，如何更改它以处理重复的请求？

示例代码:

     is_checked = False
     while(not is_checked):
         response = yield scrapy.Request("https://api.bookscouter.com/v3/prices/sell/"+isbn+".json")            
         jsonresponse = loads(response.body)
         sellPrice = jsonresponse['data']['Prices'][0]['Price']
         if sellPrice!=-1:
             is_checked = True
             yield {'SellPrice': sellPrice}

请记住，我使用内联请求库，但它与解决方案无关。

最佳答案

要强制调度重复请求，请设置 dont_filter=True在 Request 的构造函数中。在上面的示例中，更改

response = yield scrapy.Request("https://api.bookscouter.com/v3/prices/sell/"+isbn+".json")

至

response = yield scrapy.Request("https://api.bookscouter.com/v3/prices/sell/"+isbn+".json", dont_filter=True)

关于python - scrapy如何重复重复的请求，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46525093/

上一篇：python3.6 : equivalent of %d %s using the format: f("string {var1} {var2})

下一篇：python - 如何在Python中以正确的方式将一列分成两部分？

python - 将 .txt url 保存在文件夹中

python - 使用 Scrapy XPATH 获取属性名称

Python Scrapy，如何为项目定义管道？

python - Selenium 检查元素是否存在并单击

python - 在 Python 中调用函数之前检查函数是否引发 NotImplementedError

Python:拟合误差函数(erf)或类似于数据

python - 编写视频而不会丢失数据或比特率 - opencv(python)

javascript - 使用 Jquery 抓取 url

python - 使用 Selenium 将鼠标悬停在元素上