python - 类型错误 : 'Request' object is not subscriptable

标签 python scrapy scrapy-spider

我收到 TypeError: 'Request' object is not subscriptable 在尝试访问从辅助 Web 请求传回的数据时:

import scrapy

class MyItem(scrapy.Item):
   main_url = scrapy.Field()
   addr_name = scrapy.Field()
   addr = scrapy.Field()
   addr_city = scrapy.Field()

class ServiceCanadaSpider(scrapy.Spider):
    name = 'servicecan'
    start_urls = ['http://www.servicecanada.gc.ca/tbsc-fsco/sc-lst.jsp?prov=AB&lang=eng']

    def parse(self, response):
        with open('test', 'w') as f:
            for title in response.xpath('//li/ul/li/a'):
                f.write(title.xpath('text()').extract_first())
                #get url for info page
                url='http://www.servicecanada.gc.ca' + title.xpath('@href').extract_first()
                #parse info page
                item = MyItem()
                request = scrapy.Request(url, callback=self.parse_info_page)
                request.meta['item'] = item

                f.write(',' + url)
                yield request
                f.write(',' + request['addr_name'])
                #f.write(',' + request.addr)
                #f.write(',' + request.addr_city)
                f.write('\n')

    def parse_info_page(self, response):
        item = response.meta['item']
        item['main_url'] = response.url
        if len(response.xpath('//td[@id="offInfo"]/text()')) == 3:
            item['addr_name']='';
            item['addr'] = response.xpath('//td[@id="offInfo"]/text()').extract()[0].replace('\n','')
            item['addr_city'] = response.xpath('//td[@id="offInfo"]/text()').extract()[1].replace('\n','')
        else:
            item['addr_name']=response.xpath('//td[@id="offInfo"]/text()').extract()[0].replace('\n','')
            item['addr'] = response.xpath('//td[@id="offInfo"]/text()').extract()[1].replace('\n','')
            item['addr_city'] = response.xpath('//td[@id="offInfo"]/text()').extract()[2].replace('\n','')
        return [item]

当我产生请求时,我可以在它的 MyItem 类中看到数据...

{'addr': ' 802 Bow Valley Trail',
 'addr_city': ' Canmore, Alberta',
 'addr_name': ' Canmore Gateway Shops - Building C, Suite 113',
 'main_url': 'http://www.servicecanada.gc.ca/tbsc-fsco/sc-dsp.jsp?rc=4865&lang=eng'}

最佳答案

Request类确实不支持订阅,即[]操作符的使用。 如果您想通过 meta 属性访问您附加到 Request 实例的对象的字段,您必须明确地执行此操作:

request = scrapy.Request(url, callback=self.parse_info_page)
request.meta['item'] = item

f.write(',' + request.meta['item'].addr_name)

关于python - 类型错误 : 'Request' object is not subscriptable,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48732571/

相关文章:

python - 在3.4.2.0(Python 3)以上的cv2版本上使用SIFT算法

javascript - Scrapy-splash - 会溅起 :go(url) in lua_script perform GET request again?

python - 获取 Scrapy 记录器

python - 在中间件中获取代理响应

java - 如何在网络爬虫中获取内容

python - 使用 python 部分下载提取 MP3 URL 的 ID3 标签

Python 分割字符串忽略\"

python - 如何有条件地替换python中列表列表中的值

web-scraping - Scrapy:抓取嵌套链接