python - scrapy exceptions.TypeError : 'int' object has no attribute '__getitem__'

标签 python mysql scrapy

我在使用scrapy collection into Mysql的时候遇到了一些问题,希望大家给出解决方案,谢谢。 pipelines.py错误类型:

2013-12-06 18:07:02+0800 [-] ERROR: Unhandled error in Deferred:
2013-12-06 18:07:02+0800 [-] Unhandled Error
    Traceback (most recent call last):
      File "/usr/lib/python2.7/threading.py", line 524, in __bootstrap
        self.__bootstrap_inner()
      File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
        self.run()
      File "/usr/lib/python2.7/threading.py", line 504, in run
        self.__target(*self.__args, **self.__kwargs)
    --- <exception caught here> ---
      File "/usr/local/lib/python2.7/dist-packages/twisted/python/threadpool.py", line 191, in _worker
        result = context.call(ctx, function, *args, **kwargs)
      File "/usr/local/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext
        return self.currentContext().callWithContext(ctx, func, *args, **kw)
      File "/usr/local/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext
        return func(*args,**kw)
      File "/usr/local/lib/python2.7/dist-packages/twisted/enterprise/adbapi.py", line 448, in _runInteraction
        result = interaction(trans, *args, **kw)
      File "/home/hugo/spider/spider/pipelines.py", line 39, in _conditional_insert
        tx.execute('INSERT INTO book_updata values (%s, %s, %s, %s, %s)' ,(item['name'][i], item['siteid'][i], item['page_url'][i], item['page_title'][i], time.time()))
    exceptions.TypeError: 'int' object has no attribute '__getitem__'

错误:exceptions.TypeError: 'int' object has no attribute 'getitem'

及代码:

# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: http://doc.scrapy.org/topics/item-pipeline.html
# -*- coding: utf-8 -*-
from scrapy import log
from twisted.enterprise import adbapi
from scrapy.http import Request  
from scrapy.exceptions import DropItem 
from scrapy.contrib.pipeline.images import ImagesPipeline 
import time  
import MySQLdb  
import MySQLdb.cursors
import socket
import select
import sys
import os
import errno

class MySQLStorePipeline(object):
    def __init__ (self):

        self.dbpool = adbapi.ConnectionPool('MySQLdb', 
              db = 'test', 
              user = 'root', 
              passwd = '153325', 
              cursorclass =MySQLdb.cursors.DictCursor,  
              charset = 'utf8', 
              use_unicode = False
       )
    def process_item(self,item, spider):
        query = self.dbpool.runInteraction(self._conditional_insert,item)  
        return item
    def _conditional_insert (self, tx, item):
        for i in range(len(item['name'])):
            tx.execute("select * from book where name = '%s'" % (item['name'][i]))
            result = tx.fetchone()
            #(name, page_url, page_title, siteid, date) 
            if result:
                for i in range(len(item['name'])):
                    tx.execute('INSERT INTO book_updata values (%s, %s, %s, %s, %s)' ,(item['name'][i], item['siteid'][i], item['page_url'][i], item['page_title'][i], time.time()))
                    log.msg("\n ====Old novel: %s is update!==== \n" % item['name'][i], level=log.DEBUG)
            else:
                log.msg("\n ===New novel: %s is into db==== \n" % item['name'][i], level=log.DEBUG)
                tx.execute("INSERT INTO book (name, category, page_url, page_title, author, img_url, intro, state, time) values ('%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s')" % (item['name'][i], item['category'][i], item['page_url'][i], item['page_title'][i], item['author'][i], item['img_url'][i], item['intro'][i], item['state'][i], int(time.time())))
    def handle_error(self, e):
        log.err(e)

最佳答案

tx.execute('INSERT INTO book_updata ...) 中的一个 item[xxx] 似乎是 int而不是 listdict。所以检查item中的数据格式,看是不是数据的格式有误。

关于python - scrapy exceptions.TypeError : 'int' object has no attribute '__getitem__' ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20421483/

相关文章:

python - Selenium Firefox webdriver 导致错误 : Service geckodriver unexpectedly exited. 状态代码为:2

python - mysqldbcompare 和 mysqldiff 无法比较

python - Scrapy:循环搜索结果仅返回第一项

python - 使用 Python 实现神经网络的成本函数(第 5 周 Coursera)

python - 如何使用mingw-w64,Python和pybind11手动构建C++扩展?

php - 登录 php 和 mysqli

python - MongoDB 无效文档 : Cannot encode object

python - 使用 Celery 时 Scrapy 蜘蛛不跟踪链接

python - Python中双循环中所有字符串的大写

PHP:基于 2 列将 mysql 数据库导出为 CSV?