python: Mac OS X.malloc 错误。未分配正在释放的指针。中止陷阱 6

标签 python python-multithreading

我正在运行一个多线程 python 脚本。它的作用是爬网并插入/更新到 mysql 中。这是我的代码

我的线程.py

import threading
import time

class MyThread (threading.Thread):
    def __init__(self, threadID, threadname, q):
        threading.Thread.__init__(self)
        self.threadID = threadID
        self.threadname = threadname
        self.queue = q
        self.__exitFlag = False
        self.__signal_lock = threading.Lock()

    def run(self):
        print "Starting " + self.threadname
        self.process_data()
        print "Exiting " + self.threadname

    def stop(self):
        with self.__signal_lock:
            self.__exitFlag = True

    def process_data(self):
        while not self.__exitFlag:
            if not self.queue.empty():
                data = self.queue.get()
                # crawl data from the web...
                # update to mysql
                # assuming we have already connected mysql:
                # db = MySQLDb()
                # db.connect
                query = ""
                db.query(query)

mysql_db.py

class MySQLDb:
    conn = None

    def connect(self):
        self.conn = MySQLdb.connect(
            host="127.0.0.1",
            user = "root",
            passwd = "password",
            db = "moviestats")

        self.cursor = self.conn.cursor(MySQLdb.cursors.DictCursor)

    def query(self, sql):
        try:
            self.cursor.execute(sql)
            self.conn.commit()
        except (AttributeError, MySQLdb.OperationalError):
            # solution to: MySQL server has gone away
            self.cursor.close()
            self.connect()
            self.cursor = self.conn.cursor(MySQLdb.cursors.DictCursor)
            self.cursor.execute(sql)
            self.conn.commit()

错误日志如下:

Process:         Python [905]
Path:            /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python
Identifier:      Python
Version:         2.7.7 (2.7.7)
Code Type:       X86-64 (Native)
Parent Process:  bash [751]
Responsible:     Terminal [410]
User ID:         501

Date/Time:       2014-07-09 22:31:43.221 +0800
OS Version:      Mac OS X 10.9.3 (13D65)
Report Version:  11
....

....
Crashed Thread:  5

Exception Type:  EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000

Application Specific Information:
abort() called
*** error for object 0x100a4b600: pointer being freed was not allocated
......
Thread 5 Crashed:
0   libsystem_kernel.dylib          0x00007fff83153866 __pthread_kill + 10
1   libsystem_pthread.dylib         0x00007fff8de8735c pthread_kill + 92
2   libsystem_c.dylib               0x00007fff8ef88b1a abort + 125
3   libsystem_malloc.dylib          0x00007fff8220707f free + 411
4   libmysqlclient.18.dylib         0x0000000101027302 vio_delete + 44
5   libmysqlclient.18.dylib         0x000000010100709a end_server + 48
6   libmysqlclient.18.dylib         0x0000000101006f81 cli_safe_read + 49
7   libmysqlclient.18.dylib         0x000000010100b469 cli_read_query_result + 26
8   libmysqlclient.18.dylib         0x000000010100a648 mysql_real_query + 83
9   _mysql.so                       0x0000000100533be8 _mysql_ConnectionObject_query + 85
10  org.python.python               0x00000001000c2fad PyEval_EvalFrameEx + 21405
11  org.python.python               0x00000001000c3bfa PyEval_EvalFrameEx + 24554
12  org.python.python               0x00000001000c3bfa PyEval_EvalFrameEx + 24554
13  org.python.python               0x00000001000c4fb3 PyEval_EvalCodeEx + 2115
14  org.python.python               0x00000001000c33f0 PyEval_EvalFrameEx + 22496
15  org.python.python               0x00000001000c3bfa PyEval_EvalFrameEx + 24554
16  org.python.python               0x00000001000c3bfa PyEval_EvalFrameEx + 24554
17  org.python.python               0x00000001000c4fb3 PyEval_EvalCodeEx + 2115
18  org.python.python               0x00000001000c33f0 PyEval_EvalFrameEx + 22496
19  org.python.python               0x00000001000c3bfa PyEval_EvalFrameEx + 24554
20  org.python.python               0x00000001000c3bfa PyEval_EvalFrameEx + 24554
21  org.python.python               0x00000001000c3bfa PyEval_EvalFrameEx + 24554
22  org.python.python               0x00000001000c4fb3 PyEval_EvalCodeEx + 2115
23  org.python.python               0x000000010003eac0 function_call + 176
24  org.python.python               0x000000010000ceb2 PyObject_Call + 98
25  org.python.python               0x000000010001f56d instancemethod_call + 365
26  org.python.python               0x000000010000ceb2 PyObject_Call + 98
27  org.python.python               0x00000001000bc957 PyEval_CallObjectWithKeywords + 87
28  org.python.python               0x0000000100102f27 t_bootstrap + 71
29  libsystem_pthread.dylib         0x00007fff8de86899 _pthread_body + 138
30  libsystem_pthread.dylib         0x00007fff8de8672a _pthread_start + 137
31  libsystem_pthread.dylib         0x00007fff8de8afc9 thread_start + 13

我用 50 个线程运行脚本。错误发生是间歇性的,但它是可重复的。我缩小了问题范围,这是由于对 mysql 的插入/更新。我读到这可能是由于并发问题,但我该如何解决?

最佳答案

我在 OSX 上使用 MySQLdb 时遇到了同样的 malloc 错误。导致我出错的原因是在线程之间共享 MySQLdb 连接。使用每个线程的连接为我修复了它。

来自文档 http://mysql-python.sourceforge.net/MySQLdb.html :

The MySQL protocol can not handle multiple threads using the same connection at once. Some earlier versions of MySQLdb utilized locking to achieve a threadsafety of 2. While this is not terribly hard to accomplish using the standard Cursor class (which uses mysql_store_result()), it is complicated by SSCursor (which uses mysql_use_result(); with the latter you must ensure all the rows have been read before another query can be executed. It is further complicated by the addition of transactions, since transactions start when a cursor execute a query, but end when COMMIT or ROLLBACK is executed by the Connection object. Two threads simply cannot share a connection while a transaction is in progress, in addition to not being able to share it during query execution. This excessively complicated the code to the point where it just isn't worth it.

The general upshot of this is: Don't share connections between threads. It's really not worth your effort or mine, and in the end, will probably hurt performance, since the MySQL server runs a separate thread for each connection. You can certainly do things like cache connections in a pool, and give those connections to one thread at a time. If you let two threads use a connection simultaneously, the MySQL client library will probably upchuck and die. You have been warned.

关于python: Mac OS X.malloc 错误。未分配正在释放的指针。中止陷阱 6,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24738717/

相关文章:

database - Python:来自 CSV 数据的惰性数据库?

Python:从类内部发送消息到进程

Python 线程

python-3.x - Tkinter 对象从错误的线程被垃圾收集

python - 从外部获取正在运行的 Python 脚本的源代码

python - 将函数应用于元组数组

Python 将选项卡 "\t"作为参数传递

python 多线程用于继续循环直到用户输入。帮助理解所要求的示例

python - AsyncResult.successful() 返回 false,get() 引发属性错误

python - 如何从 FastAPI 文档中隐藏 Pydantic 鉴别器字段