Python MySQLdb 响应时间在相似的集合上截然不同

我正在使用 Python+MySQLdb 设计一个 crontab 作业，从 MySQL 中提取数据，生成 XML 文件并将其压缩。是的，这是每天中午发生的归档任务。

我的代码:

#!/usr/bin/env python
#encoding: utf-8
from dmconfig import DmConf
#from dmdb import Dmdb
import redis
import MySQLdb
import dawnutils


import time
from datetime import datetime, timedelta, date

conf = DmConf().loadConf()

db = MySQLdb.connect(host=conf["DbHost"],user=conf['DbAccount'],passwd=conf['DbPassword'],\
        db=conf['DbName'],charset=conf['DbCharset'])
cache = redis.Redis(host=conf['RedisHost'], port=conf['RedisPort'], 
        db=conf['Redisdbid'], password=conf['RedisPassword'])

#cursor = db.cursor()

def try_reconnect(conn):
    try:
        conn.ping()
    except:
        conn = MySQLdb.connect(host=conf["DbHost"],user=conf['DbAccount'],passwd=conf['DbPassword'],\
            db=conf['DbName'],charset=conf['DbCharset'])


def zip_task(device, start, stop):
    #cursor = db.cursor()
    format = "%Y%m%d%H%M%S"
    begin = time.strftime("%Y-%m-%d %H:%M:%S",time.strptime(start,format))
    end = time.strftime("%Y-%m-%d %H:%M:%S",time.strptime(stop,format))
    print "%s (%s,%s)"%(device, begin, end)
    sql = "SELECT * from `period` WHERE `snrCode` = \"%s\" AND `time` > \"%s\" AND `time` < \"%s\" ORDER BY `recId` DESC"%(device, begin, end)
    print sql
    cursor = db.cursor()

    try_reconnect(db)
    t1 = time.time()
    try:
        cursor.execute(sql)
        results = cursor.fetchall()
    except MySQLdb.Error,e:
        print "Error %s"%(e)

    print ("SQL takes %f seconds"%(time.time()-t1))

    print ("len of reconds, %d"%len(results))

    #for row in results:
        #print row


def dispatcher(devSet, start, stop):
    print "size of set: %d"%len(devSet)
    print devSet
    for dev in devSet:
        zip_task(dev, start, stop)

def archive_task_queue():
    today = datetime.now()
    oneday = timedelta(days=1)
    yesterday = today - oneday
    format = "%Y%m%d%H%M%S"
    begin = time.strftime(format, yesterday.timetuple())[:8] + '120000'
    end = time.strftime(format, today.timetuple())[:8] + '120000'

    sql = "SELECT * from `logbook` WHERE `login` > \"%s\" AND `login` < \"%s\" AND `logout` > \"%s\" AND `logout` < \"%s\""%(begin, end, begin, end)
    print sql

    cursor = db.cursor()
    reclist = []
    try:
        cursor.execute(sql)
        results = cursor.fetchall()

        for row in results:
            #print row
            reclist.append(row[1])
    except MySQLdb.Error,e:
        print "Error %s"%(e)

    #reclist = [u'A2H300001']

    if len(reclist):
        dispatcher(set(reclist), begin, end)

    db.close()

if __name__ == '__main__':
    archive_task_queue()

在我的代码中，我将查询设备事件日志，并获取当天设置的事件设备。并一一查询每个设备的数据集。这些问题是随着第二阶段查询而来的。运行后查看我的控制台:

SELECT * from `logbook` WHERE `login` > "20160720120000" AND `login` < "20160721                                                                     120000" AND `logout` > "20160720120000" AND `logout` < "20160721120000"
size of set: 4
set([u'B1H700001', u'B1H700002', u'A1E500018', u'A2H300001'])
B1H700001 (2016-07-20 12:00:00,2016-07-21 12:00:00)
SELECT * from `period` WHERE `snrCode` = "B1H700001" AND `time` > "2016-07-20 12                                                                     :00:00" AND `time` < "2016-07-21 12:00:00" ORDER BY `recId` DESC
SQL takes 0.018232 seconds
len of reconds, 597
B1H700002 (2016-07-20 12:00:00,2016-07-21 12:00:00)
SELECT * from `period` WHERE `snrCode` = "B1H700002" AND `time` > "2016-07-20 12                                                                     :00:00" AND `time` < "2016-07-21 12:00:00" ORDER BY `recId` DESC
SQL takes 0.974020 seconds
len of reconds, 4642
A1E500018 (2016-07-20 12:00:00,2016-07-21 12:00:00)
SELECT * from `period` WHERE `snrCode` = "A1E500018" AND `time` > "2016-07-20 12                                                                     :00:00" AND `time` < "2016-07-21 12:00:00" ORDER BY `recId` DESC
SQL takes 0.342373 seconds
len of reconds, 0
A2H300001 (2016-07-20 12:00:00,2016-07-21 12:00:00)
SELECT * from `period` WHERE `snrCode` = "A2H300001" AND `time` > "2016-07-20 12                                                                     :00:00" AND `time` < "2016-07-21 12:00:00" ORDER BY `recId` DESC

SQL takes 68.173677 seconds
len of reconds, 5794

查询时间很奇怪。 B1H700002 4642个数据点需要0.9秒，A2H300001 5764个数据点需要68秒。

然后我将问题范围缩小到仅查询特定的设备 ID，您可以在我之前的代码中找到它。结果是一样的。该查询需要 65 秒。

有什么线索吗？

最佳答案

我对此 SQL 查询做了更多实验。最后发现和MySQLdb的内存使用有关。虽然总数据集可能只有 5794 行，但如果我添加 LIMIT 5000，查询只需要 0.3 秒，否则需要 60 秒以上。

因此，作为一种解决方法，我使用 LIMIT 和一些分页方法来查询每个查询的有限行并将其附加到以前的查询。总时间减少到1秒以内。

关于Python MySQLdb 响应时间在相似的集合上截然不同，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/38492932/

Python MySQLdb 响应时间在相似的集合上截然不同

上一篇：MySQL 内连接和多个 where 条件

下一篇：php - 我可以使用什么逻辑来允许表列中存在 x 个值？