python - 优化python中的mysql查询

我正在使用 python 中的 mysql-connector 连接到 mysql 数据库。在正常的事件运行中，我不需要处理超过 1000 行，但有时我可能需要处理 20k 多行数据，这就是我的查询变得非常慢并超时的地方。我尝试过在 mysql-connector 中使用迭代器来限制在给定时间处理的数量，但似乎这只是为 py 中的 MySQLdb 实现的。

SELECT
        ROUND(311.30004 - (20.110938 * temp.value) 
        - (2.012626 * hum.value) - (2.006346 * airspeed.value) 
        + (0.4059 * POWER(temp.value, 2) + (0.006604 * POWER(hum.value, 2)) 
        + (3.151145 * POWER(airspeed.value, 2) + (0.05555 * temp.value * hum.value) 
        - (0.37037 * temp.value * airspeed.value) + (0.03968 * hum.value * airspeed.value))), 2) 
        AS value, temp.time, temp.x, temp.y, temp.z, temp.round_id, 'predicted_frequency', temp.round_number, temp.day_of_production
FROM round_data_temperature AS temp 
    INNER JOIN round_data_humidity AS hum 
        ON temp.round_number = hum.round_number 
           AND temp.x = hum.x 
           AND temp.y = hum.y AND DATE_FORMAT(temp.time, '%Y-%m-%d %H:%i:00') = DATE_FORMAT(hum.time, '%Y-%m-%d %H:%i:00')
    INNER JOIN round_data_airspeed AS airspeed 
        ON temp.round_number = airspeed.round_number 
            AND temp.x = airspeed.x AND temp.y = airspeed.y 
            AND DATE_FORMAT(temp.time, '%Y-%m-%d %H:%i:00') = DATE_FORMAT(airspeed.time, '%Y-%m-%d %H:%i:00')
WHERE temp.round_id ='xxxxx'

这段代码仅在 mysql 上运行得非常快，但在 python 中的 mysql-connector 中运行速度非常慢并且超时，有 20k 多行，所以我决定立即插入以避免在 python 中选择非常大的数据包。我做了以下操作:

INSERT INTO round_data_pf (value, time, x, y, z, round_id, observable_name, round_number, day_of_production)
SELECT
        ROUND(311.30004 - (20.110938 * temp.value) 
        - (2.012626 * hum.value) - (2.006346 * airspeed.value) 
        + (0.4059 * POWER(temp.value, 2) + (0.006604 * POWER(hum.value, 2)) 
        + (3.151145 * POWER(airspeed.value, 2) + (0.05555 * temp.value * hum.value) 
        - (0.37037 * temp.value * airspeed.value) + (0.03968 * hum.value * airspeed.value))), 2) 
        AS value, temp.time, temp.x, temp.y, temp.z, temp.round_id, 'pf', temp.round_number, temp.day_of_production
FROM round_data_temperature AS temp 
    INNER JOIN round_data_humidity AS hum 
        ON temp.round_number = hum.round_number 
            AND temp.x = hum.x 
            AND temp.y = hum.y 
            AND DATE_FORMAT(temp.time, '%Y-%m-%d %H:%i:00') = DATE_FORMAT(hum.time, '%Y-%m-%d %H:%i:00')
    INNER JOIN round_data_airspeed AS airspeed 
        ON temp.round_number = airspeed.round_number 
            AND temp.x = airspeed.x 
            AND temp.y = airspeed.y 
            AND DATE_FORMAT(temp.time, '%Y-%m-%d %H:%i:00') = DATE_FORMAT(airspeed.time, '%Y-%m-%d %H:%i:00')
WHERE temp.round_id ='xxxxx'

当我有大约 20k 行数据时，这段代码在 python 中的 mysql 和 mysql-connector 中都会超时。

我正在寻求优化代码，以便当我有超过 20k 行时，它可以更快地选择和插入。

最佳答案

确保您有正确的索引

table round_data_temperature composite index  on columns (round_id, round_number, x, y,  time)

table  round_data_humidity     composite index  on columns (round_number, x, y time) 

table  round_data_airspeed   comoosite index on clumns (round_number, x, y,time   )

您必须考虑这样一个事实:关系中涉及的每个表只能使用一个索引。如果您没有在表上建立索引，则表将被完全扫描。

单列索引和复合索引之间的区别在于，索引中存在的所有列(从左到右)都用于过滤查询，避免访问表数据，因此在某些情况下使用复合索引(正确定义)可以显着改进查询..

关于python - 优化python中的mysql查询，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55516134/

python - 优化python中的mysql查询

上一篇：mysql - MARIADB/MYSQL : multiple connections from localhost to localhost

下一篇：php - 在一天中的不同时间段运行相同的 MYSQL 查询，是否有更好的方法？