用例:我有一个 cron每 5 分钟检查一些统计信息并将其插入数据库表stats
**Structure**
`time` as DATETIME (index)
`skey` as VARCHAR(50) (index)
`value` as BIGINT
Primary (time and skey)
现在我想创建一个图表来显示一天中正在进行的每日平均值 - 所以即游戏用户图表:
from 0-1 i have 10 playing users (avg value from 0-1 is now 10)
from 1-2 i have 6 playing users (avg value is now 8 => (10+6) / 2)
from 2-3 i have 14 playing users (avg value is no 10 => (10+6+14) / 3
第二天从头开始
我已经在运行查询,但运行时间超过 3.5 秒
第一次尝试:
SELECT *
, (SELECT AVG(value)
FROM stats as b
WHERE b.skey = stats.skey
AND b.time <= stats.time
AND DATE(b.time) = DATE(stats.time))
FROM stats
ORDER
BY stats.time DESC
第二次尝试:
SELECT *
, (SELECT AVG(b.value)
FROM stats as b
WHERE b.skey = stats.skey
AND DATE(b.time) = DATE(stats.time)
AND b.time <= stats.time) as avg
FROM stats
WHERE skey = 'playingUsers'
GROUP
BY HOUR(stats.time)
, DATE(stats.time)
第一次尝试是获取每个条目并计算平均值
第二次尝试是按小时分组(就像我的例子)
无论如何,这不会改变任何性能
是否有办法提高 mysql 的性能,或者我是否必须更改其背后的完整逻辑?
数据库 fiddle : https://www.db-fiddle.com/f/krFmR1yPsmnPny2zi5NJGv/4
最佳答案
我建议将每小时平均值的计算与天平均值的计算分开,并通过分组每小时仅计算一次这些值。
如果您使用的是 MySQL 8,我建议按如下方式使用 CTE:
with HOURLY AS (
SELECT distinct
DATE_,
HOUR_,
AVG(b.value) as avg_per_hour
FROM (SELECT s.value, DATE(s.time) DATE_, HOUR(s.time) HOUR_
FROM stats s
where skey = 'playingUsers'
) b
GROUP BY b.DATE_, b.HOUR_
ORDER BY b.DATE_ DESC, b.HOUR_ DESC
)
SELECT *
, (SELECT AVG(b.avg_per_hour)
FROM HOURLY as b
WHERE b.DATE_ = HOURLY.DATE_
AND b.HOUR_ <= HOURLY.HOUR_) as avg
FROM HOURLY
该语句在给定的 fiddle 中持续 < 300 毫秒。
计算对应于您在上表中描述的算法。
但是,结果与所提供的陈述不同。
关于MYSQL 日平均值统计,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57338188/