我有三个表,我需要根据一个公共(public)字段加入它们的数据。
示例伪表定义:
barometer_log(设备、压力 float 、sampleTime 时间戳)
temperature_log(设备整数、温度 float 、sampleTime 时间戳)
magnitude_log(设备整数、幅度 float 、utcTime 时间戳)
每个表最终将包含数十亿行,但目前每个表包含大约 500,000 行。
我需要能够将表中的数据(FULL JOIN)合并到,以便将 sampleTime 合并为一列(COALESE)以给我行: 设备、采样时间、压力、温度、幅度
我需要能够通过指定设备以及开始和结束日期来查询数据,例如 选择 .... 其中 device=1000 并且 sampleTime 在“2011-10-11”和“2011-10-17”之间
我用 RIGHT 和 LEFT 连接尝试了不同的 UNION ALL 技术 正如 MySql full join (union) and ordering on multiple date columns 中所建议的那样和 MySql full join (union) and ordering on multiple date columns ,但查询时间太长,我必须停止它或在运行数小时后抛出有关临时文件大小的错误。 在可接受的时间范围内查询这三个表并合并它们的输出的最佳方法是什么?
这是建议的完整表格定义。 注意:设备表未包含在内。
震级日志
CREATE TABLE magnitude_log (
device int(11) NOT NULL,
magnitude float not NULL,
sampleTime timestamp NOT NULL,
PRIMARY KEY (device,sampleTime),
CONSTRAINT magnitudeLog_device
FOREIGN KEY (device)
REFERENCES device (id)
ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
晴雨表日志
CREATE TABLE barometer_log (
device int(11) NOT NULL,
pressure float not NULL,
sampleTime timestamp NOT NULL,
PRIMARY KEY (device,sampleTime),
CONSTRAINT barometerLog_device
FOREIGN KEY (device)
REFERENCES device (id)
ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
温度记录
CREATE TABLE temperature_log (
device int(11) NOT NULL,
sampleTime timestamp NOT NULL,
temperature float default NULL,
PRIMARY KEY (device,sampleTime),
CONSTRAINT temperatureLog_device
FOREIGN KEY (device)
REFERENCES device (id)
ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
最佳答案
首先,在要求的时间段内,从所有 3 个表中获取 (device, sampleTime)
的所有组合:
-------- Q --------
SELECT device, sampleTime
FROM magnitude_log
WHERE device = 1000
AND sampleTime >= '2011-10-11'
AND sampleTime < '2011-10-18'
UNION
SELECT device, sampleTime
FROM barometer_log
WHERE device = 1000
AND sampleTime >= '2011-10-11'
AND sampleTime < '2011-10-18'
UNION
SELECT device, sampleTime
FROM temperature_log
WHERE device = 1000
AND sampleTime >= '2011-10-11'
AND sampleTime < '2011-10-18'
然后使用它LEFT JOIN
3 个表:
SELECT
q.device
, q.sampleTime
, b.pressure
, t.temperature
, m.magnitude
FROM
( Q ) AS q
LEFT JOIN
( SELECT *
FROM magnitude_log
WHERE device = 1000
AND sampleTime >= '2011-10-11'
AND sampleTime < '2011-10-18'
) AS m
ON (m.device, m.sampleTime) = (q.device, q.sampleTime)
LEFT JOIN
( SELECT *
FROM barometer_log
WHERE device = 1000
AND sampleTime >= '2011-10-11'
AND sampleTime < '2011-10-18'
) AS b
ON (b.device, b.sampleTime) = (q.device, q.sampleTime)
LEFT JOIN
( SELECT *
FROM temperature_log_log
WHERE device = 1000
AND sampleTime >= '2011-10-11'
AND sampleTime < '2011-10-18'
) AS t
ON (t.device, t.sampleTime) = (q.device, q.sampleTime)
您拥有的时间越长,查询与 UNION
子查询斗争的时间就越长。您可以考虑将 Q
作为一个单独的表,可能会通过触发器使用其他三个表中唯一的 (device, sampleTime)
组合来填充它。
关于mysql - 使用大数据集模拟 MYSQL 中的完全连接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8306919/