类似于问题BigQuery combine tables based on closest timerstamp and matching value
我有三个表,对于表numberTwo的每一行,我需要在表numberOne中获取具有相同提示>cod 值,以及在比较 time1 和 time2 时具有最接近时间的值。如果表numberOne中没有出现cod,它会尝试获取与表中的cod匹配的提示数字三
为了更容易理解我需要做的事情是:
表号一:
| id | cod | hint | time1 |
---------------------------------------------------
| 1 | ABC | V | 2016-11-03 18:00:00 UTC |
| 2 | ABC | W | 2016-11-03 12:00:00 UTC |
| 3 | CDE | X | 2016-11-03 19:00:00 UTC |
| 4 | CDE | Y | 2016-11-03 19:30:00 UTC |
| 5 | EFG | Z | 2016-11-03 18:00:00 UTC |
表号二
| id | cod | value | time2 |
----------------------------------------------------
| 1 | ABC | xyz2 | 2016-11-03 18:20:00 UTC |
| 2 | FHK | h323 | 2016-11-03 11:30:00 UTC |
| 3 | ABC | rewq | 2016-11-03 09:00:00 UTC |
| 4 | IJK | abce | 2016-11-03 19:10:00 UTC |
表号三
| id | cod | hint |
--------------------------
| 1 | FHK | tes1 |
| 2 | IJK | tes2 |
| 3 | MNK | tes3 |
| 4 | MOP | tes4 |
因此,对于表 numberTwo 的 row #1,我将使用 cod: ABC 获取表 numberOne 中的所有行强>
| 1 | ABC | V | 2016-11-03 18:00:00 UTC |
| 2 | ABC | W | 2016-11-03 12:00:00 UTC |
在这些之间,我会得到与 time2 最接近的时间戳:
| 1 | ABC | V | 2016-11-03 18:00:00 UTC |
如果cod未出现在表numberOne中,则它与表numberThree匹配。 numberOne 和 numberThree 中的代码是唯一的。因此,两个表中不会出现相同的代码。因此它可以首先尝试匹配表numberThree。
处理完每一行后,我将得到一个像这样的表:
所需表格
| id | cod | hint | value | time2 |
--------------------------------------------------------------
| 1 | ABC | V | xyz2 | 2016-11-03 18:20:00 UTC |
| 2 | FHK | tes1 | h323 | |
| 3 | ABC | W | rewq | 2016-11-03 09:00:00 UTC |
| 4 | IJK | tes2 | abce | |
最佳答案
尝试下面
WITH
/*
TableNumberOne AS (
SELECT 1 AS id, 'ABC' AS cod, 'V' AS hint, TIMESTAMP '2016-11-03 18:00:00 UTC' AS time1 UNION ALL
SELECT 2 AS id, 'ABC' AS cod, 'W' AS hint, TIMESTAMP '2016-11-03 12:00:00 UTC' AS time1 UNION ALL
SELECT 3 AS id, 'CDE' AS cod, 'X' AS hint, TIMESTAMP '2016-11-03 19:00:00 UTC' AS time1 UNION ALL
SELECT 4 AS id, 'CDE' AS cod, 'Y' AS hint, TIMESTAMP '2016-11-03 19:30:00 UTC' AS time1 UNION ALL
SELECT 5 AS id, 'EFG' AS cod, 'Z' AS hint, TIMESTAMP '2016-11-03 18:00:00 UTC' AS time1
),
TableNumberTwo AS (
SELECT 1 AS id, 'ABC' AS cod, 'xyz2' AS value, TIMESTAMP '2016-11-03 18:20:00 UTC' AS time2 UNION ALL
SELECT 2 AS id, 'FHK' AS cod, 'h323' AS value, TIMESTAMP '2016-11-03 11:30:00 UTC' AS time2 UNION ALL
SELECT 3 AS id, 'ABC' AS cod, 'rewq' AS value, TIMESTAMP '2016-11-03 09:00:00 UTC' AS time2 UNION ALL
SELECT 4 AS id, 'IJK' AS cod, 'abce' AS value, TIMESTAMP '2016-11-03 19:10:00 UTC' AS time2
),
TableNumberThree AS (
SELECT 1 AS id, 'FHK' AS cod, 'test1' AS hint UNION ALL
SELECT 2 AS id, 'IJK' AS cod, 'test2' AS hint UNION ALL
SELECT 3 AS id, 'MNK' AS cod, 'test3' AS hint UNION ALL
SELECT 4 AS id, 'MOP' AS cod, 'test4' AS hint
),
*/
tempTable AS (
SELECT
t2.id, t2.cod, t2.value, t2.time2, t1.hint,
ROW_NUMBER() OVER(PARTITION BY t2.id, t2.cod, t2.value
ORDER BY ABS(TIMESTAMP_DIFF(t2.time2, t1.time1, SECOND))) AS win
FROM TableNumberTwo AS t2
LEFT JOIN TableNumberOne AS t1
ON t1.cod = t2.cod
)
SELECT
t1.id, t1.cod, IFNULL(t1.hint, t2.hint) AS hint, value,
IF(t1.hint IS NULL, NULL, time2) as time2
FROM tempTable AS t1
LEFT JOIN TableNumberThree AS t2
ON t1.cod = t2.cod AND t1.hint IS NULL
WHERE win = 1
关于mysql - BigQuery - 根据匹配值或时间戳合并三个表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40897056/