我想使用关联的非 NULL 值填充每个 session_id
的 device
列中的 NULL
值。我怎样才能实现这一目标?
这是示例数据:
+------------+-------+---------+
| session_id | step | device |
+------------+-------+---------+
| 351acc | step1 | |
| 351acc | step2 | |
| 351acc | step3 | mobile |
| 351acc | step4 | mobile |
| 350bca | step1 | desktop |
| 350bca | step2 | |
| 350bca | step3 | |
| 350bca | step4 | desktop |
+------------+-------+---------+
所需输出:
+------------+-------+---------+
| session_id | step | device |
+------------+-------+---------+
| 351acc | step1 | mobile |
| 351acc | step2 | mobile |
| 351acc | step3 | mobile |
| 351acc | step4 | mobile |
| 350bca | step1 | desktop |
| 350bca | step2 | desktop |
| 350bca | step3 | desktop |
| 350bca | step4 | desktop |
+------------+-------+---------+
最佳答案
window function first_value()
正确的订购可能是最便宜的:
SELECT session_id, step
, COALESCE(device
, first_value(device) OVER (PARTITION BY session_id ORDER BY device IS NULL, step)
) AS device
FROM tbl
ORDER BY session_id DESC, step;
db<> fiddle here
ORDER BY device IS NULL,step
将 NULL
值排在最后,因此会选择最早具有 notnull 值的 step
。请参阅:
如果每个 session_id
的 notnull 设备始终相同,您可以简化为仅 ORDER BY device IS NULL
。而且您不需要COALESCE
。
关于sql - 替换每个分区的 NULL 值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67007078/