我有如下的 MySQL 表
accountNum date status action qty time
---------- ---- ------ ------ --- ----
1234 2017 filled B 10 11:20
1234 2017 filled S 10 11:20
2345 2017 filled B 20 12:00
2345 2017 filled B 10 12:00
4444 2017 filled B 5 01:00
4444 2017 filled S 5 02:00
这里我想比较 2 行,先使用操作“B”,然后使用操作“S”。如果在这些记录上发现 2 行先是 B,然后是 S,我必须检查 accountNum、日期、时间、状态是否相同。
所以根据上面的测试数据我应该只得到前两行
accountNum date status action qty time
---------- ---- ------ ------ --- ----
1234 2017 filled B 10 11:20
1234 2017 filled S 10 11:20
为此我应该编写什么类型的查询?
最佳答案
我会初步统计一下你的 key
select accountNum, date, status, time
from yourTable
where action in ('B', 'S')
group by accountNum, date, status, time
having count(distinct action) = 2
然后您可以将上面的内容与初始表连接起来以仅过滤您想要的行
select t1.*
from yourTable t1
join (
select accountNum, date, status, time
from yourTable
where action in ('B', 'S')
group by accountNum, date, status, time
having count(distinct action) = 2
) t2
on t1.accountNum = t2.accountNum and
t1.date = t2.date and
t1.status = t2.status and
t1.time = t2.time
编辑
我不是 Hive 专家,但如果子查询中不允许使用 distinct
和 having
,您可能可以这样编写查询
select t1.*
from yourTable t1
join (
select accountNum, date, status, time, count(action) as cnt
from yourTable
where action in ('B', 'S')
group by accountNum, date, status, time
) t2
on t1.accountNum = t2.accountNum and
t1.date = t2.date and
t1.status = t2.status and
t1.time = t2.time
where t2.cnt = 2
如果相同的accountNum/date/time/status
组合不能有同一操作的多个实例,您可以完全摆脱distinct
。
having
子句可以作为 where
条件移动到外部查询中。
关于mysql - 比较表中的 2 行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43674377/