我有一个包含以下列的表格
application_uuid
changed_at_utc
changed_by
name
我想按 application_uuid
和 changed_at_utc
排序。然后,我只想过滤紧接在 application_status
具有文本“Ready for Scoring”
使用 Python 和 Pandas,我会做这样的事情......
application_statuses = application_statuses.sort_values(['application_uuid', 'changed_at_utc'], ascending=[True, True]).reset_index(drop=True)
indexes = application_statuses[application_statuses['application_status']=='Ready for Scoring'].index + 1
next_statuses = application_statuses.ix[indexes]
如何使用 SQL 完成同样的事情?
最佳答案
根据您的解释,您可以使用 lead
函数来执行此操作。
select next_application_status,application_uuid,changed_at_utc,changed_by
from (select t.*,
lead(application_status) over(order by application_uuid,changed_at_utc) as next_appliaction_status
from tablename t
) t1
where application_status = 'Ready for Scoring'
如果必须对每个 application_uuid
执行此操作,请在 lead
中包含一个 partition by
,如下所示。
select next_application_status,application_uuid,changed_at_utc,changed_by
from (select t.*,
lead(application_status) over(partition by application_uuid order by changed_at_utc) as next_appliaction_status
from tablename t
) t1
where application_status = 'Ready for Scoring'
如果您需要 application_status 准备评分
之后的所有行,请获取该特定行的时间戳并选择所有其他较大的时间戳。这假设 application_uuid 最多有一行具有准备评分
状态。
select application_status,application_uuid,changed_at_utc,changed_by
from (select t.*,
max(case when application_status='Ready for Scoring' then changed_at_utc end) over(partition by application_uuid) as status_time
from tablename t
) t1
where changed_at_utc > status_time
关于SQL Server : Filter for only the rows directly after a row containing specific text,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42353662/