我正在尝试使用 Oracle SQL 编写一些逻辑,但很难做到正确。首先,我需要我的脚本来识别重复项。然后确定重复项中哪个是最新的。我正在使用的数据库在应用程序外部进行了大量手动数据插入。这会导致在使用 ID 编号时项目出现乱序(我使用 start date
和 ID
编号作为衡量顺序的方法,因为表格没有其他方法)。
如果我需要确定员工 12311 的最新角色,我该怎么做?
这是我目前所拥有的:
表格
ID | EMPLOYEE | ROLE | STARTDATE |
-----|----------|------------|-----------------------|
3432 | 12311 | Supervisor | 2016-07-12T00:00:00Z |
3421 | 12311 | Analyst | 2016-07-12T00:00:00Z |
4321 | 12311 | Help Desk | 2014-05-12T00:00:00Z |
5432 | 23432 | Manager | 2012-11-02T00:00:00Z |
3452 | 23432 | Associate | 2011-04-23T00:00:00Z |
7652 | 54332 | Analyst | 2015-10-15T00:00:00Z |
5691 | 54332 | Assistant | 2013-10-15T00:00:00Z |
代码
SELECT "ID", "EMPLOYEE", "ROLE", "STARTDATE",
(CASE
WHEN RANK() OVER (PARTITION BY "EMPLOYEE" ORDER BY "STARTDATE" DESC) = 1
THEN 1
ELSE 0
END) "MAX DATE",
(CASE
WHEN RANK() OVER (PARTITION BY "EMPLOYEE" ORDER BY "ID" DESC) = 1
THEN 1
ELSE 0
END) "MAX ID"
FROM (
SELECT DISTINCT EMPLOYEE "E.EMPLOYEE",
E.ID "ID",
LR.DESCRIPTION "ROLE",
ROLE_START_DATE "STARTDATE"
FROM EMPLOYEES E
JOIN ROLES R ON E.EMPLOYEE_ID = R.EMPLOYEE_ID
JOIN LU_ROLES LR ON R.ROLE_ID = LR.ROLE_ID
WHERE ROLE_START_DATE <= DATE '2017-12-03')
ORDER BY 2
结果
ID | EMPLOYEE | ROLE | STARTDATE | MAX Date | Max ID
-----|----------|------------|----------------------------------|--------
3432 | 12311 | Supervisor | 2016-07-12T00:00:00Z | 1 | 0
3421 | 12311 | Analyst | 2016-07-12T00:00:00Z | 1 | 0
4321 | 12311 | Help Desk | 2014-05-12T00:00:00Z | 0 | 1
5432 | 23432 | Manager | 2012-11-02T00:00:00Z | 1 | 1
3452 | 23432 | Associate | 2011-04-23T00:00:00Z | 0 | 0
7652 | 54332 | Analyst | 2015-10-15T00:00:00Z | 1 | 1
5691 | 54332 | Assistant | 2013-10-15T00:00:00Z | 0 | 0
我希望脚本只使用重复的 STARTDATES
,而不是查看每个员工的所有记录并确定最近的记录。
基本上,如果最近的 STARTDATE
重复,请确定哪个 ID
最高。
所以它应该是这样的:
ID | EMPLOYEE | ROLE | STARTDATE | MAX Date | Max ID
-----|----------|------------|----------------------------------|--------
3432 | 12311 | Supervisor | 2016-07-12T00:00:00Z | 1 | 1
3421 | 12311 | Analyst | 2016-07-12T00:00:00Z | 1 | 0
4321 | 12311 | Help Desk | 2014-05-12T00:00:00Z | 0 | 0
5432 | 23432 | Manager | 2012-11-02T00:00:00Z | 1 | 1
3452 | 23432 | Associate | 2011-04-23T00:00:00Z | 0 | 0
7652 | 54332 | Analyst | 2015-10-15T00:00:00Z | 1 | 1
5691 | 54332 | Assistant | 2013-10-15T00:00:00Z | 0 | 0
我完全愿意接受更好的方法。如果您能提供任何帮助,我们将不胜感激!
用解决方案编辑:
为此感谢@Littlefoot。我能够修改我的脚本以包含以下内容:
SELECT "ID", "EMPLOYEE", "ROLE", "STARTDATE",
ROW_NUMBER() OVER (PARTITION BY "EMPLOYEE" ORDER BY "STARTDATE" DESC, "ID" DESC) RN
FROM (
SELECT DISTINCT EMPLOYEE "E.EMPLOYEE",
E.ID "ID",
LR.DESCRIPTION "ROLE",
ROLE_START_DATE "STARTDATE"
FROM EMPLOYEES E
JOIN ROLES R ON E.EMPLOYEE_ID = R.EMPLOYEE_ID
JOIN LU_ROLES LR ON R.ROLE_ID = LR.ROLE_ID
WHERE ROLE_START_DATE <= DATE '2017-12-03')
ORDER BY 2
结果:
ID | EMPLOYEE | ROLE | STARTDATE | RN |
-----|----------|------------|----------------------------------|
3432 | 12311 | Supervisor | 2016-07-12T00:00:00Z | 1 |
3421 | 12311 | Analyst | 2016-07-12T00:00:00Z | 2 |
4321 | 12311 | Help Desk | 2014-05-12T00:00:00Z | 3 |
5432 | 23432 | Manager | 2012-11-02T00:00:00Z | 1 |
3452 | 23432 | Associate | 2011-04-23T00:00:00Z | 2 |
7652 | 54332 | Analyst | 2015-10-15T00:00:00Z | 1 |
5691 | 54332 | Assistant | 2013-10-15T00:00:00Z | 2 |
然后我按 RN=1
最佳答案
If I need to determine the most current role for Employee 12311, how would I do so?
RN最低的那个?为什么需要两个 MAX
列,而一个列本身就可以完成?如:
SQL> with test (id, empid, role, startdate) as
2 (select 3432, 12311, 'supervisor', date '2016-07-12' from dual union
3 select 3421, 12311, 'analyst' , date '2016-07-12' from dual union
4 select 4321, 12311, 'help desk' , date '2014-05-12' from dual union
5 --
6 select 5432, 23432, 'manager' , date '2012-11-02' from dual union
7 select 3452, 23432, 'associate' , date '2011-04-23' from dual
8 )
9 select id, empid, role, startdate,
10 row_number() over (partition by empid order by startdate desc, id desc) rn
11 from test;
ID EMPID ROLE STARTDATE RN
---------- ---------- ---------- ---------- ----------
3432 12311 supervisor 2016-07-12 1
3421 12311 analyst 2016-07-12 2
4321 12311 help desk 2014-05-12 3
5432 23432 manager 2012-11-02 1
3452 23432 associate 2011-04-23 2
SQL>
该查询将是另一个查询的来源,它使用 WHERE
子句,即
<snip>
9 select id, empid, role, startdate
10 from (select id, empid, role, startdate,
11 row_number() over (partition by empid order by startdate desc, id desc) rn
12 from test
13 )
14 where rn = 1;
ID EMPID ROLE STARTDATE
---------- ---------- ---------- ----------
3432 12311 supervisor 2016-07-12
5432 23432 manager 2012-11-02
SQL>
关于sql - 重复分析,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51617388/