gpt4 book ai didi

sql - 重复分析

转载 作者:行者123 更新时间:2023-12-02 06:27:49 25 4
gpt4 key购买 nike

我正在尝试使用 Oracle SQL 编写一些逻辑,但很难做到正确。首先,我需要我的脚本来识别重复项。然后确定重复项中哪个是最新的。我正在使用的数据库在应用程序外部进行了大量手动数据插入。这会导致在使用 ID 编号时项目出现乱序(我使用 start dateID 编号作为衡量顺序的方法,因为表格没有其他方法)。

如果我需要确定员工 12311 的最新角色,我该怎么做?

这是我目前所拥有的:

表格

  ID | EMPLOYEE |       ROLE |   STARTDATE           | 
-----|----------|------------|-----------------------|
3432 | 12311 | Supervisor | 2016-07-12T00:00:00Z |
3421 | 12311 | Analyst | 2016-07-12T00:00:00Z |
4321 | 12311 | Help Desk | 2014-05-12T00:00:00Z |
5432 | 23432 | Manager | 2012-11-02T00:00:00Z |
3452 | 23432 | Associate | 2011-04-23T00:00:00Z |
7652 | 54332 | Analyst | 2015-10-15T00:00:00Z |
5691 | 54332 | Assistant | 2013-10-15T00:00:00Z |

代码

SELECT "ID", "EMPLOYEE", "ROLE", "STARTDATE",
(CASE
WHEN RANK() OVER (PARTITION BY "EMPLOYEE" ORDER BY "STARTDATE" DESC) = 1
THEN 1
ELSE 0
END) "MAX DATE",
(CASE
WHEN RANK() OVER (PARTITION BY "EMPLOYEE" ORDER BY "ID" DESC) = 1
THEN 1
ELSE 0
END) "MAX ID"
FROM (
SELECT DISTINCT EMPLOYEE "E.EMPLOYEE",
E.ID "ID",
LR.DESCRIPTION "ROLE",
ROLE_START_DATE "STARTDATE"
FROM EMPLOYEES E
JOIN ROLES R ON E.EMPLOYEE_ID = R.EMPLOYEE_ID
JOIN LU_ROLES LR ON R.ROLE_ID = LR.ROLE_ID
WHERE ROLE_START_DATE <= DATE '2017-12-03')
ORDER BY 2

结果

  ID | EMPLOYEE |       ROLE |   STARTDATE           | MAX Date | Max ID
-----|----------|------------|----------------------------------|--------
3432 | 12311 | Supervisor | 2016-07-12T00:00:00Z | 1 | 0
3421 | 12311 | Analyst | 2016-07-12T00:00:00Z | 1 | 0
4321 | 12311 | Help Desk | 2014-05-12T00:00:00Z | 0 | 1
5432 | 23432 | Manager | 2012-11-02T00:00:00Z | 1 | 1
3452 | 23432 | Associate | 2011-04-23T00:00:00Z | 0 | 0
7652 | 54332 | Analyst | 2015-10-15T00:00:00Z | 1 | 1
5691 | 54332 | Assistant | 2013-10-15T00:00:00Z | 0 | 0

我希望脚本只使用重复的 STARTDATES,而不是查看每个员工的所有记录并确定最近的记录。

基本上,如果最近的 STARTDATE 重复,请确定哪个 ID 最高。

所以它应该是这样的:

  ID | EMPLOYEE |       ROLE |   STARTDATE           | MAX Date | Max ID
-----|----------|------------|----------------------------------|--------
3432 | 12311 | Supervisor | 2016-07-12T00:00:00Z | 1 | 1
3421 | 12311 | Analyst | 2016-07-12T00:00:00Z | 1 | 0
4321 | 12311 | Help Desk | 2014-05-12T00:00:00Z | 0 | 0
5432 | 23432 | Manager | 2012-11-02T00:00:00Z | 1 | 1
3452 | 23432 | Associate | 2011-04-23T00:00:00Z | 0 | 0
7652 | 54332 | Analyst | 2015-10-15T00:00:00Z | 1 | 1
5691 | 54332 | Assistant | 2013-10-15T00:00:00Z | 0 | 0

我完全愿意接受更好的方法。如果您能提供任何帮助,我们将不胜感激!

用解决方案编辑:

为此感谢@Littlefoot。我能够修改我的脚本以包含以下内容:

   SELECT "ID", "EMPLOYEE", "ROLE", "STARTDATE",
ROW_NUMBER() OVER (PARTITION BY "EMPLOYEE" ORDER BY "STARTDATE" DESC, "ID" DESC) RN
FROM (
SELECT DISTINCT EMPLOYEE "E.EMPLOYEE",
E.ID "ID",
LR.DESCRIPTION "ROLE",
ROLE_START_DATE "STARTDATE"
FROM EMPLOYEES E
JOIN ROLES R ON E.EMPLOYEE_ID = R.EMPLOYEE_ID
JOIN LU_ROLES LR ON R.ROLE_ID = LR.ROLE_ID
WHERE ROLE_START_DATE <= DATE '2017-12-03')
ORDER BY 2

结果:

  ID | EMPLOYEE |       ROLE |   STARTDATE           |       RN | 
-----|----------|------------|----------------------------------|
3432 | 12311 | Supervisor | 2016-07-12T00:00:00Z | 1 |
3421 | 12311 | Analyst | 2016-07-12T00:00:00Z | 2 |
4321 | 12311 | Help Desk | 2014-05-12T00:00:00Z | 3 |
5432 | 23432 | Manager | 2012-11-02T00:00:00Z | 1 |
3452 | 23432 | Associate | 2011-04-23T00:00:00Z | 2 |
7652 | 54332 | Analyst | 2015-10-15T00:00:00Z | 1 |
5691 | 54332 | Assistant | 2013-10-15T00:00:00Z | 2 |

然后我按 RN=1

过滤我的结果

最佳答案

If I need to determine the most current role for Employee 12311, how would I do so?

RN最低的那个?为什么需要两个 MAX 列,而一个列本身就可以完成?如:

SQL> with test (id, empid, role, startdate) as
2 (select 3432, 12311, 'supervisor', date '2016-07-12' from dual union
3 select 3421, 12311, 'analyst' , date '2016-07-12' from dual union
4 select 4321, 12311, 'help desk' , date '2014-05-12' from dual union
5 --
6 select 5432, 23432, 'manager' , date '2012-11-02' from dual union
7 select 3452, 23432, 'associate' , date '2011-04-23' from dual
8 )
9 select id, empid, role, startdate,
10 row_number() over (partition by empid order by startdate desc, id desc) rn
11 from test;

ID EMPID ROLE STARTDATE RN
---------- ---------- ---------- ---------- ----------
3432 12311 supervisor 2016-07-12 1
3421 12311 analyst 2016-07-12 2
4321 12311 help desk 2014-05-12 3
5432 23432 manager 2012-11-02 1
3452 23432 associate 2011-04-23 2

SQL>

该查询将是另一个查询的来源,它使用 WHERE 子句,即

  <snip>
9 select id, empid, role, startdate
10 from (select id, empid, role, startdate,
11 row_number() over (partition by empid order by startdate desc, id desc) rn
12 from test
13 )
14 where rn = 1;

ID EMPID ROLE STARTDATE
---------- ---------- ---------- ----------
3432 12311 supervisor 2016-07-12
5432 23432 manager 2012-11-02

SQL>

关于sql - 重复分析,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51617388/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com