I have a table like this:
我有一张这样的桌子:
t104f005_employee_no |
t104f040_position_no |
t104f025_date_effective |
t104f030_date_to |
11354 |
89043 |
01/07/1999 |
01/04/2012 |
11354 |
89043 |
02/04/2012 |
31/08/2014 |
11354 |
89043 |
01/09/2014 |
21/09/2014 |
11354 |
89043 |
22/09/2014 |
27/09/2015 |
11354 |
23273 |
04/05/2015 |
27/09/2015 |
11354 |
94040 |
28/09/2015 |
10/01/2016 |
11354 |
94040 |
11/01/2016 |
08/05/2017 |
11354 |
94040 |
09/05/2017 |
24/03/2019 |
11354 |
99406 |
26/02/2018 |
01/07/2018 |
11354 |
99406 |
02/07/2018 |
30/09/2018 |
11354 |
99406 |
01/10/2018 |
01/11/2018 |
11354 |
97293 |
02/11/2018 |
25/11/2018 |
11354 |
99406 |
26/11/2018 |
10/03/2019 |
11354 |
97293 |
11/03/2019 |
24/03/2019 |
11354 |
99406 |
25/03/2019 |
30/09/2019 |
11354 |
99406 |
01/10/2019 |
15/03/2020 |
11354 |
97293 |
11/11/2019 |
24/11/2019 |
11354 |
99406 |
16/03/2020 |
10/05/2020 |
11354 |
101808 |
11/05/2020 |
14/06/2020 |
11354 |
99406 |
11/05/2020 |
14/06/2020 |
11354 |
101808 |
15/06/2020 |
04/10/2020 |
11354 |
99406 |
15/06/2020 |
04/10/2020 |
11354 |
101808 |
05/10/2020 |
25/10/2020 |
11354 |
99406 |
05/10/2020 |
21/11/2021 |
11354 |
101808 |
26/10/2020 |
31/01/2021 |
11354 |
101808 |
01/02/2021 |
28/02/2021 |
11354 |
101808 |
01/03/2021 |
30/06/2021 |
11354 |
101806 |
01/07/2021 |
18/07/2021 |
11354 |
101808 |
19/07/2021 |
21/11/2021 |
11354 |
101808 |
22/11/2021 |
28/11/2021 |
11354 |
99406 |
22/11/2021 |
28/11/2021 |
11354 |
101808 |
29/11/2021 |
20/03/2022 |
11354 |
101806 |
21/03/2022 |
17/04/2022 |
11354 |
101808 |
21/03/2022 |
22/05/2022 |
11354 |
101808 |
23/05/2022 |
09/04/2023 |
11354 |
101903 |
15/08/2022 |
12/02/2023 |
11354 |
105009 |
13/02/2023 |
28/02/2023 |
11354 |
105009 |
01/03/2023 |
09/04/2023 |
11354 |
101808 |
10/04/2023 |
NULL |
11354 |
80717 |
15/05/2023 |
12/05/2024 |
And I need to be able to identify the continuous position no periods with the min start and max end date outputted for each continuous period. Expected outcome would look like this:
我需要能够识别连续的位置没有期间输出的最小开始和最大结束日期为每个连续的期间。预期结果如下所示:
t104f005_employee_no |
t104f040_position_no |
t104f025_date_effective |
t104f030_date_to |
11354 |
89043 |
01/07/1999 |
27/09/2015 |
11354 |
23273 |
04/05/2015 |
27/09/2015 |
11354 |
94040 |
28/09/2015 |
24/03/2019 |
11354 |
99406 |
26/02/2018 |
01/11/2018 |
11354 |
97293 |
02/11/2018 |
25/11/2018 |
11354 |
99406 |
26/11/2018 |
10/03/2019 |
11354 |
97293 |
11/03/2019 |
24/03/2019 |
11354 |
99406 |
25/03/2019 |
28/11/2021 |
11354 |
97293 |
11/11/2019 |
24/11/2019 |
11354 |
101808 |
11/05/2020 |
30/06/2021 |
11354 |
101806 |
01/07/2021 |
18/07/2021 |
11354 |
101808 |
19/07/2021 |
NULL |
11354 |
101806 |
21/03/2022 |
17/04/2022 |
11354 |
101903 |
15/08/2022 |
12/02/2023 |
11354 |
105009 |
13/02/2023 |
28/02/2023 |
11354 |
105009 |
01/03/2023 |
09/04/2023 |
11354 |
80717 |
15/05/2023 |
12/05/2024 |
I have tried the following methods but it's not working out and I'm a bit stumped now.
我尝试了以下方法,但都不起作用,现在我有点困惑了。
WITH RankedPositions AS (
SELECT
[t104f005_employee_no],
[t104f040_position_no],
[t104f025_date_effective],
[t104f030_date_to],
ROW_NUMBER() OVER (PARTITION BY [t104f005_employee_no] ORDER BY [t104f025_date_effective]) -
ROW_NUMBER() OVER (PARTITION BY [t104f005_employee_no], [t104f040_position_no] ORDER BY [t104f025_date_effective]) AS grp
FROM [AUR11PROD].[dbo].[t104_employment_history] with (nolock)
WHERE [t104f005_employee_no] = '11354'
)
SELECT
[t104f005_employee_no],
[t104f040_position_no],
MIN([t104f025_date_effective]) AS min_startdate,
MAX([t104f030_date_to]) AS max_enddate
FROM RankedPositions
GROUP BY [t104f005_employee_no], [t104f040_position_no], grp
ORDER BY [t104f005_employee_no], min_startdate;
and this one
还有这个
WITH RECURSIVE ContinuousPositions AS (
SELECT
employee_no,
position_no,
startdate,
enddate
FROM your_table_name
WHERE NOT EXISTS (
SELECT 1
FROM your_table_name AS t2
WHERE t2.employee_no = your_table_name.employee_no
AND t2.position_no = your_table_name.position_no
AND t2.startdate < your_table_name.startdate
)
UNION ALL
SELECT
cp.employee_no,
cp.position_no,
cp.startdate,
t.enddate
FROM ContinuousPositions AS cp
JOIN your_table_name AS t ON (
cp.employee_no = t.employee_no
AND cp.position_no = t.position_no
AND cp.enddate = DATEADD(day, -1, t.startdate)
)
)
SELECT
employee_no,
position_no,
MIN(startdate) AS min_startdate,
MAX(enddate) AS max_enddate
FROM ContinuousPositions
GROUP BY employee_no, position_no
ORDER BY employee_no, min_startdate;
更多回答
have you tried window function?
你试过窗户功能了吗?
@BagusTesa yep tried that.
@BagusTesa YEP试过了。
优秀答案推荐
This is a gaps and islands problem. One way is to use LEAD()
or LAG()
to identify when the continuous date break and set a flag (g
). Performing a cumulative sum() on the flag gives you the required grouping (grp
).
这是一个缺口和岛屿问题。一种方法是使用Lead()或Lag()来标识连续日期何时中断,并设置一个标志(G)。对标志执行累积求和()将得到所需的分组(GRP)。
with
cte as
(
select employee_no, position_no, date_effective, date_expiry,
g = case when dateadd(day, -1, date_effective)
<> lag(date_expiry) over (partition by employee_no, position_no
order by date_effective)
then 1
else 0
end
from employment_history
),
cte2 as
(
select employee_no, position_no, date_effective,
date_expiry = isnull(date_expiry, '99991231'),
grp = sum(g) over (partition by employee_no, position_no
order by date_effective)
from cte
)
select employee_no, position_no,
date_effective = min(date_effective),
date_expiry = nullif(max(date_expiry), '99991231')
from cte2
group by employee_no, position_no, grp
order by employee_no, date_effective
db<>fiddle demo
DB<>小提琴演示
更多回答
Hey @squirrel, thank you! I tried this but I am getting this error. Msg 8127, Level 16, State 1, Line 27 Column "cte2.t104f025_date_effective" is invalid in the ORDER BY clause because it is not contained in either an aggregate function or the GROUP BY clause. I am on this version - Microsoft SQL Server 2016 (SP3-GDR) (KB5021129) - 13.0.6430.49 (X64) Jan 22 2023 17:38:22 Copyright (c) Microsoft Corporation Enterprise Edition: Core-based Licensing (64-bit) on Windows Server 2016 Datacenter 10.0 <X64> (Build 14393: ) (Hypervisor)
嘿@松鼠,谢谢你!我试过了,但我收到了这个错误。消息8127,级别16,状态1,第27行列“cte2.t104f025_DATE_Efficient”在ORDER BY子句中无效,因为它既未包含在聚合函数中,也未包含在GROUP BY子句中。我正在使用此版本-Microsoft SQL Server2016(SP3-GDR)(KB5021129)-13.0.6430.49(X64)Jan 22 2023年17:38:22版权所有(C)Microsoft Corporation企业版:Windows Server 2016数据中心10.0(内部版本14393:)(Hypervisor)上的基于核心的许可(64位)
I am not using the exact same column name as yours as my brain can't process those additional t104...
prefix. Just compare the DDL and do a find and replace with your actual column name, it will work
我没有使用与您完全相同的列名,因为我的大脑无法处理这些额外的T104……前缀。只需比较DDL并使用您的实际列名进行查找和替换,它就会起作用
Cheers @squirrell. I must have edited something incorrectly the 1st time round.
干杯@松鼠。我第一次一定是编辑错了什么。
@DaleK - I did, still learning.
@Dalek-我做到了,还在学习。
我是一名优秀的程序员,十分优秀!