gpt4 book ai didi

sql - 在 T-SQL 中计算相交时间间隔

转载 作者:行者123 更新时间:2023-12-02 18:37:19 24 4
gpt4 key购买 nike

代码:

CREATE TABLE #Temp1 (CoachID INT, BusyST DATETIME, BusyET DATETIME)
CREATE TABLE #Temp2 (CoachID INT, AvailableST DATETIME, AvailableET DATETIME)

INSERT INTO #Temp1 (CoachID, BusyST, BusyET)
SELECT 1,'2016-08-17 09:12:00','2016-08-17 10:11:00'
UNION
SELECT 3,'2016-08-17 09:30:00','2016-08-17 10:00:00'
UNION
SELECT 4,'2016-08-17 12:07:00','2016-08-17 13:10:00'

INSERT INTO #Temp2 (CoachID, AvailableST, AvailableET)
SELECT 1,'2016-08-17 09:07:00','2016-08-17 11:09:00'
UNION
SELECT 2,'2016-08-17 09:11:00','2016-08-17 09:30:00'
UNION
SELECT 3,'2016-08-17 09:24:00','2016-08-17 13:08:00'
UNION
SELECT 1,'2016-08-17 11:34:00','2016-08-17 12:27:00'
UNION
SELECT 4,'2016-08-17 09:34:00','2016-08-17 13:00:00'
UNION
SELECT 5,'2016-08-17 09:10:00','2016-08-17 09:55:00'

--RESULT-SET QUERY GOES HERE

DROP TABLE #Temp1
DROP TABLE #Temp2

期望的输出:

CoachID CanCoachST                  CanCoachET                  NumOfCoaches
1 2016-08-17 09:12:00.000 2016-08-17 09:24:00.000 2 --(ID2 = 2,5)
1 2016-08-17 09:24:00.000 2016-08-17 09:30:00.000 3 --(ID2 = 2,3,5)
1 2016-08-17 09:30:00.000 2016-08-17 09:34:00.000 1 --(ID2 = 5)
1 2016-08-17 09:34:00.000 2016-08-17 09:55:00.000 2 --(ID2 = 4,5)
1 2016-08-17 09:55:00.000 2016-08-17 10:00:00.000 1 --(ID2 = 4)
1 2016-08-17 10:00:00.000 2016-08-17 10:11:00.000 2 --(ID2 = 3,4)
3 2016-08-17 09:30:00.000 2016-08-17 09:34:00.000 1 --(ID2 = 5)
3 2016-08-17 09:34:00.000 2016-08-17 09:55:00.000 2 --(ID2 = 4,5)
3 2016-08-17 09:55:00.000 2016-08-17 10:00:00.000 1 --(ID2 = 4)
4 2016-08-17 12:07:00.000 2016-08-17 12:27:00.000 2 --(ID2 = 1,3)
4 2016-08-17 12:27:00.000 2016-08-17 13:08:00.000 1 --(ID2 = 3)
4 2016-08-17 13:08:00.000 2016-08-17 13:10:00.000 0 --(No one is available)

目标:将 #Temp1 视为团队教练 (ID1) 及其 session 时间(ST1 = session 开始时间,ET1 = session 结束时间)的表。将 #Temp2 视为团队教练 (ID2) 及其总可用时间(ST2 = 可用开始时间,ET2 = 可用结束时间)的表。

现在,我们的目标是从 #Temp2 中找到所有可能的教练,这些教练可以在 #Temp1 的教练 session 时间进行指导。

例如,对于 ID1 = 1 的教练,他在 9:12 到 10:11 之间很忙(如果该信息很重要,数据可以跨越多天),我们有教练 ID2 = 2 和 5,可以在 9:12 到 9:24 之间教练,教练 ID2 = 2,3, 5 可以在 9:24 到 9:30 之间教练,教练 ID2 = 5,可以在 9:30 到 9:34 之间教练,教练 ID2 = 4 和 5,可以在 9:34 到 9:55 之间教练,教练ID2 = 4,可以在9:55到10:00之间教练,并且教练 ID2 = 3 和 4 可以在 10:00 到 10:11 之间进行教练(请注意,ID 3 虽然在 9:24 到 13:08 之间在 #Temp2 表中可用,但它无法在 10:00 到 13:08 之间进行教练 ID1 = 1 9:24 和 10:00,因为 9:30 到 10:00 之间也很忙。

到目前为止我的努力:到目前为止只处理打破 #Temp1 的时间范围。仍然需要弄清楚 A) 如何从输出中删除非繁忙时间窗口 B) 添加一个字段/将其映射到右侧 T1 的 CoachID。

;WITH ED
AS (SELECT BusyET, CoachID FROM #Temp1
UNION ALL
SELECT BusyST, CoachID FROM #Temp1
)
,Brackets
AS (SELECT MIN(BusyST) AS BusyST
,( SELECT MIN(BusyET)
FROM ED e
WHERE e.BusyET > MIN(BusyST)
) AS BusyET
FROM #Temp1 T
UNION ALL
SELECT B.BusyET
,e.BusyET
FROM Brackets B
INNER JOIN ED E ON B.BusyET < E.BusyET
WHERE NOT EXISTS (
SELECT *
FROM ED E2
WHERE E2.BusyET > B.BusyET
AND E2.BusyET < E.BusyET
)
)
SELECT *
FROM Brackets
ORDER BY BusyST;

我认为我需要加入比较 ID 彼此不匹配的两个表之间的 ST/ET 日期。但我无法弄清楚如何实际获取 session 时间窗口和唯一计数。

更新了更好的架构/数据集。另请注意,尽管 CoachID 4 未“计划”可用,但他在最后几分钟仍被列为忙碌。并且可能存在在这段时间没有其他人可以工作的情况,在这种情况下,我们可以返回 0 cnt 记录(如果真的很复杂,则不返回它)。

同样,我们的目标是找到所有可用 CoachID 的计数和组合及其可用时间窗口,以指导繁忙表中列出的 CoachID。

更新了更多与示例数据匹配的示例描述。

最佳答案

此答案中的查询受到 Packing Intervals 的启发。作者:Itzik Ben-Gan。

<小时/>

起初,我并不理解要求的全部复杂性,并假设 Table1Table2 中的间隔不重叠。我认为同一个教练不可能同时忙碌和空闲。

事实证明我的假设是错误的,因此我在下面留下的查询的第一个变体必须通过初步步骤进行扩展,该步骤从存储在 Table1 中的间隔中减去存储在 Table1 中的所有间隔表2

它使用了类似的想法。每个“可用”间隔的开始都用 +1 EventType 标记,“可用”间隔的结束用 -1 EventType 标记。对于“忙”间隔,标记相反。 “忙”间隔以 -1 开始,以 +1 结束。这是在 C1_Subtract 中完成的。

然后运行总计告诉我们“真正”可用的间隔在哪里 (C2_Subtract)。最后,CTE_Available 只留下“真正”可用的间隔。

示例数据

我添加了几行来说明如果没有可用的教练会发生什么。我还添加了 CoachID=9,它不在查询的第一个变体的初始结果中。

CREATE TABLE #Temp1 (CoachID INT, BusyST DATETIME, BusyET DATETIME);
CREATE TABLE #Temp2 (CoachID INT, AvailableST DATETIME, AvailableET DATETIME);
-- Start time is inclusive
-- End time is exclusive

INSERT INTO #Temp1 (CoachID, BusyST, BusyET) VALUES
(1, '2016-08-17 09:12:00','2016-08-17 10:11:00'),
(3, '2016-08-17 09:30:00','2016-08-17 10:00:00'),
(4, '2016-08-17 12:07:00','2016-08-17 13:10:00'),

(6, '2016-08-17 15:00:00','2016-08-17 16:00:00'),
(9, '2016-08-17 15:00:00','2016-08-17 16:00:00');

INSERT INTO #Temp2 (CoachID, AvailableST, AvailableET) VALUES
(1,'2016-08-17 09:07:00','2016-08-17 11:09:00'),
(2,'2016-08-17 09:11:00','2016-08-17 09:30:00'),
(3,'2016-08-17 09:24:00','2016-08-17 13:08:00'),
(1,'2016-08-17 11:34:00','2016-08-17 12:27:00'),
(4,'2016-08-17 09:34:00','2016-08-17 13:00:00'),
(5,'2016-08-17 09:10:00','2016-08-17 09:55:00'),

(7,'2016-08-17 15:10:00','2016-08-17 15:20:00'),
(8,'2016-08-17 15:15:00','2016-08-17 15:25:00'),
(7,'2016-08-17 15:40:00','2016-08-17 15:55:00'),
(9,'2016-08-17 15:05:00','2016-08-17 15:07:00'),
(9,'2016-08-17 15:40:00','2016-08-17 16:55:00');

CTE_Available的中间结果

+---------+-------------------------+-------------------------+
| CoachID | AvailableST | AvailableET |
+---------+-------------------------+-------------------------+
| 1 | 2016-08-17 09:07:00.000 | 2016-08-17 09:12:00.000 |
| 1 | 2016-08-17 10:11:00.000 | 2016-08-17 11:09:00.000 |
| 1 | 2016-08-17 11:34:00.000 | 2016-08-17 12:27:00.000 |
| 2 | 2016-08-17 09:11:00.000 | 2016-08-17 09:30:00.000 |
| 3 | 2016-08-17 09:24:00.000 | 2016-08-17 09:30:00.000 |
| 3 | 2016-08-17 10:00:00.000 | 2016-08-17 13:08:00.000 |
| 4 | 2016-08-17 09:34:00.000 | 2016-08-17 12:07:00.000 |
| 5 | 2016-08-17 09:10:00.000 | 2016-08-17 09:55:00.000 |
| 7 | 2016-08-17 15:10:00.000 | 2016-08-17 15:20:00.000 |
| 7 | 2016-08-17 15:40:00.000 | 2016-08-17 15:55:00.000 |
| 8 | 2016-08-17 15:15:00.000 | 2016-08-17 15:25:00.000 |
| 9 | 2016-08-17 16:00:00.000 | 2016-08-17 16:55:00.000 |
+---------+-------------------------+-------------------------+

现在,我们可以在查询的第一个变体中使用 CTE_Available 的这些中间结果,而不是 #Temp2。请参阅查询第一个变体下面的详细说明。

完整查询

WITH
C1_Subtract
AS
(
SELECT
CoachID
,AvailableST AS ts
,+1 AS EventType
FROM #Temp2

UNION ALL

SELECT
CoachID
,AvailableET AS ts
,-1 AS EventType
FROM #Temp2

UNION ALL

SELECT
CoachID
,BusyST AS ts
,-1 AS EventType
FROM #Temp1

UNION ALL

SELECT
CoachID
,BusyET AS ts
,+1 AS EventType
FROM #Temp1
)
,C2_Subtract AS
(
SELECT
C1_Subtract.*
,SUM(EventType)
OVER (
PARTITION BY CoachID
ORDER BY ts, EventType DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS cnt
,LEAD(ts)
OVER (
PARTITION BY CoachID
ORDER BY ts, EventType DESC)
AS NextTS
FROM C1_Subtract
)
,CTE_Available
AS
(
SELECT
C2_Subtract.CoachID
,C2_Subtract.ts AS AvailableST
,C2_Subtract.NextTS AS AvailableET
FROM C2_Subtract
WHERE cnt > 0
)
,CTE_Intervals
AS
(
SELECT
TBusy.CoachID AS BusyCoachID
,TBusy.BusyST
,TBusy.BusyET
,CA.CoachID AS AvailableCoachID
,CA.AvailableST
,CA.AvailableET
-- max of start time
,CASE WHEN CA.AvailableST < TBusy.BusyST
THEN TBusy.BusyST
ELSE CA.AvailableST
END AS ST
-- min of end time
,CASE WHEN CA.AvailableET > TBusy.BusyET
THEN TBusy.BusyET
ELSE CA.AvailableET
END AS ET
FROM
#Temp1 AS TBusy
CROSS APPLY
(
SELECT
TAvailable.*
FROM
CTE_Available AS TAvailable
WHERE
-- the same coach can't be available and busy
TAvailable.CoachID <> TBusy.CoachID
-- intervals intersect
AND TAvailable.AvailableST < TBusy.BusyET
AND TAvailable.AvailableET > TBusy.BusyST
) AS CA
)
,C1 AS
(
SELECT
BusyCoachID
,AvailableCoachID
,ST AS ts
,+1 AS EventType
FROM CTE_Intervals

UNION ALL

SELECT
BusyCoachID
,AvailableCoachID
,ET AS ts
,-1 AS EventType
FROM CTE_Intervals

UNION ALL

SELECT
CoachID AS BusyCoachID
,CoachID AS AvailableCoachID
,BusyST AS ts
,+1 AS EventType
FROM #Temp1

UNION ALL

SELECT
CoachID AS BusyCoachID
,CoachID AS AvailableCoachID
,BusyET AS ts
,-1 AS EventType
FROM #Temp1
)
,C2 AS
(
SELECT
C1.*
,SUM(EventType)
OVER (
PARTITION BY BusyCoachID
ORDER BY ts, EventType DESC, AvailableCoachID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
- 1 AS cnt
,LEAD(ts)
OVER (
PARTITION BY BusyCoachID
ORDER BY ts, EventType DESC, AvailableCoachID)
AS NextTS
FROM C1
)
SELECT
BusyCoachID AS CoachID
,ts AS CanCoachST
,NextTS AS CanCoachET
,cnt AS NumOfCoaches
FROM C2
WHERE ts <> NextTS
ORDER BY BusyCoachID, CanCoachST
;

最终结果

+---------+-------------------------+-------------------------+--------------+
| CoachID | CanCoachST | CanCoachET | NumOfCoaches |
+---------+-------------------------+-------------------------+--------------+
| 1 | 2016-08-17 09:12:00.000 | 2016-08-17 09:24:00.000 | 2 |
| 1 | 2016-08-17 09:24:00.000 | 2016-08-17 09:30:00.000 | 3 |
| 1 | 2016-08-17 09:30:00.000 | 2016-08-17 09:34:00.000 | 1 |
| 1 | 2016-08-17 09:34:00.000 | 2016-08-17 09:55:00.000 | 2 |
| 1 | 2016-08-17 09:55:00.000 | 2016-08-17 10:00:00.000 | 1 |
| 1 | 2016-08-17 10:00:00.000 | 2016-08-17 10:11:00.000 | 2 |
| 3 | 2016-08-17 09:30:00.000 | 2016-08-17 09:34:00.000 | 1 |
| 3 | 2016-08-17 09:34:00.000 | 2016-08-17 09:55:00.000 | 2 |
| 3 | 2016-08-17 09:55:00.000 | 2016-08-17 10:00:00.000 | 1 |
| 4 | 2016-08-17 12:07:00.000 | 2016-08-17 12:27:00.000 | 2 |
| 4 | 2016-08-17 12:27:00.000 | 2016-08-17 13:08:00.000 | 1 |
| 4 | 2016-08-17 13:08:00.000 | 2016-08-17 13:10:00.000 | 0 |
| 6 | 2016-08-17 15:00:00.000 | 2016-08-17 15:10:00.000 | 0 |
| 6 | 2016-08-17 15:10:00.000 | 2016-08-17 15:15:00.000 | 1 |
| 6 | 2016-08-17 15:15:00.000 | 2016-08-17 15:20:00.000 | 2 |
| 6 | 2016-08-17 15:20:00.000 | 2016-08-17 15:25:00.000 | 1 |
| 6 | 2016-08-17 15:25:00.000 | 2016-08-17 15:40:00.000 | 0 |
| 6 | 2016-08-17 15:40:00.000 | 2016-08-17 15:55:00.000 | 1 |
| 6 | 2016-08-17 15:55:00.000 | 2016-08-17 16:00:00.000 | 0 |
| 9 | 2016-08-17 15:00:00.000 | 2016-08-17 15:10:00.000 | 0 |
| 9 | 2016-08-17 15:10:00.000 | 2016-08-17 15:15:00.000 | 1 |
| 9 | 2016-08-17 15:15:00.000 | 2016-08-17 15:20:00.000 | 2 |
| 9 | 2016-08-17 15:20:00.000 | 2016-08-17 15:25:00.000 | 1 |
| 9 | 2016-08-17 15:25:00.000 | 2016-08-17 15:40:00.000 | 0 |
| 9 | 2016-08-17 15:40:00.000 | 2016-08-17 15:55:00.000 | 1 |
| 9 | 2016-08-17 15:55:00.000 | 2016-08-17 16:00:00.000 | 0 |
+---------+-------------------------+-------------------------+--------------+

我建议创建以下索引以避免执行计划中的某些排序。

CREATE UNIQUE NONCLUSTERED INDEX [IX_CoachID_BusyST] ON #Temp1
(
CoachID ASC,
BusyST ASC
);

CREATE UNIQUE NONCLUSTERED INDEX [IX_CoachID_BusyET] ON #Temp1
(
CoachID ASC,
BusyET ASC
);

CREATE UNIQUE NONCLUSTERED INDEX [IX_CoachID_AvailableST] ON #Temp2
(
CoachID ASC,
AvailableST ASC
);

CREATE UNIQUE NONCLUSTERED INDEX [IX_CoachID_AvailableET] ON #Temp2
(
CoachID ASC,
AvailableET ASC
);

但是,在实际数据上,瓶颈可能在其他地方,这可能取决于数据分布。该查询相当复杂,在没有实际数据的情况下对其进行调整将需要过多的猜测。

<小时/>

查询的第一个变体

逐步、CTE 到 CTE 运行查询并检查中间结果以了解其工作原理。

CTE_Intervals 为我们提供了与繁忙间隔相交的可用间隔列表。C1 将开始时间和结束时间与相应的 EventType 放在同一列中。这将帮助我们跟踪间隔何时开始或结束。EventType 的运行总数给出了可用教练的数量。 C1 将忙碌的教练加入其中,以便在没有教练可用时正确计算案例。

WITH
CTE_Intervals
AS
(
SELECT
TBusy.CoachID AS BusyCoachID
,TBusy.BusyST
,TBusy.BusyET
,CA.CoachID AS AvailableCoachID
,CA.AvailableST
,CA.AvailableET
-- max of start time
,CASE WHEN CA.AvailableST < TBusy.BusyST
THEN TBusy.BusyST
ELSE CA.AvailableST
END AS ST
-- min of end time
,CASE WHEN CA.AvailableET > TBusy.BusyET
THEN TBusy.BusyET
ELSE CA.AvailableET
END AS ET
FROM
#Temp1 AS TBusy
CROSS APPLY
(
SELECT
TAvailable.*
FROM
#Temp2 AS TAvailable
WHERE
-- the same coach can't be available and busy
TAvailable.CoachID <> TBusy.CoachID
-- intervals intersect
AND TAvailable.AvailableST < TBusy.BusyET
AND TAvailable.AvailableET > TBusy.BusyST
) AS CA
)
,C1 AS
(
SELECT
BusyCoachID
,AvailableCoachID
,ST AS ts
,+1 AS EventType
FROM CTE_Intervals

UNION ALL

SELECT
BusyCoachID
,AvailableCoachID
,ET AS ts
,-1 AS EventType
FROM CTE_Intervals

UNION ALL

SELECT
CoachID AS BusyCoachID
,CoachID AS AvailableCoachID
,BusyST AS ts
,+1 AS EventType
FROM #Temp1

UNION ALL

SELECT
CoachID AS BusyCoachID
,CoachID AS AvailableCoachID
,BusyET AS ts
,-1 AS EventType
FROM #Temp1
)
,C2 AS
(
SELECT
C1.*
,SUM(EventType)
OVER (
PARTITION BY BusyCoachID
ORDER BY ts, EventType DESC, AvailableCoachID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
- 1 AS cnt
,LEAD(ts)
OVER (
PARTITION BY BusyCoachID
ORDER BY ts, EventType DESC, AvailableCoachID)
AS NextTS
FROM C1
)
SELECT
BusyCoachID AS CoachID
,ts AS CanCoachST
,NextTS AS CanCoachET
,cnt AS NumOfCoaches
FROM C2
WHERE ts <> NextTS
ORDER BY BusyCoachID, CanCoachST
;

DROP TABLE #Temp1;
DROP TABLE #Temp2;

结果

我已为每行添加了注释,其中包含已统计的可用教练的 ID。

现在我明白为什么我的初始结果与您的预期结果不一样了。

+---------+---------------------+---------------------+--------------+
| CoachID | CanCoachST | CanCoachET | NumOfCoaches |
+---------+---------------------+---------------------+--------------+
| 1 | 2016-08-17 09:12:00 | 2016-08-17 09:24:00 | 2 | 2,5
| 1 | 2016-08-17 09:24:00 | 2016-08-17 09:30:00 | 3 | 2,3,5
| 1 | 2016-08-17 09:30:00 | 2016-08-17 09:34:00 | 2 | 3,5
| 1 | 2016-08-17 09:34:00 | 2016-08-17 09:55:00 | 3 | 3,4,5
| 1 | 2016-08-17 09:55:00 | 2016-08-17 10:11:00 | 2 | 3,4
| 3 | 2016-08-17 09:30:00 | 2016-08-17 09:34:00 | 2 | 1,5
| 3 | 2016-08-17 09:34:00 | 2016-08-17 09:55:00 | 3 | 1,4,5
| 3 | 2016-08-17 09:55:00 | 2016-08-17 10:00:00 | 2 | 1,4
| 4 | 2016-08-17 12:07:00 | 2016-08-17 12:27:00 | 2 | 3,1
| 4 | 2016-08-17 12:27:00 | 2016-08-17 13:08:00 | 1 | 3
| 4 | 2016-08-17 13:08:00 | 2016-08-17 13:10:00 | 0 | none
| 6 | 2016-08-17 15:00:00 | 2016-08-17 15:10:00 | 0 | none
| 6 | 2016-08-17 15:10:00 | 2016-08-17 15:15:00 | 1 | 7
| 6 | 2016-08-17 15:15:00 | 2016-08-17 15:20:00 | 2 | 7,8
| 6 | 2016-08-17 15:20:00 | 2016-08-17 15:25:00 | 1 | 8
| 6 | 2016-08-17 15:25:00 | 2016-08-17 15:40:00 | 0 | none
| 6 | 2016-08-17 15:40:00 | 2016-08-17 15:55:00 | 1 | 7
| 6 | 2016-08-17 15:55:00 | 2016-08-17 16:00:00 | 0 | none
+---------+---------------------+---------------------+--------------+

关于sql - 在 T-SQL 中计算相交时间间隔,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39009645/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com