gpt4 book ai didi

sql-server - 删除基于日期的时间线内出现的连续重复值

转载 作者:行者123 更新时间:2023-12-03 02:08:00 25 4
gpt4 key购买 nike

我有一个表,其中包含基于日期的用户操作。该表用作事件的时间表。以下示例显示了两个人如何随着时间的推移改变他们的工作角色:

DECLARE @tbl TABLE (
UserID int,
ActionID int,
ActionDesc nvarchar(50),
ActionDate datetime
);
INSERT INTO @tbl (UserID, ActionID, ActionDesc, ActionDate)
VALUES
-- First person
(1, 200, 'Promoted', '2000-01-01'),
(1, 200, 'Promoted', '2001-01-01'),
(1, 200, 'Promoted', '2002-02-01'),
(1, 300, 'Moved', '2004-03-01'),
(1, 200, 'Promoted', '2005-03-01'),
(1, 200, 'Promoted', '2006-03-01'),
-- Second person
(2, 200, 'Promoted', '2006-01-01'),
(2, 300, 'Moved', '2007-01-01'),
(2, 200, 'Promoted', '2008-01-01');

SELECT * FROM @tbl ORDER BY UserID, ActionDate DESC;

这给出了以下内容,首先显示为最近的事件:

enter image description here

我需要以相反的日期顺序显示表格,但根据 [UserID/ActionID] 匹配删除刚刚发生后立即发生的任何事件。例如,如果某人被晋升,然后再次晋升,则第二次晋升将不会包含在结果中,因为它会被视为与前一个操作重复。

因此,所需的输出是:

enter image description here

经过研究,我尝试使用 ROW_NUMBER() 来识别重复项:

SELECT
*,
ROW_NUMBER() OVER (PARTITION BY UserID, ActionID ORDER BY ActionDate ASC) AS RowNum
FROM
@tbl
ORDER BY
UserID, ActionDate DESC;

...但它不太有效,因为在每次不同的操作后编号都不会重置。我可能想得太多了,但我正在努力寻找灵感,因为搜索结果返回了无数的问题,而人们只是从列表中删除重复项。

最佳答案

我会使用LEAD消除不必要的行。

USE tempdb;

DECLARE @tbl TABLE (
UserID int,
ActionID int,
ActionDesc nvarchar(50),
ActionDate datetime
);
INSERT INTO @tbl (UserID, ActionID, ActionDesc, ActionDate)
VALUES
-- First person
(1, 200, 'Promoted', '2000-01-01'),
(1, 200, 'Promoted', '2001-01-01'),
(1, 200, 'Promoted', '2002-02-01'),
(1, 300, 'Moved', '2004-03-01'),
(1, 200, 'Promoted', '2005-03-01'),
(1, 200, 'Promoted', '2006-03-01'),
-- Second person
(2, 200, 'Promoted', '2006-01-01'),
(2, 300, 'Moved', '2007-01-01'),
(2, 200, 'Promoted', '2008-01-01');

;WITH src AS
(
SELECT *
, l = LEAD(t.ActionID) OVER (PARTITION BY t.UserID ORDER BY t.ActionDate DESC)
FROM @tbl t
)
SELECT src.UserID
, src.ActionID
, src.ActionDesc
, src.ActionDate
FROM src
WHERE src.l <> src.ActionID
OR src.l IS NULL

上述查询中的 WHERE 子句从输出中消除重复行,其中前一行是当前行的重复 ActionID。 src.l IS NULL 确保我们看到的行没有重复的 ActionID。

结果:

╔════════╦══════════╦════════════╦═════════════════════════╗║ UserID ║ ActionID ║ ActionDesc ║       ActionDate        ║╠════════╬══════════╬════════════╬═════════════════════════╣║      1 ║      200 ║ Promoted   ║ 2005-03-01 00:00:00.000 ║║      1 ║      300 ║ Moved      ║ 2004-03-01 00:00:00.000 ║║      1 ║      200 ║ Promoted   ║ 2000-01-01 00:00:00.000 ║║      2 ║      200 ║ Promoted   ║ 2008-01-01 00:00:00.000 ║║      2 ║      300 ║ Moved      ║ 2007-01-01 00:00:00.000 ║║      2 ║      200 ║ Promoted   ║ 2006-01-01 00:00:00.000 ║╚════════╩══════════╩════════════╩═════════════════════════╝

对于具有大量行的表,您希望将查询中使用的聚合数量减少到尽可能少; LEAD 只需要一个聚合即可提供此功能。我的版本的执行计划:

enter image description here

关于sql-server - 删除基于日期的时间线内出现的连续重复值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52223565/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com