gpt4 book ai didi

postgresql - 窗口函数 LAG 可以引用正在计算其值的列吗?

转载 作者:行者123 更新时间:2023-11-29 11:19:46 25 4
gpt4 key购买 nike

我需要根据当前记录的某些其他列和先前记录的 X 值(使用一些分区和顺序)计算某些列 X 的值。基本上我需要在表单中实现查询

SELECT <some fields>, 
<some expression using LAG(X) OVER(PARTITION BY ... ORDER BY ...) AS X
FROM <table>

这是不可能的,因为在窗口函数中只能使用现有的列,所以我正在寻找克服这个问题的方法。

这是一个例子。我有一张有事件的 table 。每个事件都有 typetime_stamp

create table event (id serial, type integer, time_stamp integer);

我不想找到“重复”事件(跳过它们)。重复是指以下内容。让我们按 time_stamp 升序排列给定 type 的所有事件。然后

  1. 第一个事件不是重复的
  2. 非重复之后的所有事件并且在它之后的某个时间范围内(即它们的 time_stamp 不大于前一个非重复的 time_stamp 加上一些常量 TIMEFRAME) 是重复的
  3. 下一个 time_stamp 比上一个非重复事件多于 TIMEFRAME 的事件不重复
  4. 等等

对于这个数据

insert into event (type, time_stamp) 
values
(1, 1), (1, 2), (2, 2), (1,3), (1, 10), (2,10),
(1,15), (1, 21), (2,13),
(1, 40);

TIMEFRAME=10 结果应该是

time_stamp | type | duplicate
-----------------------------
1 | 1 | false
2 | 1 | true
3 | 1 | true
10 | 1 | true
15 | 1 | false
21 | 1 | true
40 | 1 | false
2 | 2 | false
10 | 2 | true
13 | 2 | false

我可以根据当前 time_stamp 和上一个非重复事件的 time_stamp 计算 duplicate 字段的值,如下所示:

WITH evt AS (
SELECT
time_stamp,
CASE WHEN
time_stamp - LAG(current_non_dupl_time_stamp) OVER w >= TIMEFRAME
THEN
time_stamp
ELSE
LAG(current_non_dupl_time_stamp) OVER w
END AS current_non_dupl_time_stamp
FROM event
WINDOW w AS (PARTITION BY type ORDER BY time_stamp ASC)
)
SELECT time_stamp, time_stamp != current_non_dupl_time_stamp AS duplicate

但这不起作用,因为计算的字段不能在 LAG 中引用:

ERROR:  column "current_non_dupl_time_stamp" does not exist.

那么问题来了:我可以重写这个查询来达到我需要的效果吗?

最佳答案

朴素的递归链编织器:


        -- temp view to avoid nested CTE
CREATE TEMP VIEW drag AS
SELECT e.type,e.time_stamp
, ROW_NUMBER() OVER www as rn -- number the records
, FIRST_VALUE(e.time_stamp) OVER www as fst -- the "group leader"
, EXISTS (SELECT * FROM event x
WHERE x.type = e.type
AND x.time_stamp < e.time_stamp) AS is_dup
FROM event e
WINDOW www AS (PARTITION BY type ORDER BY time_stamp)
;

WITH RECURSIVE ttt AS (
SELECT d0.*
FROM drag d0 WHERE d0.is_dup = False -- only the "group leaders"
UNION ALL
SELECT d1.type, d1.time_stamp, d1.rn
, CASE WHEN d1.time_stamp - ttt.fst > 20 THEN d1.time_stamp
ELSE ttt.fst END AS fst -- new "group leader"
, CASE WHEN d1.time_stamp - ttt.fst > 20 THEN False
ELSE True END AS is_dup
FROM drag d1
JOIN ttt ON d1.type = ttt.type AND d1.rn = ttt.rn+1
)
SELECT * FROM ttt
ORDER BY type, time_stamp
;

结果:


CREATE TABLE
INSERT 0 10
CREATE VIEW
type | time_stamp | rn | fst | is_dup
------+------------+----+-----+--------
1 | 1 | 1 | 1 | f
1 | 2 | 2 | 1 | t
1 | 3 | 3 | 1 | t
1 | 10 | 4 | 1 | t
1 | 15 | 5 | 1 | t
1 | 21 | 6 | 1 | t
1 | 40 | 7 | 40 | f
2 | 2 | 1 | 2 | f
2 | 10 | 2 | 2 | t
2 | 13 | 3 | 2 | t
(10 rows)

关于postgresql - 窗口函数 LAG 可以引用正在计算其值的列吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34338991/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com