gpt4 book ai didi

sql - 如何在某个值差距之前识别每组的行?

转载 作者:行者123 更新时间:2023-12-04 08:35:46 25 4
gpt4 key购买 nike

我想根据 PostgreSQL 中相邻行之间另一列值的差异更新表中的某个列。
这是一个测试设置:

CREATE TABLE test(
main INTEGER,
sub_id INTEGER,
value_t INTEGER);

INSERT INTO test (main, sub_id, value_t)
VALUES
(1,1,8),
(1,2,7),
(1,3,3),
(1,4,85),
(1,5,40),
(2,1,3),
(2,2,1),
(2,3,1),
(2,4,8),
(2,5,41);
我的目标是确定每组 mainsub_id 1 开始 diff 中的哪个值通过 sub_id 按升序检查超过某个阈值(例如 <10 或 >-10) .在达到阈值之前,我想标记每个通过的行 条件为 FALSE 的一行按填写栏 newval带有一个值,例如 1 .
enter image description here
我应该使用循环还是有更智能的解决方案?
伪代码中的任务描述:
FOR i in GROUP [PARTITION BY main ORDER BY sub_id]:
DO until diff > 10 OR diff <-10
SET newval = 1 AND LEAD(newval) = 1

最佳答案

基础版 SELECT尽可能快地:

SELECT *, bool_and(diff BETWEEN -10 AND 10) OVER (PARTITION BY main ORDER BY sub_id) AS flag
FROM (
SELECT *, value_t - lag(value_t, 1, value_t) OVER (PARTITION BY main ORDER BY sub_id) AS diff
FROM test
) sub;
细点
  • 您的思维模型围绕窗口函数演进 lead() .但它的对应物 lag() 为此目的更有效,因为在包含大间隙之前的行时不会出现逐一错误。 或者 , 使用 lead()使用倒序排序 ( ORDER BY sub_id DESC )。
  • 避免 NULL对于分区中的第一行,提供 value_t默认为第三个参数,这使得差异 0而不是 NULL。两者 lead()lag()有那个能力。
  • diff BETWEEN -10 AND 10略快于 @diff < 11 (也更清晰、更灵活)。 ( @ being the "absolute value" operator ,相当于 abs() function 。)
  • bool_or() or bool_and() 在外窗函数中,将所有行标记为大间隙可能是最便宜的。

  • 您的 UPDATE

    Until the threshold is reached I would like to flag every passed row AND the one row where the condition is FALSE by filling column newval with a value e.g. 1.


    再次,尽可能快。
    UPDATE test AS t
    SET newval = 1
    FROM (
    SELECT main, sub_id
    , bool_and(diff BETWEEN -10 AND 10) OVER (PARTITION BY main ORDER BY sub_id) AS flag
    FROM (
    SELECT main, sub_id
    , value_t - lag(value_t, 1, value_t) OVER (PARTITION BY main ORDER BY sub_id) AS diff
    FROM test
    ) sub
    ) u
    WHERE (t.main, t.sub_id) = (u.main, u.sub_id)
    AND u.flag;
    细点
  • 在单个查询中计算所有值通常比相关子查询快得多。
  • 添加的 WHERE 条件 AND u.flag确保我们只更新实际需要更新的行。
    如果某些行可能已经在 newval 中具有正确的值, 添加另一个子句以避免那些空更新:AND t.newval IS DISTINCT FROM 1看:
  • How do I (or can I) SELECT DISTINCT on multiple columns?

  • SET newval = 1分配一个常数(即使在这种情况下我们可以使用实际计算的值),这会便宜一些。

  • 分贝<> fiddle here

    关于sql - 如何在某个值差距之前识别每组的行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64810641/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com