gpt4 book ai didi

sql - PostgreSQL 中的条件累积平均

转载 作者:行者123 更新时间:2023-11-29 11:34:24 27 4
gpt4 key购买 nike

我有一个简单的表,它是:

    DROP TABLE IF EXISTS running_averages;

CREATE TABLE running_averages
(
avg_id SERIAL NOT NULL PRIMARY KEY,
num1 integer,
num2 integer DEFAULT 0

);

INSERT INTO running_averages(num1, num2)
SELECT 100, 100 UNION ALL
SELECT 200, 175 UNION ALL
SELECT -400, NULL UNION ALL
SELECT 300, 200 UNION ALL
SELECT -100, NULL;

在上表中,如果“num1”列为负值,则“num2”列应更新为上一行的累积平均值。我当前的查询是:

    SELECT *,
num1 * num2 AS current_total,
SUM(num1 * num2) OVER(order by avg_id) AS cumulative_sum,
SUM(num1) OVER(order by avg_id) AS culmulative_num1,

CASE WHEN num1 > 0 THEN
SUM(num1 * num2) OVER(order by avg_id)
/
SUM(num1) OVER(order by avg_id)
ELSE
0
END AS cumulative_average
FROM running_averages;

结果:

avg_id  num1  num2    current_total cumulative_sum   cumulative_num1 cumulative_average
1 100 100 10,000 10,000 100 100
2 200 175 35,000 45,000 300 150
3 -400 NULL 45,00 -100 0
4 300 200 60,000 105,000 200 525
5 -100 NULL 105,000 100 0

如果当前行的 num1 列是负数,我想不出获取前一行的累积平均值的方法。而不是上面的,预期的输出应该是:

avg_id  num1  num2    current_total cumulative_sum   cumulative_num1 cumulative_average
1 100 100 10,000 10,000 100 100
2 200 175 35,000 45,000 300 150
3 -400 150 -60,000 -15,00 -100 150
4 300 200 60,000 45,000 200 225
5 -100 225 -22,500 22,500 100 225

在这种情况下,如何获取最后一行的列的值?

编辑:

我编辑了上面的 SQL 脚本。我非常喜欢 Gordon Linoff 的方法的回答。但遗憾的是,根据脚本更改,它会产生不正确的结果:

avg_id  num1  num2    new_num2
1 100 100 100
2 200 175 175
3 -400 150 150 (Correct)
4 300 200 200
5 -100 225 50 (Incorrect)

编辑2

我还测试了 Multisync 的答案,它也会产生错误的结果:

avg_id  num1  num2              current_total cumulative_sum   cumulative_num1 cumulative_average
1 100 100 10,000 10,000 100 100
2 200 175 35,000 45,000 300 150
3 -400 150 (Correct) -60,000 -15,00 -100 150
4 300 200 60,000 45,000 200 225
5 -100 175 (Incorrect) -17,500 27,500 100 275

编辑3

我已经接受了 Multisync 的更新答案,因为它会产生正确的结果。我还想知道如何在我们有很多聚合函数和窗口函数的情况下改进这样的查询。有关此主题的任何引用资料都会有所帮助。

最佳答案

我只能想到一个递归查询:

with recursive tmp (avg_id, num1, num2, sum_m, sum_num1, last_id) as (
select avg_id, num1, num2, num1 * num2, num1, avg_id
from running_averages where avg_id = 1
union all
select r.avg_id, r.num1,
case when r.num1 < 0 then t.sum_m / t.sum_num1 else r.num2 end,
t.sum_m + case when r.num1 < 0 then t.sum_m / t.sum_num1 else r.num2 end * r.num1,
t.sum_num1 + r.num1,
r.avg_id
from running_averages r join tmp t on r.avg_id = t.last_id + 1
)
select avg_id, num1, num2,
num1 * num2 AS current_total,
SUM(num1 * num2) OVER(order by avg_id) AS cumulative_sum,
SUM(num1) OVER(order by avg_id) AS culmulative_num1,
SUM(num1 * num2) OVER(order by avg_id)
/ SUM(num1) OVER(order by avg_id) AS cumulative_average
from tmp;

avg_id 必须包含连续的数字(您可以使用 row_number() 代替,我没有使用它来简化)

num2 在计算过程中发生变化,这就是为什么我除了递归查询之外想不出任何其他东西(上一步的输出是下一步的输入)

关于sql - PostgreSQL 中的条件累积平均,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26961586/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com