gpt4 book ai didi

sql - 为什么在使用OVER(PARTITION BY x)时需要在GROUP BY中包含一个字段?

转载 作者:行者123 更新时间:2023-11-29 11:27:58 24 4
gpt4 key购买 nike

我有一个表,我想对一个字段进行简单求和,按两列分组。然后我想要每个 year_num 的所有值的总和。

参见示例:http://rextester.com/QSLRS68794

此查询抛出:“42803:列“foo.num_cust”必须出现在 GROUP BY 子句中或用于聚合函数中”,我不明白为什么。 为什么使用 OVER (PARTITION BY x) 的聚合函数要求求和字段位于 GROUP BY 中?

select 
year_num
,age_bucket
,sum(num_cust)
--,sum(num_cust) over (partition by year_num) --THROWS ERROR!!
from
foo
group by
year_num
,age_bucket
order by 1,2

表格:

| loc_id |  year_num |  gen |  cust_category |  cust_age |  num_cust |  age_bucket |
|--------|-----------|------|----------------|-----------|-----------|-------------|
| 1 | 2016 | M | cash | 41 | 2 | 04_<45 |
| 1 | 2016 | F | Prepaid | 41 | 1 | 03_<35 |
| 1 | 2016 | F | cc | 61 | 1 | 05_45+ |
| 1 | 2016 | F | cc | 19 | 2 | 02_<25 |
| 1 | 2016 | M | cc | 64 | 1 | 05_45+ |
| 1 | 2016 | F | cash | 46 | 1 | 05_45+ |
| 1 | 2016 | F | cash | 27 | 3 | 03_<35 |
| 1 | 2016 | M | cash | 42 | 1 | 04_<45 |
| 1 | 2017 | F | cc | 35 | 1 | 04_<45 |
| 1 | 2017 | F | cc | 37 | 1 | 04_<45 |
| 1 | 2017 | F | cash | 46 | 1 | 05_45+ |
| 1 | 2016 | F | cash | 19 | 4 | 02_<25 |
| 1 | 2017 | M | cash | 43 | 1 | 04_<45 |
| 1 | 2017 | M | cash | 29 | 1 | 03_<35 |
| 1 | 2016 | F | cc | 13 | 1 | 01_<18 |
| 1 | 2017 | F | cash | 16 | 2 | 01_<18 |
| 1 | 2016 | F | cc | 17 | 2 | 01_<18 |
| 1 | 2016 | M | cc | 17 | 2 | 01_<18 |
| 1 | 2017 | F | cash | 18 | 9 | 02_<25 |

期望的输出:

| year_num | age_bucket | sum | sum over (year_num) |
|----------|------------|-----|---------------------|
| 2016 | 01_<18 | 5 | 21 |
| 2016 | 02_<25 | 6 | 21 |
| 2016 | 03_<35 | 4 | 21 |
| 2016 | 04_<45 | 3 | 21 |
| 2016 | 05_45+ | 3 | 21 |
| 2017 | 01_<18 | 2 | 16 |
| 2017 | 02_<25 | 9 | 16 |
| 2017 | 03_<35 | 1 | 16 |
| 2017 | 04_<45 | 3 | 16 |
| 2017 | 05_45+ | 1 | 16 |

最佳答案

您需要嵌套 sum():

select year_num, age_bucket, sum(num_cust),
sum(sum(num_cust)) over (partition by year_num) --WORKS!!
from foo
group by year_num, age_bucket
order by 1, 2;

为什么?好吧,窗口函数不做聚合。参数需要是一个表达式,可以在 group by 之后 求值(因为这是一个聚合查询)。因为 num_cust 不是 group by 键,所以它需要一个聚合函数。

如果您使用子查询,也许这会更清楚:

select year_num, age_bucket, sum_num_cust,
sum(sum_num_cust) over (partition by year_num)
from (select year_num, age_bucket, sum(num_cust) as sum_num_cust
from foo
group by year_num, age_bucket
) ya
order by 1, 2;

这两个查询做的事情完全一样。但是对于子查询,为什么需要额外的聚合应该会更明显。

关于sql - 为什么在使用OVER(PARTITION BY x)时需要在GROUP BY中包含一个字段?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46612628/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com