gpt4 book ai didi

google-bigquery - BigQuery 中带有 GROUPBY 的百分位函数

转载 作者:行者123 更新时间:2023-12-03 22:16:23 27 4
gpt4 key购买 nike

在我的人口普查表中,我想按州分组,并为每个州获取县人口中位数和县数。

在 psql、redshift 和雪花中,我可以这样做:

psql=> SELECT state, count(county), PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY "population2000") AS median FROM CENSUS GROUP BY state;
state | count | median
----------------------+-------+----------
Alabama | 67 | 36583
Alaska | 24 | 7296.5
Arizona | 15 | 116320
Arkansas | 75 | 20229
...

我试图在标准 BigQuery 中找到一种很好的方法来做到这一点。我注意到有未记录的 percentile_cont分析功能可用,但我必须做一些主要的黑客才能让它做我想做的事。

我希望能够用我收集到的正确参数做同样的事情:
SELECT
state,
COUNT(county),
PERCENTILE_CONT(population2000,
0.5) OVER () AS `medPop`
FROM
CENSUS
GROUP BY
state;

但此查询产生错误
SELECT list expression references column population2000 which is neither grouped nor aggregated at

我可以得到我想要的答案,但如果这是做我想做的事情的推荐方式,我会非常失望:
SELECT
MAX(nCounties) AS nCounties,
state,
MAX(medPop) AS medPop
FROM (
SELECT
nCounties,
T1.state,
(PERCENTILE_CONT(population2000,
0.5) OVER (PARTITION BY T1.state)) AS `medPop`
FROM
census T1
LEFT OUTER JOIN (
SELECT
COUNT(county) AS `nCounties`,
state
FROM
census
GROUP BY
state) T2
ON
T1.state = T2.state) T3
GROUP BY
state

有没有更好的方法来做我想做的事情?另外,是 PERCENTILE_CONT功能是否会被记录?

谢谢阅读!

最佳答案

谢谢你的关注。 PERCENTILE_CONT 正在开发中,一旦正式发布,我们将发布文档。我们将首先支持它作为解析函数,我们计划稍后支持它作为聚合函数(允许 GROUP BY)。在这两个版本之间,更简单的解决方法是

SELECT
state,
ANY_VALUE(nCounties) AS nCounties,
ANY_VALUE(medPop) AS medPop
FROM (
SELECT
state,
COUNT(county) OVER (PARTITION BY state) AS nCounties,
PERCENTILE_CONT(population2000,
0.5) OVER (PARTITION BY state) AS medPop
FROM
CENSUS)
GROUP BY
state

关于google-bigquery - BigQuery 中带有 GROUPBY 的百分位函数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45579692/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com