gpt4 book ai didi

sql - 如何修复 PostgreSQL 中错误的最重复值

转载 作者:行者123 更新时间:2023-11-29 14:36:09 26 4
gpt4 key购买 nike

我有一个查询,它从 my_table 中选择最重复的值。查询如下:

SELECT
gid,
max_height
FROM
(
SELECT gid, max_height,
ROW_NUMBER() OVER (PARTITION BY gid ORDER BY freq DESC) AS rn
FROM (
SELECT gid, max_height, COUNT(id) AS freq
FROM my_table
GROUP BY 1, 2
order by 1,2
) hgt_freq
) ranked_hgt_req
WHERE rn = 1

而 my_table 包含三列,例如:

gid id  max_height
3 1 19.3
3 2 19.3
3 3 20.3
3 4 20.3
3 5 19.3
3 6 19.3
3 7 21.4
3 8 21.4
3 9 21.4
3 10 21.4
3 11 21.4
3 12 21.4
22 1 23.1
22 2 23.1
22 3 23.1
22 4 23.1
22 5 23.1
22 6 23.1
22 7 22.1
22 8 22.1
22 9 22.1
22 10 22.1
22 11 22.1
22 12 22.1
29 1 24
29 2 24
29 3 24
29 4 18.9
29 5 18.9
29 6 18.9
29 7 NULL
29 8 NULL
29 9 27.1
29 10 27.1
29 11 6.5
29 12 6.5

此查询的问题是它按降序返回最重复的值,这在 gid = 22 的情况下给出了错误的值。查询的输出是:

gid    max_height
3 21.4
22 22.1
29 24.0

对于 gid = 22 的情况,有两个最重复的值,即 23.1 和 22.1。因此查询应返回 23.1。谁能指出我如何解决这个问题,或者有更好的方法来解决这个问题吗?对于大型记录 (gids),该过程需要自动化。

最佳答案

使用distinct on:

select distinct on(gid) gid, max_height
from (
select gid, max_height, count(id) as freq
from my_table
group by 1, 2
) s
order by gid, freq desc

gid | max_height
-----+------------
3 | 21.4
22 | 23.1
29 | 24
(3 rows)

来自 the documentation:

SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first.


gid=29 有两个最常见的值。在这种情况下,您可以通过在 order by 中添加一个条件来选择它们的显示顺序:

select distinct on(gid) gid, max_height
from (
select gid, max_height, count(id) as freq
from my_table
group by 1, 2
) s
order by gid, freq desc, max_height desc;

gid | max_height
-----+------------
3 | 21.4
22 | 23.1
29 | 24
(3 rows)

select distinct on(gid) gid, max_height
from (
select gid, max_height, count(id) as freq
from my_table
group by 1, 2
) s
order by gid, freq desc, max_height;

gid | max_height
-----+------------
3 | 21.4
22 | 22.1
29 | 18.9
(3 rows)

关于sql - 如何修复 PostgreSQL 中错误的最重复值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44467033/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com