gpt4 book ai didi

postgresql - 提高复杂 PostgreSQL 查询的性能

转载 作者:行者123 更新时间:2023-11-29 12:26:46 25 4
gpt4 key购买 nike

我正在尝试弄清楚如何缩短此查询的时间。有人告诉我有关 EXPLAIN ANALYZE 的信息,但我不知道如何解释结果以及要进行哪些修复。有什么建议么?请注意,我使用的是第三方数据库 (cartoDB),因此我认为我没有创建索引的选项。

这是查询。这里面的两个表大概40行,32000行左右。

EXPLAIN ANALYZE SELECT
id, identifier,
CASE
WHEN dist < 8046. THEN 1
WHEN dist < 16093. THEN 2
WHEN dist < 40233. THEN 3
WHEN dist < 80467. THEN 4
WHEN dist < 160934. THEN 5
ELSE 6
END AS grp,
count(*)
FROM (
SELECT s.id, s.identifier, ST_Distance_Sphere(s.the_geom, c.the_geom) AS dist
FROM full_data_for_testing_deid_2 c, demo_locations_table s) AS loc_dist
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3

这是对 EXECUTE ANALYZE 的响应:

{
"fields" : {
"QUERY PLAN" : {
"type" : "string"
}
},
"rows" : [
{
"QUERY PLAN" : "GroupAggregate (cost=373146.40..651612.12 rows=1058805 width=128) (actual time=34120.054..37536.893 rows=197 loops=1)"
},
{
"QUERY PLAN" : " -> Sort (cost=373146.40..373675.81 rows=1058805 width=128) (actual time=34120.000..36504.439 rows=1058805 loops=1)"
},
{
"QUERY PLAN" : " Sort Key: s.id, s.identifier, (CASE WHEN (_st_distance(geography(s.the_geom), geography(c.the_geom), 0::double precision, false) < 8046::double precision) THEN 1 WHEN (_st_distance(geography(s.the_geom), geography(c.the_geom), 0::double precision, false) < 16093::double precision) THEN 2 WHEN (_st_distance(geography(s.the_geom), geography(c.the_geom), 0::double precision, false) < 40233::double precision) THEN 3 WHEN (_st_distance(geography(s.the_geom), geography(c.the_geom), 0::double precision, false) < 80467::double precision) THEN 4 WHEN (_st_distance(geography(s.the_geom), geography(c.the_geom), 0::double precision, false) < 160934::double precision) THEN 5 ELSE 6 END)"
},
{
"QUERY PLAN" : " Sort Method: external merge Disk: 35200kB"
},
{
"QUERY PLAN" : " -> Nested Loop (cost=0.00..283194.48 rows=1058805 width=128) (actual time=0.688..13487.097 rows=1058805 loops=1)"
},
{
"QUERY PLAN" : " -> Seq Scan on full_data_for_testing_deid_2 c (cost=0.00..6845.26 rows=32085 width=32) (actual time=0.006..130.054 rows=32085 loops=1)"
},
{
"QUERY PLAN" : " -> Materialize (cost=0.00..1.13 rows=33 width=96) (actual time=0.001..0.028 rows=33 loops=32085)"
},
{
"QUERY PLAN" : " -> Seq Scan on demo_locations_table s (cost=0.00..1.10 rows=33 width=96) (actual time=0.003..0.034 rows=33 loops=1)"
},
{
"QUERY PLAN" : "Total runtime: 37569.205 ms"
}
],
"time" : 37.574,
"total_rows" : 9
}

最佳答案

问题出在笛卡尔积上: 选择 s.id, s.identifier, ST_Distance_Sphere(s.the_geom, c.the_geom) AS dist FROM full_data_for_testing_deid_2 c, demo_locations_table s

接下来是嵌套循环。我不认为你想在这里做笛卡尔。您可以通过更具体的 JOIN ON 轻松切断一些不必要的循环。两点之间的距离是交换函数。因此,只需向其中添加以下条件即可:c.pk > s.pk 根据您的需要(没有关于架构设计的信息)

关于postgresql - 提高复杂 PostgreSQL 查询的性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31481798/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com