gpt4 book ai didi

performance - 直接查询比使用 join 的子查询慢得多

转载 作者:行者123 更新时间:2023-11-29 12:05:57 26 4
gpt4 key购买 nike

我有 2 个表。它们的结构大致如下;不过,我已经更改了名称。

CREATE TABLE overlay_polygon
(
overlay_polygon_id SERIAL PRIMARY KEY,
some_other_polygon_id INTEGER REFERENCES some_other_polygon (some_other_polygon_id)
dollar_value NUMERIC,
geom GEOMETRY(Polygon,26915)
)

CREATE TABLE point
(
point_id SERIAL PRIMARY KEY,
some_other_polygon_id INTEGER REFERENCES some_other_polygon (some_other_polygon_id)
-- A bunch of other fields that this query won't touch
geom GEOMETRY(Point,26915)
)

point 在其 geom 列上有一个名为 spix_point 的空间索引,在其 some_other_polygon_id 上有一个索引专栏也是如此。

point 中大约有 500,000 行,point 中几乎所有行都与 overlay_polygon 中的某些行相交。最初,我的 overlay_polygon 表包含几行,这些行的面积非常小(大部分小于 1 平方米),并且在空间上不与 point 的任何行相交。删除中不与任何行相交的小行后,共有38行。

顾名思义,overlay_polygon 是一个多边形表,由 3 个其他表(包括 some_other_polygon)的多边形叠加而成。特别是,我需要使用 dollar_valuepoint 上的一些列进行一些计算。当我打算删除不与任何点相交的行以加快将来的处理速度时,我最终查询了 COUNT 行。最明显的查询似乎如下。

SELECT op.*, COUNT(point_id) AS num_points
FROM overlay_polygon op
LEFT JOIN point ON op.some_other_polygon_id = point.some_other_polygon_id AND ST_Intersects(op.geom, point.geom)
GROUP BY op.overlay_polygon_id
ORDER BY op.overlay_polygon_id
;

这是它的EXPLAIN (ANALYZE, BUFFERS)

GroupAggregate  (cost=544.45..545.12 rows=38 width=8049) (actual time=284962.944..540959.914 rows=38 loops=1)
Buffers: shared hit=58694 read=17119, temp read=189483 written=189483
I/O Timings: read=39171.525
-> Sort (cost=544.45..544.55 rows=38 width=8049) (actual time=271754.952..534154.573 rows=415224 loops=1)
Sort Key: op.overlay_polygon_id
Sort Method: external merge Disk: 897016kB
Buffers: shared hit=58694 read=17119, temp read=189483 written=189483
I/O Timings: read=39171.525
-> Nested Loop Left Join (cost=0.00..543.46 rows=38 width=8049) (actual time=0.110..46755.284 rows=415224 loops=1)
Buffers: shared hit=58694 read=17119
I/O Timings: read=39171.525
-> Seq Scan on overlay_polygon op (cost=0.00..11.38 rows=38 width=8045) (actual time=0.043..153.255 rows=38 loops=1)
Buffers: shared hit=1 read=10
I/O Timings: read=152.866
-> Index Scan using spix_point on point (cost=0.00..13.99 rows=1 width=200) (actual time=50.229..1139.868 rows=10927 loops=38)
Index Cond: (op.geom && geom)
Filter: ((op.some_other_polygon_id = some_other_polygon_id) AND _st_intersects(op.geom, geom))
Rows Removed by Filter: 13353
Buffers: shared hit=58693 read=17109
I/O Timings: read=39018.660
Total runtime: 542172.156 ms

但是,我发现这个查询运行得非常非常快:

SELECT *
FROM overlay_polygon
JOIN (SELECT op.overlay_polygon_id, COUNT(point_id) AS num_points
FROM overlay_polygon op
LEFT JOIN point ON op.some_other_polygon_id = point.some_other_polygon_id AND ST_Intersects(op.geom, point.geom)
GROUP BY op.overlay_polygon_id
) x ON x.overlay_polygon_id = overlay_polygon.overlay_polygon_id
ORDER BY overlay_polygon.overlay_polygon_id
;

它的EXPLAIN (ANALYZE, BUFFERS)如下。

Sort  (cost=557.78..557.88 rows=38 width=8057) (actual time=18904.661..18904.748 rows=38 loops=1)
Sort Key: overlay_polygon.overlay_polygon_id
Sort Method: quicksort Memory: 126kB
Buffers: shared hit=58690 read=17134
I/O Timings: read=9924.328
-> Hash Join (cost=544.88..556.78 rows=38 width=8057) (actual time=18903.697..18904.210 rows=38 loops=1)
Hash Cond: (overlay_polygon.overlay_polygon_id = op.overlay_polygon_id)
Buffers: shared hit=58690 read=17134
I/O Timings: read=9924.328
-> Seq Scan on overlay_polygon (cost=0.00..11.38 rows=38 width=8045) (actual time=0.127..0.411 rows=38 loops=1)
Buffers: shared hit=2 read=9
I/O Timings: read=0.173
-> Hash (cost=544.41..544.41 rows=38 width=12) (actual time=18903.500..18903.500 rows=38 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 2kB
Buffers: shared hit=58688 read=17125
I/O Timings: read=9924.154
-> HashAggregate (cost=543.65..544.03 rows=38 width=8) (actual time=18903.276..18903.379 rows=38 loops=1)
Buffers: shared hit=58688 read=17125
I/O Timings: read=9924.154
-> Nested Loop Left Join (cost=0.00..543.46 rows=38 width=8) (actual time=0.052..17169.606 rows=415224 loops=1)
Buffers: shared hit=58688 read=17125
I/O Timings: read=9924.154
-> Seq Scan on overlay_polygon op (cost=0.00..11.38 rows=38 width=8038) (actual time=0.004..0.537 rows=38 loops=1)
Buffers: shared hit=1 read=10
I/O Timings: read=0.279
-> Index Scan using spix_point on point (cost=0.00..13.99 rows=1 width=200) (actual time=4.422..381.991 rows=10927 loops=38)
Index Cond: (op.gopm && gopm)
Filter: ((op.some_other_polygon_id = some_other_polygon_id) AND _st_intersects(op.geom, geom))
Rows Removed by Filter: 13353
Buffers: shared hit=58687 read=17115
I/O Timings: read=9923.875
Total runtime: 18905.293 ms

如您所见,他们有可比较的成本估算,但我不确定这些成本估算的准确性。我对涉及 PostGIS 功能的成本估算持怀疑态度。自上次修改以来和运行查询之前,这两个表都运行了 VACUUM ANALYZE FULL

也许我只是无法阅读我的EXPLAIN ANALYZE,但我不明白为什么这些查询的运行时间有如此大的不同。任何人都可以识别任何东西吗?我能想到的唯一可能性与LEFT JOIN中涉及的列数有关。

编辑 1

根据@ChrisTravers 的建议,我增加了 work_mem 并重新运行第一个查询。我不认为这是一个显着的改进。

执行

SET work_mem='4MB';

(它是 1 MB。)

然后执行第一个查询给出了这些结果。

GroupAggregate  (cost=544.45..545.12 rows=38 width=8049) (actual time=339910.046..495775.478 rows=38 loops=1)
Buffers: shared hit=58552 read=17261, temp read=112133 written=112133
-> Sort (cost=544.45..544.55 rows=38 width=8049) (actual time=325391.923..491329.208 rows=415224 loops=1)
Sort Key: op.overlay_polygon_id
Sort Method: external merge Disk: 896904kB
Buffers: shared hit=58552 read=17261, temp read=112133 written=112133
-> Nested Loop Left Join (cost=0.00..543.46 rows=38 width=8049) (actual time=14.698..234266.573 rows=415224 loops=1)
Buffers: shared hit=58552 read=17261
-> Seq Scan on overlay_polygon op (cost=0.00..11.38 rows=38 width=8045) (actual time=14.612..15.384 rows=38 loops=1)
Buffers: shared read=11
-> Index Scan using spix_point on point (cost=0.00..13.99 rows=1 width=200) (actual time=95.262..5451.636 rows=10927 loops=38)
Index Cond: (op.geom && geom)
Filter: ((op.some_other_polygon_id = some_other_polygon_id) AND _st_intersects(op.geom, geom))
Rows Removed by Filter: 13353
Buffers: shared hit=58552 read=17250
Total runtime: 496936.775 ms

编辑2

好吧,这里有一种我以前没有注意到的好闻的味道(主要是因为我无法阅读 ANALYZE 输出)。抱歉,我没有早点注意到它。

Sort  (cost=544.45..544.55 rows=38 width=8049) (actual time=271754.952..534154.573 rows=415224 loops=1)

估计行数:38。实际行数:超过 400K。有想法吗?

最佳答案

我的直接想法是,这可能与 work_mem 限制有关。计划之间的区别在于,在第一个计划中,您加入然后聚合,而在第二个计划中,您聚合并加入。这意味着您的聚合集更窄,这意味着该操作使用的内存更少。

如果您尝试将 work_mem 加倍并再次尝试,看看会发生什么变化会很有趣。

编辑:既然我们知道增加 work_mem 只会带来适度的改进,下一个问题是排序行估计。我怀疑它实际上超过了 work_mem 并且它期望这会很容易,因为它预计只有 38 行,但是却得到了很多行。我不清楚规划器是从哪里获得这些信息的,因为很明显规划器估计(正确地)38 行是我们期望从聚合中得到的行数。这部分对我来说开始看起来像是一个计划程序错误,但我很难把我的手指放在上面。可能值得在 pgsql-general 电子邮件列表中撰写和提出。在我看来,规划者几乎混淆了排序所需的内存和聚合所需的内存。

关于performance - 直接查询比使用 join 的子查询慢得多,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15126996/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com