gpt4 book ai didi

performance - postgres 的性能真的很慢

转载 作者:行者123 更新时间:2023-11-29 12:31:42 27 4
gpt4 key购买 nike

我有一个表,我只插入行,从不删除。在每个循环中,我插入大约 36k 行。

我需要从此表中获取行以执行操作。问题是在每个循环中查询性能真的很差。

例如,在循环 31 上:

explain analyze select exp(least(709,a.value)), a.from, a.to,a.material,a.transport from resultTable a where a.loop=31;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on resultTable a (cost=36.58..4431.79 rows=2425 width=48) (actual time=7.351..33894.217 rows=34640 loops=1)
Recheck Cond: (loop = 31)
-> Bitmap Index Scan on "resultTable_idx_mo" (cost=0.00..35.97 rows=2425 width=0) (actual time=4.880..4.880 rows=34640 loops=1)
Index Cond: (loop = 31)
Total runtime: 33897.070 ms
(5 rows)

对于循环 43:

explain analyze select exp(least(709,a.value)), a.from, a.to,a.material,a.transport from resultTable a where a.loop=43;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on resultTable a (cost=36.58..4431.79 rows=2425 width=48) (actual time=10.129..125460.445 rows=34640 loops=1)
Recheck Cond: (loop = 43)
-> Bitmap Index Scan on "resultTable_idx_mo" (cost=0.00..35.97 rows=2425 width=0) (actual time=4.618..4.618 rows=34640 loops=1)
Index Cond: (loop 43)
Total runtime: 125463.516 ms
(5 rows)

时间呈指数增长。我在每个循环中都在做 VACUUM 和 REINDEX(我也尝试过没有,但结果是一样的)。

知道如何改进时间吗?

提前致谢。

分区后:

    QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=14.47..2686.29 rows=1649 width=48) (actual time=18.562..220124.597 rows=34640 loops=1)
-> Append (cost=14.47..2682.17 rows=1649 width=48) (actual time=5.189..32.743 rows=34640 loops=1)
-> Bitmap Heap Scan on resultTable a (cost=14.47..1655.44 rows=866 width=48) (actual time=0.008..0.008 rows=0 loops=1)
Recheck Cond: (loop = 60)
-> Bitmap Index Scan on "resultTable_idx_mo" (cost=0.00..14.26 rows=866 width=0) (actual time=0.006..0.006 rows=0 loops=1)
Index Cond: (loop = 60)
-> Bitmap Heap Scan on result_table_child_70 a (cost=8.82..1026.73 rows=783 width=48) (actual time=5.181..29.068 rows=34640 loops=1)
Recheck Cond: (loop = 60)
-> Bitmap Index Scan on resultTable_child_70_idx (cost=0.00..8.63 rows=783 width=0) (actual time=4.843..4.843 rows=34640 loops=1)
Index Cond: (loop = 60)
Total runtime: 220128.290 ms

在分析表并设置 enable_bitmapscan=off 之后(仍然使用分区):

 Result  (cost=0.00..2761.06 rows=33652 width=389) (actual time=9.714..378389.177 rows=34640 loops=1)
-> Append (cost=0.00..2676.93 rows=33652 width=389) (actual time=0.119..34.065 rows=34640 loops=1)
-> Index Scan using "resultTable_idx_mo" on resultTable a (cost=0.00..12.84 rows=5 width=48) (actual time=0.058..0.058 rows=0 loops=1)
Index Cond: (loop= 79)
-> Index Scan using resultTable_child_80_idx on resultTable_child_80 a (cost=0.00..2664.10 rows=33647 width=389) (actual time=0.061..30.303 rows=34640 loops=1)
Index Cond: (loop = 79)
Total runtime: 378393.671 ms
(7 rows)

最佳答案

如果集群和分区不起作用,我开始怀疑您的存储系统存在严重问题。堆扫描 35K 行的 10 秒太慢了几个数量级。您使用的是什么版本的 Postgres?您的存储是什么样的?检查你的 iostats。

我用 Pg 9.0.4 在小型 VM(1 个小数 cpu、1GB 内存、NFS 磁盘装载)上设置了一个实验,创建您的表和索引并注入(inject) 1000 批处理,每批处理有 36000 条记录。

insert into r(loop,value,xfrom,xto,material,transport,pk) select 0,0,0,0,0,0,i from generate_series(0,35999) i;
insert into r(loop,value,xfrom,xto,material,transport,pk) select 1,0,0,0,0,0,i from generate_series(36000,71999) i;
...

在任何批处理上运行您的选择始终低于 40 毫秒:

explain analyze select exp(least(709,a.value)), a.xfrom, a.xto, a.material,a.transport from r a where a.loop=31;
Index Scan using "resultTable_idx_mo" on r a (cost=0.00..1596.35 rows=37680 width=21) (actual time=0.087..34.038 rows=36000 loops=1)
Index Cond: (loop = 31)
Total runtime: 36.332 ms

explain analyze select exp(least(709,a.value)), a.xfrom, a.xto,a.material,a.transport from r a where a.loop=87;
Index Scan using "resultTable_idx_mo" on r a (cost=0.00..1421.35 rows=33480 width=21) (actual time=0.105..37.357 rows=36000 loops=1)
Index Cond: (loop = 87)
Total runtime: 39.365 ms

请注意,我的计划提出了普通的 IndexScan 而不是 BitmapScan,然后是 HeapScan。您是否调整了计划优化配置以影响计划?

在您的解释中,我注意到估计的行数远远少于实际的行数(1649 对 34640)。这表明您对表的统计不准确。您应该运行 ANAYLZE resultTable;

当这些简单的索引扫描选择花费超过几十毫秒时,我认为您的 postgres 配置或存储系统中存在严重问题。当您运行查询时,此数据库上是否还有其他事件?也许您的结果正在与其他查询竞争资源?

关于performance - postgres 的性能真的很慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9685067/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com