gpt4 book ai didi

postgresql 对每个表的第一个查询很慢

转载 作者:行者123 更新时间:2023-12-03 03:25:40 33 4
gpt4 key购买 nike

问题

我有一个包含多个大表的数据库,每个表都有超过 5 亿条记录,每个记录非常简单,只有 5 个文本/浮点字段。我对表进行了分区,并准备了一个函数来查询表。但第一次查询总是比后面的查询慢 10 倍。所有 table 都会发生这种情况。

详细信息

根据这里的许多帖子,我尝试了以下方法:

查询过程的分析如下,在 psql 连接后,我执行第一个查询,然后立即使用不同的参数进行第二个查询:

postgres=> EXPLAIN (ANALYZE, BUFFERS) select * from myschema.myfunc(tableID,arg1, arg2) limit 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.25..0.26 rows=1 width=20) (actual time=4860.718..4860.718 rows=1 loops=1)
Buffers: shared hit=4479 read=21
I/O Timings: read=209.969
-> Function Scan on get_specific_point_data (cost=0.25..0.35 rows=10 width=20) (actual time=4860.717..4860.717 rows=1 loops=1)
Buffers: shared hit=4479 read=21
I/O Timings: read=209.969
Planning Time: 0.049 ms
Execution Time: 4860.824 ms
(8 rows)

postgres=> EXPLAIN (ANALYZE, BUFFERS) select * from myschema.myfunc(tableID,arg3, arg4) limit 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.25..0.26 rows=1 width=20) (actual time=448.253..448.254 rows=1 loops=1)
Buffers: shared hit=89 read=19
I/O Timings: read=436.326
-> Function Scan on get_specific_point_data (cost=0.25..0.35 rows=10 width=20) (actual time=448.252..448.252 rows=1 loops=1)
Buffers: shared hit=89 read=19
I/O Timings: read=436.326
Planning Time: 0.051 ms
Execution Time: 448.292 ms

配置

postgresql 服务器部署在 azure 上,具有 2 个 vCores CPU、约 600Gb 存储。假设它有 4Gb RAM。以下是一些配置:

            name            | setting
----------------------------+---------
autovacuum_work_mem | -1
dynamic_shared_memory_type | windows
maintenance_work_mem | 65536
work_mem | 4096
block_size | 8192
effective_cache_size | 393216 (?? units are not Kb? Azure portal say: "unit is 8kb"?
log_rotation_size | 102400
max_wal_size | 1024
min_parallel_index_scan_size | 64
min_parallel_table_scan_size | 1024
min_wal_size | 256
segment_size | 131072
track_activity_query_size | 1024
wal_block_size | 8192
wal_segment_size | 16777216

我还打开了pg_prewarm Azure 共享库中的插件,但不确定是否需要任何进一步的步骤/设置来激活它。

如有任何建议,我们将不胜感激!

更新

根据 @jjanes 和 @Zegarek 关于隐藏计划信息的功能的评论,一些额外的细节。尝试在此处准备。我再次从新连接重新运行查询,第一个查询速度较慢,第二个查询速度快了 10 倍。

PREPARE                                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=452763.67..467166.58 rows=200 width=20) (actual time=1.977..2.196 rows=1604 loops=1)
Buffers: shared hit=114
-> Sort (cost=452763.67..459965.12 rows=2880582 width=20) (actual time=1.976..2.031 rows=1604 loops=1)
Sort Key: values_table_r1_h0.date
Sort Method: quicksort Memory: 126kB
Buffers: shared hit=114
-> Nested Loop (cost=273.05..25554.05 rows=2880582 width=20) (actual time=1.000..1.656 rows=1604 loops=1)
Buffers: shared hit=111
-> Limit (cost=272.62..272.62 rows=1 width=12) (actual time=0.955..0.956 rows=1 loops=1)
Buffers: shared hit=91
-> Sort (cost=272.62..272.64 rows=9 width=12) (actual time=0.952..0.953 rows=1 loops=1)
Sort Key: ((abs((id_table.field1 - '96.7'::double precision)) + abs((id_table.field2 - '50.7'::double precision))))
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=91
-> Index Scan using id_table_id2_lat_idx on id_table (cost=0.43..272.57 rows=9 width=12) (actual time=0.139..0.937 rows=9 loops=1)
Buffers: shared hit=88
Index Cond: ((field1 < '96.7'::double precision) AND (field1 > '96.7'::double precision))
Filter: ((field2 < '50.7'::double precision) AND (field2 > '50.7'::double precision))
-> Append (cost=0.43..9215.18 rows=166333 width=16) (actual time=0.020..0.514 rows=1604 loops=1)
Buffers: shared hit=20
-> Bitmap Heap Scan on values_table_r1_h0 (cost=4.22..14.82 rows=1 width=16) (never executed)
Recheck Cond: (id = id_table.id)
Filter: (field3 = ANY ('{6,7,8,9,10}'::integer[]))
-> Bitmap Index Scan on values_table_r1_h0_pt_id_idx (cost=0.00..4.22 rows=9 width=0) (never executed)
Index Cond: (id = id_table.id)
...
-> Index Scan using values_table_r9_h8_pt_id_idx on values_table_r9_h8 (cost=0.43..77.42 rows=1522 width=16) (actual time=0.015..0.435 rows=1604 loops=1)
Index Cond: (pt_id = id_table.pt_id)
Filter: (field3 = ANY ('{6,7,8,9,10}'::integer[]))
Rows Removed by Filter: 280
...
-> Bitmap Heap Scan on values_table_r10_h9 (cost=4.22..14.82 rows=1 width=16) (never executed)
Recheck Cond: (id = id_table.id)
Filter: (field3 = ANY ('{6,7,8,9,10}'::integer[]))
-> Bitmap Index Scan on values_table_r10_h9_pt_id_idx (cost=0.00..4.22 rows=9 width=0) (never executed)
Index Cond: (id = id_table.id)
Planning Time: 4199.297 ms
Execution Time: 4.114 ms
(317 rows)

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=452745.97..467148.88 rows=200 width=20) (actual time=1.719..2.033 rows=1715 loops=1)
-> Sort (cost=452745.97..459947.42 rows=2880582 width=20) (actual time=1.718..1.772 rows=1715 loops=1)
Sort Key: values_table_r1_h0.date
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=255.34..25536.35 rows=2880582 width=20) (actual time=0.728..1.416 rows=1715 loops=1)
-> Limit (cost=254.92..254.92 rows=1 width=12) (actual time=0.693..0.694 rows=1 loops=1)
-> Sort (cost=254.92..254.94 rows=9 width=12) (actual time=0.692..0.693 rows=1 loops=1)
Sort Key: ((abs((id_table.field1 - '97.4'::double precision)) + abs((id_table.field2 - '52.26'::double precision))))
Sort Method: quicksort Memory: 25kB
-> Index Scan using id_table_id2_lat_idx on id_table (cost=0.43..254.87 rows=9 width=12) (actual time=0.184..0.688 rows=9 loops=1)
Index Cond: ((field1 < '97.6'::double precision) AND (field1 > '97.4'::double precision))
Filter: ((field2 < '52.3'::double precision) AND (field2 > '52.32'::double precision))
-> Append (cost=0.43..9215.18 rows=166333 width=16) (actual time=0.019..0.532 rows=1715 loops=1)
-> Bitmap Heap Scan on values_table_r1_h0 (cost=4.22..14.82 rows=1 width=16) (never executed)
Recheck Cond: (id = id_table.id)
Filter: (field3 = ANY ('{6,7,8,9,10}'::integer[]))
-> Bitmap Index Scan on values_table_r1_h0_pt_id_idx (cost=0.00..4.22 rows=9 width=0) (never executed)
Index Cond: (id = id_table.id)
...
-> Index Scan using values_table_r5_h3_pt_id_idx on values_table_r5_h3 (cost=0.44..87.53 rows=1758 width=16) (actual time=0.016..0.470 rows=1715 loops=1)
Index Cond: (pt_id = pts_lake_id3.pt_id)
Filter: (month = ANY ('{6,7,8,9,10}'::integer[]))
Rows Removed by Filter: 281
...
-> Bitmap Heap Scan on values_table_r10_h9 (cost=4.22..14.82 rows=1 width=16) (never executed)
Recheck Cond: (id = id_table.id)
Filter: (field3 = ANY ('{6,7,8,9,10}'::integer[]))
-> Bitmap Index Scan on values_table_r10_h9_pt_id_idx (cost=0.00..4.22 rows=9 width=0) (never executed)
Index Cond: (id = id_table.id)
Planning Time: 5.590 ms
Execution Time: 3.362 ms
(317 rows)

第一个查询的额外时间似乎是规划时间:4348.720 毫秒。一个连接与多个查询显示时间顺序为:

 Planning Time: 4455.529 ms
Execution Time: 267.157 ms
Planning Time: 6.161 ms
Execution Time: 177.163 ms
Planning Time: 5.950 ms
Execution Time: 279.961 ms
Planning Time: 5.447 ms
Execution Time: 180.953 ms
Planning Time: 7.334 ms
Execution Time: 571.524 ms
Planning Time: 10.463 ms
Execution Time: 1108.749 ms

所以我只需要尝试减少第一个查询的计划时间?

  • 向服务器添加更多资源(CPU、RAM)有助于减少第一次加载时间吗?
  • pg_prewarm在这种情况下有帮助吗?
  • 如果它们都不起作用,当用户从 GUI 连接到数据库时,我可以在后台查询每个表,以隐形执行第一个慢速查询,而不影响 GUI 响应时间吗?

最佳答案

看起来问题在于加载有关对象的元数据(我猜主要是分区和分区索引)所需的时间,以便它可以进行规划。我认为没有任何解决方案,只有解决方法。一旦建立连接,请保留该连接,以便下次不需要重复加载元数据。如果这不方便,您可以使用像 pgbouncer 这样的外部连接池来为您保留连接。

关于postgresql 对每个表的第一个查询很慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/75700453/

33 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com