gpt4 book ai didi

postgresql - postgres查询优化问题

转载 作者:行者123 更新时间:2023-11-29 11:36:29 29 4
gpt4 key购买 nike

下面是两个几乎相同的 postgres 查询,但产生的查询计划和执行时间却大不相同。我假设第一个查询很快,因为只有 196 个 form_instance 记录 form_id = 'W40',而有 7000 个 form_id = 'W30L'。但为什么从 200 条记录跳到 7000 条记录(这对我来说似乎相对较小)会导致查询时间如此惊人地增加?我尝试以各种方式为数据编制索引以加快速度,但基本上感到困惑。我怎样才能加快速度? (请注意,两个表的架构都包含在底部)。

explain analyze select form_id,form_instance_id,answer,field_id
from form_instances,field_instances
where workflow_state = 'DRqueued' and form_instance_id = form_instances.id
and field_id in ('Book_EstimatedDueDate','H_SubmittedDate','H_Ccode','miscarriage','miscarriage_of_multiple','stillbirth','AP_IUFD_of_multiple','maternal_death','birth_includes_transport','newborn_death','H_Pid','H_Mid1','H_Mid2','H_Mid3')
and (form_id = 'W40');

QUERY PLAN
Nested Loop (cost=0.00..70736.14 rows=4646 width=29) (actual time=0.000..20.000 rows=2399 loops=1)
-> Index Scan using form_id_and_workflow_state on form_instances (cost=0.00..1041.42 rows=507 width=8) (actual time=0.000..0.000 rows=196 loops=1)
Index Cond: (((form_id)::text = 'W40'::text) AND ((workflow_state)::text = 'DRqueued'::text))
-> Index Scan using index_field_instances_on_form_instance_id on field_instances (cost=0.00..137.25 rows=17 width=25) (actual time=0.000..0.102 rows=12 loops=196)
Index Cond: (field_instances.form_instance_id = form_instances.id)
Filter: ((field_instances.field_id)::text = ANY ('{Book_EstimatedDueDate,H_SubmittedDate,H_Ccode,miscarriage,miscarriage_of_multiple,stillbirth,AP_IUFD_of_multiple,maternal_death,birth_includes_transport,newborn_death,H_Pid,H_Mid1,H_Mid2,H_Mid3}'::text[]))
Total runtime: 30.000 ms
(7 rows)

explain analyze select form_id,form_instance_id,answer,field_id
from form_instances,field_instances
where workflow_state = 'DRqueued' and form_instance_id = form_instances.id
and field_id in ('Book_EstimatedDueDate','H_SubmittedDate','H_Ccode','miscarriage','miscarriage_of_multiple','stillbirth','AP_IUFD_of_multiple','maternal_death','birth_includes_transport','newborn_death','H_Pid','H_Mid1','H_Mid2','H_Mid3')
and (form_id = 'W30L');

QUERY PLAN
Hash Join (cost=34300.46..160865.40 rows=31045 width=29) (actual time=65670.000..74960.000 rows=102777 loops=1)
Hash Cond: (field_instances.form_instance_id = form_instances.id)
-> Bitmap Heap Scan on field_instances (cost=29232.57..152163.82 rows=531718 width=25) (actual time=64660.000..72800.000 rows=526842 loops=1)
Recheck Cond: ((field_id)::text = ANY ('{Book_EstimatedDueDate,H_SubmittedDate,H_Ccode,miscarriage,miscarriage_of_multiple,stillbirth,AP_IUFD_of_multiple,maternal_death,birth_includes_transport,newborn_death,H_Pid,H_Mid1,H_Mid2,H_Mid3}'::text[]))
-> Bitmap Index Scan on index_field_instances_on_field_id (cost=0.00..29099.64 rows=531718 width=0) (actual time=64630.000..64630.000 rows=594515 loops=1)
Index Cond: ((field_id)::text = ANY ('{Book_EstimatedDueDate,H_SubmittedDate,H_Ccode,miscarriage,miscarriage_of_multiple,stillbirth,AP_IUFD_of_multiple,maternal_death,birth_includes_transport,newborn_death,H_Pid,H_Mid1,H_Mid2,H_Mid3}'::text[]))
-> Hash (cost=5025.54..5025.54 rows=3388 width=8) (actual time=980.000..980.000 rows=10457 loops=1)
-> Bitmap Heap Scan on form_instances (cost=90.99..5025.54 rows=3388 width=8) (actual time=10.000..950.000 rows=10457 loops=1)
Recheck Cond: (((form_id)::text = 'W30L'::text) AND ((workflow_state)::text = 'DRqueued'::text))
-> Bitmap Index Scan on form_id_and_workflow_state (cost=0.00..90.14 rows=3388 width=0) (actual time=0.000..0.000 rows=10457 loops=1)
Index Cond: (((form_id)::text = 'W30L'::text) AND ((workflow_state)::text = 'DRqueued'::text))
Total runtime: 75080.000 ms

# \d form_instances Table "public.form_instances" Column | Type | Modifiers
-----------------+-----------------------------+-------------------------------------------------------------
id | integer | not null default nextval('form_instances_id_seq'::regclass)
form_id | character varying(255) |
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
created_by_id | integer |
updated_by_id | integer |
workflow | character varying(255) |
workflow_state | character varying(255) |
validation_data | text |
Indexes:
"form_instances_pkey" PRIMARY KEY, btree (id)
"form_id_and_workflow_state" btree (form_id, workflow_state)
"index_form_instances_on_form_id" btree (form_id)
"index_form_instances_on_workflow_state" btree (workflow_state)

# \d field_instances
Table "public.field_instances"
Column | Type | Modifiers
------------------+-----------------------------+--------------------------------------------------------------
id | integer | not null default nextval('field_instances_id_seq'::regclass)
form_instance_id | integer |
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
created_by_id | integer |
updated_by_id | integer |
field_id | character varying(255) |
answer | text |
state | character varying(255) |
explanation | text |
idx | integer | not null default 0
Indexes:
"field_instances_pkey" PRIMARY KEY, btree (id)
"field_instances__lower_answer" btree (lower(answer))
"index_field_instances_on_answer" btree (answer)
"index_field_instances_on_field_id" btree (field_id)
"index_field_instances_on_field_id_and_answer" btree (field_id, answer)
"index_field_instances_on_form_instance_id" btree (form_instance_id)
"index_field_instances_on_idx" btree (idx)

最佳答案

之前的评论,但由于它似乎已经解决了问题,所以我将提升为实际答案。

系统对可能有多少行的估计是错误的。我们可以看到,在第二个查询中,它从位图索引扫描中估计了 3388 行,但实际上得到了 10457 行。

所以你可能想要vacuum full analyze;

此外,其他可以极大帮助的命令包括 reindex 和/或 cluster

OP 声明 vacuum 没有帮助,但 reindex 有帮助。

关于postgresql - postgres查询优化问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/6442046/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com