gpt4 book ai didi

sql - 简单情况下的子查询性能

转载 作者:行者123 更新时间:2023-11-29 11:51:16 24 4
gpt4 key购买 nike

Postgresql 10 - Ubuntu LTS 最新版本 - 1CPU 2GB Ram - 没有安装其他软件
两个表,都有它们的索引:
关注 (22 条记录)
提示(250 万条记录)

select users_id_to from follows where users_id_from =1 

需要 0,041 毫秒

select tips.id
from tips
where tips.users_id in (2,3,4,5,6,8,79407,38463,42798,94150,76554,56777,71407,51788,4624,41079,13549,75920,18979,6078,26178,18316)

Bitmap Heap Scan on tips (cost=101.72..2122.76 rows=556 width=8) (actual time=0.267..1.120 rows=597 loops=1)
Recheck Cond: (users_id = ANY ('{2,3,4,5,6,8,79407,38463,42798,94150,76554,56777,71407,51788,4624,41079,13549,75920,18979,6078,26178,18316}'::bigint[]))
Heap Blocks: exact=594
-> Bitmap Index Scan on tips_idx_users_id01 (cost=0.00..101.58 rows=556 width=0) (actual time=0.188..0.188 rows=597 loops=1)
Index Cond: (users_id = ANY ('{2,3,4,5,6,8,79407,38463,42798,94150,76554,56777,71407,51788,4624,41079,13549,75920,18979,6078,26178,18316}'::bigint[]))
Planning time: 0.210 ms
Execution time: 1.193 ms

需要 1.2 毫秒(第一次运行时为 4.7 毫秒)

select tips.id
from tips
where tips.users_id in (select users_id_to
from follows
where users_id_from = 1
)


Merge Semi Join (cost=2.29..22.07 rows=573 width=8) (actual time=0.540..10632.242 rows=597 loops=1)
Merge Cond: (tips.users_id = follows.users_id_to)
Buffers: shared hit=1095506 read=1264002
-> Index Scan using tips_idx_users_id01 on tips (cost=0.43..205139.43 rows=2500000 width=16) (actual time=0.021..10180.667 rows=2353909 loops=1)
Buffers: shared hit=1095505 read=1264002
-> Sort (cost=1.77..1.82 rows=22 width=8) (actual time=0.051..0.084 rows=22 loops=1)
Sort Key: follows.users_id_to
Sort Method: quicksort Memory: 26kB
Buffers: shared hit=1
-> Seq Scan on follows (cost=0.00..1.27 rows=22 width=8) (actual time=0.012..0.019 rows=22 loops=1)
Filter: (users_id_from = 1)
Buffers: shared hit=1
Planning time: 0.954 ms
Execution time: 10632.376 ms

需要 10433 毫秒
定义:

CREATE TABLE public.follows (
id bigserial NOT NULL,
users_id_from bigint NOT NULL DEFAULT 0,
users_id_to bigint NOT NULL DEFAULT 0,
has_accepted boolean NOT NULL DEFAULT true,
created_on timestamp without time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT followings_pkey
PRIMARY KEY (id)
)

CREATE TABLE public.tips (
id bigserial NOT NULL,
users_id bigint NOT NULL,
temp_id bigint NOT NULL,
first_seen numeric(12,2) NOT NULL DEFAULT 0,
created_on timestamp without time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
expire_on_gmt timestamp without time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
ip_from inet NOT NULL DEFAULT '0.0.0.0'::inet,
"type" smallint NOT NULL DEFAULT 0,
growth numeric(8,1) NOT NULL DEFAULT 0.0,
seen boolean DEFAULT false,

CONSTRAINT tips_pkey
PRIMARY KEY (id)
)

CREATE INDEX tips_idx_users_id01
ON public.tips
(users_id);

我真的不明白为什么性能这么差,似乎服务器在后台做了一个 JOIN ...
任何帮助表示赞赏。

谢谢
佩雷斯

编辑 - 2018.10.9
尽管接受了答案,但由于对 Pavel Stehule 进行了更深入的调查(请参阅下面的帖子),谁立即解决了问题,真正的问题是 follows 表的统计信息不正确. VACUUM ANALYZE 解决了这个问题,现在两个查询都运行得很快。

最佳答案

我建议将查询写成:

select t.id
from tips t
where exists (select 1
from follows f
where f.users_id_from = 1 and f.users_id_to = t.users_id
);

并在 follow(users_id_to, users_id_from) 上创建索引——两列按此顺序。

至于Postgres为什么选择那个执行计划。 Postgres 认为它​​是最好的。有时优化器会犯错误。也许桌面上的统计数据不是最新的。

编辑:

嗯。我想知道这些版本中的任何一个是否会鼓励 Postgres 在 tips(id) 上使用索引:

with f as (
select users_id_to
from follows
where users_id_from = 1
)
select t.id
from tips t
where t.users_id in (select f.users_id_to from f);

这为 Postgres 提供了实现子查询然后使用索引的选项(鼓励?)。

第二个是一个简单的join:

select t.id
from tips t join
follows f
on f.users_id_to = t.id
where f.users_id_from = 1

关于sql - 简单情况下的子查询性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52685384/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com