gpt4 book ai didi

postgresql - 如何优化此 Postgres 计数查询

转载 作者:行者123 更新时间:2023-11-29 12:00:31 29 4
gpt4 key购买 nike

EXPLAIN ANALYZE 
SELECT count(*)
FROM "businesses"
WHERE (
source = 'facebook'
OR EXISTS(
SELECT *
FROM provider_business_map pbm
WHERE
pbm.hotstepper_business_id=businesses.id
AND pbm.provider_name='facebook'
)
);
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=233538965.74..233538965.75 rows=1 width=0) (actual time=116169.720..116169.721 rows=1 loops=1)
-> Seq Scan on businesses (cost=0.00..233521096.48 rows=7147706 width=0) (actual time=11.284..116165.646 rows=3693 loops=1)
Filter: (((source)::text = 'facebook'::text) OR (alternatives: SubPlan 1 or hashed SubPlan 2))
SubPlan 1
-> Index Scan using idx_provider_hotstepper_business on provider_business_map pbm (cost=0.00..16.29 rows=1 width=0) (never executed)
Index Cond: (((provider_name)::text = 'facebook'::text) AND (hotstepper_business_id = businesses.id))
SubPlan 2
-> Index Scan using idx_provider_hotstepper_business on provider_business_map pbm (cost=0.00..16.28 rows=1 width=4) (actual time=0.045..5.685 rows=3858 loops=1)
Index Cond: ((provider_name)::text = 'facebook'::text)
Total runtime: 116169.820 ms
(10 rows)

此查询需要一分钟多的时间,它正在计算结果约为 3000。瓶颈似乎是顺序扫描,但我不确定我需要在数据库上使用什么索引来优化它。还值得注意的是,我还没有调整 postgres,所以如果有任何调整可以帮助它可能值得考虑。虽然我的数据库是 15GB,但我不打算很快将所有这些都放入内存中,所以我不确定更改与 RAM 相关的值是否会有很大帮助。

最佳答案

OR 以糟糕的表现而臭名昭著。尝试将其拆分为两个表上两个完全独立的查询的联合:​​

SELECT COUNT(*) FROM (
SELECT id
FROM businesses
WHERE source = 'facebook'
UNION -- union makes the ids unique in the result
SELECT hotstepper_business_id
FROM provider_business_map
WHERE provider_name = 'facebook'
AND hotstepper_business_id IS NOT NULL
) x

如果 hotstepper_business_id 不能为空,您可以删除该行

AND hotstepper_business_id IS NOT NULL

如果你想要整个业务行,你可以简单地用 IN (...) 包装上面的查询:

SELECT * FROM businesses
WHERE ID IN (
-- above inner query
)

但是一个性能更好的查询是修改上面的查询使用一个连接:

SELECT *
FROM businesses
WHERE source = 'facebook'
UNION
SELECT b.*
FROM provider_business_map m
JOIN businesses b
ON b.id = m.hotstepper_business_id
WHERE provider_name = 'facebook'

关于postgresql - 如何优化此 Postgres 计数查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17440690/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com