gpt4 book ai didi

sql - 为什么我的子查询甚至对过滤的行也执行?

转载 作者:行者123 更新时间:2023-11-29 12:06:34 28 4
gpt4 key购买 nike

我有一个看起来有点像这样的查询(注意:实际查询是由 Hibernate 生成的,有点复杂):

select * from outage_revisions orev
join outages o
on orev.outage=o.id
where o.observed_end is null
and orev.observation_date =
(select max(observation_date)
from outage_revisions orev2
where orev2.observation_date <= '2011-11-21 00:00:00'
and orev2.outage = orev.outage);

此查询运行非常缓慢(大约 15 分钟)。但是,如果我用子查询取出 where 子句的一部分,它几乎立即返回(大约 83 毫秒),只有大约 14 行。

此外,子查询本身非常快(大约 31 毫秒):

select max(observation_date) from outage_revisions orev2
where orev2.observation_date <= '2011-11-21 00:00:00'
and orev2.outage = 1

我的问题是:如果不包括子查询筛选器的完整查询只返回 14 行,为什么添加子查询会使查询变慢这么多?子查询不应该添加最多大约 31*14 毫秒吗?

下面是完整查询的计划:

Nested Loop  (cost=0.00..71078813.16 rows=1 width=115)
-> Seq Scan on outagerevisions orev (cost=0.00..71077624.67 rows=284 width=79)
Filter: (observationdate = (SubPlan 2))
SubPlan 2
-> Result (cost=1250.56..1250.57 rows=1 width=0)
InitPlan 1 (returns $1)
-> Limit (cost=0.00..1250.56 rows=1 width=8)
-> Index Scan Backward using idx_observationdate on outagerevisions orev2 (cost=0.00..2501.12 rows=2 width=8)
Index Cond: (observationdate <= '2011-11-21 00:00:00'::timestamp without time zone)
Filter: ((observationdate IS NOT NULL) AND (outage = $0))
-> Index Scan using outages_pkey on outages o (cost=0.00..4.17 rows=1 width=36)
Index Cond: (o.id = orev.outage)
Filter: (o.observedend IS NULL)

最佳答案

我的猜测是 PostgreSQL 只是在执行查询的方式上做出了错误的选择。尽管在执行相关子查询之前它应该缩小到 9 行似乎很明显,但它可能没有这样做,因此子查询必须运行 60,000 次。在这样做的同时,它还必须跟踪哪些行将继续进行下一步,等等。

以下是您可以尝试编写的其他几种方法:

SELECT
<column list>
FROM
Outage_Revisions OREV
JOIN Outages O ON
OREV.outage = O.id
LEFT OUTER JOIN Outage_Revisions OREV2 ON
OREV2.outage = OREV.outage AND
OREV2.observation_date <= '2011-11-21 00:00:00' AND
OREV2.observation_date > OREV.observation_date
WHERE
O.observed_end IS NULL AND
OREV2.outage IS NULL

或者(假设PostgreSQL和Hibernate支持加入子查询)

SELECT
<column list>
FROM
Outage_Revisions OREV
JOIN Outages O ON
OREV.outage = O.id
JOIN (SELECT OREV2.outage, MAX(OREV2.observation_date) AS max_observation_date
FROM Outage_Revisions OREV2
WHERE OREV2.observation_date <= '2011-11-21 00:00:00'
GROUP BY OREV2.outage) SQ ON
SQ.outage = OREV.outage AND
SQ.max_observation_date = OREV.observation_date
WHERE
O.observed_end IS NULL

您可以调整最后一个查询中的连接顺序。

关于sql - 为什么我的子查询甚至对过滤的行也执行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8227853/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com