gpt4 book ai didi

sql - 将 ORDER BY 从 id 更改为另一个索引列(具有低 LIMIT)具有巨大的成本

转载 作者:行者123 更新时间:2023-11-29 11:45:27 26 4
gpt4 key购买 nike

我有一个 500 000 行表的查询。

基本上

WHERE s3_.id = 287
ORDER BY m0_.id DESC
LIMIT 25

=> 查询运行时间 = 20 毫秒

WHERE s3_.id = 287
ORDER BY m0_.created_at DESC
LIMIT 25

=> 查询运行时间 = 15000 毫秒或更多

created_at 上有一个索引。

查询计划完全不同。

不幸的是,我不是查询计划大师。我想在按 created_at 排序时重现快速查询计划。

这可能吗?我该怎么做?

查询计划 - 慢速查询(按 m0_.created_at 排序):http://explain.depesz.com/s/KBl

查询计划 - 快速查询(按 m0_.id 排序):http://explain.depesz.com/s/2pYZ

完整查询

SELECT m0_.id AS id0, m0_.content AS content1, m0_.created_at AS created_at2,
c1_.id AS id3, l2_.id AS id4, l2_.reference AS reference5,
s3_.id AS id6, s3_.name AS name7, s3_.code AS code8,
u4_.email AS email9, u4_.id AS id10, u4_.firstname AS firstname11, u4_.lastname AS lastname12,
u5_.email AS email13, u5_.id AS id14, u5_.firstname AS firstname15, u5_.lastname AS lastname16,
g6_.id AS id17, g6_.firstname AS firstname18, g6_.lastname AS lastname19, g6_.email AS email20,
m0_.conversation_id AS conversation_id21, m0_.author_user_id AS author_user_id22, m0_.author_guest_id AS author_guest_id23,
c1_.author_user_id AS author_user_id24, c1_.author_guest_id AS author_guest_id25, c1_.listing_id AS listing_id26,
l2_.poster_id AS poster_id27, l2_.site_id AS site_id28, l2_.building_id AS building_id29, l2_.type_id AS type_id30, l2_.neighborhood_id AS neighborhood_id31, l2_.facility_bathroom_id AS facility_bathroom_id32, l2_.facility_kitchen_id AS facility_kitchen_id33, l2_.facility_heating_id AS facility_heating_id34, l2_.facility_internet_id AS facility_internet_id35, l2_.facility_condition_id AS facility_condition_id36, l2_.original_translation_id AS original_translation_id37,
u4_.site_id AS site_id38, u4_.address_id AS address_id39, u4_.billing_address_id AS billing_address_id40,
u5_.site_id AS site_id41, u5_.address_id AS address_id42, u5_.billing_address_id AS billing_address_id43,
g6_.site_id AS site_id44
FROM message m0_
INNER JOIN conversation c1_ ON m0_.conversation_id = c1_.id
INNER JOIN listing l2_ ON c1_.listing_id = l2_.id
INNER JOIN Site s3_ ON l2_.site_id = s3_.id
INNER JOIN user_ u4_ ON l2_.poster_id = u4_.id
LEFT JOIN user_ u5_ ON m0_.author_user_id = u5_.id
LEFT JOIN guest_data g6_ ON m0_.author_guest_id = g6_.id
WHERE s3_.id = 287
ORDER BY m0_.created_at DESC
LIMIT 25 OFFSET 0

最佳答案

修正您的查询

您的WHERE 条件位于通过LEFT JOIN 节点连接的表上。 WHERE 条件强制联接的行为类似于 [INNER] JOIN。这是毫无意义的,可能会使查询规划器感到困惑,尤其是对于具有大量表的查询,因此许多可能的查询计划。通过正确设置,您可以大大减少可能的查询计划的数量,使 Postgres 更容易找到一个好的计划。
More details in the answer to the additionally spawned question.

SELECT m0_.id AS id0, ...
FROM site s3_
JOIN listing l2_ ON l2_.site_id = s3_.id
JOIN conversation c1_ ON c1_.listing_id = l2_.id
JOIN message m0_ ON m0_.conversation_id = c1_.id

LEFT JOIN user_ u4_ ON u4_.id = l2_.poster_id
LEFT JOIN user_ u5_ ON u5_.id = m0_.author_user_id
LEFT JOIN guest_data g6_ ON g6_.id = m0_.author_guest_id
WHERE s3_.id = '287' -- ??
ORDER BY m0_.created_at DESC
LIMIT 25

为什么 s3_.id = '287'

看起来 287 应该是 integer 类型,您通常会输入不带引号的数字常量:287。实际数据类型是什么(以及为什么)?无论哪种方式,都只是一个问题。

读取查询计划

@FuzzyTree 已经(非常准确地)暗示,在与 WHERE 子句中使用的列不同的列上进行排序会使事情变得复杂。但这不是房间里的大象。

LIMIT 25 的组合使差异巨大。两个查询计划在最后一步都显示从 rows=124616 减少到 rows=25,这是巨大

两个查询计划还显示:Seq Scan on site s3_ ... rows=1。因此,如果您在快速变体中ORDER BY _s3.id,您实际上并没有订购任何东西。而另一个查询必须从 124616 个候选者中找到前 25 行……这不是一个公平的比较。

解决方案

经过澄清,问题似乎更清楚了。您正在按一个标准选择大量行,但按另一个标准排序。没有任何传统的索引设计可以涵盖这一点,即使两列都驻留在同一个表中(它们不存在)也是如此。

我认为我们在 dba.SE 上的这个相关问题下找到了针对此类问题的(非平凡的)解决方案:

当然,所有常用的建议query optimization和一般performance optimization适用。

关于sql - 将 ORDER BY 从 id 更改为另一个索引列(具有低 LIMIT)具有巨大的成本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24872582/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com