sql - 如何调整 postgres 中的 index

sql - 如何调整 postgres 中的 index_scan 成本？

转载作者：行者123 更新时间：2023-11-29 11:49:51

对于以下查询:

SELECT *
FROM "routes_trackpoint"
WHERE "routes_trackpoint"."track_id" = 593
ORDER BY "routes_trackpoint"."id" ASC
LIMIT 1;

Postgres 正在选择一个查询计划，该计划读取“id”索引中的所有行以执行排序，并执行手动过滤以获取具有正确轨道 ID 的条目:

Limit  (cost=0.43..511.22 rows=1 width=65) (actual time=4797.964..4797.966 rows=1 loops=1)
   Buffers: shared hit=3388505
   ->  Index Scan using routes_trackpoint_pkey on routes_trackpoint  (cost=0.43..719699.79 rows=1409 width=65) (actual time=4797.958..4797.958 rows=1 loops=1)
         Filter: (track_id = 75934)
         Rows Removed by Filter: 13005436
         Buffers: shared hit=3388505
 Total runtime: 4798.019 ms
(7 rows)

禁用索引扫描，查询计划(SET enable_indexscan=OFF;)更好，响应更快。

Limit  (cost=6242.46..6242.46 rows=1 width=65) (actual time=77.584..77.586 rows=1 loops=1)
   Buffers: shared hit=1075 read=6
   ->  Sort  (cost=6242.46..6246.64 rows=1674 width=65) (actual time=77.577..77.577 rows=1 loops=1)
         Sort Key: id
         Sort Method: top-N heapsort  Memory: 25kB
         Buffers: shared hit=1075 read=6
         ->  Bitmap Heap Scan on routes_trackpoint  (cost=53.41..6234.09 rows=1674 width=65) (actual time=70.384..74.782 rows=1454 loops=1)
               Recheck Cond: (track_id = 75934)
               Buffers: shared hit=1075 read=6
               ->  Bitmap Index Scan on routes_trackpoint_track_id  (cost=0.00..52.99 rows=1674 width=0) (actual time=70.206..70.206 rows=1454 loops=1)
                     Index Cond: (track_id = 75934)
                     Buffers: shared hit=2 read=6
 Total runtime: 77.655 ms
(13 rows)

如何让 Postgres 自动选择更好的计划？

我尝试了以下方法:

ALTER TABLE routes_trackpoint ALTER COLUMN track_id SET STATISTICS 5000;
ALTER TABLE routes_trackpoint ALTER COLUMN id SET STATISTICS 5000;
ANALYZE routes_trackpoint;

但查询计划保持不变。

表定义是:

watchdog2=# \d routes_trackpoint
                                   Table "public.routes_trackpoint"
  Column  |           Type           |                           Modifiers                            
----------+--------------------------+----------------------------------------------------------------
 id       | integer                  | not null default nextval('routes_trackpoint_id_seq'::regclass)
 track_id | integer                  | not null
 position | geometry(Point,4326)     | not null
 speed    | double precision         | not null
 bearing  | double precision         | not null
 valid    | boolean                  | not null
 created  | timestamp with time zone | not null
Indexes:
    "routes_trackpoint_pkey" PRIMARY KEY, btree (id)
    "routes_trackpoint_position_id" gist ("position")
    "routes_trackpoint_track_id" btree (track_id)
Foreign-key constraints:
    "track_id_refs_id_d59447ae" FOREIGN KEY (track_id) REFERENCES routes_track(id) DEFERRABLE INITIALLY DEFERRED

PS:我们强制 postgres 按“创建”排序，这也有助于他使用“track_id”上的索引。

最佳答案

尽可能避免LIMIT。计划 #1:使用 NOT EXISTS() 获取第一个

EXPLAIN ANALYZE
SELECT * FROM routes_trackpoint tp
WHERE tp.track_id = 593
AND NOT EXISTS (
        SELECT * FROM routes_trackpoint nx
        WHERE nx.track_id = tp.track_id AND nx.id < tp.id
        );

计划 #2:使用 row_number() OVER some_window 获取组中的第一个。

EXPLAIN ANALYZE
SELECT tp.*
FROM routes_trackpoint tp
JOIN (select track_id, id
        , row_number() OVER (partition BY track_id ORDER BY id) rn
        FROM routes_trackpoint tp2
        ) omg ON omg.id = tp.id
WHERE tp.track_id = 593
AND omg.rn = 1
        ;

或者 - 甚至更好 - 将 WHERE 子句移动到子查询:

EXPLAIN ANALYZE
SELECT tp.*
FROM routes_trackpoint tp
JOIN (select track_id, id
        , row_number() OVER (partition BY track_id ORDER BY id) rn
        FROM routes_trackpoint tp2
        WHERE tp2.track_id = 593
        ) omg ON omg.id = tp.id
WHERE 1=1
-- AND tp.track_id = 593
AND omg.rn = 1
        ;

计划#3 使用 postgres 特定的 DISTINCT ON() 构造(感谢@a_horse_with_no_name):

-- EXPLAIN ANALYZE
SELECT DISTINCT ON (track_id) track_id, id
FROM routes_trackpoint tp2
WHERE tp2.track_id = 593
-- order by track_id, created desc
order by track_id, id
        ;

关于sql - 如何调整 postgres 中的 index_scan 成本？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33856163/

文章推荐：具有 100k 记录的 SQL Server 表，2 个内部连接极慢

文章推荐： sql - 使用模式匹配字符的 postgresql 替换函数

文章推荐： postgresql - PostGIS : schema is empty 的 pg_dump

文章推荐： sql - 优化 ActiveRecord 查询

sql - 什么时候适合使用 index_scan 来获取已排序的数据？
有什么经验法则吗？问题是我刚刚想到了这个问题，其中索引无助于通过其预定义的排序更快地运行查询。我有下表tbl: | p_id | s_id | w_id | amount | cur
sql - 如何调整 postgres 中的 index_scan 成本？
对于以下查询: SELECT * FROM "routes_trackpoint" WHERE "routes_trackpoint"."track_id" = 593 ORDER BY "route

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

sql - 如何调整 postgres 中的 index_scan 成本？