gpt4 book ai didi

postgresql - 大数据查询

转载 作者:行者123 更新时间:2023-11-29 12:58:35 29 4
gpt4 key购买 nike

我有一个分区表,现在大概有200多个分区,每个分区表里面大概有1200万条数据。现在select很慢,对应字段的表索引已经建立,但是还是很慢,我看执行计划,发现有大量从磁盘读取数据,对我来说,我换了怎么调整优化

   gjdd4=# \d t_bus_position_20160306_20160308
Table "public.t_bus_position_20160306_20160308"
Column | Type | Modifiers
------------------------+--------------------------------+--------------------
pos_uuid | character varying(20) | collate zh_CN.utf8
pos_line_uuid | character varying(20) |
pos_line_type | character varying(20) |
pos_bus_uuid | character varying(20) | collate zh_CN.utf8
pos_dev_uuid | character varying(20) |
pos_sta_uuid | character varying(20) |
pos_drv_ic_card | character varying(30) |
pos_lng | character varying(30) |
pos_lat | character varying(30) |
pos_bus_speed | character varying(20) |
pos_real_time_status | character varying(20) |
pos_gather_time | timestamp(6) without time zone |
pos_storage_time | timestamp(6) without time zone |
pos_is_offset | boolean |
pos_is_overspeed | character varying(1) |
pos_cursor_over_ground | character varying(20) |
pos_all_alarms | character varying(30) |
pos_is_in_station | character varying(1) |
pos_closed_alarms | character varying(30) |
pos_dis_to_pre_i | integer |
pos_odometer_i | bigint |
pos_relative_location | real |
pos_dis_to_pre | real |
pos_odometer | double precision |
pos_gather_time1 | bigint |
Indexes:
"idx_multi" btree (pos_bus_uuid, pos_gather_time DESC)
"idx_trgm" btree (replace(to_char(pos_gather_time, 'YYYYMMDDHH24'::text), ' '::text, ''::text))
"idx_trgm1" btree (to_char(pos_gather_time, 'YYYYMMDD'::text))
"tp_20160306_20160308_pos_dev_uuid_idx" btree (pos_dev_uuid)
Check constraints:
"t_bus_position_20160306_20160308_pos_gather_time_check" CHECK (pos_gather_time >= '2016-03-06 00:00:00'::timestamp without time zone AND
pos_gather_time < '2016-03-09 00:00:00'::timestamp without time zone)

计划是这样的。

    gjdd4=# explain(costs,buffers,timing,analyze) select pos_bus_uuid from    test2 group by pos_bus_uuid;
HashAggregate (cost=802989.75..802993.00 rows=325 width=21) (actual time=42721.528..42721.679 rows=354 loops=1)
Group Key: pos_bus_uuid
Buffers: shared hit=3560 read=567491
I/O Timings: read=20231.511
-> Seq Scan on test2 (cost=0.00..756602.00 rows=18555100 width=21) (actual time=0.067..27749.533 rows=18555100 loops=1)
Buffers: shared hit=3560 read=567491
I/O Timings: read=20231.511
Planning time: 0.116 ms
Execution time: 42721.839 ms
(9 rows)

Time: 42722.629 ms

最佳答案

您的查询没有进行任何真正的聚合,只是distinct。如果这是您真正想要的(所有不同的 pos_bus_uuid 值),那么您可以使用一种名为 loose index scan 的技术。 :

此处定制查询假设 pos_bus_uuid 具有非空约束:

WITH RECURSIVE t AS (
(SELECT pos_bus_uuid FROM test2 ORDER BY pos_bus_uuid LIMIT 1) -- parentheses required
UNION ALL
SELECT (SELECT pos_bus_uuid FROM test2
WHERE pos_bus_uuid > t.pos_bus_uuid ORDER BY pos_bus_uuid LIMIT 1)
FROM t
WHERE t.pos_bus_uuid IS NOT NULL
)
SELECT pos_bus_uuid FROM t WHERE pos_bus_uuid IS NOT NULL;

您的索引 pos_bus_uuid 应该足以满足此查询。

关于postgresql - 大数据查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36490290/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com