gpt4 book ai didi

postgresql - PostgreSQL 中时间戳和分组依据的查询优化

转载 作者:行者123 更新时间:2023-11-29 12:23:24 25 4
gpt4 key购买 nike

我想查询具有以下结构的表:

               Table "public.company_geo_table"
Column | Type | Collation | Nullable | Default
--------------------+--------+-----------+----------+---------
geoname_id | bigint | | |
date | text | | |
cik | text | | |
count | bigint | | |
country_iso_code | text | | |
subdivision_1_name | text | | |
city_name | text | | |
Indexes:
"cik_country_index" btree (cik, country_iso_code)
"cik_geoname_index" btree (cik, geoname_id)
"cik_index" btree (cik)
"date_index" brin (date)

我尝试使用以下 sql 查询,它需要在一段时间内查询特定的 cik 编号,并按 cik 和 geoname_id(不同区域)进行分组。

select cik, geoname_id, sum(count) as total
from company_geo_table
where cik = '1111111'
and date between '2016-01-01' and '2016-01-10'
group by cik, geoname_id

解释结果显示他们只使用了cik索引和date索引,没有使用cik_geoname索引。为什么?有什么方法可以优化我的解决方案吗?有新指数吗?提前谢谢你。

HashAggregate  (cost=117182.79..117521.42 rows=27091 width=47) (actual time=560132.903..560134.229 rows=3552 loops=1)
Group Key: cik, geoname_id
-> Bitmap Heap Scan on company_geo_table (cost=16467.77..116979.48 rows=27108 width=23) (actual time=6486.232..560114.828 rows=8175 loops=1)
Recheck Cond: ((date >= '2016-01-01'::text) AND (date <= '2016-01-10'::text) AND (cik = '1288776'::text))
Rows Removed by Index Recheck: 16621155
Heap Blocks: lossy=193098
-> BitmapAnd (cost=16467.77..16467.77 rows=27428 width=0) (actual time=6469.640..6469.641 rows=0 loops=1)
-> Bitmap Index Scan on date_index (cost=0.00..244.81 rows=7155101 width=0) (actual time=53.034..53.035 rows=8261120 loops=1)
Index Cond: ((date >= '2016-01-01'::text) AND (date <= '2016-01-10'::text))
-> Bitmap Index Scan on cik_index (cost=0.00..16209.15 rows=739278 width=0) (actual time=6370.930..6370.930 rows=676231 loops=1)
Index Cond: (cik = '1111111'::text)
Planning time: 12.909 ms
Execution time: 560135.432 ms

最佳答案

没有很好的估计(可能值'1111111'被使用得太频繁了(我不确定影响,但看起来 cik 列有错误的数据类型(文本),什么可以是不好估计的一个原因(或部分原因)。

Bitmap Heap Scan on company_geo_table  (cost=16467.77..116979.48 rows=27108 width=23) (actual time=6486.232..560114.828 rows=8175 loops=1)

看起来像复合索引(date, cik) 可以帮助

关于postgresql - PostgreSQL 中时间戳和分组依据的查询优化,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56918608/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com