sql - 相同的查询，不同的表，postgres 上的不同执行时间-6ren

sql - 相同的查询，不同的表，postgres 上的不同执行时间

转载作者：搜寻专家更新时间：2023-10-30 22:11:26

我遇到了 Postgres 的性能问题。我有两个具有相同结构、相同索引的表，我还在两个表的 id_coordinate 索引上执行了相同的 CLUSTER。这些表具有以下结构:

     Column     |   Type   |                  Modifiers                | Storage | Description
----------------+----------+-------------------------------------------+---------+-------------
 id_best_server | integer  | not null default nextval('seq'::regclass) | plain   |
 date           | date     | not null                                  | plain   |
 id_coordinate  | integer  | not null                                  | plain   |
 mnc            | smallint |                                           | plain   |
 id_cell        | integer  |                                           | plain   |
 rx_level       | real     |                                           | plain   |
 rx_quality     | real     |                                           | plain   |
 sqi            | real     |                                           | plain   |

Indexes:
    "history_best_server_until_2013_10_pkey" PRIMARY KEY, btree (id_best_server)
    "ix_history_best_server_until_2013_10_id_coordinate" btree (id_coordinate) CLUSTER
    "ix_history_best_server_until_2013_10_id_best_server" btree (id_best_server)

执行的查询:

EXPLAIN ANALYZE SELECT DISTINCT ON (x, y) x, y, rx_level, rx_quality, date, mnc, id_cell
FROM
(
    SELECT X(co.location) AS x, Y(co.location) AS y, tems.rx_level, tems.rx_quality, date, mnc, id_cell
    FROM tems.history_best_server_until_2012_10 AS tems
    JOIN gis.coordinate AS co ON tems.id_coordinate = co.id_coordinate
        AND co.location && setsrid(makeBox2d(GeomFromText('POINT(101000 461500)', 2710),
                             GeomFromText('POINT(102400 463610)', 2710)
                             ), 2710)
    WHERE mnc = 41
) AS j1
ORDER BY x, y, date DESC

两个表的行数几乎相同(大约 8M)。当我执行上面的查询时，在一张表上我得到了这些结果:

"Unique  (cost=245742.87..245805.99 rows=8416 width=118) (actual time=3420.966..3425.584 rows=10009 loops=1)"
"  ->  Sort  (cost=245742.87..245763.91 rows=8416 width=118) (actual time=3420.963..3422.236 rows=10212 loops=1)"
"        Sort Key: (x(co.location)), (y(co.location)), tems.date"
"        Sort Method: quicksort  Memory: 1182kB"
"        ->  Hash Join  (cost=61069.15..245194.20 rows=8416 width=118) (actual time=191.365..3405.590 rows=10212 loops=1)"
"              Hash Cond: (tems.id_coordinate = co.id_coordinate)"
"              ->  Seq Scan on history_best_server_until_2012_10 tems  (cost=0.00..147705.35 rows=3226085 width=22) (actual time=0.009..1749.468 rows=3230507 loops=1)"
"                    Filter: (mnc = 41)"
"              ->  Hash  (cost=60697.73..60697.73 rows=29714 width=104) (actual time=46.828..46.828 rows=31806 loops=1)"
"                    Buckets: 4096  Batches: 1  Memory Usage: 1864kB"
"                    ->  Bitmap Heap Scan on coordinate co  (cost=937.22..60697.73 rows=29714 width=104) (actual time=14.975..35.561 rows=31806 loops=1)"
"                          Recheck Cond: (location && '0103000020960A000001000000050000000000000080A8F84000000000F02A1C410000000080A8F84000000000E84B1C41000000000000F94000000000E84B1C41000000000000F94000000000F02A1C410000000080A8F84000000000F02A1C41'::geome (...)"
"                          ->  Bitmap Index Scan on ix_coordinate_location  (cost=0.00..929.79 rows=29714 width=0) (actual time=14.593..14.593 rows=31806 loops=1)"
"                                Index Cond: (location && '0103000020960A000001000000050000000000000080A8F84000000000F02A1C410000000080A8F84000000000E84B1C41000000000000F94000000000E84B1C41000000000000F94000000000F02A1C410000000080A8F84000000000F02A1C41'::g (...)"
"Total runtime: 3426.635 ms"

在另一张 table 上，它看起来像这样:

"Unique  (cost=267070.35..267138.75 rows=9120 width=118) (actual time=172.333..177.232 rows=10051 loops=1)"
"  ->  Sort  (cost=267070.35..267093.15 rows=9120 width=118) (actual time=172.330..173.708 rows=10256 loops=1)"
"        Sort Key: (x(co.location)), (y(co.location)), tems.date"
"        Sort Method: quicksort  Memory: 1186kB"
"        ->  Nested Loop  (cost=937.22..266470.49 rows=9120 width=118) (actual time=14.876..156.322 rows=10256 loops=1)"
"              ->  Bitmap Heap Scan on coordinate co  (cost=937.22..60697.73 rows=29714 width=104) (actual time=14.788..29.510 rows=31806 loops=1)"
"                    Recheck Cond: (location && '0103000020960A000001000000050000000000000080A8F84000000000F02A1C410000000080A8F84000000000E84B1C41000000000000F94000000000E84B1C41000000000000F94000000000F02A1C410000000080A8F84000000000F02A1C41'::geometry)"
"                    ->  Bitmap Index Scan on ix_coordinate_location  (cost=0.00..929.79 rows=29714 width=0) (actual time=14.409..14.409 rows=31806 loops=1)"
"                          Index Cond: (location && '0103000020960A000001000000050000000000000080A8F84000000000F02A1C410000000080A8F84000000000E84B1C41000000000000F94000000000E84B1C41000000000000F94000000000F02A1C410000000080A8F84000000000F02A1C41'::geometr (...)"
"              ->  Index Scan using ix_history_best_server_until_2013_10_id_coordinate on history_best_server_until_2013_10 tems  (cost=0.00..6.91 rows=1 width=22) (actual time=0.003..0.003 rows=0 loops=31806)"
"                    Index Cond: (id_coordinate = co.id_coordinate)"
"                    Filter: (mnc = 41)"
"Total runtime: 178.280 ms"

总运行时间不同。

如果不使用“WHERE mnc = 41”，它们都工作得很快。我不知道是什么导致了第一种情况下的序列扫描。请注意，mnc 只能具有 3 个可能值之一。每个值的频率在较快的表上约为 41%、39%、20%，在较慢的表上约为 43%、41%、16%。

添加:这是快速表的统计信息。

             tablename             |    attname     | n_distinct | correlation | most_common_freqs
-----------------------------------+----------------+------------+-------------+-------------------
 history_best_server_until_2013_10 | id_best_server |         -1 |           1 |
 history_best_server_until_2013_10 | date           |       1122 |   -0.206991 | many values
 history_best_server_until_2013_10 | id_coordinate  |  -0.373645 |           1 | many values
 history_best_server_until_2013_10 | mnc            |          3 |     0.30477 | {0.411783,0.386967,0.20125}
 history_best_server_until_2013_10 | id_cell        |       5811 |  -0.0759416 | many values
 history_best_server_until_2013_10 | rx_level       |      14961 |   -0.122292 | many values
 history_best_server_until_2013_10 | rx_quality     |         16 |    0.360472 | many values
 history_best_server_until_2013_10 | sqi            |       5552 |    0.212023 | many values
(8 rows)

这个是慢的:

             tablename             |    attname     | n_distinct | correlation | most_common_freqs
-----------------------------------+----------------+------------+-------------+-------------------
 history_best_server_until_2012_10 | id_best_server |         -1 |           1 |
 history_best_server_until_2012_10 | date           |        954 |   -0.205897 | many values
 history_best_server_until_2012_10 | id_coordinate  |  -0.421911 |           1 | many values
 history_best_server_until_2012_10 | mnc            |          3 |    0.314319 | {0.4349,0.402433,0.162667}
 history_best_server_until_2012_10 | id_cell        |       5617 |  -0.0715787 | many values
 history_best_server_until_2012_10 | rx_level       |      14129 |   -0.115288 | many values
 history_best_server_until_2012_10 | rx_quality     |         22 |    0.368943 | many values
 history_best_server_until_2012_10 | sqi            |       5320 |    0.226596 | many values

gis.coordinate 的表定义

                                                  Table "gis.coordinate"
    Column     |   Type   |                               Modifiers                                | Storage | Description
---------------+----------+------------------------------------------------------------------------+---------+-------------
 id_coordinate | integer  | not null default nextval('gis.coordinate_id_coordinate_seq'::regclass) | plain   |
 location      | geometry |                                                                        | main    |

Indexes:
    "coordinate_pkey" PRIMARY KEY, btree (id_coordinate)
    "ix_pk_coordinate" UNIQUE, btree (id_coordinate) CLUSTER
    "ix_coordinate_location" gist (location)

Check constraints:
    "enforce_dims_location" CHECK (ndims(location) = 2)
    "enforce_geotype_location" CHECK (geometrytype(location) = 'POINT'::text OR location IS NULL)
    "enforce_srid_location" CHECK (srid(location) = 2710)

最佳答案

这不是相同的数据，因此期望相同的计划是不合理的，除非统计数据(mnc = 41 的行数等等，值在整个表中的分布方式等)相似。

在一种情况下，该值很可能出现并遍布整个地方，而在另一种情况下，它们的分组范围很窄。在第一种情况下，seq 扫描行通常会更快；另一方面，索引扫描通常会更快。

关于sql - 相同的查询，不同的表，postgres 上的不同执行时间，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/27426435/

文章推荐： ios - 'observeSingleEvent(: with:)

文章推荐： vue.js - 我可以将 vue-fontawesome 用于复选框复选标记吗？

文章推荐： ios - CoreSpotlight 框架与 Spotlight 索引扩展？

文章推荐： vue.js - 用于 VueJS vue-router 的 Bootstrap nav pills

Mysql 查询 JOIN 查询
我有三张 table 。表 A 有选项名称(即颜色、尺寸)。表 B 有选项值名称(即蓝色、红色、黑色等)。表C通过将选项名称id和选项名称值id放在一起来建立关系。我的查询需要显示值和选项的名称，而
查询
在mysql中，如何计算一行中的非空单元格？我只想计算某些列之间的单元格，比如第 3-10 列之间的单元格。不是所有的列...同样，仅在该行中。最佳答案如果你想这样做，只能在 sql 中使用名称而
sql - 查询、 native 查询、命名查询和类型化查询之间的区别
关闭。这个问题需要多问focused 。目前不接受答案。想要改进此问题吗？更新问题，使其仅关注一个问题 editing this post . 已关闭 7 年前。 Improve this ques
elasticsearch - 在Elasticsearch查询中没有为[查询]注册的[查询]
我正在为版本7.6进行Elasticsearch查询我的查询是这样的: { "query": { "bool": { "should": [ {
sql - 查询、 native 查询、命名查询和类型化查询之间的区别
关闭。这个问题需要多问focused 。目前不接受答案。想要改进此问题吗？更新问题，使其仅关注一个问题 editing this post . 已关闭 7 年前。 Improve this ques
php - Mysql WHERE NOT EXISTS(查询)OR(查询)
是否可以编写一个查询来检查任一子查询(而不是一个子查询)是否正确？ SELECT * FROM employees e WHERE NOT EXISTS (
javascript - 查询。为表中的每一行发送 ajax 查询
我找到了很多关于我的问题的答案，但问题没有解决我有表格，有数据，例如: Data 1 Data 2 Data 3
salesforce - SOQL 查询 - 如何通过将字段设为小写并进行比较来编写 SOQL 查询？
以下查询返回错误: 查询: SELECT Id, FirstName, LastName, OwnerId, PersonEmail FROM Account WHERE lower(PersonEm
salesforce - SOQL 查询 - 如何通过将字段设为小写并进行比较来编写 SOQL 查询？
以下查询返回错误: 查询: SELECT Id, FirstName, LastName, OwnerId, PersonEmail FROM Account WHERE lower(PersonEm
Android SQLite 查询(我想解析一般的 SQL 查询)
我从 EditText 中获取了 String 值。以及提交查询的按钮。 String sql=editQuery.getText().toString();// SELECT * FROM empl
mysql 查询 - 为一个巨大的表优化现有的 MAX-MIN 查询
我有一个或多或少有效的查询(关于结果)，但处理大约需要 45 秒。这对于在 GUI 中呈现数据来说肯定太长了。所以我的需求是找到一个更快/更高效的查询(几毫秒左右会很好)我的数据表大约有 3000
SQL 查询 - 将 NULL 结果添加到 SELECT 查询
这是我第一次使用 Stack Overflow，所以我希望我以正确的方式提出这个问题。我有 2 个 SQL 查询，我正在尝试比较和识别缺失值，尽管我无法将 NULL 字段添加到第二个查询中以识别缺失
sql - 什么是动态 SQL 查询？何时需要使用动态 SQL 查询？
什么是动态 SQL 查询？何时需要使用动态 SQL 查询？我使用的是 SQL Server 2005。最佳答案这里有几篇文章: Introduction to Dynamic SQL Dynami
php - 在另一个 mysql 查询 while 循环中调用 mysql 查询
include "mysql.php"; $query= "SELECT ID,name,displayname,established,summary,searchlink,im
java - MySQL 查询 "select top 5"查询
我有一个查询要“转换”为 mysql。这是查询: select top 5 * from (select id, firstName, lastName, sum(fileSize) as To
c# - Entity Framework 查询 ToString 不会产生 SQL 查询
通过我的研究，我发现至少从 EF 4.1 开始，EF 查询上的 .ToString() 方法将返回要运行的 SQL。事实上，这对我来说非常有用，使用 Entity Framework 5 和 6。但
MySQL 查询(或 Doctrine 1.2 查询)- 从连接表和过滤器中获取最新项目
我在构造查询来执行以下操作时遇到问题: 按activity_type_id过滤联系人，仅显示最近事件具有所需activity_type_id或为NULL(无事件)的联系人表格结构如下: 一个联系人可
php - 如何在执行另一个 SQL 查询 x 分钟后执行一个 SQL 查询？
如何让我输入数据库的信息在输入数据 5 分钟后自行更新？假设我有一张 table : +--+--+-----+ |id|ip|count| +--+--+-----+ |
database - 如何在 N1QL 查询(Couchbase 查询)中使用 LENGTH() 字符串函数
我正在尝试搜索正好是 4 位数字的 ID，我知道我需要使用 LENGTH() 字符串函数，但找不到如何使用它的示例。我正在尝试以下(和其他变体)但它们不起作用。 SELECT max(car_id)
php - 将 SQL 查询 (+JOIN) 转换为 Symfony Propel 查询
我有一个在 mysql 上运行良好的 sql 查询(查询 + 连接): select sum(pa.price) from user u , purchase pu , pack pa where (

搜寻专家

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

sql - 相同的查询，不同的表，postgres 上的不同执行时间