gpt4 book ai didi

mysql - 如何结合这两个mysql查询来保持速度呢?

转载 作者:可可西里 更新时间:2023-11-01 06:30:13 24 4
gpt4 key购买 nike

我有两个 MySQL 查询,一个接一个,运行速度非常快:

查询 1

SELECT Ads.AdId FROM Ads, AdsGeometry WHERE 
AdsGeometry.AdId = Ads.AdId AND
(ST_CONTAINS(GeomFromText('Polygon((
-4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018,
-4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882,
-4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563
))'), AdsGeometry.GeomPoint))
GROUP BY Ads.AdId

此查询在 0.0013 秒内运行,并返回 4 行。

查询 2

 SELECT Ads.AdId FROM Ads, AdsHierarchy WHERE 
Ads.AdId = AdsHierarchy.ads_AdId AND
AdsHierarchy.locations_LocationId = 148022797
GROUP BY Ads.AdId

此查询在 0.0094 秒内运行,并返回 67 行(其中 3 行与上述查询相同)。

我正在尝试将这两个查询合并为一个查询,因为稍后,两个查询的结果集应该一起排序,我想使用 MySQL 进行排序。这是我试过的,在它下面,你会发现它也有解释:

SELECT Ads.AdId FROM Ads, AdsHierarchy, AdsGeometry WHERE 
Ads.AdId = AdsHierarchy.ads_AdId AND
AdsGeometry.AdId = Ads.AdId AND (
ST_CONTAINS(GeomFromText('Polygon((
-4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018,
-4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882,
-4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563
))'), AdsGeometry.GeomPoint) OR
AdsHierarchy.locations_LocationId = 148022797
)
GROUP BY Ads.AdId

id select_type table type possible_keys key key_len ref rows Extra
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 SIMPLE AdsGeometry ALL PRIMARY,GeomPoint,sx_adsgeometry_geompoint NULL NULL NULL 682848 Using temporary; Using filesort
1 SIMPLE Ads eq_ref PRIMARY PRIMARY 4 dbname.AdsGeometry.AdId 1 Using where; Using index
1 SIMPLE AdsHierarchy ref Ads_AdsHierarchy,locations_LocationId Ads_AdsHierarchy 4 dbname.Ads.AdId 1 Using where

虽然此查询返回了正确的结果集(68 行),但它需要 6.5937 秒才能运行。如果我理解正确的话,AdsHierarchy 表没有使用它的索引,AdsGeometry 表也没有。

有什么方法可以将两个查询(或可能的更多位置,或像这样的基于多边形的查询)合并在一起,并保持合理的运行速度?

谢谢!

编辑:一些信息,关于有问题的 3 个表的索引

AdsGeometry 表是 MyISAM,主键是 AdId

SHOW INDEXES FROM AdsGeometry 的结果是:

Table           Non_unique  Key_name                    Seq_in_index    Column_name     Collation   Cardinality     Sub_part    Packed      Null    Index_type    Comment   Index_comment
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
AdsGeometry 0 PRIMARY 1 AdId A 682848 NULL NULL BTREE
AdsGeometry 1 Latitude 1 Latitude A NULL NULL NULL BTREE
AdsGeometry 1 Longitude 1 Longitude A NULL NULL NULL BTREE
AdsGeometry 1 GeomPoint 1 GeomPoint A NULL 32 NULL SPATIAL
AdsGeometry 1 sx_adsgeometry_geompoint 1 GeomPoint A NULL 32 NULL SPATIAL
AdsGeometry 1 Latitude_2 1 Latitude A NULL NULL NULL BTREE
AdsGeometry 1 Latitude_2 2 Longitude A NULL NULL NULL BTREE

AdsHierarchy 表类型是 InnoDB,主键是 AdsHierarchyId

SHOW INDEXES FROM AdsHierarchy 的结果是:

Table           Non_unique  Key_name                    Seq_in_index    Column_name           Collation     Cardinality     Sub_part    Packed      Null    Index_type    Comment   Index_comment
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
AdsHierarchy 0 PRIMARY 1 AdsHierarchyId A 2479044 NULL NULL BTREE
AdsHierarchy 1 Ads_AdsHierarchy 1 ads_AdId A 2479044 NULL NULL BTREE
AdsHierarchy 1 locations_LocationId 1 locations_LocationId A 123952 NULL NULL BTREE

Ads 表类型为 InnoDB,主键为 AdId

SHOW INDEXES FROM Ads 的结果是:

Table           Non_unique  Key_name                    Seq_in_index    Column_name           Collation     Cardinality     Sub_part    Packed      Null    Index_type    Comment   Index_comment
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Ads 0 PRIMARY 1 AdId A 705411 NULL NULL BTREE
Ads 1 Accounts_Ads 1 accounts_AccountId A 2 NULL NULL BTREE
Ads 1 Ads_Locations 1 locations_LocationId A 88176 NULL NULL BTREE
Ads 1 Categories_Ads 1 categories_CategoryId A 16 NULL NULL BTREE
Ads 1 Currencies_Ads 1 currencies_Currency A 2 NULL NULL BTREE
Ads 1 countries_CountryId 1 countries_CountryId A 204 NULL NULL BTREE
Ads 1 ExternalId 1 ExternalId A 705411 NULL NULL BTREE
Ads 1 ExternalId 2 accounts_AccountId A 705411 NULL NULL BTREE
Ads 1 xml_XMLId 1 xml_XMLId A 4 NULL NULL BTREE
Ads 1 streets_StreetId 1 streets_StreetId A 2 NULL NULL YES BTREE

编辑 2:用隐式连接重写查询,并解释:

这是查询,重写为使用隐式连接,但它仍然运行得很慢(5.503 秒)

 SELECT a.AdId FROM Ads AS a 
JOIN AdsHierarchy AS ah ON a.AdId = ah.ads_AdId
JOIN AdsGeometry AS ag ON a.AdId = ag.AdId
WHERE
ST_CONTAINS(GeomFromText('Polygon((
-4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018,
-4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882,
-4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563
))'), ag.GeomPoint)
OR ah.locations_LocationId = 148022797
GROUP BY a.AdId

id select_type table type possible_keys key key_len ref rows Extra
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 SIMPLE a index PRIMARY PRIMARY 4 NULL 627853 Using index
1 SIMPLE ag eq_ref PRIMARY,GeomPoint,sx_adsgeometry_geompoint PRIMARY 8 micasa_dev.a.AdId 1 Using index condition
1 SIMPLE ah ref Ads_AdsHierarchy,locations_LocationId Ads_AdsHierarchy 4 micasa_dev.a.AdId 1 Using where

编辑 3:尝试对两个查询进行 UNION

还尝试了@RobertKoch 提供的UNION 方法。

虽然以下 UNION 查询运行得非常快(0.06 秒)

SELECT Ads.AdId FROM Ads, AdsGeometry 
WHERE
AdsGeometry.AdId = Ads.AdId AND
ST_CONTAINS(GeomFromText('Polygon((
-4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018,
-4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882,
-4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563
))'), AdsGeometry.GeomPoint)
GROUP BY Ads.AdId
UNION
SELECT Ads.AdId FROM Ads, AdsHierarchy WHERE
Ads.AdId = AdsHierarchy.ads_AdId AND
AdsHierarchy.locations_LocationId = 148022797
GROUP BY Ads.AdId

我仍然无法使用此方法,因为稍后我需要根据 Ads 表对我从两个查询的合并中获得的结果集进行排序。

如果我尝试执行以下操作,查询将再次变得极其缓慢(3.7 秒):

SELECT Ads.AdId FROM Ads WHERE Ads.AdId IN (
SELECT Ads.AdId FROM Ads, AdsGeometry
WHERE
AdsGeometry.AdId = Ads.AdId AND
ST_CONTAINS(GeomFromText('Polygon((
-4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018,
-4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882,
-4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563
))'), AdsGeometry.GeomPoint)
GROUP BY Ads.AdId
UNION
SELECT Ads.AdId FROM Ads, AdsHierarchy WHERE
Ads.AdId = AdsHierarchy.ads_AdId AND
AdsHierarchy.locations_LocationId = 148022797
GROUP BY Ads.AdId
) WHERE Ads.AdId > 100000
ORDER BY Ads.ModifiedDate ASC

编辑 4:更改 UNION 所在的位置,似乎可以解决问题

如果我将上面的 UNION 查询修改为

SELECT Ads.AdId
FROM Ads,
(SELECT Ads.AdId
FROM Ads,
AdsGeometry
WHERE AdsGeometry.AdId = Ads.AdId
AND ST_CONTAINS(GeomFromText('Polygon((
-4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018,
-4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882,
-4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563
))'), AdsGeometry.GeomPoint)
GROUP BY Ads.AdId
UNION SELECT Ads.AdId
FROM Ads,
AdsHierarchy
WHERE Ads.AdId = AdsHierarchy.ads_AdId
AND AdsHierarchy.locations_LocationId = 148022797
GROUP BY Ads.AdId) AS nt
WHERE Ads.AdId = nt.AdId
AND Ads.AdId > 1000000
ORDER BY Ads.ModifiedDate ASC

然后查询再次快速运行(~0.0007 秒)。

If no solution comes without UNION, I am willing to give the bounty to anyone who can explain the difference between the two UNION versions (this one, and the one in EDIT 3), and explain to me, why the query runs fast when it is written in the following order, and runs slow, when written in the above order.

如果需要任何其他信息,请在评论中询问,我会尽量提供!谢谢

*注意:* 我在两个 UNION 查询中添加了一个 ORDER,以使其更清楚,虽然我只从表中选择了 AdId,但我仍然需要来自Ads 表。

编辑 5:@bovko 的请求

1   SIMPLE  Ads     index   NULL                countries_CountryId     2   NULL                    627853  Using index; Using temporary
1 SIMPLE ag eq_ref PRIMARY PRIMARY 8 micasa_dev.Ads.AdId 1 Using where; Distinct
1 SIMPLE ah ref Ads_AdsHierarchy Ads_AdsHierarchy 4 micasa_dev.Ads.AdId 1 Using where; Distinct

最佳答案

IN ( SELECT ... ) 通常效率低下。避免它。

到目前为止,所有的答案都比他们需要的更努力。在UNION之后之前,JOIN 似乎是不必要的。请参阅下面的更多注释。

SELECT  Ads.AdId
FROM Ads,
JOIN (
( SELECT AdId
FROM AdsGeometry
WHERE ST_CONTAINS(GeomFromText('Polygon(( -4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018, -4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882, -4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563 ))'),
AdsGeometry.GeomPoint)
AND AdId > 1000000 )
UNION DISTINCT
( SELECT ads_AdId AS AdId
FROM AdsHierarchy
WHERE locations_LocationId = 148022797
AND ads_AdId > 1000000 )
) AS nt ON Ads.AdId = nt.AdId
ORDER BY Ads.ModifiedDate ASC

注意事项:

  • AdsGeometryAdsHierarchy 都有 adId(不同的名称);无需在内部查询中执行 JOIN,除非可能验证它是否存在于 Ads 中。这是一个问题吗?无论如何,我的查询将在外部 SELECT 的 JOIN 中处理。
  • UNION DISTINCT 是必需的,因为两个 SELECTs 可能会获取相同的 ID。
  • > 1000000 移到里面以减少 UNION 收集的值的数量。
  • UNION 将始终(在旧版本的 MySQL 中)或有时(在新版本中)创建一个临时表。你受困于此。
  • IN ( SELECT ... ) 通常优化得非常糟糕;避免它。
  • 我添加了一些括号;可以(但目前没有必要)将ORDER BY等添加到UNION; parent 明确表示它属于什么。
  • 外部查询的唯一原因是获取 ModifiedDate 进行排序。您可以通过删除该要求来加快速度。 (UNION 可能会创建一个 tmp 表;此 ORDER BY 可能会创建另一个。)

关于mysql - 如何结合这两个mysql查询来保持速度呢?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39411259/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com