gpt4 book ai didi

mysql - 隐式 JOIN 在 EXPLAIN 中的行数较少,但运行速度比显式 JOIN 慢

转载 作者:搜寻专家 更新时间:2023-10-30 20:24:31 27 4
gpt4 key购买 nike

我正在尝试优化下面的查询

SELECT `publisher_id`, `publisher_name`, SUM(`views`) AS `total_views`, SUM(`channels`) AS `total_channels`
FROM (
SELECT DISTINCT `name` AS `publisher_name`, `id` AS `publisher_id`
FROM `publishers`
WHERE TRIM(`name`) <> ''
) AS `publisher_names`
INNER JOIN
(
SELECT `twitch_name`, `publishers`
FROM `game_profiles`
WHERE `twitch_name` IS NOT NULL
AND `publishers` IS NOT NULL
AND TRIM(`publishers`) <> ''
) AS `game_list`
ON `game_list`.`publishers` LIKE CONCAT('%', `publisher_names`.`publisher_name`, '%')
INNER JOIN
(
SELECT `games`.`id` AS `id`, `games`.`name`, `games`.`simple_name`, `games`.`box`, SUM(`channels`) AS `channels`, SUM(`viewers`) AS `views`
FROM `games`
WHERE `log_date` >= SUBDATE(NOW(), INTERVAL 1 WEEK)
AND `log_date` <= SUBDATE(NOW(), INTERVAL 0 WEEK)
GROUP BY `games`.`id`
) AS `view_list`
ON `game_list`.`twitch_name` = `view_list`.`name`
GROUP BY `publisher_id` ORDER BY `total_views` DESC LIMIT 10;

使用 EXPLAIN 命令检查查询的性能,我得到以下结果。

EXPLAIN result for explicit JOIN

基本上,games 表包含每小时的观看次数和 channel ,game_profiles 表将游戏映射到其发布者,而 < strong>publishers 表包含每个现有发布者的更详细的行。我想要实现的是根据过去一周的游戏总浏览量显示排名前 10 位的发行商。

没有想法,我尝试使用隐式 JOIN。查询写在下面

SELECT `publishers`.`id` AS `publisher_id`, `publishers`.`name` AS `publisher_name`, 
SUM(`games`.`viewers`) AS `total_views`, SUM(`games`.`channels`) AS `total_channels`
FROM `game_profiles`, `publishers`, `games`
WHERE `game_profiles`.`twitch_name` IS NOT NULL
AND `game_profiles`.`publishers` IS NOT NULL AND TRIM(`game_profiles`.`publishers`) <> ''
AND `game_profiles`.`publishers` LIKE CONCAT('%', `publishers`.`name`, '%')
AND `game_profiles`.`twitch_name` = `games`.`name`
AND `games`.`log_date` >= SUBDATE(NOW(), INTERVAL 1 WEEK)
AND `games`.`log_date` <= SUBDATE(NOW(), INTERVAL 0 WEEK)
GROUP BY `publisher_id` ORDER BY `total_views` DESC LIMIT 10;

这为我提供了 EXPLAIN 命令的以下结果。

EXPLAIN result for implicit JOIN

据我所知,这应该会返回相同的结果,但查询在 MySQL 工作台中运行缓慢,我等不及它的结果,所以我无法验证它是否确实返回了相同的行。然而,仅从 EXPLAIN 结果来看,我认为后一个查询应该运行得更快。关于为什么不是这样,我在这里遗漏了什么吗?非常感谢。

附言我的数据库设计并不是真正的最优。这更像是一个原型(prototype)数据库。进行此操作时未执行标准化。我只想更好地了解查询中发生的情况。谢谢。

最佳答案

在第二个查询中,您正在执行隐式 CROSS JOIN这是不可取的,会导致您的查询永远运行。这意味着您首先从所有表中选择所有行,然后在该操作之后过滤结果集。

至于第一个查询。

你的数据库设计的不是很好。

子句game_list.publishers LIKE CONCAT('%', publisher_names.publisher_name, '%'远非最佳。应该有一个链接表。

所以可能索引也很差,检查丢失的索引,尤其是在 games 上表格、列 log_date .

WHERE log_date >= SUBDATE(NOW(), INTERVAL 1 WEEK) 
AND log_date <= SUBDATE(NOW(), INTERVAL 0 WEEK)

顺便说一句,为了更好的可读性,可以使用 BETWEEN 重新编写:

WHERE log_date BETWEEN SUBDATE(NOW(), INTERVAL 1 WEEK) 
AND SUBDATE(NOW(), INTERVAL 0 WEEK)

LTRIM(publishers) <> ''不是 sargable ,尽量避免这种情况。 publishers <> ''应该足够了。

表的分组games最后一个 INNER JOIN 可能也不是最优的。对于这样的问题,最好提供SQL Fiddle。带有示例数据。

但是您在所有子查询中总是犯一个错误。您使用 INNER JOIN (SELECT x WHERE y) as Z ON z.something = a.something .这会破坏索引性能。

优化后的查询看起来像这样(未验证):

SELECT 
publisher_names.id AS publisher_id
,publisher_names.name AS publisher_name
,SUM(view_list.views) AS total_views
,SUM(view_list.channels) AS total_channels
FROM publishers AS publisher_names
INNER JOIN game_profiles AS game_list ON
twitch_name IS NOT NULL
AND publishers IS NOT NULL
AND publishers <> ''
AND publishers LIKE CONCAT('%', publisher_names.publisher_name, '%')
INNER JOIN games AS view_list
ON log_date BETWEEN SUBDATE(NOW(), INTERVAL 1 WEEK)
AND SUBDATE(NOW(), INTERVAL 0 WEEK)
AND game_list.twitch_name = view_list.name
WHERE publisher_names.name <> ''
GROUP BY publisher_id
ORDER BY total_views DESC

关于mysql - 隐式 JOIN 在 EXPLAIN 中的行数较少,但运行速度比显式 JOIN 慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43199748/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com