gpt4 book ai didi

mysql - 为什么添加 INNER JOIN 会使此查询变得如此缓慢?

转载 作者:可可西里 更新时间:2023-11-01 08:31:26 25 4
gpt4 key购买 nike

我有一个包含以下三个表的数据库:

matches 表有 200,000 个匹配...

CREATE TABLE `matches` (
`match_id` bigint(20) unsigned NOT NULL,
`start_time` int(10) unsigned NOT NULL,
PRIMARY KEY (`match_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

heroes 表有大约 100 个英雄...

CREATE TABLE `heroes` (
`hero_id` smallint(5) unsigned NOT NULL,
`name` char(40) NOT NULL,
PRIMARY KEY (`hero_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

matches_heroes 表有 2,000,000 个关系(每场比赛 10 个随机英雄)...

CREATE TABLE `matches_heroes` (
`relation_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`match_id` bigint(20) unsigned NOT NULL,
`hero_id` smallint(6) unsigned NOT NULL,
PRIMARY KEY (`relation_id`),
KEY `match_id` (`match_id`),
KEY `hero_id` (`hero_id`),
CONSTRAINT `matches_heroes_ibfk_2` FOREIGN KEY (`hero_id`)
REFERENCES `heroes` (`hero_id`),
CONSTRAINT `matches_heroes_ibfk_1` FOREIGN KEY (`match_id`)
REFERENCES `matches` (`match_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=3689891 DEFAULT CHARSET=utf8

以下查询需要超过 1 秒,对于如此简单的事情,这对我来说似乎很慢:

SELECT SQL_NO_CACHE COUNT(*) AS match_count
FROM matches INNER JOIN matches_heroes ON matches.match_id = matches_heroes.match_id
WHERE hero_id = 5

仅删除 WHERE 子句没有帮助,但如果我也删除 INNER JOIN,就像这样:

SELECT SQL_NO_CACHE COUNT(*) AS match_count FROM matches

...只需要 0.05 秒。看来 INNER JOIN 的成本很高。我在加入方面没有太多经验。这是正常现象还是我做错了什么?

更新 #1:这是 EXPLAIN 结果。

id  select_type  table          type   possible_keys                     key     key_len  ref                                rows  Extra  
1 SIMPLE matches_heroes ref match_id,hero_id,match_id_hero_id hero_id 2 const 34742
1 SIMPLE matches eq_ref PRIMARY PRIMARY 8 mydatabase.matches_heroes.match_id 1 Using index

更新 #2:听了你们的意见后,我认为它工作正常,而且速度很快。如果您不同意,请告诉我。感谢所有的帮助。我真的很感激。

最佳答案

使用 COUNT(matches.match_id) 而不是 count(*),因为在使用连接时最好不要使用 *,因为它会进行额外的计算。使用连接中的列是确保您不请求任何其他操作的最佳方式。 (在 MySql InnerJoin 上不是问题,我的错)。

此外,您还应该验证是否对所有键进行了碎片整理,并且有足够的 ram 可供索引加载到内存中

更新 1:


尝试为 match_id,hero_id 添加一个组合索引,因为它应该提供更好的性能。

ALTER TABLE `matches_heroes` ADD KEY `match_id_hero_id` (`match_id`,`hero_id`)


更新 2:


我对接受的答案不满意,mysql 对于 2 条工厂记录来说太慢了,我在我的 ubuntu PC(i7 处理器,带标准 HDD)上运行了基准测试。

-- pre-requirements

CREATE TABLE seq_numbers (
number INT NOT NULL
) ENGINE = MYISAM;


DELIMITER $$
CREATE PROCEDURE InsertSeq(IN MinVal INT, IN MaxVal INT)
BEGIN
DECLARE i INT;
SET i = MinVal;
START TRANSACTION;
WHILE i <= MaxVal DO
INSERT INTO seq_numbers VALUES (i);
SET i = i + 1;
END WHILE;
COMMIT;
END$$
DELIMITER ;

CALL InsertSeq(1,200000)
;

ALTER TABLE seq_numbers ADD PRIMARY KEY (number)
;

-- create tables

-- DROP TABLE IF EXISTS `matches`
CREATE TABLE `matches` (
`match_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`start_time` int(10) unsigned NOT NULL,
PRIMARY KEY (`match_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
;

CREATE TABLE `heroes` (
`hero_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`name` char(40) NOT NULL,
PRIMARY KEY (`hero_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
;

CREATE TABLE `matches_heroes` (
`match_id` bigint(20) unsigned NOT NULL,
`hero_id` smallint(6) unsigned NOT NULL,
PRIMARY KEY (`match_id`,`hero_id`),
KEY (match_id),
KEY (hero_id),
CONSTRAINT `matches_heroes_ibfk_2` FOREIGN KEY (`hero_id`) REFERENCES `heroes` (`hero_id`),
CONSTRAINT `matches_heroes_ibfk_1` FOREIGN KEY (`match_id`) REFERENCES `matches` (`match_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=MyISAM DEFAULT CHARSET=utf8
;
-- insert DATA
-- 100
INSERT INTO heroes(name)
SELECT SUBSTR(CONCAT(char(RAND()*25+65),char(RAND()*25+97),char(RAND()*25+97),char(RAND()*25+97),char(RAND()*25+97),char(RAND()*25+97),char(RAND()*25+97),char(RAND()*25+97),char(RAND()*25+97),char(RAND()*25+97),char(RAND()*25+97),char(RAND()*25+97)),1,RAND()*9+4) as RandomName
FROM seq_numbers WHERE number <= 100

-- 200000
INSERT INTO matches(start_time)
SELECT rand()*1000000
FROM seq_numbers WHERE number <= 200000

-- 2000000
INSERT INTO matches_heroes(hero_id,match_id)
SELECT a.hero_id, b.match_id
FROM heroes as a
INNER JOIN matches as b ON 1=1
LIMIT 2000000

-- warm-up database, load INDEXes in ram (optional, works only for MyISAM tables)
LOAD INDEX INTO CACHE matches_heroes,matches,heroes


-- get random hero_id
SET @randHeroId=(SELECT hero_id FROM matches_heroes ORDER BY rand() LIMIT 1);


-- test 1

SELECT SQL_NO_CACHE @randHeroId,COUNT(*) AS match_count
FROM matches as a
INNER JOIN matches_heroes as b ON a.match_id = b.match_id
WHERE b.hero_id = @randHeroId
; -- Time: 0.039s


-- test 2: adding some complexity
SET @randName = (SELECT `name` FROM heroes WHERE hero_id = @randHeroId LIMIT 1);

SELECT SQL_NO_CACHE @randName, COUNT(*) AS match_count
FROM matches as a
INNER JOIN matches_heroes as b ON a.match_id = b.match_id
INNER JOIN heroes as c ON b.hero_id = c.hero_id
WHERE c.name = @randName
; -- Time: 0.037s

结论:测试结果快了大约 20 倍,测试前我的服务器负载大约为 80%,因为它不是专用的 mysql 服务器并且有其他 cpu 密集型任务在运行,所以如果你运行整个脚本(从上面)并得到较低的结果,这可能是因为:

  1. 你有一个共享主机,负载太大。在这种情况下,您无能为力:您要么向当前的主机投诉,要么为更好的主机/虚拟机付费,要么尝试其他主机
  2. 您配置的key_buffer_size (对于 MyISAM)或 innodb_buffer_pool_size (对于 innoDB)太小,最佳大小将超过 150MB
  3. 您的可用 ram 不够,您需要大约 100 - 150 mb 的 ram 才能将索引加载到内存中。解决方案:释放一些内存或购买更多内存

请注意,通过使用测试脚本,新数据的生成排除了索引碎片问题。希望这对您有所帮助,并询问您在测试时是否遇到问题。


观察:


SELECT SQL_NO_CACHE COUNT(*) AS match_count 
FROM matches INNER JOIN matches_heroes ON matches.match_id = matches_heroes.match_id
WHERE hero_id = 5`

相当于:

SELECT SQL_NO_CACHE COUNT(*) AS match_count 
FROM matches_heroes
WHERE hero_id = 5`

因此,如果这是您需要的计数,则您不需要连接,但我猜这只是一个示例。

关于mysql - 为什么添加 INNER JOIN 会使此查询变得如此缓慢?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25763730/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com