gpt4 book ai didi

MySQL 索引查询对特定列值花费很长时间

转载 作者:行者123 更新时间:2023-11-29 06:07:03 24 4
gpt4 key购买 nike

我有 2 个 MySQL(版本 14.14 Distrib 5.5.49)表,看起来像这样:

CREATE TABLE `Document` (
`Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`CompanyCode` int(10) unsigned NOT NULL,
`B` int(10) unsigned NOT NULL,
`C` int(10) unsigned NOT NULL,
`DocumentCode` int(10) unsigned NOT NULL,
`E` int(11) DEFAULT '0',
`EpochSeconds` int(11) DEFAULT '0',
`G` int(10) unsigned NOT NULL,
`H` int(10) unsigned NOT NULL,
`I` int(11) DEFAULT '0',
`J` int(11) DEFAULT '0',
`K` varchar(48) DEFAULT '',
PRIMARY KEY (`Id`),
KEY `Idx1` (`CompanyCode`),
KEY `Idx2` (`B`,`C`),
KEY `Idx3` (`CompanyCode`,`DocumentCode`),
KEY `Idx4` (`CompanyCode`,`B`,`C`),
KEY `Idx5` (`H`),
KEY `Idx6` (`CompanyCode`,`K`),
KEY `Idx7` (`K`),
KEY `Idx8` (`K`,`E`),
KEY `NEWIDX` (`DocumentCode`,`EpochSeconds`),
) ENGINE=MyISAM AUTO_INCREMENT=397783215 DEFAULT CHARSET=latin1

CREATE TABLE `Company` (
`Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`CompanyCode` int(10) unsigned NOT NULL,
`CompanyName` varchar(150) NOT NULL,
`C` varchar(2) NOT NULL,
`D` varchar(10) NOT NULL,
`E` varchar(150) NOT NULL,
PRIMARY KEY (`Id`),
KEY `Idx1` (`CompanyCode`),
KEY `Idx2` (`CompanyName`),
KEY `Idx3` (`C`),
KEY `Idx4` (`D`,`C`)
KEY `Idx5` (`E`)
) ENGINE=MyISAM AUTO_INCREMENT=9218804 DEFAULT CHARSET=latin1

我省略了 Company 中的大部分列定义,因为我不想不必要地使问题复杂化,但是那些缺失的列不涉及任何 KEY 定义。

Document 有大约 1250 万行,Company 有大约 600,000 行。
我已将 KEY NEWIDX 添加到 Document 以促进以下查询:

SELECT Document.*, Company.CompanyName FROM Document, Company where Document.DocumentCode = ? and Document.CompanyCode = Company.CompanyCode ORDER BY Document.EpochSeconds desc LIMIT 0, 30;

执行计划:

+----+-------------+--------------+------+-----------------------------------+-------------+---------+------------------------------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+------------------------------------------+-------------+---------+------------------------------+--------+---------------------------------+
| 1 | SIMPLE | Company | ALL | Idx1 | NULL | NULL | NULL | 593729 | Using temporary; Using filesort |
| 1 | SIMPLE | Document | ref | Idx1,Idx4,Idx6,NEWIDX,Idx3 | Idx3 | 8 | db.Company.CompanyCode,const | 3 | |
+----+-------------+-------+------+-----------------------------------------------------------+-------------+---------+----------------------+--------+------------------------+

如果上面 Document.DocumentCode 的值不是 8,则查询立即返回(0.00 秒)。如果值为 8,则查询需要 38 到 45 秒之间的任意时间。如果我从查询中删除 Company,例如

SELECT * FROM Document where DocumentCode = 8 ORDER BY EpochSeconds desc LIMIT 0, 30;

执行计划:

+----+-------------+-----------+------+---------------+------------+---------+-------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+---------------+------------+---------+-------+---------+-------------+
| 1 | SIMPLE | Documents | ref | NEWIDX | NEWIDX | 4 | const | 3654177 | Using where |
+----+-------------+-----------+------+---------------+------------+---------+-------+---------+-------------+

...然后查询立即返回(0.00 秒)。

  • Document.DocumentCode 的可能值范围是 369,这些值之间的分布足够合理。
  • Document 中有约 315 万行 DocumentCode = 8。
  • 另外,请考虑 Document 中约有 150 万行的 DocumentCode = 9,并且该查询会立即返回。

我还在 Document 表上运行了 mysqlcheck 实用程序,它没有报告任何问题。

当在查询中使用 Company 连接时,为什么 DocumentCode = 8 的查询会花费这么长时间,而 DocumentCode 的任何其他值返回得如此之快?


这是 DocumentCode = 8 的执行计划的比较:

+----+-------------+--------------+------+-----------------------------------+-------------+---------+------------------------------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+------------------------------------------+-------------+---------+------------------------------+--------+---------------------------------+
| 1 | SIMPLE | Company | ALL | Idx1 | NULL | NULL | NULL | 593729 | Using temporary; Using filesort |
| 1 | SIMPLE | Document | ref | Idx1,Idx4,Idx6,NEWIDX,Idx3 | Idx3 | 8 | db.Company.CompanyCode,const | 3 | |
+----+-------------+-------+------+-----------------------------------------------------------+-------------+---------+----------------------+--------+------------------------+

和 DocumentCode = 9:

+----+-------------+----------+------+----------------------------+--------+---------+--------------------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+----------------------------+--------+---------+--------------------------+---------+-------------+
| 1 | SIMPLE | Document | ref | Idx1,Idx4,Idx6,NEWIDX,Idx3 | NEWIDX | 4 | const | 1953090 | Using where |
| 1 | SIMPLE | Company | ref | Idx1 | Idx1 | 4 | db.Document.CompanyCode | 1 | |
+----+-------------+----------+------+----------------------------+--------+---------+--------------------------+---------+-------------+

它们显然不同,但我对它们的理解还不足以解释正在发生的事情。此外,执行 ANALYZE TABLE DocumentANALYZE TABLE Company 都报告 OK

最佳答案

使用 STRAIGHT_JOIN 强制 MySQL 执行连接的顺序

SELECT Document.*, 
Company.CompanyName
FROM Document
STRAIGHT_JOIN Company
ON Document.CompanyCode = Company.CompanyCode
WHERE Document.DocumentCode = ?
ORDER BY Document.EpochSeconds DESC
LIMIT 0, 30;

关于MySQL 索引查询对特定列值花费很长时间,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40844272/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com