gpt4 book ai didi

mysql - 提高性能或重新设计 'greatest-n-per-group' mysql 查询

转载 作者:可可西里 更新时间:2023-11-01 08:06:39 25 4
gpt4 key购买 nike

我正在使用 MySQL5,目前我有一个查询可以获取我需要的信息,但我觉得它在性能方面可以得到改进。

这是我构建的查询(大致遵循 this guide ):

SELECT d.*, dc.date_change, dc.cwd, h.name as hub
FROM livedata_dom AS d
LEFT JOIN ( SELECT dc1.*
FROM livedata_domcabling as dc1
LEFT JOIN livedata_domcabling AS dc2
ON dc1.dom_id = dc2.dom_id AND dc1.date_change < dc2.date_change
WHERE dc2.dom_id IS NULL
ORDER BY dc1.date_change desc) AS dc ON (d.id = dc.dom_id)
LEFT JOIN livedata_hub AS h ON (d.id = dc.dom_id AND dc.hub_id = h.id)
WHERE d.cluster = 'localhost'
GROUP BY d.id;

编辑:使用 ORDER BY + GROUP BY 来避免获得多个 dom 条目,以防“domcabling”有一个带有 null date_change 的条目,而另一个带有相同“dom”的日期。

我觉得我正在用火箭筒杀死一只老鼠。此查询需要超过 3 秒,而“livedata_dom”和“livedata_domcabling”中只有大约 5k 个条目。此外,EXPLAIN 告诉我使用了 2 个文件排序:

+----+-------------+------------+--------+-----------------------------+-----------------------------+---------+-----------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+-----------------------------+-----------------------------+---------+-----------------+------+----------------------------------------------+
| 1 | PRIMARY | d | ALL | NULL | NULL | NULL | NULL | 3 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3 | |
| 1 | PRIMARY | h | eq_ref | PRIMARY | PRIMARY | 4 | dc.hub_id | 1 | |
| 2 | DERIVED | dc1 | ALL | NULL | NULL | NULL | NULL | 4 | Using filesort |
| 2 | DERIVED | dc2 | ref | livedata_domcabling_dc592d9 | livedata_domcabling_dc592d9 | 4 | live.dc1.dom_id | 2 | Using where; Not exists |
+----+-------------+------------+--------+-----------------------------+-----------------------------+---------+-----------------+------+----------------------------------------------+

我如何更改此查询以使其更有效率?

使用虚拟数据(下面提供),这是预期的结果:

+-----+-------+---------+--------+----------+------------+-----------+---------------------+------+-----------+
| id | mb_id | prod_id | string | position | name | cluster | date_change | cwd | hub |
+-----+-------+---------+--------+----------+------------+-----------+---------------------+------+-----------+
| 249 | 47 | 47 | 47 | 47 | SuperDOM47 | localhost | NULL | NULL | NULL |
| 250 | 48 | 48 | 48 | 48 | SuperDOM48 | localhost | 2014-04-16 05:23:00 | 32A | megahub01 |
| 251 | 49 | 49 | 49 | 49 | SuperDOM49 | localhost | NULL | 22B | megahub01 |
+-----+-------+---------+--------+----------+------------+-----------+---------------------+------+-----------+

基本上每个'dom'条目我需要1行,用

  1. 具有最高 date_change 的“domcabling”记录
    • 如果记录不存在,我需要空字段
    • 每个 dom 的一个条目可能有一个空的 date_change 字段(空的日期时间字段被认为比任何其他日期时间都早)
  2. “集线器”的名称,当找到“domcabling”条目时,否则为空

为 3 个表创建表 + 虚拟插入:

livedata_dom(约 5000 个条目)

CREATE TABLE `livedata_dom` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`mb_id` varchar(12) NOT NULL,
`prod_id` varchar(8) NOT NULL,
`string` int(11) NOT NULL,
`position` int(11) NOT NULL,
`name` varchar(30) NOT NULL,
`cluster` varchar(9) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `mb_id` (`mb_id`),
UNIQUE KEY `prod_id` (`prod_id`),
UNIQUE KEY `name` (`name`),
UNIQUE KEY `livedata_domgood_string_7bff074107b0e5a0_uniq` (`string`,`position`,`cluster`)
) ENGINE=InnoDB AUTO_INCREMENT=5485 DEFAULT CHARSET=latin1;

INSERT INTO `livedata_dom` VALUES (251,'49','49',49,49,'SuperDOM49','localhost'),(250,'48','48',48,48,'SuperDOM48','localhost'),(249,'47','47',47,47,'SuperDOM47','localhost');

livedata_domcabling(大约 10000 个条目并且增长缓慢)

CREATE TABLE `livedata_domcabling` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`dom_id` int(11) NOT NULL,
`hub_id` int(11) NOT NULL,
`cwd` varchar(3) NOT NULL,
`date_change` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `livedata_domcabling_dc592d9` (`dom_id`),
KEY `livedata_domcabling_4366aa6e` (`hub_id`),
CONSTRAINT `dom_id_refs_id_73e89ce0c50bf0a6` FOREIGN KEY (`dom_id`) REFERENCES `livedata_dom` (`id`),
CONSTRAINT `hub_id_refs_id_179c89d8bfd74cdf` FOREIGN KEY (`hub_id`) REFERENCES `livedata_hub` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5397 DEFAULT CHARSET=latin1;

INSERT INTO `livedata_domcabling` VALUES (1,251,1,'22B',NULL),(2,250,1,'33A',NULL),(6,250,1,'32A','2014-04-16 05:23:00'),(5,250,1,'22B','2013-05-22 00:00:00');

livedata_hub(约 100 个条目)

CREATE TABLE `livedata_hub` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(14) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=98 DEFAULT CHARSET=latin;

INSERT INTO `livedata_hub` VALUES (1,'megahub01');

最佳答案

试试这个重写(在 SQL-Fiddle 中测试:

SELECT 
d.*, dc.date_change, dc.cwd, h.name as hub
FROM
livedata_dom AS d
LEFT JOIN
livedata_domcabling as dc
ON dc.id =
( SELECT id
FROM livedata_domcabling AS dcc
WHERE dcc.dom_id = d.id
ORDER BY date_change DESC
LIMIT 1
)
LEFT JOIN
livedata_hub AS h
ON dc.hub_id = h.id
WHERE
d.cluster = 'localhost' ;

(dom_id, date_change) 上的索引将有助于提高效率。

我不确定 d.cluster = 'localhost' 的选择性(livedata_dom 表有多少行符合这个条件?)但是添加了一个索引在 (cluster) 上也可能有帮助。

关于mysql - 提高性能或重新设计 'greatest-n-per-group' mysql 查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12980823/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com