gpt4 book ai didi

mysql大表查询优化group by

转载 作者:行者123 更新时间:2023-11-29 11:40:43 24 4
gpt4 key购买 nike

我有一个巨大的表,大约有 4000 万行(GPS 跟踪器位置),每 10 秒从公司内部的多个设备记录一次。我只想选择每分钟的第一行,所以我使用了group by。问题是该表每 10 秒就会增长一次,我几乎尝试了所有方法,用 google 搜索了很多小时。所以我决定问一个问题。

我使用的是 MySQL 5.7.11 InnoDB 池 50GB,服务器是 Xeon X5650,内存为 64GB。

表结构:

CREATE TABLE `eventData` (
`id` bigint(20) NOT NULL,
`position` point NOT NULL,
`speed` decimal(6,2) DEFAULT NULL,
`time` datetime DEFAULT NULL,
`device_id` int(9) DEFAULT NULL,
`processed` tinyint(1) NOT NULL DEFAULT '0',
`time_m` datetime GENERATED ALWAYS AS ((`time` - interval second(`time`) second)) VIRTUAL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_czech_ci ROW_FORMAT=DYNAMIC;

ALTER TABLE `eventData`
ADD PRIMARY KEY (`id`),
ADD KEY `time` (`time`),
ADD KEY `device_id` (`device_id`,`processed`),
ADD KEY `time_m` (`time_m`);

SQL:

SELECT e.time, e.time_m, X(e.position) AS lat, Y(e.position) AS lng
FROM eventData AS e
WHERE
e.device_id = 86 AND
e.time BETWEEN '2016-02-29' AND '2016-03-06'
GROUP BY DAY(e.time),HOUR(e.time),MINUTE(e.time);

解释:

EXPLAIN SELECT e.time, e.time_m, X(e.position) AS lat, Y(e.position) AS lng FROM eventData AS e WHERE   e.device_id = 86 AND   e.time BETWEEN '2016-02-29' AND '2016-03-06'   GROUP BY DAY(e.time),HOUR(e.time),MINUTE(e.time);
+----+-------------+-------+------------+------+----------------+-----------+---------+-------+---------+----------+---------------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+----------------+-----------+---------+-------+---------+----------+---------------------------------------------------------------------+
| 1 | SIMPLE | e | NULL | ref | time,device_id | device_id | 5 | const | 2122632 | 6.40 | Using index condition; Using where; Using temporary; Using filesort |
+----+-------------+-------+------------+------+----------------+-----------+---------+-------+---------+----------+---------------------------------------------------------------------+

描述:

DESCRIBE eventData;
+------------------+------------------------+------+-----+---------+-------------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+------------------------+------+-----+---------+-------------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| position | point | NO | | NULL | |
| speed | decimal(6,2) | YES | | NULL | |
| time | datetime | YES | MUL | NULL | |
| device_id | int(9) | YES | MUL | NULL | |
| processed | tinyint(1) | NO | | 0 | |
| time_m | datetime | YES | MUL | NULL | VIRTUAL GENERATED |
+------------------+------------------------+------+-----+---------+-------------------+

我已经尝试过:

  • 没有分组依据:~0.06s
  • 按天、小时、分钟分组:~4.76s
  • 按虚拟列分组(time_m):~4.92s
  • 按 e.time DIV 500 分组:~5.02s

我需要获得比 5 秒更好的结果。请帮忙。

最佳答案

您可以对表进行分区。例如按年份。由于索引更小,这将显着提高性能。如果这在您的环境中不可能,请尝试

GROUP BY date_format(e.time,'%d%H%i');

关于mysql大表查询优化group by,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35819790/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com