gpt4 book ai didi

mysql查询性能帮助

转载 作者:行者123 更新时间:2023-11-29 01:01:45 26 4
gpt4 key购买 nike

我有一个很大的表来存储电子邮件中包含的单词

mysql> explain t_message_words;
+----------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+---------+------+-----+---------+----------------+
| mwr_key | int(11) | NO | PRI | NULL | auto_increment |
| mwr_message_id | int(11) | NO | MUL | NULL | |
| mwr_word_id | int(11) | NO | MUL | NULL | |
| mwr_count | int(11) | NO | | 0 | |
+----------------+---------+------+-----+---------+----------------+

表包含大约 100M 行
mwr_message_id 是消息表的 FK
mwr_word_id 是单词表的 FK
mwr_count是消息mwr_message_id中单词mwr_word_id出现的次数

为了计算最常用的词,我使用以下查询

SELECT SUM(mwr_count) AS word_count, mwr_word_id
FROM t_message_words
GROUP BY mwr_word_id
ORDER BY word_count DESC
LIMIT 100;

几乎永远运行(在测试服务器上超过半小时)

mysql> show processlist;
+----+------+----------------+--------+---------+------+----------------------+-----------------------------------------------------
| Id | User | Host | db | Command | Time | State | Info
+----+------+----------------+--------+---------+------+----------------------+-----------------------------------------------------
processlist
| 41 | root | localhost:3148 | tst_db | Query | 1955 | Copying to tmp table | SELECT SUM(mwr_count) AS word_count, mwr_word_id
FROM t_message_words
GROUP BY mwr_word_id |
+----+------+----------------+--------+---------+------+----------------------+-----------------------------------------------------
3 rows in set (0.00 sec)

我可以做些什么来“加速”查询(除了添加更多 ram、更多 cpu、更快的磁盘之外)?

提前谢谢你
斯特凡诺

附言解释结果:

mysql> EXPLAIN SELECT SUM(mwr_count) AS word_count, mwr_word_id
-> FROM t_message_words
-> GROUP BY mwr_word_id
-> ORDER BY word_count DESC
-> LIMIT 100;
+----+-------------+-----------------+-------+---------------+----------------------+---------+------+----------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+-------+---------------+----------------------+---------+------+----------+---------------------------------+
| 1 | SIMPLE | t_message_words | index | NULL | IDX_t_message_words2 | 4 | NULL | 94823285 | Using temporary; Using filesort |
+----+-------------+-----------------+-------+---------------+----------------------+---------+------+----------+---------------------------------+
1 row in set (0.01 sec)

最佳答案

我不明白您是否有 mwr_message_id 和 mwr_word_id 索引,或仅 mwr_word_id 索引,或主键索引以外的任何其他索引。如果您没有 mwr_word_id 索引(或将 mwr_word_id 作为第一个字段),我建议您添加一个。

如果您已经有了这样的索引,并且这对于一个非常常见的场景来说确实很痛苦,我建议您在 words 表中添加一些冗余,对所有 mwr_message_id 中 mwr_word_id 的总出现次数求和。

并且你也可以在t_message_words中添加一些触发器来处理这种冗余的更新。

关于mysql查询性能帮助,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2690894/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com