gpt4 book ai didi

MySQL 性能,当查询使用 order by/group by 时子查询使用临时文件排序

转载 作者:可可西里 更新时间:2023-11-01 08:46:06 25 4
gpt4 key购买 nike

在为用户创建的游戏 map 存档制作标签表时,获取包含所有提供的标签的 map 的 map ID 的 SQL 是,...是标签,# 是标签的数量:

SELECT DISTINCT map_id 
FROM `map_tag`
INNER JOIN `tag` USING (tag_id)
WHERE tag IN (...)
GROUP BY map_id HAVING COUNT(DISTINCT tag_id) = #
ORDER BY map_id DESC

/* Affected rows: 0 Found rows: 83,597 Warnings: 0 Duration for 1 query: 0.032 sec. (+ 0.531 sec. network) */

+----+-------------+---------+-------+---------------+---------+---------+-------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------+---------+---------+-------+--------+--------------------------+
| 1 | SIMPLE | tag | const | PRIMARY,tag | tag | 767 | const | 1 | Using index |
| 1 | SIMPLE | map_tag | index | NULL | PRIMARY | 8 | NULL | 888729 | Using where; Using index |
+----+-------------+---------+-------+---------------+---------+---------+-------+--------+--------------------------+

然后我加入 map 本身,SQL 变成:

SELECT 
`map`.*
FROM (
SELECT DISTINCT map_id
FROM `map_tag`
INNER JOIN `tag` USING (tag_id)
WHERE tag IN (...)
GROUP BY map_id HAVING COUNT(DISTINCT tag_id) = #
ORDER BY map_id DESC
) matching
INNER JOIN `map` USING (map_id)
INNER JOIN `map_tag` USING (map_id)
INNER JOIN `tag` USING (tag_id)
LIMIT 0, 10

/* Affected rows: 0 Found rows: 10 Warnings: 0 Duration for 1 query: 0.297 sec. */

+----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 83597 | |
| 1 | PRIMARY | map | eq_ref | PRIMARY | PRIMARY | 4 | matching.map_id | 1 | |
| 1 | PRIMARY | map_tag | ref | PRIMARY | PRIMARY | 4 | matching.map_id | 2 | Using index |
| 1 | PRIMARY | tag | eq_ref | PRIMARY | PRIMARY | 4 | maps.local.map_tag.tag_id | 1 | Using index |
| 2 | DERIVED | tag | const | PRIMARY,tag | tag | 767 | | 1 | Using index |
| 2 | DERIVED | map_tag | index | NULL | PRIMARY | 8 | NULL | 888729 | Using where; Using index |
+----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------+

当我想实际使用标签时,问题就出现了。

SELECT 
`map`.*,
GROUP_CONCAT(`tag`.tag) AS tags
FROM (
SELECT DISTINCT map_id
FROM `map_tag`
INNER JOIN `tag` USING (tag_id)
WHERE tag IN (...)
GROUP BY map_id HAVING COUNT(DISTINCT tag_id) = #
ORDER BY map_id DESC
) matching
INNER JOIN `map` USING (map_id)
INNER JOIN `map_tag` USING (map_id)
INNER JOIN `tag` USING (tag_id)
GROUP BY map_id
LIMIT 0, 10

/* Affected rows: 0 Found rows: 10 Warnings: 0 Duration for 1 query: 47.641 sec. */

+----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 83597 | Using temporary; Using filesort |
| 1 | PRIMARY | map | eq_ref | PRIMARY | PRIMARY | 4 | matching.map_id | 1 | |
| 1 | PRIMARY | map_tag | ref | PRIMARY | PRIMARY | 4 | matching.map_id | 2 | Using index |
| 1 | PRIMARY | tag | eq_ref | PRIMARY | PRIMARY | 4 | maps.local.map_tag.tag_id | 1 | |
| 2 | DERIVED | tag | const | PRIMARY,tag | tag | 767 | | 1 | Using index |
| 2 | DERIVED | map_tag | index | NULL | PRIMARY | 8 | NULL | 888729 | Using where; Using index |
+----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+---------------------------------+

一个 47 秒的查询,从 INNER JOINmap 表之前的 0.3 秒增加了。子查询切换到使用临时和文件排序,我不知道为什么。我在所有相关表中为 map_id 设置了索引,但出于某种原因,它在执行 GROUP BY 时没有使用它们。 ORDER BY 也会导致此行为。

我需要做些什么来改变表以便使用索引吗?有没有更有效的方法来引入 map 表并获取所有标签,而不仅仅是匹配的标签?


目标是有,如果有三个map(这个不表示表结构,tags就是mapmap_tag标记表关系):

+-------+---------------+
| name | tags |
+-------+---------------+
| map A | aaa, bbb, ccc |
| map B | bbb, ccc, zzz |
| map C | ccc, zzz, yyy |
+-------+---------------+

如果我搜索标签“bbb”和“ccc”,我得到的结果是:

+-------+---------------+
| name | tags |
+-------+---------------+
| map A | aaa, bbb, ccc |
| map B | bbb, ccc, zzz |
+-------+---------------+

所有标签都属于每个 map ,而不仅仅是匹配的标签,并且我能够按 map 列对生成的 map 行进行排序,而 MySQL 不会忽略指标:

...
ORDER BY `map`.published DESC

/* Affected rows: 0 Found rows: 10 Warnings: 0 Duration for 1 query: 00:01:35 (+ 0.078 sec. network) */

最佳答案

不是真的理解你的问题,也不是对评论的回答,但是......我会尝试以这种方式构建它......你的内部查询是来自 map_tag 和符合条件的标签上的标签表的连接,以及组concat of distinct 是在那里完成的,具有按 map ID 分组的计数。完成...现在您可以加入符合条件的 map 表。

为了帮助索引优化,我可以建议以下索引

table       index
map_tag ( map_id, tag_id )
tag ( tag_id, tag )
map ( map_id )

SELECT
m.*,
PreTags.allTags
from
( SELECT
mt.map_id,
GROUP_CONCAT(DISTINCT t.tag ORDER BY t.tag SEPARATOR ',') allTags
FROM
map_tag mt
JOIN `tag` t
ON mt.tag_id = t.tag_id
group by
mt.map_id
having
SUM( case when t.tag in (...) then 1 else 0 end ) > 1
order by
mt.map_id DESC ) PreTags
JOIN map m
ON PreTags.map_id = m.map_id
limit
0, 10

这样,内部查询会为您执行组连接,因此您不必在获取最终 map 条目时在外部重新应用它...而且由于内部查询按 map_id 分组,您不会有来自内部查询的重复项。

这里有另一个选项,我很好奇它的性能。

SELECT
m.*,
FullTags.allTags
from
( SELECT
Just10.map_id,
GROUP_CONCAT(DISTINCT t.tag ORDER BY t.tag SEPARATOR ',') allTags
from
( SELECT mt.map_id
FROM map_tag mt
where mt.tag_id in ( select t.tag_id
from `tag` t
where t.tag in (...) )
group by mt.map_id
having COUNT(*) > 1
order by mt.map_id DESC
limit 0, 10 ) Just10
JOIN map_tag mt2
ON Just10.map_id = mt2.map_id
JOIN `tag` t
ON mt2.tag_id = t.tag_id
group by
Just10.map_id ) FullTags
JOIN map m
ON FullTags.map_id = m.map_id

最内层的查询只获取最多 10 个条目,这些条目具有多个与您正在查找的标签匹配的条目并应用排序依据。然后,仅针对这 10 个返回并获取 group_concat() - 同样,这仅适用于最多 10 个记录,然后最终加入以获取其余的 map 数据。

关于MySQL 性能,当查询使用 order by/group by 时子查询使用临时文件排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27956078/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com