gpt4 book ai didi

mysql - 使用文件排序优化自连接

转载 作者:行者123 更新时间:2023-11-29 00:22:30 25 4
gpt4 key购买 nike

我有这个问题:

SELECT DISTINCT 
t1.`signature_id` AS id1,
t2.`signature_id` AS id2,
COUNT(DISTINCT t3.serial) AS weight
FROM `gc_con_sig` AS t1
INNER JOIN `gc_con_sig` AS t2
ON ((t1.`signature_id` != t2.`signature_id`)
AND (t1.`petition_id` = t2.`petition_id`))
INNER JOIN `wtp_data_petitions` AS t3
ON (t3.`serial` = t1.`petition_serial`)
GROUP BY t1.`signature_id`, t2.`signature_id`
HAVING weight > 0;

它基本上得到了 signature_id 的排列,以及他们都签署的请愿书的数量(权重)。

我正在尝试针对此表运行 (gc_con_sig):

`petition_id` varchar(64) NOT NULL DEFAULT '' COMMENT 'Petition ID defined by API',
`signature_id` varchar(34) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
`petition_serial` int(11) DEFAULT NULL,
KEY `signature_id` (`signature_id`),
KEY `petition_id` (`petition_id`),
KEY `signature_petition_idx` (`signature_id`,`petition_id`),
KEY `pcidx` (`petition_id`,`signature_id`),
KEY `sig_pet_ser_idx` (`petition_serial`)

这是我得到的解释:

  +----+-------------+-------+--------+--------------------------------------------------------+---------+---------+------------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+--------------------------------------------------------+---------+---------+------------------------+--------+----------------------------------------------+
| 1 | SIMPLE | t1 | ALL | petition_id,pcidx,sig_pet_ser_idx | NULL | NULL | NULL | 200659 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | t3 | eq_ref | PRIMARY | PRIMARY | 4 | wtp.t1.petition_serial | 1 | Using index |
| 1 | SIMPLE | t2 | ref | petition_id,pcidx | pcidx | 194 | wtp.t1.petition_id | 5016 | Using where; Using index |
+----+-------------+-------+--------+--------------------------------------------------------+---------+---------+------------------------+--------+----------------------------------------------+

我已经优化了 mysqltuner 告诉我的各种 mysql 配置,但是这个查询不会在具有 17GB 内存(12GB 分配给 mysql)的机器上运行(至少在一个小时内)。

有什么想法吗?

最佳答案

可以在多个请愿书上签名吗? serial 可以是 NULL 吗?

假设两个问题的答案都是“否”,您可以尝试:

SELECT t1.`signature_id` AS id1, t2.`signature_id` AS id2,
COUNT(*) AS weight
FROM `gc_con_sig` t1 INNER JOIN
`gc_con_sig` t2
ON (t1.`signature_id` != t2.`signature_id`) AND
(t1.`petition_id` = t2.`petition_id`)
GROUP BY t1.`signature_id`, t2.`signature_id`;

count(distinct serial) 正在计算字段中的非 NULL 值。如果所有值都不为 NULL 并且没有重复值,则这等同于 count(*)

不需要 having 子句,因为 on 子句基本上保证至少有一个匹配。

最后,当您正确使用 group by 时,永远不需要 select distinct

关于mysql - 使用文件排序优化自连接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20599168/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com