gpt4 book ai didi

php - 缓慢的 MySQL 查询 - 将数据缓存在 PHP 数组中?

转载 作者:可可西里 更新时间:2023-11-01 06:37:08 25 4
gpt4 key购买 nike

我需要使用 PHP 从 MySQL 数据库中选择一些数据。它可以在一个单一的 MySQL 查询中完成,这需要 5 分钟才能在一台好的服务器上运行(在超过 10 个 Mio 行的表上进行多个 JOIN)。

我想知道在 PHP 中拆分查询并使用一些循环而不是 MySQL 是否是更好的做法。此外,从一个数组中包含 150 000 行的表中查询所有电子邮件然后检查数组而不是执行数千个 MySQL SELECT 会更好吗?

这是查询:

SELECT count(contacted_emails.id), contacted_emails.email 
FROM contacted_emails
LEFT OUTER JOIN blacklist ON contacted_emails.email = blacklist.email
LEFT OUTER JOIN submission_authors ON contacted_emails.email = submission_authors.email
LEFT OUTER JOIN users ON contacted_emails.email = users.email
GROUP BY contacted_emails.email
HAVING count(contacted_emails.id) > 3

EXPLAIN 返回:EXPLAIN

4个表中的索引分别是:

contacted_emails: id, blacklist_section_id, journal_id and mail
blacklist: id, email and name
submission_authors: id, hash_key and email
users: id, email, firstname, lastname, editor_id, title_id, country_id, workplace_id

工作类型_id

contacted_emails 表的创建方式如下:

CREATE TABLE contacted_emails ( 
id int(10) unsigned NOT NULL AUTO_INCREMENT,
email varchar(150) COLLATE utf8_unicode_ci NOT NULL,
contacted_at datetime NOT NULL,
created_at datetime NOT NULL,
blacklist_section_id int(11) unsigned NOT NULL,
journal_id int(10) DEFAULT NULL,
PRIMARY KEY (id),
KEY blacklist_section_id (blacklist_section_id),
KEY journal_id (journal_id),
KEY email (email) )
ENGINE=InnoDB AUTO_INCREMENT=4491706 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

最佳答案

您的索引看起来不错。

性能问题似乎是因为您要JOIN 所有行,然后使用HAVING 进行过滤。

这可能会更好:

SELECT * 
FROM (
SELECT email, COUNT(id) AS number_of_contacts
FROM contacted_emails
GROUP BY email
HAVING COUNT(id) > 3
) AS ce
LEFT OUTER JOIN blacklist AS bl ON ce.email = bl.email
LEFT OUTER JOIN submission_authors AS sa ON ce.email = sa.email
LEFT OUTER JOIN users AS u ON ce.email = u.email
/* EDIT: Exclude-join clause added based on comments below */
WHERE bl.email IS NULL
AND sa.email IS NULL
AND u.email IS NULL

在这里,您在 JOIN 之前限制了您的初始 GROUPed 数据集,这明显更优化。

虽然给定了原始查询的上下文,但似乎根本没有使用 LEFT OUTER JOIN 表,因此以下可能会以更少的开销返回完全相同的结果:

SELECT email, COUNT(id) AS number_of_contacts
FROM contacted_emails
GROUP BY email
HAVING count(id) > 3

那些 JOINed 表到底有什么意义? LEFT JOIN 阻止他们减少任何数据,而您只查看来自 contacted_emails 的聚合数据。您是想改用 INNER JOIN 吗?


编辑:您提到联接的目的是排除现有表中的电子邮件。我修改了我的第一个查询以执行正确的排除连接(这是您最初发布的代码中的错误)。

这是另一个可能适合您的选项:

SELECT 
FROM contacted_emails
LEFT JOIN (
SELECT email FROM blacklist
UNION ALL SELECT email FROM submission_authors
UNION ALL SELECT email FROM users
) AS existing ON contacted_emails.email = existing.email
WHERE existing.email IS NULL
GROUP BY contacted_emails.email
HAVING COUNT(id) > 3

我在这里所做的是收集子查询中的现有电子邮件,并对该派生表执行单个排除连接。

另一种表达方式是 WHERE 子句中的非相关子查询:

SELECT 
FROM contacted_emails
WHERE email NOT IN (
SELECT email FROM blacklist
UNION ALL SELECT email FROM submission_authors
UNION ALL SELECT email FROM users
)
GROUP BY email
HAVING COUNT(id) > 3

全部尝试一下,看看哪个给出了 MySQL 中最好的执行计划

关于php - 缓慢的 MySQL 查询 - 将数据缓存在 PHP 数组中?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32143672/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com